Mark Hopkins

2025

Using Encipherment to Isolate Conditions for the Successful Fine-tuning of Massively Multilingual Translation Models
Carter Louchheim | Denis Sotnichenko | Yukina Yamaguchi | Mark Hopkins
Proceedings of the Tenth Conference on Machine Translation

When fine-tuning massively multilingual translation models for low-resource languages, practitioners often include auxiliary languages to improve performance, but factors determining successful auxiliary language selection remain unclear. This paper investigates whether syntactic similarity or lexical overlap is more important for effective multilingual fine-tuning. We use encipherment to create controlled experimental conditions that disentangle these confounded factors, generating novel languages with identical syntax but no lexical overlap, and conversely, languages that preserve lexical overlap. Through extensive NLLB-200 fine-tuning experiments across Europarl and AmericasNLP datasets, we demonstrate that lexical overlap is the dominant factor. Syntactically identical auxiliary languages provide negligible benefits ( 1.0 ChrF), while languages with significant lexical overlap provide substantial improvements ( 5.0 ChrF), with effectiveness strongly correlated to KL-divergence between token distributions (r = -0.47, p .001). Our findings provide clear guidance: when selecting auxiliary languages for multilingual fine-tuning, prioritize lexical overlap over syntactic similarity.

2023

pdf bib abs

On the Evaluation of Neural Selective Prediction Methods for Natural Language Processing
Zhengyao Gu | Mark Hopkins
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We provide a survey and empirical comparison of the state-of-the-art in neural selective classification for NLP tasks. We also provide a methodological blueprint, including a novel metric called refinement that provides a calibrated evaluation of confidence functions for selective prediction. Finally, we supply documented, open-source code to support the future development of selective prediction techniques.

pdf bib abs

Williams College’s Submission for the Coco4MT 2023 Shared Task
Alex Root | Mark Hopkins
Proceedings of the Second Workshop on Corpus Generation and Corpus Augmentation for Machine Translation

Professional translation is expensive. As a consequence, when developing a translation system in the absence of a pre-existing parallel corpus, it is important to strategically choose sentences to have professionally translated for the training corpus. In our contribution to the Coco4MT 2023 Shared Task, we explore how sentence embeddings can be leveraged to choose an impactful set of sentences to translate. Based on six language pairs of the JHU Bible corpus, we demonstrate that a technique based on SimCSE embeddings outperforms a competitive suite of baselines.

2022

pdf bib abs

Towards More Natural Artificial Languages
Mark Hopkins
Proceedings of the 26th Conference on Computational Natural Language Learning (CoNLL)

A number of papers have recently argued in favor of using artificially generated languages to investigate the inductive biases of linguistic models, or to develop models for low-resource languages with underrepresented typologies. But the promise of artificial languages comes with a caveat: if these artificial languages are not sufficiently reflective of natural language, then using them as a proxy may lead to inaccurate conclusions. In this paper, we take a step towards increasing the realism of artificial language by introducing a variant of indexed grammars that draw their weights from hierarchical Pitman-Yor processes. We show that this framework generates languages that emulate the statistics of natural language corpora better than the current approach of directly formulating weighted context-free grammars.

2020

pdf bib abs

Reed at SemEval-2020 Task 9: Fine-Tuning and Bag-of-Words Approaches to Code-Mixed Sentiment Analysis
Vinay Gopalan | Mark Hopkins
Proceedings of the Fourteenth Workshop on Semantic Evaluation

We explore the task of sentiment analysis on Hinglish (code-mixed Hindi-English) tweets as participants of Task 9 of the SemEval-2020 competition, known as the SentiMix task. We had two main approaches: 1) applying transfer learning by fine-tuning pre-trained BERT models and 2) training feedforward neural networks on bag-of-words representations. During the evaluation phase of the competition, we obtained an F-score of 71.3% with our best model, which placed 4th out of 62 entries in the official system rankings.

2019

pdf bib abs

We report on the SemEval 2019 task on math question answering. We provided a question set derived from Math SAT practice exams, including 2778 training questions and 1082 test questions. For a significant subset of these questions, we also provided SMT-LIB logical form annotations and an interpreter that could solve these logical forms. Systems were evaluated based on the percentage of correctly answered questions. The top system correctly answered 45% of the test questions, a considerable improvement over the 17% random guessing baseline.

2018

pdf bib abs

Spot the Odd Man Out: Exploring the Associative Power of Lexical Resources
Gabriel Stanovsky | Mark Hopkins
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

We propose Odd-Man-Out, a novel task which aims to test different properties of word representations. An Odd-Man-Out puzzle is composed of 5 (or more) words, and requires the system to choose the one which does not belong with the others. We show that this simple setup is capable of teasing out various properties of different popular lexical resources (like WordNet and pre-trained word embeddings), while being intuitive enough to annotate on a large scale. In addition, we propose a novel technique for training multi-prototype word representations, based on unsupervised clustering of ELMo embeddings, and show that it surpasses all other representations on all Odd-Man-Out collections.

pdf bib abs

Extending a Parser to Distant Domains Using a Few Dozen Partially Annotated Examples
Vidur Joshi | Matthew Peters | Mark Hopkins
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We revisit domain adaptation for parsers in the neural era. First we show that recent advances in word representations greatly diminish the need for domain adaptation when the target domain is syntactically similar to the source domain. As evidence, we train a parser on the Wall Street Journal alone that achieves over 90% F1 on the Brown corpus. For more syntactically distant domains, we provide a simple way to adapt a parser using only dozens of partial annotations. For instance, we increase the percentage of error-free geometry-domain parses in a held-out set from 45% to 73% using approximately five dozen training examples. In the process, we demonstrate a new state-of-the-art single model result on the Wall Street Journal test set of 94.3%. This is an absolute increase of 1.7% over the previous state-of-the-art of 92.6%.

2017

pdf bib abs

Beyond Sentential Semantic Parsing: Tackling the Math SAT with a Cascade of Tree Transducers
Mark Hopkins | Cristian Petrescu-Prahova | Roie Levin | Ronan Le Bras | Alvaro Herrasti | Vidur Joshi
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

We present an approach for answering questions that span multiple sentences and exhibit sophisticated cross-sentence anaphoric phenomena, evaluating on a rich source of such questions – the math portion of the Scholastic Aptitude Test (SAT). By using a tree transducer cascade as its basic architecture, our system propagates uncertainty from multiple sources (e.g. coreference resolution or verb interpretation) until it can be confidently resolved. Experiments show the first-ever results 43% recall and 91% precision) on SAT algebra word problems. We also apply our system to the public Dolphin algebra question set, and improve the state-of-the-art F1-score from 73.9% to 77.0%.

pdf bib abs

Interactive Visualization for Linguistic Structure
Aaron Sarnat | Vidur Joshi | Cristian Petrescu-Prahova | Alvaro Herrasti | Brandon Stilson | Mark Hopkins
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

We provide a visualization library and web interface for interactively exploring a parse tree or a forest of parses. The library is not tied to any particular linguistic representation, but provides a general-purpose API for the interactive exploration of hierarchical linguistic structure. To facilitate rapid understanding of a complex structure, the API offers several important features, including expand/collapse functionality, positional and color cues, explicit visual support for sequential structure, and dynamic highlighting to convey node-to-text correspondence.