Erick Fonseca

Also published as: Erick R. Fonseca, Erick Rocha Fonseca


pdf bib
Predicting Attention Sparsity in Transformers
Marcos Treviso | António Góis | Patrick Fernandes | Erick Fonseca | Andre Martins
Proceedings of the Sixth Workshop on Structured Prediction for NLP

Transformers’ quadratic complexity with respect to the input sequence length has motivated a body of work on efficient sparse approximations to softmax. An alternative path, used by entmax transformers, consists of having built-in exact sparse attention; however this approach still requires quadratic computation. In this paper, we propose Sparsefinder, a simple model trained to identify the sparsity pattern of entmax attention before computing it. We experiment with three variants of our method, based on distances, quantization, and clustering, on two tasks: machine translation (attention in the decoder) and masked language modeling (encoder-only). Our work provides a new angle to study model efficiency by doing extensive analysis of the tradeoff between the sparsity and recall of the predicted attention graph. This allows for detailed comparison between different models along their Pareto curves, important to guide future benchmarks for sparse attention models.

pdf bib
MLQE-PE: A Multilingual Quality Estimation and Post-Editing Dataset
Marina Fomicheva | Shuo Sun | Erick Fonseca | Chrysoula Zerva | Frédéric Blain | Vishrav Chaudhary | Francisco Guzmán | Nina Lopatina | Lucia Specia | André F. T. Martins
Proceedings of the Thirteenth Language Resources and Evaluation Conference

We present MLQE-PE, a new dataset for Machine Translation (MT) Quality Estimation (QE) and Automatic Post-Editing (APE). The dataset contains annotations for eleven language pairs, including both high- and low-resource languages. Specifically, it is annotated for translation quality with human labels for up to 10,000 translations per language pair in the following formats: sentence-level direct assessments and post-editing effort, and word-level binary good/bad labels. Apart from the quality-related scores, each source-translation sentence pair is accompanied by the corresponding post-edited sentence, as well as titles of the articles where the sentences were extracted from, and information on the neural MT models used to translate the text. We provide a thorough description of the data collection and annotation process as well as an analysis of the annotation distribution for each language pair. We also report the performance of baseline systems trained on the MLQE-PE dataset. The dataset is freely available and has already been used for several WMT shared tasks.


pdf bib
Revisiting Higher-Order Dependency Parsers
Erick Fonseca | André F. T. Martins
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Neural encoders have allowed dependency parsers to shift from higher-order structured models to simpler first-order ones, making decoding faster and still achieving better accuracy than non-neural parsers. This has led to a belief that neural encoders can implicitly encode structural constraints, such as siblings and grandparents in a tree. We tested this hypothesis and found that neural parsers may benefit from higher-order features, even when employing a powerful pre-trained encoder, such as BERT. While the gains of higher-order features are small in the presence of a powerful encoder, they are consistent for long-range dependencies and long sentences. In particular, higher-order models are more accurate on full sentence parses and on the exact match of modifier lists, indicating that they deal better with larger, more complex structures.

pdf bib
Findings of the WMT 2020 Shared Task on Quality Estimation
Lucia Specia | Frédéric Blain | Marina Fomicheva | Erick Fonseca | Vishrav Chaudhary | Francisco Guzmán | André F. T. Martins
Proceedings of the Fifth Conference on Machine Translation

We report the results of the WMT20 shared task on Quality Estimation, where the challenge is to predict the quality of the output of neural machine translation systems at the word, sentence and document levels. This edition included new data with open domain texts, direct assessment annotations, and multiple language pairs: English-German, English-Chinese, Russian-English, Romanian-English, Estonian-English, Sinhala-English and Nepali-English data for the sentence-level subtasks, English-German and English-Chinese for the word-level subtask, and English-French data for the document-level subtask. In addition, we made neural machine translation models available to participants. 19 participating teams from 27 institutions submitted altogether 1374 systems to different task variants and language pairs.


pdf bib
Findings of the WMT 2019 Shared Tasks on Quality Estimation
Erick Fonseca | Lisa Yankovskaya | André F. T. Martins | Mark Fishel | Christian Federmann
Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2)

We report the results of the WMT19 shared task on Quality Estimation, i.e. the task of predicting the quality of the output of machine translation systems given just the source text and the hypothesis translations. The task includes estimation at three granularity levels: word, sentence and document. A novel addition is evaluating sentence-level QE against human judgments: in other words, designing MT metrics that do not need a reference translation. This year we include three language pairs, produced solely by neural machine translation systems. Participating teams from eleven institutions submitted a variety of systems to different task variants and language pairs.


pdf bib
Portuguese Word Embeddings: Evaluating on Word Analogies and Natural Language Tasks
Nathan Hartmann | Erick Fonseca | Christopher Shulby | Marcos Treviso | Jéssica Silva | Sandra Aluísio
Proceedings of the 11th Brazilian Symposium in Information and Human Language Technology


pdf bib
A Deep Architecture for Non-Projective Dependency Parsing
Erick Fonseca | Sandra Aluísio
Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing

pdf bib
Semi-Automatic Construction of a Textual Entailment Dataset: Selecting Candidates with Vector Space Models
Erick R. Fonseca | Sandra Maria Aluísio
Proceedings of the 10th Brazilian Symposium in Information and Human Language Technology


pdf bib
Mac-Morpho Revisited: Towards Robust Part-of-Speech Tagging
Erick Rocha Fonseca | João Luís G. Rosa
Proceedings of the 9th Brazilian Symposium in Information and Human Language Technology