Katja Filippova


2021

pdf bib
Controlling Machine Translation for Multiple Attributes with Additive Interventions
Andrea Schioppa | David Vilar | Artem Sokolov | Katja Filippova
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Fine-grained control of machine translation (MT) outputs along multiple attributes is critical for many modern MT applications and is a requirement for gaining users’ trust. A standard approach for exerting control in MT is to prepend the input with a special tag to signal the desired output attribute. Despite its simplicity, attribute tagging has several drawbacks: continuous values must be binned into discrete categories, which is unnatural for certain applications; interference between multiple tags is poorly understood. We address these problems by introducing vector-valued interventions which allow for fine-grained control over multiple attributes simultaneously via a weighted linear combination of the corresponding vectors. For some attributes, our approach even allows for fine-tuning a model trained without annotations to support such interventions. In experiments with three attributes (length, politeness and monotonicity) and two language pairs (English to German and Japanese) our models achieve better control over a wider range of tasks compared to tagging, and translation quality does not degrade when no control is requested. Finally, we demonstrate how to enable control in an already trained model after a relatively cheap fine-tuning stage.

pdf bib
We Need To Talk About Random Splits
Anders Søgaard | Sebastian Ebert | Jasmijn Bastings | Katja Filippova
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

(CITATION) argued for using random splits rather than standard splits in NLP experiments. We argue that random splits, like standard splits, lead to overly optimistic performance estimates. We can also split data in biased or adversarial ways, e.g., training on short sentences and evaluating on long ones. Biased sampling has been used in domain adaptation to simulate real-world drift; this is known as the covariate shift assumption. In NLP, however, even worst-case splits, maximizing bias, often under-estimate the error observed on new samples of in-domain data, i.e., the data that models should minimally generalize to at test time. This invalidates the covariate shift assumption. Instead of using multiple random splits, future benchmarks should ideally include multiple, independent test sets instead; if infeasible, we argue that multiple biased splits leads to more realistic performance estimates than multiple random splits.

2020

pdf bib
Controlled Hallucinations: Learning to Generate Faithfully from Noisy Data
Katja Filippova
Findings of the Association for Computational Linguistics: EMNLP 2020

Neural text generation (data- or text-to-text) demonstrates remarkable performance when training data is abundant which for many applications is not the case. To collect a large corpus of parallel data, heuristic rules are often used but they inevitably let noise into the data, such as phrases in the output which cannot be explained by the input. Consequently, models pick up on the noise and may hallucinate–generate fluent but unsupported text. Our contribution is a simple but powerful technique to treat such hallucinations as a controllable aspect of the generated text, without dismissing any input and without modifying the model architecture. On the WikiBio corpus (Lebret et al., 2016), a particularly noisy dataset, we demonstrate the efficacy of the technique both in an automatic and in a human evaluation.

pdf bib
The elephant in the interpretability room: Why use attention as explanation when we have saliency methods?
Jasmijn Bastings | Katja Filippova
Proceedings of the Third BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP

There is a recent surge of interest in using attention as explanation of model predictions, with mixed evidence on whether attention can be used as such. While attention conveniently gives us one weight per input token and is easily extracted, it is often unclear toward what goal it is used as explanation. We find that often that goal, whether explicitly stated or not, is to find out what input tokens are the most relevant to a prediction, and that the implied user for the explanation is a model developer. For this goal and user, we argue that input saliency methods are better suited, and that there are no compelling reasons to use attention, despite the coincidence that it provides a weight for each input. With this position paper, we hope to shift some of the recent focus on attention to saliency methods, and for authors to clearly state the goal and user for their explanations.

2018

pdf bib
Sentence-Level Fluency Evaluation: References Help, But Can Be Spared!
Katharina Kann | Sascha Rothe | Katja Filippova
Proceedings of the 22nd Conference on Computational Natural Language Learning

Motivated by recent findings on the probabilistic modeling of acceptability judgments, we propose syntactic log-odds ratio (SLOR), a normalized language model score, as a metric for referenceless fluency evaluation of natural language generation output at the sentence level. We further introduce WPSLOR, a novel WordPiece-based version, which harnesses a more compact language model. Even though word-overlap metrics like ROUGE are computed with the help of hand-written references, our referenceless methods obtain a significantly higher correlation with human fluency scores on a benchmark dataset of compressed sentences. Finally, we present ROUGE-LM, a reference-based metric which is a natural extension of WPSLOR to the case of available references. We show that ROUGE-LM yields a significantly higher correlation with human judgments than all baseline metrics, including WPSLOR on its own.

2015

pdf bib
Sentence Compression by Deletion with LSTMs
Katja Filippova | Enrique Alfonseca | Carlos A. Colmenares | Lukasz Kaiser | Oriol Vinyals
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Idest: Learning a Distributed Representation for Event Patterns
Sebastian Krause | Enrique Alfonseca | Katja Filippova | Daniele Pighin
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2014

pdf bib
Modelling Events through Memory-based, Open-IE Patterns for Abstractive Summarization
Daniele Pighin | Marco Cornolti | Enrique Alfonseca | Katja Filippova
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Opinion Mining on YouTube
Aliaksei Severyn | Alessandro Moschitti | Olga Uryupina | Barbara Plank | Katja Filippova
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2013

pdf bib
Overcoming the Lack of Parallel Data in Sentence Compression
Katja Filippova | Yasemin Altun
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

2012

pdf bib
User Demographics and Language in an Implicit Social Network
Katja Filippova
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

pdf bib
Pattern Learning for Relation Extraction with a Hierarchical Topic Model
Enrique Alfonseca | Katja Filippova | Jean-Yves Delort | Guillermo Garrido
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2011

pdf bib
Proceedings of the Workshop on Monolingual Text-To-Text Generation
Katja Filippova | Stephen Wan
Proceedings of the Workshop on Monolingual Text-To-Text Generation

2010

pdf bib
Multi-Sentence Compression: Finding Shortest Paths in Word Graphs
Katja Filippova
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

2009

pdf bib
Company-Oriented Extractive Summarization of Financial News
Katja Filippova | Mihai Surdeanu | Massimiliano Ciaramita | Hugo Zaragoza
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

pdf bib
Tree Linearization in English: Improving Language Model Based Approaches
Katja Filippova | Michael Strube
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers

2008

pdf bib
Dependency Tree Based Sentence Compression
Katja Filippova | Michael Strube
Proceedings of the Fifth International Natural Language Generation Conference

pdf bib
Sentence Fusion via Dependency Graph Compression
Katja Filippova | Michael Strube
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

2007

pdf bib
Generating Constituent Order in German Clauses
Katja Filippova | Michael Strube
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

pdf bib
Extending the Entity-grid Coherence Model to Semantically Related Entities
Katja Filippova | Michael Strube
Proceedings of the Eleventh European Workshop on Natural Language Generation (ENLG 07)

2006

pdf bib
Using linguistically motivated features for paragraph boundary identification
Katja Filippova | Michael Strube
Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing