Challenges in Measuring Bias via Open-Ended Language Generation
Afra Feyza Akyürek | Muhammed Yusuf Kocyigit | Sejin Paik | Derry Tanti Wijaya
Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP)

Researchers have devised numerous ways to quantify social biases vested in pretrained language models. As some language models are capable of generating coherent completions given a set of textual prompts, several prompting datasets have been proposed to measure biases between social groups—posing language generation as a way of identifying biases. In this opinion paper, we analyze how specific choices of prompt sets, metrics, automatic tools and sampling strategies affect bias results. We find out that the practice of measuring biases through text completion is prone to yielding contradicting results under different experiment settings. We additionally provide recommendations for reporting biases in open-ended language generation for a more complete outlook of biases exhibited by a given language model. Code to reproduce the results is released under


Majority Voting with Bidirectional Pre-translation For Bitext Retrieval
Alexander Jones | Derry Tanti Wijaya
Proceedings of the 14th Workshop on Building and Using Comparable Corpora (BUCC 2021)

Obtaining high-quality parallel corpora is of paramount importance for training NMT systems. However, as many language pairs lack adequate gold-standard training data, a popular approach has been to mine so-called “pseudo-parallel” sentences from paired documents in two languages. In this paper, we outline some drawbacks with current methods that rely on an embedding similarity threshold, and propose a heuristic method in its place. Our method involves translating both halves of a paired corpus before mining, and then performing a majority vote on sentence pairs mined in three ways: after translating documents in language x to language y, after translating language y to x, and using the original documents in languages x and y. We demonstrate success with this novel approach on the Tatoeba similarity search benchmark in 64 low-resource languages, and on NMT in Kazakh and Gujarati. We also uncover the effect of resource-related factors (i.e. how much monolingual/bilingual data is available for a given language) on the optimal choice of bitext mining method, demonstrating that there is currently no one-size-fits-all approach for this task. We make the code and data used in our experiments publicly available.

IndoCollex: A Testbed for Morphological Transformation of Indonesian Colloquial Words
Haryo Akbarianto Wibowo | Made Nindyatama Nityasya | Afra Feyza Akyürek | Suci Fitriany | Alham Fikri Aji | Radityo Eko Prasojo | Derry Tanti Wijaya
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

Detecting Frames in News Headlines and Lead Images in U.S. Gun Violence Coverage
Isidora Tourni | Lei Guo | Taufiq Husada Daryanto | Fabian Zhafransyah | Edward Edberg Halim | Mona Jalal | Boqi Chen | Sha Lai | Hengchang Hu | Margrit Betke | Prakash Ishwar | Derry Tanti Wijaya
Findings of the Association for Computational Linguistics: EMNLP 2021

News media structure their reporting of events or issues using certain perspectives. When describing an incident involving gun violence, for example, some journalists may focus on mental health or gun regulation, while others may emphasize the discussion of gun rights. Such perspectives are called “frames” in communication research.We study, for the first time, the value of combining lead images and their contextual information with text to identify the frame of a given news article. We observe that using multiple modes of information(article- and image-derived features) improves prediction of news frames over any single mode of information when the images are relevant to the frames of the headlines.We also observe that frame image relevance is related to the ease of conveying frames via images, which we call frame concreteness. Additionally, we release the first multimodal news framing dataset related to gun violence in the U.S., curated and annotated by communication researchers. The dataset will allow researchers to further examine the use of multiple information modalities for studying media framing.

Cultural and Geographical Influences on Image Translatability of Words across Languages
Nikzad Khani | Isidora Tourni | Mohammad Sadegh Rasooli | Chris Callison-Burch | Derry Tanti Wijaya
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Neural Machine Translation (NMT) models have been observed to produce poor translations when there are few/no parallel sentences to train the models. In the absence of parallel data, several approaches have turned to the use of images to learn translations. Since images of words, e.g., horse may be unchanged across languages, translations can be identified via images associated with words in different languages that have a high degree of visual similarity. However, translating via images has been shown to improve upon text-only models only marginally. To better understand when images are useful for translation, we study image translatability of words, which we define as the translatability of words via images, by measuring intra- and inter-cluster similarities of image representations of words that are translations of each other. We find that images of words are not always invariant across languages, and that language pairs with shared culture, meaning having either a common language family, ethnicity or religion, have improved image translatability (i.e., have more similar images for similar words) compared to its converse, regardless of their geographic proximity. In addition, in line with previous works that show images help more in translating concrete words, we found that concrete words have improved image translatability compared to abstract ones.

“Wikily” Supervised Neural Translation Tailored to Cross-Lingual Tasks
Mohammad Sadegh Rasooli | Chris Callison-Burch | Derry Tanti Wijaya
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

We present a simple but effective approach for leveraging Wikipedia for neural machine translation as well as cross-lingual tasks of image captioning and dependency parsing without using any direct supervision from external parallel data or supervised models in the target language. We show that first sentences and titles of linked Wikipedia pages, as well as cross-lingual image captions, are strong signals for a seed parallel data to extract bilingual dictionaries and cross-lingual word embeddings for mining parallel text from Wikipedia. Our final model achieves high BLEU scores that are close to or sometimes higher than strong supervised baselines in low-resource languages; e.g. supervised BLEU of 4.0 versus 12.1 from our model in English-to-Kazakh. Moreover, we tailor our wikily translation models to unsupervised image captioning, and cross-lingual dependency parser transfer. In image captioning, we train a multi-tasking machine translation and image captioning pipeline for Arabic and English from which the Arabic training data is a wikily translation of the English captioning data. Our captioning results on Arabic are slightly better than that of its supervised model. In dependency parsing, we translate a large amount of monolingual text, and use it as an artificial training data in an annotation projection framework. We show that our model outperforms recent work on cross-lingual transfer of dependency parsers.

OpenFraming: Open-sourced Tool for Computational Framing Analysis of Multilingual Data
Vibhu Bhatia | Vidya Prasad Akavoor | Sejin Paik | Lei Guo | Mona Jalal | Alyssa Smith | David Assefa Tofu | Edward Edberg Halim | Yimeng Sun | Margrit Betke | Prakash Ishwar | Derry Tanti Wijaya
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

When journalists cover a news story, they can cover the story from multiple angles or perspectives. These perspectives are called “frames,” and usage of one frame or another may influence public perception and opinion of the issue at hand. We develop a web-based system for analyzing frames in multilingual text documents. We propose and guide users through a five-step end-to-end computational framing analysis framework grounded in media framing theory in communication research. Users can use the framework to analyze multilingual text data, starting from the exploration of frames in user’s corpora and through review of previous framing literature (step 1-3) to frame classification (step 4) and prediction (step 5). The framework combines unsupervised and supervised machine learning and leverages a state-of-the-art (SoTA) multilingual language model, which can significantly enhance frame prediction performance while requiring a considerably small sample of manual annotations. Through the interactive website, anyone can perform the proposed computational framing analysis, making advanced computational analysis available to researchers without a programming background and bridging the digital divide within the communication research discipline in particular and the academic community in general. The system is available online at, via an API, or through our GitHub page


Resolving Pronouns in Twitter Streams: Context can Help!
Anietie Andy | Chris Callison-Burch | Derry Tanti Wijaya
Proceedings of the Third Workshop on Computational Models of Reference, Anaphora and Coreference

Many people live-tweet televised events like Presidential debates and popular TV-shows and discuss people or characters in the event. Naturally, many tweets make pronominal reference to these people/characters. We propose an algorithm for resolving personal pronouns that make reference to people involved in an event, in tweet streams collected during the event.

Multi-Label and Multilingual News Framing Analysis
Afra Feyza Akyürek | Lei Guo | Randa Elanwar | Prakash Ishwar | Margrit Betke | Derry Tanti Wijaya
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

News framing refers to the practice in which aspects of specific issues are highlighted in the news to promote a particular interpretation. In NLP, although recent works have studied framing in English news, few have studied how the analysis can be extended to other languages and in a multi-label setting. In this work, we explore multilingual transfer learning to detect multiple frames from just the news headline in a genuinely low-resource context where there are few/no frame annotations in the target language. We propose a novel method that can leverage elementary resources consisting of a dictionary and few annotations to detect frames in the target language. Our method performs comparably or better than translating the entire target language headline to the source language for which we have annotated data. This work opens up an exciting new capability of scaling up frame analysis to many languages, even those without existing translation technologies. Lastly, we apply our method to detect frames on the issue of U.S. gun violence in multiple languages and obtain exciting insights on the relationship between different frames of the same problem across different countries with different languages.


Detecting Frames in News Headlines and Its Application to Analyzing News Framing Trends Surrounding U.S. Gun Violence
Siyi Liu | Lei Guo | Kate Mays | Margrit Betke | Derry Tanti Wijaya
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

Different news articles about the same topic often offer a variety of perspectives: an article written about gun violence might emphasize gun control, while another might promote 2nd Amendment rights, and yet a third might focus on mental health issues. In communication research, these different perspectives are known as “frames”, which, when used in news media will influence the opinion of their readers in multiple ways. In this paper, we present a method for effectively detecting frames in news headlines. Our training and performance evaluation is based on a new dataset of news headlines related to the issue of gun violence in the United States. This Gun Violence Frame Corpus (GVFC) was curated and annotated by journalism and communication experts. Our proposed approach sets a new state-of-the-art performance for multiclass news frame detection, significantly outperforming a recent baseline by 35.9% absolute difference in accuracy. We apply our frame detection approach in a large scale study of 88k news headlines about the coverage of gun violence in the U.S. between 2016 and 2018.

Winter is here: Summarizing Twitter Streams related to Pre-Scheduled Events
Anietie Andy | Derry Tanti Wijaya | Chris Callison-Burch
Proceedings of the Second Workshop on Storytelling

Pre-scheduled events, such as TV shows and sports games, usually garner considerable attention from the public. Twitter captures large volumes of discussions and messages related to these events, in real-time. Twitter streams related to pre-scheduled events are characterized by the following: (1) spikes in the volume of published tweets reflect the highlights of the event and (2) some of the published tweets make reference to the characters involved in the event, in the context in which they are currently portrayed in a subevent. In this paper, we take advantage of these characteristics to identify the highlights of pre-scheduled events from tweet streams and we demonstrate a method to summarize these highlights. We evaluate our algorithm on tweets collected around 2 episodes of a popular TV show, Game of Thrones, Season 7.


Learning Translations via Images with a Massively Multilingual Image Dataset
John Hewitt | Daphne Ippolito | Brendan Callahan | Reno Kriz | Derry Tanti Wijaya | Chris Callison-Burch
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We conduct the most comprehensive study to date into translating words via images. To facilitate research on the task, we introduce a large-scale multilingual corpus of images, each labeled with the word it represents. Past datasets have been limited to only a few high-resource languages and unrealistically easy translation settings. In contrast, we have collected by far the largest available dataset for this task, with images for approximately 10,000 words in each of 100 languages. We run experiments on a dozen high resource languages and 20 low resources languages, demonstrating the effect of word concreteness and part-of-speech on translation quality. %We find that while image features work best for concrete nouns, they are sometimes effective on other parts of speech. To improve image-based translation, we introduce a novel method of predicting word concreteness from images, which improves on a previous state-of-the-art unsupervised technique. This allows us to predict when image-based translation may be effective, enabling consistent improvements to a state-of-the-art text-based word translation system. Our code and the Massively Multilingual Image Dataset (MMID) are available at


Learning Translations via Matrix Completion
Derry Tanti Wijaya | Brendan Callahan | John Hewitt | Jie Gao | Xiao Ling | Marianna Apidianaki | Chris Callison-Burch
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Bilingual Lexicon Induction is the task of learning word translations without bilingual parallel corpora. We model this task as a matrix completion problem, and present an effective and extendable framework for completing the matrix. This method harnesses diverse bilingual and monolingual signals, each of which may be incomplete or noisy. Our model achieves state-of-the-art performance for both high and low resource languages.


Mapping Verbs in Different Languages to Knowledge Base Relations using Web Text as Interlingua
Derry Tanti Wijaya | Tom M. Mitchell
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies


“A Spousal Relation Begins with a Deletion of engage and Ends with an Addition of divorce”: Learning State Changing Verbs from Wikipedia Revision History
Derry Tanti Wijaya | Ndapandula Nakashole | Tom Mitchell
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing


CTPs: Contextual Temporal Profiles for Time Scoping Facts using State Change Detection
Derry Tanti Wijaya | Ndapandula Nakashole | Tom M. Mitchell
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)