Dhivya Chinnappa


2023

pdf bib
Interpreting Answers to Yes-No Questions in User-Generated Content
Shivam Mathur | Keun Park | Dhivya Chinnappa | Saketh Kotamraju | Eduardo Blanco
Findings of the Association for Computational Linguistics: EMNLP 2023

Interpreting answers to yes-no questions in social media is difficult. Yes and no keywords are uncommon, and the few answers that include them are rarely to be interpreted what the keywords suggest. In this paper, we present a new corpus of 4,442 yes-no question-answer pairs from Twitter. We discuss linguistic characteristics of answers whose interpretation is yes or no, as well as answers whose interpretation is unknown. We show that large language models are far from solving this problem, even after fine-tuning and blending other corpora for the same problem but outside social media.

2022

pdf bib
An Analysis of Negation in Natural Language Understanding Corpora
Md Mosharaf Hossain | Dhivya Chinnappa | Eduardo Blanco
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

This paper analyzes negation in eight popular corpora spanning six natural language understanding tasks. We show that these corpora have few negations compared to general-purpose English, and that the few negations in them are often unimportant. Indeed, one can often ignore negations and still make the right predictions. Additionally, experimental results show that state-of-the-art transformers trained with these corpora obtain substantially worse results with instances that contain negation, especially if the negations are important. We conclude that new corpora accounting for negation are needed to solve natural language understanding tasks when negation is present.

pdf bib
Proceedings of the 19th International Conference on Natural Language Processing (ICON): Shared Task on Word Level Language Identification in Code-mixed Kannada-English Texts
Bharathi Raja Chakravarthi | Abirami Murugappan | Dhivya Chinnappa | Adeep Hane | Prasanna Kumar Kumeresan | Rahul Ponnusamy
Proceedings of the 19th International Conference on Natural Language Processing (ICON): Shared Task on Word Level Language Identification in Code-mixed Kannada-English Texts

pdf bib
Proceedings of the First Workshop on Multimodal Machine Learning in Low-resource Languages
Bharathi Raja Chakravarthi | Abirami Murugappan | Dhivya Chinnappa | Adeep Hane | Prasanna Kumar Kumeresan | Rahul Ponnusamy
Proceedings of the First Workshop on Multimodal Machine Learning in Low-resource Languages

2021

pdf bib
dhivya-hope-detection@LT-EDI-EACL2021: Multilingual Hope Speech Detection for Code-mixed and Transliterated Texts
Dhivya Chinnappa
Proceedings of the First Workshop on Language Technology for Equality, Diversity and Inclusion

In this paper we work with a hope speech detection corpora that includes English, Tamil, and Malayalam datasets. We present a two phase mechanism to detect hope speech. In the first phase we build a classifier to identify the language of the text. In the second phase, we build a classifier to detect hope speech, non hope speech, or not lang labels. Experimental results show that hope speech detection is challenging and there is scope for improvement.

pdf bib
Tamil Lyrics Corpus: Analysis and Experiments
Dhivya Chinnappa | Praveenraj Dhandapani
Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages

In this paper, we present a new Tamil lyrics corpus extracted from Tamil movies captured across a range of 65 years (1954 to 2019). We present a detailed corpus analysis showing the nature of Tamil lyrics with respect to lyricists and the year which it was written. We also present similar- ity score across different lyricists based on their song lyrics. We present experi- mental results based on the SOTA BERT Tamil models to identify the lyricists of a song. Finally, we present future research directions encouraging researchers to pur- sue Tamil NLP research.

2020

pdf bib
WikiPossessions: Possession Timeline Generation as an Evaluation Benchmark for Machine Reading Comprehension of Long Texts
Dhivya Chinnappa | Alexis Palmer | Eduardo Blanco
Proceedings of the Twelfth Language Resources and Evaluation Conference

This paper presents WikiPossessions, a new benchmark corpus for the task of temporally-oriented possession (TOP), or tracking objects as they change hands over time. We annotate Wikipedia articles for 90 different well-known artifacts paintings, diamonds, and archaeological artifacts), producing 799 artifact-possessor relations with associated attributes. For each article, we also produce a full possession timeline. The full version of the task combines straightforward entity-relation extraction with complex temporal reasoning, as well as verification of textual support for the relevant types of knowledge. Specifically, to complete the full TOP task for a given article, a system must do the following: a) identify possessors; b) anchor possessors to times/events; c) identify temporal relations between each temporal anchor and the possession relation it corresponds to; d) assign certainty scores to each possessor and each temporal relation; and e) assemble individual possession events into a global possession timeline. In addition to the corpus, we release evaluation scripts and a baseline model for the task.

pdf bib
Beyond Possession Existence: Duration and Co-Possession
Dhivya Chinnappa | Srikala Murugan | Eduardo Blanco
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

This paper introduces two tasks: determining (a) the duration of possession relations and (b) co-possessions, i.e., whether multiple possessors possess a possessee at the same time. We present new annotations on top of corpora annotating possession existence and experimental results. Regarding possession duration, we derive the time spans we work with empirically from annotations indicating lower and upper bounds. Regarding co-possessions, we use a binary label. Cohen’s kappa coefficients indicate substantial agreement, and experimental results show that text is more useful than the image for solving these tasks.

pdf bib
Determining Event Outcomes: The Case of #fail
Srikala Murugan | Dhivya Chinnappa | Eduardo Blanco
Findings of the Association for Computational Linguistics: EMNLP 2020

This paper targets the task of determining event outcomes in social media. We work with tweets containing either #cookingFail or #bakingFail, and show that many of the events described in them resulted in something edible. Tweets that contain images are more likely to result in edible albeit imperfect outcomes. Experimental results show that edibility is easier to predict than outcome quality.

2019

pdf bib
Extracting Possessions from Social Media: Images Complement Language
Dhivya Chinnappa | Srikala Murugan | Eduardo Blanco
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

This paper describes a new dataset and experiments to determine whether authors of tweets possess the objects they tweet about. We work with 5,000 tweets and show that both humans and neural networks benefit from images in addition to text. We also introduce a simple yet effective strategy to incorporate visual information into any neural network beyond weights from pretrained networks. Specifically, we consider the tags identified in an image as an additional textual input, and leverage pretrained word embeddings as usually done with regular text. Experimental results show this novel strategy is beneficial.

2018

pdf bib
Mining Possessions: Existence, Type and Temporal Anchors
Dhivya Chinnappa | Eduardo Blanco
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

This paper presents a corpus and experiments to mine possession relations from text. Specifically, we target alienable and control possessions, and assign temporal anchors indicating when the possession holds between possessor and possessee. We present new annotations for this task, and experimental results using both traditional classifiers and neural networks. Results show that the three subtasks (predicting possession existence, possession type and temporal anchors) can be automated.

pdf bib
Possessors Change Over Time: A Case Study with Artworks
Dhivya Chinnappa | Eduardo Blanco
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

This paper presents a corpus and experimental results to extract possession relations over time. We work with Wikipedia articles about artworks, and extract possession relations along with temporal information indicating when these relations are true. The annotation scheme yields many possessors over time for a given artwork, and experimental results show that an LSTM ensemble can automate the task.