Snigdha Chaturvedi


2021

pdf bib
Proceedings of the Third Workshop on Narrative Understanding
Nader Akoury | Faeze Brahman | Snigdha Chaturvedi | Elizabeth Clark | Mohit Iyyer | Lara J. Martin
Proceedings of the Third Workshop on Narrative Understanding

pdf bib
How Helpful is Inverse Reinforcement Learning for Table-to-Text Generation?
Sayan Ghosh | Zheng Qi | Snigdha Chaturvedi | Shashank Srivastava
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

Existing approaches for the Table-to-Text task suffer from issues such as missing information, hallucination and repetition. Many approaches to this problem use Reinforcement Learning (RL), which maximizes a single manually defined reward, such as BLEU. In this work, we instead pose the Table-to-Text task as Inverse Reinforcement Learning (IRL) problem. We explore using multiple interpretable unsupervised reward components that are combined linearly to form a composite reward function. The composite reward function and the description generator are learned jointly. We find that IRL outperforms strong RL baselines marginally. We further study the generalization of learned IRL rewards in scenarios involving domain adaptation. Our experiments reveal significant challenges in using IRL for this task.

2020

pdf bib
Predicting Depression in Screening Interviews from Latent Categorization of Interview Prompts
Alex Rinaldi | Jean Fox Tree | Snigdha Chaturvedi
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Accurately diagnosing depression is difficult– requiring time-intensive interviews, assessments, and analysis. Hence, automated methods that can assess linguistic patterns in these interviews could help psychiatric professionals make faster, more informed decisions about diagnosis. We propose JLPC, a model that analyzes interview transcripts to identify depression while jointly categorizing interview prompts into latent categories. This latent categorization allows the model to define high-level conversational contexts that influence patterns of language in depressed individuals. We show that the proposed model not only outperforms competitive baselines, but that its latent prompt categories provide psycholinguistic insights about depression.

pdf bib
Bridging the Structural Gap Between Encoding and Decoding for Data-To-Text Generation
Chao Zhao | Marilyn Walker | Snigdha Chaturvedi
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Generating sequential natural language descriptions from graph-structured data (e.g., knowledge graph) is challenging, partly because of the structural differences between the input graph and the output text. Hence, popular sequence-to-sequence models, which require serialized input, are not a natural fit for this task. Graph neural networks, on the other hand, can better encode the input graph but broaden the structural gap between the encoder and decoder, making faithful generation difficult. To narrow this gap, we propose DualEnc, a dual encoding model that can not only incorporate the graph structure, but can also cater to the linear structure of the output text. Empirical comparisons with strong single-encoder baselines demonstrate that dual encoding can significantly improve the quality of the generated text.

pdf bib
Proceedings of the First Joint Workshop on Narrative Understanding, Storylines, and Events
Claire Bonial | Tommaso Caselli | Snigdha Chaturvedi | Elizabeth Clark | Ruihong Huang | Mohit Iyyer | Alejandro Jaimes | Heng Ji | Lara J. Martin | Ben Miller | Teruko Mitamura | Nanyun Peng | Joel Tetreault
Proceedings of the First Joint Workshop on Narrative Understanding, Storylines, and Events

pdf bib
Modeling Protagonist Emotions for Emotion-Aware Storytelling
Faeze Brahman | Snigdha Chaturvedi
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Emotions and their evolution play a central role in creating a captivating story. In this paper, we present the first study on modeling the emotional trajectory of the protagonist in neural storytelling. We design methods that generate stories that adhere to given story titles and desired emotion arcs for the protagonist. Our models include Emotion Supervision (EmoSup) and two Emotion-Reinforced (EmoRL) models. The EmoRL models use special rewards designed to regularize the story generation process through reinforcement learning. Our automatic and manual evaluations demonstrate that these models are significantly better at generating stories that follow the desired emotion arcs compared to baseline methods, without sacrificing story quality.

pdf bib
Cue Me In: Content-Inducing Approaches to Interactive Story Generation
Faeze Brahman | Alexandru Petrusca | Snigdha Chaturvedi
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing

Automatically generating stories is a challenging problem that requires producing causally related and logical sequences of events about a topic. Previous approaches in this domain have focused largely on one-shot generation, where a language model outputs a complete story based on limited initial input from a user. Here, we instead focus on the task of interactive story generation, where the user provides the model mid-level sentence abstractions in the form of cue phrases during the generation process. This provides an interface for human users to guide the story generation. We present two content-inducing approaches to effectively incorporate this additional information. Experimental results from both automatic and human evaluations show that these methods produce more topically coherent and personalized stories compared to baseline methods.

2019

pdf bib
Named Entity Recognition with Partially Annotated Training Data
Stephen Mayhew | Snigdha Chaturvedi | Chen-Tse Tsai | Dan Roth
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

Supervised machine learning assumes the availability of fully-labeled data, but in many cases, such as low-resource languages, the only data available is partially annotated. We study the problem of Named Entity Recognition (NER) with partially annotated training data in which a fraction of the named entities are labeled, and all other tokens, entities or otherwise, are labeled as non-entity by default. In order to train on this noisy dataset, we need to distinguish between the true and false negatives. To this end, we introduce a constraint-driven iterative algorithm that learns to detect false negatives in the noisy set and downweigh them, resulting in a weighted training set. With this set, we train a weighted NER model. We evaluate our algorithm with weighted variants of neural and non-neural NER models on data in 8 languages from several language and script families, showing strong ability to learn from partial data. Finally, to show real-world efficacy, we evaluate on a Bengali NER corpus annotated by non-speakers, outperforming the prior state-of-the-art by over 5 points F1.

pdf bib
Proceedings of the First Workshop on Narrative Understanding
David Bamman | Snigdha Chaturvedi | Elizabeth Clark | Madalina Fiterau | Mohit Iyyer
Proceedings of the First Workshop on Narrative Understanding

2018

pdf bib
Looking Beyond the Surface: A Challenge Set for Reading Comprehension over Multiple Sentences
Daniel Khashabi | Snigdha Chaturvedi | Michael Roth | Shyam Upadhyay | Dan Roth
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

We present a reading comprehension challenge in which questions can only be answered by taking into account information from multiple sentences. We solicit and verify questions and answers for this challenge through a 4-step crowdsourcing experiment. Our challenge dataset contains 6,500+ questions for 1000+ paragraphs across 7 different domains (elementary school science, news, travel guides, fiction stories, etc) bringing in linguistic diversity to the texts and to the questions wordings. On a subset of our dataset, we found human solvers to achieve an F1-score of 88.1%. We analyze a range of baselines, including a recent state-of-art reading comprehension system, and demonstrate the difficulty of this challenge, despite a high human performance. The dataset is the first to study multi-sentence inference at scale, with an open-ended set of question types that requires reasoning skills.

pdf bib
Where Have I Heard This Story Before? Identifying Narrative Similarity in Movie Remakes
Snigdha Chaturvedi | Shashank Srivastava | Dan Roth
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)

People can identify correspondences between narratives in everyday life. For example, an analogy with the Cinderella story may be made in describing the unexpected success of an underdog in seemingly different stories. We present a new task and dataset for story understanding: identifying instances of similar narratives from a collection of narrative texts. We present an initial approach for this problem, which finds correspondences between narratives in terms of plot events, and resemblances between characters and their social relationships. Our approach yields an 8% absolute improvement in performance over a competitive information-retrieval baseline on a novel dataset of plot summaries of 577 movie remakes from Wikipedia.

2017

pdf bib
A Joint Model for Semantic Sequences: Frames, Entities, Sentiments
Haoruo Peng | Snigdha Chaturvedi | Dan Roth
Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017)

Understanding stories – sequences of events – is a crucial yet challenging natural language understanding task. These events typically carry multiple aspects of semantics including actions, entities and emotions. Not only does each individual aspect contribute to the meaning of the story, so does the interaction among these aspects. Building on this intuition, we propose to jointly model important aspects of semantic knowledge – frames, entities and sentiments – via a semantic language model. We achieve this by first representing these aspects’ semantic units at an appropriate level of abstraction and then using the resulting vector representations for each semantic aspect to learn a joint representation via a neural language model. We show that the joint semantic language model is of high quality and can generate better semantic sequences than models that operate on the word level. We further demonstrate that our joint model can be applied to story cloze test and shallow discourse parsing tasks with improved performance and that each semantic aspect contributes to the model.

pdf bib
Story Comprehension for Predicting What Happens Next
Snigdha Chaturvedi | Haoruo Peng | Dan Roth
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Automatic story comprehension is a fundamental challenge in Natural Language Understanding, and can enable computers to learn about social norms, human behavior and commonsense. In this paper, we present a story comprehension model that explores three distinct semantic aspects: (i) the sequence of events described in the story, (ii) its emotional trajectory, and (iii) its plot consistency. We judge the model’s understanding of real-world stories by inquiring if, like humans, it can develop an expectation of what will happen next in a given story. Specifically, we use it to predict the correct ending of a given short story from possible alternatives. The model uses a hidden variable to weigh the semantic aspects in the context of the story. Our experiments demonstrate the potential of our approach to characterize these semantic aspects, and the strength of the hidden variable based approach. The model outperforms the state-of-the-art approaches and achieves best results on a publicly available dataset.

2016

pdf bib
Feuding Families and Former Friends: Unsupervised Learning for Dynamic Fictional Relationships
Mohit Iyyer | Anupam Guha | Snigdha Chaturvedi | Jordan Boyd-Graber | Hal Daumé III
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2014

pdf bib
Predicting Instructor’s Intervention in MOOC forums
Snigdha Chaturvedi | Dan Goldwasser | Hal Daumé III
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)