Ruihong Huang


2024

pdf bib
LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing
Jiangshu Du | Yibo Wang | Wenting Zhao | Zhongfen Deng | Shuaiqi Liu | Renze Lou | Henry Peng Zou | Pranav Narayanan Venkit | Nan Zhang | Mukund Srinath | Haoran Ranran Zhang | Vipul Gupta | Yinghui Li | Tao Li | Fei Wang | Qin Liu | Tianlin Liu | Pengzhi Gao | Congying Xia | Chen Xing | Cheng Jiayang | Zhaowei Wang | Ying Su | Raj Sanjay Shah | Ruohao Guo | Jing Gu | Haoran Li | Kangda Wei | Zihao Wang | Lu Cheng | Surangika Ranathunga | Meng Fang | Jie Fu | Fei Liu | Ruihong Huang | Eduardo Blanco | Yixin Cao | Rui Zhang | Philip S. Yu | Wenpeng Yin
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Claim: This work is not advocating the use of LLMs for paper (meta-)reviewing. Instead, wepresent a comparative analysis to identify and distinguish LLM activities from human activities. Two research goals: i) Enable better recognition of instances when someone implicitly uses LLMs for reviewing activities; ii) Increase community awareness that LLMs, and AI in general, are currently inadequate for performing tasks that require a high level of expertise and nuanced judgment.This work is motivated by two key trends. On one hand, large language models (LLMs) have shown remarkable versatility in various generative tasks such as writing, drawing, and question answering, significantly reducing the time required for many routine tasks. On the other hand, researchers, whose work is not only time-consuming but also highly expertise-demanding, face increasing challenges as they have to spend more time reading, writing, and reviewing papers. This raises the question: how can LLMs potentially assist researchers in alleviating their heavy workload?This study focuses on the topic of LLMs as NLP Researchers, particularly examining the effectiveness of LLMs in assisting paper (meta-)reviewing and its recognizability. To address this, we constructed the ReviewCritique dataset, which includes two types of information: (i) NLP papers (initial submissions rather than camera-ready) with both human-written and LLM-generated reviews, and (ii) each review comes with “deficiency” labels and corresponding explanations for individual segments, annotated by experts. Using ReviewCritique, this study explores two threads of research questions: (i) “LLMs as Reviewers”, how do reviews generated by LLMs compare with those written by humans in terms of quality and distinguishability? (ii) “LLMs as Metareviewers”, how effectively can LLMs identify potential issues, such as Deficient or unprofessional review segments, within individual paper reviews? To our knowledge, this is the first work to provide such a comprehensive analysis.

pdf bib
Boosting Logical Fallacy Reasoning in LLMs via Logical Structure Tree
Yuanyuan Lei | Ruihong Huang
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Logical fallacy uses invalid or faulty reasoning in the construction of a statement. Despite the prevalence and harmfulness of logical fallacies, detecting and classifying logical fallacies still remains a challenging task. We observe that logical fallacies often use connective words to indicate an intended logical relation between two arguments, while the argument semantics does not actually support the logical relation. Inspired by this observation, we propose to build a logical structure tree to explicitly represent and track the hierarchical logic flow among relation connectives and their arguments in a statement. Specifically, this logical structure tree is constructed in an unsupervised manner guided by the constituency tree and a taxonomy of connectives for ten common logical relations, with relation connectives as non-terminal nodes and textual arguments as terminal nodes, and the latter are mostly elementary discourse units. We further develop two strategies to incorporate the logical structure tree into LLMs for fallacy reasoning. Firstly, we transform the tree into natural language descriptions and feed the textualized tree into LLMs as a part of the hard text prompt. Secondly, we derive a relation-aware tree embedding and insert the tree embedding into LLMs as a soft prompt. Experiments on benchmark datasets demonstrate that our approach based on logical structure tree significantly improves precision and recall for both fallacy detection and fallacy classification.

pdf bib
Are LLMs Good Annotators for Discourse-level Event Relation Extraction?
Kangda Wei | Aayush Gautam | Ruihong Huang
Findings of the Association for Computational Linguistics: EMNLP 2024

Large Language Models (LLMs) have demonstrated proficiency in a wide array of natural language processing tasks. However, its effectiveness over discourse-level event relation extraction (ERE) tasks remains unexplored. In this paper, we assess the effectiveness of LLMs in addressing discourse-level ERE tasks characterized by lengthy documents and intricate relations encompassing coreference, temporal, causal, and subevent types. Evaluation is conducted using an commercial model, GPT-3.5, and an open-source model, LLaMA-2. Our study reveals a notable underperformance of LLMs compared to the baseline established through supervised learning. Although Supervised Fine-Tuning (SFT) can improve LLMs performance, it does not scale well compared to the smaller supervised baseline model. Our quantitative and qualitative analysis shows that LLMs have several weaknesses when applied for extracting event relations, including a tendency to fabricate event mentions, and failures to capture transitivity rules among relations, detect long distance relations, or comprehend contexts with dense event mentions.

pdf bib
Evaluating Gender Bias of LLMs in Making Morality Judgements
Divij Bajaj | Yuanyuan Lei | Jonathan Tong | Ruihong Huang
Findings of the Association for Computational Linguistics: EMNLP 2024

Large Language Models (LLMs) have shown remarkable capabilities in a multitude of Natural Language Processing (NLP) tasks. However, these models are still not immune to limitations such as social biases, especially gender bias. This work investigates whether current closed and open-source LLMs possess gender bias, especially when asked to give moral opinions. To evaluate these models, we curate and introduce a new dataset GenMO (Gender-bias in Morality Opinions) comprising parallel short stories featuring male and female characters respectively. Specifically, we test models from the GPT family (GPT-3.5-turbo, GPT-3.5-turbo-instruct, GPT-4-turbo), Llama 3 and 3.1 families (8B/70B), Mistral-7B and Claude 3 families (Sonnet and Opus). Surprisingly, despite employing safety checks, all production-standard models we tested display significant gender bias with GPT-3.5-turbo giving biased opinions in 24% of the samples. Additionally, all models consistently favour female characters, with GPT showing bias in 68-85% of cases and Llama 3 in around 81-85% instances. Additionally, our study investigates the impact of model parameters on gender bias and explores real-world situations where LLMs reveal biases in moral decision-making.

pdf bib
Polarity Calibration for Opinion Summarization
Yuanyuan Lei | Kaiqiang Song | Sangwoo Cho | Xiaoyang Wang | Ruihong Huang | Dong Yu
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

Opinion summarization is automatically generating summaries from a variety of subjective information, such as product reviews or political opinions. The challenge of opinions summarization lies in presenting divergent or even conflicting opinions. We conduct an analysis of previous summarization models, which reveals their inclination to amplify the polarity bias, emphasizing the majority opinions while ignoring the minority opinions. To address this issue and make the summarizer express both sides of opinions, we introduce the concept of polarity calibration, which aims to align the polarity of output summary with that of input text. Specifically, we develop a reinforcement training approach for polarity calibration. This approach feeds the polarity distance between output summary and input text as reward into the summarizer, and also balance polarity calibration with content preservation and language naturality. We evaluate our Polarity Calibration model (PoCa) on two types of opinions summarization tasks: summarizing product reviews and political opinions articles. Automatic and human evaluation demonstrate that our approach can mitigate the polarity mismatch between output summary and input text, as well as maintain the content semantic and language quality.

pdf bib
Sentence-level Media Bias Analysis with Event Relation Graph
Yuanyuan Lei | Ruihong Huang
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

Media outlets are becoming more partisan and polarized nowadays. In this paper, we identify media bias at the sentence level, and pinpoint bias sentences that intend to sway readers’ opinions. As bias sentences are often expressed in a neutral and factual way, considering broader context outside a sentence can help reveal the bias. In particular, we observe that events in a bias sentence need to be understood in associations with other events in the document. Therefore, we propose to construct an event relation graph to explicitly reason about event-event relations for sentence-level bias identification. The designed event relation graph consists of events as nodes and four common types of event relations: coreference, temporal, causal, and subevent relations. Then, we incorporate event relation graph for bias sentences identification in two steps: an event-aware language model is built to inject the events and event relations knowledge into the basic language model via soft labels; further, a relation-aware graph attention network is designed to update sentence embedding with events and event relations information based on hard labels. Experiments on two benchmark datasets demonstrate that our approach with the aid of event relation graph improves both precision and recall of bias sentence identification.

pdf bib
EMONA: Event-level Moral Opinions in News Articles
Yuanyuan Lei | Md Messal Monem Miah | Ayesha Qamar | Sai Ramana Reddy | Jonathan Tong | Haotian Xu | Ruihong Huang
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

Most previous research on moral frames has focused on social media short texts, little work has explored moral sentiment within news articles. In news articles, authors often express their opinions or political stance through moral judgment towards events, specifically whether the event is right or wrong according to social moral rules. This paper initiates a new task to understand moral opinions towards events in news articles. We have created a new dataset, EMONA, and annotated event-level moral opinions in news articles. This dataset consists of 400 news articles containing over 10k sentences and 45k events, among which 9,613 events received moral foundation labels. Extracting event morality is a challenging task, as moral judgment towards events can be very implicit. Baseline models were built for event moral identification and classification. In addition, we also conduct extrinsic evaluations to integrate event-level moral opinions into three downstream tasks. The statistical analysis and experiments show that moral opinions of events can serve as informative features for identifying ideological bias or subjective events.

2023

pdf bib
Composition-contrastive Learning for Sentence Embeddings
Sachin Chanchani | Ruihong Huang
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Vector representations of natural language are ubiquitous in search applications. Recently, various methods based on contrastive learning have been proposed to learn textual representations from unlabelled data; by maximizing alignment between minimally-perturbed embeddings of the same text, and encouraging a uniform distribution of embeddings across a broader corpus. Differently, we propose maximizing alignment between texts and a composition of their phrasal constituents. We consider several realizations of this objective and elaborate the impact on representations in each case. Experimental results on semantic textual similarity tasks show improvements over baselines that are comparable with state-of-the-art approaches. Moreover, this work is the first to do so without incurring costs in auxiliary training objectives or additional network parameters.

pdf bib
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Danushka Bollegala | Ruihong Huang | Alan Ritter
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)

pdf bib
Hierarchical Fusion for Online Multimodal Dialog Act Classification
Md Messal Monem Miah | Adarsh Pyarelal | Ruihong Huang
Findings of the Association for Computational Linguistics: EMNLP 2023

We propose a framework for online multimodal dialog act (DA) classification based on raw audio and ASR-generated transcriptions of current and past utterances. Existing multimodal DA classification approaches are limited by ineffective audio modeling and late-stage fusion. We showcase significant improvements in multimodal DA classification by integrating modalities at a more granular level and incorporating recent advancements in large language and audio models for audio feature extraction. We further investigate the effectiveness of self-attention and cross-attention mechanisms in modeling utterances and dialogs for DA classification. We achieve a substantial increase of 3 percentage points in the F1 score relative to current state-of-the-art models on two prominent DA classification datasets, MRDA and EMOTyDA.

pdf bib
Identifying Conspiracy Theories News based on Event Relation Graph
Yuanyuan Lei | Ruihong Huang
Findings of the Association for Computational Linguistics: EMNLP 2023

Conspiracy theories, as a type of misinformation, are narratives that explains an event or situation in an irrational or malicious manner. While most previous work examined conspiracy theory in social media short texts, limited attention was put on such misinformation in long news documents. In this paper, we aim to identify whether a news article contains conspiracy theories. We observe that a conspiracy story can be made up by mixing uncorrelated events together, or by presenting an unusual distribution of relations between events. Achieving a contextualized understanding of events in a story is essential for detecting conspiracy theories. Thus, we propose to incorporate an event relation graph for each article, in which events are nodes, and four common types of event relations, coreference, temporal, causal, and subevent relations, are considered as edges. Then, we integrate the event relation graph into conspiracy theory identification in two ways: an event-aware language model is developed to augment the basic language model with the knowledge of events and event relations via soft labels; further, a heterogeneous graph attention network is designed to derive a graph embedding based on hard labels. Experiments on a large benchmark dataset show that our approach based on event relation graph improves both precision and recall of conspiracy theory identification, and generalizes well for new unseen media sources.

pdf bib
Who is Speaking? Speaker-Aware Multiparty Dialogue Act Classification
Ayesha Qamar | Adarsh Pyarelal | Ruihong Huang
Findings of the Association for Computational Linguistics: EMNLP 2023

Utterances do not occur in isolation in dialogues; it is essential to have the information of who the speaker of an utterance is to be able to recover the speaker’s intention with respect to the surrounding context. Beyond simply capturing speaker switches, identifying how speakers interact with each other in a dialogue is crucial to understanding conversational flow. This becomes increasingly important and simultaneously difficult to model when more than two interlocutors take part in a conversation. To overcome this challenge, we propose to explicitly add speaker awareness to each utterance representation. To that end, we use a graph neural network to model how each speaker is behaving within the local context of a conversation. The speaker representations learned this way are then used to update their respective utterance representations. We experiment with both multiparticipant and dyadic conversations on the MRDA and SwDA datasets and show the effectiveness of our approach.

pdf bib
Discourse Structures Guided Fine-grained Propaganda Identification
Yuanyuan Lei | Ruihong Huang
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Propaganda is a form of deceptive narratives that instigate or mislead the public, usually with a political purpose. In this paper, we aim to identify propaganda in political news at two fine-grained levels: sentence-level and token-level. We observe that propaganda content is more likely to be embedded in sentences that attribute causality or assert contrast to nearby sentences, as well as seen in opinionated evaluation, speculation and discussions of future expectation. Hence, we propose to incorporate both local and global discourse structures for propaganda discovery and construct two teacher models for identifying PDTB-style discourse relations between nearby sentences and common discourse roles of sentences in a news article respectively. We further devise two methods to incorporate the two types of discourse structures for propaganda identification by either using teacher predicted probabilities as additional features or soliciting guidance in a knowledge distillation framework. Experiments on the benchmark dataset demonstrate that leveraging guidance from discourse structures can significantly improve both precision and recall of propaganda content identification.

pdf bib
All Things Considered: Detecting Partisan Events from News Media with Cross-Article Comparison
Yujian Liu | Xinliang Zhang | Kaijian Zou | Ruihong Huang | Nicholas Beauchamp | Lu Wang
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Public opinion is shaped by the information news media provide, and that information in turn may be shaped by the ideological preferences of media outlets. But while much attention has been devoted to media bias via overt ideological language or topic selection, a more unobtrusive way in which the media shape opinion is via the strategic inclusion or omission of partisan events that may support one side or the other. We develop a latent variable-based framework to predict the ideology of news articles by comparing multiple articles on the same story and identifying partisan events whose inclusion or omission reveals ideology. Our experiments first validate the existence of partisan event selection, and then show that article alignment and cross-document comparison detect partisan events and article ideology better than competitive baselines. Our results reveal the high-level form of media bias, which is present even among mainstream media with strong norms of objectivity and nonpartisanship. Our codebase and dataset are available at https://github.com/launchnlp/ATC.

pdf bib
Semi-supervised News Discourse Profiling with Contrastive Learning
Ming Li | Ruihong Huang
Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)

2022

pdf bib
Crossroads, Buildings and Neighborhoods: A Dataset for Fine-grained Location Recognition
Pei Chen | Haotian Xu | Cheng Zhang | Ruihong Huang
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

General domain Named Entity Recognition (NER) datasets like CoNLL-2003 mostly annotate coarse-grained location entities such as a country or a city. But many applications require identifying fine-grained locations from texts and mapping them precisely to geographic sites, e.g., a crossroad, an apartment building, or a grocery store. In this paper, we introduce a new dataset HarveyNER with fine-grained locations annotated in tweets. This dataset presents unique challenges and characterizes many complex and long location mentions in informal descriptions. We built strong baseline models using Curriculum Learning and experimented with different heuristic curricula to better recognize difficult location mentions. Experimental results show that the simple curricula can improve the system’s performance on hard cases and its overall performance, and outperform several other baseline systems. The dataset and the baseline models can be found at https://github.com/brickee/HarveyNER.

pdf bib
Predicting Sentence Deletions for Text Simplification Using a Functional Discourse Structure
Bohan Zhang | Prafulla Kumar Choubey | Ruihong Huang
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Document-level text simplification often deletes some sentences besides performing lexical, grammatical or structural simplification to reduce text complexity. In this work, we focus on sentence deletions for text simplification and use a news genre-specific functional discourse structure, which categorizes sentences based on their contents and their function roles in telling a news story, for predicting sentence deletion. We incorporate sentence categories into a neural net model in two ways for predicting sentence deletions, either as additional features or by jointly predicting sentence deletions and sentence categories. Experimental results using human-annotated data show that incorporating the functional structure improves the recall of sentence deletion prediction by 6.5% and 10.7% respectively using the two methods, and improves the overall F1-score by 3.6% and 4.3% respectively.

pdf bib
Sentence-level Media Bias Analysis Informed by Discourse Structures
Yuanyuan Lei | Ruihong Huang | Lu Wang | Nick Beauchamp
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

As polarization continues to rise among both the public and the news media, increasing attention has been devoted to detecting media bias. Most recent work in the NLP community, however, identify bias at the level of individual articles. However, each article itself comprises multiple sentences, which vary in their ideological bias. In this paper, we aim to identify sentences within an article that can illuminate and explain the overall bias of the entire article. We show that understanding the discourse role of a sentence in telling a news story, as well as its relation with nearby sentences, can reveal the ideological leanings of an author even when the sentence itself appears merely neutral. In particular, we consider using a functional news discourse structure and PDTB discourse relations to inform bias sentence identification, and distill the auxiliary knowledge from the two types of discourse structure into our bias sentence identification system. Experimental results on benchmark datasets show that incorporating both the global functional discourse structure and local rhetorical discourse relations can effectively increase the recall of bias sentence identification by 8.27% - 8.62%, as well as increase the precision by 2.82% - 3.48%.

pdf bib
Few-Shot (Dis)Agreement Identification in Online Discussions with Regularized and Augmented Meta-Learning
Yuanyuan Lei | Ruihong Huang
Findings of the Association for Computational Linguistics: EMNLP 2022

Online discussions are abundant with opinions towards a common topic, and identifying (dis)agreement between a pair of comments enables many opinion mining applications. Realizing the increasing needs to analyze opinions for emergent new topics that however tend to lack annotations, we present the first meta-learning approach for few-shot (dis)agreement identification that can be quickly applied to analyze opinions for new topics with few labeled instances. Furthermore, we enhance the meta-learner’s domain generalization ability from two perspectives. The first is domain-invariant regularization, where we design a lexicon-based regularization loss to enable the meta-learner to learn domain-invariant cues. The second is domain-aware augmentation, where we propose domain-aware task augmentation for meta-training to learn domain-specific expressions. In addition to using an existing dataset, we also evaluate our approach on two very recent new topics, mask mandate and COVID vaccine, using our newly annotated datasets containing 1.5k and 1.4k SubReddits comment pairs respectively. Extensive experiments on three domains/topics demonstrate the effectiveness of our meta-learning approach.

pdf bib
Modeling Document-level Temporal Structures for Building Temporal Dependency Graphs
Prafulla Kumar Choubey | Ruihong Huang
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

We propose to leverage news discourse profiling to model document-level temporal structures for building temporal dependency graphs. Our key observation is that the functional roles of sentences used for profiling news discourse signify different time frames relevant to a news story and can, therefore, help to recover the global temporal structure of a document. Our analyses and experiments with the widely used knowledge distillation technique show that discourse profiling effectively identifies distant inter-sentence event and (or) time expression pairs that are temporally related and otherwise difficult to locate.

2021

pdf bib
Explicitly Capturing Relations between Entity Mentions via Graph Neural Networks for Domain-specific Named Entity Recognition
Pei Chen | Haibo Ding | Jun Araki | Ruihong Huang
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

Named entity recognition (NER) is well studied for the general domain, and recent systems have achieved human-level performance for identifying common entity types. However, the NER performance is still moderate for specialized domains that tend to feature complicated contexts and jargonistic entity types. To address these challenges, we propose explicitly connecting entity mentions based on both global coreference relations and local dependency relations for building better entity mention representations. In our experiments, we incorporate entity mention relations by Graph Neural Networks and show that our system noticeably improves the NER performance on two datasets from different domains. We further show that the proposed lightweight system can effectively elevate the NER performance to a higher level even when only a tiny amount of labeled data is available, which is desirable for domain-specific NER.

pdf bib
Automatic Data Acquisition for Event Coreference Resolution
Prafulla Kumar Choubey | Ruihong Huang
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

We propose to leverage lexical paraphrases and high precision rules informed by news discourse structure to automatically collect coreferential and non-coreferential event pairs from unlabeled English news articles. We perform both manual validation and empirical evaluation on multiple evaluation datasets with different event domains and text genres to assess the quality of our acquired event pairs. We found that a model trained on our acquired event pairs performs comparably as the supervised model when applied to new data out of the training data domains. Further, augmenting human-annotated data with the acquired event pairs provides empirical performance gains on both in-domain and out-of-domain evaluation datasets.

pdf bib
A Joint Model for Structure-based News Genre Classification with Application to Text Summarization
Zeyu Dai | Ruihong Huang
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib
Profiling News Discourse Structure Using Explicit Subtopic Structures Guided Critics
Prafulla Kumar Choubey | Ruihong Huang
Findings of the Association for Computational Linguistics: EMNLP 2021

We present an actor-critic framework to induce subtopical structures in a news article for news discourse profiling. The model uses multiple critics that act according to known subtopic structures while the actor aims to outperform them. The content structures constitute sentences that represent latent subtopic boundaries. Then, we introduce a hierarchical neural network that uses the identified subtopic boundary sentences to model multi-level interaction between sentences, subtopics, and the document. Experimental results and analyses on the NewsDiscourse corpus show that the actor model learns to effectively segment a document into subtopics and improves the performance of the hierarchical model on the news discourse profiling task.

2020

pdf bib
Reconstructing Event Regions for Event Extraction via Graph Attention Networks
Pei Chen | Hang Yang | Kang Liu | Ruihong Huang | Yubo Chen | Taifeng Wang | Jun Zhao
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing

Event information is usually scattered across multiple sentences within a document. The local sentence-level event extractors often yield many noisy event role filler extractions in the absence of a broader view of the document-level context. Filtering spurious extractions and aggregating event information in a document remains a challenging problem. Following the observation that a document has several relevant event regions densely populated with event role fillers, we build graphs with candidate role filler extractions enriched by sentential embeddings as nodes, and use graph attention networks to identify event regions in a document and aggregate event information. We characterize edges between candidate extractions in a graph into rich vector representations to facilitate event region identification. The experimental results on two datasets of two languages show that our approach yields new state-of-the-art performance for the challenging event extraction task.

pdf bib
One Classifier for All Ambiguous Words: Overcoming Data Sparsity by Utilizing Sense Correlations Across Words
Prafulla Kumar Choubey | Ruihong Huang
Proceedings of the Twelfth Language Resources and Evaluation Conference

Most supervised word sense disambiguation (WSD) systems build word-specific classifiers by leveraging labeled data. However, when using word-specific classifiers, the sparseness of annotations leads to inferior sense disambiguation performance on less frequently seen words. To combat data sparsity, we propose to learn a single model that derives sense representations and meanwhile enforces congruence between a word instance and its right sense by using both sense-annotated data and lexical resources. The model is shared across words that allows utilizing sense correlations across words, and therefore helps to transfer common disambiguation rules from annotation-rich words to annotation-lean words. Empirical evaluation on benchmark datasets shows that the proposed shared model outperforms the equivalent classifier-based models by 1.7%, 2.5% and 3.8% in F1-score when using GloVe, ELMo and BERT word embeddings respectively.

pdf bib
Discourse as a Function of Event: Profiling Discourse Structure in News Articles around the Main Event
Prafulla Kumar Choubey | Aaron Lee | Ruihong Huang | Lu Wang
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Understanding discourse structures of news articles is vital to effectively contextualize the occurrence of a news event. To enable computational modeling of news structures, we apply an existing theory of functional discourse structure for news articles that revolves around the main event and create a human-annotated corpus of 802 documents spanning over four domains and three media sources. Next, we propose several document-level neural-network models to automatically construct news content structures. Finally, we demonstrate that incorporating system predicted news structures yields new state-of-the-art performance for event coreference resolution. The news documents we annotated are openly available and the annotations are publicly released for future research.

pdf bib
Proceedings of the First Joint Workshop on Narrative Understanding, Storylines, and Events
Claire Bonial | Tommaso Caselli | Snigdha Chaturvedi | Elizabeth Clark | Ruihong Huang | Mohit Iyyer | Alejandro Jaimes | Heng Ji | Lara J. Martin | Ben Miller | Teruko Mitamura | Nanyun Peng | Joel Tetreault
Proceedings of the First Joint Workshop on Narrative Understanding, Storylines, and Events

pdf bib
Weakly Supervised Subevent Knowledge Acquisition
Wenlin Yao | Zeyu Dai | Maitreyi Ramaswamy | Bonan Min | Ruihong Huang
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Subevents elaborate an event and widely exist in event descriptions. Subevent knowledge is useful for discourse analysis and event-centric applications. Acknowledging the scarcity of subevent knowledge, we propose a weakly supervised approach to extract subevent relation tuples from text and build the first large scale subevent knowledge base. We first obtain the initial set of event pairs that are likely to have the subevent relation, by exploiting two observations that 1) subevents are temporally contained by the parent event, and 2) the definitions of the parent event can be used to further guide the identification of subevents. Then, we collect rich weak supervision using the initial seed subevent pairs to train a contextual classifier using BERT and apply the classifier to identify new subevent pairs. The evaluation showed that the acquired subevent tuples (239K) are of high quality (90.1% accuracy) and cover a wide range of event types. The acquired subevent knowledge has been shown useful for discourse analysis and identifying a range of event-event relations.

pdf bib
PARADE: A New Dataset for Paraphrase Identification Requiring Computer Science Domain Knowledge
Yun He | Zhuoer Wang | Yin Zhang | Ruihong Huang | James Caverlee
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

We present a new benchmark dataset called PARADE for paraphrase identification that requires specialized domain knowledge. PARADE contains paraphrases that overlap very little at the lexical and syntactic level but are semantically equivalent based on computer science domain knowledge, as well as non-paraphrases that overlap greatly at the lexical and syntactic level but are not semantically equivalent based on this domain knowledge. Experiments show that both state-of-the-art neural models and non-expert human annotators have poor performance on PARADE. For example, BERT after fine-tuning achieves an F1 score of 0.709, which is much lower than its performance on other paraphrase identification datasets. PARADE can serve as a resource for researchers interested in testing models that incorporate domain knowledge. We make our data and code freely available.

2019

pdf bib
Improving Dialogue State Tracking by Discerning the Relevant Context
Sanuj Sharma | Prafulla Kumar Choubey | Ruihong Huang
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

A typical conversation comprises of multiple turns between participants where they go back and forth between different topics. At each user turn, dialogue state tracking (DST) aims to estimate user’s goal by processing the current utterance. However, in many turns, users implicitly refer to the previous goal, necessitating the use of relevant dialogue history. Nonetheless, distinguishing relevant history is challenging and a popular method of using dialogue recency for that is inefficient. We, therefore, propose a novel framework for DST that identifies relevant historical context by referring to the past utterances where a particular slot-value changes and uses that together with weighted system utterance to identify the relevant context. Specifically, we use the current user utterance and the most recent system utterance to determine the relevance of a system utterance. Empirical analyses show that our method improves joint goal accuracy by 2.75% and 2.36% on WoZ 2.0 and Multi-WoZ restaurant domain datasets respectively over the previous state-of-the-art GLAD model.

pdf bib
Modeling Document-level Causal Structures for Event Causal Relation Identification
Lei Gao | Prafulla Kumar Choubey | Ruihong Huang
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

We aim to comprehensively identify all the event causal relations in a document, both within a sentence and across sentences, which is important for reconstructing pivotal event structures. The challenges we identified are two: 1) event causal relations are sparse among all possible event pairs in a document, in addition, 2) few causal relations are explicitly stated. Both challenges are especially true for identifying causal relations between events across sentences. To address these challenges, we model rich aspects of document-level causal structures for achieving comprehensive causal relation identification. The causal structures include heavy involvements of document-level main events in causal relations as well as several types of fine-grained constraints that capture implications from certain sentential syntactic relations and discourse relations as well as interactions between event causal relations and event coreference relations. Our experimental results show that modeling the global and fine-grained aspects of causal structures using Integer Linear Programming (ILP) greatly improves the performance of causal relation identification, especially in identifying cross-sentence causal relations.

pdf bib
A Regularization Approach for Incorporating Event Knowledge and Coreference Relations into Neural Discourse Parsing
Zeyu Dai | Ruihong Huang
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

We argue that external commonsense knowledge and linguistic constraints need to be incorporated into neural network models for mitigating data sparsity issues and further improving the performance of discourse parsing. Realizing that external knowledge and linguistic constraints may not always apply in understanding a particular context, we propose a regularization approach that tightly integrates these constraints with contexts for deriving word representations. Meanwhile, it balances attentions over contexts and constraints through adding a regularization term into the objective function. Experiments show that our knowledge regularization approach outperforms all previous systems on the benchmark dataset PDTB for discourse parsing.

pdf bib
In Plain Sight: Media Bias Through the Lens of Factual Reporting
Lisa Fan | Marshall White | Eva Sharma | Ruisi Su | Prafulla Kumar Choubey | Ruihong Huang | Lu Wang
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

The increasing prevalence of political bias in news media calls for greater public awareness of it, as well as robust methods for its detection. While prior work in NLP has primarily focused on the lexical bias captured by linguistic attributes such as word choice and syntax, other types of bias stem from the actual content selected for inclusion in the text. In this work, we investigate the effects of informational bias: factual content that can nevertheless be deployed to sway reader opinion. We first produce a new dataset, BASIL, of 300 news articles annotated with 1,727 bias spans and find evidence that informational bias appears in news articles more frequently than lexical bias. We further study our annotations to observe how informational bias surfaces in news articles by different media outlets. Lastly, a baseline model for informational bias prediction is presented by fine-tuning BERT on our labeled data, indicating the challenges of the task and future directions.

pdf bib
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations
Sebastian Padó | Ruihong Huang
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations

2018

pdf bib
Improving Implicit Discourse Relation Classification by Modeling Inter-dependencies of Discourse Units in a Paragraph
Zeyu Dai | Ruihong Huang
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

We argue that semantic meanings of a sentence or clause can not be interpreted independently from the rest of a paragraph, or independently from all discourse relations and the overall paragraph-level discourse structure. With the goal of improving implicit discourse relation classification, we introduce a paragraph-level neural networks that model inter-dependencies between discourse units as well as discourse relation continuity and patterns, and predict a sequence of discourse relations in a paragraph. Experimental results show that our model outperforms the previous state-of-the-art systems on the benchmark corpus of PDTB.

pdf bib
Identifying the Most Dominant Event in a News Article by Mining Event Coreference Relations
Prafulla Kumar Choubey | Kaushik Raju | Ruihong Huang
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)

Identifying the most dominant and central event of a document, which governs and connects other foreground and background events in the document, is useful for many applications, such as text summarization, storyline generation and text segmentation. We observed that the central event of a document usually has many coreferential event mentions that are scattered throughout the document for enabling a smooth transition of subtopics. Our empirical experiments, using gold event coreference relations, have shown that the central event of a document can be well identified by mining properties of event coreference chains. But the performance drops when switching to system predicted event coreference relations. In addition, we found that the central event can be more accurately identified by further considering the number of sub-events as well as the realis status of an event.

pdf bib
Improving Event Coreference Resolution by Modeling Correlations between Event Coreference Chains and Document Topic Structures
Prafulla Kumar Choubey | Ruihong Huang
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

This paper proposes a novel approach for event coreference resolution that models correlations between event coreference chains and document topical structures through an Integer Linear Programming formulation. We explicitly model correlations between the main event chains of a document with topic transition sentences, inter-coreference chain correlations, event mention distributional characteristics and sub-event structure, and use them with scores obtained from a local coreference relation classifier for jointly resolving multiple event chains in a document. Our experiments across KBP 2016 and 2017 datasets suggest that each of the structures contribute to improving event coreference resolution performance.

pdf bib
Temporal Event Knowledge Acquisition via Identifying Narratives
Wenlin Yao | Ruihong Huang
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Inspired by the double temporality characteristic of narrative texts, we propose a novel approach for acquiring rich temporal “before/after” event knowledge across sentences in narrative stories. The double temporality states that a narrative story often describes a sequence of events following the chronological order and therefore, the temporal order of events matches with their textual order. We explored narratology principles and built a weakly supervised approach that identifies 287k narrative paragraphs from three large corpora. We then extracted rich temporal event knowledge from these narrative paragraphs. Such event knowledge is shown useful to improve temporal relation classification and outperforms several recent neural network models on the narrative cloze task.

pdf bib
Domain-Sensitive Temporal Tagging By Jannik Strötgen, Michael Gertz
Ruihong Huang
Computational Linguistics, Volume 44, Issue 2 - June 2018

pdf bib
Fine-grained Structure-based News Genre Categorization
Zeyu Dai | Himanshu Taneja | Ruihong Huang
Proceedings of the Workshop Events and Stories in the News 2018

Journalists usually organize and present the contents of a news article following a well-defined structure. In this work, we propose a new task to categorize news articles based on their content presentation structures, which is beneficial for various NLP applications. We first define a small set of news elements considering their functions (e.g., introducing the main story or event, catching the reader’s attention and providing details) in a news story and their writing style (narrative or expository), and then formally define four commonly used news article structures based on their selections and organizations of news elements. We create an annotated dataset for structure-based news genre identification, and finally, we build a predictive model to assess the feasibility of this classification task using structure indicative features.

pdf bib
Proceedings of the 2nd Workshop on Abusive Language Online (ALW2)
Darja Fišer | Ruihong Huang | Vinodkumar Prabhakaran | Rob Voigt | Zeerak Waseem | Jacqueline Wernimont
Proceedings of the 2nd Workshop on Abusive Language Online (ALW2)

pdf bib
Building Context-aware Clause Representations for Situation Entity Type Classification
Zeyu Dai | Ruihong Huang
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Capabilities to categorize a clause based on the type of situation entity (e.g., events, states and generic statements) the clause introduces to the discourse can benefit many NLP applications. Observing that the situation entity type of a clause depends on discourse functions the clause plays in a paragraph and the interpretation of discourse functions depends heavily on paragraph-wide contexts, we propose to build context-aware clause representations for predicting situation entity types of clauses. Specifically, we propose a hierarchical recurrent neural network model to read a whole paragraph at a time and jointly learn representations for all the clauses in the paragraph by extensively modeling context influences and inter-dependencies of clauses. Experimental results show that our model achieves the state-of-the-art performance for clause-level situation entity classification on the genre-rich MASC+Wiki corpus, which approaches human-level performance.

2017

pdf bib
Recognizing Explicit and Implicit Hate Speech Using a Weakly Supervised Two-path Bootstrapping Approach
Lei Gao | Alexis Kuppersmith | Ruihong Huang
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

In the wake of a polarizing election, social media is laden with hateful content. To address various limitations of supervised hate speech classification methods including corpus bias and huge cost of annotation, we propose a weakly supervised two-path bootstrapping approach for an online hate speech detection model leveraging large-scale unlabeled data. This system significantly outperforms hate speech detection systems that are trained in a supervised manner using manually annotated data. Applying this model on a large quantity of tweets collected before, after, and on election day reveals motivations and patterns of inflammatory language.

pdf bib
Using Context Events in Neural Network Models for Event Temporal Status Identification
Zeyu Dai | Wenlin Yao | Ruihong Huang
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

Focusing on the task of identifying event temporal status, we find that events directly or indirectly governing the target event in a dependency tree are most important contexts. Therefore, we extract dependency chains containing context events and use them as input in neural network models, which consistently outperform previous models using local context words as input. Visualization verifies that the dependency chain representation can effectively capture the context events which are closely related to the target event and play key roles in predicting event temporal status.

pdf bib
Detecting Online Hate Speech Using Context Aware Models
Lei Gao | Ruihong Huang
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017

In the wake of a polarizing election, the cyber world is laden with hate speech. Context accompanying a hate speech text is useful for identifying hate speech, which however has been largely overlooked in existing datasets and hate speech detection models. In this paper, we provide an annotated corpus of hate speech with context information well kept. Then we propose two types of hate speech detection models that incorporate context information, a logistic regression model with context features and a neural network model with learning components for context. Our evaluation shows that both models outperform a strong baseline by around 3% to 4% in F1 score and combining these two models further improve the performance by another 7% in F1 score.

pdf bib
Online Deception Detection Refueled by Real World Data Collection
Wenlin Yao | Zeyu Dai | Ruihong Huang | James Caverlee
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017

The lack of large realistic datasets presents a bottleneck in online deception detection studies. In this paper, we apply a data collection method based on social network analysis to quickly identify high quality deceptive and truthful online reviews1 from Amazon. The dataset contains more than 10,000 deceptive reviews and is diverse in product domains and reviewers. Using this dataset, we explore effective general features for online deception detection that perform well across domains. We demonstrate that with generalized features – advertising speak and writing complexity scores – deception detection performance can be further improved by adding additional deceptive reviews from assorted domains in training. Finally, reviewer level evaluation gives an interesting insight into different deceptive reviewers’ writing styles.

pdf bib
A Weakly Supervised Approach to Train Temporal Relation Classifiers and Acquire Regular Event Pairs Simultaneously
Wenlin Yao | Saipravallika Nettyam | Ruihong Huang
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017

Capabilities of detecting temporal and causal relations between two events can benefit many applications. Most of existing temporal relation classifiers were trained in a supervised manner. Instead, we explore the observation that regular event pairs show a consistent temporal relation despite of their various contexts and these rich contexts can be used to train a contextual temporal relation classifier, which can further recognize new temporal relation contexts and identify new regular event pairs. We focus on detecting after and before temporal relations and design a weakly supervised learning approach that extracts thousands of regular event pairs and learns a contextual temporal relation classifier simultaneously. Evaluation shows that the acquired regular event pairs are of high quality and contain rich commonsense knowledge and domain specific knowledge. In addition, the weakly supervised trained temporal relation classifier achieves comparable performance with the state-of-the-art supervised systems.

pdf bib
A Sequential Model for Classifying Temporal Relations between Intra-Sentence Events
Prafulla Kumar Choubey | Ruihong Huang
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

We present a sequential model for temporal relation classification between intra-sentence events. The key observation is that the overall syntactic structure and compositional meanings of the multi-word context between events are important for distinguishing among fine-grained temporal relations. Specifically, our approach first extracts a sequence of context words that indicates the temporal relation between two events, which well align with the dependency path between two event mentions. The context word sequence, together with a parts-of-speech tag sequence and a dependency relation sequence that are generated corresponding to the word sequence, are then provided as input to bidirectional recurrent neural network (LSTM) models. The neural nets learn compositional syntactic and semantic representations of contexts surrounding the two events and predict the temporal relation between them. Evaluation of the proposed approach on TimeBank corpus shows that sequential modeling is capable of accurately recognizing temporal relations between events, which outperforms a neural net model using various discrete features as input that imitates previous feature based models.

pdf bib
Event Coreference Resolution by Iteratively Unfolding Inter-dependencies among Events
Prafulla Kumar Choubey | Ruihong Huang
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

We introduce a novel iterative approach for event coreference resolution that gradually builds event clusters by exploiting inter-dependencies among event mentions within the same chain as well as across event chains. Among event mentions in the same chain, we distinguish within- and cross-document event coreference links by using two distinct pairwise classifiers, trained separately to capture differences in feature distributions of within- and cross-document event clusters. Our event coreference approach alternates between WD and CD clustering and combines arguments from both event clusters after every merge, continuing till no more merge can be made. And then it performs further merging between event chains that are both closely related to a set of other chains of events. Experiments on the ECB+ corpus show that our model outperforms state-of-the-art methods in joint task of WD and CD event coreference resolution.

2016

pdf bib
Distinguishing Past, On-going, and Future Events: The EventStatus Corpus
Ruihong Huang | Ignacio Cases | Dan Jurafsky | Cleo Condoravdi | Ellen Riloff
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Extracting Subevents via an Effective Two-phase Approach
Allison Badgett | Ruihong Huang
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Towards Accurate Event Detection in Social Media: A Weakly Supervised Approach for Learning Implicit Event Indicators
Ajit Jain | Girish Kasiviswanathan | Ruihong Huang
Proceedings of the 2nd Workshop on Noisy User-generated Text (WNUT)

Accurate event detection in social media is very challenging because user generated contents are extremely noisy and sparse in content. Event indicators are generally words or phrases that act as a trigger that help us understand the semantics of the context they occur in. We present a weakly supervised approach that relies on using a single strong event indicator phrase as a seed to acquire a variety of additional event cues. We propose to leverage various types of implicit event indicators, such as props, actors and precursor events, to achieve precise event detection. We experimented with civil unrest events and show that the automatically learnt event indicators are effective in identifying specific types of events.

pdf bib
Learning Event Expressions via Bilingual Structure Projection
Fangyuan Li | Ruihong Huang | Deyi Xiong | Min Zhang
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Identifying events of a specific type is a challenging task as events in texts are described in numerous and diverse ways. Aiming to resolve high complexities of event descriptions, previous work (Huang and Riloff, 2013) proposes multi-faceted event recognition and a bootstrapping method to automatically acquire both event facet phrases and event expressions from unannotated texts. However, to ensure high quality of learned phrases, this method is constrained to only learn phrases that match certain syntactic structures. In this paper, we propose a bilingual structure projection algorithm that explores linguistic divergences between two languages (Chinese and English) and mines new phrases with new syntactic structures, which have been ignored in the previous work. Experiments show that our approach can successfully find novel event phrases and structures, e.g., phrases headed by nouns. Furthermore, the newly mined phrases are capable of recognizing additional event descriptions and increasing the recall of event recognition.

pdf bib
CaseSummarizer: A System for Automated Summarization of Legal Texts
Seth Polsley | Pooja Jhunjhunwala | Ruihong Huang
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations

Attorneys, judges, and others in the justice system are constantly surrounded by large amounts of legal text, which can be difficult to manage across many cases. We present CaseSummarizer, a tool for automated text summarization of legal documents which uses standard summary methods based on word frequency augmented with additional domain-specific knowledge. Summaries are then provided through an informative interface with abbreviations, significance heat maps, and other flexible controls. It is evaluated using ROUGE and human scoring against several other summarization systems, including summary text and feedback provided by domain experts.

2013

pdf bib
Sarcasm as Contrast between a Positive Sentiment and Negative Situation
Ellen Riloff | Ashequl Qadir | Prafulla Surve | Lalindra De Silva | Nathan Gilbert | Ruihong Huang
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Classifying Message Board Posts with an Extracted Lexicon of Patient Attributes
Ruihong Huang | Ellen Riloff
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Multi-faceted Event Recognition with Bootstrapped Dictionaries
Ruihong Huang | Ellen Riloff
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2012

pdf bib
Bootstrapped Training of Event Extraction Classifiers
Ruihong Huang | Ellen Riloff
Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics

2011

pdf bib
Peeling Back the Layers: Detecting Event Role Fillers in Secondary Contexts
Ruihong Huang | Ellen Riloff
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

2010

pdf bib
Inducing Domain-Specific Semantic Class Taggers from (Almost) Nothing
Ruihong Huang | Ellen Riloff
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

2008

pdf bib
Two Step Chinese Named Entity Recognition Based on Conditional Random Fields Models
Yuanyong Feng | Ruihong Huang | Le Sun
Proceedings of the Sixth SIGHAN Workshop on Chinese Language Processing

Search
Co-authors