Yangqiu Song


2021

pdf bib
A Brief Survey and Comparative Study of Recent Development of Pronoun Coreference Resolution in English
Hongming Zhang | Xinran Zhao | Yangqiu Song
Proceedings of the Fourth Workshop on Computational Models of Reference, Anaphora and Coreference

Pronoun Coreference Resolution (PCR) is the task of resolving pronominal expressions to all mentions they refer to. Compared with the general coreference resolution task, the main challenge of PCR is the coreference relation prediction rather than the mention detection. As one important natural language understanding (NLU) component, pronoun resolution is crucial for many downstream tasks and still challenging for existing models, which motivates us to survey existing approaches and think about how to do better. In this survey, we first introduce representative datasets and models for the ordinary pronoun coreference resolution task. Then we focus on recent progress on hard pronoun coreference resolution problems (e.g., Winograd Schema Challenge) to analyze how well current models can understand commonsense. We conduct extensive experiments to show that even though current models are achieving good performance on the standard evaluation set, they are still not ready to be used in real applications (e.g., all SOTA models struggle on correctly resolving pronouns to infrequent objects). All experiment codes will be available upon acceptance.

pdf bib
Ultra-Fine Entity Typing with Weak Supervision from a Masked Language Model
Hongliang Dai | Yangqiu Song | Haixun Wang
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Recently, there is an effort to extend fine-grained entity typing by using a richer and ultra-fine set of types, and labeling noun phrases including pronouns and nominal nouns instead of just named entity mentions. A key challenge for this ultra-fine entity typing task is that human annotated data are extremely scarce, and the annotation ability of existing distant or weak supervision approaches is very limited. To remedy this problem, in this paper, we propose to obtain training data for ultra-fine entity typing by using a BERT Masked Language Model (MLM). Given a mention in a sentence, our approach constructs an input for the BERT MLM so that it predicts context dependent hypernyms of the mention, which can be used as type labels. Experimental results demonstrate that, with the help of these automatically generated labels, the performance of an ultra-fine entity typing model can be improved substantially. We also show that our approach can be applied to improve traditional fine-grained entity typing after performing simple type mapping.

pdf bib
Exploring Discourse Structures for Argument Impact Classification
Xin Liu | Jiefu Ou | Yangqiu Song | Xin Jiang
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Discourse relations among arguments reveal logical structures of a debate conversation. However, no prior work has explicitly studied how the sequence of discourse relations influence a claim’s impact. This paper empirically shows that the discourse relations between two arguments along the context path are essential factors for identifying the persuasive power of an argument. We further propose DisCOC to inject and fuse the sentence-level structural discourse information with contextualized features derived from large-scale language models. Experimental results and extensive analysis show that the attention and gate mechanisms that explicitly model contexts and texts can indeed help the argument impact classification task defined by Durmus et al. (2019), and discourse structures among the context path of the claim to be classified can further boost the performance.

pdf bib
Probing Toxic Content in Large Pre-Trained Language Models
Nedjma Ousidhoum | Xinran Zhao | Tianqing Fang | Yangqiu Song | Dit-Yan Yeung
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Large pre-trained language models (PTLMs) have been shown to carry biases towards different social groups which leads to the reproduction of stereotypical and toxic content by major NLP systems. We propose a method based on logistic regression classifiers to probe English, French, and Arabic PTLMs and quantify the potentially harmful content that they convey with respect to a set of templates. The templates are prompted by a name of a social group followed by a cause-effect relation. We use PTLMs to predict masked tokens at the end of a sentence in order to examine how likely they enable toxicity towards specific communities. We shed the light on how such negative content can be triggered within unrelated and benign contexts based on evidence from a large-scale study, then we explain how to take advantage of our methodology to assess and mitigate the toxicity transmitted by PTLMs.

pdf bib
Joint Coreference Resolution and Character Linking for Multiparty Conversation
Jiaxin Bai | Hongming Zhang | Yangqiu Song | Kun Xu
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

Character linking, the task of linking mentioned people in conversations to the real world, is crucial for understanding the conversations. For the efficiency of communication, humans often choose to use pronouns (e.g., “she”) or normal entities (e.g., “that girl”) rather than named entities (e.g., “Rachel”) in the spoken language, which makes linking those mentions to real people a much more challenging than a regular entity linking task. To address this challenge, we propose to incorporate the richer context from the coreference relations among different mentions to help the linking. On the other hand, considering that finding coreference clusters itself is not a trivial task and could benefit from the global character information, we propose to jointly solve these two tasks. Specifically, we propose Cˆ2, the joint learning model of Coreference resolution and Character linking. The experimental results demonstrate that Cˆ2 can significantly outperform previous works on both tasks. Further analyses are conducted to analyze the contribution of all modules in the proposed model and the effect of all hyper-parameters.

pdf bib
Variational Weakly Supervised Sentiment Analysis with Posterior Regularization
Ziqian Zeng | Yangqiu Song
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

Sentiment analysis is an important task in natural language processing (NLP). Most of existing state-of-the-art methods are under the supervised learning paradigm. However, human annotations can be scarce. Thus, we should leverage more weak supervision for sentiment analysis. In this paper, we propose a posterior regularization framework for the variational approach to the weakly supervised sentiment analysis to better control the posterior distribution of the label assignment. The intuition behind the posterior regularization is that if extracted opinion words from two documents are semantically similar, the posterior distributions of two documents should be similar. Our experimental results show that the posterior regularization can improve the original variational approach to the weakly supervised sentiment analysis and the performance is more stable with smaller prediction variance.

pdf bib
Exophoric Pronoun Resolution in Dialogues with Topic Regularization
Xintong Yu | Hongming Zhang | Yangqiu Song | Changshui Zhang | Kun Xu | Dong Yu
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Resolving pronouns to their referents has long been studied as a fundamental natural language understanding problem. Previous works on pronoun coreference resolution (PCR) mostly focus on resolving pronouns to mentions in text while ignoring the exophoric scenario. Exophoric pronouns are common in daily communications, where speakers may directly use pronouns to refer to some objects present in the environment without introducing the objects first. Although such objects are not mentioned in the dialogue text, they can often be disambiguated by the general topics of the dialogue. Motivated by this, we propose to jointly leverage the local context and global topics of dialogues to solve the out-of-text PCR problem. Extensive experiments demonstrate the effectiveness of adding topic regularization for resolving exophoric pronouns.

pdf bib
Benchmarking Commonsense Knowledge Base Population with an Effective Evaluation Dataset
Tianqing Fang | Weiqi Wang | Sehyun Choi | Shibo Hao | Hongming Zhang | Yangqiu Song | Bin He
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Reasoning over commonsense knowledge bases (CSKB) whose elements are in the form of free-text is an important yet hard task in NLP. While CSKB completion only fills the missing links within the domain of the CSKB, CSKB population is alternatively proposed with the goal of reasoning unseen assertions from external resources. In this task, CSKBs are grounded to a large-scale eventuality (activity, state, and event) graph to discriminate whether novel triples from the eventuality graph are plausible or not. However, existing evaluations on the population task are either not accurate (automatic evaluation with randomly sampled negative examples) or of small scale (human annotation). In this paper, we benchmark the CSKB population task with a new large-scale dataset by first aligning four popular CSKBs, and then presenting a high-quality human-annotated evaluation set to probe neural models’ commonsense reasoning ability. We also propose a novel inductive commonsense reasoning model that reasons over graphs. Experimental results show that generalizing commonsense reasoning on unseen assertions is inherently a hard task. Models achieving high accuracy during training perform poorly on the evaluation set, with a large gap between human performance. We will make the data publicly available for future contributions. Codes and data are available at https://github.com/HKUST-KnowComp/CSKB-Population.

2020

pdf bib
A Chinese Corpus for Fine-grained Entity Typing
Chin Lee | Hongliang Dai | Yangqiu Song | Xin Li
Proceedings of the 12th Language Resources and Evaluation Conference

Fine-grained entity typing is a challenging task with wide applications. However, most existing datasets for this task are in English. In this paper, we introduce a corpus for Chinese fine-grained entity typing that contains 4,800 mentions manually labeled through crowdsourcing. Each mention is annotated with free-form entity types. To make our dataset useful in more possible scenarios, we also categorize all the fine-grained types into 10 general types. Finally, we conduct experiments with some neural models whose structures are typical in fine-grained entity typing and show how well they perform on our dataset. We also show the possibility of improving Chinese fine-grained entity typing through cross-lingual transfer learning.

pdf bib
Analogous Process Structure Induction for Sub-event Sequence Prediction
Hongming Zhang | Muhao Chen | Haoyu Wang | Yangqiu Song | Dan Roth
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Computational and cognitive studies of event understanding suggest that identifying, comprehending, and predicting events depend on having structured representations of a sequence of events and on conceptualizing (abstracting) its components into (soft) event categories. Thus, knowledge about a known process such as “buying a car” can be used in the context of a new but analogous process such as “buying a house”. Nevertheless, most event understanding work in NLP is still at the ground level and does not consider abstraction. In this paper, we propose an Analogous Process Structure Induction (APSI) framework, which leverages analogies among processes and conceptualization of sub-event instances to predict the whole sub-event sequence of previously unseen open-domain processes. As our experiments and analysis indicate, APSI supports the generation of meaningful sub-event sequences for unseen processes and can help predict missing events.

pdf bib
Comparative Evaluation of Label-Agnostic Selection Bias in Multilingual Hate Speech Datasets
Nedjma Ousidhoum | Yangqiu Song | Dit-Yan Yeung
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Work on bias in hate speech typically aims to improve classification performance while relatively overlooking the quality of the data. We examine selection bias in hate speech in a language and label independent fashion. We first use topic models to discover latent semantics in eleven hate speech corpora, then, we present two bias evaluation metrics based on the semantic similarity between topics and search words frequently used to build corpora. We discuss the possibility of revising the data collection process by comparing datasets and analyzing contrastive case studies.

pdf bib
When Hearst Is not Enough: Improving Hypernymy Detection from Corpus with Distributional Models
Changlong Yu | Jialong Han | Peifeng Wang | Yangqiu Song | Hongming Zhang | Wilfred Ng | Shuming Shi
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

We address hypernymy detection, i.e., whether an is-a relationship exists between words (x ,y), with the help of large textual corpora. Most conventional approaches to this task have been categorized to be either pattern-based or distributional. Recent studies suggest that pattern-based ones are superior, if large-scale Hearst pairs are extracted and fed, with the sparsity of unseen (x ,y) pairs relieved. However, they become invalid in some specific sparsity cases, where x or y is not involved in any pattern. For the first time, this paper quantifies the non-negligible existence of those specific cases. We also demonstrate that distributional methods are ideal to make up for pattern-based ones in such cases. We devise a complementary framework, under which a pattern-based and a distributional model collaborate seamlessly in cases which they each prefer. On several benchmark datasets, our framework demonstrates improvements that are both competitive and explainable.

pdf bib
WinoWhy: A Deep Diagnosis of Essential Commonsense Knowledge for Answering Winograd Schema Challenge
Hongming Zhang | Xinran Zhao | Yangqiu Song
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

In this paper, we present the first comprehensive categorization of essential commonsense knowledge for answering the Winograd Schema Challenge (WSC). For each of the questions, we invite annotators to first provide reasons for making correct decisions and then categorize them into six major knowledge categories. By doing so, we better understand the limitation of existing methods (i.e., what kind of knowledge cannot be effectively represented or inferred with existing methods) and shed some light on the commonsense knowledge that we need to acquire in the future for better commonsense reasoning. Moreover, to investigate whether current WSC models can understand the commonsense or they simply solve the WSC questions based on the statistical bias of the dataset, we leverage the collected reasons to develop a new task called WinoWhy, which requires models to distinguish plausible reasons from very similar but wrong reasons for all WSC questions. Experimental results prove that even though pre-trained language representation models have achieved promising progress on the original WSC dataset, they are still struggling at WinoWhy. Further experiments show that even though supervised models can achieve better performance, the performance of these models can be sensitive to the dataset distribution. WinoWhy and all codes are available at: https://github.com/HKUST-KnowComp/WinoWhy.

2019

pdf bib
A Variational Approach to Weakly Supervised Document-Level Multi-Aspect Sentiment Classification
Ziqian Zeng | Wenxuan Zhou | Xin Liu | Yangqiu Song
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

In this paper, we propose a variational approach to weakly supervised document-level multi-aspect sentiment classification. Instead of using user-generated ratings or annotations provided by domain experts, we use target-opinion word pairs as “supervision.” These word pairs can be extracted by using dependency parsers and simple rules. Our objective is to predict an opinion word given a target word while our ultimate goal is to learn a sentiment polarity classifier to predict the sentiment polarity of each aspect given a document. By introducing a latent variable, i.e., the sentiment polarity, to the objective function, we can inject the sentiment polarity classifier to the objective via the variational lower bound. We can learn a sentiment polarity classifier by optimizing the lower bound. We show that our method can outperform weakly supervised baselines on TripAdvisor and BeerAdvocate datasets and can be comparable to the state-of-the-art supervised method with hundreds of labels per aspect.

pdf bib
Incorporating Context and External Knowledge for Pronoun Coreference Resolution
Hongming Zhang | Yan Song | Yangqiu Song
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Linking pronominal expressions to the correct references requires, in many cases, better analysis of the contextual information and external knowledge. In this paper, we propose a two-layer model for pronoun coreference resolution that leverages both context and external knowledge, where a knowledge attention mechanism is designed to ensure the model leveraging the appropriate source of external knowledge based on different context. Experimental results demonstrate the validity and effectiveness of our model, where it outperforms state-of-the-art models by a large margin.

pdf bib
Relation Discovery with Out-of-Relation Knowledge Base as Supervision
Yan Liang | Xin Liu | Jianwen Zhang | Yangqiu Song
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Unsupervised relation discovery aims to discover new relations from a given text corpus without annotated data. However, it does not consider existing human annotated knowledge bases even when they are relevant to the relations to be discovered. In this paper, we study the problem of how to use out-of-relation knowledge bases to supervise the discovery of unseen relations, where out-of-relation means that relations to discover from the text corpus and those in knowledge bases are not overlapped. We construct a set of constraints between entity pairs based on the knowledge base embedding and then incorporate constraints into the relation discovery by a variational auto-encoder based algorithm. Experiments show that our new approach can improve the state-of-the-art relation discovery performance by a large margin.

pdf bib
SP-10K: A Large-scale Evaluation Set for Selectional Preference Acquisition
Hongming Zhang | Hantian Ding | Yangqiu Song
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Selectional Preference (SP) is a commonly observed language phenomenon and proved to be useful in many natural language processing tasks. To provide a better evaluation method for SP models, we introduce SP-10K, a large-scale evaluation set that provides human ratings for the plausibility of 10,000 SP pairs over five SP relations, covering 2,500 most frequent verbs, nouns, and adjectives in American English. Three representative SP acquisition methods based on pseudo-disambiguation are evaluated with SP-10K. To demonstrate the importance of our dataset, we investigate the relationship between SP-10K and the commonsense knowledge in ConceptNet5 and show the potential of using SP to represent the commonsense knowledge. We also use the Winograd Schema Challenge to prove that the proposed new SP relations are essential for the hard pronoun coreference resolution problem.

pdf bib
Knowledge-aware Pronoun Coreference Resolution
Hongming Zhang | Yan Song | Yangqiu Song | Dong Yu
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Resolving pronoun coreference requires knowledge support, especially for particular domains (e.g., medicine). In this paper, we explore how to leverage different types of knowledge to better resolve pronoun coreference with a neural model. To ensure the generalization ability of our model, we directly incorporate knowledge in the format of triplets, which is the most common format of modern knowledge graphs, instead of encoding it with features or rules as that in conventional approaches. Moreover, since not all knowledge is helpful in certain contexts, to selectively use them, we propose a knowledge attention module, which learns to select and use informative knowledge based on contexts, to enhance our model. Experimental results on two datasets from different domains prove the validity and effectiveness of our model, where it outperforms state-of-the-art baselines by a large margin. Moreover, since our model learns to use external knowledge rather than only fitting the training data, it also demonstrates superior performance to baselines in the cross-domain setting.

pdf bib
Neural Aspect and Opinion Term Extraction with Mined Rules as Weak Supervision
Hongliang Dai | Yangqiu Song
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Lack of labeled training data is a major bottleneck for neural network based aspect and opinion term extraction on product reviews. To alleviate this problem, we first propose an algorithm to automatically mine extraction rules from existing training examples based on dependency parsing results. The mined rules are then applied to label a large amount of auxiliary data. Finally, we study training procedures to train a neural model which can learn from both the data automatically labeled by the rules and a small amount of data accurately annotated by human. Experimental results show that although the mined rules themselves do not perform well due to their limited flexibility, the combination of human annotated data and rule labeled auxiliary data can improve the neural model and allow it to achieve performance better than or comparable with the current state-of-the-art.

pdf bib
Multilingual and Multi-Aspect Hate Speech Analysis
Nedjma Ousidhoum | Zizheng Lin | Hongming Zhang | Yangqiu Song | Dit-Yan Yeung
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Current research on hate speech analysis is typically oriented towards monolingual and single classification tasks. In this paper, we present a new multilingual multi-aspect hate speech analysis dataset and use it to test the current state-of-the-art multilingual multitask learning approaches. We evaluate our dataset in various classification settings, then we discuss how to leverage our annotations in order to improve hate speech detection and classification in general.

pdf bib
What You See is What You Get: Visual Pronoun Coreference Resolution in Dialogues
Xintong Yu | Hongming Zhang | Yangqiu Song | Yan Song | Changshui Zhang
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Grounding a pronoun to a visual object it refers to requires complex reasoning from various information sources, especially in conversational scenarios. For example, when people in a conversation talk about something all speakers can see, they often directly use pronouns (e.g., it) to refer to it without previous introduction. This fact brings a huge challenge for modern natural language understanding systems, particularly conventional context-based pronoun coreference models. To tackle this challenge, in this paper, we formally define the task of visual-aware pronoun coreference resolution (PCR) and introduce VisPro, a large-scale dialogue PCR dataset, to investigate whether and how the visual information can help resolve pronouns in dialogues. We then propose a novel visual-aware PCR model, VisCoref, for this task and conduct comprehensive experiments and case studies on our dataset. Results demonstrate the importance of the visual information in this PCR case and show the effectiveness of the proposed model.

pdf bib
Multiplex Word Embeddings for Selectional Preference Acquisition
Hongming Zhang | Jiaxin Bai | Yan Song | Kun Xu | Changlong Yu | Yangqiu Song | Wilfred Ng | Dong Yu
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Conventional word embeddings represent words with fixed vectors, which are usually trained based on co-occurrence patterns among words. In doing so, however, the power of such representations is limited, where the same word might be functionalized separately under different syntactic relations. To address this limitation, one solution is to incorporate relational dependencies of different words into their embeddings. Therefore, in this paper, we propose a multiplex word embedding model, which can be easily extended according to various relations among words. As a result, each word has a center embedding to represent its overall semantics, and several relational embeddings to represent its relational dependencies. Compared to existing models, our model can effectively distinguish words with respect to different relations without introducing unnecessary sparseness. Moreover, to accommodate various relations, we use a small dimension for relational embeddings and our model is able to keep their effectiveness. Experiments on selectional preference acquisition and word similarity demonstrate the effectiveness of the proposed model, and a further study of scalability also proves that our embeddings only need 1/20 of the original embedding size to achieve better performance.

pdf bib
Improving Fine-grained Entity Typing with Entity Linking
Hongliang Dai | Donghong Du | Xin Li | Yangqiu Song
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Fine-grained entity typing is a challenging problem since it usually involves a relatively large tag set and may require to understand the context of the entity mention. In this paper, we use entity linking to help with the fine-grained entity type classification process. We propose a deep neural model that makes predictions based on both the context and the information obtained from entity linking results. Experimental results on two commonly used datasets demonstrates the effectiveness of our approach. On both datasets, it achieves more than 5% absolute strict accuracy improvement over the state of the art.

2018

pdf bib
Entity Linking within a Social Media Platform: A Case Study on Yelp
Hongliang Dai | Yangqiu Song | Liwei Qiu | Rijia Liu
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

In this paper, we study a new entity linking problem where both the entity mentions and the target entities are within a same social media platform. Compared with traditional entity linking problems that link mentions to a knowledge base, this new problem have less information about the target entities. However, if we can successfully link mentions to entities within a social media platform, we can improve a lot of applications such as comparative study in business intelligence and opinion leader finding. To study this problem, we constructed a dataset called Yelp-EL, where the business mentions in Yelp reviews are linked to their corresponding businesses on the platform. We conducted comprehensive experiments and analysis on this dataset with a learning to rank model that takes different types of features as input, as well as a few state-of-the-art entity linking approaches. Our experimental results show that two types of features that are not available in traditional entity linking: social features and location features, can be very helpful for this task.

pdf bib
CogCompNLP: Your Swiss Army Knife for NLP
Daniel Khashabi | Mark Sammons | Ben Zhou | Tom Redman | Christos Christodoulopoulos | Vivek Srikumar | Nicholas Rizzolo | Lev Ratinov | Guanheng Luo | Quang Do | Chen-Tse Tsai | Subhro Roy | Stephen Mayhew | Zhili Feng | John Wieting | Xiaodong Yu | Yangqiu Song | Shashank Gupta | Shyam Upadhyay | Naveen Arivazhagan | Qiang Ning | Shaoshi Ling | Dan Roth
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2017

pdf bib
Joint Embeddings of Chinese Words, Characters, and Fine-grained Subcharacter Components
Jinxing Yu | Xun Jian | Hao Xin | Yangqiu Song
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Word embeddings have attracted much attention recently. Different from alphabetic writing systems, Chinese characters are often composed of subcharacter components which are also semantically informative. In this work, we propose an approach to jointly embed Chinese words as well as their characters and fine-grained subcharacter components. We use three likelihoods to evaluate whether the context words, characters, and components can predict the current target word, and collected 13,253 subcharacter components to demonstrate the existing approaches of decomposing Chinese characters are not enough. Evaluation on both word similarity and word analogy tasks demonstrates the superior performance of our model.

pdf bib
Document-Level Multi-Aspect Sentiment Classification as Machine Comprehension
Yichun Yin | Yangqiu Song | Ming Zhang
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Document-level multi-aspect sentiment classification is an important task for customer relation management. In this paper, we model the task as a machine comprehension problem where pseudo question-answer pairs are constructed by a small number of aspect-related keywords and aspect ratings. A hierarchical iterative attention model is introduced to build aspectspecific representations by frequent and repeated interactions between documents and aspect questions. We adopt a hierarchical architecture to represent both word level and sentence level information, and use the attention operations for aspect questions and documents alternatively with the multiple hop mechanism. Experimental results on the TripAdvisor and BeerAdvocate datasets show that our model outperforms classical baselines. We will release our code and data for the method replicability.

pdf bib
NNEMBs at SemEval-2017 Task 4: Neural Twitter Sentiment Classification: a Simple Ensemble Method with Different Embeddings
Yichun Yin | Yangqiu Song | Ming Zhang
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

Recently, neural twitter sentiment classification has become one of state-of-thearts, which relies less feature engineering work compared with traditional methods. In this paper, we propose a simple and effective ensemble method to further boost the performances of neural models. We collect several word embedding sets which are publicly released (often are learned on different corpus) or constructed by running Skip-gram on released large-scale corpus. We make an assumption that different word embeddings cover different words and encode different semantic knowledge, thus using them together can improve the generalizations and performances of neural models. In the SemEval 2017, our method ranks 1st in Accuracy, 5th in AverageR. Meanwhile, the additional comparisons demonstrate the superiority of our model over these ones based on only one word embedding set. We release our code for the method duplicability.

2016

pdf bib
Event Detection and Co-reference with Minimal Supervision
Haoruo Peng | Yangqiu Song | Dan Roth
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Word Embeddings with Limited Memory
Shaoshi Ling | Yangqiu Song | Dan Roth
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2015

pdf bib
Unsupervised Sparse Vector Densification for Short Text Similarity
Yangqiu Song | Dan Roth
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Improving a Pipeline Architecture for Shallow Discourse Parsing
Yangqiu Song | Haoruo Peng | Parisa Kordjamshidi | Mark Sammons | Dan Roth
Proceedings of the Nineteenth Conference on Computational Natural Language Learning - Shared Task