Fan Yang


2022

pdf bib
DESED: Dialogue-based Explanation for Sentence-level Event Detection
Yinyi Wei | Shuaipeng Liu | Jianwei Lv | Xiangyu Xi | Hailei Yan | Wei Ye | Tong Mo | Fan Yang | Guanglu Wan
Proceedings of the 29th International Conference on Computational Linguistics

Many recent sentence-level event detection efforts focus on enriching sentence semantics, e.g., via multi-task or prompt-based learning. Despite the promising performance, these methods commonly depend on label-extensive manual annotations or require domain expertise to design sophisticated templates and rules. This paper proposes a new paradigm, named dialogue-based explanation, to enhance sentence semantics for event detection. By saying dialogue-based explanation of an event, we mean explaining it through a consistent information-intensive dialogue, with the original event description as the start utterance. We propose three simple dialogue generation methods, whose outputs are then fed into a hybrid attention mechanism to characterize the complementary event semantics. Extensive experimental results on two event detection datasets verify the effectiveness of our method and suggest promising research opportunities in the dialogue-based explanation paradigm.

pdf bib
Improving Relevance Quality in Product Search using High-Precision Query-Product Semantic Similarity
Alireza Bagheri Garakani | Fan Yang | Wen-Yu Hua | Yetian Chen | Michinari Momma | Jingyuan Deng | Yan Gao | Yi Sun
Proceedings of the Fifth Workshop on e-Commerce and NLP (ECNLP 5)

Ensuring relevance quality in product search is a critical task as it impacts the customer’s ability to find intended products in the short-term as well as the general perception and trust of the e-commerce system in the long term. In this work we leverage a high-precision cross-encoder BERT model for semantic similarity between customer query and products and survey its effectiveness for three ranking applications where offline-generated scores could be used: (1) as an offline metric for estimating relevance quality impact, (2) as a re-ranking feature covering head/torso queries, and (3) as a training objective for optimization. We present results on effectiveness of this strategy for the large e-commerce setting, which has general applicability for choice of other high-precision models and tasks in ranking.

pdf bib
Spelling Correction using Phonetics in E-commerce Search
Fan Yang | Alireza Bagheri Garakani | Yifei Teng | Yan Gao | Jia Liu | Jingyuan Deng | Yi Sun
Proceedings of the Fifth Workshop on e-Commerce and NLP (ECNLP 5)

In E-commerce search, spelling correction plays an important role to find desired products for customers in processing user-typed search queries. However, resolving phonetic errors is a critical but much overlooked area. The query with phonetic spelling errors tends to appear correct based on pronunciation but is nonetheless inaccurate in spelling (e.g., “bluetooth sound system” vs. “blutut sant sistam”) with numerous noisy forms and sparse occurrences. In this work, we propose a generalized spelling correction system integrating phonetics to address phonetic errors in E-commerce search without additional latency cost. Using India (IN) E-commerce market for illustration, the experiment shows that our proposed phonetic solution significantly improves the F1 score by 9%+ and recall of phonetic errors by 8%+. This phonetic spelling correction system has been deployed to production, currently serving hundreds of millions of customers.

pdf bib
MT-Speech at SemEval-2022 Task 10: Incorporating Data Augmentation and Auxiliary Task with Cross-Lingual Pretrained Language Model for Structured Sentiment Analysis
Cong Chen | Jiansong Chen | Cao Liu | Fan Yang | Guanglu Wan | Jinxiong Xia
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

Sentiment analysis is a fundamental task, and structure sentiment analysis (SSA) is an important component of sentiment analysis. However, traditional SSA is suffering from some important issues: (1) lack of interactive knowledge of different languages; (2) small amount of annotation data or even no annotation data. To address the above problems, we incorporate data augment and auxiliary tasks within a cross-lingual pretrained language model into SSA. Specifically, we employ XLM-Roberta to enhance mutually interactive information when parallel data is available in the pretraining stage. Furthermore, we leverage two data augment strategies and auxiliary tasks to improve the performance on few-label data and zero-shot cross-lingual settings. Experiments demonstrate the effectiveness of our models. Our models rank first on the cross-lingual sub-task and rank second on the monolingual sub-task of SemEval-2022 task 10.

2021

pdf bib
Domain-Lifelong Learning for Dialogue State Tracking via Knowledge Preservation Networks
Qingbin Liu | Pengfei Cao | Cao Liu | Jiansong Chen | Xunliang Cai | Fan Yang | Shizhu He | Kang Liu | Jun Zhao
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Dialogue state tracking (DST), which estimates user goals given a dialogue context, is an essential component of task-oriented dialogue systems. Conventional DST models are usually trained offline, which requires a fixed dataset prepared in advance. This paradigm is often impractical in real-world applications since online dialogue systems usually involve continually emerging new data and domains. Therefore, this paper explores Domain-Lifelong Learning for Dialogue State Tracking (DLL-DST), which aims to continually train a DST model on new data to learn incessantly emerging new domains while avoiding catastrophically forgetting old learned domains. To this end, we propose a novel domain-lifelong learning method, called Knowledge Preservation Networks (KPN), which consists of multi-prototype enhanced retrospection and multi-strategy knowledge distillation, to solve the problems of expression diversity and combinatorial explosion in the DLL-DST task. Experimental results show that KPN effectively alleviates catastrophic forgetting and outperforms previous state-of-the-art lifelong learning methods by 4.25% and 8.27% of whole joint goal accuracy on the MultiWOZ benchmark and the SGD benchmark, respectively.

pdf bib
From Paraphrasing to Semantic Parsing: Unsupervised Semantic Parsing via Synchronous Semantic Decoding
Shan Wu | Bo Chen | Chunlei Xin | Xianpei Han | Le Sun | Weipeng Zhang | Jiansong Chen | Fan Yang | Xunliang Cai
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Semantic parsing is challenging due to the structure gap and the semantic gap between utterances and logical forms. In this paper, we propose an unsupervised semantic parsing method - Synchronous Semantic Decoding (SSD), which can simultaneously resolve the semantic gap and the structure gap by jointly leveraging paraphrasing and grammar-constrained decoding. Specifically, we reformulate semantic parsing as a constrained paraphrasing problem: given an utterance, our model synchronously generates its canonical utterancel and meaning representation. During synchronously decoding: the utterance paraphrasing is constrained by the structure of the logical form, therefore the canonical utterance can be paraphrased controlledly; the semantic decoding is guided by the semantics of the canonical utterance, therefore its logical form can be generated unsupervisedly. Experimental results show that SSD is a promising approach and can achieve state-of-the-art unsupervised semantic parsing performance on multiple datasets.

pdf bib
Opinion Prediction with User Fingerprinting
Kishore Tumarada | Yifan Zhang | Fan Yang | Eduard Dragut | Omprakash Gnawali | Arjun Mukherjee
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)

Opinion prediction is an emerging research area with diverse real-world applications, such as market research and situational awareness. We identify two lines of approaches to the problem of opinion prediction. One uses topic-based sentiment analysis with time-series modeling, while the other uses static embedding of text. The latter approaches seek user-specific solutions by generating user fingerprints. Such approaches are useful in predicting user’s reactions to unseen content. In this work, we propose a novel dynamic fingerprinting method that leverages contextual embedding of user’s comments conditioned on relevant user’s reading history. We integrate BERT variants with a recurrent neural network to generate predictions. The results show up to 13% improvement in micro F1-score compared to previous approaches. Experimental results show novel insights that were previously unknown such as better predictions for an increase in dynamic history length, the impact of the nature of the article on performance, thereby laying the foundation for further research.

pdf bib
Improving Evidence Retrieval with Claim-Evidence Entailment
Fan Yang | Eduard Dragut | Arjun Mukherjee
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)

Claim verification is challenging because it requires first to find textual evidence and then apply claim-evidence entailment to verify a claim. Previous works evaluate the entailment step based on the retrieved evidence, whereas we hypothesize that the entailment prediction can provide useful signals for evidence retrieval, in the sense that if a sentence supports or refutes a claim, the sentence must be relevant. We propose a novel model that uses the entailment score to express the relevancy. Our experiments verify that leveraging entailment prediction improves ranking multiple pieces of evidence.

2020

pdf bib
Predicting Personal Opinion on Future Events with Fingerprints
Fan Yang | Eduard Dragut | Arjun Mukherjee
Proceedings of the 28th International Conference on Computational Linguistics

Predicting users’ opinions in their response to social events has important real-world applications, many of which political and social impacts. Existing approaches derive a population’s opinion on a going event from large scores of user generated content. In certain scenarios, we may not be able to acquire such content and thus cannot infer an unbiased opinion on those emerging events. To address this problem, we propose to explore opinion on unseen articles based on one’s fingerprinting: the prior reading and commenting history. This work presents a focused study on modeling and leveraging fingerprinting techniques to predict a user’s future opinion. We introduce a recurrent neural network based model that integrates fingerprinting. We collect a large dataset that consists of event-comment pairs from six news websites. We evaluate the proposed model on this dataset. The results show substantial performance gains demonstrating the effectiveness of our approach.

pdf bib
Logic-guided Semantic Representation Learning for Zero-Shot Relation Classification
Juan Li | Ruoxu Wang | Ningyu Zhang | Wen Zhang | Fan Yang | Huajun Chen
Proceedings of the 28th International Conference on Computational Linguistics

Relation classification aims to extract semantic relations between entity pairs from the sentences. However, most existing methods can only identify seen relation classes that occurred during training. To recognize unseen relations at test time, we explore the problem of zero-shot relation classification. Previous work regards the problem as reading comprehension or textual entailment, which have to rely on artificial descriptive information to improve the understandability of relation types. Thus, rich semantic knowledge of the relation labels is ignored. In this paper, we propose a novel logic-guided semantic representation learning model for zero-shot relation classification. Our approach builds connections between seen and unseen relations via implicit and explicit semantic representations with knowledge graph embeddings and logic rules. Extensive experimental results demonstrate that our method can generalize to unseen relation types and achieve promising improvements.

pdf bib
XGLUE: A New Benchmark Datasetfor Cross-lingual Pre-training, Understanding and Generation
Yaobo Liang | Nan Duan | Yeyun Gong | Ning Wu | Fenfei Guo | Weizhen Qi | Ming Gong | Linjun Shou | Daxin Jiang | Guihong Cao | Xiaodong Fan | Ruofei Zhang | Rahul Agrawal | Edward Cui | Sining Wei | Taroon Bharti | Ying Qiao | Jiun-Hung Chen | Winnie Wu | Shuguang Liu | Fan Yang | Daniel Campos | Rangan Majumder | Ming Zhou
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

In this paper, we introduce XGLUE, a new benchmark dataset to train large-scale cross-lingual pre-trained models using multilingual and bilingual corpora, and evaluate their performance across a diverse set of cross-lingual tasks. Comparing to GLUE (Wang et al.,2019), which is labeled in English and includes natural language understanding tasks only, XGLUE has three main advantages: (1) it provides two corpora with different sizes for cross-lingual pre-training; (2) it provides 11 diversified tasks that cover both natural language understanding and generation scenarios; (3) for each task, it provides labeled data in multiple languages. We extend a recent cross-lingual pre-trained model Unicoder (Huang et al., 2019) to cover both understanding and generation tasks, which is evaluated on XGLUE as a strong baseline. We also evaluate the base versions (12-layer) of Multilingual BERT, XLM and XLM-R for comparison.

2019

pdf bib
Exploring Deep Multimodal Fusion of Text and Photo for Hate Speech Classification
Fan Yang | Xiaochang Peng | Gargi Ghosh | Reshef Shilon | Hao Ma | Eider Moore | Goran Predovic
Proceedings of the Third Workshop on Abusive Language Online

Interactions among users on social network platforms are usually positive, constructive and insightful. However, sometimes people also get exposed to objectionable content such as hate speech, bullying, and verbal abuse etc. Most social platforms have explicit policy against hate speech because it creates an environment of intimidation and exclusion, and in some cases may promote real-world violence. As users’ interactions on today’s social networks involve multiple modalities, such as texts, images and videos, in this paper we explore the challenge of automatically identifying hate speech with deep multimodal technologies, extending previous research which mostly focuses on the text signal alone. We present a number of fusion approaches to integrate text and photo signals. We show that augmenting text with image embedding information immediately leads to a boost in performance, while applying additional attention fusion methods brings further improvement.

2018

pdf bib
Attending Sentences to detect Satirical Fake News
Sohan De Sarkar | Fan Yang | Arjun Mukherjee
Proceedings of the 27th International Conference on Computational Linguistics

Satirical news detection is important in order to prevent the spread of misinformation over the Internet. Existing approaches to capture news satire use machine learning models such as SVM and hierarchical neural networks along with hand-engineered features, but do not explore sentence and document difference. This paper proposes a robust, hierarchical deep neural network approach for satire detection, which is capable of capturing satire both at the sentence level and at the document level. The architecture incorporates pluggable generic neural networks like CNN, GRU, and LSTM. Experimental results on real world news satire dataset show substantial performance gains demonstrating the effectiveness of our proposed approach. An inspection of the learned models reveals the existence of key sentences that control the presence of satire in news.

2017

pdf bib
Satirical News Detection and Analysis using Attention Mechanism and Linguistic Features
Fan Yang | Arjun Mukherjee | Eduard Dragut
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Satirical news is considered to be entertainment, but it is potentially deceptive and harmful. Despite the embedded genre in the article, not everyone can recognize the satirical cues and therefore believe the news as true news. We observe that satirical cues are often reflected in certain paragraphs rather than the whole document. Existing works only consider document-level features to detect the satire, which could be limited. We consider paragraph-level linguistic features to unveil the satire by incorporating neural network and attention mechanism. We investigate the difference between paragraph-level features and document-level features, and analyze them on a large satirical news dataset. The evaluation shows that the proposed model detects satirical news effectively and reveals what features are important at which level.

2016

pdf bib
An Empirical Study of Automatic Chinese Word Segmentation for Spoken Language Understanding and Named Entity Recognition
Wencan Luo | Fan Yang
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Leveraging Multiple Domains for Sentiment Classification
Fan Yang | Arjun Mukherjee | Yifan Zhang
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Sentiment classification becomes more and more important with the rapid growth of user generated content. However, sentiment classification task usually comes with two challenges: first, sentiment classification is highly domain-dependent and training sentiment classifier for every domain is inefficient and often impractical; second, since the quantity of labeled data is important for assessing the quality of classifier, it is hard to evaluate classifiers when labeled data is limited for certain domains. To address the challenges mentioned above, we focus on learning high-level features that are able to generalize across domains, so a global classifier can benefit with a simple combination of documents from multiple domains. In this paper, the proposed model incorporates both sentiment polarity and unlabeled data from multiple domains and learns new feature representations. Our model doesn’t require labels from every domain, which means the learned feature representation can be generalized for sentiment domain adaptation. In addition, the learned feature representation can be used as classifier since our model defines the meaning of feature value and arranges high-level features in a prefixed order, so it is not necessary to train another classifier on top of the new features. Empirical evaluations demonstrate our model outperforms baselines and yields competitive results to other state-of-the-art works on benchmark datasets.

2014

pdf bib
Semi-Supervised Chinese Word Segmentation Using Partial-Label Learning With Conditional Random Fields
Fan Yang | Paul Vozila
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

2013

pdf bib
An Empirical Study Of Semi-Supervised Chinese Word Segmentation Using Co-Training
Fan Yang | Paul Vozila
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

2011

pdf bib
An Investigation of Interruptions and Resumptions in Multi-Tasking Dialogues
Fan Yang | Peter A. Heeman | Andrew L. Kun
Computational Linguistics, Volume 37, Issue 1 - March 2011

2009

pdf bib
A Chinese-English Organization Name Translation System Using Heuristic Web Mining and Asymmetric Alignment
Fan Yang | Jun Zhao | Kang Liu
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

2008

pdf bib
Switching to Real-Time Tasks in Multi-Tasking Dialogue
Fan Yang | Peter A. Heeman | Andrew Kun
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

pdf bib
Chinese-English Backward Transliteration Assisted with Mining Monolingual Web Pages
Fan Yang | Jun Zhao | Bo Zou | Kang Liu | Feifan Liu
Proceedings of ACL-08: HLT

pdf bib
CRFs-Based Named Entity Recognition Incorporated with Heuristic Entity List Searching
Fan Yang | Jun Zhao | Bo Zou
Proceedings of the Sixth SIGHAN Workshop on Chinese Language Processing

2007

pdf bib
Avoiding and Resolving Initiative Conflicts in Dialogue
Fan Yang | Peter A. Heeman
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference

2006

pdf bib
A Data Driven Approach to Relevancy Recognition for Contextual Question Answering
Fan Yang | Junlan Feng | Giuseppe Di Fabbrizio
Proceedings of the Interactive Question Answering Workshop at HLT-NAACL 2006

2005

pdf bib
DialogueView: an Annotation Tool for Dialogue
Fan Yang | Peter A. Heeman
Proceedings of HLT/EMNLP 2005 Interactive Demonstrations

2002

pdf bib
DialogueView - An Annotation Tool for Dialogue
Peter A. Heeman | Fan Yang | Susan E. Strayer
Proceedings of the Third SIGdial Workshop on Discourse and Dialogue