Wei Gao


2021

pdf bib
Boundary Detection with BERT for Span-level Emotion Cause Analysis
Xiangju Li | Wei Gao | Shi Feng | Yifei Zhang | Daling Wang
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2020

pdf bib
Debunking Rumors on Twitter with Tree Transformer
Jing Ma | Wei Gao
Proceedings of the 28th International Conference on Computational Linguistics

Rumors are manufactured with no respect for accuracy, but can circulate quickly and widely by “word-of-post” through social media conversations. Conversation tree encodes important information indicative of the credibility of rumor. Existing conversation-based techniques for rumor detection either just strictly follow tree edges or treat all the posts fully-connected during feature learning. In this paper, we propose a novel detection model based on tree transformer to better utilize user interactions in the dialogue where post-level self-attention plays the key role for aggregating the intra-/inter-subtree stances. Experimental results on the TWITTER and PHEME datasets show that the proposed approach consistently improves rumor detection performance.

2019

pdf bib
Using Customer Service Dialogues for Satisfaction Analysis with Context-Assisted Multiple Instance Learning
Kaisong Song | Lidong Bing | Wei Gao | Jun Lin | Lujun Zhao | Jiancheng Wang | Changlong Sun | Xiaozhong Liu | Qiong Zhang
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Customers ask questions and customer service staffs answer their questions, which is the basic service model via multi-turn customer service (CS) dialogues on E-commerce platforms. Existing studies fail to provide comprehensive service satisfaction analysis, namely satisfaction polarity classification (e.g., well satisfied, met and unsatisfied) and sentimental utterance identification (e.g., positive, neutral and negative). In this paper, we conduct a pilot study on the task of service satisfaction analysis (SSA) based on multi-turn CS dialogues. We propose an extensible Context-Assisted Multiple Instance Learning (CAMIL) model to predict the sentiments of all the customer utterances and then aggregate those sentiments into service satisfaction polarity. After that, we propose a novel Context Clue Matching Mechanism (CCMM) to enhance the representations of all customer utterances with their matched context clues, i.e., sentiment and reasoning clues. We construct two CS dialogue datasets from a top E-commerce platform. Extensive experimental results are presented and contrasted against a few previous models to demonstrate the efficacy of our model.

pdf bib
Sentence-Level Evidence Embedding for Claim Verification with Hierarchical Attention Networks
Jing Ma | Wei Gao | Shafiq Joty | Kam-Fai Wong
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Claim verification is generally a task of verifying the veracity of a given claim, which is critical to many downstream applications. It is cumbersome and inefficient for human fact-checkers to find consistent pieces of evidence, from which solid verdict could be inferred against the claim. In this paper, we propose a novel end-to-end hierarchical attention network focusing on learning to represent coherent evidence as well as their semantic relatedness with the claim. Our model consists of three main components: 1) A coherence-based attention layer embeds coherent evidence considering the claim and sentences from relevant articles; 2) An entailment-based attention layer attends on sentences that can semantically infer the claim on top of the first attention; and 3) An output layer predicts the verdict based on the embedded evidence. Experimental results on three public benchmark datasets show that our proposed model outperforms a set of state-of-the-art baselines.

2018

pdf bib
Rumor Detection on Twitter with Tree-structured Recursive Neural Networks
Jing Ma | Wei Gao | Kam-Fai Wong
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Automatic rumor detection is technically very challenging. In this work, we try to learn discriminative features from tweets content by following their non-sequential propagation structure and generate more powerful representations for identifying different type of rumors. We propose two recursive neural models based on a bottom-up and a top-down tree-structured neural networks for rumor representation learning and classification, which naturally conform to the propagation layout of tweets. Results on two public Twitter datasets demonstrate that our recursive neural models 1) achieve much better performance than state-of-the-art approaches; 2) demonstrate superior capacity on detecting rumors at very early stage.

pdf bib
Personalized Microblog Sentiment Classification via Adversarial Cross-lingual Multi-task Learning
Weichao Wang | Shi Feng | Wei Gao | Daling Wang | Yifei Zhang
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Sentiment expression in microblog posts can be affected by user’s personal character, opinion bias, political stance and so on. Most of existing personalized microblog sentiment classification methods suffer from the insufficiency of discriminative tweets for personalization learning. We observed that microblog users have consistent individuality and opinion bias in different languages. Based on this observation, in this paper we propose a novel user-attention-based Convolutional Neural Network (CNN) model with adversarial cross-lingual learning framework. The user attention mechanism is leveraged in CNN model to capture user’s language-specific individuality from the posts. Then the attention-based CNN model is incorporated into a novel adversarial cross-lingual learning framework, in which with the help of user properties as bridge between languages, we can extract the language-specific features and language-independent features to enrich the user post representation so as to alleviate the data insufficiency problem. Results on English and Chinese microblog datasets confirm that our method outperforms state-of-the-art baseline algorithms with large margins.

2017

pdf bib
Detect Rumors in Microblog Posts Using Propagation Structure via Kernel Learning
Jing Ma | Wei Gao | Kam-Fai Wong
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

How fake news goes viral via social media? How does its propagation pattern differ from real stories? In this paper, we attempt to address the problem of identifying rumors, i.e., fake information, out of microblog posts based on their propagation structure. We firstly model microblog posts diffusion with propagation trees, which provide valuable clues on how an original message is transmitted and developed over time. We then propose a kernel-based method called Propagation Tree Kernel, which captures high-order patterns differentiating different types of rumors by evaluating the similarities between their propagation tree structures. Experimental results on two real-world datasets demonstrate that the proposed kernel-based approach can detect rumors more quickly and accurately than state-of-the-art rumor detection models.

2016

pdf bib
Topic Extraction from Microblog Posts Using Conversation Structures
Jing Li | Ming Liao | Wei Gao | Yulan He | Kam-Fai Wong
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
QCRI at SemEval-2016 Task 4: Probabilistic Methods for Binary and Ordinal Quantification
Giovanni Da San Martino | Wei Gao | Fabrizio Sebastiani
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

2015

pdf bib
Using Content-level Structures for Summarizing Microblog Repost Trees
Jing Li | Wei Gao | Zhongyu Wei | Baolin Peng | Kam-Fai Wong
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Using Tweets to Help Sentence Compression for News Highlights Generation
Zhongyu Wei | Yang Liu | Chen Li | Wei Gao
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

pdf bib
QCRI: Answer Selection for Community Question Answering - Experiments for Arabic and English
Massimo Nicosia | Simone Filice | Alberto Barrón-Cedeño | Iman Saleh | Hamdy Mubarak | Wei Gao | Preslav Nakov | Giovanni Da San Martino | Alessandro Moschitti | Kareem Darwish | Lluís Màrquez | Shafiq Joty | Walid Magdy
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

2014

pdf bib
Simple Effective Microblog Named Entity Recognition: Arabic as an Example
Kareem Darwish | Wei Gao
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Despite many recent papers on Arabic Named Entity Recognition (NER) in the news domain, little work has been done on microblog NER. NER on microblogs presents many complications such as informality of language, shortened named entities, brevity of expressions, and inconsistent capitalization (for cased languages). We introduce simple effective language-independent approaches for improving NER on microblogs, based on using large gazetteers, domain adaptation, and a two-pass semi-supervised method. We use Arabic as an example language to compare the relative effectiveness of the approaches and when best to use them. We also present a new dataset for the task. Results of combining the proposed approaches show an improvement of 35.3 F-measure points over a baseline system trained on news data and an improvement of 19.9 F-measure points over the same system but trained on microblog data.

pdf bib
Utilizing Microblogs for Automatic News Highlights Extraction
Zhongyu Wei | Wei Gao
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

2013

pdf bib
An Empirical Study on Uncertainty Identification in Social Media Context
Zhongyu Wei | Junwen Chen | Wei Gao | Binyang Li | Lanjun Zhou | Yulan He | Kam-Fai Wong
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2012

pdf bib
Cross-Lingual Identification of Ambiguous Discourse Connectives for Resource-Poor Language
Lanjun Zhou | Wei Gao | Binyang Li | Zhongyu Wei | Kam-Fai Wong
Proceedings of COLING 2012: Posters

pdf bib
Information-theoretic Multi-view Domain Adaptation
Pei Yang | Wei Gao | Qi Tan | Kam-Fai Wong
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2011

pdf bib
Unsupervised Discovery of Discourse Relations for Eliminating Intra-sentence Polarity Ambiguities
Lanjun Zhou | Binyang Li | Wei Gao | Zhongyu Wei | Kam-Fai Wong
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf bib
Generating Aspect-oriented Multi-Document Summarization with Event-aspect model
Peng Li | Yinglin Wang | Wei Gao | Jing Jiang
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf bib
Query Weighting for Ranking Model Adaptation
Peng Cai | Wei Gao | Aoying Zhou | Kam-Fai Wong
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

2009

pdf bib
Exploiting Bilingual Information to Improve Web Search
Wei Gao | John Blitzer | Ming Zhou | Kam-Fai Wong
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

2005

pdf bib
NIL Is Not Nothing: Recognition of Chinese Network Informal Language Expressions
Yunqing Xia | Kam-Fai Wong | Wei Gao
Proceedings of the Fourth SIGHAN Workshop on Chinese Language Processing