Binyang Li


pdf bib
Social Bot-Aware Graph Neural Network for Early Rumor Detection
Zhen Huang | Zhilong Lv | Xiaoyun Han | Binyang Li | Menglong Lu | Dongsheng Li
Proceedings of the 29th International Conference on Computational Linguistics

Early rumor detection is a key challenging task to prevent rumors from spreading widely. Sociological research shows that social bots’ behavior in the early stage has become the main reason for rumors’ wide spread. However, current models do not explicitly distinguish genuine users from social bots, and their failure in identifying rumors timely. Therefore, this paper aims at early rumor detection by accounting for social bots’ behavior, and presents a Social Bot-Aware Graph Neural Network, named SBAG. SBAG firstly pre-trains a multi-layer perception network to capture social bot features, and then constructs multiple graph neural networks by embedding the features to model the early propagation of posts, which is further used to detect rumors. Extensive experiments on three benchmark datasets show that SBAG achieves significant improvements against the baselines and also identifies rumors within 3 hours while maintaining more than 90% accuracy.


pdf bib
CHIME: Cross-passage Hierarchical Memory Network for Generative Review Question Answering
Junru Lu | Gabriele Pergola | Lin Gui | Binyang Li | Yulan He
Proceedings of the 28th International Conference on Computational Linguistics

We introduce CHIME, a cross-passage hierarchical memory network for question answering (QA) via text generation. It extends XLNet introducing an auxiliary memory module consisting of two components: the context memory collecting cross-passage evidences, and the answer memory working as a buffer continually refining the generated answers. Empirically, we show the efficacy of the proposed architecture in the multi-passage generative QA, outperforming the state-of-the-art baselines with better syntactically well-formed answers and increased precision in addressing the questions of the AmazonQA review dataset. An additional qualitative analysis revealed the interpretability introduced by the memory module.


pdf bib
Early Rumour Detection
Kaimin Zhou | Chang Shu | Binyang Li | Jey Han Lau
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Rumours can spread quickly through social media, and malicious ones can bring about significant economical and social impact. Motivated by this, our paper focuses on the task of rumour detection; particularly, we are interested in understanding how early we can detect them. Although there are numerous studies on rumour detection, few are concerned with the timing of the detection. A successfully-detected malicious rumour can still cause significant damage if it isn’t detected in a timely manner, and so timing is crucial. To address this, we present a novel methodology for early rumour detection. Our model treats social media posts (e.g. tweets) as a data stream and integrates reinforcement learning to learn the number minimum number of posts required before we classify an event as a rumour. Experiments on Twitter and Weibo demonstrate that our model identifies rumours earlier than state-of-the-art systems while maintaining a comparable accuracy.

pdf bib
Context-aware Embedding for Targeted Aspect-based Sentiment Analysis
Bin Liang | Jiachen Du | Ruifeng Xu | Binyang Li | Hejiao Huang
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Attention-based neural models were employed to detect the different aspects and sentiment polarities of the same target in targeted aspect-based sentiment analysis (TABSA). However, existing methods do not specifically pre-train reasonable embeddings for targets and aspects in TABSA. This may result in targets or aspects having the same vector representations in different contexts and losing the context-dependent information. To address this problem, we propose a novel method to refine the embeddings of targets and aspects. Such pivotal embedding refinement utilizes a sparse coefficient vector to adjust the embeddings of target and aspect from the context. Hence the embeddings of targets and aspects can be refined from the highly correlative words instead of using context-independent or randomly initialized vectors. Experiment results on two benchmark datasets show that our approach yields the state-of-the-art performance in TABSA task.


pdf bib
ISCLAB at SemEval-2018 Task 1: UIR-Miner for Affect in Tweets
Meng Li | Zhenyuan Dong | Zhihao Fan | Kongming Meng | Jinghua Cao | Guanqi Ding | Yuhan Liu | Jiawei Shan | Binyang Li
Proceedings of the 12th International Workshop on Semantic Evaluation

This paper presents a UIR-Miner system for emotion and sentiment analysis evaluation in Twitter in SemEval 2018. Our system consists of three main modules: preprocessing module, stacking module to solve the intensity prediction of emotion and sentiment, LSTM network module to solve multi-label classification, and the hierarchical attention network module for solving emotion and sentiment classification problem. According to the metrics of SemEval 2018, our system gets the final scores of 0.636, 0.531, 0.731, 0.708, and 0.408 on 5 subtasks, respectively.

pdf bib
The UIR Uncertainty Corpus for Chinese: Annotating Chinese Microblog Corpus for Uncertainty Identification from Social Media
Binyang Li | Jun Xiang | Le Chen | Xu Han | Xiaoyan Yu | Ruifeng Xu | Tengjiao Wang | Kam-fai Wong
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)


pdf bib
ACE: Automatic Colloquialism, Typographical and Orthographic Errors Detection for Chinese Language
Shichao Dong | Gabriel Pui Cheong Fung | Binyang Li | Baolin Peng | Ming Liao | Jia Zhu | Kam-fai Wong
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations

We present a system called ACE for Automatic Colloquialism and Errors detection for written Chinese. ACE is based on the combination of N-gram model and rule-base model. Although it focuses on detecting colloquial Cantonese (a dialect of Chinese) at the current stage, it can be extended to detect other dialects. We chose Cantonese becauase it has many interesting properties, such as unique grammar system and huge colloquial terms, that turn the detection task extremely challenging. We conducted experiments using real data and synthetic data. The results indicated that ACE is highly reliable and effective.


pdf bib
UIR-PKU: Twitter-OpinMiner System for Sentiment Analysis in Twitter at SemEval 2015
Xu Han | Binyang Li | Jing Ma | Yuxiao Zhang | Gaoyan Ou | Tengjiao Wang | Kam-fai Wong
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

pdf bib
Overview of Topic-based Chinese Message Polarity Classification in SIGHAN 2015
Xiangwen Liao | Binyang Li | Liheng Xu
Proceedings of the Eighth SIGHAN Workshop on Chinese Language Processing


pdf bib
Exploiting Community Emotion for Microblog Event Detection
Gaoyan Ou | Wei Chen | Tengjiao Wang | Zhongyu Wei | Binyang Li | Dongqing Yang | Kam-Fai Wong
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
The CUHK Discourse TreeBank for Chinese: Annotating Explicit Discourse Connectives for the Chinese TreeBank
Lanjun Zhou | Binyang Li | Zhongyu Wei | Kam-Fai Wong
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

The lack of open discourse corpus for Chinese brings limitations for many natural language processing tasks. In this work, we present the first open discourse treebank for Chinese, namely, the Discourse Treebank for Chinese (DTBC). At the current stage, we annotated explicit intra-sentence discourse connectives, their corresponding arguments and senses for all 890 documents of the Chinese Treebank 5. We started by analysing the characteristics of discourse annotation for Chinese, adapted the annotation scheme of Penn Discourse Treebank 2 (PDTB2) to Chinese language while maintaining the compatibility as far as possible. We made adjustments to 3 essential aspects according to the previous study of Chinese linguistics. They are sense hierarchy, argument scope and semantics of arguments. Agreement study showed that our annotation scheme could achieve highly reliable results.

pdf bib
Web Information Mining and Decision Support Platform for the Modern Service Industry
Binyang Li | Lanjun Zhou | Zhongyu Wei | Kam-fai Wong | Ruifeng Xu | Yunqing Xia
Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations


pdf bib
Is Twitter A Better Corpus for Measuring Sentiment Similarity?
Shi Feng | Le Zhang | Binyang Li | Daling Wang | Ge Yu | Kam-Fai Wong
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
An Empirical Study on Uncertainty Identification in Social Media Context
Zhongyu Wei | Junwen Chen | Wei Gao | Binyang Li | Lanjun Zhou | Yulan He | Kam-Fai Wong
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)


pdf bib
Cross-Lingual Identification of Ambiguous Discourse Connectives for Resource-Poor Language
Lanjun Zhou | Wei Gao | Binyang Li | Zhongyu Wei | Kam-Fai Wong
Proceedings of COLING 2012: Posters


pdf bib
Unsupervised Discovery of Discourse Relations for Eliminating Intra-sentence Polarity Ambiguities
Lanjun Zhou | Binyang Li | Wei Gao | Zhongyu Wei | Kam-Fai Wong
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing


pdf bib
A Unified Graph Model for Sentence-Based Opinion Retrieval
Binyang Li | Lanjun Zhou | Shi Feng | Kam-Fai Wong
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics