Xiao Li


2020

pdf bib
Using a Penalty-based Loss Re-estimation Method to Improve Implicit Discourse Relation Classification
Xiao Li | Yu Hong | Huibin Ruan | Zhen Huang
Proceedings of the 28th International Conference on Computational Linguistics

We tackle implicit discourse relation classification, a task of automatically determining semantic relationships between arguments. The attention-worthy words in arguments are crucial clues for classifying the discourse relations. Attention mechanisms have been proven effective in highlighting the attention-worthy words during encoding. However, our survey shows that some inessential words are unintentionally misjudged as the attention-worthy words and, therefore, assigned heavier attention weights than should be. We propose a penalty-based loss re-estimation method to regulate the attention learning process, integrating penalty coefficients into the computation of loss by means of overstability of attention weight distributions. We conduct experiments on the Penn Discourse TreeBank (PDTB) corpus. The test results show that our loss re-estimation method leads to substantial improvements for a variety of attention mechanisms, and it obtains highly competitive performance compared to the state-of-the-art methods.

pdf bib
Improving Variational Autoencoder for Text Modelling with Timestep-Wise Regularisation
Ruizhe Li | Xiao Li | Guanyi Chen | Chenghua Lin
Proceedings of the 28th International Conference on Computational Linguistics

The Variational Autoencoder (VAE) is a popular and powerful model applied to text modelling to generate diverse sentences. However, an issue known as posterior collapse (or KL loss vanishing) happens when the VAE is used in text modelling, where the approximate posterior collapses to the prior, and the model will totally ignore the latent variables and be degraded to a plain language model during text generation. Such an issue is particularly prevalent when RNN-based VAE models are employed for text modelling. In this paper, we propose a simple, generic architecture called Timestep-Wise Regularisation VAE (TWR-VAE), which can effectively avoid posterior collapse and can be applied to any RNN-based VAE models. The effectiveness and versatility of our model are demonstrated in different tasks, including language modelling and dialogue response generation.

pdf bib
Multi-Task Neural Model for Agglutinative Language Translation
Yirong Pan | Xiao Li | Yating Yang | Rui Dong
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop

Neural machine translation (NMT) has achieved impressive performance recently by using large-scale parallel corpora. However, it struggles in the low-resource and morphologically-rich scenarios of agglutinative language translation task. Inspired by the finding that monolingual data can greatly improve the NMT performance, we propose a multi-task neural model that jointly learns to perform bi-directional translation and agglutinative language stemming. Our approach employs the shared encoder and decoder to train a single model without changing the standard NMT architecture but instead adding a token before each source-side sentence to specify the desired target outputs of the two different tasks. Experimental results on Turkish-English and Uyghur-Chinese show that our proposed approach can significantly improve the translation performance on agglutinative languages by using a small amount of monolingual data.

pdf bib
DGST: a Dual-Generator Network for Text Style Transfer
Xiao Li | Guanyi Chen | Chenghua Lin | Ruizhe Li
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

We propose DGST, a novel and simple Dual-Generator network architecture for text Style Transfer. Our model employs two generators only, and does not rely on any discriminators or parallel corpus for training. Both quantitative and qualitative experiments on the Yelp and IMDb datasets show that our model gives competitive performance compared to several strong baselines with more complicated architecture designs.

2019

pdf bib
GeoSQA: A Benchmark for Scenario-based Question Answering in the Geography Domain at High School Level
Zixian Huang | Yulin Shen | Xiao Li | Yu’ang Wei | Gong Cheng | Lin Zhou | Xinyu Dai | Yuzhong Qu
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Scenario-based question answering (SQA) has attracted increasing research attention. It typically requires retrieving and integrating knowledge from multiple sources, and applying general knowledge to a specific case described by a scenario. SQA widely exists in the medical, geography, and legal domains—both in practice and in the exams. In this paper, we introduce the GeoSQA dataset. It consists of 1,981 scenarios and 4,110 multiple-choice questions in the geography domain at high school level, where diagrams (e.g., maps, charts) have been manually annotated with natural language descriptions to benefit NLP research. Benchmark results on a variety of state-of-the-art methods for question answering, textual entailment, and reading comprehension demonstrate the unique challenges presented by SQA for future research.

pdf bib
A Dual-Attention Hierarchical Recurrent Neural Network for Dialogue Act Classification
Ruizhe Li | Chenghua Lin | Matthew Collinson | Xiao Li | Guanyi Chen
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

Recognising dialogue acts (DA) is important for many natural language processing tasks such as dialogue generation and intention recognition. In this paper, we propose a dual-attention hierarchical recurrent neural network for DA classification. Our model is partially inspired by the observation that conversational utterances are normally associated with both a DA and a topic, where the former captures the social act and the latter describes the subject matter. However, such a dependency between DAs and topics has not been utilised by most existing systems for DA classification. With a novel dual task-specific attention mechanism, our model is able, for utterances, to capture information about both DAs and topics, as well as information about the interactions between them. Experimental results show that by modelling topic as an auxiliary task, our model can significantly improve DA classification, yielding better or comparable performance to the state-of-the-art method on three public datasets.

pdf bib
A Stable Variational Autoencoder for Text Modelling
Ruizhe Li | Xiao Li | Chenghua Lin | Matthew Collinson | Rui Mao
Proceedings of the 12th International Conference on Natural Language Generation

Variational Autoencoder (VAE) is a powerful method for learning representations of high-dimensional data. However, VAEs can suffer from an issue known as latent variable collapse (or KL term vanishing), where the posterior collapses to the prior and the model will ignore the latent codes in generative tasks. Such an issue is particularly prevalent when employing VAE-RNN architectures for text modelling (Bowman et al., 2016; Yang et al., 2017). In this paper, we present a new architecture called Full-Sampling-VAE-RNN, which can effectively avoid latent variable collapse. Compared to the general VAE-RNN architectures, we show that our model can achieve much more stable training process and can generate text with significantly better quality.

2018

pdf bib
Statistical NLG for Generating the Content and Form of Referring Expressions
Xiao Li | Kees van Deemter | Chenghua Lin
Proceedings of the 11th International Conference on Natural Language Generation

This paper argues that a new generic approach to statistical NLG can be made to perform Referring Expression Generation (REG) successfully. The model does not only select attributes and values for referring to a target referent, but also performs Linguistic Realisation, generating an actual Noun Phrase. Our evaluations suggest that the attribute selection aspect of the algorithm exceeds classic REG algorithms, while the Noun Phrases generated are as similar to those in a previously developed corpus as were Noun Phrases produced by a new set of human speakers.

2017

pdf bib
Investigating the content and form of referring expressions in Mandarin: introducing the Mtuna corpus
Kees van Deemter | Le Sun | Rint Sybesma | Xiao Li | Bo Chen | Muyun Yang
Proceedings of the 10th International Conference on Natural Language Generation

East Asian languages are thought to handle reference differently from languages such as English, particularly in terms of the marking of definiteness and number. We present the first Data-Text corpus for Referring Expressions in Mandarin, and we use this corpus to test some initial hypotheses inspired by the theoretical linguistics literature. Our findings suggest that function words deserve more attention in Referring Expressions Generation than they have so far received, and they have a bearing on the debate about whether different languages make different trade-offs between clarity and brevity.

pdf bib
Log-linear Models for Uyghur Segmentation in Spoken Language Translation
Chenggang Mi | Yating Yang | Rui Dong | Xi Zhou | Lei Wang | Xiao Li | Tonghai Jiang
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017

To alleviate data sparsity in spoken Uyghur machine translation, we proposed a log-linear based morphological segmentation approach. Instead of learning model only from monolingual annotated corpus, this approach optimizes Uyghur segmentation for spoken translation based on both bilingual and monolingual corpus. Our approach relies on several features such as traditional conditional random field (CRF) feature, bilingual word alignment feature and monolingual suffixword co-occurrence feature. Experimental results shown that our proposed segmentation model for Uyghur spoken translation achieved 1.6 BLEU score improvements compared with the state-of-the-art baseline.

2016

pdf bib
Statistics-Based Lexical Choice for NLG from Quantitative Information
Xiao Li | Kees van Deemter | Chenghua Lin
Proceedings of the 9th International Natural Language Generation conference

pdf bib
Recurrent Neural Network Based Loanwords Identification in Uyghur
Chenggang Mi | Yating Yang | Xi Zhou | Lei Wang | Xiao Li | Tonghai Jiang
Proceedings of the 30th Pacific Asia Conference on Language, Information and Computation: Oral Papers

2010

pdf bib
Understanding the Semantic Structure of Noun Phrase Queries
Xiao Li
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

2009

pdf bib
Semantic Tagging of Web Search Queries
Mehdi Manshadi | Xiao Li
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf bib
On the Use of Virtual Evidence in Conditional Random Fields
Xiao Li
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
Discovery of Term Variation in Japanese Web Search Queries
Hisami Suzuki | Xiao Li | Jianfeng Gao
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

2008

pdf bib
Learning N-Best Correction Models from Implicit User Feedback in a Multi-Modal Local Search Application
Dan Bohus | Xiao Li | Patrick Nguyen | Geoffrey Zweig
Proceedings of the 9th SIGdial Workshop on Discourse and Dialogue

2005

pdf bib
The Vocal Joystick: A Voice-Based Human-Computer Interface for Individuals with Motor Impairments
Jeff A. Bilmes | Xiao Li | Jonathan Malkin | Kelley Kilanski | Richard Wright | Katrin Kirchhoff | Amar Subramanya | Susumu Harada | James Landay | Patricia Dowden | Howard Chizeck
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing