Jian Wang


pdf bib
Domain-specific knowledge distillation yields smaller and better models for conversational commerce
Kristen Howell | Jian Wang | Akshay Hazare | Joseph Bradley | Chris Brew | Xi Chen | Matthew Dunn | Beth Hockey | Andrew Maurer | Dominic Widdows
Proceedings of the Fifth Workshop on e-Commerce and NLP (ECNLP 5)

We demonstrate that knowledge distillation can be used not only to reduce model size, but to simultaneously adapt a contextual language model to a specific domain. We use Multilingual BERT (mBERT; Devlin et al., 2019) as a starting point and follow the knowledge distillation approach of (Sahn et al., 2019) to train a smaller multilingual BERT model that is adapted to the domain at hand. We show that for in-domain tasks, the domain-specific model shows on average 2.3% improvement in F1 score, relative to a model distilled on domain-general data. Whereas much previous work with BERT has fine-tuned the encoder weights during task training, we show that the model improvements from distillation on in-domain data persist even when the encoder weights are frozen during task training, allowing a single encoder to support classifiers for multiple tasks and languages.

pdf bib
多特征融合的越英端到端语音翻译方法(A Vietnamese-English end-to-end speech translation method based on multi-feature fusion)
Houli Ma (马候丽) | Ling Dong (董凌) | Wenjun Wang (王文君) | Jian Wang (王剑) | Shengxiang Gao (高盛祥) | Zhengtao Yu (余正涛)
Proceedings of the 21st Chinese National Conference on Computational Linguistics


pdf bib
Two Languages Are Better than One: Bilingual Enhancement for Chinese Named Entity Recognition
Jinzhong Ning | Zhihao Yang | Zhizheng Wang | Yuanyuan Sun | Hongfei Lin | Jian Wang
Proceedings of the 29th International Conference on Computational Linguistics

Chinese Named Entity Recognition (NER) has continued to attract research attention. However, most existing studies only explore the internal features of the Chinese language but neglect other lingual modal features. Actually, as another modal knowledge of the Chinese language, English contains rich prompts about entities that can potentially be applied to improve the performance of Chinese NER. Therefore, in this study, we explore the bilingual enhancement for Chinese NER and propose a unified bilingual interaction module called the Adapted Cross-Transformers with Global Sparse Attention (ACT-S) to capture the interaction of bilingual information. We utilize a model built upon several different ACT-Ss to integrate the rich English information into the Chinese representation. Moreover, our model can learn the interaction of information between bilinguals (inter-features) and the dependency information within Chinese (intra-features). Compared with existing Chinese NER methods, our proposed model can better handle entities with complex structures. The English text that enhances the model is automatically generated by machine translation, avoiding high labour costs. Experimental results on four well-known benchmark datasets demonstrate the effectiveness and robustness of our proposed model.

pdf bib
RealMedDial: A Real Telemedical Dialogue Dataset Collected from Online Chinese Short-Video Clips
Bo Xu | Hongtong Zhang | Jian Wang | Xiaokun Zhang | Dezhi Hao | Linlin Zong | Hongfei Lin | Fenglong Ma
Proceedings of the 29th International Conference on Computational Linguistics

Intelligent medical services have attracted great research interests for providing automated medical consultation. However, the lack of corpora becomes a main obstacle to related research, particularly data from real scenarios. In this paper, we construct RealMedDial, a Chinese medical dialogue dataset based on real medical consultation. RealMedDial contains 2,637 medical dialogues and 24,255 utterances obtained from Chinese short-video clips of real medical consultations. We collected and annotated a wide range of meta-data with respect to medical dialogue including doctor profiles, hospital departments, diseases and symptoms for fine-grained analysis on language usage pattern and clinical diagnosis. We evaluate the performance of medical response generation, department routing and doctor recommendation on RealMedDial. Results show that RealMedDial are applicable to a wide range of NLP tasks with respect to medical dialogue.


pdf bib
TransS-Driven Joint Learning Architecture for Implicit Discourse Relation Recognition
Ruifang He | Jian Wang | Fengyu Guo | Yugui Han
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Implicit discourse relation recognition is a challenging task due to the lack of connectives as strong linguistic clues. Previous methods primarily encode two arguments separately or extract the specific interaction patterns for the task, which have not fully exploited the annotated relation signal. Therefore, we propose a novel TransS-driven joint learning architecture to address the issues. Specifically, based on the multi-level encoder, we 1) translate discourse relations in low-dimensional embedding space (called TransS), which could mine the latent geometric structure information of argument-relation instances; 2) further exploit the semantic features of arguments to assist discourse understanding; 3) jointly learn 1) and 2) to mutually reinforce each other to obtain the better argument representations, so as to improve the performance of the task. Extensive experimental results on the Penn Discourse TreeBank (PDTB) show that our model achieves competitive results against several state-of-the-art systems.

pdf bib
Dual Dynamic Memory Network for End-to-End Multi-turn Task-oriented Dialog Systems
Jian Wang | Junhao Liu | Wei Bi | Xiaojiang Liu | Kejing He | Ruifeng Xu | Min Yang
Proceedings of the 28th International Conference on Computational Linguistics

Existing end-to-end task-oriented dialog systems struggle to dynamically model long dialog context for interactions and effectively incorporate knowledge base (KB) information into dialog generation. To conquer these limitations, we propose a Dual Dynamic Memory Network (DDMN) for multi-turn dialog generation, which maintains two core components: dialog memory manager and KB memory manager. The dialog memory manager dynamically expands the dialog memory turn by turn and keeps track of dialog history with an updating mechanism, which encourages the model to filter irrelevant dialog history and memorize important newly coming information. The KB memory manager shares the structural KB triples throughout the whole conversation, and dynamically extracts KB information with a memory pointer at each turn. Experimental results on three benchmark datasets demonstrate that DDMN significantly outperforms the strong baselines in terms of both automatic evaluation and human evaluation. Our code is available at https://github.com/siat-nlp/DDMN.


pdf bib
WECA: A WordNet-Encoded Collocation-Attention Network for Homographic Pun Recognition
Yufeng Diao | Hongfei Lin | Di Wu | Liang Yang | Kan Xu | Zhihao Yang | Jian Wang | Shaowu Zhang | Bo Xu | Dongyu Zhang
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Homographic puns have a long history in human writing, widely used in written and spoken literature, which usually occur in a certain syntactic or stylistic structure. How to recognize homographic puns is an important research. However, homographic pun recognition does not solve very well in existing work. In this work, we first use WordNet to understand and expand word embedding for settling the polysemy of homographic puns, and then propose a WordNet-Encoded Collocation-Attention network model (WECA) which combined with the context weights for recognizing the puns. Our experiments on the SemEval2017 Task7 and Pun of the Day demonstrate that the proposed model is able to distinguish between homographic pun and non-homographic pun texts. We show the effectiveness of the model to present the capability of choosing qualitatively informative words. The results show that our model achieves the state-of-the-art performance on homographic puns recognition.

pdf bib
Think Visually: Question Answering through Virtual Imagery
Ankit Goyal | Jian Wang | Jia Deng
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

In this paper, we study the problem of geometric reasoning (a form of visual reasoning) in the context of question-answering. We introduce Dynamic Spatial Memory Network (DSMN), a new deep network architecture that specializes in answering questions that admit latent visual representations, and learns to generate and reason over such representations. Further, we propose two synthetic benchmarks, FloorPlanQA and ShapeIntersection, to evaluate the geometric reasoning capability of QA systems. Experimental results validate the effectiveness of our proposed DSMN for visual thinking tasks.


pdf bib
Alibaba at IJCNLP-2017 Task 2: A Boosted Deep System for Dimensional Sentiment Analysis of Chinese Phrases
Xin Zhou | Jian Wang | Xu Xie | Changlong Sun | Luo Si
Proceedings of the IJCNLP 2017, Shared Tasks

This paper introduces Team Alibaba’s systems participating IJCNLP 2017 shared task No. 2 Dimensional Sentiment Analysis for Chinese Phrases (DSAP). The systems mainly utilize a multi-layer neural networks, with multiple features input such as word embedding, part-of-speech-tagging (POST), word clustering, prefix type, character embedding, cross sentiment input, and AdaBoost method for model training. For word level task our best run achieved MAE 0.545 (ranked 2nd), PCC 0.892 (ranked 2nd) in valence prediction and MAE 0.857 (ranked 1st), PCC 0.678 (ranked 2nd) in arousal prediction. For average performance of word and phrase task we achieved MAE 0.5355 (ranked 3rd), PCC 0.8965 (ranked 3rd) in valence prediction and MAE 0.661 (ranked 3rd), PCC 0.766 (ranked 2nd) in arousal prediction. In the final our submitted system achieved 2nd in mean rank.


pdf bib
DUTIR in BioNLP-ST 2016: Utilizing Convolutional Network and Distributed Representation to Extract Complicate Relations
Honglei Li | Jianhai Zhang | Jian Wang | Hongfei Lin | Zhihao Yang
Proceedings of the 4th BioNLP Shared Task Workshop


pdf bib
Biography-Dependent Collaborative Entity Archiving for Slot Filling
Yu Hong | Xiaobin Wang | Yadong Chen | Jian Wang | Tongtao Zhang | Heng Ji
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing