2019
pdf
bib
abs
Chameleon: A Language Model Adaptation Toolkit for Automatic Speech Recognition of Conversational Speech
Yuanfeng Song
|
Di Jiang
|
Weiwei Zhao
|
Qian Xu
|
Raymond Chi-Wing Wong
|
Qiang Yang
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations
Language model is a vital component in modern automatic speech recognition (ASR) systems. Since “one-size-fits-all” language model works suboptimally for conversational speeches, language model adaptation (LMA) is considered as a promising solution for solving this problem. In order to compare the state-of-the-art LMA techniques and systematically demonstrate their effect in conversational speech recognition, we develop a novel toolkit named Chameleon, which includes the state-of-the-art cache-based and topic-based LMA techniques. This demonstration does not only vividly visualize underlying working mechanisms of a variety of the state-of-the-art LMA models but also provide an interface for the user to customize the hyperparameters of them. With this demonstration, the audience can experience the effect of LMA in an interactive and real-time fashion. We wish this demonstration would inspire more research on better language model techniques for ASR.
pdf
bib
abs
DAL: Dual Adversarial Learning for Dialogue Generation
Shaobo Cui
|
Rongzhong Lian
|
Di Jiang
|
Yuanfeng Song
|
Siqi Bao
|
Yong Jiang
Proceedings of the Workshop on Methods for Optimizing and Evaluating Neural Language Generation
In open-domain dialogue systems, generative approaches have attracted much attention for response generation. However, existing methods are heavily plagued by generating safe responses and unnatural responses. To alleviate these two problems, we propose a novel framework named Dual Adversarial Learning(DAL) for high-quality response generation. DAL innovatively utilizes the duality between query generation and response generation to avoid safe responses and increase the diversity of the generated responses. Additionally, DAL uses adversarial learning to mimic human judges and guides the system to generate natural responses. Experimental results demonstrate that DAL effectively improves both diversity and overall quality of the generated responses. DAL outperforms state-of-the-art methods regarding automatic metrics and human evaluations.
2016
pdf
bib
abs
Latent Topic Embedding
Di Jiang
|
Lei Shi
|
Rongzhong Lian
|
Hua Wu
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers
Topic modeling and word embedding are two important techniques for deriving latent semantics from data. General-purpose topic models typically work in coarse granularity by capturing word co-occurrence at the document/sentence level. In contrast, word embedding models usually work in much finer granularity by modeling word co-occurrence within small sliding windows. With the aim of deriving latent semantics by considering word co-occurrence at different levels of granularity, we propose a novel model named Latent Topic Embedding (LTE), which seamlessly integrates topic generation and embedding learning in one unified framework. We further propose an efficient Monte Carlo EM algorithm to estimate the parameters of interest. By retaining the individual advantages of topic modeling and word embedding, LTE results in better latent topics and word embedding. Extensive experiments verify the superiority of LTE over the state-of-the-arts.
2006
pdf
bib
A Full Inspection on Chinese Characters Used in the Secrete History of the Mongols
Di Jiang
|
Xuewen Zhou
Proceedings of the 20th Pacific Asia Conference on Language, Information and Computation
pdf
bib
The Current Status of Sorting Order of Tibetan Dictionaries and Standardization
Di Jiang
Proceedings of the 20th Pacific Asia Conference on Language, Information and Computation
2005
pdf
bib
The Verbal Entries and Their Description in a Grammatical Information-Dictionary of Contemporary Tibetan
Di Jiang
|
Congjun Long
|
Jichuan Zhang
Second International Joint Conference on Natural Language Processing: Full Papers