Liner Yang


pdf bib
Multitasking Framework for Unsupervised Simple Definition Generation
Cunliang Kong | Yun Chen | Hengyuan Zhang | Liner Yang | Erhong Yang
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

The definition generation task can help language learners by providing explanations for unfamiliar words. This task has attracted much attention in recent years. We propose a novel task of Simple Definition Generation (SDG) to help language learners and low literacy readers. A significant challenge of this task is the lack of learner’s dictionaries in many languages, and therefore the lack of data for supervised training. We explore this task and propose a multitasking framework SimpDefiner that only requires a standard dictionary with complex definitions and a corpus containing arbitrary simple texts. We disentangle the complexity factors from the text by carefully designing a parameter sharing scheme between two decoders. By jointly training these components, the framework can generate both complex and simple definitions simultaneously. We demonstrate that the framework can generate relevant, simple definitions for the target words through automatic and manual evaluations on English and Chinese datasets. Our method outperforms the baseline model by a 1.77 SARI score on the English dataset, and raises the proportion of the low level (HSK level 1-3) words in Chinese definitions by 3.87%.


pdf bib
中美学者学术英语写作中词汇难度特征比较研究——以计算语言学领域论文为例(A Comparative Study of the Features of Lexical Sophistication in Academic English Writing by Chinese and American)
Yonghui Xie (谢永慧) | Yang Liu (刘洋) | Erhong Yang (杨尔弘) | Liner Yang (杨麟儿)
Proceedings of the 20th Chinese National Conference on Computational Linguistics


pdf bib
Neural Quality Estimation with Multiple Hypotheses for Grammatical Error Correction
Zhenghao Liu | Xiaoyuan Yi | Maosong Sun | Liner Yang | Tat-Seng Chua
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Grammatical Error Correction (GEC) aims to correct writing errors and help language learners improve their writing skills. However, existing GEC models tend to produce spurious corrections or fail to detect lots of errors. The quality estimation model is necessary to ensure learners get accurate GEC results and avoid misleading from poorly corrected sentences. Well-trained GEC models can generate several high-quality hypotheses through decoding, such as beam search, which provide valuable GEC evidence and can be used to evaluate GEC quality. However, existing models neglect the possible GEC evidence from different hypotheses. This paper presents the Neural Verification Network (VERNet) for GEC quality estimation with multiple hypotheses. VERNet establishes interactions among hypotheses with a reasoning graph and conducts two kinds of attention mechanisms to propagate GEC evidence to verify the quality of generated hypotheses. Our experiments on four GEC datasets show that VERNet achieves state-of-the-art grammatical error detection performance, achieves the best quality estimation results, and significantly improves GEC performance by reranking hypotheses. All data and source codes are available at


pdf bib
面向汉语作为第二语言学习的个性化语法纠错(Personalizing Grammatical Error Correction for Chinese as a Second Language)
Shengsheng Zhang (张生盛) | Guina Pang (庞桂娜) | Liner Yang (杨麟儿) | Chencheng Wang (王辰成) | Yongping Du (杜永萍) | Erhong Yang (杨尔弘) | Yaping Huang (黄雅平)
Proceedings of the 19th Chinese National Conference on Computational Linguistics


pdf bib
基于BERT与柱搜索的中文释义生成(Chinese Definition Modeling Based on BERT and Beam Seach)
Qinan Fan (范齐楠) | Cunliang Kong (孔存良) | Liner Yang (杨麟儿) | Erhong Yang (杨尔弘)
Proceedings of the 19th Chinese National Conference on Computational Linguistics


pdf bib
汉语学习者依存句法树库构建(Construction of a Treebank of Learner Chinese)
Jialu Shi (师佳璐) | Xinyu Luo (罗昕宇) | Liner Yang (杨麟儿) | Dan Xiao (肖丹) | Zhengsheng Hu (胡正声) | Yijun Wang (王一君) | Jiaxin Yuan (袁佳欣) | Yu Jingsi (余婧思) | Erhong Yang (杨尔弘)
Proceedings of the 19th Chinese National Conference on Computational Linguistics



pdf bib
The BLCU System in the BEA 2019 Shared Task
Liner Yang | Chencheng Wang
Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications

This paper describes the BLCU Group submissions to the Building Educational Applications (BEA) 2019 Shared Task on Grammatical Error Correction (GEC). The task is to detect and correct grammatical errors that occurred in essays. We participate in 2 tracks including the Restricted Track and the Unrestricted Track. Our system is based on a Transformer model architecture. We integrate many effective methods proposed in recent years. Such as, Byte Pair Encoding, model ensemble, checkpoints average and spell checker. We also corrupt the public monolingual data to further improve the performance of the model. On the test data of the BEA 2019 Shared Task, our system yields F0.5 = 58.62 and 59.50, ranking twelfth and fourth respectively.


pdf bib
A Novel Fast Framework for Topic Labeling Based on Similarity-preserved Hashing
Xian-Ling Mao | Yi-Jing Hao | Qiang Zhou | Wen-Qing Yuan | Liner Yang | Heyan Huang
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Recently, topic modeling has been widely applied in data mining due to its powerful ability. A common, major challenge in applying such topic models to other tasks is to accurately interpret the meaning of each topic. Topic labeling, as a major interpreting method, has attracted significant attention recently. However, most of previous works only focus on the effectiveness of topic labeling, and less attention has been paid to quickly creating good topic descriptors; meanwhile, it’s hard to assign labels for new emerging topics by using most of existing methods. To solve the problems above, in this paper, we propose a novel fast topic labeling framework that casts the labeling problem as a k-nearest neighbor (KNN) search problem in a probability vector set. Our experimental results show that the proposed sequential interleaving method based on locality sensitive hashing (LSH) technology is efficient in boosting the comparison speed among probability distributions, and the proposed framework can generate meaningful labels to interpret topics, including new emerging topics.