Liu Jian
2022
Towards Making the Most of Pre-trained Translation Model for Quality Estimation
Li Chunyou
|
Di Hui
|
Huang Hui
|
Ouchi Kazushige
|
Chen Yufeng
|
Liu Jian
|
Xu Jinan
Proceedings of the 21st Chinese National Conference on Computational Linguistics
“Machine translation quality estimation (QE) aims to evaluate the quality of machine translation automatically without relying on any reference. One common practice is applying the translation model as a feature extractor. However, there exist several discrepancies between the translation model and the QE model. The translation model is trained in an autoregressive manner, while the QE model is performed in a non-autoregressive manner. Besides, the translation model only learns to model human-crafted parallel data, while the QE model needs to model machinetranslated noisy data. In order to bridge these discrepancies, we propose two strategies to posttrain the translation model, namely Conditional Masked Language Modeling (CMLM) and Denoising Restoration (DR). Specifically, CMLM learns to predict masked tokens at the target side conditioned on the source sentence. DR firstly introduces noise to the target side of parallel data, and the model is trained to detect and recover the introduced noise. Both strategies can adapt the pre-trained translation model to the QE-style prediction task. Experimental results show that our model achieves impressive results, significantly outperforming the baseline model, verifying the effectiveness of our proposed methods.”
2021
Improving Low-Resource Named Entity Recognition via Label-Aware Data Augmentation and Curriculum Denoising
Zhu Wenjing
|
Liu Jian
|
Xu Jinan
|
Chen Yufeng
|
Zhang Yujie
Proceedings of the 20th Chinese National Conference on Computational Linguistics
Deep neural networks have achieved state-of-the-art performances on named entity recognition(NER) with sufficient training data while they perform poorly in low-resource scenarios due to data scarcity. To solve this problem we propose a novel data augmentation method based on pre-trained language model (PLM) and curriculum learning strategy. Concretely we use the PLMto generate diverse training instances through predicting different masked words and design atask-specific curriculum learning strategy to alleviate the influence of noises. We evaluate the effectiveness of our approach on three datasets: CoNLL-2003 OntoNotes5.0 and MaScip of which the first two are simulated low-resource scenarios and the last one is a real low-resource dataset in material science domain. Experimental results show that our method consistently outperform the baseline model. Specifically our method achieves an absolute improvement of3.46% F1 score on the 1% CoNLL-2003 2.58% on the 1% OntoNotes5.0 and 0.99% on the full of MaScip.
Search
Fix data
Co-authors
- Xu Jinan 2
- Chen Yufeng 2
- Li Chunyou 1
- Di Hui 1
- Huang Hui 1
- show all...
Venues
- ccl2