JANUS: Joint Autoregressive and Non-autoregressive Training with Auxiliary Loss for Sequence Generation
Xiaobo Liang | Lijun Wu | Juntao Li | Min Zhang
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Transformer-based autoregressive and non-autoregressive models have played an essential role in sequence generation tasks. The autoregressive model can obtain excellent performance, while the non-autoregressive model brings fast decoding speed for inference. In this paper, we propose JANUS, a Joint Autoregressive and Non-autoregressive training method using aUxiliary losS to enhance the model performance in both AR and NAR manner simultaneously and effectively alleviate the problem of distribution discrepancy.Further, we pre-train BART with JANUS on a large corpus with minimal cost (16 GPU days) and make the BART-JANUS capable of non-autoregressive generation, demonstrating that our approach can transfer the AR knowledge to NAR. Empirically, we show our approach and BART-JANUS can achieve significant improvement on multiple generation tasks, including machine translation and GLGE benchmarks. Our code is available at Github.
Cross-Domain NER using Cross-Domain Language Modeling
Chen Jia | Xiaobo Liang | Yue Zhang
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Due to limitation of labeled resources, cross-domain named entity recognition (NER) has been a challenging task. Most existing work considers a supervised setting, making use of labeled data for both the source and target domains. A disadvantage of such methods is that they cannot train for domains without NER data. To address this issue, we consider using cross-domain LM as a bridge cross-domains for NER domain adaptation, performing cross-domain and cross-task knowledge transfer by designing a novel parameter generation network. Results show that our method can effectively extract domain differences from cross-domain LM contrast, allowing unsupervised domain adaptation while also giving state-of-the-art results among supervised domain adaptation methods.
Neural Relation Classification with Text Descriptions
Feiliang Ren | Di Zhou | Zhihui Liu | Yongcheng Li | Rongsheng Zhao | Yongkang Liu | Xiaobo Liang
Proceedings of the 27th International Conference on Computational Linguistics
Relation classification is an important task in natural language processing fields. State-of-the-art methods usually concentrate on building deep neural networks based classification models on the training data in which the relations of the labeled entity pairs are given. However, these methods usually suffer from the data sparsity issue greatly. On the other hand, we notice that it is very easily to obtain some concise text descriptions for almost all of the entities in a relation classification task. The text descriptions can provide helpful supplementary information for relation classification. But they are ignored by most of existing methods. In this paper, we propose DesRC, a new neural relation classification method which integrates entities’ text descriptions into deep neural networks models. We design a two-level attention mechanism to select the most useful information from the “intra-sentence” aspect and the “cross-sentence” aspect. Besides, the adversarial training method is also used to further improve the classification per-formance. Finally, we evaluate the proposed method on the SemEval 2010 dataset. Extensive experiments show that our method achieves much better experimental results than other state-of-the-art relation classification methods.
- Feiliang Ren 1
- Di Zhou 1
- Zhihui Liu 1
- Yongcheng Li 1
- Rongsheng Zhao 1
- show all...