X-PuDu at SemEval-2022 Task 7: A Replaced Token Detection Task Pre-trained Model with Pattern-aware Ensembling for Identifying Plausible Clarifications
Junyuan Shang | Shuohuan Wang | Yu Sun | Yanjun Yu | Yue Zhou | Li Xiang | Guixiu Yang
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)
This paper describes our winning system on SemEval 2022 Task 7: Identifying Plausible Clarifications ofImplicit and Underspecified Phrases in Instructional Texts. A replaced token detection pre-trained model is utilized with minorly different task-specific heads for SubTask-A: Multi-class Classification and SubTask-B: Ranking. Incorporating a pattern-aware ensemble method, our system achieves a 68.90% accuracy score and 0.8070 spearman’s rank correlation score surpassing the 2nd place with a large margin by 2.7 and 2.2 percent points for SubTask-A and SubTask-B, respectively. Our approach is simple and easy to implement, and we conducted ablation studies and qualitative and quantitative analyses for the working strategies used in our system.
ERNIE-Doc: A Retrospective Long-Document Modeling Transformer
SiYu Ding | Junyuan Shang | Shuohuan Wang | Yu Sun | Hao Tian | Hua Wu | Haifeng Wang
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Transformers are not suited for processing long documents, due to their quadratically increasing memory and time consumption. Simply truncating a long document or applying the sparse attention mechanism will incur the context fragmentation problem or lead to an inferior modeling capability against comparable model sizes. In this paper, we propose ERNIE-Doc, a document-level language pretraining model based on Recurrence Transformers. Two well-designed techniques, namely the retrospective feed mechanism and the enhanced recurrence mechanism, enable ERNIE-Doc, which has a much longer effective context length, to capture the contextual information of a complete document. We pretrain ERNIE-Doc to explicitly learn the relationships among segments with an additional document-aware segment-reordering objective. Various experiments were conducted on both English and Chinese document-level tasks. ERNIE-Doc improved the state-of-the-art language modeling result of perplexity to 16.8 on WikiText-103. Moreover, it outperformed competitive pretraining models by a large margin on most language understanding tasks, such as text classification and question answering.