Chunguang Pan


2021

pdf bib
MapRE: An Effective Semantic Mapping Approach for Low-resource Relation Extraction
Manqing Dong | Chunguang Pan | Zhipeng Luo
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Neural relation extraction models have shown promising results in recent years; however, the model performance drops dramatically given only a few training samples. Recent works try leveraging the advance in few-shot learning to solve the low resource problem, where they train label-agnostic models to directly compare the semantic similarities among context sentences in the embedding space. However, the label-aware information, i.e., the relation label that contains the semantic knowledge of the relation itself, is often neglected for prediction. In this work, we propose a framework considering both label-agnostic and label-aware semantic mapping information for low resource relation extraction. We show that incorporating the above two types of mapping information in both pretraining and fine-tuning can significantly improve the model performance on low-resource relation extraction tasks.

pdf bib
DeepBlueAI at TextGraphs 2021 Shared Task: Treating Multi-Hop Inference Explanation Regeneration as A Ranking Problem
Chunguang Pan | Bingyan Song | Zhipeng Luo
Proceedings of the Fifteenth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-15)

This paper describes the winning system for TextGraphs 2021 shared task: Multi-hop inference explanation regeneration. Given a question and its corresponding correct answer, this task aims to select the facts that can explain why the answer is correct for that question and answering (QA) from a large knowledge base. To address this problem and accelerate training as well, our strategy includes two steps. First, fine-tuning pre-trained language models (PLMs) with triplet loss to recall top-K relevant facts for each question and answer pair. Then, adopting the same architecture to train the re-ranking model to rank the top-K candidates. To further improve the performance, we average the results from models based on different PLMs (e.g., RoBERTa) and different parameter settings to make the final predictions. The official evaluation shows that, our system can outperform the second best system by 4.93 points, which proves the effectiveness of our system. Our code has been open source, address is https://github.com/DeepBlueAI/TextGraphs-15

pdf bib
DeepBlueAI at SemEval-2021 Task 1: Lexical Complexity Prediction with A Deep Ensemble Approach
Chunguang Pan | Bingyan Song | Shengguang Wang | Zhipeng Luo
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)

Lexical complexity plays an important role in reading comprehension. lexical complexity prediction (LCP) can not only be used as a part of Lexical Simplification systems, but also as a stand-alone application to help people better reading. This paper presents the winning system we submitted to the LCP Shared Task of SemEval 2021 that capable of dealing with both two subtasks. We first perform fine-tuning on numbers of pre-trained language models (PLMs) with various hyperparameters and different training strategies such as pseudo-labelling and data augmentation. Then an effective stacking mechanism is applied on top of the fine-tuned PLMs to obtain the final prediction. Experimental results on the Complex dataset show the validity of our method and we rank first and second for subtask 2 and 1.

pdf bib
DeepBlueAI at SemEval-2021 Task 7: Detecting and Rating Humor and Offense with Stacking Diverse Language Model-Based Methods
Bingyan Song | Chunguang Pan | Shengguang Wang | Zhipeng Luo
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)

This paper describes the winning system for SemEval-2021 Task 7: Detecting and Rating Humor and Offense. Our strategy is stacking diverse pre-trained language models (PLMs) such as RoBERTa and ALBERT. We first perform fine-tuning on these two PLMs with various hyperparameters and different training strategies. Then a valid stacking mechanism is applied on top of the fine-tuned PLMs to get the final prediction. Experimental results on the dataset released by the organizer of the task show the validity of our method and we win first place and third place for subtask 2 and 1a.

pdf bib
DeepBlueAI at WANLP-EACL2021 task 2: A Deep Ensemble-based Method for Sarcasm and Sentiment Detection in Arabic
Bingyan Song | Chunguang Pan | Shengguang Wang | Zhipeng Luo
Proceedings of the Sixth Arabic Natural Language Processing Workshop

Sarcasm is one of the main challenges for sentiment analysis systems due to using implicit indirect phrasing for expressing opinions, especially in Arabic. This paper presents the system we submitted to the Sarcasm and Sentiment Detection task of WANLP-2021 that is capable of dealing with both two subtasks. We first perform fine-tuning on two kinds of pre-trained language models (PLMs) with different training strategies. Then an effective stacking mechanism is applied on top of the fine-tuned PLMs to obtain the final prediction. Experimental results on ArSarcasm-v2 dataset show the effectiveness of our method and we rank third and second for subtask 1 and 2.