MT-Speech at SemEval-2022 Task 10: Incorporating Data Augmentation and Auxiliary Task with Cross-Lingual Pretrained Language Model for Structured Sentiment Analysis

Cong Chen, Jiansong Chen, Cao Liu, Fan Yang, Guanglu Wan, Jinxiong Xia


Abstract
Sentiment analysis is a fundamental task, and structure sentiment analysis (SSA) is an important component of sentiment analysis. However, traditional SSA is suffering from some important issues: (1) lack of interactive knowledge of different languages; (2) small amount of annotation data or even no annotation data. To address the above problems, we incorporate data augment and auxiliary tasks within a cross-lingual pretrained language model into SSA. Specifically, we employ XLM-Roberta to enhance mutually interactive information when parallel data is available in the pretraining stage. Furthermore, we leverage two data augment strategies and auxiliary tasks to improve the performance on few-label data and zero-shot cross-lingual settings. Experiments demonstrate the effectiveness of our models. Our models rank first on the cross-lingual sub-task and rank second on the monolingual sub-task of SemEval-2022 task 10.
Anthology ID:
2022.semeval-1.185
Volume:
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)
Month:
July
Year:
2022
Address:
Seattle, United States
Editors:
Guy Emerson, Natalie Schluter, Gabriel Stanovsky, Ritesh Kumar, Alexis Palmer, Nathan Schneider, Siddharth Singh, Shyam Ratan
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
1329–1335
Language:
URL:
https://aclanthology.org/2022.semeval-1.185
DOI:
10.18653/v1/2022.semeval-1.185
Bibkey:
Cite (ACL):
Cong Chen, Jiansong Chen, Cao Liu, Fan Yang, Guanglu Wan, and Jinxiong Xia. 2022. MT-Speech at SemEval-2022 Task 10: Incorporating Data Augmentation and Auxiliary Task with Cross-Lingual Pretrained Language Model for Structured Sentiment Analysis. In Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022), pages 1329–1335, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):
MT-Speech at SemEval-2022 Task 10: Incorporating Data Augmentation and Auxiliary Task with Cross-Lingual Pretrained Language Model for Structured Sentiment Analysis (Chen et al., SemEval 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.semeval-1.185.pdf
Video:
 https://aclanthology.org/2022.semeval-1.185.mp4
Data
MPQA Opinion Corpus