UoB at ProfNER 2021: Data Augmentation for Classification Using Machine Translation

Frances Adriana Laureano De Leon; Harish Tayyar Madabushi; Mark Lee

doi:10.18653/v1/2021.smm4h-1.23

UoB at ProfNER 2021: Data Augmentation for Classification Using Machine Translation

Frances Adriana Laureano De Leon, Harish Tayyar Madabushi, Mark Lee

Abstract

This paper describes the participation of the UoB-NLP team in the ProfNER-ST shared subtask 7a. The task was aimed at detecting the mention of professions in social media text. Our team experimented with two methods of improving the performance of pre-trained models: Specifically, we experimented with data augmentation through translation and the merging of multiple language inputs to meet the objective of the task. While the best performing model on the test data consisted of mBERT fine-tuned on augmented data using back-translation, the improvement is minor possibly because multi-lingual pre-trained models such as mBERT already have access to the kind of information provided through back-translation and bilingual data.

Anthology ID:: 2021.smm4h-1.23
Volume:: Proceedings of the Sixth Social Media Mining for Health (#SMM4H) Workshop and Shared Task
Month:: June
Year:: 2021
Address:: Mexico City, Mexico
Editors:: Arjun Magge, Ari Klein, Antonio Miranda-Escalada, Mohammed Ali Al-garadi, Ilseyar Alimova, Zulfat Miftahutdinov, Eulalia Farre-Maduell, Salvador Lima Lopez, Ivan Flores, Karen O'Connor, Davy Weissenbacher, Elena Tutubalina, Abeed Sarker, Juan M Banda, Martin Krallinger, Graciela Gonzalez-Hernandez
Venue:: SMM4H
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 115–117
Language:
URL:: https://aclanthology.org/2021.smm4h-1.23/
DOI:: 10.18653/v1/2021.smm4h-1.23
Bibkey:
Cite (ACL):: Frances Adriana Laureano De Leon, Harish Tayyar Madabushi, and Mark Lee. 2021. UoB at ProfNER 2021: Data Augmentation for Classification Using Machine Translation. In Proceedings of the Sixth Social Media Mining for Health (#SMM4H) Workshop and Shared Task, pages 115–117, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):: UoB at ProfNER 2021: Data Augmentation for Classification Using Machine Translation (Laureano De Leon et al., SMM4H 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.smm4h-1.23.pdf

PDF Cite Search Fix data