The University of Texas System Submission for the Code-Switching Workshop Shared Task 2018

Florian Janke, Tongrui Li, Eric Rincón, Gualberto Guzmán, Barbara Bullock, Almeida Jacqueline Toribio


Abstract
This paper describes the system for the Named Entity Recognition Shared Task of the Third Workshop on Computational Approaches to Linguistic Code-Switching (CALCS) submitted by the Bilingual Annotations Tasks (BATs) research group of the University of Texas. Our system uses several features to train a Conditional Random Field (CRF) model for classifying input words as Named Entities (NEs) using the Inside-Outside-Beginning (IOB) tagging scheme. We participated in the Modern Standard Arabic-Egyptian Arabic (MSA-EGY) and English-Spanish (ENG-SPA) tasks, achieving weighted average F-scores of 65.62 and 54.16 respectively. We also describe the performance of a deep neural network (NN) trained on a subset of the CRF features, which did not surpass CRF performance.
Anthology ID:
W18-3216
Volume:
Proceedings of the Third Workshop on Computational Approaches to Linguistic Code-Switching
Month:
July
Year:
2018
Address:
Melbourne, Australia
Editors:
Gustavo Aguilar, Fahad AlGhamdi, Victor Soto, Thamar Solorio, Mona Diab, Julia Hirschberg
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
120–125
Language:
URL:
https://aclanthology.org/W18-3216
DOI:
10.18653/v1/W18-3216
Bibkey:
Cite (ACL):
Florian Janke, Tongrui Li, Eric Rincón, Gualberto Guzmán, Barbara Bullock, and Almeida Jacqueline Toribio. 2018. The University of Texas System Submission for the Code-Switching Workshop Shared Task 2018. In Proceedings of the Third Workshop on Computational Approaches to Linguistic Code-Switching, pages 120–125, Melbourne, Australia. Association for Computational Linguistics.
Cite (Informal):
The University of Texas System Submission for the Code-Switching Workshop Shared Task 2018 (Janke et al., ACL 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-3216.pdf