Language-Independent Named Entity Analysis Using Parallel Projection and Rule-Based Disambiguation

James Mayfield, Paul McNamee, Cash Costello


Abstract
The 2017 shared task at the Balto-Slavic NLP workshop requires identifying coarse-grained named entities in seven languages, identifying each entity’s base form, and clustering name mentions across the multilingual set of documents. The fact that no training data is provided to systems for building supervised classifiers further adds to the complexity. To complete the task we first use publicly available parallel texts to project named entity recognition capability from English to each evaluation language. We ignore entirely the subtask of identifying non-inflected forms of names. Finally, we create cross-document entity identifiers by clustering named mentions using a procedure-based approach.
Anthology ID:
W17-1414
Volume:
Proceedings of the 6th Workshop on Balto-Slavic Natural Language Processing
Month:
April
Year:
2017
Address:
Valencia, Spain
Venues:
BSNLP | WS
SIG:
SIGSLAV
Publisher:
Association for Computational Linguistics
Note:
Pages:
92–96
Language:
URL:
https://aclanthology.org/W17-1414
DOI:
10.18653/v1/W17-1414
Bibkey:
Cite (ACL):
James Mayfield, Paul McNamee, and Cash Costello. 2017. Language-Independent Named Entity Analysis Using Parallel Projection and Rule-Based Disambiguation. In Proceedings of the 6th Workshop on Balto-Slavic Natural Language Processing, pages 92–96, Valencia, Spain. Association for Computational Linguistics.
Cite (Informal):
Language-Independent Named Entity Analysis Using Parallel Projection and Rule-Based Disambiguation (Mayfield et al., 2017)
Copy Citation:
PDF:
https://aclanthology.org/W17-1414.pdf