Extracting Person Names from User Generated Text: Named-Entity Recognition for Combating Human Trafficking

Yifei Li, Pratheeksha Nair, Kellin Pelrine, Reihaneh Rabbany


Abstract
Online escort advertisement websites are widely used for advertising victims of human trafficking. Domain experts agree that advertising multiple people in the same ad is a strong indicator of trafficking. Thus, extracting person names from the text of these ads can provide valuable clues for further analysis. However, Named-Entity Recognition (NER) on escort ads is challenging because the text can be noisy, colloquial and often lacking proper grammar and punctuation. Most existing state-of-the-art NER models fail to demonstrate satisfactory performance in this task. In this paper, we propose NEAT (Name Extraction Against Trafficking) for extracting person names. It effectively combines classic rule-based and dictionary extractors with a contextualized language model to capture ambiguous names (e.g penny, hazel) and adapts to adversarial changes in the text by expanding its dictionary. NEAT shows 19% improvement on average in the F1 classification score for name extraction compared to previous state-of-the-art in two domain-specific datasets.
Anthology ID:
2022.findings-acl.225
Volume:
Findings of the Association for Computational Linguistics: ACL 2022
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2854–2868
Language:
URL:
https://aclanthology.org/2022.findings-acl.225
DOI:
10.18653/v1/2022.findings-acl.225
Bibkey:
Cite (ACL):
Yifei Li, Pratheeksha Nair, Kellin Pelrine, and Reihaneh Rabbany. 2022. Extracting Person Names from User Generated Text: Named-Entity Recognition for Combating Human Trafficking. In Findings of the Association for Computational Linguistics: ACL 2022, pages 2854–2868, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Extracting Person Names from User Generated Text: Named-Entity Recognition for Combating Human Trafficking (Li et al., Findings 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.findings-acl.225.pdf
Data
WNUT 2017