Halidanmu Abudukelimu


pdf bib
Error Analysis of Uyghur Name Tagging: Language-specific Techniques and Remaining Challenges
Halidanmu Abudukelimu | Abudoukelimu Abulizi | Boliang Zhang | Xiaoman Pan | Di Lu | Heng Ji | Yang Liu
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)


pdf bib
Embracing Non-Traditional Linguistic Resources for Low-resource Language Name Tagging
Boliang Zhang | Di Lu | Xiaoman Pan | Ying Lin | Halidanmu Abudukelimu | Heng Ji | Kevin Knight
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Current supervised name tagging approaches are inadequate for most low-resource languages due to the lack of annotated data and actionable linguistic knowledge. All supervised learning methods (including deep neural networks (DNN)) are sensitive to noise and thus they are not quite portable without massive clean annotations. We found that the F-scores of DNN-based name taggers drop rapidly (20%-30%) when we replace clean manual annotations with noisy annotations in the training data. We propose a new solution to incorporate many non-traditional language universal resources that are readily available but rarely explored in the Natural Language Processing (NLP) community, such as the World Atlas of Linguistic Structure, CIA names, PanLex and survival guides. We acquire and encode various types of non-traditional linguistic resources into a DNN name tagger. Experiments on three low-resource languages show that feeding linguistic knowledge can make DNN significantly more robust to noise, achieving 8%-22% absolute F-score gains on name tagging without using any human annotation