Language ID in the Wild: Unexpected Challenges on the Path to a Thousand-Language Web Text Corpus Isaac Caswell author Theresa Breiner author Daan van Esch author Ankur Bapna author 2020-12 text Proceedings of the 28th International Conference on Computational Linguistics Donia Scott editor Nuria Bel editor Chengqing Zong editor International Committee on Computational Linguistics Barcelona, Spain (Online) conference publication caswell-etal-2020-language 10.18653/v1/2020.coling-main.579 https://aclanthology.org/2020.coling-main.579/ 2020-12 6588 6608