How can NLP Help Revitalize Endangered Languages? A Case Study and Roadmap for the Cherokee Language

Shiyue Zhang, Ben Frey, Mohit Bansal


Abstract
More than 43% of the languages spoken in the world are endangered, and language loss currently occurs at an accelerated rate because of globalization and neocolonialism. Saving and revitalizing endangered languages has become very important for maintaining the cultural diversity on our planet. In this work, we focus on discussing how NLP can help revitalize endangered languages. We first suggest three principles that may help NLP practitioners to foster mutual understanding and collaboration with language communities, and we discuss three ways in which NLP can potentially assist in language education. We then take Cherokee, a severely-endangered Native American language, as a case study. After reviewing the language’s history, linguistic features, and existing resources, we (in collaboration with Cherokee community members) arrive at a few meaningful ways NLP practitioners can collaborate with community partners. We suggest two approaches to enrich the Cherokee language’s resources with machine-in-the-loop processing, and discuss several NLP tools that people from the Cherokee community have shown interest in. We hope that our work serves not only to inform the NLP community about Cherokee, but also to provide inspiration for future work on endangered languages in general.
Anthology ID:
2022.acl-long.108
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1529–1541
Language:
URL:
https://aclanthology.org/2022.acl-long.108
DOI:
10.18653/v1/2022.acl-long.108
Bibkey:
Cite (ACL):
Shiyue Zhang, Ben Frey, and Mohit Bansal. 2022. How can NLP Help Revitalize Endangered Languages? A Case Study and Roadmap for the Cherokee Language. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1529–1541, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
How can NLP Help Revitalize Endangered Languages? A Case Study and Roadmap for the Cherokee Language (Zhang et al., ACL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.acl-long.108.pdf
Video:
 https://aclanthology.org/2022.acl-long.108.mp4
Code
 zhangshiyue/revitalizecherokee