Piyapath Spencer


2024

pdf bib
Documenting Endangered Languages with LangDoc: A Wordlist-Based System and A Case Study on Moklen
Piyapath Spencer
Proceedings of the 3rd Workshop on NLP Applications to Field Linguistics (Field Matters 2024)

Language documentation, especially languages lacking standardised writing systems, is a laborious and time-consuming process. This paper introduces LangDoc, a comprehensive system designed to address challenges and improve the efficiency and accuracy of language documentation projects. LangDoc offers several features, including tools for managing, recording, and reviewing the collected data. It operates both online and offline, crucial for fieldwork in remote locations. The paper also presents a comparative analysis demonstrating LangDoc’s efficiency compared to other methods. A case study of the Moklen language documentation project demonstrates how the features address the specific challenges of working with endangered languages and remote communities. Future development areas include integrating with NLP tools for advanced linguistic analysis and emphasising its potential to support the preservation of language diversity.
Search
Co-authors