Exploring a Choctaw Language Corpus with Word Vectors and Minimum Distance Length
Jacqueline Brixey | David Sides | Timothy Vizthum | David Traum | Khalil Iskarous
Proceedings of the Twelfth Language Resources and Evaluation Conference
This work introduces additions to the corpus ChoCo, a multimodal corpus for the American indigenous language Choctaw. Using texts from the corpus, we develop new computational resources by using two off-the-shelf tools: word2vec and Linguistica. Our work illustrates how these tools can be successfully implemented with a small corpus.