Molyé: A Corpus-based Approach to Language Contact in Colonial France

Rasul Dent, Juliette Janes, Thibault Clerice, Pedro Ortiz Suarez, Benoît Sagot


Abstract
Whether or not several Creole languages which developed during the early modern period can be considered genetic descendants of European languages has been the subject of intense debate. This is in large part due to the absence of evidence of intermediate forms. This work introduces a new open corpus, the Molyé corpus, which combines stereotypical representations of three kinds of language variation in Europe with early attestations of French-based Creole languages across a period of 400 years. It is intended to facilitate future research on the continuity between contact situations in Europe and Creolophone (former) colonies.
Anthology ID:
2024.nlp4dh-1.18
Volume:
Proceedings of the 4th International Conference on Natural Language Processing for Digital Humanities
Month:
November
Year:
2024
Address:
Miami, USA
Editors:
Mika Hämäläinen, Emily Öhman, So Miyagawa, Khalid Alnajjar, Yuri Bizzoni
Venue:
NLP4DH
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
189–199
Language:
URL:
https://aclanthology.org/2024.nlp4dh-1.18
DOI:
Bibkey:
Cite (ACL):
Rasul Dent, Juliette Janes, Thibault Clerice, Pedro Ortiz Suarez, and Benoît Sagot. 2024. Molyé: A Corpus-based Approach to Language Contact in Colonial France. In Proceedings of the 4th International Conference on Natural Language Processing for Digital Humanities, pages 189–199, Miami, USA. Association for Computational Linguistics.
Cite (Informal):
Molyé: A Corpus-based Approach to Language Contact in Colonial France (Dent et al., NLP4DH 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.nlp4dh-1.18.pdf