How to Parse a Creole: When Martinican Creole Meets French

Ludovic Mompelat, Daniel Dakota, Sandra Kübler


Abstract
We investigate methods to develop a parser for Martinican Creole, a highly under-resourced language, using a French treebank. We compare transfer learning and multi-task learning models and examine different input features and strategies to handle the massive size imbalance between the treebanks. Surprisingly, we find that a simple concatenated (French + Martinican Creole) baseline yields optimal results even though it has access to only 80 Martinican Creole sentences. POS embeddings work better than lexical ones, but they suffer from negative transfer.
Anthology ID:
2022.coling-1.387
Volume:
Proceedings of the 29th International Conference on Computational Linguistics
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
4397–4406
Language:
URL:
https://aclanthology.org/2022.coling-1.387
DOI:
Bibkey:
Cite (ACL):
Ludovic Mompelat, Daniel Dakota, and Sandra Kübler. 2022. How to Parse a Creole: When Martinican Creole Meets French. In Proceedings of the 29th International Conference on Computational Linguistics, pages 4397–4406, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.
Cite (Informal):
How to Parse a Creole: When Martinican Creole Meets French (Mompelat et al., COLING 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.coling-1.387.pdf