DAG: Dictionary-Augmented Generation for Disambiguation of Sentences in Endangered Uralic Languages using ChatGPT

Mika Hämäläinen


Abstract
We showcase that ChatGPT can be used to disambiguate lemmas in two endangered languages ChatGPT is not proficient in, namely Erzya and Skolt Sami. We augment our prompt by providing dictionary translations of the candidate lemmas to a majority language - Finnish in our case. This dictionary augmented generation approach results in 50% accuracy for Skolt Sami and 41% accuracy for Erzya. On a closer inspection, many of the error types were of the kind even an untrained human annotator would make.
Anthology ID:
2024.iwclul-1.4
Volume:
Proceedings of the 9th International Workshop on Computational Linguistics for Uralic Languages
Month:
November
Year:
2024
Address:
Helsinki, Finland
Editors:
Mika Hämäläinen, Flammie Pirinen, Melany Macias, Mario Crespo Avila
Venue:
IWCLUL
SIG:
SIGUR
Publisher:
Association for Computational Linguistics
Note:
Pages:
36–40
Language:
URL:
https://aclanthology.org/2024.iwclul-1.4
DOI:
Bibkey:
Cite (ACL):
Mika Hämäläinen. 2024. DAG: Dictionary-Augmented Generation for Disambiguation of Sentences in Endangered Uralic Languages using ChatGPT. In Proceedings of the 9th International Workshop on Computational Linguistics for Uralic Languages, pages 36–40, Helsinki, Finland. Association for Computational Linguistics.
Cite (Informal):
DAG: Dictionary-Augmented Generation for Disambiguation of Sentences in Endangered Uralic Languages using ChatGPT (Hämäläinen, IWCLUL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.iwclul-1.4.pdf