NusaWrites: Constructing High-Quality Corpora for Underrepresented and Extremely Low-Resource Languages

Samuel Cahyawijaya, Holy Lovenia, Fajri Koto, Dea Adhista, Emmanuel Dave, Sarah Oktavianti, Salsabil Akbar, Jhonson Lee, Nuur Shadieq, Tjeng Wawan Cenggoro, Hanung Linuwih, Bryan Wilie, Galih Muridan, Genta Winata, David Moeljadi, Alham Fikri Aji, Ayu Purwarianti, Pascale Fung


Anthology ID:
2023.ijcnlp-main.60
Volume:
Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
November
Year:
2023
Address:
Nusa Dua, Bali
Editors:
Jong C. Park, Yuki Arase, Baotian Hu, Wei Lu, Derry Wijaya, Ayu Purwarianti, Adila Alfa Krisnadhi
Venues:
IJCNLP | AACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
921–945
Language:
URL:
https://aclanthology.org/2023.ijcnlp-main.60
DOI:
10.18653/v1/2023.ijcnlp-main.60
Bibkey:
Cite (ACL):
Samuel Cahyawijaya, Holy Lovenia, Fajri Koto, Dea Adhista, Emmanuel Dave, Sarah Oktavianti, Salsabil Akbar, Jhonson Lee, Nuur Shadieq, Tjeng Wawan Cenggoro, Hanung Linuwih, Bryan Wilie, Galih Muridan, Genta Winata, David Moeljadi, Alham Fikri Aji, Ayu Purwarianti, and Pascale Fung. 2023. NusaWrites: Constructing High-Quality Corpora for Underrepresented and Extremely Low-Resource Languages. In Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 921–945, Nusa Dua, Bali. Association for Computational Linguistics.
Cite (Informal):
NusaWrites: Constructing High-Quality Corpora for Underrepresented and Extremely Low-Resource Languages (Cahyawijaya et al., IJCNLP-AACL 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.ijcnlp-main.60.pdf