Universal Dependencies for Punjabi

Aryaman Arora


Abstract
We introduce the first Universal Dependencies treebank for Punjabi (written in the Gurmukhi script) and discuss corpus design and linguistic phenomena encountered in annotation. The treebank covers a variety of genres and has been annotated for POS tags, dependency relations, and graph-based Enhanced Dependencies. We aim to expand the diversity of coverage of Indo-Aryan languages in UD.
Anthology ID:
2022.lrec-1.613
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
5705–5711
Language:
URL:
https://aclanthology.org/2022.lrec-1.613
DOI:
Bibkey:
Cite (ACL):
Aryaman Arora. 2022. Universal Dependencies for Punjabi. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 5705–5711, Marseille, France. European Language Resources Association.
Cite (Informal):
Universal Dependencies for Punjabi (Arora, LREC 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.lrec-1.613.pdf
Data
IndicCorpUniversal Dependencies