A Corpus of Adpositional Supersenses for Mandarin Chinese

Siyao Peng, Yang Liu, Yilun Zhu, Austin Blodgett, Yushi Zhao, Nathan Schneider


Abstract
Adpositions are frequent markers of semantic relations, but they are highly ambiguous and vary significantly from language to language. Moreover, there is a dearth of annotated corpora for investigating the cross-linguistic variation of adposition semantics, or for building multilingual disambiguation systems. This paper presents a corpus in which all adpositions have been semantically annotated in Mandarin Chinese; to the best of our knowledge, this is the first Chinese corpus to be broadly annotated with adposition semantics. Our approach adapts a framework that defined a general set of supersenses according to ostensibly language-independent semantic criteria, though its development focused primarily on English prepositions (Schneider et al., 2018). We find that the supersense categories are well-suited to Chinese adpositions despite syntactic differences from English. On a Mandarin translation of The Little Prince, we achieve high inter-annotator agreement and analyze semantic correspondences of adposition tokens in bitext.
Anthology ID:
2020.lrec-1.733
Volume:
Proceedings of the 12th Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
5986–5994
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.733
DOI:
Bibkey:
Cite (ACL):
Siyao Peng, Yang Liu, Yilun Zhu, Austin Blodgett, Yushi Zhao, and Nathan Schneider. 2020. A Corpus of Adpositional Supersenses for Mandarin Chinese. In Proceedings of the 12th Language Resources and Evaluation Conference, pages 5986–5994, Marseille, France. European Language Resources Association.
Cite (Informal):
A Corpus of Adpositional Supersenses for Mandarin Chinese (Peng et al., LREC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.lrec-1.733.pdf
Data
English Web TreebankUniversal Dependencies