Towards a General Abstract Meaning Representation Corpus for Brazilian Portuguese

Marco Antonio Sobrevilla Cabezudo, Thiago Pardo


Abstract
Abstract Meaning Representation (AMR) is a recent and prominent semantic representation with good acceptance and several applications in the Natural Language Processing area. For English, there is a large annotated corpus (with approximately 39K sentences) that supports the research with the representation. However, to the best of our knowledge, there is only one restricted corpus for Portuguese, which contains 1,527 sentences. In this context, this paper presents an effort to build a general purpose AMR-annotated corpus for Brazilian Portuguese by translating and adapting AMR English guidelines. Our results show that such approach is feasible, but there are some challenging phenomena to solve. More than this, efforts are necessary to increase the coverage of the corresponding lexical resource that supports the annotation.
Anthology ID:
W19-4028
Volume:
Proceedings of the 13th Linguistic Annotation Workshop
Month:
August
Year:
2019
Address:
Florence, Italy
Venue:
LAW
SIG:
SIGANN
Publisher:
Association for Computational Linguistics
Note:
Pages:
236–244
Language:
URL:
https://aclanthology.org/W19-4028
DOI:
10.18653/v1/W19-4028
Bibkey:
Cite (ACL):
Marco Antonio Sobrevilla Cabezudo and Thiago Pardo. 2019. Towards a General Abstract Meaning Representation Corpus for Brazilian Portuguese. In Proceedings of the 13th Linguistic Annotation Workshop, pages 236–244, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Towards a General Abstract Meaning Representation Corpus for Brazilian Portuguese (Sobrevilla Cabezudo & Pardo, LAW 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-4028.pdf