Identification and Analysis of Personification in Hungarian: The PerSECorp project

Gábor Simon


Abstract
Despite the recent findings on the conceptual and linguistic organization of personification, we have relatively little knowledge about its lexical patterns and grammatical templates. It is especially true in the case of Hungarian which has remained an understudied language regarding the constructions of figurative meaning generation. The present paper aims to provide a corpus-driven approach to personification analysis in the framework of cognitive linguistics. This approach is based on the building of a semi-automatically processed research corpus (the PerSE corpus) in which personifying linguistic structures are annotated manually. The present test version of the corpus consists of online car reviews written in Hungarian (10468 words altogether): the texts were tokenized, lemmatized, morphologically analyzed, syntactically parsed, and PoS-tagged with the e-magyar NLP tool. For the identification of personifications, the adaptation of the MIPVU protocol was used and combined with additional analysis of semantic relations within personifying multi-word expressions. The paper demonstrates the structure of the corpus as well as the levels of the annotation. Furthermore, it gives an overview of possible data types emerging from the analysis: lexical pattern, grammatical characteristics, and the construction-like behavior of personifications in Hungarian.
Anthology ID:
2022.lrec-1.292
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
2730–2738
Language:
URL:
https://aclanthology.org/2022.lrec-1.292
DOI:
Bibkey:
Cite (ACL):
Gábor Simon. 2022. Identification and Analysis of Personification in Hungarian: The PerSECorp project. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 2730–2738, Marseille, France. European Language Resources Association.
Cite (Informal):
Identification and Analysis of Personification in Hungarian: The PerSECorp project (Simon, LREC 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.lrec-1.292.pdf