Construction and Validation of a Japanese Honorific Corpus Based on Systemic Functional Linguistics

Muxuan Liu, Ichiro Kobayashi


Abstract
In Japanese, there are different expressions used in speech depending on the speaker’s and listener’s social status, called honorifics. Unlike other languages, Japanese has many types of honorific expressions, and it is vital for machine translation and dialogue systems to handle the differences in meaning correctly. However, there is still no corpus that deals with honorific expressions based on social status. In this study, we developed an honorific corpus (KeiCO corpus) that includes social status information based on Systemic Functional Linguistics, which expresses language use in situations from the social group’s values and common understanding. As a general-purpose language resource, it filled in the Japanese honorific blanks. We expect the KeiCO corpus could be helpful for various tasks, such as improving the accuracy of machine translation, automatic evaluation, correction of Japanese composition and style transformation. We also verified the accuracy of our corpus by a BERT-based classification task.
Anthology ID:
2022.dclrl-1.3
Volume:
Proceedings of the Workshop on Dataset Creation for Lower-Resourced Languages within the 13th Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Jonne Sälevä, Constantine Lignos
Venue:
DCLRL
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
19–26
Language:
URL:
https://aclanthology.org/2022.dclrl-1.3
DOI:
Bibkey:
Cite (ACL):
Muxuan Liu and Ichiro Kobayashi. 2022. Construction and Validation of a Japanese Honorific Corpus Based on Systemic Functional Linguistics. In Proceedings of the Workshop on Dataset Creation for Lower-Resourced Languages within the 13th Language Resources and Evaluation Conference, pages 19–26, Marseille, France. European Language Resources Association.
Cite (Informal):
Construction and Validation of a Japanese Honorific Corpus Based on Systemic Functional Linguistics (Liu & Kobayashi, DCLRL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.dclrl-1.3.pdf