Muxuan Liu
2023
Constructing a Japanese Business Email Corpus Based on Social Situations
Muxuan Liu
|
Tatsuya Ishigaki
|
Yusuke Miyao
|
Hiroya Takamura
|
Ichiro Kobayashi
Proceedings of the 37th Pacific Asia Conference on Language, Information and Computation
2022
Construction and Validation of a Japanese Honorific Corpus Based on Systemic Functional Linguistics
Muxuan Liu
|
Ichiro Kobayashi
Proceedings of the Workshop on Dataset Creation for Lower-Resourced Languages within the 13th Language Resources and Evaluation Conference
In Japanese, there are different expressions used in speech depending on the speaker’s and listener’s social status, called honorifics. Unlike other languages, Japanese has many types of honorific expressions, and it is vital for machine translation and dialogue systems to handle the differences in meaning correctly. However, there is still no corpus that deals with honorific expressions based on social status. In this study, we developed an honorific corpus (KeiCO corpus) that includes social status information based on Systemic Functional Linguistics, which expresses language use in situations from the social group’s values and common understanding. As a general-purpose language resource, it filled in the Japanese honorific blanks. We expect the KeiCO corpus could be helpful for various tasks, such as improving the accuracy of machine translation, automatic evaluation, correction of Japanese composition and style transformation. We also verified the accuracy of our corpus by a BERT-based classification task.