Natsuko Nakagawa
2024
Language Atlas of Japanese and Ryukyuan (LAJaR): A Linguistic Typology Database for Endangered Japonic Languages
Kanji Kato
|
So Miyagawa
|
Natsuko Nakagawa
Proceedings of the 6th Workshop on Research in Computational Linguistic Typology and Multilingual NLP
LAJaR (Language Atlas of Japanese and Ryukyuan) is a linguistic typology database focusing on micro-variation of the Japonic (Japanese and Ryukyuan) languages. This paper aims to report the design and progress of this ongoing database project. Finally, we also show a case study utilizing its database on zero copulas among the Japonic languages.
2012
Annotation of anaphoric relations and topic continuity in Japanese conversation
Natsuko Nakagawa
|
Yasuharu Den
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
This paper proposes a basic scheme for annotating anaphoric relations in Japanese conversations. More specifically, we propose methods of (i) dividing discourse segments into meaningful units, (ii) identifying zero pronouns and other overt anaphors, (iii) classifying zero pronouns, and (iv) identifying anaphoric relations. We discuss various kinds of problems involved in the annotation mainly caused by on-line processing of discourse and/or interactions between the participants. These problems do not arise in annotating written languages. This paper also proposes a method to compute topic continuity based on anaphoric relations. The topic continuity involves the information status of the noun in question (given, accessible, and new) and persistence (whether the noun is mentioned multiple times or not). We show that the topic continuity correlates with short-utterance units, which are determined prosodically through the previous annotations; nouns of high topic continuity tend to be prosodically separated from the predicates. This result indicates the validity of our annotations of anaphoric relations and topic continuity and the usefulness for further studies on discourse and interaction.
Search