Kimi Kaneko
2021
Tell Me What You Read: Automatic Expertise-Based Annotator Assignment for Text Annotation in Expert Domains
Hiyori Yoshikawa | Tomoya Iwakura | Kimi Kaneko | Hiroaki Yoshida | Yasutaka Kumano | Kazutaka Shimada | Rafal Rzepka | Patrycja Swieczkowska
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)
Hiyori Yoshikawa | Tomoya Iwakura | Kimi Kaneko | Hiroaki Yoshida | Yasutaka Kumano | Kazutaka Shimada | Rafal Rzepka | Patrycja Swieczkowska
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)
This paper investigates the effectiveness of automatic annotator assignment for text annotation in expert domains. In the task of creating high-quality annotated corpora, expert domains often cover multiple sub-domains (e.g. organic and inorganic chemistry in the chemistry domain) either explicitly or implicitly. Therefore, it is crucial to assign annotators to documents relevant with their fine-grained domain expertise. However, most of existing methods for crowdsoucing estimate reliability of each annotator or annotated instance only after the annotation process. To address the issue, we propose a method to estimate the domain expertise of each annotator before the annotation process using information easily available from the annotators beforehand. We propose two measures to estimate the annotator expertise: an explicit measure using the predefined categories of sub-domains, and an implicit measure using distributed representations of the documents. The experimental results on chemical name annotation tasks show that the annotation accuracy improves when both explicit and implicit measures for annotator assignment are combined.
2016
Annotation and Analysis of Discourse Relations, Temporal Relations and Multi-Layered Situational Relations in Japanese Texts
Kimi Kaneko | Saku Sugawara | Koji Mineshima | Daisuke Bekki
Proceedings of the 12th Workshop on Asian Language Resources (ALR12)
Kimi Kaneko | Saku Sugawara | Koji Mineshima | Daisuke Bekki
Proceedings of the 12th Workshop on Asian Language Resources (ALR12)
This paper proposes a methodology for building a specialized Japanese data set for recognizing temporal relations and discourse relations. In addition to temporal and discourse relations, multi-layered situational relations that distinguish generic and specific states belonging to different layers in a discourse are annotated. Our methodology has been applied to 170 text fragments taken from Wikinews articles in Japanese. The validity of our methodology is evaluated and analyzed in terms of degree of annotator agreement and frequency of errors.
2014
Building a Japanese Corpus of Temporal-Causal-Discourse Structures Based on SDRT for Extracting Causal Relations
Kimi Kaneko | Daisuke Bekki
Proceedings of the EACL 2014 Workshop on Computational Approaches to Causality in Language (CAtoCL)
Kimi Kaneko | Daisuke Bekki
Proceedings of the EACL 2014 Workshop on Computational Approaches to Causality in Language (CAtoCL)
Toward a Discourse Theory for Annotating Causal Relations in Japanese
Kimi Kaneko | Daisuke Bekki
Proceedings of the 28th Pacific Asia Conference on Language, Information and Computing
Kimi Kaneko | Daisuke Bekki
Proceedings of the 28th Pacific Asia Conference on Language, Information and Computing