Yuka Tateisi

Also published as: Yuka Tateishi


2016

pdf bib
Typed Entity and Relation Annotation on Computer Science Papers
Yuka Tateisi | Tomoko Ohta | Sampo Pyysalo | Yusuke Miyao | Akiko Aizawa
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

We describe our ongoing effort to establish an annotation scheme for describing the semantic structures of research articles in the computer science domain, with the intended use of developing search systems that can refine their results by the roles of the entities denoted by the query keys. In our scheme, mentions of entities are annotated with ontology-based types, and the roles of the entities are annotated as relations with other entities described in the text. So far, we have annotated 400 abstracts from the ACL anthology and the ACM digital library. In this paper, the scheme and the annotated dataset are described, along with the problems found in the course of annotation. We also show the results of automatic annotation and evaluate the corpus in a practical setting in application to topic extraction.

2014

pdf bib
Corpus for Coreference Resolution on Scientific Papers
Panot Chaimongkol | Akiko Aizawa | Yuka Tateisi
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

The ever-growing number of published scientific papers prompts the need for automatic knowledge extraction to help scientists keep up with the state-of-the-art in their respective fields. To construct a good knowledge extraction system, annotated corpora in the scientific domain are required to train machine learning models. As described in this paper, we have constructed an annotated corpus for coreference resolution in multiple scientific domains, based on an existing corpus. We have modified the annotation scheme from Message Understanding Conference to better suit scientific texts. Then we applied that to the corpus. The annotated corpus is then compared with corpora in general domains in terms of distribution of resolution classes and performance of the Stanford Dcoref coreference resolver. Through these comparisons, we have demonstrated quantitatively that our manually annotated corpus differs from a general-domain corpus, which suggests deep differences between general-domain texts and scientific texts and which shows that different approaches can be made to tackle coreference resolution for general texts and scientific texts.

pdf bib
Annotation of Computer Science Papers for Semantic Relation Extrac-tion
Yuka Tateisi | Yo Shidahara | Yusuke Miyao | Akiko Aizawa
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

We designed a new annotation scheme for formalising relation structures in research papers, through the investigation of computer science papers. The annotation scheme is based on the hypothesis that identifying the role of entities and events that are described in a paper is useful for intelligent information retrieval in academic literature, and the role can be determined by the relationship between the author and the described entities or events, and relationships among them. Using the scheme, we have annotated research abstracts from the IPSJ Journal published in Japanese by the Information Processing Society of Japan. On the basis of the annotated corpus, we have developed a prototype information extraction system which has the facility to classify sentences according to the relationship between entities mentioned, to help find the role of the entity in which the searcher is interested.

2013

pdf bib
Relation Annotation for Understanding Research Papers
Yuka Tateisi | Yo Shidahara | Yusuke Miyao | Akiko Aizawa
Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse

pdf bib
Clinical Vocabulary and Clinical Finding Concepts in Medical Literature
Takashi Okumura | Eiji Aramaki | Yuka Tateisi
The First Workshop on Natural Language Processing for Medical and Healthcare Fields

2011

pdf bib
Parsing Natural Language Queries for Life Science Knowledge
Tadayoshi Hara | Yuka Tateisi | Jin-Dong Kim | Yusuke Miyao
Proceedings of BioNLP 2011 Workshop

2008

pdf bib
Toward an Underspecifiable Corpus Annotation Scheme
Yuka Tateisi
Coling 2008: Proceedings of the workshop on Cross-Framework and Cross-Domain Parser Evaluation

pdf bib
GENIA-GR: a Grammatical Relation Corpus for Parser Evaluation in the Biomedical Domain
Yuka Tateisi | Yusuke Miyao | Kenji Sagae | Jun’ichi Tsujii
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

We report the construction of a corpus for parser evaluation in the biomedical domain. A 50-abstract subset (492 sentences) of the GENIA corpus (Kim et al., 2003) is annotated with labeled head-dependent relations using the grammatical relations (GR) evaluation scheme (Carroll et al., 1998) ,which has been used for parser evaluation in the newswire domain.

2006

pdf bib
Linguistic and Biological Annotations of Biological Interaction Events
Tomoko Ohta | Yuka Tateisi | Jin-Dong Kim | Akane Yakushiji | Jun-ichi Tsujii
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

This paper discusses an augmentation of a corpus ofresearch abstracts in biomedical domain (the GENIA corpus) with two kinds of annotations: tree annotation and event annotation. The tree annotation identifies the linguistic structure that encodes the relations among entities. The event annotation reveals the semantic structure of the biological interaction events encoded in the text. With these annotations we aim to provide a link between the clue and the target of biological event information extraction.

pdf bib
An Intelligent Search Engine and GUI-based Efficient MEDLINE Search Tool Based on Deep Syntactic Parsing
Tomoko Ohta | Yusuke Miyao | Takashi Ninomiya | Yoshimasa Tsuruoka | Akane Yakushiji | Katsuya Masuda | Jumpei Takeuchi | Kazuhiro Yoshida | Tadayoshi Hara | Jin-Dong Kim | Yuka Tateisi | Jun’ichi Tsujii
Proceedings of the COLING/ACL 2006 Interactive Presentation Sessions

pdf bib
Automatic Construction of Predicate-argument Structure Patterns for Biomedical Information Extraction
Akane Yakushiji | Yusuke Miyao | Tomoko Ohta | Yuka Tateisi | Jun’ichi Tsujii
Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing

pdf bib
Subdomain adaptation of a POS tagger with a small corpus
Yuka Tateisi | Yoshimasa Tsuruoka | Jun’ichi Tsujii
Proceedings of the HLT-NAACL BioNLP Workshop on Linking Natural Language and Biology

2005

pdf bib
Syntax Annotation for the GENIA Corpus
Yuka Tateisi | Akane Yakushiji | Tomoko Ohta | Jun’ichi Tsujii
Companion Volume to the Proceedings of Conference including Posters/Demos and tutorial abstracts

2004

pdf bib
Finding Anchor Verbs for Biomedical IE Using Predicate-Argument Structures
Akane Yakushiji | Yuka Tateisi | Yusuke Miyao | Jun’ichi Tsujii
Proceedings of the ACL Interactive Poster and Demonstration Sessions

pdf bib
Introduction to the Bio-entity Recognition Task at JNLPBA
Nigel Collier | Tomoko Ohta | Yoshimasa Tsuruoka | Yuka Tateisi | Jin-Dong Kim
Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications (NLPBA/BioNLP)

pdf bib
Part-of-Speech Annotation of Biology Research Abstracts
Yuka Tateisi | Jun-ichi Tsujii
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

2003

pdf bib
A Debug Tool for Practical Grammar Development
Akane Yakushiji | Yuka Tateisi | Yusuke Miyao | Naoki Yoshinaga | Jun’ichi Tsujii
The Companion Volume to the Proceedings of 41st Annual Meeting of the Association for Computational Linguistics

pdf bib
Encoding Biomedical Resources in TEI: The Case of the GENIA Corpus
Tomaz Erjavec | Jin-Dong Kim | Tomoko Ohta | Yuka Tateisi | Jun’ichi Tsujii
Proceedings of the ACL 2003 Workshop on Natural Language Processing in Biomedicine

pdf bib
Stretching TEI: Converting the Genia Corpus
Tomaz Erjavec | Jin-Dong Kim | Tomoko Ohta | Yuka Tateisi | Jun-ichi Tsujii
Proceedings of 4th International Workshop on Linguistically Interpreted Corpora (LINC-03) at EACL 2003

2000

pdf bib
Building an Annotated Corpus in the Molecular-Biology Domain
Yuka Tateisi | Tomoko Ohta | Nigel Collier | Chikashi Nobata | Jun-ichi Tsujii
Proceedings of the COLING-2000 Workshop on Semantic Annotation and Intelligent Content

1999

pdf bib
The GENIA project: corpus-based knowledge acquisition and information extraction from genome research papers
Nigel Collier | Hyun Seok Park | Norihiro Ogata | Yuka Tateishi | Chikashi Nobata | Tomoko Ohta | Tateshi Sekimizu | Hisao Imai | Katsutoshi Ibushi | Jun-ichi Tsujii
Ninth Conference of the European Chapter of the Association for Computational Linguistics

1998

pdf bib
Packing of feature structures for optimizing the HPSG-style grammar translated from TAG
Yusuke Miyao | Kentaro Torisawa | Yuka Tateisi | Jun’ichi Tsujii
Proceedings of the Fourth International Workshop on Tree Adjoining Grammars and Related Frameworks (TAG+4)

pdf bib
Translating the XTAG English grammar to HPSG
Yuka Tateisi | Kentaro Torisawa | Yusuke Miyao | Jun’ichi Tsujii
Proceedings of the Fourth International Workshop on Tree Adjoining Grammars and Related Frameworks (TAG+4)

1988

pdf bib
A Computer Readability Formula of Japanese Texts for Machine Scoring
Yuka Tateisi | Yoshihiko Ono | Hisao Yamada
Coling Budapest 1988 Volume 2: International Conference on Computational Linguistics