Yoshihide Kato

Also published as: Yoshihide Sato

2024

pdf bib abs
Negation Scope Conversion: Towards a Unified Negation-Annotated Dataset
Asahi Yoshida | Yoshihide Kato | Shigeki Matsubara
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Negation scope resolution is the task that identifies the part of a sentence affected by the negation cue. The three major corpora used for this task, the BioScope corpus, the SFU review corpus and the Sherlock dataset, have different annotation schemes for negation scope. Due to the different annotations, the negation scope resolution models based on pre-trained language models (PLMs) perform worse when fine-tuned on the simply combined dataset consisting of the three corpora. To address this issue, we propose a method for automatically converting the scopes of BioScope and SFU to those of Sherlock and merge them into a unified dataset. To verify the effectiveness of the proposed method, we conducted experiments using the unified dataset for fine-tuning PLM-based models. The experimental results demonstrate that the performances of the models increase when fine-tuned on the unified dataset unlike the simply combined one. In the token-level metric, the model fine-tuned on the unified dataset archived the state-of-the-art performance on the Sherlock dataset.

2023

pdf bib abs
Revisiting Syntax-Based Approach in Negation Scope Resolution
Asahi Yoshida | Yoshihide Kato | Shigeki Matsubara
Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023)

Negation scope resolution is the process of detecting the negated part of a sentence. Unlike the syntax-based approach employed in previous research, state-of-the-art methods performed better without the explicit use of syntactic structure. This work revisits the syntax-based approach and re-evaluates the effectiveness of syntactic structure in negation scope resolution. We replace the parser utilized in the prior works with state-of-the-art parsers and modify the syntax-based heuristic rules. The experimental results demonstrate that the simple modifications enhance the performance of the prior syntax-based method to the same level as state-of-the-art end-to-end neural-based methods.

2022

pdf bib
A Model-Theoretic Formalization of Natural Language Inference Using Neural Network and Tableau Method
Ayahito Saji | Yoshihide Kato | Shigeki Matsubara
Proceedings of the 36th Pacific Asia Conference on Language, Information and Computation

2021

pdf bib abs
A New Representation for Span-based CCG Parsing
Yoshihide Kato | Shigeki Matsubara
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

This paper proposes a new representation for CCG derivations. CCG derivations are represented as trees whose nodes are labeled with categories strictly restricted by CCG rule schemata. This characteristic is not suitable for span-based parsing models because they predict node labels independently. In other words, span-based models may generate invalid CCG derivations that violate the rule schemata. Our proposed representation decomposes CCG derivations into several independent pieces and prevents the span-based parsing models from violating the schemata. Our experimental result shows that an off-the-shelf span-based parser with our representation is comparable with previous CCG parsers.

pdf bib
Natural Language Inference using Neural Network and Tableau Method
Ayahito Saji | Daiki Takao | Yoshihide Kato | Shigeki Matsubara
Proceedings of the 35th Pacific Asia Conference on Language, Information and Computation

2020

pdf bib abs
Parsing Gapping Constructions Based on Grammatical and Semantic Roles
Yoshihide Kato | Shigeki Matsubara
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

A gapping construction consists of a coordinated structure where redundant elements are elided from all but one conjuncts. This paper proposes a method of parsing sentences with gapping to recover elided elements. The proposed method is based on constituent trees annotated with grammatical and semantic roles that are useful for identifying elided elements. Our method outperforms the previous method in terms of F-measure and recall.

2019

pdf bib abs
PTB Graph Parsing with Tree Approximation
Yoshihide Kato | Shigeki Matsubara
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

The Penn Treebank (PTB) represents syntactic structures as graphs due to nonlocal dependencies. This paper proposes a method that approximates PTB graph-structured representations by trees. By our approximation method, we can reduce nonlocal dependency identification and constituency parsing into single tree-based parsing. An experimental result demonstrates that our approximation method with an off-the-shelf tree-based constituency parser significantly outperforms the previous methods in nonlocal dependency identification.

2018

pdf bib
Model-Theoretic Incremental Interpretation Based on Discourse Representation Theory
Yoshihide Kato | Shigeki Matsubara
Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation

2016

pdf bib abs
Correcting Errors in a Treebank Based on Tree Mining
Kanta Suzuki | Yoshihide Kato | Shigeki Matsubara
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

This paper provides a new method to correct annotation errors in a treebank. The previous error correction method constructs a pseudo parallel corpus where incorrect partial parse trees are paired with correct ones, and extracts error correction rules from the parallel corpus. By applying these rules to a treebank, the method corrects errors. However, this method does not achieve wide coverage of error correction. To achieve wide coverage, our method adopts a different approach. In our method, we consider that an infrequent pattern which can be transformed to a frequent one is an annotation error pattern. Based on a tree mining technique, our method seeks such infrequent tree patterns, and constructs error correction rules each of which consists of an infrequent pattern and a corresponding frequent pattern. We conducted an experiment using the Penn Treebank. We obtained 1,987 rules which are not constructed by the previous method, and the rules achieved good precision.

pdf bib
Transition-Based Left-Corner Parsing for Identifying PTB-Style Nonlocal Dependencies
Yoshihide Kato | Shigeki Matsubara
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

This paper presents a corpus search system utilizing lexical dependency structure. The user's query consists of lexical dependency structure. The user's query consists of a sequence of keywords. For a given query, the system automatically generates the dependency structure patterns which consist of keywords in the query, and returns the sentences whose dependency structures match the generated patterns. The dependency structure patterns are generated by using two operations: combining and interpolation, which utilize dependency structures in the searched corpus. The operations enable the system to generate only the dependency structure patterns that occur in the corpus. The system achieves simple and intuitive corpus search and it is enough linguistically sophisticated to utilize structural information.