Yuya Sawada
2026
entity-linkings: A Unified Library for Entity Linking
Yuya Sawada | Tsuyoshi Fujita | Yusuke Sakai | Taro Watanabe
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Yuya Sawada | Tsuyoshi Fujita | Yusuke Sakai | Taro Watanabe
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 3: System Demonstrations)
Entity linking (EL) aims to disambiguate named entities in text by mapping them to the appropriate entities in a knowledge base. However, it is difficult to use some EL methods, as they sometimes have issues in reproducibility due to limited maintenance or the lack of official resources.To address this, we introduce , a unified library for using and developing entity linking systems through a unified interface. Our library flexibly integrates various candidate retrievers and re-ranking models, making it easy to compare and use any entity linking methods within a unified framework. In addition, it is designed with a strong emphasis on API usability, making it highly extensible, and it supports both command-line tools and APIs. Our code is available on GitHub and is also distributed via PyPI under the MIT-license. The video is available on YouTube.
Toward Automatic Delegation Extraction in Japanese Law
Tsuyoshi Fujita | Yuya Sawada | Yusuke Sakai | Taro Watanabe
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 5: Industry Track)
Tsuyoshi Fujita | Yuya Sawada | Yusuke Sakai | Taro Watanabe
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 5: Industry Track)
The legal systems have a hierarchical structure, and a higher-level law often authorizes a lower-level law to implement detailed provisions, which is called delegation. When interpreting legal texts with delegation, readers must repeatedly consult the lower-level laws that stipulate the detailed provisions, imposing a substantial workload. Therefore, it is necessary to develop a system that enables readers to instantly refer to relevant laws in delegation. However, manually annotating delegation is difficult because it requires extensive legal expertise, careful reading of numerous legal texts, and continuous adaptation to newly enacted laws. In this study, we focus on Japanese law and develop a two-stage pipeline system for automatic delegation annotation. First, we extract keywords that indicate delegation using a named entity recognition approach. Second, we identify the delegated provision corresponding to each keyword as an entity disambiguation task. In our experiments, the proposed system demonstrates sufficient performance to assist manual annotation in practice.
2025
A Text Embedding Model with Contrastive Example Mining for Point-of-Interest Geocoding
Hibiki Nakatani | Hiroki Teranishi | Shohei Higashiyama | Yuya Sawada | Hiroki Ouchi | Taro Watanabe
Proceedings of the 31st International Conference on Computational Linguistics
Hibiki Nakatani | Hiroki Teranishi | Shohei Higashiyama | Yuya Sawada | Hiroki Ouchi | Taro Watanabe
Proceedings of the 31st International Conference on Computational Linguistics
Geocoding is a fundamental technique that links location mentions to their geographic positions, which is important for understanding texts in terms of where the described events occurred. Unlike most geocoding studies that targeted coarse-grained locations, we focus on geocoding at a fine-grained point-of-interest (POI) level. To address the challenge of finding appropriate geo-database entries from among many candidates with similar POI names, we develop a text embedding-based geocoding model and investigate (1) entry encoding representations and (2) hard negative mining approaches suitable for enhancing the model’s disambiguation ability. Our experiments show that the second factor significantly impact the geocoding accuracy of the model.
JaCorpTrack: Corporate History Event Extraction for Tracking Organizational Changes
Yuya Sawada | Hiroki Ouchi | Yuichiro Yasui | Hiroki Teranishi | Yuji Matsumoto | Taro Watanabe | Masayuki Ishii
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
Yuya Sawada | Hiroki Ouchi | Yuichiro Yasui | Hiroki Teranishi | Yuji Matsumoto | Taro Watanabe | Masayuki Ishii
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
Corporate history in corporate annual reports includes events related to organizational changes, which can provide useful cues for a comprehensive understanding of corporate actions.However, extracting organizational changes requires identifying differences in companies before and after an event, raising concerns about whether existing information extraction systems can accurately capture the relations.This work introduces JaCorpTrack, a novel event extraction task designed to identify events related to organizational changes.JaCorpTrack defines five event types related to organizational changes and is designed to identify the company names before and after each event, as well as the corresponding date.Experimental results indicate that large language models (LLMs) exhibit notable disparities in performance across event types.Our analysis reveals that these systems face challenges in identifying company names before and after events, and in interpreting event types expressed under ambiguous terminology.We will publicly release our dataset and experimental code at https://github.com/naist-nlp/JaCorpTrack
2020
Coordination Boundary Identification without Labeled Data for Compound Terms Disambiguation
Yuya Sawada | Takashi Wada | Takayoshi Shibahara | Hiroki Teranishi | Shuhei Kondo | Hiroyuki Shindo | Taro Watanabe | Yuji Matsumoto
Proceedings of the 28th International Conference on Computational Linguistics
Yuya Sawada | Takashi Wada | Takayoshi Shibahara | Hiroki Teranishi | Shuhei Kondo | Hiroyuki Shindo | Taro Watanabe | Yuji Matsumoto
Proceedings of the 28th International Conference on Computational Linguistics
We propose a simple method for nominal coordination boundary identification. As the main strength of our method, it can identify the coordination boundaries without training on labeled data, and can be applied even if coordination structure annotations are not available. Our system employs pre-trained word embeddings to measure the similarities of words and detects the span of coordination, assuming that conjuncts share syntactic and semantic similarities. We demonstrate that our method yields good results in identifying coordinated noun phrases in the GENIA corpus and is comparable to a recent supervised method for the case when the coordinator conjoins simple noun phrases.