Hideo Kobayashi

2025

pdf bib abs
You Only Read Once (YORO): Learning to Internalize Database Knowledge for Text-to-SQL
Hideo Kobayashi | Wuwei Lan | Peng Shi | Shuaichen Chang | Jiang Guo | Henghui Zhu | Zhiguo Wang | Patrick Ng
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

While significant progress has been made on the text-to-SQL task, recent solutions repeatedly encode the same database schema for every question, resulting in unnecessary high inference cost and often overlooking crucial database knowledge. To address these issues, we propose You Only Read Once (YORO), a novel paradigm that directly internalizes database knowledge into the parametric knowledge of a text-to-SQL model during training and eliminates the need for schema encoding during inference. YORO significantly reduces the input token length by 66%-98%. Despite its shorter inputs, our empirical results demonstrate YORO’s competitive performances with traditional systems on three benchmarks as well as its significant outperformance on large databases. Furthermore, YORO excels in handling questions with challenging value retrievals such as abbreviation.

2023

pdf bib abs
PairSpanBERT: An Enhanced Language Model for Bridging Resolution
Hideo Kobayashi | Yufang Hou | Vincent Ng
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We present PairSpanBERT, a SpanBERT-based pre-trained model specialized for bridging resolution. To this end, we design a novel pre-training objective that aims to learn the contexts in which two mentions are implicitly linked to each other from a large amount of data automatically generated either heuristically or via distance supervision with a knowledge graph. Despite the noise inherent in the automatically generated data, we achieve the best results reported to date on three evaluation datasets for bridging resolution when replacing SpanBERT with PairSpanBERT in a state-of-the-art resolver that jointly performs entity coreference resolution and bridging resolution.

2022

pdf bib abs
Constrained Multi-Task Learning for Bridging Resolution
Hideo Kobayashi | Yufang Hou | Vincent Ng
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We examine the extent to which supervised bridging resolvers can be improved without employing additional labeled bridging data by proposing a novel constrained multi-task learning framework for bridging resolution, within which we (1) design cross-task consistency constraints to guide the learning process; (2) pre-train the entity coreference model in the multi-task framework on the large amount of publicly available coreference data; and (3) integrating prior knowledge encoded in rule-based resolvers. Our approach achieves state-of-the-art results on three standard evaluation corpora.

pdf bib abs
Neural Anaphora Resolution in Dialogue Revisited
Shengjie Li | Hideo Kobayashi | Vincent Ng
Proceedings of the CODI-CRAC 2022 Shared Task on Anaphora, Bridging, and Discourse Deixis in Dialogue

We present the systems that we developed for all three tracks of the CODI-CRAC 2022 shared task, namely the anaphora resolution track, the bridging resolution track, and the discourse deixis resolution track. Combining an effective encoding of the input using the SpanBERT_Large encoder with an extensive hyperparameter search process, our systems achieved the highest scores in all phases of all three tracks.

pdf bib abs
End-to-End Neural Bridging Resolution
Hideo Kobayashi | Yufang Hou | Vincent Ng
Proceedings of the 29th International Conference on Computational Linguistics

The state of bridging resolution research is rather unsatisfactory: not only are state-of-the-art resolvers evaluated in unrealistic settings, but the neural models underlying these resolvers are weaker than those used for entity coreference resolution. In light of these problems, we evaluate bridging resolvers in an end-to-end setting, strengthen them with better encoders, and attempt to gain a better understanding of them via perturbation experiments and a manual analysis of their outputs.

pdf bib abs
Analyzing Coreference and Bridging in Product Reviews
Hideo Kobayashi | Christopher Malon
Proceedings of the Fifth Workshop on Computational Models of Reference, Anaphora and Coreference

Product reviews may have complex discourse including coreference and bridging relations to a main product, competing products, and interacting products. Current approaches to aspect-based sentiment analysis (ABSA) and opinion summarization largely ignore this complexity. On the other hand, existing systems for coreference and bridging were trained in a different domain. We collect mention type annotations relevant to coreference and bridging for 498 product reviews. Using these annotations, we show that a state-of-the-art factuality score fails to catch coreference errors in product reviews, and that a state-of-the-art coreference system trained on OntoNotes does not perform nearly as well on product mentions. As our dataset grows, we expect it to help ABSA and opinion summarization systems to avoid entity reference errors.

2021

pdf bib abs
Neural Anaphora Resolution in Dialogue
Hideo Kobayashi | Shengjie Li | Vincent Ng
Proceedings of the CODI-CRAC 2021 Shared Task on Anaphora, Bridging, and Discourse Deixis in Dialogue

We describe the systems that we developed for the three tracks of the CODI-CRAC 2021 shared task, namely entity coreference resolution, bridging resolution, and discourse deixis resolution. Our team ranked second for entity coreference resolution, first for bridging resolution, and first for discourse deixis resolution.

pdf bib abs
The CODI-CRAC 2021 Shared Task on Anaphora, Bridging, and Discourse Deixis Resolution in Dialogue: A Cross-Team Analysis
Shengjie Li | Hideo Kobayashi | Vincent Ng
Proceedings of the CODI-CRAC 2021 Shared Task on Anaphora, Bridging, and Discourse Deixis in Dialogue

The CODI-CRAC 2021 shared task is the first shared task that focuses exclusively on anaphora resolution in dialogue and provides three tracks, namely entity coreference resolution, bridging resolution, and discourse deixis resolution. We perform a cross-task analysis of the systems that participated in the shared task in each of these tracks.

pdf bib abs
Bridging Resolution: Making Sense of the State of the Art
Hideo Kobayashi | Vincent Ng
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

While Yu and Poesio (2020) have recently demonstrated the superiority of their neural multi-task learning (MTL) model to rule-based approaches for bridging anaphora resolution, there is little understanding of (1) how it is better than the rule-based approaches (e.g., are the two approaches making similar or complementary mistakes?) and (2) what should be improved. To shed light on these issues, we (1) propose a hybrid rule-based and MTL approach that would enable a better understanding of their comparative strengths and weaknesses; and (2) perform a manual analysis of the errors made by the MTL model.

2020

pdf bib abs
Bridging Resolution: A Survey of the State of the Art
Hideo Kobayashi | Vincent Ng
Proceedings of the 28th International Conference on Computational Linguistics

Bridging reference resolution is an anaphora resolution task that is arguably more challenging and less studied than entity coreference resolution. Given that significant progress has been made on coreference resolution in recent years, we believe that bridging resolution will receive increasing attention in the NLP community. Nevertheless, progress on bridging resolution is currently hampered in part by the scarcity of large annotated corpora for model training as well as the lack of standardized evaluation protocols. This paper presents a survey of the current state of research on bridging reference resolution and discusses future research directions.

Co-authors

Venues

Fix author