Kohei Oda
2026
One Sentence, Two Embeddings: Contrastive Learning of Explicit and Implicit Semantic Representations
Kohei Oda | Po-Min Chuang | Kiyoaki Shirai | Natthawut Kertkeidkachorn
Findings of the Association for Computational Linguistics: EACL 2026
Kohei Oda | Po-Min Chuang | Kiyoaki Shirai | Natthawut Kertkeidkachorn
Findings of the Association for Computational Linguistics: EACL 2026
Sentence embedding methods have made remarkable progress, yet they still struggle to capture the implicit semantics within sentences. This can be attributed to the inherent limitations of conventional sentence embedding methods that assign only a single vector per sentence. To overcome this limitation, we propose DualCSE, a sentence embedding method that assigns two embeddings to each sentence: one representing the explicit semantics and the other representing the implicit semantics. These embeddings coexist in the shared space, enabling the selection of the desired semantics for specific purposes such as information retrieval and text classification. Experimental results demonstrate that DualCSE can effectively encode both explicit and implicit meanings and improve the performance of the downstream task.
2025
Improving Interpretability of Lexical Semantic Change with Neurobiological Features
Kohei Oda | Hiroya Takamura | Kiyoaki Shirai | Natthawut Kertkeidkachorn
Proceedings of the 39th Pacific Asia Conference on Language, Information and Computation
Kohei Oda | Hiroya Takamura | Kiyoaki Shirai | Natthawut Kertkeidkachorn
Proceedings of the 39th Pacific Asia Conference on Language, Information and Computation
2024
Learning Contextualized Box Embeddings with Prototypical Networks
Kohei Oda | Kiyoaki Shirai | Natthawut Kertkeidkachorn
Proceedings of the 9th Workshop on Representation Learning for NLP (RepL4NLP-2024)
Kohei Oda | Kiyoaki Shirai | Natthawut Kertkeidkachorn
Proceedings of the 9th Workshop on Representation Learning for NLP (RepL4NLP-2024)
This paper proposes ProtoBox, a novel method to learn contextualized box embeddings. Unlike an ordinary word embedding, which represents a word as a single vector, a box embedding represents the meaning of a word as a box in a high-dimensional space: that is suitable for representing semantic relations between words. In addition, our method aims to obtain a “contextualized” box embedding, which is an abstract representation of a word in a specific context. ProtoBox is based on Prototypical Networks, which is a robust method for classification problems, especially focusing on learning the hypernym–hyponym relation between senses. ProtoBox is evaluated on three tasks: Word Sense Disambiguation (WSD), New Sense Classification (NSC), and Hypernym Identification (HI). Experimental results show that ProtoBox outperforms baselines for the HI task and is comparable for the WSD and NSC tasks.