WSDPO: A Generative Word Sense Disambiguation Framework with Chain-of-Thought and Preference Optimization

Kunpeng Kang; Shuaimin Li; Kaiyuan Zhang; Luyang Zhang; Jiasheng Si; Bing Xu; Kehai Chen (陈科海); Muyun Yang (杨沐昀); Wenpeng Lu

WSDPO: A Generative Word Sense Disambiguation Framework with Chain-of-Thought and Preference Optimization

Kunpeng Kang, Shuaimin Li, Kaiyuan Zhang, Luyang Zhang, Jiasheng Si, Bing Xu, Kehai Chen, Muyun Yang, Wenpeng Lu

Abstract

Word sense disambiguation (WSD) is a foundational task in natural language processing. Recent research has reformulated WSD for large language models (LLMs) as a generative task, where the model produces a definition to convey the intended meaning of an ambiguous word in context.In practice, most existing approaches implement this formulation through straightforward supervised fine-tuning, which tends to prioritize superficial context-to-gloss memorization over true contextual sense discrimination, leading to degraded performance on less frequent senses (LFS), particularly in unseen settings.To address this issue, we propose WSDPO, a training framework for generative WSD with chain-of-thought (CoT) and preference optimization. WSDPO consists of three stages: (1) disambiguation-aware CoT construction, which produces training data containing explicit disambiguation steps for the later stage;(2) disambiguation-guided supervised fine-tuning, which explicitly trains the model to discriminate word sense before generating the final definition; and(3) preference-based optimization, which further strengthens the model’s ability to generate sense-faithful definitions by optimizing it using preference pairs constructed from multiple sampled CoT outputs.Extensive experiments across benchmark datasets and multiple backbone LLMs demonstrate that WSDPO achieves substantial performance gains on rare and unseen settings, and exhibits strong generalization in standard evaluation settings.

Anthology ID:: 2026.acl-long.1610
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 34870–34889
Language:
URL:: https://aclanthology.org/2026.acl-long.1610/
DOI:
Bibkey:
Cite (ACL):: Kunpeng Kang, Shuaimin Li, Kaiyuan Zhang, Luyang Zhang, Jiasheng Si, Bing Xu, Kehai Chen, Muyun Yang, and Wenpeng Lu. 2026. WSDPO: A Generative Word Sense Disambiguation Framework with Chain-of-Thought and Preference Optimization. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 34870–34889, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: WSDPO: A Generative Word Sense Disambiguation Framework with Chain-of-Thought and Preference Optimization (Kang et al., ACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.acl-long.1610.pdf
Checklist:: 2026.acl-long.1610.checklist.pdf

PDF Cite Search Checklist Fix data