Junsong Li


2024

pdf bib
A Unified Generative Framework for Bilingual Euphemism Detection and Identification
Yuxue Hu | Junsong Li | Tongguan Wang | Dongyu Su | Guixin Su | Ying Sha
Findings of the Association for Computational Linguistics ACL 2024

Various euphemisms are emerging in social networks, attracting widespread attention from the natural language processing community. However, existing euphemism datasets are only domain-specific or language-specific. In addition, existing approaches to the study of euphemisms are one-sided. Either only the euphemism detection task or only the euphemism identification task is accomplished, lacking a unified framework. To this end, we construct a large-scale Bilingual Multi-category dataset of Euphemisms named BME, which covers a total of 12 categories for two languages, English and Chinese. Then, we first propose a unified generative model to Jointly conduct the tasks of bilingual Euphemism Detection and Identification named JointEDI. By comparing with LLMs and human evaluation, we demonstrate the effectiveness of the proposed JointEDI and the feasibility of unifying euphemism detection and euphemism identification tasks. Moreover, the BME dataset also provides a new reference standard for euphemism detection and euphemism identification.