Exploring Accurate and Generic Simile Knowledge from Pre-trained Language Models

Zhou Shuhan, Ma Longxuan, Shao Yanqiu


Abstract
“A simile is an important linguistic phenomenon in daily communication and an important taskin natural language processing (NLP). In recent years, pre-trained language models (PLMs) haveachieved great success in NLP since they learn generic knowledge from a large corpus. However,PLMs still have hallucination problems that they could generate unrealistic or context-unrelatedinformation.In this paper, we aim to explore more accurate simile knowledge from PLMs.To this end, we first fine-tune a single model to perform three main simile tasks (recognition,interpretation, and generation). In this way, the model gains a better understanding of the simileknowledge. However, this understanding may be limited by the distribution of the training data. To explore more generic simile knowledge from PLMs, we further add semantic dependencyfeatures in three tasks. The semantic dependency feature serves as a global signal and helpsthe model learn simile knowledge that can be applied to unseen domains. We test with seenand unseen domains after training. Automatic evaluations demonstrate that our method helps thePLMs to explore more accurate and generic simile knowledge for downstream tasks. Our methodof exploring more accurate knowledge is not only useful for simile study but also useful for otherNLP tasks leveraging knowledge from PLMs. Our code and data will be released on GitHub.”
Anthology ID:
2023.ccl-1.78
Volume:
Proceedings of the 22nd Chinese National Conference on Computational Linguistics
Month:
August
Year:
2023
Address:
Harbin, China
Editors:
Maosong Sun, Bing Qin, Xipeng Qiu, Jing Jiang, Xianpei Han
Venue:
CCL
SIG:
Publisher:
Chinese Information Processing Society of China
Note:
Pages:
918–929
Language:
English
URL:
https://aclanthology.org/2023.ccl-1.78
DOI:
Bibkey:
Cite (ACL):
Zhou Shuhan, Ma Longxuan, and Shao Yanqiu. 2023. Exploring Accurate and Generic Simile Knowledge from Pre-trained Language Models. In Proceedings of the 22nd Chinese National Conference on Computational Linguistics, pages 918–929, Harbin, China. Chinese Information Processing Society of China.
Cite (Informal):
Exploring Accurate and Generic Simile Knowledge from Pre-trained Language Models (Shuhan et al., CCL 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.ccl-1.78.pdf