Sardar Khan Khayamkhani


2025

pdf bib
GPT-Based Lexical Simplification for Multi-Word Expressions Using Prompt Engineering
Sardar Khan Khayamkhani | Matthew Shardlow
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era

Multiword Lexical Simplification (MWLS) is the task of replacing a complex phrase in a sentence with a simpler alternative. Whereas previous approaches to MWLS made use of the BERT language model, we make use of the Generative Pre-trained Transformer architecture. Our approach employs Large Language Models in an auto-regressive format, making use of prompt engineering and few-shot learning to develop new strategies for the MWLS task. We experiment with several GPT-based models and differing experimental settings including varying the number of requested examples, changing the base model type, adapting the prompt and zero-shot, one-shot and k-shot in-context learning. We show that a GPT-4o model with k-shot in-context learning (k=6) demonstrates state-of-the-art performance for the MWLS1 dataset with NDCG=0.3143, PREC@5=0.1048, beating the previous Bert-based approach by a wide margin on several metrics and consistently across subsets. Our findings indicate that GPT-based models are superior to BERT-based models for the MWLS task.