The Overlooked Repetitive Lengthening Form in Sentiment Analysis

Lei Wang; Eduard Dragut

doi:10.18653/v1/2024.findings-emnlp.952

The Overlooked Repetitive Lengthening Form in Sentiment Analysis

Abstract

Individuals engaging in online communication frequently express personal opinions with informal styles (e.g., memes and emojis). While Language Models (LMs) with informal communications have been widely discussed, a unique and emphatic style, the Repetitive Lengthening Form (RLF), has been overlooked for years. In this paper, we explore answers to two research questions: 1) Is RLF important for SA? 2) Can LMs understand RLF? Inspired by previous linguistic research, we curate **Lengthening**, the first multi-domain dataset with 850k samples focused on RLF for sentiment analysis. Moreover, we introduce **Explnstruct**, a two-stage Explainable Instruction Tuning framework aimed at improving both the performance and explainability of LLMs for RLF. We further propose a novel unified approach to quantify LMs’ understanding of informal expressions. We show that RLF sentences are expressive expressions and can serve as signatures of document-level sentiment. Additionally, RLF has potential value for online content analysis. Our comprehensive results show that fine-tuned Pre-trained Language Models (PLMs) can surpass zero-shot GPT-4 in performance but not in explanation for RLF. Finally, we show ExpInstruct can improve the open-sourced LLMs to match zero-shot GPT-4 in performance and explainability for RLF with limited samples. Code and sample data are available at https://github.com/Tom-Owl/OverlookedRLF

Anthology ID:: 2024.findings-emnlp.952
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2024
Month:: November
Year:: 2024
Address:: Miami, Florida, USA
Editors:: Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 16225–16238
Language:
URL:: https://aclanthology.org/2024.findings-emnlp.952/
DOI:: 10.18653/v1/2024.findings-emnlp.952
Bibkey:
Cite (ACL):: Lei Wang and Eduard Dragut. 2024. The Overlooked Repetitive Lengthening Form in Sentiment Analysis. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 16225–16238, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):: The Overlooked Repetitive Lengthening Form in Sentiment Analysis (Wang & Dragut, Findings 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.findings-emnlp.952.pdf

PDF Cite Search Fix data