Controllable Data Augmentation for Few-Shot Text Mining with Chain-of-Thought Attribute Manipulation

Letian Peng; Yuwei Zhang; Jingbo Shang

Controllable Data Augmentation for Few-Shot Text Mining with Chain-of-Thought Attribute Manipulation

Abstract

Prompting large language models (LLMs) for data augmentation has recently become a common practice in few-shot NLP tasks. In this paper, we propose Chain-of-Thought Attribute Manipulation (CoTAM), a novel approach that generates new data from existing examples by only tweaking in the user-provided, task-specific attribute, e.g., sentiment polarity or topic in movie reviews. Instead of conventional latent representation controlling, we leverage the chain-of-thought prompting to directly edit the text in three steps, (1) attribute decomposition, (2) manipulation proposal, and (3) sentence reconstruction. Extensive results on various tasks, such as text (pair) classification and aspect-based sentiment analysis, verify the superiority of CoTAM over other LLM-based augmentation methods with the same number of training examples for both fine-tuning and in-context learning. Remarkably, the 2D visualization of the augmented dataset using principle component analysis revealed a human-recognizable decision boundary that is likely hinted by the attribute manipulation, demonstrating the potential of our proposed approach.

Anthology ID:: 2024.findings-acl.1
Volume:: Findings of the Association for Computational Linguistics ACL 2024
Month:: August
Year:: 2024
Address:: Bangkok, Thailand and virtual meeting
Editors:: Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1–16
Language:
URL:: https://aclanthology.org/2024.findings-acl.1
DOI:
Bibkey:
Cite (ACL):: Letian Peng, Yuwei Zhang, and Jingbo Shang. 2024. Controllable Data Augmentation for Few-Shot Text Mining with Chain-of-Thought Attribute Manipulation. In Findings of the Association for Computational Linguistics ACL 2024, pages 1–16, Bangkok, Thailand and virtual meeting. Association for Computational Linguistics.
Cite (Informal):: Controllable Data Augmentation for Few-Shot Text Mining with Chain-of-Thought Attribute Manipulation (Peng et al., Findings 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.findings-acl.1.pdf

PDF Cite Search