Coding Open-Ended Responses using Pseudo Response Generation by Large Language Models

Yuki Zenimoto; Ryo Hasegawa; Takehito Utsuro; Masaharu Yoshioka; Noriko Kando

doi:10.18653/v1/2024.naacl-srw.26

Coding Open-Ended Responses using Pseudo Response Generation by Large Language Models

Yuki Zenimoto, Ryo Hasegawa, Takehito Utsuro, Masaharu Yoshioka, Noriko Kando

Abstract

Survey research using open-ended responses is an important method thatcontributes to the discovery of unknown issues and new needs. However,survey research generally requires time and cost-consuming manual dataprocessing, indicating that it is difficult to analyze large dataset.To address this issue, we propose an LLM-based method to automate partsof the grounded theory approach (GTA), a representative approach of thequalitative data analysis. We generated and annotated pseudo open-endedresponses, and used them as the training data for the coding proceduresof GTA. Through evaluations, we showed that the models trained withpseudo open-ended responses are quite effective compared with thosetrained with manually annotated open-ended responses. We alsodemonstrate that the LLM-based approach is highly efficient andcost-saving compared to human-based approach.

Anthology ID:: 2024.naacl-srw.26
Volume:: Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 4: Student Research Workshop)
Month:: June
Year:: 2024
Address:: Mexico City, Mexico
Editors:: Yang (Trista) Cao, Isabel Papadimitriou, Anaelia Ovalle, Marcos Zampieri, Francis Ferraro, Swabha Swayamdipta
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 242–254
Language:
URL:: https://aclanthology.org/2024.naacl-srw.26/
DOI:: 10.18653/v1/2024.naacl-srw.26
Bibkey:
Cite (ACL):: Yuki Zenimoto, Ryo Hasegawa, Takehito Utsuro, Masaharu Yoshioka, and Noriko Kando. 2024. Coding Open-Ended Responses using Pseudo Response Generation by Large Language Models. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 4: Student Research Workshop), pages 242–254, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):: Coding Open-Ended Responses using Pseudo Response Generation by Large Language Models (Zenimoto et al., NAACL 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.naacl-srw.26.pdf
Video:: https://aclanthology.org/2024.naacl-srw.26.mp4

PDF Cite Search Video Fix data