Automating Qualitative Data Analysis with Large Language Models

Angelina Parfenova, Alexander Denzler, Jörgen Pfeffer


Abstract
This PhD proposal aims to investigate ways of automating qualitative data analysis, specifically the thematic coding of texts. Despite existing methods vastly covered in literature, they mainly use Topic Modeling and other quantitative approaches which are far from resembling a human’s analysis outcome. This proposal examines the limitations of current research in the field. It proposes a novel methodology based on Large Language Models to tackle automated coding and make it as close as possible to the results of human researchers. This paper covers studies already done in this field and their limitations, existing software, the problem of duplicating the researcher bias, and the proposed methodology.
Anthology ID:
2024.acl-srw.17
Volume:
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop)
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Xiyan Fu, Eve Fleisig
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
177–185
Language:
URL:
https://aclanthology.org/2024.acl-srw.17
DOI:
Bibkey:
Cite (ACL):
Angelina Parfenova, Alexander Denzler, and Jörgen Pfeffer. 2024. Automating Qualitative Data Analysis with Large Language Models. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), pages 177–185, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
Automating Qualitative Data Analysis with Large Language Models (Parfenova et al., ACL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.acl-srw.17.pdf