Exploring Large Language Models for Qualitative Data Analysis

Tim Fischer; Chris Biemann

doi:10.18653/v1/2024.nlp4dh-1.41

Exploring Large Language Models for Qualitative Data Analysis

Abstract

This paper explores the potential of Large Language Models (LLMs) to enhance qualitative data analysis (QDA) workflows within the open-source QDA platform developed at our university. We identify several opportunities within a typical QDA workflow where AI assistance can boost researcher productivity and translate these opportunities into corresponding NLP tasks: document classification, information extraction, span classification, and text generation. A benchmark tailored to these QDA activities is constructed, utilizing English and German datasets that align with relevant use cases. Focusing on efficiency and accessibility, we evaluate the performance of three prominent open-source LLMs - Llama 3.1, Gemma 2, and Mistral NeMo - on this benchmark. Our findings reveal the promise of LLM integration for streamlining QDA workflows, particularly for English-language projects. Consequently, we have implemented the LLM Assistant as an opt-in feature within our platform and report the implementation details. With this, we hope to further democratize access to AI capabilities for qualitative data analysis.

Anthology ID:: 2024.nlp4dh-1.41
Volume:: Proceedings of the 4th International Conference on Natural Language Processing for Digital Humanities
Month:: November
Year:: 2024
Address:: Miami, USA
Editors:: Mika Hämäläinen, Emily Öhman, So Miyagawa, Khalid Alnajjar, Yuri Bizzoni
Venues:: NLP4DH | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 423–437
Language:
URL:: https://aclanthology.org/2024.nlp4dh-1.41/
DOI:: 10.18653/v1/2024.nlp4dh-1.41
Bibkey:
Cite (ACL):: Tim Fischer and Chris Biemann. 2024. Exploring Large Language Models for Qualitative Data Analysis. In Proceedings of the 4th International Conference on Natural Language Processing for Digital Humanities, pages 423–437, Miami, USA. Association for Computational Linguistics.
Cite (Informal):: Exploring Large Language Models for Qualitative Data Analysis (Fischer & Biemann, NLP4DH 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.nlp4dh-1.41.pdf

PDF Cite Search Fix data