Exploring the Challenges of Behaviour Change Language Classification: A Study on Semi-Supervised Learning and the Impact of Pseudo-Labelled Data

Selina Meyer, Marcos Fernandez-Pichel, David Elsweiler, David E. Losada


Abstract
Automatic classification of behaviour change language can enhance conversational agents’ capabilities to adjust their behaviour based on users’ current situations and to encourage individuals to make positive changes. However, the lack of annotated language data of change-seekers hampers the performance of existing classifiers. In this study, we investigate the use of semi-supervised learning (SSL) to classify highly imbalanced texts around behaviour change. We assess the impact of including pseudo-labelled data from various sources and examine the balance between the amount of added pseudo-labelled data and the strictness of the inclusion criteria. Our findings indicate that while adding pseudo-labelled samples to the training data has limited classification impact, it does not significantly reduce performance regardless of the source of these new samples. This reinforces previous findings on the feasibility of applying classifiers trained on behaviour change language to diverse contexts.
Anthology ID:
2024.cl4health-1.28
Volume:
Proceedings of the First Workshop on Patient-Oriented Language Processing (CL4Health) @ LREC-COLING 2024
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Dina Demner-Fushman, Sophia Ananiadou, Paul Thompson, Brian Ondov
Venues:
CL4Health | WS
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
229–239
Language:
URL:
https://aclanthology.org/2024.cl4health-1.28
DOI:
Bibkey:
Cite (ACL):
Selina Meyer, Marcos Fernandez-Pichel, David Elsweiler, and David E. Losada. 2024. Exploring the Challenges of Behaviour Change Language Classification: A Study on Semi-Supervised Learning and the Impact of Pseudo-Labelled Data. In Proceedings of the First Workshop on Patient-Oriented Language Processing (CL4Health) @ LREC-COLING 2024, pages 229–239, Torino, Italia. ELRA and ICCL.
Cite (Informal):
Exploring the Challenges of Behaviour Change Language Classification: A Study on Semi-Supervised Learning and the Impact of Pseudo-Labelled Data (Meyer et al., CL4Health-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.cl4health-1.28.pdf