A Counselling Corpus in Cantonese

John Lee, Tianyuan Cai, Wenxiu Xie, Lam Xing


Abstract
Virtual agents are increasingly used for delivering health information in general, and mental health assistance in particular. This paper presents a corpus designed for training a virtual counsellor in Cantonese, a variety of Chinese. The corpus consists of a domain-independent subcorpus that supports small talk for rapport building with users, and a domain-specific subcorpus that provides material for a particular area of counselling. The former consists of ELIZA style responses, chitchat expressions, and a dataset of general dialog, all of which are reusable across counselling domains. The latter consists of example user inputs and appropriate chatbot replies relevant to the specific domain. In a case study, we created a chatbot with a domain-specific subcorpus that addressed 25 issues in test anxiety, with 436 inputs solicited from native speakers of Cantonese and 150 chatbot replies harvested from mental health websites. Preliminary evaluations show that Word Mover’s Distance achieved 56% accuracy in identifying the issue in user input, outperforming a number of baselines.
Anthology ID:
2020.sltu-1.50
Volume:
Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL)
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Dorothee Beermann, Laurent Besacier, Sakriani Sakti, Claudia Soria
Venue:
SLTU
SIG:
Publisher:
European Language Resources association
Note:
Pages:
358–361
Language:
English
URL:
https://aclanthology.org/2020.sltu-1.50
DOI:
Bibkey:
Cite (ACL):
John Lee, Tianyuan Cai, Wenxiu Xie, and Lam Xing. 2020. A Counselling Corpus in Cantonese. In Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL), pages 358–361, Marseille, France. European Language Resources association.
Cite (Informal):
A Counselling Corpus in Cantonese (Lee et al., SLTU 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.sltu-1.50.pdf