Huikai Chua


2022

pdf bib
A unified framework for cross-domain and cross-task learning of mental health conditions
Huikai Chua | Andrew Caines | Helen Yannakoudakis
Proceedings of the Second Workshop on NLP for Positive Impact (NLP4PI)

The detection of mental health conditions based on an individual’s use of language has received considerable attention in the NLP community. However, most work has focused on single-task and single-domain models, limiting the semantic space that they are able to cover and risking significant cross-domain loss. In this paper, we present two approaches towards a unified framework for cross-domain and cross-task learning for the detection of depression, post-traumatic stress disorder and suicide risk across different platforms that further utilizes inductive biases across tasks. Firstly, we develop a lightweight model using a general set of features that sets a new state of the art on several tasks while matching the performance of more complex task- and domain-specific systems on others. We also propose a multi-task approach and further extend our framework to explicitly capture the affective characteristics of someone’s language, further consolidating transfer of inductive biases and of shared linguistic characteristics. Finally, we present a novel dynamically adaptive loss weighting approach that allows for more stable learning across imbalanced datasets and better neural generalization performance. Our results demonstrate the effectiveness of our unified framework for mental ill-health detection across a number of diverse English datasets.

2021

pdf bib
Stylistic approaches to predicting Reddit popularity in diglossia
Huikai Chua
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: Student Research Workshop

Past work investigating what makes a Reddit post popular has indicated that style is a far better predictor than content, where posts conforming to a subreddit’s community style are better received. However, what about a diglossia, when there are two community styles? In Singapore, the basilect (‘Singlish’) co-exists with an acrolect (standard English), each with contrasting advantages of community identity and prestige respectively. In this paper, I apply stylistic approaches to predicting Reddit post scores in a diglossia. Using data from the Singaporean and British subreddits, I show that while the acrolect’s prestige attracts more upvotes, the most popular posts also draw on Singlish vocabulary to appeal to the community identity.