Jiahe Liu
2026
CHiRPE: A Step Towards Real-World Clinical NLP with Clinician-Oriented Model Explanations
Stephanie Fong | Zimu Wang | Guilherme C Oliveira | Xiangyu Zhao | Yiwen Jiang | Jiahe Liu | Beau-Luke Colton | Scott W. Woods | Martha Shenton | Barnaby Nelson | Zongyuan Ge | Dominic Dwyer
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 2: Short Papers)
Stephanie Fong | Zimu Wang | Guilherme C Oliveira | Xiangyu Zhao | Yiwen Jiang | Jiahe Liu | Beau-Luke Colton | Scott W. Woods | Martha Shenton | Barnaby Nelson | Zongyuan Ge | Dominic Dwyer
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 2: Short Papers)
The medical adoption of NLP tools requires interpretability by end users, yet traditional explainable AI (XAI) methods are misaligned with clinical reasoning and lack clinician input. We introduce CHiRPE (Clinical High-Risk Prediction with Explainability), an NLP pipeline that takes transcribed semi-structured clinical interviews to: (i) predict psychosis risk; and (ii) generate novel SHAP explanation formats co-developed with clinicians. Trained on 944 semi-structured interview transcripts across 24 international clinics of the AMP-SCZ study, the CHiRPE pipeline integrates symptom-domain mapping, LLM summarisation, and BERT classification. CHiRPE achieved over 90% accuracy across three BERT variants and outperformed baseline models. Explanation formats were evaluated by 28 clinical experts who indicated a strong preference for our novel concept-guided explanations, especially hybrid graph-and-text summary formats. CHiRPE demonstrates that clinically-guided model development produces both accurate and interpretable results. Our next step is focused on real-world testing across our 24 international sites.
2024
Breaking the Silence: How Online Forums Address Lung Cancer Stigma and Offer Support
Jiahe Liu | Mike Conway | Daniel Cabrera Lozoya
Proceedings of the 22nd Annual Workshop of the Australasian Language Technology Association
Jiahe Liu | Mike Conway | Daniel Cabrera Lozoya
Proceedings of the 22nd Annual Workshop of the Australasian Language Technology Association
Lung cancer remains a leading cause of cancer-related deaths, but public support for individuals living with lung cancer is often constrained by stigma and misconceptions, leading to serious emotional and social consequences for those diagnosed. Understanding how this stigma manifests and affects individuals is vital for developing inclusive interventions. Online discussion forums offer a unique opportunity to examine how lung cancer stigma is expressed and experienced. This study combines qualitative analysis and unsupervised learning (topic modelling) to explore stigma-related content within an online lung cancer forum. Our findings highlight the role of online forums as a key space for addressing anti-discriminatory attitudes and sharing experiences of lung cancer stigma. We found that users both with and with- out lung cancer engage in discussions pertaining to supportive and welcoming topics, high- lighting the online forum’s role in facilitating social and informational support.
Optimizing Multimodal Large Language Models for Detection of Alcohol Advertisements via Adaptive Prompting
Daniel Cabrera Lozoya | Jiahe Liu | Simon D’Alfonso | Mike Conway
Proceedings of the 23rd Workshop on Biomedical Natural Language Processing
Daniel Cabrera Lozoya | Jiahe Liu | Simon D’Alfonso | Mike Conway
Proceedings of the 23rd Workshop on Biomedical Natural Language Processing
Adolescents exposed to advertisements promoting addictive substances exhibit a higher likelihood of subsequent substance use. The predominant source for youth exposure to such advertisements is through online content accessed via smartphones. Detecting these advertisements is crucial for establishing and maintaining a safer online environment for young people. In our study, we utilized Multimodal Large Language Models (MLLMs) to identify addictive substance advertisements in digital media. The performance of MLLMs depends on the quality of the prompt used to instruct the model. To optimize our prompts, an adaptive prompt engineering approach was implemented, leveraging a genetic algorithm to refine and enhance the prompts. To evaluate the model’s performance, we augmented the RICO dataset, consisting of Android user interface screenshots, by superimposing alcohol ads onto them. Our results indicate that the MLLM can detect advertisements promoting alcohol with a 0.94 accuracy and a 0.94 F1 score.