Jiahe Liu


2026

The medical adoption of NLP tools requires interpretability by end users, yet traditional explainable AI (XAI) methods are misaligned with clinical reasoning and lack clinician input. We introduce CHiRPE (Clinical High-Risk Prediction with Explainability), an NLP pipeline that takes transcribed semi-structured clinical interviews to: (i) predict psychosis risk; and (ii) generate novel SHAP explanation formats co-developed with clinicians. Trained on 944 semi-structured interview transcripts across 24 international clinics of the AMP-SCZ study, the CHiRPE pipeline integrates symptom-domain mapping, LLM summarisation, and BERT classification. CHiRPE achieved over 90% accuracy across three BERT variants and outperformed baseline models. Explanation formats were evaluated by 28 clinical experts who indicated a strong preference for our novel concept-guided explanations, especially hybrid graph-and-text summary formats. CHiRPE demonstrates that clinically-guided model development produces both accurate and interpretable results. Our next step is focused on real-world testing across our 24 international sites.

2024

Lung cancer remains a leading cause of cancer-related deaths, but public support for individuals living with lung cancer is often constrained by stigma and misconceptions, leading to serious emotional and social consequences for those diagnosed. Understanding how this stigma manifests and affects individuals is vital for developing inclusive interventions. Online discussion forums offer a unique opportunity to examine how lung cancer stigma is expressed and experienced. This study combines qualitative analysis and unsupervised learning (topic modelling) to explore stigma-related content within an online lung cancer forum. Our findings highlight the role of online forums as a key space for addressing anti-discriminatory attitudes and sharing experiences of lung cancer stigma. We found that users both with and with- out lung cancer engage in discussions pertaining to supportive and welcoming topics, high- lighting the online forum’s role in facilitating social and informational support.
Adolescents exposed to advertisements promoting addictive substances exhibit a higher likelihood of subsequent substance use. The predominant source for youth exposure to such advertisements is through online content accessed via smartphones. Detecting these advertisements is crucial for establishing and maintaining a safer online environment for young people. In our study, we utilized Multimodal Large Language Models (MLLMs) to identify addictive substance advertisements in digital media. The performance of MLLMs depends on the quality of the prompt used to instruct the model. To optimize our prompts, an adaptive prompt engineering approach was implemented, leveraging a genetic algorithm to refine and enhance the prompts. To evaluate the model’s performance, we augmented the RICO dataset, consisting of Android user interface screenshots, by superimposing alcohol ads onto them. Our results indicate that the MLLM can detect advertisements promoting alcohol with a 0.94 accuracy and a 0.94 F1 score.