William Cheung

Also published as: William K. Cheung


2026

While multimodal generative models have advanced radiology report generation (RRG), challenges remain in making reports accessible to patients and ensuring reliable evaluation. The technical language and templated nature of professional reports hinder patient comprehension and enable models to artificially boost lexical metrics such as BLEU by reproducing common report patterns. To address these limitations, we propose the Layman’s RRG framework, which leverages layperson-friendly language to enhance patient accessibility and promote more robust evaluation and report generation by encouraging models to focus on semantic accuracy over rigid templates. Our approach also introduces and releases two refined layman-style datasets (at the sentence and report levels), along with a semantics-based evaluation metric that mitigates inflated lexical scores and a layman-guided training strategy. Experiments show that training on layman-style data helps models better capture the meaning of clinical findings. Notably, we observe a positive scaling law: model performance improves with more layman-style data, in contrast to the inverse trend observed with templated professional language.

2025

This paper presents the setup and results of the third edition of the BioLaySumm shared task on Lay Summarization of Biomedical Research Articles and Radiology Reports, hosted at the BioNLP Workshop at ACL 2025. In this task edition, we aim to build on the first two editions’ successes by further increasing research interest in this important task and encouraging participants to explore novel approaches that will help advance the state-of-the-art. Specifically, we introduce the new task of Radiology Report Generation with Layman’s terms, which is parallel to the task of lay summarization of biomedical articles in the first two editions. Overall, our results show that a broad range of innovative approaches were adopted by task participants, including inspiring explorations of latest RL techniques adopted in the training of general-domain large reasoning models.

2024

Low-rank adaptation (LoRA) achieves parameter efficient fine-tuning for large language models (LLMs) by decomposing the model weight update into a pair of low-rank projection matrices. Yet, the memory overhead restricts it to scale up when the model size increases. We propose Randomized LoRA (RLoRA) which adopts Randomized Walsh-Hadamard Transform to achieve significant reduction in the size of trainable parameters compared to LoRA. At the same time, it allows a PAC-Bayes regularizer to be efficiently incorporated to improve generalization. We evaluate the effectiveness of RLoRA on LLMs RoBERTa, GPT-2 and LLaMA-7B using GLUE, E2E and math reasoning benchmarks. With a much lower memory requirement, RLoRA can give similar performance as the SOTA low-rank adaptation methods for these three tasks and significantly better performance under few-shot settings.

2022

Recent work on non-autoregressive neural machine translation (NAT) that leverages alignment information to explicitly reduce the modality of target distribution has reported comparable performance with counterparts that tackle multi-modality problem by implicitly modeling dependencies. Effectiveness in handling alignment is vital for models that follow this approach, where a token reordering mechanism is typically involved and plays a vital role. We review the reordering capability of the respective mechanisms in recent NAT models, and our experimental results show that their performance is sub-optimal. We propose to learn a non-autoregressive language model (NALM) based on transformer which can be combined with Viterbi decoding to achieve better reordering performance. We evaluate the proposed NALM using the PTB dataset where sentences with words permuted in different ways are expected to have their ordering recovered. Our empirical results show that the proposed method can outperform the state-of-the-art reordering mechanisms under different word permutation settings, with a 2-27 BLEU improvement, suggesting high potential for word alignment in NAT.

2019

While broadly applicable to many natural language processing (NLP) tasks, variational autoencoders (VAEs) are hard to train due to the posterior collapse issue where the latent variable fails to encode the input data effectively. Various approaches have been proposed to alleviate this problem to improve the capability of the VAE. In this paper, we propose to introduce a mutual information (MI) term between the input and its latent variable to regularize the objective of the VAE. Since estimating the MI in the high-dimensional space is intractable, we employ neural networks for the estimation of the MI and provide a training algorithm based on the convex duality approach. Our experimental results on three benchmark datasets demonstrate that the proposed model, compared to the state-of-the-art baselines, exhibits less posterior collapse and has comparable or better performance in language modeling and text generation. We also qualitatively evaluate the inferred latent space and show that the proposed model can generate more reasonable and diverse sentences via linear interpolation in the latent space.