Mong Yuan Sim


pdf bib
CSIRO Data61 Team at BioLaySumm Task 1: Lay Summarisation of Biomedical Research Articles Using Generative Models
Mong Yuan Sim | Xiang Dai | Maciej Rybinski | Sarvnaz Karimi
The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks

Lay summarisation aims at generating a summary for non-expert audience which allows them to keep updated with latest research in a specific field. Despite the significant advancements made in the field of text summarisation, lay summarisation remains relatively under-explored. We present a comprehensive set of experiments and analysis to investigate the effectiveness of existing pre-trained language models in generating lay summaries. When evaluate our models using a BioNLP Shared Task, BioLaySumm, our submission ranked second for the relevance criteria and third overall among 21 competing teams.


pdf bib
An Empirical Study on Topic Preservation in Multi-Document Summarization
Mong Yuan Sim | Wei Emma Zhang | Congbo Ma
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing: Student Research Workshop

Multi-document summarization (MDS) is a process of generating an informative and concise summary from multiple topic-related documents. Many studies have analyzed the quality of MDS dataset or models, however no work has been done from the perspective of topic preservation. In this work, we fill the gap by performing an empirical analysis on two MDS datasets and study topic preservation on generated summaries from 8 MDS models. Our key findings include i) Multi-News dataset has better gold summaries compared to Multi-XScience in terms of its topic distribution consistency and ii) Extractive approaches perform better than abstractive approaches in preserving topic information from source documents. We hope our findings could help develop a summarization model that can generate topic-focused summary and also give inspiration to researchers in creating dataset for such challenging task.