Ran Liu

2025

With the popularity of large language models and their high-quality text generation capabilities, researchers are using them as auxiliary tools for text summary writing. Although summaries generated by these large language models are smooth and capture key information sufficiently, the quality of their output depends on the prompt, and the generated text is somewhat procedural to a certain extent. We construct LecSumm to verify whether language models truly capture human writing preferences, in which we recruit 200 college students to write summaries for lecture notes on ten different machine-learning topics and analyze writing preferences in real-world human summaries through the dimensions of length, content depth, tone & style, and summary format. We define the method of capturing human writing preferences by language models as finetuning pre-trained models with data and designing prompts to optimize the output of large language models. The results of translating the analyzed human writing preferences into prompts and conducting experiments show that both models still fail to capture human writing preferences effectively. Our LecSumm dataset brings new challenges to finetuned and prompt-based large language models on the task of human-centered text summarization.

2024

pdf bib abs

With the popularity of large language models (LLMs) and their ability to handle longer input documents, there is a growing need for high-quality long document summarization datasets. Although many models already support 16k input, current lengths of summarization datasets are inadequate, and salient information is not evenly distributed. To bridge these gaps, we collect a new summarization dataset called SumSurvey, consisting of more than 18k scientific survey papers. With an average document length exceeding 12k and a quarter exceeding 16k, as well as the uniformity metric outperforming current mainstream long document summarization datasets, SumSurvey brings new challenges and expectations to both fine-tuned models and LLMs. The informativeness of summaries and the models supporting the evaluation of long document summarization warrant further attention. Automatic and human evaluation results on this abstractive dataset confirm this view. Our dataset and code are available at https://github.com/Oswald1997/SumSurvey.

Co-authors

Jingbao Luo 1

Yongpan Sheng 1

Peng Wu 1

Min Yu 1

He Zhang 1

Venues

Findings2

Fix author