Judy Li
2023
Revisiting Cross-Lingual Summarization: A Corpus-based Study and A New Benchmark with Improved Annotation
Yulong Chen
|
Huajian Zhang
|
Yijie Zhou
|
Xuefeng Bai
|
Yueguan Wang
|
Ming Zhong
|
Jianhao Yan
|
Yafu Li
|
Judy Li
|
Xianchao Zhu
|
Yue Zhang
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Most existing cross-lingual summarization (CLS) work constructs CLS corpora by simply and directly translating pre-annotated summaries from one language to another, which can contain errors from both summarization and translation processes. To address this issue, we propose ConvSumX, a cross-lingual conversation summarization benchmark, through a new annotation schema that explicitly considers source input context. ConvSumX consists of 2 sub-tasks under different real-world scenarios, with each covering 3 language directions. We conduct thorough analysis on ConvSumX and 3 widely-used manually annotated CLS corpora and empirically find that ConvSumX is more faithful towards input text. Additionally, based on the same intuition, we propose a 2-Step method, which takes both conversation and summary as input to simulate human annotation process. Experimental results show that 2-Step method surpasses strong baselines on ConvSumX under both automatic and human evaluation. Analysis shows that both source input text and summary are crucial for modeling cross-lingual summaries.
Search
Fix data
Co-authors
- Xuefeng Bai (白雪峰) 1
- Yulong Chen 1
- Yafu Li 1
- Yueguan Wang 1
- Jianhao Yan 1
- show all...
Venues
- acl1