MultiHumES: Multilingual Humanitarian Dataset for Extractive Summarization

Jenny Paola Yela-Bello; Ewan Oglethorpe; Navid Rekabsaz

doi:10.18653/v1/2021.eacl-main.146

MultiHumES: Multilingual Humanitarian Dataset for Extractive Summarization

Jenny Paola Yela-Bello, Ewan Oglethorpe, Navid Rekabsaz

Abstract

When responding to a disaster, humanitarian experts must rapidly process large amounts of secondary data sources to derive situational awareness and guide decision-making. While these documents contain valuable information, manually processing them is extremely time-consuming when an expedient response is necessary. To improve this process, effective summarization models are a valuable tool for humanitarian response experts as they provide digestible overviews of essential information in secondary data. This paper focuses on extractive summarization for the humanitarian response domain and describes and makes public a new multilingual data collection for this purpose. The collection – called MultiHumES– provides multilingual documents coupled with informative snippets that have been annotated by humanitarian analysts over the past four years. We report the performance results of a recent neural networks-based summarization model together with other baselines. We hope that the released data collection can further grow the research on multilingual extractive summarization in the humanitarian response domain.

Anthology ID:: 2021.eacl-main.146
Volume:: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
Month:: April
Year:: 2021
Address:: Online
Editors:: Paola Merlo, Jorg Tiedemann, Reut Tsarfaty
Venue:: EACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1713–1717
Language:
URL:: https://aclanthology.org/2021.eacl-main.146/
DOI:: 10.18653/v1/2021.eacl-main.146
Bibkey:
Cite (ACL):: Jenny Paola Yela-Bello, Ewan Oglethorpe, and Navid Rekabsaz. 2021. MultiHumES: Multilingual Humanitarian Dataset for Extractive Summarization. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 1713–1717, Online. Association for Computational Linguistics.
Cite (Informal):: MultiHumES: Multilingual Humanitarian Dataset for Extractive Summarization (Yela-Bello et al., EACL 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.eacl-main.146.pdf

PDF Cite Search Fix data