LOCOST: State-Space Models for Long Document Abstractive Summarization

Florian Le Bronnec; Song Duong; Mathieu Ravaut; Alexandre Allauzen; Nancy Chen; Vincent Guigue; Alberto Lumbreras; Laure Soulier; Patrick Gallinari

LOCOST: State-Space Models for Long Document Abstractive Summarization

Florian Le Bronnec, Song Duong, Mathieu Ravaut, Alexandre Allauzen, Nancy Chen, Vincent Guigue, Alberto Lumbreras, Laure Soulier, Patrick Gallinari

Abstract

State-space models are a low-complexity alternative to transformers for encoding long sequences and capturing long-term dependencies. We propose LOCOST: an encoder-decoder architecture based on state-space models for conditional text generation with long context inputs. With a computational complexity of 𝒪(L log L), this architecture can handle significantly longer sequences than state-of-the-art models that are based on sparse attention patterns. We evaluate our model on a series of long document abstractive summarization tasks. The model reaches a performance level that is 93-96% comparable to the top-performing sparse transformers of the same size while saving up to 50% memory during training and up to 87% during inference. Additionally, LOCOST effectively handles input texts exceeding 600K tokens at inference time, setting new state-of-the-art results on full-book summarization and opening new perspectives for long input processing.

Anthology ID:: 2024.eacl-long.69
Original:: 2024.eacl-long.69v1
Version 2:: 2024.eacl-long.69v2
Volume:: Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: March
Year:: 2024
Address:: St. Julian’s, Malta
Editors:: Yvette Graham, Matthew Purver
Venue:: EACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1144–1159
Language:
URL:: https://aclanthology.org/2024.eacl-long.69/
DOI:
Award:: Best Paper Award
Bibkey:
Cite (ACL):: Florian Le Bronnec, Song Duong, Mathieu Ravaut, Alexandre Allauzen, Nancy Chen, Vincent Guigue, Alberto Lumbreras, Laure Soulier, and Patrick Gallinari. 2024. LOCOST: State-Space Models for Long Document Abstractive Summarization. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1144–1159, St. Julian’s, Malta. Association for Computational Linguistics.
Cite (Informal):: LOCOST: State-Space Models for Long Document Abstractive Summarization (Le Bronnec et al., EACL 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.eacl-long.69.pdf
Video:: https://aclanthology.org/2024.eacl-long.69.mp4

PDF (v2) PDF (v1) Cite Search Video Fix data