How Do Hyenas Deal with Human Speech? Speech Recognition and Translation with ConfHyena

Marco Gaido, Sara Papi, Matteo Negri, Luisa Bentivogli


Abstract
The attention mechanism, a cornerstone of state-of-the-art neural models, faces computational hurdles in processing long sequences due to its quadratic complexity. Consequently, research efforts in the last few years focused on finding more efficient alternatives. Among them, Hyena (Poli et al., 2023) stands out for achieving competitive results in both language modeling and image classification, while offering sub-quadratic memory and computational complexity. Building on these promising results, we propose ConfHyena, a Conformer whose encoder self-attentions are replaced with an adaptation of Hyena for speech processing, where the long input sequences cause high computational costs. Through experiments in automatic speech recognition (for English) and translation (from English into 8 target languages), we show that our best ConfHyena model significantly reduces the training time by 27%, at the cost of minimal quality degradation (∼1%), which, in most cases, is not statistically significant.
Anthology ID:
2024.lrec-main.717
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
8184–8191
Language:
URL:
https://aclanthology.org/2024.lrec-main.717
DOI:
Bibkey:
Cite (ACL):
Marco Gaido, Sara Papi, Matteo Negri, and Luisa Bentivogli. 2024. How Do Hyenas Deal with Human Speech? Speech Recognition and Translation with ConfHyena. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 8184–8191, Torino, Italia. ELRA and ICCL.
Cite (Informal):
How Do Hyenas Deal with Human Speech? Speech Recognition and Translation with ConfHyena (Gaido et al., LREC-COLING 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.lrec-main.717.pdf