Characterizing Mamba’s Selective Memory using Auto-Encoders

Tamanna Hossain; Robert L. Logan Iv; Chandrasekhara Ganesh Jagadeesan; Sameer Singh; Joel Tetreault; Alejandro Jaimes

Characterizing Mamba’s Selective Memory using Auto-Encoders

Tamanna Hossain, Robert L. Logan Iv, Chandrasekhara Ganesh Jagadeesan, Sameer Singh, Joel R. Tetreault, Alejandro Jaimes

Abstract

State space models (SSMs) are a promising alternative to transformers for language modeling because they use fixed memory during inference. However, this fixed memory usage requires some information loss in the hidden state when processing long sequences. While prior work has studied the sequence length at which this information loss occurs, it does not characterize the types of information SSM language models (LMs) tend to forget. In this paper, we address this knowledge gap by identifying the types of tokens (e.g., parts of speech, named entities) and sequences (e.g., code, math problems) that are more frequently forgotten by SSM LMs. We achieve this by training an auto-encoder to reconstruct sequences from the SSM’s hidden state, and measure information loss by comparing inputs with their reconstructions. We perform experiments using the Mamba family of SSM LMs (130M–1.4B) on sequences ranging from 4–256 tokens. Our results show significantly higher rates of information loss on math-related tokens (e.g., numbers, variables), mentions of organization entities, and alternative dialects to Standard American English. We then examine the frequency that these tokens appear in Mamba’s pretraining data and find that less prevalent tokens tend to be the ones Mamba is most likely to forget. By identifying these patterns, our work provides clear direction for future research to develop methods that better control Mamba’s ability to retain important information.

Anthology ID:: 2025.ijcnlp-long.109
Volume:: Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
Month:: December
Year:: 2025
Address:: Mumbai, India
Editors:: Kentaro Inui, Sakriani Sakti, Haofen Wang, Derek F. Wong, Pushpak Bhattacharyya, Biplab Banerjee, Asif Ekbal, Tanmoy Chakraborty, Dhirendra Pratap Singh
Venues:: IJCNLP | AACL
SIG:
Publisher:: The Asian Federation of Natural Language Processing and The Association for Computational Linguistics
Note:
Pages:: 2007–2022
Language:
URL:: https://aclanthology.org/2025.ijcnlp-long.109/
DOI:
Bibkey:
Cite (ACL):: Tamanna Hossain, Robert L. Logan Iv, Chandrasekhara Ganesh Jagadeesan, Sameer Singh, Joel R. Tetreault, and Alejandro Jaimes. 2025. Characterizing Mamba’s Selective Memory using Auto-Encoders. In Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, pages 2007–2022, Mumbai, India. The Asian Federation of Natural Language Processing and The Association for Computational Linguistics.
Cite (Informal):: Characterizing Mamba’s Selective Memory using Auto-Encoders (Hossain et al., IJCNLP-AACL 2025)
Copy Citation:
PDF:: https://aclanthology.org/2025.ijcnlp-long.109.pdf

PDF Cite Search Fix data