Besher Hassan
2026
Beyond Memorization: Extending Reasoning Depth with Recurrence, Memory and Test-Time Compute Scaling
Ivan Rodkin | Daniil Orel | Konstantin Smirnov | Arman Bolatov | Bilal Elbouardi | Besher Hassan | Yuri Kuratov | Aydar Bulatov | Preslav Nakov | Timothy Baldwin | Artem Shelmanov | Mikhail Burtsev
Findings of the Association for Computational Linguistics: ACL 2026
Ivan Rodkin | Daniil Orel | Konstantin Smirnov | Arman Bolatov | Bilal Elbouardi | Besher Hassan | Yuri Kuratov | Aydar Bulatov | Preslav Nakov | Timothy Baldwin | Artem Shelmanov | Mikhail Burtsev
Findings of the Association for Computational Linguistics: ACL 2026
Reasoning is a core capability of large language models (LLMs), yet how multi-step reasoning is learned and executed remains unclear. We study this question in a controlled cellular-automata (1dCA) framework that excludes memorization by using disjoint training and test rules. Given a short state sequence, the model is required to infer the hidden local rule and then chain it to predict multiple future steps. Our evaluation shows that LLMs largely fail to reliably solve a natural-language proxy of the proposed task. We find that most neural architectures trained from scratch can learn rule inference and achieve high next-step accuracy, but performance drops sharply as the required number of intermediate reasoning steps increases. Experiments show that increasing model depth is crucial, and extending effective depth via recurrence, memory, or test-time compute improves results but remains bounded. Code is available on github: https://github.com/RodkinIvan/associative-recurrent-memory-transformer/tree/ACT.
Multilingual Idioms in Sentences and Conversations Across High-, Medium-, and Low-Resource Languages
Saeed Almheiri | Bilal Elbouardi | Salsabila Zahirah Pranida | Irina Nikishina | Ashwath Rao B | Parameswari Krishnamurthy | Muhammad Cendekia Airlangga | Rifo Ahmad Genadi | Nguyen Phan Gia Bao | Amir Hossein Yari | Hawau Olamide Toyin | Nurdaulet Mukhituly | Mena Attia | Besher Hassan | Ahmad Fathan Hidayatullah | Tatsuki Kuribayashi | Haonan Li | Suma Bhat | Fajri Koto
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Saeed Almheiri | Bilal Elbouardi | Salsabila Zahirah Pranida | Irina Nikishina | Ashwath Rao B | Parameswari Krishnamurthy | Muhammad Cendekia Airlangga | Rifo Ahmad Genadi | Nguyen Phan Gia Bao | Amir Hossein Yari | Hawau Olamide Toyin | Nurdaulet Mukhituly | Mena Attia | Besher Hassan | Ahmad Fathan Hidayatullah | Tatsuki Kuribayashi | Haonan Li | Suma Bhat | Fajri Koto
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Idiomatic expressions pose a major challenge for multilingual NLP because their meanings shift between figurative and literal usage, often requiring context for accurate interpretation. Prior work has focused on high-resource languages typically evaluates isolated idiom-meaning questions, overlooking realistic discourse. We introduce MIDI, a multilingual idiom dataset spanning 3 high-, 3 medium-, and 12 low-resource languages, curated by native speakers. Unlike previous datasets, MIDI provides idioms embedded in both sentence-level and conversational contexts, capturing both literal and figurative readings. Benchmarking state-of-the-art models shows that idiom comprehension degrades in low-resource languages and that, in all resource tiers, literal interpretations are substantially harder than figurative ones. Conversational context improves performance but does not eliminate these disparities. Through controlled tests and interventions on hidden representations, we further separate memorization from reasoning, exposing core limitations of current models.
Search
Fix author
Co-authors
- Bilal Elbouardi 2
- Muhammad Cendekia Airlangga 1
- Saeed Almheiri 1
- Mena Attia 1
- Ashwath Rao B 1
- Timothy Baldwin 1
- Nguyen Phan Gia Bao 1
- Suma Bhat 1
- Arman Bolatov 1
- Aydar Bulatov 1
- Mikhail Burtsev 1
- Rifo Ahmad Genadi 1
- Ahmad Fathan Hidayatullah 1
- Fajri Koto 1
- Parameswari Krishnamurthy 1
- Yurii Kuratov 1
- Tatsuki Kuribayashi 1
- Haonan Li 1
- Nurdaulet Mukhituly 1
- Preslav Nakov 1
- Irina Nikishina 1
- Daniil Orel 1
- Salsabila Zahirah Pranida 1
- Ivan Rodkin 1
- Artem Shelmanov 1
- Konstantin Smirnov 1
- Hawau Olamide Toyin 1
- Amir Hossein Yari 1