An L* Algorithm for Deterministic Weighted Regular Languages

Clemente Pasti, Talu Karagöz, Franz Nowak, Anej Svete, Reda Boumasmoud, Ryan Cotterell


Abstract
Extracting finite state automata (FSAs) fromblack-box models offers a powerful approachto gaining interpretable insights into complexmodel behaviors. To support this pursuit, wepresent a weighted variant of Angluin’s (1987)L* algorithm for learning FSAs. We stay faithful to the original formulation, devising a wayto exactly learn deterministic weighted FSAswhose weights support division. Furthermore,we formulate the learning process in a mannerthat highlights the connection with FSA minimization, showing how L* directly learns aminimal automaton for the target language.
Anthology ID:
2024.emnlp-main.468
Volume:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
8197–8210
Language:
URL:
https://aclanthology.org/2024.emnlp-main.468
DOI:
Bibkey:
Cite (ACL):
Clemente Pasti, Talu Karagöz, Franz Nowak, Anej Svete, Reda Boumasmoud, and Ryan Cotterell. 2024. An L* Algorithm for Deterministic Weighted Regular Languages. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 8197–8210, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
An L* Algorithm for Deterministic Weighted Regular Languages (Pasti et al., EMNLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.emnlp-main.468.pdf