Clemente Pasti
2024
An L* Algorithm for Deterministic Weighted Regular Languages
Clemente Pasti
|
Talu Karagöz
|
Franz Nowak
|
Anej Svete
|
Reda Boumasmoud
|
Ryan Cotterell
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Extracting finite state automata (FSAs) fromblack-box models offers a powerful approachto gaining interpretable insights into complexmodel behaviors. To support this pursuit, wepresent a weighted variant of Angluin’s (1987)L* algorithm for learning FSAs. We stay faithful to the original formulation, devising a wayto exactly learn deterministic weighted FSAswhose weights support division. Furthermore,we formulate the learning process in a mannerthat highlights the connection with FSA minimization, showing how L* directly learns aminimal automaton for the target language.
2023
On the Intersection of Context-Free and Regular Languages
Clemente Pasti
|
Andreas Opedal
|
Tiago Pimentel
|
Tim Vieira
|
Jason Eisner
|
Ryan Cotterell
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics
The Bar-Hillel construction is a classic result in formal language theory. It shows, by a simple construction, that the intersection of a context-free language and a regular language is itself context-free. In the construction, the regular language is specified by a finite-state automaton. However, neither the original construction (Bar-Hillel et al., 1961) nor its weighted extension (Nederhof and Satta, 2003) can handle finite-state automata with ε-arcs. While it is possible to remove ε-arcs from a finite-state automaton efficiently without modifying the language, such an operation modifies the automaton’s set of paths. We give a construction that generalizes the Bar- Hillel in the case the desired automaton has ε-arcs, and further prove that our generalized construction leads to a grammar that encodes the structure of both the input automaton and grammar while retaining the asymptotic size of the original construction.
Search
Co-authors
- Ryan Cotterell 2
- Talu Karagöz 1
- Franz Nowak 1
- Anej Svete 1
- Reda Boumasmoud 1
- show all...