Roni Katzir


2023

pdf bib
Benchmarking Neural Network Generalization for Grammar Induction
Nur Lan | Emmanuel Chemla | Roni Katzir
Proceedings of the 2023 CLASP Conference on Learning with Small Data (LSD)

How well do neural networks generalize? Even for grammar induction tasks, where the target generalization is fully known, previous works have left the question open, testing very limited ranges beyond the training set and using different success criteria. We provide a measure of neural network generalization based on fully specified formal languages. Given a model and a formal grammar, the method assigns a generalization score representing how well a model generalizes to unseen samples in inverse relation to the amount of data it was trained on. The benchmark includes languages such as anbn, anbncn, anbmcn+m, and Dyck-1 and 2. We evaluate selected architectures using the benchmark and find that networks trained with a Minimum Description Length objective (MDL) generalize better and using less data than networks trained using standard loss functions. The benchmark is available at https://github.com/taucompling/bliss.

2022

pdf bib
Minimum Description Length Recurrent Neural Networks
Nur Lan | Michal Geyer | Emmanuel Chemla | Roni Katzir
Transactions of the Association for Computational Linguistics, Volume 10

We train neural networks to optimize a Minimum Description Length score, that is, to balance between the complexity of the network and its accuracy at a task. We show that networks optimizing this objective function master tasks involving memory challenges and go beyond context-free languages. These learners master languages such as anbn, anbncn, anb2n, anbmcn +m, and they perform addition. Moreover, they often do so with 100% accuracy. The networks are small, and their inner workings are transparent. We thus provide formal proofs that their perfect accuracy holds not only on a given test set, but for any input sequence. To our knowledge, no other connectionist model has been shown to capture the underlying grammars for these languages in full generality.