Emmanuel Chemla

2024

pdf bib abs
Bridging the Empirical-Theoretical Gap in Neural Network Formal Language Learning Using Minimum Description Length
Nur Lan | Emmanuel Chemla | Roni Katzir
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Neural networks offer good approximation to many tasks but consistently fail to reach perfect generalization, even when theoretical work shows that such perfect solutions can be expressed by certain architectures. Using the task of formal language learning, we focus on one simple formal language and show that the theoretically correct solution is in fact not an optimum of commonly used objectives — even with regularization techniques that according to common wisdom should lead to simple weights and good generalization (L1, L2) or other meta-heuristics (early-stopping, dropout). On the other hand, replacing standard targets with the Minimum Description Length objective (MDL) results in the correct solution being an optimum.

pdf bib abs
Improving Spoken Language Modeling with Phoneme Classification: A Simple Fine-tuning Approach
Maxime Poli | Emmanuel Chemla | Emmanuel Dupoux
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Recent progress in Spoken Language Modeling has shown that learning language directly from speech is feasible. Generating speech through a pipeline that operates at the text level typically loses nuances, intonations, and non-verbal vocalizations. Modeling directly from speech opens up the path to more natural and expressive systems. On the other hand, speech-only systems require up to three orders of magnitude more data to catch up to their text-based counterparts in terms of their semantic abilities. We show that fine-tuning speech representation models on phoneme classification leads to more context-invariant representations, and language models trained on these units achieve comparable lexical comprehension to ones trained on hundred times more data.

2023

pdf bib abs
Benchmarking Neural Network Generalization for Grammar Induction
Nur Lan | Emmanuel Chemla | Roni Katzir
Proceedings of the 2023 CLASP Conference on Learning with Small Data (LSD)

How well do neural networks generalize? Even for grammar induction tasks, where the target generalization is fully known, previous works have left the question open, testing very limited ranges beyond the training set and using different success criteria. We provide a measure of neural network generalization based on fully specified formal languages. Given a model and a formal grammar, the method assigns a generalization score representing how well a model generalizes to unseen samples in inverse relation to the amount of data it was trained on. The benchmark includes languages such as aⁿbⁿ, aⁿbⁿcⁿ, aⁿb^mc^n+m, and Dyck-1 and 2. We evaluate selected architectures using the benchmark and find that networks trained with a Minimum Description Length objective (MDL) generalize better and using less data than networks trained using standard loss functions. The benchmark is available at https://github.com/taucompling/bliss.

pdf bib abs
It is a Bird Therefore it is a Robin: On BERT’s Internal Consistency Between Hypernym Knowledge and Logical Words
Nicolas Guerin | Emmanuel Chemla
Findings of the Association for Computational Linguistics: ACL 2023

The lexical knowledge of NLP systems shouldbe tested (i) for their internal consistency(avoiding groundedness issues) and (ii) bothfor content words and logical words. In thispaper we propose a new method to test the understandingof the hypernymy relationship bymeasuring its antisymmetry according to themodels. Previous studies often rely only on thedirect question (e.g., A robin is a ...), where weargue a correct answer could only rely on collocationalcues, rather than hierarchical cues. We show how to control for this, and how it isimportant. We develop a method to ask similarquestions about logical words that encode anentailment-like relation (e.g., because or therefore).Our results show important weaknessesof BERT-like models on these semantic tasks.

2022

pdf bib abs
Minimum Description Length Recurrent Neural Networks
Nur Lan | Michal Geyer | Emmanuel Chemla | Roni Katzir
Transactions of the Association for Computational Linguistics, Volume 10

We train neural networks to optimize a Minimum Description Length score, that is, to balance between the complexity of the network and its accuracy at a task. We show that networks optimizing this objective function master tasks involving memory challenges and go beyond context-free languages. These learners master languages such as anbn, anbncn, anb2n, anbmcn +m, and they perform addition. Moreover, they often do so with 100% accuracy. The networks are small, and their inner workings are transparent. We thus provide formal proofs that their perfect accuracy holds not only on a given test set, but for any input sequence. To our knowledge, no other connectionist model has been shown to capture the underlying grammars for these languages in full generality.

2020

pdf bib abs
On the Spontaneous Emergence of Discrete and Compositional Signals
Nur Geffen Lan | Emmanuel Chemla | Shane Steinert-Threlkeld
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

We propose a general framework to study language emergence through signaling games with neural agents. Using a continuous latent space, we are able to (i) train using backpropagation, (ii) show that discrete messages nonetheless naturally emerge. We explore whether categorical perception effects follow and show that the messages are not compositional.

Emmanuel Chemla

2024

2023

2022

2020

2014

Co-authors

Venues