Yoram Bachrach

2021

pdf bib abs
Game-theoretic Vocabulary Selection via the Shapley Value and Banzhaf Index
Roma Patel | Marta Garnelo | Ian Gemp | Chris Dyer | Yoram Bachrach
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

The input vocabulary and the representations learned are crucial to the performance of neural NLP models. Using the full vocabulary results in less explainable and more memory intensive models, with the embedding layer often constituting the majority of model parameters. It is thus common to use a smaller vocabulary to lower memory requirements and construct more interpertable models. We propose a vocabulary selection method that views words as members of a team trying to maximize the model’s performance. We apply power indices from cooperative game theory, including the Shapley value and Banzhaf index, that measure the relative importance of individual team members in accomplishing a joint task. We approximately compute these indices to identify the most influential words. Our empirical evaluation examines multiple NLP tasks, including sentence and document classification, question answering and textual entailment. We compare to baselines that select words based on frequency, TF-IDF and regression coefficients under L1 regularization, and show that this game-theoretic vocabulary selection outperforms all baseline on a range of different tasks and datasets.

2018

pdf bib abs
Named Entity Recognition With Parallel Recurrent Neural Networks
Andrej Žukov-Gregorič | Yoram Bachrach | Sam Coope
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

We present a new architecture for named entity recognition. Our model employs multiple independent bidirectional LSTM units across the same input and promotes diversity among them by employing an inter-model regularization term. By distributing computation across multiple smaller LSTMs we find a significant reduction in the total number of parameters. We find our architecture achieves state-of-the-art performance on the CoNLL 2003 NER dataset.

Yoram Bachrach

2021

2018

2016

2015

Co-authors

Venues