Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP

Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP Johns Hopkins UniversityTal Linzen Tilburg UniversityGrzegorz Chrupała Tilburg UniversityAfra Alishahi November 2018

Brussels, Belgium

Association for Computational Linguistics http://www.aclweb.org/anthology/W18-54 book BlackboxNLP:2018 When does deep multi-task learning work for loosely related document classification tasks? EmmaKerinec ChloéBraud AndersSøgaard Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 1–8 http://www.aclweb.org/anthology/W18-5401 This work aims to contribute to our understanding of when multi-task learning through parameter sharing in deep neural networks leads to improvements over single-task learning. We focus on the setting of learning from loosely related tasks, for which no theoretical guarantees exist. We therefore approach the question empirically, studying which properties of datasets and single-task learning characteristics correlate with improvements from multi-task learning. We are the first to study this in a text classification setting and across more than 500 different task pairs. inproceedings kerinec-braud-sgaard:2018:BlackboxNLP Analyzing Learned Representations of a Deep ASR Performance Prediction Model ZiedElloumi LaurentBesacier OlivierGalibert BenjaminLecouteux Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 9–15 http://www.aclweb.org/anthology/W18-5402 This paper addresses a relatively new task: prediction of ASR performance on unseen broadcast programs. inproceedings elloumi-EtAl:2018:BlackboxNLP Explaining non-linear Classifier Decisions within Kernel-based Deep Architectures DaniloCroce DanieleRossini RobertoBasili Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 16–24 http://www.aclweb.org/anthology/W18-5403 Nonlinear methods such as deep neural networks achieve state-of-the-art performances in several semantic NLP tasks. inproceedings croce-rossini-basili:2018:BlackboxNLP Nightmare at test time: How punctuation prevents parsers from generalizing AndersSøgaard Miryamde Lhoneux IsabelleAugenstein Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 25–29 http://www.aclweb.org/anthology/W18-5404 Punctuation is a strong indicator of syntactic structure, and parsers trained on text with punctuation often rely heavily on this signal. Punctuation is a diversion, however, since human language processing does not rely on punctuation to the same extent, and in informal texts, we therefore often leave out punctuation. We also use punctuation ungrammatically for emphatic or creative purposes, or simply by mistake. We show that (a) dependency parsers are sensitive to both absence of punctuation and to alternative uses; (b) neural parsers tend to be more sensitive than vintage parsers; (c) training neural parsers without punctuation outperforms all out-of-the-box parsers across all scenarios where punctuation departs from standard punctuation. Our main experiments are on synthetically corrupted data to study the effect of punctuation in isolation and avoid potential confounds, but we also show effects on out-of-domain data. inproceedings sgaard-delhoneux-augenstein:2018:BlackboxNLP Evaluating Textual Representations through Image Generation GrahamSpinks Marie-FrancineMoens Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 30–39 http://www.aclweb.org/anthology/W18-5405 We present a methodology for determining the quality of textual representations through the ability to generate images from them. Continuous representations of textual input are ubiquitous in modern Natural Language Processing techniques either at the core of machine learning algorithms or as the by-product at any given layer of a neural network. While current techniques to evaluate such representations focus on their performance on particular tasks, they don't provide a clear understanding of the level of informational detail that is stored within them, especially their ability to represent spatial information. The central premise of this paper is that visual inspection or analysis is the most convenient method to quickly and accurately determine information content. Through the use of text-to-image neural networks, we propose a new technique to compare the quality of textual representations by visualizing their information content. The method is illustrated on a medical dataset where the correct representation of spatial information and shorthands are of particular importance. For four different well-known textual representations, we show with a quantitative analysis that some representations are consistently able to deliver higher quality visualizations of the information content. Additionally, we show that the quantitative analysis technique correlates with the judgment of a human expert evaluator in terms of alignment. inproceedings spinks-moens:2018:BlackboxNLP On the Role of Text Preprocessing in Neural Network Architectures: An Evaluation Study on Text Categorization and Sentiment Analysis JoseCamacho-Collados Mohammad TaherPilehvar Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 40–46 http://www.aclweb.org/anthology/W18-5406 Text preprocessing is often the first step in the pipeline of a Natural Language Processing (NLP) system, with potential impact in its final performance. Despite its importance, text preprocessing has not received much attention in the deep learning literature. In this paper we investigate the impact of simple text preprocessing decisions (particularly tokenizing, lemmatizing, lowercasing and multiword grouping) on the performance of a standard neural text classifier. We perform an extensive evaluation on standard benchmarks from text categorization and sentiment analysis. While our experiments show that a simple tokenization of input text is generally adequate, they also highlight significant degrees of variability across preprocessing techniques. This reveals the importance of paying attention to this usually-overlooked step in the pipeline, particularly when comparing different models. Finally, our evaluation provides insights into the best preprocessing practices for training word embeddings. inproceedings camachocollados-pilehvar:2018:BlackboxNLP Jump to better conclusions: SCAN both left and right JoostBastings MarcoBaroni JasonWeston KyunghyunCho DouweKiela Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 47–55 http://www.aclweb.org/anthology/W18-5407 Lake & Baroni (2018) recently introduced the SCAN data set, which consists of simple commands paired with action sequences and is intended to test the inproceedings bastings-EtAl:2018:BlackboxNLP Understanding Convolutional Neural Networks for Text Classification AlonJacovi OrenSar Shalom YoavGoldberg Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 56–65 http://www.aclweb.org/anthology/W18-5408 We present an analysis into the inner workings of Convolutional Neural Networks (CNNs) for processing text. CNNs used for computer vision can be interpreted by projecting filters into image space, but for discrete sequence inputs CNNs remain a mystery. We aim to understand the method by which the networks process and classify text. We examine common hypotheses to this problem: that filters, accompanied by global max-pooling, serve as ngram detectors. We show that filters may capture several different semantic classes of ngrams by using different activation patterns, and that global max-pooling induces behavior which separates important ngrams from the rest. Finally, we show practical use cases derived from our findings in the form of model interpretability (explaining a trained model by deriving a concrete identity for each filter, bridging the gap between visualization tools in vision tasks and NLP) and prediction interpretability (explaining predictions). inproceedings jacovi-sarshalom-goldberg:2018:BlackboxNLP Linguistic representations in multi-task neural networks for ellipsis resolution OlaRønning DanielHardt AndersSøgaard Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 66–73 http://www.aclweb.org/anthology/W18-5409 Sluicing resolution is the task of identifying the antecedent to a question ellipsis. Antecedents are often sentential constituents, and previous work has therefore relied on syntactic parsing, together with complex linguistic features. A recent model instead used partial parsing as an auxiliary task in sequential neural network architectures to inject syntactic information. We explore the linguistic information being brought to bear by such networks, both by defining subsets of the data exhibiting relevant linguistic characteristics, and by examining the internal representations of the network. Both perspectives provide evidence for substantial linguistic knowledge being deployed by the neural networks. inproceedings rnning-hardt-sgaard:2018:BlackboxNLP Unsupervised Token-wise Alignment to Improve Interpretation of Encoder-Decoder Models ShunKiyono ShoTakase JunSuzuki NaoakiOkazaki KentaroInui MasaakiNagata Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 74–81 http://www.aclweb.org/anthology/W18-5410 Developing a method for understanding the inner workings of black-box neural methods is an important research endeavor. inproceedings kiyono-EtAl:2018:BlackboxNLP Rule induction for global explanation of trained models MadhumitaSushil SimonSuster WalterDaelemans Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 82–97 http://www.aclweb.org/anthology/W18-5411 Understanding the behavior of a trained network and finding explanations for its outputs is important for improving the network's performance and generalization ability, and for ensuring trust in automated systems. Several approaches have previously been proposed to identify and visualize the most important features by analyzing a trained network. However, the relations between different features and classes are lost in most cases. We propose a technique to induce sets of if-then-else rules that capture these relations to globally explain the predictions of a network. We first calculate the importance of the features in the trained network. We then weigh the original inputs with these feature importance scores, simplify the transformed input space, and finally fit a rule induction model to explain the model predictions. We find that the output rule-sets can explain the predictions of a neural network trained for 4-class text classification from the 20 newsgroups dataset to a macro-averaged F-score of 0.80. We make the code available at https://github.com/clips/interpret_with_rules. inproceedings sushil-suster-daelemans:2018:BlackboxNLP Can LSTM Learn to Capture Agreement? The Case of Basque ShauliRavfogel YoavGoldberg FrancisTyers Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 98–107 http://www.aclweb.org/anthology/W18-5412 We focus on the task of agreement prediction in Basque, as a case study for a task that requires implicit understanding of sentence structure and the acquisition of a complex but consistent morphological system. In a series of controlled experiments, we probe the ability of sequential models to learn agreement patterns and asses different aspects of the problem. Analyzing experimental results from two syntactic prediction tasks – verb number prediction and suffix recovery – we find that sequential models perform worse on agreement prediction in Basque than one might expect on the basis of a previous agreement prediction work in English. Tentative findings based on diagnostic classifiers suggest the network makes use of local heuristics as a proxy for the hierarchical structure of the sentence. We propose the Basque agreement prediction task as challenging benchmark for models that attempt to learn regularities in human language. inproceedings ravfogel-goldberg-tyers:2018:BlackboxNLP Rearranging the Familiar: Testing Compositional Generalization in Recurrent Networks JoaoLoula MarcoBaroni BrendenLake Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 108–114 http://www.aclweb.org/anthology/W18-5413 Systematic compositionality is the ability to recombine meaningful units with regular and predictable outcomes, and it's seen as key to humans' capacity for generalization in language. Recent work (Lake and Baroni, 2018) has studied systematic compositionality in modern seq2seq models using generalization to novel navigation instructions in a grounded environment as a probing tool. Lake and Baroni's main experiment required the models to quickly bootstrap the meaning of new words. We extend this framework here to settings where the model needs only to recombine well-trained functional words (such as "around" and "right") in novel contexts. Our findings confirm and strengthen the earlier ones: seq2seq models can be impressively good at generalizing to novel combinations of previously-seen input, but only when they receive extensive training on the specific pattern to be generalized (e.g., generalizing from many examples of "X around right" to "jump around right"), while failing when generalization requires novel application of compositional rules (e.g., inferring the meaning of "around right" from those of "right" and "around"). inproceedings loula-baroni-lake:2018:BlackboxNLP Evaluating the Ability of LSTMs to Learn Context-Free Grammars LuziSennhauser RobertBerwick Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 115–124 http://www.aclweb.org/anthology/W18-5414 While long short-term memory (LSTM) neural net architectures are designed to capture sequence information, human language is generally composed of hierarchical structures. This raises the question as to whether LSTMs can learn hierarchical structures. We explore this question with a well-formed bracket prediction task using two types of brackets modeled by an LSTM. inproceedings sennhauser-berwick:2018:BlackboxNLP Interpretable Neural Architectures for Attributing an Ad’s Performance to its Writing Style ReidPryzant SugatoBasu KazooSone Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 125–135 http://www.aclweb.org/anthology/W18-5415 How much does "free shipping!" help an advertisement's ability to persuade? inproceedings pryzant-basu-sone:2018:BlackboxNLP Interpreting Neural Networks with Nearest Neighbors EricWallace ShiFeng JordanBoyd-Graber Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 136–144 http://www.aclweb.org/anthology/W18-5416 Local model interpretation methods explain individual predictions by inproceedings wallace-feng-boydgraber:2018:BlackboxNLP 'Indicatements' that character language models learn English morpho-syntactic units and regularities YovaKementchedjhieva AdamLopez Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 145–153 http://www.aclweb.org/anthology/W18-5417 Character language models have access to surface morphological patterns, but it is not clear whether or how they learn abstract morphological regularities. We instrument a character language model with several probes, finding that it can develop a specific unit to identify word boundaries and, by extension, morpheme boundaries, which allows it to capture linguistic properties and regularities of these units. Our language model proves surprisingly good at identifying the selectional restrictions of English derivational morphemes, a task that requires both morphological and syntactic awareness. Thus we conclude that, when morphemes overlap extensively with the words of a language, a character language model can perform morphological abstraction. inproceedings kementchedjhieva-lopez:2018:BlackboxNLP LISA: Explaining Recurrent Neural Network Judgments via Layer-wIse Semantic Accumulation and Example to Pattern Transformation PankajGupta HinrichSchütze Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 154–164 http://www.aclweb.org/anthology/W18-5418 Recurrent neural networks (RNNs) are temporal inproceedings gupta-schtze:2018:BlackboxNLP Analysing the potential of seq-to-seq models for incremental interpretation in task-oriented dialogue DieuwkeHupkes SanneBouwmeester RaquelFernández Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 165–174 http://www.aclweb.org/anthology/W18-5419 We investigate how encoder-decoder models trained on a synthetic dataset of task-oriented dialogues process disfluencies, such as hesitations and self-corrections. We find that, contrary to earlier results, disfluencies have very little impact on the task success of seq-to-seq models with attention. Using visualisations and diagnostic classifiers, we analyse the representations that are incrementally built by the model, and discover that models develop little to no awareness of the structure of disfluencies. However, adding disfluencies to the data appears to help the model create clearer representations overall, as evidenced by the attention patterns the different models exhibit. inproceedings hupkes-bouwmeester-fernndez:2018:BlackboxNLP An Operation Sequence Model for Explainable Neural Machine Translation FelixStahlberg DanielleSaunders BillByrne Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 175–186 http://www.aclweb.org/anthology/W18-5420 We propose to achieve explainable neural machine translation (NMT) by changing the output representation to explain itself. We present a novel approach to NMT which generates the target sentence by monotonically walking through the source sentence. Word reordering is modeled by operations which allow setting markers in the target sentence and move a target-side write head between those markers. In contrast to many modern neural models, our system emits explicit word alignment information which is often crucial to practical machine translation as it improves explainability. Our technique can outperform a plain text system in terms of BLEU score under the recent Transformer architecture on Japanese-English and Portuguese-English, and is within 0.5 BLEU difference on Spanish-English. inproceedings stahlberg-saunders-byrne:2018:BlackboxNLP Introspection for convolutional automatic speech recognition AndreasKrug SebastianStober Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 187–199 http://www.aclweb.org/anthology/W18-5421 Artificial Neural Networks (ANNs) have experienced great success in the past few years. The increasing complexity of these models leads to less understanding about their decision processes. Therefore, introspection techniques have been proposed, mostly for images as input data. Patterns or relevant regions in images can be intuitively interpreted by a human observer. This is not the case for more complex data like speech recordings. In this work, we investigate the application of common introspection techniques from computer vision to an Automatic Speech Recognition (ASR) task. To this end, we use a model similar to image classification, which predicts letters from spectrograms. We show difficulties in applying image introspection to ASR. To tackle these problems, we propose normalized averaging of aligned inputs (NAvAI): a data-driven method to reveal learned patterns for prediction of specific classes. Our method integrates information from many data examples through local introspection techniques for Convolutional Neural Networks (CNNs). We demonstrate that our method provides better interpretability of letter-specific patterns than existing methods. inproceedings krug-stober:2018:BlackboxNLP Learning and Evaluating Sparse Interpretable Sentence Embeddings ValentinTrifonov Octavian-EugenGanea AnnaPotapenko ThomasHofmann Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 200–210 http://www.aclweb.org/anthology/W18-5422 Previous research on word embeddings has shown that sparse representations, which can be either learned on top of existing dense embeddings or obtained through model constraints during training time, have the benefit of increased interpretability properties: to some degree, each dimension can be understood by a human and associated with a recognizable feature in the data. In this paper, we transfer this idea to sentence embeddings and explore several approaches to obtain a sparse representation. We further introduce a novel, quantitative and automated evaluation metric for sentence embedding interpretability, based on topic coherence methods. We observe an increase in interpretability compared to dense models, on a dataset of movie dialogs and on the scene descriptions from the MS COCO dataset. inproceedings trifonov-EtAl:2018:BlackboxNLP What do RNN Language Models Learn about Filler–Gap Dependencies? EthanWilcox RogerLevy TakashiMorita RichardFutrell Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 211–221 http://www.aclweb.org/anthology/W18-5423 RNN language models have achieved state-of-the-art perplexity results and have proven useful in a suite of NLP tasks, but it is as yet unclear what syntactic generalizations they learn. Here we investigate whether state-of-the-art RNN language models represent long-distance filler–gap dependencies and constraints on them. Examining RNN behavior on experimentally controlled sentences designed to expose filler–gap dependencies, we show that RNNs can represent the relationship in multiple syntactic positions and over large spans of text. Furthermore, we show that RNNs learn a subset of the known restrictions on filler–gap dependencies, known as island constraints: RNNs show evidence for wh-islands, adjunct islands, and complex NP islands. These studies demonstrates that state-of-the-art RNN models are able to learn and generalize about empty syntactic positions. inproceedings wilcox-EtAl:2018:BlackboxNLP Do Language Models Understand Anything? On the Ability of LSTMs to Understand Negative Polarity Items JaapJumelet DieuwkeHupkes Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 222–231 http://www.aclweb.org/anthology/W18-5424 In this paper, we attempt to link the inner workings of a neural language model to linguistic theory, focusing on a complex phenomenon well discussed in formal linguistics: (negative) polarity items. inproceedings jumelet-hupkes:2018:BlackboxNLP Closing Brackets with Recurrent Neural Networks NataliaSkachkova ThomasTrost DietrichKlakow Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 232–239 http://www.aclweb.org/anthology/W18-5425 Many natural and formal languages contain words or symbols that require a matching counterpart for making an expression well-formed. The combination of opening and closing brackets is a typical example of such a construction. Due to their commonness, the ability to follow such rules is important for language modeling. Currently, recurrent neural networks (RNNs) are extensively used for this task. We investigate whether they are capable of learning the rules of opening and closing brackets by applying them to synthetic Dyck languages that consist of different types of brackets. We provide an analysis of the statistical properties of these languages as a baseline and show strengths and limits of Elman-RNNs, GRUs and LSTMs in experiments on random samples of these languages. In terms of perplexity and prediction accuracy, the RNNs get close to the theoretical baseline in most cases. inproceedings skachkova-trost-klakow:2018:BlackboxNLP Under the Hood: Using Diagnostic Classifiers to Investigate and Improve how Language Models Track Agreement Information MarioGiulianelli JackHarding FlorianMohnert DieuwkeHupkes WillemZuidema Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 240–248 http://www.aclweb.org/anthology/W18-5426 How do neural language models keep track of number agreement between subject and verb? We show that ‘diagnostic classifiers’, trained to predict number from the internal states of the language model, provide a detailed understanding of how, when, and where this information is represented. Moreover, they give us insight in when and where this information is corrupted in cases where the language model ends up making agreement errors. To demonstrate the causal role that the representations we find play, we then use this information to influence the course of the LSTM during the processing of difficult sentences. Results from such an intervention show a large increase in the language model’s accuracy. Together, these results show that diagnostic classifiers give us an unrivalled detailed look into the representation of linguistic information in neural models, and moreover demonstrate that this knowledge can be use to improve their inproceedings giulianelli-EtAl:2018:BlackboxNLP Iterative Recursive Attention Model for Interpretable Sequence Classification MartinTutek JanŠnajder Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 249–257 http://www.aclweb.org/anthology/W18-5427 Natural language processing has greatly benefited from the introduction of the attention mechanism. However, standard attention models are of limited interpretability for tasks that involve a series of inference steps. We describe an iterative recursive attention model, which constructs incremental representations of input data through reusing results of previously computed queries. We train our model on sentiment classification datasets and demonstrate its capacity to identify and combine different aspects of the input in an easily interpretable manner, while obtaining performance close to the state of the art. inproceedings tutek-najder:2018:BlackboxNLP Interpreting Word-Level Hidden State Behaviour of Character-Level LSTM Language Models AveryHiebert ColePeterson AlonaFyshe NishantMehta Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 258–266 http://www.aclweb.org/anthology/W18-5428 While Long Short-Term Memory networks (LSTMs) and other forms of recurrent neural network have been successfully applied to language modeling on a character level, the hidden state dynamics of these models can be difficult to interpret. We investigate the hidden states of such a model by using the HDBSCAN clustering algorithm to identify points in the text at which the hidden state is similar. Focusing on whitespace characters prior to the beginning of a word reveals interpretable clusters that offer insight into how the LSTM may combine contextual and character-level information to identify parts of speech. We also introduce a method for deriving word vectors from the hidden state representation in order to investigate the word-level knowledge of the model. These word vectors encode meaningful semantic information even for words that appear only once in the training text. inproceedings hiebert-EtAl:2018:BlackboxNLP Importance of Self-Attention for Sentiment Analysis GaëlLetarte FrédérikParadis PhilippeGiguère FrançoisLaviolette Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 267–275 http://www.aclweb.org/anthology/W18-5429 Despite their superior performance, deep learning models often lack interpretability. In this paper, we explore the modeling of insightful relations between words, in order to understand and enhance predictions. To this effect, we propose the Self-Attention Network (SANet), a flexible and interpretable architecture for text classification. Experiments indicate that gains obtained by self-attention is task-dependent. For instance, experiments on sentiment analysis tasks showed an improvement of around 2% when using self-attention compared to a baseline without attention, while topic classification showed no gain. Interpretability brought forward by our architecture highlighted the importance of neighboring word interactions to extract sentiment. inproceedings letarte-EtAl:2018:BlackboxNLP Firearms and Tigers are Dangerous, Kitchen Knives and Zebras are Not: Testing whether Word Embeddings Can Tell PiaSommerauer AntskeFokkens Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 276–286 http://www.aclweb.org/anthology/W18-5430 This paper presents an approach for investigating the nature of semantic information captured by word embeddings. We propose a method that extends an existing human- elicited semantic property dataset with gold negative examples using crowd judgments. Our experimental approach tests the ability of supervised classifiers to identify semantic features in word embedding vectors and compares this to a feature-identification method based on full vector cosine similarity. The idea behind this method is that properties identified by classifiers, but not through full vector comparison are captured by embeddings. Properties that cannot be identified by either method are not. Our results provide an initial indication that semantic properties relevant for the way entities interact (e.g. dangerous) are captured, while perceptual information (e.g. colors) is not represented. We conclude that, though preliminary, these results show that our method is suitable for identifying which properties are captured by embeddings. inproceedings sommerauer-fokkens:2018:BlackboxNLP An Analysis of Encoder Representations in Transformer-Based Machine Translation AlessandroRaganato JörgTiedemann Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 287–297 http://www.aclweb.org/anthology/W18-5431 The attention mechanism is a successful technique in modern NLP, especially in tasks like machine translation. The recently proposed network architecture of the Transformer is based entirely on attention mechanisms and achieves new state of the art results in neural machine translation, outperforming other sequence-to-sequence models. inproceedings raganato-tiedemann:2018:BlackboxNLP Evaluating Grammaticality in Seq2seq Models with a Broad Coverage HPSG Grammar: A Case Study on Machine Translation JohnnyWei KhiemPham BrendanO'Connor BrianDillon Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 298–305 http://www.aclweb.org/anthology/W18-5432 Sequence to sequence (seq2seq) models are often employed in settings where the target output is natural language. However, the syntactic properties of the language generated from these models are not well understood. We explore whether such output belongs to a formal and realistic grammar, by employing the English Resource Grammar (ERG), a broad coverage, linguistically precise HPSG-based grammar of English. From a French to English parallel corpus, we analyze the parseability and grammatical constructions occurring in output from a seq2seq translation model. Over 93% of the model translations are parseable, suggesting that it learns to generate conforming to a grammar. The model has trouble learning the distribution of rarer syntactic rules, and we pinpoint several constructions that differentiate translations between the references and our model. inproceedings wei-EtAl:2018:BlackboxNLP Context-Free Transductions with Neural Stacks YidingHao WilliamMerrill DanaAngluin RobertFrank NoahAmsel AndrewBenz SimonMendelsohn Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 306–315 http://www.aclweb.org/anthology/W18-5433 This paper analyzes the behavior of stack-augmented recurrent neural network (RNN) models. Due to the architectural similarity between stack RNNs and pushdown transducers, we train stack RNN models on a number of tasks, including string reversal, context-free language modelling, and cumulative XOR evaluation. Examining the behavior of our networks, we show that stack-augmented RNNs can discover intuitive stack-based strategies for solving our tasks. However, stack RNNs are more difficult to train than classical architectures such as LSTMs. Rather than employ stack-based strategies, more complex stack-augmented networks often find approximate solutions by using the stack as unstructured memory. inproceedings hao-EtAl:2018:BlackboxNLP Learning Explanations from Language Data DavidHarbecke RobertSchwarzenberg ChristophAlt Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 316–318 http://www.aclweb.org/anthology/W18-5434 PatternAttribution is a recent method, introduced in the vision domain, that explains classifications of deep neural networks. We demonstrate that it also generates meaningful interpretations in the language domain. inproceedings harbecke-schwarzenberg-alt:2018:BlackboxNLP How much should you ask? On the question structure in QA systems. BarbaraRychalska DominikaBasaj AnnaWróblewska PrzemyslawBiecek Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 319–321 http://www.aclweb.org/anthology/W18-5435 Datasets that boosted state-of-the-art solutions for Question Answering (QA) systems prove that it is possible to ask questions in natural language manner. However, users are still used to query-like systems where they type in keywords to search for answer. In this study we validate which parts of questions are essential for obtaining valid answer. In order to conclude that, we take advantage of LIME - a framework that explains prediction by local approximation. We find that grammar and natural language is disregarded by QA. State-of-the-art model can answer properly even if ’asked’ only with a few words with high coefficients calculated with LIME. According to our knowledge, it is the first time that QA model is being explained by LIME. inproceedings rychalska-EtAl:2018:BlackboxNLP1 Does it care what you asked? Understanding Importance of Verbs in Deep Learning QA System BarbaraRychalska DominikaBasaj AnnaWróblewska PrzemyslawBiecek Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 322–324 http://www.aclweb.org/anthology/W18-5436 In this paper we present the results of an investigation of the importance of verbs in a deep learning QA system trained on SQuAD dataset. We show that main verbs in questions carry little influence on the decisions made by the system - in over 90% of researched cases swapping verbs for their antonyms did not change system decision. We track this phenomenon down to the insides of the net, analyzing the mechanism of self-attention and values contained in hidden layers of RNN. Finally, we recognize the characteristics of the SQuAD dataset as the source of the problem. Our work refers to the recently popular topic of adversarial examples in NLP, combined with investigating deep net structure. inproceedings rychalska-EtAl:2018:BlackboxNLP2 Interpretable Textual Neuron Representations for NLP NinaPoerner BenjaminRoth HinrichSchütze Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 325–327 http://www.aclweb.org/anthology/W18-5437 Input optimization methods, such as Google Deep Dream, create interpretable representations of neurons for computer vision DNNs. We propose and evaluate ways of transferring this technology to NLP. Our results suggest that gradient ascent with a gumbel softmax layer produces n-gram representations that outperform naive corpus search in terms of target neuron activation. The representations highlight differences in syntax awareness between the language and visual models of the Imaginet architecture. inproceedings poerner-roth-schtze:2018:BlackboxNLP Language Models Learn POS First NaomiSaphra AdamLopez Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 328–330 http://www.aclweb.org/anthology/W18-5438 A glut of recent research shows that language models capture linguistic structure. Such work answers the question of whether a model represents linguistic structure. But how and when are these structures acquired? Rather than treating the training process itself as a black box, we investigate how representations of linguistic structure are learned over time. In particular, we demonstrate that different aspects of linguistic structure are learned at different rates, with part of speech tagging acquired early and global topic information learned continuously. inproceedings saphra-lopez:2018:BlackboxNLP Predicting and interpreting embeddings for out of vocabulary words in downstream tasks NicolasGarneau Jean-SamuelLeboeuf LucLamontagne Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 331–333 http://www.aclweb.org/anthology/W18-5439 We propose a novel way to handle out of vocabulary (OOV) words in downstream natural language processing (NLP) tasks. We implement a network that predicts useful embeddings for OOV words based on their morphology and on the context in which they appear. Our model also incorporates an attention mechanism indicating the focus allocated to the left context words, the right context words or the word’s characters, hence making the prediction more interpretable. The model is a inproceedings garneau-leboeuf-lamontagne:2018:BlackboxNLP Probing sentence embeddings for structure-dependent tense GeoffBacon TerryRegier Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 334–336 http://www.aclweb.org/anthology/W18-5440 Learning universal sentence representations which accurately model sentential semantic content is a current goal of natural language processing research. A prominent and successful approach is to train recurrent neural networks (RNNs) to encode sentences into fixed length vectors. Many core linguistic phenomena that one would like to model in universal sentence representations depend on syntactic structure. Despite the fact that RNNs do not have explicit syntactic structural representations, there is some evidence that RNNs can approximate such structure-dependent phenomena under certain conditions, in addition to their widespread success in practical tasks. In this work, we assess RNNs' ability to learn the structure-dependent phenomenon of main clause tense. inproceedings bacon-regier:2018:BlackboxNLP Collecting Diverse Natural Language Inference Problems for Sentence Representation Evaluation AdamPoliak AparajitaHaldar RachelRudinger J. EdwardHu ElliePavlick Aaron StevenWhite BenjaminVan Durme Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 337–340 http://www.aclweb.org/anthology/W18-5441 We present a large-scale collection of diverse natural language inference (NLI) datasets that help provide insight into how well a sentence representation encoded by a neural network captures distinct types of reasoning. The collection results from recasting 13 existing datasets from 7 semantic phenomena into a common NLI structure, resulting in over half a million labeled context-hypothesis pairs in total. Our collection of diverse datasets is available at http://www.decomp.net/ and will grow over time as additional resources are recast and added from novel sources. inproceedings poliak-EtAl:2018:BlackboxNLP Interpretable Word Embedding Contextualization Kyoung-RokJang Sung-HyonMyaeng Sang-BumKim Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 341–343 http://www.aclweb.org/anthology/W18-5442 In this paper, we propose a method of calibrating a word embedding, so that the semantic it conveys becomes more relevant to the context. Our method is novel because the output shows clearly which senses that were originally presented in a target word embedding become stronger or weaker. This is possible by utilizing the technique introduced in, the technique of using sparse coding to recover senses that comprises a word embedding. inproceedings jang-myaeng-kim:2018:BlackboxNLP State Gradients for RNN Memory Analysis LyanVerwimp HugoVan hamme VincentRenkens PatrickWambacq Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 344–346 http://www.aclweb.org/anthology/W18-5443 We present a framework for analyzing what the state in RNNs remembers from its input embeddings. We compute the gradients of the states with respect to the input embeddings and decompose the gradient matrix with Singular Value Decomposition to analyze which directions in the embedding space are best transferred to the hidden state space, characterized by the largest singular values. We apply our approach to LSTM language models and investigate to what extent and for how long certain classes of words are remembered on average for a certain corpus. Additionally, the extent to which a specific property or relationship is remembered by the RNN can be tracked by comparing a vector characterizing that property with the direction(s) in embedding space that are best preserved in hidden state space. inproceedings verwimp-EtAl:2018:BlackboxNLP Extracting Syntactic Trees from Transformer Encoder Self-Attentions DavidMareček RudolfRosa Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 347–349 http://www.aclweb.org/anthology/W18-5444 This is a work in progress about extracting the sentence tree structures from the encoder's self-attention weights, when translating into another language using the Transformer neural network architecture. We visualize the structures and discuss their characteristics with respect to the existing syntactic theories and annotations. inproceedings mareek-rosa:2018:BlackboxNLP Portable, layer-wise task performance monitoring for NLP models TomLippincott Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 350–352 http://www.aclweb.org/anthology/W18-5445 There is a long-standing interest in understanding the internal behavior of neural networks. Deep neural architectures for natural language processing (NLP) are often accompanied by explanations for their effectiveness, from general observations (e.g. RNNs can represent unbounded dependencies in a sequence) to specific arguments about linguistic phenomena (early layers encode lexical information, deeper layers syntactic). The recent ascendancy of DNNs is fueling efforts in the NLP community to explore these claims. Previous work has tended to focus on easily-accessible representations like word or sentence embeddings, with deeper structure requiring more ad hoc methods to extract and examine. In this work, we introduce Vivisect, a toolkit that aims at a general solution for broad and fine-grained monitoring in the major DNN frameworks, with minimal change to research patterns. inproceedings lippincott:2018:BlackboxNLP GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding AlexWang AmanpreetSingh JulianMichael FelixHill OmerLevy SamuelBowman Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 353–355 http://www.aclweb.org/anthology/W18-5446 For natural language understanding (NLU) technology to be maximally useful, it must be able to process language in a way that is not exclusively tailored to a specific task, genre, or dataset. inproceedings wang-EtAl:2018:BlackboxNLP Explicitly modeling case improves neural dependency parsing ClaraVania AdamLopez Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 356–358 http://www.aclweb.org/anthology/W18-5447 Neural dependency parsing models that compose word representations from characters can presumably exploit morphosyntax when making attachment decisions. How much do they know about morphology? We investigate how well they handle morphological case, which is important for parsing. Our experiments on Czech, German and Russian suggest that adding explicit morphological case–-either oracle or predicted–-improves neural dependency parsing, indicating that the learned representations in these models do not fully encode the morphological knowledge that they need, and can still benefit from targeted forms of explicit linguistic modeling. inproceedings vania-lopez:2018:BlackboxNLP Language Modeling Teaches You More than Translation Does: Lessons Learned Through Auxiliary Syntactic Task Analysis KellyZhang SamuelBowman Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 359–361 http://www.aclweb.org/anthology/W18-5448 Recently, researchers have found that deep LSTMs trained on tasks like machine translation learn substantial syntactic and semantic information about their input sentences, including part-of-speech. These findings begin to shed light on why pretrained representations, like ELMo and CoVe, are so beneficial for neural language understanding models. We still, though, do not yet have a clear understanding of how the choice of pretraining objective affects the type of linguistic information that models learn. With this in mind, we compare four objectives–-language modeling, translation, skip-thought, and autoencoding–-on their ability to induce syntactic and part-of-speech information, holding constant the quantity and genre of the training data, as well as the LSTM architecture. inproceedings zhang-bowman:2018:BlackboxNLP Representation of Word Meaning in the Intermediate Projection Layer of a Neural Language Model StevenDerby PaulMiller BrianMurphy BarryDevereux Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 362–364 http://www.aclweb.org/anthology/W18-5449 In this work, we evaluate latent semantic knowledge present in the LSTM activation patterns produced before and after the word of interest. We evaluate whether these activations predict human similarity ratings, human-derived property knowledge, and brain imaging data. In this way, we test the model's ability to encode important semantic information relevant to word prediction, and it's relationship with human cognitive semantic representations. inproceedings derby-EtAl:2018:BlackboxNLP Interpretable Structure Induction via Sparse Attention BenPeters VladNiculae André F. T.Martins Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 365–367 http://www.aclweb.org/anthology/W18-5450 Neural network methods are experiencing wide adoption in NLP, thanks to their empirical performance on many tasks. Modern neural architectures go way beyond simple feedforward and recurrent models: they are complex pipelines that perform soft, differentiable computation instead of discrete logic. The price of such soft computing is the introduction of dense dependencies, which make it hard to disentangle the patterns that trigger a prediction. Our recent work on sparse and structured latent computation presents a promising avenue for enhancing interpretability of such neural pipelines. Through this extended abstract, we aim to discuss and explore the potential and impact of our methods. inproceedings peters-niculae-martins:2018:BlackboxNLP Debugging Sequence-to-Sequence Models with Seq2Seq-Vis HendrikStrobelt SebastianGehrmann MichaelBehrisch AdamPerer HanspeterPfister AlexanderRush Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 368–370 http://www.aclweb.org/anthology/W18-5451 Neural sequence-to-sequence models have proven to be inproceedings strobelt-EtAl:2018:BlackboxNLP Grammar Induction with Neural Language Models: An Unusual Replication Phu MonHtut KyunghyunCho SamuelBowman Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 371–373 http://www.aclweb.org/anthology/W18-5452 Grammar induction is the task of learning syntactic structure without the expert-labeled treebanks (Charniak and Carroll, 1992; Klein and Manning, 2002). Recent work on latent tree learning offers a new family of approaches to this problem by inducing syntactic structure using the supervision from a downstream NLP task (Yogatama et al., 2017; Maillard et al., 2017; Choi et al., 2018). In a recent paper published at ICLR, Shen et al. (2018) introduce such a model and report near state-of-the-art results on the target task of language modeling, and the first strong latent tree learning result on constituency parsing. During the analysis of this model, we discover issues that make the original results hard to trust, including tuning and even training on what is effectively the test set. Here, we analyze the model under different configurations to understand what it learns and to identify the conditions under which it succeeds. We find that this model represents the first empirical success for neural network latent tree learning, and that neural language modeling warrants further study as a setting for grammar induction. inproceedings htut-cho-bowman:2018:BlackboxNLP Does Syntactic Knowledge in Multilingual Language Models Transfer Across Languages? PrajitDhar AriannaBisazza Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 374–377 http://www.aclweb.org/anthology/W18-5453 Recent work has shown that neural models can be successfully trained on multiple languages simultaneously. inproceedings dhar-bisazza:2018:BlackboxNLP Exploiting Attention to Reveal Shortcomings in Memory Models KayleeBurns AidaNematzadeh ErinGrant AlisonGopnik TomGriffiths Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 378–380 http://www.aclweb.org/anthology/W18-5454 The decision making processes of deep networks are difficult to understand and while their accuracy often improves with increased architectural complexity, so too does their opacity. Practical use of machine learning models, especially for question and answering applications, demands a system that is interpretable. We analyze the attention of a memory network model to reconcile contradictory performance on a challenging question-answering dataset that is inspired by theory-of-mind experiments. We equate success on questions to task classification, which explains not only test-time failures but also how well the model generalizes to new training conditions. inproceedings burns-EtAl:2018:BlackboxNLP End-to-end Image Captioning Exploits Distributional Similarity in Multimodal Space Pranava SwaroopMadhyastha JosiahWang LuciaSpecia Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 381–383 http://www.aclweb.org/anthology/W18-5455 We hypothesize that end-to-end neural image captioning systems work seemingly well because they exploit and learn `distributional similarity' in a multimodal feature space, by mapping a test image to similar training images in this space and generating a caption from the same space. To validate our hypothesis, we focus on the `image' side of image captioning, and vary the input image representation but keep the RNN text generation model of a CNN-RNN constant. Our analysis indicates that image captioning models (i) are capable of separating structure from noisy input representations; (ii) experience virtually no significant performance loss when a high dimensional representation is compressed to a lower dimensional space; (iii) cluster images with similar visual and linguistic information together. Our experiments all point to one fact: that our distributional similarity hypothesis holds. We conclude that, regardless of the image representation, image captioning systems seem to match images and generate captions in a learned joint image-text semantic subspace. inproceedings madhyastha-wang-specia:2018:BlackboxNLP Limitations in learning an interpreted language with recurrent models DenisPaperno Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP November 2018

Brussels, Belgium

Association for Computational Linguistics 384–386 http://www.aclweb.org/anthology/W18-5456 In this submission I report work in progress on learning simplified interpreted languages by means of recurrent models. The data is constructed to reflect core properties of natural language as modeled in formal syntax and semantics. Preliminary results suggest that LSTM networks do generalise to compositional interpretation, albeit only in the most favorable learning setting. inproceedings paperno:2018:BlackboxNLP