Yoav Goldberg


2021

pdf bib
Proceedings of the 1st Workshop on Understanding Implicit and Underspecified Language
Michael Roth | Reut Tsarfaty | Yoav Goldberg
Proceedings of the 1st Workshop on Understanding Implicit and Underspecified Language

pdf bib
Data Augmentation for Sign Language Gloss Translation
Amit Moryossef | Kayo Yin | Graham Neubig | Yoav Goldberg
Proceedings of the 1st International Workshop on Automatic Translation for Signed and Spoken Languages (AT4SSL)

Sign language translation (SLT) is often decomposed into video-to-gloss recognition and gloss to-text translation, where a gloss is a sequence of transcribed spoken-language words in the order in which they are signed. We focus here on gloss-to-text translation, which we treat as a low-resource neural machine translation (NMT) problem. However, unlike traditional low resource NMT, gloss-to-text translation differs because gloss-text pairs often have a higher lexical overlap and lower syntactic overlap than pairs of spoken languages. We exploit this lexical overlap and handle syntactic divergence by proposing two rule-based heuristics that generate pseudo-parallel gloss-text pairs from monolingual spoken language text. By pre-training on this synthetic data, we improve translation from American Sign Language (ASL) to English and German Sign Language (DGS) to German by up to 3.14 and 2.20 BLEU, respectively.

pdf bib
Scalable Evaluation and Improvement of Document Set Expansion via Neural Positive-Unlabeled Learning
Alon Jacovi | Gang Niu | Yoav Goldberg | Masashi Sugiyama
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

We consider the situation in which a user has collected a small set of documents on a cohesive topic, and they want to retrieve additional documents on this topic from a large collection. Information Retrieval (IR) solutions treat the document set as a query, and look for similar documents in the collection. We propose to extend the IR approach by treating the problem as an instance of positive-unlabeled (PU) learning—i.e., learning binary classifiers from only positive (the query documents) and unlabeled (the results of the IR engine) data. Utilizing PU learning for text with big neural networks is a largely unexplored field. We discuss various challenges in applying PU learning to the setting, showing that the standard implementations of state-of-the-art PU solutions fail. We propose solutions for each of the challenges and empirically validate them with ablation tests. We demonstrate the effectiveness of the new method using a series of experiments of retrieving PubMed abstracts adhering to fine-grained topics, showing improvements over the common IR solution and other baselines.

pdf bib
Bootstrapping Relation Extractors using Syntactic Search by Examples
Matan Eyal | Asaf Amrami | Hillel Taub-Tabib | Yoav Goldberg
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

The advent of neural-networks in NLP brought with it substantial improvements in supervised relation extraction. However, obtaining a sufficient quantity of training data remains a key challenge. In this work we propose a process for bootstrapping training datasets which can be performed quickly by non-NLP-experts. We take advantage of search engines over syntactic-graphs (Such as Shlain et al. (2020)) which expose a friendly by-example syntax. We use these to obtain positive examples by searching for sentences that are syntactically similar to user input examples. We apply this technique to relations from TACRED and DocRED and show that the resulting models are competitive with models trained on manually annotated data and on data obtained from distant supervision. The models also outperform models trained using NLG data augmentation techniques. Extending the search-based approach with the NLG method further improves the results.

pdf bib
Hebrew Psychological Lexicons
Natalie Shapira | Dana Atzil-Slonim | Daniel Juravski | Moran Baruch | Dana Stolowicz-Melman | Adar Paz | Tal Alfi-Yogev | Roy Azoulay | Adi Singer | Maayan Revivo | Chen Dahbash | Limor Dayan | Tamar Naim | Lidar Gez | Boaz Yanai | Adva Maman | Adam Nadaf | Elinor Sarfati | Amna Baloum | Tal Naor | Ephraim Mosenkis | Badreya Sarsour | Jany Gelfand Morgenshteyn | Yarden Elias | Liat Braun | Moria Rubin | Matan Kenigsbuch | Noa Bergwerk | Noam Yosef | Sivan Peled | Coral Avigdor | Rahav Obercyger | Rachel Mann | Tomer Alper | Inbal Beka | Ori Shapira | Yoav Goldberg
Proceedings of the Seventh Workshop on Computational Linguistics and Clinical Psychology: Improving Access

We introduce a large set of Hebrew lexicons pertaining to psychological aspects. These lexicons are useful for various psychology applications such as detecting emotional state, well being, relationship quality in conversation, identifying topics (e.g., family, work) and many more. We discuss the challenges in creating and validating lexicons in a new language, and highlight our methodological considerations in the data-driven lexicon construction process. Most of the lexicons are publicly available, which will facilitate further research on Hebrew clinical psychology text analysis. The lexicons were developed through data driven means, and verified by domain experts, clinical psychologists and psychology students, in a process of reconciliation with three judges. Development and verification relied on a dataset of a total of 872 psychotherapy session transcripts. We describe the construction process of each collection, the final resource and initial results of research studies employing this resource.

pdf bib
Does BERT Pretrained on Clinical Notes Reveal Sensitive Data?
Eric Lehman | Sarthak Jain | Karl Pichotta | Yoav Goldberg | Byron Wallace
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Large Transformers pretrained over clinical notes from Electronic Health Records (EHR) have afforded substantial gains in performance on predictive clinical tasks. The cost of training such models (and the necessity of data access to do so) coupled with their utility motivates parameter sharing, i.e., the release of pretrained models such as ClinicalBERT. While most efforts have used deidentified EHR, many researchers have access to large sets of sensitive, non-deidentified EHR with which they might train a BERT model (or similar). Would it be safe to release the weights of such a model if they did? In this work, we design a battery of approaches intended to recover Personal Health Information (PHI) from a trained BERT. Specifically, we attempt to recover patient names and conditions with which they are associated. We find that simple probing methods are not able to meaningfully extract sensitive information from BERT trained over the MIMIC-III corpus of EHR. However, more sophisticated “attacks” may succeed in doing so: To facilitate such research, we make our experimental setup and baseline probing models available at https://github.com/elehman16/exposing_patient_data_release.

pdf bib
Ab Antiquo: Neural Proto-language Reconstruction
Carlo Meloni | Shauli Ravfogel | Yoav Goldberg
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Historical linguists have identified regularities in the process of historic sound change. The comparative method utilizes those regularities to reconstruct proto-words based on observed forms in daughter languages. Can this process be efficiently automated? We address the task of proto-word reconstruction, in which the model is exposed to cognates in contemporary daughter languages, and has to predict the proto word in the ancestor language. We provide a novel dataset for this task, encompassing over 8,000 comparative entries, and show that neural sequence models outperform conventional methods applied to this task so far. Error analysis reveals a variability in the ability of neural model to capture different phonological changes, correlating with the complexity of the changes. Analysis of learned embeddings reveals the models learn phonologically meaningful generalizations, corresponding to well-attested phonological shifts documented by historical linguistics.

pdf bib
Including Signed Languages in Natural Language Processing
Kayo Yin | Amit Moryossef | Julie Hochgesang | Yoav Goldberg | Malihe Alikhani
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Signed languages are the primary means of communication for many deaf and hard of hearing individuals. Since signed languages exhibit all the fundamental linguistic properties of natural language, we believe that tools and theories of Natural Language Processing (NLP) are crucial towards its modeling. However, existing research in Sign Language Processing (SLP) seldom attempt to explore and leverage the linguistic organization of signed languages. This position paper calls on the NLP community to include signed languages as a research area with high social and scientific impact. We first discuss the linguistic properties of signed languages to consider during their modeling. Then, we review the limitations of current SLP models and identify the open challenges to extend NLP to signed languages. Finally, we urge (1) the adoption of an efficient tokenization method; (2) the development of linguistically-informed models; (3) the collection of real-world signed language data; (4) the inclusion of local signed language communities as an active and leading voice in the direction of research.

pdf bib
Neural Extractive Search
Shauli Ravfogel | Hillel Taub-Tabib | Yoav Goldberg
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations

Domain experts often need to extract structured information from large corpora. We advocate for a search paradigm called “extractive search”, in which a search query is enriched with capture-slots, to allow for such rapid extraction. Such an extractive search system can be built around syntactic structures, resulting in high-precision, low-recall results. We show how the recall can be improved using neural retrieval and alignment. The goals of this paper are to concisely introduce the extractive-search paradigm; and to demonstrate a prototype neural retrieval system for extractive search and its benefits and potential. Our prototype is available at https://spike.neural-sim.apps.allenai.org/ and a video demonstration is available at https://vimeo.com/559586687.

2020

pdf bib
Facts2Story: Controlling Text Generation by Key Facts
Eyal Orbach | Yoav Goldberg
Proceedings of the 28th International Conference on Computational Linguistics

Recent advancements in self-attention neural network architectures have raised the bar for open-ended text generation. Yet, while current methods are capable of producing a coherent text which is several hundred words long, attaining control over the content that is being generated—as well as evaluating it—are still open questions. We propose a controlled generation task which is based on expanding a sequence of facts, expressed in natural language, into a longer narrative. We introduce human-based evaluation metrics for this task, as well as a method for deriving a large training dataset. We evaluate three methods on this task, based on fine-tuning pre-trained models. We show that while auto-regressive, unidirectional Language Models such as GPT2 produce better fluency, they struggle to adhere to the requested facts. We propose a plan-and-cloze model (using fine-tuned XLNet) which produces competitive fluency while adhering to the requested content.

pdf bib
Interactive Extractive Search over Biomedical Corpora
Hillel Taub Tabib | Micah Shlain | Shoval Sadde | Dan Lahav | Matan Eyal | Yaara Cohen | Yoav Goldberg
Proceedings of the 19th SIGBioMed Workshop on Biomedical Language Processing

We present a system that allows life-science researchers to search a linguistically annotated corpus of scientific texts using patterns over dependency graphs, as well as using patterns over token sequences and a powerful variant of boolean keyword queries. In contrast to previous attempts to dependency-based search, we introduce a light-weight query language that does not require the user to know the details of the underlying linguistic representations, and instead to query the corpus by providing an example sentence coupled with simple markup. Search is performed at an interactive speed due to efficient linguistic graph-indexing and retrieval engine. This allows for rapid exploration, development and refinement of user queries. We demonstrate the system using example workflows over two corpora: the PubMed corpus including 14,446,243 PubMed abstracts and the CORD-19 dataset, a collection of over 45,000 research papers focused on COVID-19 research. The system is publicly available at https://allenai.github.io/spike

pdf bib
Break It Down: A Question Understanding Benchmark
Tomer Wolfson | Mor Geva | Ankit Gupta | Matt Gardner | Yoav Goldberg | Daniel Deutch | Jonathan Berant
Transactions of the Association for Computational Linguistics, Volume 8

Understanding natural language questions entails the ability to break down a question into the requisite steps for computing its answer. In this work, we introduce a Question Decomposition Meaning Representation (QDMR) for questions. QDMR constitutes the ordered list of steps, expressed through natural language, that are necessary for answering a question. We develop a crowdsourcing pipeline, showing that quality QDMRs can be annotated at scale, and release the Break dataset, containing over 83K pairs of questions and their QDMRs. We demonstrate the utility of QDMR by showing that (a) it can be used to improve open-domain question answering on the HotpotQA dataset, (b) it can be deterministically converted to a pseudo-SQL formal language, which can alleviate annotation in semantic parsing applications. Last, we use Break to train a sequence-to-sequence model with copying that parses questions into QDMR structures, and show that it substantially outperforms several natural baselines.

pdf bib
oLMpics-On What Language Model Pre-training Captures
Alon Talmor | Yanai Elazar | Yoav Goldberg | Jonathan Berant
Transactions of the Association for Computational Linguistics, Volume 8

Recent success of pre-trained language models (LMs) has spurred widespread interest in the language capabilities that they possess. However, efforts to understand whether LM representations are useful for symbolic reasoning tasks have been limited and scattered. In this work, we propose eight reasoning tasks, which conceptually require operations such as comparison, conjunction, and composition. A fundamental challenge is to understand whether the performance of a LM on a task should be attributed to the pre-trained representations or to the process of fine-tuning on the task data. To address this, we propose an evaluation protocol that includes both zero-shot evaluation (no fine-tuning), as well as comparing the learning curve of a fine-tuned LM to the learning curve of multiple controls, which paints a rich picture of the LM capabilities. Our main findings are that: (a) different LMs exhibit qualitatively different reasoning abilities, e.g., RoBERTa succeeds in reasoning tasks where BERT fails completely; (b) LMs do not reason in an abstract manner and are context-dependent, e.g., while RoBERTa can compare ages, it can do so only when the ages are in the typical range of human ages; (c) On half of our reasoning tasks all models fail completely. Our findings and infrastructure can help future work on designing new datasets, models, and objective functions for pre-training.

pdf bib
A Formal Hierarchy of RNN Architectures
William Merrill | Gail Weiss | Yoav Goldberg | Roy Schwartz | Noah A. Smith | Eran Yahav
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

We develop a formal hierarchy of the expressive capacity of RNN architectures. The hierarchy is based on two formal properties: space complexity, which measures the RNN’s memory, and rational recurrence, defined as whether the recurrent update can be described by a weighted finite-state machine. We place several RNN variants within this hierarchy. For example, we prove the LSTM is not rational, which formally separates it from the related QRNN (Bradbury et al., 2016). We also show how these models’ expressive capacity is expanded by stacking multiple layers or composing them with different pooling functions. Our results build on the theory of “saturated” RNNs (Merrill, 2019). While formally extending these findings to unsaturated RNNs is left to future work, we hypothesize that the practical learnable capacity of unsaturated RNNs obeys a similar hierarchy. We provide empirical results to support this conjecture. Experimental findings from training unsaturated networks on formal languages support this conjecture.

pdf bib
Simple, Interpretable and Stable Method for Detecting Words with Usage Change across Corpora
Hila Gonen | Ganesh Jawahar | Djamé Seddah | Yoav Goldberg
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

The problem of comparing two bodies of text and searching for words that differ in their usage between them arises often in digital humanities and computational social science. This is commonly approached by training word embeddings on each corpus, aligning the vector spaces, and looking for words whose cosine distance in the aligned space is large. However, these methods often require extensive filtering of the vocabulary to perform well, and - as we show in this work - result in unstable, and hence less reliable, results. We propose an alternative approach that does not use vector space alignment, and instead considers the neighbors of each word. The method is simple, interpretable and stable. We demonstrate its effectiveness in 9 different setups, considering different corpus splitting criteria (age, gender and profession of tweet authors, time of tweet) and different languages (English, French and Hebrew).

pdf bib
Towards Faithfully Interpretable NLP Systems: How Should We Define and Evaluate Faithfulness?
Alon Jacovi | Yoav Goldberg
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

With the growing popularity of deep-learning based NLP models, comes a need for interpretable systems. But what is interpretability, and what constitutes a high-quality interpretation? In this opinion piece we reflect on the current state of interpretability evaluation research. We call for more clearly differentiating between different desired criteria an interpretation should satisfy, and focus on the faithfulness criteria. We survey the literature with respect to faithfulness evaluation, and arrange the current approaches around three assumptions, providing an explicit form to how faithfulness is “defined” by the community. We provide concrete guidelines on how evaluation of interpretation methods should and should not be conducted. Finally, we claim that the current binary definition for faithfulness sets a potentially unrealistic bar for being considered faithful. We call for discarding the binary notion of faithfulness in favor of a more graded one, which we believe will be of greater practical utility.

pdf bib
A Two-Stage Masked LM Method for Term Set Expansion
Guy Kushilevitz | Shaul Markovitch | Yoav Goldberg
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

We tackle the task of Term Set Expansion (TSE): given a small seed set of example terms from a semantic class, finding more members of that class. The task is of great practical utility, and also of theoretical utility as it requires generalization from few examples. Previous approaches to the TSE task can be characterized as either distributional or pattern-based. We harness the power of neural masked language models (MLM) and propose a novel TSE algorithm, which combines the pattern-based and distributional approaches. Due to the small size of the seed set, fine-tuning methods are not effective, calling for more creative use of the MLM. The gist of the idea is to use the MLM to first mine for informative patterns with respect to the seed set, and then to obtain more members of the seed class by generalizing these patterns. Our method outperforms state-of-the-art TSE algorithms. Implementation is available at: https://github.com/ guykush/TermSetExpansion-MPB/

pdf bib
Null It Out: Guarding Protected Attributes by Iterative Nullspace Projection
Shauli Ravfogel | Yanai Elazar | Hila Gonen | Michael Twiton | Yoav Goldberg
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

The ability to control for the kinds of information encoded in neural representation has a variety of use cases, especially in light of the challenge of interpreting these models. We present Iterative Null-space Projection (INLP), a novel method for removing information from neural representations. Our method is based on repeated training of linear classifiers that predict a certain property we aim to remove, followed by projection of the representations on their null-space. By doing so, the classifiers become oblivious to that target property, making it hard to linearly separate the data according to it. While applicable for multiple uses, we evaluate our method on bias and fairness use-cases, and show that our method is able to mitigate bias in word embeddings, as well as to increase fairness in a setting of multi-class classification.

pdf bib
Unsupervised Domain Clusters in Pretrained Language Models
Roee Aharoni | Yoav Goldberg
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

The notion of “in-domain data” in NLP is often over-simplistic and vague, as textual data varies in many nuanced linguistic aspects such as topic, style or level of formality. In addition, domain labels are many times unavailable, making it challenging to build domain-specific systems. We show that massive pre-trained language models implicitly learn sentence representations that cluster by domains without supervision – suggesting a simple data-driven definition of domains in textual data. We harness this property and propose domain data selection methods based on such models, which require only a small set of in-domain monolingual data. We evaluate our data selection methods for neural machine translation across five diverse domains, where they outperform an established approach as measured by both BLEU and precision and recall with respect to an oracle selection.

pdf bib
Syntactic Search by Example
Micah Shlain | Hillel Taub-Tabib | Shoval Sadde | Yoav Goldberg
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

We present a system that allows a user to search a large linguistically annotated corpus using syntactic patterns over dependency graphs. In contrast to previous attempts to this effect, we introduce a light-weight query language that does not require the user to know the details of the underlying syntactic representations, and instead to query the corpus by providing an example sentence coupled with simple markup. Search is performed at an interactive speed due to efficient linguistic graph-indexing and retrieval engine. This allows for rapid exploration, development and refinement of syntax-based queries. We demonstrate the system using queries over two corpora: the English wikipedia, and a collection of English pubmed abstracts. A demo of the wikipedia system is available at https://allenai.github.io/spike/ .

pdf bib
pyBART: Evidence-based Syntactic Transformations for IE
Aryeh Tiktinsky | Yoav Goldberg | Reut Tsarfaty
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

Syntactic dependencies can be predicted with high accuracy, and are useful for both machine-learned and pattern-based information extraction tasks. However, their utility can be improved. These syntactic dependencies are designed to accurately reflect syntactic relations, and they do not make semantic relations explicit. Therefore, these representations lack many explicit connections between content words, that would be useful for downstream applications. Proposals like English Enhanced UD improve the situation by extending universal dependency trees with additional explicit arcs. However, they are not available to Python users, and are also limited in coverage. We introduce a broad-coverage, data-driven and linguistically sound set of transformations, that makes event-structure and many lexical relations explicit. We present pyBART, an easy-to-use open-source Python library for converting English UD trees either to Enhanced UD graphs or to our representation. The library can work as a standalone package or be integrated within a spaCy NLP pipeline. When evaluated in a pattern-based relation extraction scenario, our representation results in higher extraction scores than Enhanced UD, while requiring fewer patterns.

pdf bib
Nakdan: Professional Hebrew Diacritizer
Avi Shmidman | Shaltiel Shmidman | Moshe Koppel | Yoav Goldberg
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

We present a system for automatic diacritization of Hebrew Text. The system combines modern neural models with carefully curated declarative linguistic knowledge and comprehensive manually constructed tables and dictionaries. Besides providing state of the art diacritization accuracy, the system also supports an interface for manual editing and correction of the automatic output, and has several features which make it particularly useful for preparation of scientific editions of historical Hebrew texts. The system supports Modern Hebrew, Rabbinic Hebrew and Poetic Hebrew. The system is freely accessible for all use at http://nakdanpro.dicta.org.il

pdf bib
Exposing Shallow Heuristics of Relation Extraction Models with Challenge Data
Shachar Rosenman | Alon Jacovi | Yoav Goldberg
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

The process of collecting and annotating training data may introduce distribution artifacts which may limit the ability of models to learn correct generalization behavior. We identify failure modes of SOTA relation extraction (RE) models trained on TACRED, which we attribute to limitations in the data annotation process. We collect and annotate a challenge-set we call Challenging RE (CRE), based on naturally occurring corpus examples, to benchmark this behavior. Our experiments with four state-of-the-art RE models show that they have indeed adopted shallow heuristics that do not generalize to the challenge-set data. Further, we find that alternative question answering modeling performs significantly better than the SOTA models on the challenge-set, despite worse overall TACRED performance. By adding some of the challenge data as training examples, the performance of the model improves. Finally, we provide concrete suggestion on how to improve RE data collection to alleviate this behavior.

pdf bib
The Extraordinary Failure of Complement Coercion Crowdsourcing
Yanai Elazar | Victoria Basmov | Shauli Ravfogel | Yoav Goldberg | Reut Tsarfaty
Proceedings of the First Workshop on Insights from Negative Results in NLP

Crowdsourcing has eased and scaled up the collection of linguistic annotation in recent years. In this work, we follow known methodologies of collecting labeled data for the complement coercion phenomenon. These are constructions with an implied action — e.g., “I started a new book I bought last week”, where the implied action is reading. We aim to collect annotated data for this phenomenon by reducing it to either of two known tasks: Explicit Completion and Natural Language Inference. However, in both cases, crowdsourcing resulted in low agreement scores, even though we followed the same methodologies as in previous work. Why does the same process fail to yield high agreement scores? We specify our modeling schemes, highlight the differences with previous work and provide some insights about the task and possible explanations for the failure. We conclude that specific phenomena require tailored solutions, not only in specialized algorithms, but also in data collection methods.

pdf bib
Compressing Pre-trained Language Models by Matrix Decomposition
Matan Ben Noach | Yoav Goldberg
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing

Large pre-trained language models reach state-of-the-art results on many different NLP tasks when fine-tuned individually; They also come with a significant memory and computational requirements, calling for methods to reduce model sizes (green AI). We propose a two-stage model-compression method to reduce a model’s inference time cost. We first decompose the matrices in the model into smaller matrices and then perform feature distillation on the internal representation to recover from the decomposition. This approach has the benefit of reducing the number of parameters while preserving much of the information within the model. We experimented on BERT-base model with the GLUE benchmark dataset and show that we can reduce the number of parameters by a factor of 0.4x, and increase inference speed by a factor of 1.45x, while maintaining a minimal loss in metric performance.

pdf bib
It’s not Greek to mBERT: Inducing Word-Level Translations from Multilingual BERT
Hila Gonen | Shauli Ravfogel | Yanai Elazar | Yoav Goldberg
Proceedings of the Third BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP

Recent works have demonstrated that multilingual BERT (mBERT) learns rich cross-lingual representations, that allow for transfer across languages. We study the word-level translation information embedded in mBERT and present two simple methods that expose remarkable translation capabilities with no fine-tuning. The results suggest that most of this information is encoded in a non-linear way, while some of it can also be recovered with purely linear tools. As part of our analysis, we test the hypothesis that mBERT learns representations which contain both a language-encoding component and an abstract, cross-lingual component, and explicitly identify an empirical language-identity subspace within mBERT representations.

pdf bib
Unsupervised Distillation of Syntactic Information from Contextualized Word Representations
Shauli Ravfogel | Yanai Elazar | Jacob Goldberger | Yoav Goldberg
Proceedings of the Third BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP

Contextualized word representations, such as ELMo and BERT, were shown to perform well on various semantic and syntactic task. In this work, we tackle the task of unsupervised disentanglement between semantics and structure in neural language representations: we aim to learn a transformation of the contextualized vectors, that discards the lexical semantics, but keeps the structural information. To this end, we automatically generate groups of sentences which are structurally similar but semantically different, and use metric-learning approach to learn a transformation that emphasizes the structural component that is encoded in the vectors. We demonstrate that our transformation clusters vectors in space by structural properties, rather than by lexical semantics. Finally, we demonstrate the utility of our distilled representations by showing that they outperform the original contextualized representations in a few-shot parsing setting.

2019

pdf bib
Transfer Learning Between Related Tasks Using Expected Label Proportions
Matan Ben Noach | Yoav Goldberg
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Deep learning systems thrive on abundance of labeled training data but such data is not always available, calling for alternative methods of supervision. One such method is expectation regularization (XR) (Mann and McCallum, 2007), where models are trained based on expected label proportions. We propose a novel application of the XR framework for transfer learning between related tasks, where knowing the labels of task A provides an estimation of the label proportion of task B. We then use a model trained for A to label a large corpus, and use this corpus with an XR loss to train a model for task B. To make the XR framework applicable to large-scale deep-learning setups, we propose a stochastic batched approximation procedure. We demonstrate the approach on the task of Aspect-based Sentiment classification, where we effectively use a sentence-level sentiment predictor to train accurate aspect-based predictor. The method improves upon fully supervised neural system trained on aspect-level data, and is also cumulative with LM-based pretraining, as we demonstrate by improving a BERT-based Aspect-based Sentiment model.

pdf bib
Are We Modeling the Task or the Annotator? An Investigation of Annotator Bias in Natural Language Understanding Datasets
Mor Geva | Yoav Goldberg | Jonathan Berant
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Crowdsourcing has been the prevalent paradigm for creating natural language understanding datasets in recent years. A common crowdsourcing practice is to recruit a small number of high-quality workers, and have them massively generate examples. Having only a few workers generate the majority of examples raises concerns about data diversity, especially when workers freely generate sentences. In this paper, we perform a series of experiments showing these concerns are evident in three recent NLP datasets. We show that model performance improves when training with annotator identifiers as features, and that models are able to recognize the most productive annotators. Moreover, we show that often models do not generalize well to examples from annotators that did not contribute to the training set. Our findings suggest that annotator bias should be monitored during dataset creation, and that test set annotators should be disjoint from training set annotators.

pdf bib
Language Modeling for Code-Switching: Evaluation, Integration of Monolingual Data, and Discriminative Training
Hila Gonen | Yoav Goldberg
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

We focus on the problem of language modeling for code-switched language, in the context of automatic speech recognition (ASR). Language modeling for code-switched language is challenging for (at least) three reasons: (1) lack of available large-scale code-switched data for training; (2) lack of a replicable evaluation setup that is ASR directed yet isolates language modeling performance from the other intricacies of the ASR system; and (3) the reliance on generative modeling. We tackle these three issues: we propose an ASR-motivated evaluation setup which is decoupled from an ASR system and the choice of vocabulary, and provide an evaluation dataset for English-Spanish code-switching. This setup lends itself to a discriminative training approach, which we demonstrate to work better than generative language modeling. Finally, we explore a variety of training protocols and verify the effectiveness of training with large amounts of monolingual data followed by fine-tuning with small amounts of code-switched data, for both the generative and discriminative cases.

pdf bib
Aligning Vector-spaces with Noisy Supervised Lexicon
Noa Yehezkel Lubin | Jacob Goldberger | Yoav Goldberg
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

The problem of learning to translate between two vector spaces given a set of aligned points arises in several application areas of NLP. Current solutions assume that the lexicon which defines the alignment pairs is noise-free. We consider the case where the set of aligned points is allowed to contain an amount of noise, in the form of incorrect lexicon pairs and show that this arises in practice by analyzing the edited dictionaries after the cleaning process. We demonstrate that such noise substantially degrades the accuracy of the learned translation when using current methods. We propose a model that accounts for noisy pairs. This is achieved by introducing a generative model with a compatible iterative EM algorithm. The algorithm jointly learns the noise level in the lexicon, finds the set of noisy pairs, and learns the mapping between the spaces. We demonstrate the effectiveness of our proposed algorithm on two alignment problems: bilingual word embedding translation, and mapping between diachronic embedding spaces for recovering the semantic shifts of words across time periods.

pdf bib
Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them
Hila Gonen | Yoav Goldberg
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Word embeddings are widely used in NLP for a vast range of tasks. It was shown that word embeddings derived from text corpora reflect gender biases in society. This phenomenon is pervasive and consistent across different word embedding models, causing serious concern. Several recent works tackle this problem, and propose methods for significantly reducing this gender bias in word embeddings, demonstrating convincing results. However, we argue that this removal is superficial. While the bias is indeed substantially reduced according to the provided bias definition, the actual effect is mostly hiding the bias, not removing it. The gender bias information is still reflected in the distances between “gender-neutralized” words in the debiased embeddings, and can be recovered from them. We present a series of experiments to support this claim, for two debiasing methods. We conclude that existing bias removal techniques are insufficient, and should not be trusted for providing gender-neutral modeling.

pdf bib
Step-by-Step: Separating Planning from Realization in Neural Data-to-Text Generation
Amit Moryossef | Yoav Goldberg | Ido Dagan
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Data-to-text generation can be conceptually divided into two parts: ordering and structuring the information (planning), and generating fluent language describing the information (realization). Modern neural generation systems conflate these two steps into a single end-to-end differentiable system. We propose to split the generation process into a symbolic text-planning stage that is faithful to the input, followed by a neural generation stage that focuses only on realization. For training a plan-to-text generator, we present a method for matching reference texts to their corresponding text plans. For inference time, we describe a method for selecting high-quality text plans for new inputs. We implement and evaluate our approach on the WebNLG benchmark. Our results demonstrate that decoupling text planning from neural realization indeed improves the system’s reliability and adequacy while maintaining fluent output. We observe improvements both in BLEU scores and in manual evaluations. Another benefit of our approach is the ability to output diverse realizations of the same input, paving the way to explicit control over the generated text structure.

pdf bib
Studying the Inductive Biases of RNNs with Synthetic Variations of Natural Languages
Shauli Ravfogel | Yoav Goldberg | Tal Linzen
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

How do typological properties such as word order and morphological case marking affect the ability of neural sequence models to acquire the syntax of a language? Cross-linguistic comparisons of RNNs’ syntactic performance (e.g., on subject-verb agreement prediction) are complicated by the fact that any two languages differ in multiple typological properties, as well as by differences in training corpus. We propose a paradigm that addresses these issues: we create synthetic versions of English, which differ from English in one or more typological parameters, and generate corpora for those languages based on a parsed English corpus. We report a series of experiments in which RNNs were trained to predict agreement features for verbs in each of those synthetic languages. Among other findings, (1) performance was higher in subject-verb-object order (as in English) than in subject-object-verb order (as in Japanese), suggesting that RNNs have a recency bias; (2) predicting agreement with both subject and object (polypersonal agreement) improves over predicting each separately, suggesting that underlying syntactic knowledge transfers across the two tasks; and (3) overt morphological case makes agreement prediction significantly easier, regardless of word order.

pdf bib
How Does Grammatical Gender Affect Noun Representations in Gender-Marking Languages?
Hila Gonen | Yova Kementchedjhieva | Yoav Goldberg
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

Many natural languages assign grammatical gender also to inanimate nouns in the language. In such languages, words that relate to the gender-marked nouns are inflected to agree with the noun’s gender. We show that this affects the word representations of inanimate nouns, resulting in nouns with the same gender being closer to each other than nouns with different gender. While “embedding debiasing” methods fail to remove the effect, we demonstrate that a careful application of methods that neutralize grammatical gender signals from the words’ context when training word embeddings is effective in removing it. Fixing the grammatical gender bias yields a positive effect on the quality of the resulting word embeddings, both in monolingual and cross-lingual settings. We note that successfully removing gender signals, while achievable, is not trivial to do and that a language-specific morphological analyzer, together with careful usage of it, are essential for achieving good results.

pdf bib
Proceedings of the 3rd Workshop on Evaluating Vector Space Representations for NLP
Anna Rogers | Aleksandr Drozd | Anna Rumshisky | Yoav Goldberg
Proceedings of the 3rd Workshop on Evaluating Vector Space Representations for NLP

bib
Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them
Hila Gonen | Yoav Goldberg
Proceedings of the 2019 Workshop on Widening NLP

Word embeddings are widely used in NLP for a vast range of tasks. It was shown that word embeddings derived from text corpora reflect gender biases in society, causing serious concern. Several recent works tackle this problem, and propose methods for significantly reducing this gender bias in word embeddings, demonstrating convincing results. However, we argue that this removal is superficial. While the bias is indeed substantially reduced according to the provided bias definition, the actual effect is mostly hiding the bias, not removing it. The gender bias information is still reflected in the distances between “gender-neutralized” words in the debiased embeddings, and can be recovered from them. We present a series of experiments to support this claim, for two debiasing methods. We conclude that existing bias removal techniques are insufficient, and should not be trusted for providing gender-neutral modeling.

bib
How does Grammatical Gender Affect Noun Representations in Gender-Marking Languages?
Hila Gonen | Yova Kementchedjhieva | Yoav Goldberg
Proceedings of the 2019 Workshop on Widening NLP

Many natural languages assign grammatical gender also to inanimate nouns in the language. In such languages, words that relate to the gender-marked nouns are inflected to agree with the noun’s gender. We show that this affects the word representations of inanimate nouns, resulting in nouns with the same gender being closer to each other than nouns with different gender. While “embedding debiasing” methods fail to remove the effect, we demonstrate that a careful application of methods that neutralize grammatical gender signals from the words’ context when training word embeddings is effective in removing it. Fixing the grammatical gender bias results in a positive effect on the quality of the resulting word embeddings, both in monolingual and cross lingual settings. We note that successfully removing gender signals, while achievable, is not trivial to do and that a language-specific morphological analyzer, together with careful usage of it, are essential for achieving good results.

pdf bib
Filling Gender & Number Gaps in Neural Machine Translation with Black-box Context Injection
Amit Moryossef | Roee Aharoni | Yoav Goldberg
Proceedings of the First Workshop on Gender Bias in Natural Language Processing

When translating from a language that does not morphologically mark information such as gender and number into a language that does, translation systems must “guess” this missing information, often leading to incorrect translations in the given context. We propose a black-box approach for injecting the missing information to a pre-trained neural machine translation system, allowing to control the morphological variations in the generated translations without changing the underlying model or training data. We evaluate our method on an English to Hebrew translation task, and show that it is effective in injecting the gender and number information and that supplying the correct information improves the translation accuracy in up to 2.3 BLEU on a female-speaker test set for a state-of-the-art online black-box system. Finally, we perform a fine-grained syntactic analysis of the generated translations that shows the effectiveness of our method.

pdf bib
Improving Quality and Efficiency in Plan-based Neural Data-to-text Generation
Amit Moryossef | Yoav Goldberg | Ido Dagan
Proceedings of the 12th International Conference on Natural Language Generation

We follow the step-by-step approach to neural data-to-text generation proposed by Moryossef et al (2019), in which the generation process is divided into a text planning stage followed by a plan realization stage. We suggest four extensions to that framework: (1) we introduce a trainable neural planning component that can generate effective plans several orders of magnitude faster than the original planner; (2) we incorporate typing hints that improve the model’s ability to deal with unseen relations and entities; (3) we introduce a verification-by-reranking stage that substantially improves the faithfulness of the resulting texts; (4) we incorporate a simple but effective referring expression generation module. These extensions result in a generation process that is faster, more fluent, and more accurate.

pdf bib
Where’s My Head? Definition, Data Set, and Models for Numeric Fused-Head Identification and Resolution
Yanai Elazar | Yoav Goldberg
Transactions of the Association for Computational Linguistics, Volume 7

We provide the first computational treatment of fused-heads constructions (FHs), focusing on the numeric fused-heads (NFHs). FHs constructions are noun phrases in which the head noun is missing and is said to be “fused” with its dependent modifier. This missing information is implicit and is important for sentence understanding. The missing references are easily filled in by humans but pose a challenge for computational models. We formulate the handling of FHs as a two stages process: Identification of the FH construction and resolution of the missing head. We explore the NFH phenomena in large corpora of English text and create (1) a data set and a highly accurate method for NFH identification; (2) a 10k examples (1 M tokens) crowd-sourced data set of NFH resolution; and (3) a neural baseline for the NFH resolution task. We release our code and data set, to foster further research into this challenging problem.

2018

pdf bib
SetExpander: End-to-end Term Set Expansion Based on Multi-Context Term Embeddings
Jonathan Mamou | Oren Pereg | Moshe Wasserblat | Ido Dagan | Yoav Goldberg | Alon Eirew | Yael Green | Shira Guskin | Peter Izsak | Daniel Korat
Proceedings of the 27th International Conference on Computational Linguistics: System Demonstrations

We present SetExpander, a corpus-based system for expanding a seed set of terms into a more complete set of terms that belong to the same semantic class. SetExpander implements an iterative end-to end workflow for term set expansion. It enables users to easily select a seed set of terms, expand it, view the expanded set, validate it, re-expand the validated set and store it, thus simplifying the extraction of domain-specific fine-grained semantic classes. SetExpander has been used for solving real-life use cases including integration in an automated recruitment system and an issues and defects resolution system. A video demo of SetExpander is available at https://drive.google.com/open?id=1e545bB87Autsch36DjnJHmq3HWfSd1Rv .

pdf bib
Breaking NLI Systems with Sentences that Require Simple Lexical Inferences
Max Glockner | Vered Shwartz | Yoav Goldberg
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

We create a new NLI test set that shows the deficiency of state-of-the-art models in inferences that require lexical and world knowledge. The new examples are simpler than the SNLI test set, containing sentences that differ by at most one word from sentences in the training set. Yet, the performance on the new test set is substantially worse across systems trained on SNLI, demonstrating that these systems are limited in their generalization ability, failing to capture many simple inferences.

pdf bib
Split and Rephrase: Better Evaluation and Stronger Baselines
Roee Aharoni | Yoav Goldberg
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Splitting and rephrasing a complex sentence into several shorter sentences that convey the same meaning is a challenging problem in NLP. We show that while vanilla seq2seq models can reach high scores on the proposed benchmark (Narayan et al., 2017), they suffer from memorization of the training set which contains more than 89% of the unique simple sentences from the validation and test sets. To aid this, we present a new train-development-test data split and neural models augmented with a copy-mechanism, outperforming the best reported baseline by 8.68 BLEU and fostering further progress on the task.

pdf bib
On the Practical Computational Power of Finite Precision RNNs for Language Recognition
Gail Weiss | Yoav Goldberg | Eran Yahav
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

While Recurrent Neural Networks (RNNs) are famously known to be Turing complete, this relies on infinite precision in the states and unbounded computation time. We consider the case of RNNs with finite precision whose computation time is linear in the input length. Under these limitations, we show that different RNN variants have different computational power. In particular, we show that the LSTM and the Elman-RNN with ReLU activation are strictly stronger than the RNN with a squashing activation and the GRU. This is achieved because LSTMs and ReLU-RNNs can easily implement counting behavior. We show empirically that the LSTM does indeed learn to effectively use the counting mechanism.

pdf bib
Proceedings of the Workshop on the Relevance of Linguistic Structure in Neural Architectures for NLP
Georgiana Dinu | Miguel Ballesteros | Avirup Sil | Sam Bowman | Wael Hamza | Anders Sogaard | Tahira Naseem | Yoav Goldberg
Proceedings of the Workshop on the Relevance of Linguistic Structure in Neural Architectures for NLP

pdf bib
Understanding Convolutional Neural Networks for Text Classification
Alon Jacovi | Oren Sar Shalom | Yoav Goldberg
Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP

We present an analysis into the inner workings of Convolutional Neural Networks (CNNs) for processing text. CNNs used for computer vision can be interpreted by projecting filters into image space, but for discrete sequence inputs CNNs remain a mystery. We aim to understand the method by which the networks process and classify text. We examine common hypotheses to this problem: that filters, accompanied by global max-pooling, serve as ngram detectors. We show that filters may capture several different semantic classes of ngrams by using different activation patterns, and that global max-pooling induces behavior which separates important ngrams from the rest. Finally, we show practical use cases derived from our findings in the form of model interpretability (explaining a trained model by deriving a concrete identity for each filter, bridging the gap between visualization tools in vision tasks and NLP) and prediction interpretability (explaining predictions).

pdf bib
Can LSTM Learn to Capture Agreement? The Case of Basque
Shauli Ravfogel | Yoav Goldberg | Francis Tyers
Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP

Sequential neural networks models are powerful tools in a variety of Natural Language Processing (NLP) tasks. The sequential nature of these models raises the questions: to what extent can these models implicitly learn hierarchical structures typical to human language, and what kind of grammatical phenomena can they acquire? We focus on the task of agreement prediction in Basque, as a case study for a task that requires implicit understanding of sentence structure and the acquisition of a complex but consistent morphological system. Analyzing experimental results from two syntactic prediction tasks – verb number prediction and suffix recovery – we find that sequential models perform worse on agreement prediction in Basque than one might expect on the basis of a previous agreement prediction work in English. Tentative findings based on diagnostic classifiers suggest the network makes use of local heuristics as a proxy for the hierarchical structure of the sentence. We propose the Basque agreement prediction task as challenging benchmark for models that attempt to learn regularities in human language.

pdf bib
Adversarial Removal of Demographic Attributes from Text Data
Yanai Elazar | Yoav Goldberg
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Recent advances in Representation Learning and Adversarial Training seem to succeed in removing unwanted features from the learned representation. We show that demographic information of authors is encoded in—and can be recovered from—the intermediate representations learned by text-based neural classifiers. The implication is that decisions of classifiers trained on textual data are not agnostic to—and likely condition on—demographic attributes. When attempting to remove such demographic information using adversarial training, we find that while the adversarial component achieves chance-level development-set accuracy during training, a post-hoc classifier, trained on the encoded sentences from the first part, still manages to reach substantially higher classification accuracies on the same data. This behavior is consistent across several tasks, demographic properties and datasets. We explore several techniques to improve the effectiveness of the adversarial component. Our main conclusion is a cautionary one: do not rely on the adversarial training to achieve invariant representation to sensitive features.

pdf bib
Word Sense Induction with Neural biLM and Symmetric Patterns
Asaf Amrami | Yoav Goldberg
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

An established method for Word Sense Induction (WSI) uses a language model to predict probable substitutes for target words, and induces senses by clustering these resulting substitute vectors. We replace the ngram-based language model (LM) with a recurrent one. Beyond being more accurate, the use of the recurrent LM allows us to effectively query it in a creative way, using what we call dynamic symmetric patterns. The combination of the RNN-LM and the dynamic symmetric patterns results in strong substitute vectors for WSI, allowing to surpass the current state-of-the-art on the SemEval 2013 WSI shared task by a large margin.

2017

pdf bib
Greedy Transition-Based Dependency Parsing with Stack LSTMs
Miguel Ballesteros | Chris Dyer | Yoav Goldberg | Noah A. Smith
Computational Linguistics, Volume 43, Issue 2 - June 2017

We introduce a greedy transition-based parser that learns to represent parser states using recurrent neural networks. Our primary innovation that enables us to do this efficiently is a new control structure for sequential neural networks—the stack long short-term memory unit (LSTM). Like the conventional stack data structures used in transition-based parsers, elements can be pushed to or popped from the top of the stack in constant time, but, in addition, an LSTM maintains a continuous space embedding of the stack contents. Our model captures three facets of the parser’s state: (i) unbounded look-ahead into the buffer of incoming words, (ii) the complete history of transition actions taken by the parser, and (iii) the complete contents of the stack of partially built tree fragments, including their internal structures. In addition, we compare two different word representations: (i) standard word vectors based on look-up tables and (ii) character-based models of words. Although standard word embedding models work well in all languages, the character-based models improve the handling of out-of-vocabulary words, particularly in morphologically rich languages. Finally, we discuss the use of dynamic oracles in training the parser. During training, dynamic oracles alternate between sampling parser states from the training data and from the model as it is being learned, making the model more robust to the kinds of errors that will be made at test time. Training our model with dynamic oracles yields a linear-time greedy parser with very competitive performance.

pdf bib
Controlling Linguistic Style Aspects in Neural Language Generation
Jessica Ficler | Yoav Goldberg
Proceedings of the Workshop on Stylistic Variation

Most work on neural natural language generation (NNLG) focus on controlling the content of the generated text. We experiment with controling several stylistic aspects of the generated text, in addition to its content. The method is based on conditioned RNN language model, where the desired content as well as the stylistic parameters serve as conditioning contexts. We demonstrate the approach on the movie reviews domain and show that it is successful in generating coherent sentences corresponding to the required linguistic style and content.

pdf bib
Proceedings of the 2nd Workshop on Evaluating Vector Space Representations for NLP
Samuel Bowman | Yoav Goldberg | Felix Hill | Angeliki Lazaridou | Omer Levy | Roi Reichart | Anders Søgaard
Proceedings of the 2nd Workshop on Evaluating Vector Space Representations for NLP

pdf bib
Capturing Dependency Syntax with “Deep” Sequential Models
Yoav Goldberg
Proceedings of the Fourth International Conference on Dependency Linguistics (Depling 2017)

pdf bib
Exploring the Syntactic Abilities of RNNs with Multi-task Learning
Émile Enguehard | Yoav Goldberg | Tal Linzen
Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017)

Recent work has explored the syntactic abilities of RNNs using the subject-verb agreement task, which diagnoses sensitivity to sentence structure. RNNs performed this task well in common cases, but faltered in complex sentences (Linzen et al., 2016). We test whether these errors are due to inherent limitations of the architecture or to the relatively indirect supervision provided by most agreement dependencies in a corpus. We trained a single RNN to perform both the agreement task and an additional task, either CCG supertagging or language modeling. Multi-task training led to significantly lower error rates, in particular on complex sentences, suggesting that RNNs have the ability to evolve more sophisticated syntactic representations than shown before. We also show that easily available agreement training data can improve performance on other syntactic tasks, in particular when only a limited amount of training data is available for those tasks. The multi-task paradigm can also be leveraged to inject grammatical knowledge into language models.

pdf bib
From Raw Text to Universal Dependencies - Look, No Tags!
Miryam de Lhoneux | Yan Shao | Ali Basirat | Eliyahu Kiperwasser | Sara Stymne | Yoav Goldberg | Joakim Nivre
Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies

We present the Uppsala submission to the CoNLL 2017 shared task on parsing from raw text to universal dependencies. Our system is a simple pipeline consisting of two components. The first performs joint word and sentence segmentation on raw text; the second predicts dependency trees from raw words. The parser bypasses the need for part-of-speech tagging, but uses word embeddings based on universal tag distributions. We achieved a macro-averaged LAS F1 of 65.11 in the official test run, which improved to 70.49 after bug fixes. We obtained the 2nd best result for sentence segmentation with a score of 89.03.

pdf bib
Morphological Inflection Generation with Hard Monotonic Attention
Roee Aharoni | Yoav Goldberg
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We present a neural model for morphological inflection generation which employs a hard attention mechanism, inspired by the nearly-monotonic alignment commonly found between the characters in a word and the characters in its inflection. We evaluate the model on three previously studied morphological inflection generation datasets and show that it provides state of the art results in various setups compared to previous neural and non-neural approaches. Finally we present an analysis of the continuous representations learned by both the hard and soft (Bahdanau, 2014) attention models for the task, shedding some light on the features such models extract.

pdf bib
Towards String-To-Tree Neural Machine Translation
Roee Aharoni | Yoav Goldberg
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

We present a simple method to incorporate syntactic information about the target language in a neural machine translation system by translating into linearized, lexicalized constituency trees. An experiment on the WMT16 German-English news translation task resulted in an improved BLEU score when compared to a syntax-agnostic NMT baseline trained on the same dataset. An analysis of the translations from the syntax-aware system shows that it performs more reordering during translation in comparison to the baseline. A small-scale human evaluation also showed an advantage to the syntax-aware system.

pdf bib
A Strong Baseline for Learning Cross-Lingual Word Embeddings from Sentence Alignments
Omer Levy | Anders Søgaard | Yoav Goldberg
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

While cross-lingual word embeddings have been studied extensively in recent years, the qualitative differences between the different algorithms remain vague. We observe that whether or not an algorithm uses a particular feature set (sentence IDs) accounts for a significant performance gap among these algorithms. This feature set is also used by traditional alignment algorithms, such as IBM Model-1, which demonstrate similar performance to state-of-the-art embedding algorithms on a variety of benchmarks. Overall, we observe that different algorithmic approaches for utilizing the sentence ID feature space result in similar performance. This paper draws both empirical and theoretical parallels between the embedding and alignment literature, and suggests that adding additional sources of information, which go beyond the traditional signal of bilingual sentence-aligned corpora, may substantially improve cross-lingual word embeddings, and that future baselines should at least take such features into account.

pdf bib
Improving a Strong Neural Parser with Conjunction-Specific Features
Jessica Ficler | Yoav Goldberg
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

While dependency parsers reach very high overall accuracy, some dependency relations are much harder than others. In particular, dependency parsers perform poorly in coordination construction (i.e., correctly attaching the conj relation). We extend a state-of-the-art dependency parser with conjunction-specific features, focusing on the similarity between the conjuncts head words. Training the extended parser yields an improvement in conj attachment as well as in overall dependency parsing accuracy on the Stanford dependency conversion of the Penn TreeBank.

pdf bib
The Interplay of Semantics and Morphology in Word Embeddings
Oded Avraham | Yoav Goldberg
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

We explore the ability of word embeddings to capture both semantic and morphological similarity, as affected by the different types of linguistic properties (surface form, lemma, morphological tag) used to compose the representation of each word. We train several models, where each uses a different subset of these properties to compose its representations. By evaluating the models on semantic and morphological measures, we reveal some useful insights on the relationship between semantics and morphology.

2016

pdf bib
A Neural Network for Coordination Boundary Prediction
Jessica Ficler | Yoav Goldberg
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
Training with Exploration Improves a Greedy Stack LSTM Parser
Miguel Ballesteros | Yoav Goldberg | Chris Dyer | Noah A. Smith
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

bib
Practical Neural Networks for NLP: From Theory to Code
Chris Dyer | Yoav Goldberg | Graham Neubig
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts

This tutorial aims to bring NLP researchers up to speed with the current techniques in deep learning and neural networks, and show them how they can turn their ideas into practical implementations. We will start with simple classification models (logistic regression and multilayer perceptrons) and cover more advanced patterns that come up in NLP such as recurrent networks for sequence tagging and prediction problems, structured networks (e.g., compositional architectures based on syntax trees), structured output spaces (sequences and trees), attention for sequence-to-sequence transduction, and feature induction for complex algorithm states. A particular emphasis will be on learning to represent complex objects as recursive compositions of simpler objects. This representation will reflect characterize standard objects in NLP, such as the composition of characters and morphemes into words, and words into sentences and documents. In addition, new opportunities such as learning to embed "algorithm states" such as those used in transition-based parsing and other sequential structured prediction models (for which effective features may be difficult to engineer by hand) will be covered.Everything in the tutorial will be grounded in code — we will show how to program seemingly complex neural-net models using toolkits based on the computation-graph formalism. Computation graphs decompose complex computations into a DAG, with nodes representing inputs, target outputs, parameters, or (sub)differentiable functions (e.g., "tanh", "matrix multiply", and "softmax"), and edges represent data dependencies. These graphs can be run "forward" to make predictions and compute errors (e.g., log loss, squared error) and then "backward" to compute derivatives with respect to model parameters. In particular we'll cover the Python bindings of the CNN library. CNN has been designed from the ground up for NLP applications, dynamically structured NNs, rapid prototyping, and a transparent data and execution model.

pdf bib
Coordination Annotation Extension in the Penn Tree Bank
Jessica Ficler | Yoav Goldberg
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Improving Hypernymy Detection with an Integrated Path-based and Distributional Method
Vered Shwartz | Yoav Goldberg | Ido Dagan
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Improved Parsing for Argument-Clusters Coordination
Jessica Ficler | Yoav Goldberg
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Deep multi-task learning with low level tasks supervised at lower layers
Anders Søgaard | Yoav Goldberg
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Multilingual Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Models and Auxiliary Loss
Barbara Plank | Anders Søgaard | Yoav Goldberg
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Universal Dependencies v1: A Multilingual Treebank Collection
Joakim Nivre | Marie-Catherine de Marneffe | Filip Ginter | Yoav Goldberg | Jan Hajič | Christopher D. Manning | Ryan McDonald | Slav Petrov | Sampo Pyysalo | Natalia Silveira | Reut Tsarfaty | Daniel Zeman
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Cross-linguistically consistent annotation is necessary for sound comparative evaluation and cross-lingual learning experiments. It is also useful for multilingual system development and comparative linguistic studies. Universal Dependencies is an open community effort to create cross-linguistically consistent treebank annotation for many languages within a dependency-based lexicalist framework. In this paper, we describe v1 of the universal guidelines, the underlying design principles, and the currently available treebanks for 33 languages.

pdf bib
Improving Sequence to Sequence Learning for Morphological Inflection Generation: The BIU-MIT Systems for the SIGMORPHON 2016 Shared Task for Morphological Reinflection
Roee Aharoni | Yoav Goldberg | Yonatan Belinkov
Proceedings of the 14th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology

pdf bib
Improving Reliability of Word Similarity Evaluation by Redesigning Annotation Task and Performance Measure
Oded Avraham | Yoav Goldberg
Proceedings of the 1st Workshop on Evaluating Vector-Space Representations for NLP

pdf bib
Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning
Stefan Riezler | Yoav Goldberg
Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning

pdf bib
Improving sentence compression by learning to predict gaze
Sigrid Klerke | Yoav Goldberg | Anders Søgaard
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Simple and Accurate Dependency Parsing Using Bidirectional LSTM Feature Representations
Eliyahu Kiperwasser | Yoav Goldberg
Transactions of the Association for Computational Linguistics, Volume 4

We present a simple and effective scheme for dependency parsing which is based on bidirectional-LSTMs (BiLSTMs). Each sentence token is associated with a BiLSTM vector representing the token in its sentential context, and feature vectors are constructed by concatenating a few BiLSTM vectors. The BiLSTM is trained jointly with the parser objective, resulting in very effective feature extractors for parsing. We demonstrate the effectiveness of the approach by applying it to a greedy transition-based parser as well as to a globally optimized graph-based parser. The resulting parsers have very simple architectures, and match or surpass the state-of-the-art accuracies on English and Chinese.

pdf bib
Easy-First Dependency Parsing with Hierarchical Tree LSTMs
Eliyahu Kiperwasser | Yoav Goldberg
Transactions of the Association for Computational Linguistics, Volume 4

We suggest a compositional vector representation of parse trees that relies on a recursive combination of recurrent-neural network encoders. To demonstrate its effectiveness, we use the representation as the backbone of a greedy, bottom-up dependency parser, achieving very strong accuracies for English and Chinese, without relying on external word embeddings. The parser’s implementation is available for download at the first author’s webpage.

pdf bib
Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies
Tal Linzen | Emmanuel Dupoux | Yoav Goldberg
Transactions of the Association for Computational Linguistics, Volume 4

The success of long short-term memory (LSTM) neural networks in language processing is typically attributed to their ability to capture long-distance statistical regularities. Linguistic regularities are often sensitive to syntactic structure; can such dependencies be captured by LSTMs, which do not have explicit structural representations? We begin addressing this question using number agreement in English subject-verb dependencies. We probe the architecture’s grammatical competence both using training objectives with an explicit grammatical target (number prediction, grammaticality judgments) and using language models. In the strongly supervised settings, the LSTM achieved very high overall accuracy (less than 1% errors), but errors increased when sequential and structural information conflicted. The frequency of such errors rose sharply in the language-modeling setting. We conclude that LSTMs can capture a non-trivial amount of grammatical structure given targeted supervision, but stronger architectures may be required to further reduce errors; furthermore, the language modeling signal is insufficient for capturing syntax-sensitive dependencies, and should be supplemented with more direct supervision if such dependencies need to be captured.

pdf bib
Semi Supervised Preposition-Sense Disambiguation using Multilingual Data
Hila Gonen | Yoav Goldberg
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Prepositions are very common and very ambiguous, and understanding their sense is critical for understanding the meaning of the sentence. Supervised corpora for the preposition-sense disambiguation task are small, suggesting a semi-supervised approach to the task. We show that signals from unannotated multilingual data can be used to improve supervised preposition-sense disambiguation. Our approach pre-trains an LSTM encoder for predicting the translation of a preposition, and then incorporates the pre-trained encoder as a component in a supervised classification system, and fine-tunes it for the task. The multilingual signals consistently improve results on two preposition-sense datasets.

2015

pdf bib
Semi-supervised Dependency Parsing using Bilexical Contextual Features from Auto-Parsed Data
Eliyahu Kiperwasser | Yoav Goldberg
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Template Kernels for Dependency Parsing
Hillel Taub-Tabib | Yoav Goldberg | Amir Globerson
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Improving Distributional Similarity with Lessons Learned from Word Embeddings
Omer Levy | Yoav Goldberg | Ido Dagan
Transactions of the Association for Computational Linguistics, Volume 3

Recent trends suggest that neural-network-inspired word embedding models outperform traditional count-based distributional models on word similarity and analogy detection tasks. We reveal that much of the performance gains of word embeddings are due to certain system design choices and hyperparameter optimizations, rather than the embedding algorithms themselves. Furthermore, we show that these modifications can be transferred to traditional distributional models, yielding similar gains. In contrast to prior reports, we observe mostly local or insignificant performance differences between the methods, with no global advantage to any single approach over the others.

2014

pdf bib
A Tabular Method for Dynamic Oracles in Transition-Based Parsing
Yoav Goldberg | Francesco Sartorio | Giorgio Satta
Transactions of the Association for Computational Linguistics, Volume 2

We develop parsing oracles for two transition-based dependency parsers, including the arc-standard parser, solving a problem that was left open in (Goldberg and Nivre, 2013). We experimentally show that using these oracles during training yields superior parsing accuracies on many languages.

pdf bib
Squibs: Constrained Arc-Eager Dependency Parsing
Joakim Nivre | Yoav Goldberg | Ryan McDonald
Computational Linguistics, Volume 40, Issue 2 - June 2014

pdf bib
Linguistic Regularities in Sparse and Explicit Word Representations
Omer Levy | Yoav Goldberg
Proceedings of the Eighteenth Conference on Computational Natural Language Learning

pdf bib
Intermediary Semantic Representation through Proposition Structures
Gabriel Stanovsky | Jessica Ficler | Ido Dagan | Yoav Goldberg
Proceedings of the ACL 2014 Workshop on Semantic Parsing

pdf bib
Proceedings of the First Joint Workshop on Statistical Parsing of Morphologically Rich Languages and Syntactic Analysis of Non-Canonical Languages
Yoav Goldberg | Yuval Marton | Ines Rehbein | Yannick Versley | Özlem Çetinoğlu | Joel Tetreault
Proceedings of the First Joint Workshop on Statistical Parsing of Morphologically Rich Languages and Syntactic Analysis of Non-Canonical Languages

pdf bib
Automatic Detection of Machine Translated Text and Translation Quality Estimation
Roee Aharoni | Moshe Koppel | Yoav Goldberg
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Dependency-Based Word Embeddings
Omer Levy | Yoav Goldberg
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2013

pdf bib
A Non-Monotonic Arc-Eager Transition System for Dependency Parsing
Matthew Honnibal | Yoav Goldberg | Mark Johnson
Proceedings of the Seventeenth Conference on Computational Natural Language Learning

pdf bib
Proceedings of the Fourth Workshop on Statistical Parsing of Morphologically-Rich Languages
Yoav Goldberg | Yuval Marton | Ines Rehbein | Yannick Versley
Proceedings of the Fourth Workshop on Statistical Parsing of Morphologically-Rich Languages

pdf bib
Overview of the SPMRL 2013 Shared Task: A Cross-Framework Evaluation of Parsing Morphologically Rich Languages
Djamé Seddah | Reut Tsarfaty | Sandra Kübler | Marie Candito | Jinho D. Choi | Richárd Farkas | Jennifer Foster | Iakes Goenaga | Koldo Gojenola Galletebeitia | Yoav Goldberg | Spence Green | Nizar Habash | Marco Kuhlmann | Wolfgang Maier | Joakim Nivre | Adam Przepiórkowski | Ryan Roth | Wolfgang Seeker | Yannick Versley | Veronika Vincze | Marcin Woliński | Alina Wróblewska | Eric Villemonte de la Clergerie
Proceedings of the Fourth Workshop on Statistical Parsing of Morphologically-Rich Languages

pdf bib
Dynamic-oracle Transition-based Parsing with Calibrated Probabilistic Output
Yoav Goldberg
Proceedings of the 13th International Conference on Parsing Technologies (IWPT 2013)

pdf bib
Training Deterministic Parsers with Non-Deterministic Oracles
Yoav Goldberg | Joakim Nivre
Transactions of the Association for Computational Linguistics, Volume 1

Greedy transition-based parsers are very fast but tend to suffer from error propagation. This problem is aggravated by the fact that they are normally trained using oracles that are deterministic and incomplete in the sense that they assume a unique canonical path through the transition system and are only valid as long as the parser does not stray from this path. In this paper, we give a general characterization of oracles that are nondeterministic and complete, present a method for deriving such oracles for transition systems that satisfy a property we call arc decomposition, and instantiate this method for three well-known transition systems from the literature. We say that these oracles are dynamic, because they allow us to dynamically explore alternative and nonoptimal paths during training — in contrast to oracles that statically assume a unique optimal path. Experimental evaluation on a wide range of data sets clearly shows that using dynamic oracles to train greedy parsers gives substantial improvements in accuracy. Moreover, this improvement comes at no cost in terms of efficiency, unlike other techniques like beam search.

pdf bib
Universal Dependency Annotation for Multilingual Parsing
Ryan McDonald | Joakim Nivre | Yvonne Quirmbach-Brundage | Yoav Goldberg | Dipanjan Das | Kuzman Ganchev | Keith Hall | Slav Petrov | Hao Zhang | Oscar Täckström | Claudia Bedini | Núria Bertomeu Castelló | Jungmee Lee
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Efficient Implementation of Beam-Search Incremental Parsers
Yoav Goldberg | Kai Zhao | Liang Huang
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Word Segmentation, Unknown-word Resolution, and Morphological Agreement in a Hebrew Parsing System
Yoav Goldberg | Michael Elhadad
Computational Linguistics, Volume 39, Issue 1 - March 2013

pdf bib
A Dataset of Syntactic-Ngrams over Time from a Very Large Corpus of English Books
Yoav Goldberg | Jon Orwant
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 1: Proceedings of the Main Conference and the Shared Task: Semantic Textual Similarity

2012

pdf bib
Domain Adaptation of a Dependency Parser with a Class-Class Selectional Preference Model
Raphael Cohen | Yoav Goldberg | Michael Elhadad
Proceedings of ACL 2012 Student Research Workshop

pdf bib
A Dynamic Oracle for Arc-Eager Dependency Parsing
Yoav Goldberg | Joakim Nivre
Proceedings of COLING 2012

2011

pdf bib
Language-Independent Parsing with Empty Elements
Shu Cai | David Chiang | Yoav Goldberg
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Joint Hebrew Segmentation and Parsing using a PCFGLA Lattice Parser
Yoav Goldberg | Michael Elhadad
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

2010

pdf bib
An Efficient Algorithm for Easy-First Non-Directional Dependency Parsing
Yoav Goldberg | Michael Elhadad
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Statistical Parsing of Morphologically Rich Languages (SPMRL) What, How and Whither
Reut Tsarfaty | Djamé Seddah | Yoav Goldberg | Sandra Kuebler | Yannick Versley | Marie Candito | Jennifer Foster | Ines Rehbein | Lamia Tounsi
Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages

pdf bib
Easy-First Dependency Parsing of Modern Hebrew
Yoav Goldberg | Michael Elhadad
Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages

pdf bib
Inspecting the Structural Biases of Dependency Parsing Algorithms
Yoav Goldberg | Michael Elhadad
Proceedings of the Fourteenth Conference on Computational Natural Language Learning

2009

pdf bib
Enhancing Unlexicalized Parsing Performance Using a Wide Coverage Lexicon, Fuzzy Tag-Set Mapping, and EM-HMM-Based Lexical Probabilities
Yoav Goldberg | Reut Tsarfaty | Meni Adler | Michael Elhadad
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

pdf bib
On the Role of Lexical Features in Sequence Labeling
Yoav Goldberg | Michael Elhadad
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
Gaiku : Generating Haiku with Word Associations Norms
Yael Netzer | David Gabay | Yoav Goldberg | Michael Elhadad
Proceedings of the Workshop on Computational Approaches to Linguistic Creativity

pdf bib
Hebrew Dependency Parsing: Initial Results
Yoav Goldberg | Michael Elhadad
Proceedings of the 11th International Conference on Parsing Technologies (IWPT’09)

2008

pdf bib
Tagging a Hebrew Corpus: the Case of Participles
Meni Adler | Yael Netzer | Yoav Goldberg | David Gabay | Michael Elhadad
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

We report on an effort to build a corpus of Modern Hebrew tagged with part-of-speech and morphology. We designed a tagset specific to Hebrew while focusing on four aspects: the tagset should be consistent with common linguistic knowledge; there should be maximal agreement among taggers as to the tags assigned to maintain consistency; the tagset should be useful for machine taggers and learning algorithms; and the tagset should be effective for applications relying on the tags as input features. In this paper, we illustrate these issues by explaining our decision to introduce a tag for beinoni forms in Hebrew. We explain how this tag is defined, and how it helped us improve manual tagging accuracy to a high-level, while improving automatic tagging and helping in the task of syntactic chunking.

pdf bib
Word-Based or Morpheme-Based? Annotation Strategies for Modern Hebrew Clitics
Reut Tsarfaty | Yoav Goldberg
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

Morphologically rich languages pose a challenge to the annotators of treebanks with respect to the status of orthographic (space-delimited) words in the syntactic parse trees. In such languages an orthographic word may carry various, distinct, sorts of information and the question arises whether we should represent such words as a sequence of their constituent morphemes (i.e., a Morpheme-Based annotation strategy) or whether we should preserve their special orthographic status within the trees (i.e., a Word-Based annotation strategy). In this paper we empirically address this challenge in the context of the development of Language Resources for Modern Hebrew. We compare and contrast the Morpheme-Based and Word-Based annotation strategies of pronominal clitics in Modern Hebrew and we show that the Word-Based strategy is more adequate for the purpose of training statistical parsers as it provides a better PP-attachment disambiguation capacity and a better alignment with initial surface forms. Our findings in turn raise new questions concerning the interaction of morphological and syntactic processing of which investigation is facilitated by the parallel treebank we made available.

pdf bib
A Single Generative Model for Joint Morphological Segmentation and Syntactic Parsing
Yoav Goldberg | Reut Tsarfaty
Proceedings of ACL-08: HLT

pdf bib
Unsupervised Lexicon-Based Resolution of Unknown Words for Full Morphological Analysis
Meni Adler | Yoav Goldberg | David Gabay | Michael Elhadad
Proceedings of ACL-08: HLT

pdf bib
EM Can Find Pretty Good HMM POS-Taggers (When Given a Good Start)
Yoav Goldberg | Meni Adler | Michael Elhadad
Proceedings of ACL-08: HLT

pdf bib
splitSVM: Fast, Space-Efficient, non-Heuristic, Polynomial Kernel Computation for NLP Applications
Yoav Goldberg | Michael Elhadad
Proceedings of ACL-08: HLT, Short Papers

2007

pdf bib
SVM Model Tampering and Anchored Learning: A Case Study in Hebrew NP Chunking
Yoav Goldberg | Michael Elhadad
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

2006

pdf bib
Noun Phrase Chunking in Hebrew: Influence of Lexical and Morphological Features
Yoav Goldberg | Meni Adler | Michael Elhadad
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

Search
Co-authors