Robert C. Berwick

Also published as: Robert Berwick, Robert Cregar Berwick


2021

pdf bib
Evaluating Universal Dependency Parser Recovery of Predicate Argument Structure via CompChain Analysis
Sagar Indurkhya | Beracah Yankama | Robert C. Berwick
Proceedings of *SEM 2021: The Tenth Joint Conference on Lexical and Computational Semantics

Accurate recovery of predicate-argument structure from a Universal Dependency (UD) parse is central to downstream tasks such as extraction of semantic roles or event representations. This study introduces compchains, a categorization of the hierarchy of predicate dependency relations present within a UD parse. Accuracy of compchain classification serves as a proxy for measuring accurate recovery of predicate-argument structure from sentences with embedding. We analyzed the distribution of compchains in three UD English treebanks, EWT, GUM and LinES, revealing that these treebanks are sparse with respect to sentences with predicate-argument structure that includes predicate-argument embedding. We evaluated the CoNLL 2018 Shared Task UDPipe (v1.2) baseline (dependency parsing) models as compchain classifiers for the EWT, GUMS and LinES UD treebanks. Our results indicate that these three baseline models exhibit poorer performance on sentences with predicate-argument structure with more than one level of embedding; we used compchains to characterize the errors made by these parsers and present examples of erroneous parses produced by the parser that were identified using compchains. We also analyzed the distribution of compchains in 58 non-English UD treebanks and then used compchains to evaluate the CoNLL’18 Shared Task baseline model for each of these treebanks. Our analysis shows that performance with respect to compchain classification is only weakly correlated with the official evaluation metrics (LAS, MLAS and BLEX). We identify gaps in the distribution of compchains in several of the UD treebanks, thus providing a roadmap for how these treebanks may be supplemented. We conclude by discussing how compchains provide a new perspective on the sparsity of training data for UD parsers, as well as the accuracy of the resulting UD parses.

2018

pdf bib
Evaluating the Ability of LSTMs to Learn Context-Free Grammars
Luzi Sennhauser | Robert Berwick
Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP

While long short-term memory (LSTM) neural net architectures are designed to capture sequence information, human language is generally composed of hierarchical structures. This raises the question as to whether LSTMs can learn hierarchical structures. We explore this question with a well-formed bracket prediction task using two types of brackets modeled by an LSTM. Demonstrating that such a system is learnable by an LSTM is the first step in demonstrating that the entire class of CFLs is also learnable. We observe that the model requires exponential memory in terms of the number of characters and embedded depth, where a sub-linear memory should suffice. Still, the model does more than memorize the training input. It learns how to distinguish between relevant and irrelevant information. On the other hand, we also observe that the model does not generalize well. We conclude that LSTMs do not learn the relevant underlying context-free rules, suggesting the good overall performance is attained rather by an efficient way of evaluating nuisance variables. LSTMs are a way to quickly reach good results for many natural language tasks, but to understand and generate natural language one has to investigate other concepts that can make more direct use of natural language’s structural nature.

2015

pdf bib
Proceedings of the Sixth Workshop on Cognitive Aspects of Computational Language Learning
Robert Berwick | Anna Korhonen | Alessandro Lenci | Thierry Poibeau | Aline Villavicencio
Proceedings of the Sixth Workshop on Cognitive Aspects of Computational Language Learning

2013

pdf bib
Language Acquisition and Probabilistic Models: keeping it simple
Aline Villavicencio | Marco Idiart | Robert Berwick | Igor Malioutov
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2012

pdf bib
Proceedings of the Workshop on Computational Models of Language Acquisition and Loss
Robert Berwick | Anna Korhonen | Thierry Poibeau | Aline Villavicencio
Proceedings of the Workshop on Computational Models of Language Acquisition and Loss

pdf bib
An annotated English child language database
Aline Villavicencio | Beracah Yankama | Rodrigo Wilkens | Marco Idiart | Robert Berwick
Proceedings of the Workshop on Computational Models of Language Acquisition and Loss

pdf bib
Get out but don’t fall down: verb-particle constructions in child language
Aline Villavicencio | Marco Idiart | Carlos Ramisch | Vítor Araújo | Beracah Yankama | Robert Berwick
Proceedings of the Workshop on Computational Models of Language Acquisition and Loss

pdf bib
A large scale annotated child language construction database
Aline Villavicencio | Beracah Yankama | Marco Idiart | Robert Berwick
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Large scale annotated corpora of child language can be of great value in assessing theoretical proposals regarding language acquisition models. For example, they can help determine whether the type and amount of data required by a proposed language acquisition model can actually be found in a naturalistic data sample. To this end, several recent efforts have augmented the CHILDES child language corpora with POS tagging and parsing information for languages such as English. With the increasing availability of robust NLP systems and electronic resources, these corpora can be further annotated with more detailed information about the properties of words, verb argument structure, and sentences. This paper describes such an initiative for combining information from various sources to extend the annotation of the English CHILDES corpora with linguistic, psycholinguistic and distributional information, along with an example illustrating an application of this approach to the extraction of verb alternation information. The end result, the English CHILDES Verb Construction Database, is an integrated resource containing information such as grammatical relations, verb semantic classes, and age of acquisition, enabling more targeted complex searches involving different levels of annotation that can facilitate a more detailed analysis of the linguistic input available to children.

1996

pdf bib
Principle-based Parsing for Chinese
Charles D. Yang | Robert C. Berwick
Proceedings of the 11th Pacific Asia Conference on Language, Information and Computation

1994

pdf bib
AN ARCHITECTURE FOR A UNIVERSAL LEXICON: A Case Study on Shared Syntactic Information in Japanese, Hindi, Bengali, Greek, and English
Naoyuki Nomura | Douglas A. Jones | Robert C. Berwick
COLING 1994 Volume 1: The 15th International Conference on Computational Linguistics

pdf bib
A Markov Language Learning Model for Finite Parameter Spaces
Partha Niyogi | Robert C. Berwick
32nd Annual Meeting of the Association for Computational Linguistics

1992

pdf bib
Isolating Cross-linguistic Parsing Complexity with a Principles-and-Parameters Parser: A Case Study of Japanese and English
Sandiway Fong | Robert C. Berwick
COLING 1992 Volume 2: The 14th International Conference on Computational Linguistics

1991

pdf bib
Proceedings of the Second International Workshop on Parsing Technologies (IWPT ’91)
Masaru Tomita | Martin Kay | Robert Berwick | Eva Hajicova | Aravind Joshi | Ronald Kaplan | Makoto Nagao | Yorick Wilks
Proceedings of the Second International Workshop on Parsing Technologies

February 13-25, 1991

pdf bib
A Computational Model of First Language Acquisition
Robert C. Berwick
Computational Linguistics, Volume 17, Number 3, September 1991

pdf bib
Automatic Acquisition of Subcategorization Frames from Tagged Text
Michael R. Brent | Robert C. Berwick
Speech and Natural Language: Proceedings of a Workshop Held at Pacific Grove, California, February 19-22, 1991

1989

pdf bib
The Computational Implementation of Principle-Based Parsers
Sandiway Fong | Robert C. Berwick
Proceedings of the First International Workshop on Parsing Technologies

This paper addresses the issue of how to organize linguistic principles for efficient processing. Based on the general characterization of principles in terms of purely computational properties, the effects of principle-ordering on parser performance are investigated. A novel parser that exploits the possible variation in principle-ordering to dynamically re-order principles is described. Heuristics for minimizing the amount of unnecessary work performed during the parsing process are also discussed.

1985

pdf bib
New Approaches to Parsing Conjunctions Using Prolog
Sandiway Fong | Robert C. Berwick
23rd Annual Meeting of the Association for Computational Linguistics

1984

pdf bib
Strong Generative Capacity, Weak Generative Capacity, and Modern Linguistic Theories
Robert C. Berwick
Computational Linguistics. Formerly the American Journal of Computational Linguistics, Volume 10, Number 3-4, July-December 1984

pdf bib
Bounded Context Parsing and Easy Learnability
Robert C. Berwick
10th International Conference on Computational Linguistics and 22nd Annual Meeting of the Association for Computational Linguistics

1983

pdf bib
Syntactic Constraints and Efficient Parsability
Robert C. Berwick | Amy S. Weinberg
21st Annual Meeting of the Association for Computational Linguistics

1982

pdf bib
Computational Complexity and Lexical-Functional Grammar
Robert C. Berwick
American Journal of Computational Linguistics, Volume 8, Number 3-4, July-December 1982

1981

pdf bib
Computational Complexity and Lexical Functional Grammar
Robert C. Berwick
19th Annual Meeting of the Association for Computational Linguistics

1980

pdf bib
Computational Analogues of Constraints on Grammars: A Model of Syntactic Acquisition
Robert Cregar Berwick
18th Annual Meeting of the Association for Computational Linguistics