William Dyer


2023

pdf bib
Evaluating a Century of Progress on the Cognitive Science of Adjective Ordering
William Dyer | Charles Torres | Gregory Scontras | Richard Futrell
Transactions of the Association for Computational Linguistics, Volume 11

The literature on adjective ordering abounds with proposals meant to account for why certain adjectives appear before others in multi-adjective strings (e.g., the small brown box). However, these proposals have been developed and tested primarily in isolation and based on English; few researchers have looked at the combined performance of multiple factors in the determination of adjective order, and few have evaluated predictors across multiple languages. The current work approaches both of these objectives by using technologies and datasets from natural language processing to look at the combined performance of existing proposals across 32 languages. Comparing this performance with both random and idealized baselines, we show that the literature on adjective ordering has made significant meaningful progress across its many decades, but there remains quite a gap yet to be explained.

2021

pdf bib
Predicting cross-linguistic adjective order with information gain
William Dyer | Richard Futrell | Zoey Liu | Greg Scontras
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2020

pdf bib
What determines the order of adjectives in English? Comparing efficiency-based theories using dependency treebanks
Richard Futrell | William Dyer | Greg Scontras
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

We take up the scientific question of what determines the preferred order of adjectives in English, in phrases such as big blue box where multiple adjectives modify a following noun. We implement and test four quantitative theories, all of which are theoretically motivated in terms of efficiency in human language production and comprehension. The four theories we test are subjectivity (Scontras et al., 2017), information locality (Futrell, 2019), integration cost (Dyer, 2017), and information gain, which we introduce. We evaluate theories based on their ability to predict orders of unseen adjectives in hand-parsed and automatically-parsed dependency treebanks. We find that subjectivity, information locality, and information gain are all strong predictors, with some evidence for a two-factor account, where subjectivity and information gain reflect a factor involving semantics, and information locality reflects collocational preferences.

2019

pdf bib
DepDist: Surface realization via regex and learned dependency-distance tolerance
William Dyer
Proceedings of the 2nd Workshop on Multilingual Surface Realisation (MSR 2019)

This paper describes a method of inflecting and linearizing a lemmatized dependency tree by: (1) determining a regular expression and substitution to describe each productive wordform rule; (2) learning the dependency distance tolerance for each head-dependent pair, resulting in an edge-weighted directed acyclic graph (DAG); and (3) topologically sorting the DAG into a surface realization based on edge weight. The method’s output for 11 languages across 18 treebanks is competitive with the other submissions to the Second Multilingual Surface Realization Shared Task (SR ‘19).

pdf bib
Weighted posets: Learning surface order from dependency trees
William Dyer
Proceedings of the 18th International Workshop on Treebanks and Linguistic Theories (TLT, SyntaxFest 2019)

2018

pdf bib
Integration complexity and the order of cosisters
William Dyer
Proceedings of the Second Workshop on Universal Dependencies (UDW 2018)

The cost of integrating dependent constituents to their heads is thought to involve the distance between dependent and head and the complexity of the integration (Gibson, 1998). The former has been convincingly addressed by Dependency Distance Minimization (DDM) (cf. Liu et al., 2017). The current study addresses the latter by proposing a novel theory of integration complexity derived from the entropy of the probability distribution of a dependent’s heads. An analysis of Universal Dependency corpora provides empirical evidence regarding the preferred order of isomorphic cosisters—sister constituents of the same syntactic form on the same side of their head—such as the adjectives in pretty blue fish. Integration complexity, alongside DDM, allows for a general theory of constituent order based on integration cost.