Proceedings of the Seventh International Conference on Dependency Linguistics (Depling, GURT/SyntaxFest 2023)

Owen Rambow, François Lareau (Editors)

Anthology ID:
Washington, D.C.
DepLing | SyntaxFest
Association for Computational Linguistics
Bib Export formats:

pdf bib
Proceedings of the Seventh International Conference on Dependency Linguistics (Depling, GURT/SyntaxFest 2023)
Owen Rambow | François Lareau

pdf bib
The development of dependency length minimization in early child language: A case study of the dative alternation
Zoey Liu | Stefanie Wulff

How does the preference for dependency length minimization (DLM) develop in early child language? This study takes up this question with the dative alternation in English as the test case. We built a large-scale dataset of dative constructions using transcripts of naturalistic child-parent interactions. Across different developmental stages of children, there appears to be a strong tendency for DLM. The tendency emerges between the age range of 12-18 months, slightly decreases until 30-36 months, then becomes more pronounced afterwards and approaches parents’ production preferences after 48 months. We further show the extent of DLM depends on how a given dative construction is realized: the tendency for shorter dependencies is much more pronounced in double object structures, whereas the prepositional object structures are associated with longer dependencies.

pdf bib
Which Sentence Representation is More Informative: An Analysis on Text Classification
Necva Bölücü | Burcu Can

Text classification is a popular and well-studied problem in Natural Language Processing. Most previous work on text classification has focused on deep neural networks such as LSTMs and CNNs. However, text classification studies using syntactic and semantic information are very limited in the literature. In this study, we propose a model using Graph Attention Network (GAT) that incorporates semantic and syntactic information as input for the text classification task. The semantic representations of UCCA and AMR are used as semantic information and the dependency tree is used as syntactic information. Extensive experimental results and in-depth analysis show that UCCA-GAT model, which is a semantic-aware model outperforms the AMR-GAT and DEP-GAT, which are semantic and syntax-aware models respectively. We also provide a comprehensive analysis of the proposed model to understand the limitations of the representations for the problem.

pdf bib
Formal Semantics for Dependency Grammar
Dag T. T. Haug | Jamie Y. Findlay

In this paper, we provide an explicit interface to formal semantics for Dependency Grammar, based on Glue Semantics. Glue Semantics has mostly been developed in the context of Lexical Functional Grammar, which shares two crucial assumptions with Dependency Grammar: lexical integrity and allowance of nonbinary-branching syntactic structure. We show how Glue can be adapted to the Dependency Grammar setting and provide sample semantic analyses of quantifier scope, control infinitives and relative clauses.

pdf bib
Predicates and entities in Abstract Meaning Representation
Antoine Venant | François Lareau

Nodes in Abstract Meaning Representation (AMR) are generally thought of as neo-Davidsonian entities. We review existing translation into neo-Davidsonian representations and show that these translations inconsistently handle copula sentences. We link the problem to an asymmetry arising from a problematic handling of words with no associated PropBank frames for the underlying predicate. We introduce a method to automatically and uniformly decompose AMR nodes into an entity-part and a predicative part, which offers a consistent treatment of copula sentences and quasi- predicates such as brother or client.

pdf bib
Character-level Dependency Annotation of Chinese
Li Yixuan

In this paper, we propose a new model for annotating dependency relations at the Mandarin character level with the aim of building treebanks to cope with the unsatisfactory performance of existing word segmentation and syntactic analysis models in specific scientific domains, such as Chinese patent texts. The result is a treebank of 100 sentences annotated according to our scheme, which also serves as a training corpus that facilitates the subsequent development of a joint word segmenter and dependency analyzer that enables downstream tasks in Chinese to be separated from the non-standardized pre-processing step of word segmentation.

pdf bib
What quantifying word order freedom can tell us about dependency corpora
Maja Buljan

Building upon existing work on word order freedom and syntactic annotation, this paper investigates whether we can differentiate between findings that reveal inherent properties of natural languages and their syntax, and features dependent on annotations used in computing the measures. An existing quantifiable and linguistically interpretable measure of word order freedom in language is applied to take a closer look at the robustness of the basic measure (word order entropy) to variations in dependency corpora used in the analysis. Measures are compared at three levels of generality, applied to corpora annotated according to the Universal Dependencies v1 and v2 annotation guidelines, selecting 31 languages for analysis. Preliminary results show that certain measures, such as subject-object relation order freedom, are sensitive to slight changes in annotation guidelines, while simpler measures are more robust, highlighting aspects of these metrics that should be taken into consideration when using dependency corpora for linguistic analysis and generalisation.

pdf bib
Word order flexibility: a typometric study
Sylvain Kahane | Ziqian Peng | Kim Gerdes

This paper introduces a typometric measure of flexibility, which quantifies the variability of head-dependent word order on the whole set of treebanks of a language or on specific constructions. The measure is based on the notion of head-initiality and we show that it can be computed for all of languages of the Universal Dependency treebank set, that it does not require ad-hoc thresholds to categorize languages or constructions, and that it can be applied with any granularity of constructions and languages. We compare our results with Bakker’s (1998) categorical flexibility index. Typometric flexibility is shown to be a good measure for characterizing the language distribution with respect to word order for a given construction, and for estimating whether a construction predicts the global word order behavior of a language.

pdf bib
Measure words are measurably different from sortal classifiers
Yamei Wang | Géraldine Walther

Nominal classifiers categorize nouns based on salient semantic properties. Past studies have long debated whether sortal classifiers (related to intrinsic semantic noun features) and mensural classifiers (related to quantity) should be considered as the same grammatical category. Suggested diagnostic tests rely on functional and distributional criteria, typically evaluated in terms of isolated example sentences obtained through elicitation. This paper offers a systematic re-evaluation of this long-standing question: using 981,076 nominal phrases from a 489 MB dependency-parsed word corpus, corresponding extracted contextual word embeddings from a Chinese BERT model, and information-theoretic measures of mutual information, we show that mensural classifiers can be distributionally and functionally distinguished from sortal classifiers justifying the existence of distinct syntactic categories for mensural and sortal classifiers. Our study also entails broader implications for the typological study of classifier systems.

pdf bib
A Pipeline for Extracting Abstract Dependency Templates for Data-to-Text Natural Language Generation
Simon Mille | Josep Ricci | Alexander Shvets | Anya Belz

We present work in progress that aims to address the coverage issue faced by rule-based text generators. We propose a pipeline for extracting abstract dependency template (predicate-argument structures) from Wikipedia text to be used as input for generating text from structured data with the FORGe system. The pipeline comprises three main components: (i) candidate sentence retrieval, (ii) clause extraction, ranking and selection, and (iii) conversion to predicate-argument form. We present an approach and preliminary evaluation for the ranking and selection module.