Investigating Dominant Word Order on Universal Dependencies with Graph Rewriting

Hee-Soo Choi, Bruno Guillaume, Karën Fort, Guy Perrier


Abstract
This paper details experiments we performed on the Universal Dependencies 2.7 corpora in order to investigate the dominant word order in the available languages. For this purpose, we used a graph rewriting tool, GREW, which allowed us to go beyond the surface annotations and identify the implicit subjects. We first measured the distribution of the six different word orders (SVO, SOV, VSO, VOS, OVS, OSV) in the corpora and investigated when there was a significant difference in the corpora within a given language. Then, we compared the obtained results with information provided in the WALS database (Dryer and Haspelmath, 2013) and in ( ̈Ostling, 2015). Finally, we examined the impact of using a graph rewriting tool for this task. The tools and resources used for this research are all freely available.
Anthology ID:
2021.ranlp-1.33
Volume:
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)
Month:
September
Year:
2021
Address:
Held Online
Editors:
Ruslan Mitkov, Galia Angelova
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd.
Note:
Pages:
281–290
Language:
URL:
https://aclanthology.org/2021.ranlp-1.33
DOI:
Bibkey:
Cite (ACL):
Hee-Soo Choi, Bruno Guillaume, Karën Fort, and Guy Perrier. 2021. Investigating Dominant Word Order on Universal Dependencies with Graph Rewriting. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), pages 281–290, Held Online. INCOMA Ltd..
Cite (Informal):
Investigating Dominant Word Order on Universal Dependencies with Graph Rewriting (Choi et al., RANLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.ranlp-1.33.pdf
Data
Universal Dependencies