A Preliminary Study of Statistically Predictive Syntactic Complexity Features and Manual Simplifications in Basque

Itziar Gonzalez-Dios, María Jesús Aranzabe, Arantza Díaz de Ilarraza


Abstract
In this paper, we present a comparative analysis of statistically predictive syntactic features of complexity and the treatment of these features by humans when simplifying texts. To that end, we have used a list of the most five statistically predictive features obtained automatically and the Corpus of Basque Simplified Texts (CBST) to analyse how the syntactic phenomena in these features have been manually simplified. Our aim is to go beyond the descriptions of operations found in the corpus and relate the multidisciplinary findings to understand text complexity from different points of view. We also present some issues that can be important when analysing linguistic complexity.
Anthology ID:
W16-4110
Volume:
Proceedings of the Workshop on Computational Linguistics for Linguistic Complexity (CL4LC)
Month:
December
Year:
2016
Address:
Osaka, Japan
Venues:
CL4LC | WS
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
89–97
Language:
URL:
https://aclanthology.org/W16-4110
DOI:
Bibkey:
Cite (ACL):
Itziar Gonzalez-Dios, María Jesús Aranzabe, and Arantza Díaz de Ilarraza. 2016. A Preliminary Study of Statistically Predictive Syntactic Complexity Features and Manual Simplifications in Basque. In Proceedings of the Workshop on Computational Linguistics for Linguistic Complexity (CL4LC), pages 89–97, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
A Preliminary Study of Statistically Predictive Syntactic Complexity Features and Manual Simplifications in Basque (Gonzalez-Dios et al., 2016)
Copy Citation:
PDF:
https://aclanthology.org/W16-4110.pdf