Profiling-UD: a Tool for Linguistic Profiling of Texts

Dominique Brunato, Andrea Cimino, Felice Dell’Orletta, Giulia Venturi, Simonetta Montemagni


Abstract
In this paper, we introduce Profiling–UD, a new text analysis tool inspired to the principles of linguistic profiling that can support language variation research from different perspectives. It allows the extraction of more than 130 features, spanning across different levels of linguistic description. Beyond the large number of features that can be monitored, a main novelty of Profiling–UD is that it has been specifically devised to be multilingual since it is based on the Universal Dependencies framework. In the second part of the paper, we demonstrate the effectiveness of these features in a number of theoretical and applicative studies in which they were successfully used for text and author profiling.
Anthology ID:
2020.lrec-1.883
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
7145–7151
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.883
DOI:
Bibkey:
Cite (ACL):
Dominique Brunato, Andrea Cimino, Felice Dell’Orletta, Giulia Venturi, and Simonetta Montemagni. 2020. Profiling-UD: a Tool for Linguistic Profiling of Texts. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 7145–7151, Marseille, France. European Language Resources Association.
Cite (Informal):
Profiling-UD: a Tool for Linguistic Profiling of Texts (Brunato et al., LREC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.lrec-1.883.pdf