Differences in type-token ratio and part-of-speech frequencies in male and female Russian written texts

Tatiana Litvinova, Pavel Seredin, Olga Litvinova, Olga Zagorovskaya


Abstract
The differences in the frequencies of some parts of speech (POS), particularly function words, and lexical diversity in male and female speech have been pointed out in a number of papers. The classifiers using exclusively context-independent parameters have proved to be highly effective. However, there are still issues that have to be addressed as a lot of studies are performed for English and the genre and topic of texts is sometimes neglected. The aim of this paper is to investigate the association between context-independent parameters of Russian written texts and the gender of their authors and to design predictive re-gression models. A number of correlations were found. The obtained data is in good agreement with the results obtained for other languages. The model based on 5 parameters with the highest correlation coefficients was designed.
Anthology ID:
W17-4909
Volume:
Proceedings of the Workshop on Stylistic Variation
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Editors:
Julian Brooke, Thamar Solorio, Moshe Koppel
Venue:
Style-Var
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
69–73
Language:
URL:
https://aclanthology.org/W17-4909
DOI:
10.18653/v1/W17-4909
Bibkey:
Cite (ACL):
Tatiana Litvinova, Pavel Seredin, Olga Litvinova, and Olga Zagorovskaya. 2017. Differences in type-token ratio and part-of-speech frequencies in male and female Russian written texts. In Proceedings of the Workshop on Stylistic Variation, pages 69–73, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):
Differences in type-token ratio and part-of-speech frequencies in male and female Russian written texts (Litvinova et al., Style-Var 2017)
Copy Citation:
PDF:
https://aclanthology.org/W17-4909.pdf