Anastasia Semikozova
2025
The PRECOM-SM Corpus: Gambling in Spanish Social Media
Pablo Álvarez-Ojeda
|
María Victoria Cantero-Romero
|
Anastasia Semikozova
|
Arturo Montejo-Raez
Proceedings of the 31st International Conference on Computational Linguistics
Gambling addiction is a “silent problem” in society, especially among young people in recent years due to the easy access to betting and gambling sites on the Internet through smartphones and personal computers. As online communities in messaging apps, forums and other “teenagers gathering” sites keep growing day by day, more textual information is available for its study. This work focuses on collecting text from online Spanish-speaking communities and analysing it in order to find patterns in written language from frequent and infrequent users on the collected platforms so that an emerging gambling addiction problem can be detected. In this paper, a newly built corpus is introduced, as well as an extensive description of how it has been made. Besides, some baseline experiments on the data have been carried on, employing the generated features after the analysis of the text with different machine learning approaches like the bag of words model or deep neural network encodings.