Sociolinguistic Corpus of WhatsApp Chats in Spanish among College Students
Alejandro Dorantes | Gerardo Sierra | Tlauhlia Yamín Donohue Pérez | Gemma Bel-Enguix | Mónica Jasso Rosales
Proceedings of the Sixth International Workshop on Natural Language Processing for Social Media
This work presents the Sociolinguistic Corpus of WhatsApp Chats in Spanish among College Students, a corpus of raw data for general use. Its purpose is to offer data for the study of of language and interactions via Instant Messaging (IM) among bachelors. Our paper consists of an overview of both the corpus’s content and demographic metadata. Furthermore, it presents the current research being conducted with it —namely parenthetical expressions, orality traits, and code-switching. This work also includes a brief outline of similar corpora and recent studies in the field of IM.