Aileen Joan Vicente
2024
Language Identification of Philippine Creole Spanish: Discriminating Chavacano From Related Languages
Aileen Joan Vicente
|
Charibeth Cheng
Proceedings of the Eleventh Workshop on NLP for Similar Languages, Varieties, and Dialects (VarDial 2024)
Chavacano is a Spanish Creole widely spoken in the southern regions of the Philippines. It is one of the many Philippine languages yet to be studied computationally. This paper presents the development of a language identification model of Chavacano to distinguish it from languages that influence its creolization using character convolutional networks. Unlike studies that discriminated similar languages based on geographical proximity, this paper reports a similarity focused on the creolization of a language. We established the similarity of Chavacano and its related languages, Spanish, Portuguese, Cebuano, and Hiligaynon, from the number of common words in the corpus for all languages. We report an accuracy of 93% for the model generated using ten filters with a filter width of 5. The training experiments reveal that increasing the filter width, number of filters, or training epochs is unnecessary even if the accuracy increases because the generated models present irregular learning behavior or may have already been overfitted. This study also demonstrates that the character features extracted from convolutional neural networks, similar to n-grams, are sufficient in identifying Chavacano. Future work on the language identification of Chavacano includes improving classification accuracy for short or code-switched texts for practical applications such as social media sensors for disaster response and management.
2017
#ActuallyDepressed: Characterization of Depressed Tumblr Users’ Online Behavior from Rules Generation Machine Learning Technique
Czarina Rae Cahutay
|
Aileen Joan Vicente
Proceedings of the 31st Pacific Asia Conference on Language, Information and Computation
Automatic Categorization of Tagalog Documents Using Support Vector Machines
April Dae Bation
|
Aileen Joan Vicente
|
Erlyn Manguilimotan
Proceedings of the 31st Pacific Asia Conference on Language, Information and Computation
Search