Samantha Kent
2022
Fraunhofer FKIE @ SMM4H 2022: System Description for Shared Tasks 2, 4 and 9
Daniel Claeser
|
Samantha Kent
Proceedings of The Seventh Workshop on Social Media Mining for Health Applications, Workshop & Shared Task
We present our results for the shared tasks 2, 4 and 9 at the SMM4H Workshop at COLING 2022 achieved by succesfully fine-tuning pre-trained language models to the downstream tasks. We identify the occurence of code-switching in the test data for task 2 as a possible source of considerable performance degradation on the test set scores. We successfully exploit structural linguistic similarities in the datasets of tasks 4 and 9 for training on joined datasets, scoring first in task 9 and on par with SOTA in task 4.
2021
CASE 2021 Task 2 Socio-political Fine-grained Event Classification using Fine-tuned RoBERTa Document Embeddings
Samantha Kent
|
Theresa Krumbiegel
Proceedings of the 4th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2021)
We present our submission to Task 2 of the Socio-political and Crisis Events Detection Shared Task at the CASE @ ACL-IJCNLP 2021 workshop. The task at hand aims at the fine-grained classification of socio-political events. Our best model was a fine-tuned RoBERTa transformer model using document embeddings. The corpus consisted of a balanced selection of sub-events extracted from the ACLED event dataset. We achieved a macro F-score of 0.923 and a micro F-score of 0.932 during our preliminary experiments on a held-out test set. The same model also performed best on the shared task test data (weighted F-score = 0.83). To analyze the results we calculated the topic compactness of the commonly misclassified events and conducted an error analysis.
2018
Multilingual Named Entity Recognition on Spanish-English Code-switched Tweets using Support Vector Machines
Daniel Claeser
|
Samantha Kent
|
Dennis Felske
Proceedings of the Third Workshop on Computational Approaches to Linguistic Code-Switching
This paper describes our system submission for the ACL 2018 shared task on named entity recognition (NER) in code-switched Twitter data. Our best result (F1 = 53.65) was obtained using a Support Vector Machine (SVM) with 14 features combined with rule-based post processing.
Search