Silvan Wehrli

2024

pdf bib abs
Guiding Sentiment Analysis with Hierarchical Text Clustering: Analyzing the German X/Twitter Discourse on Face Masks in the 2020 COVID-19 Pandemic
Silvan Wehrli | Chisom Ezekannagha | Georges Hattab | Tamara Boender | Bert Arnrich | Christopher Irrgang
Proceedings of the 14th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis

Social media are a critical component of the information ecosystem during public health crises. Understanding the public discourse is essential for effective communication and misinformation mitigation. Computational methods can aid these efforts through online social listening. We combined hierarchical text clustering and sentiment analysis to examine the face mask-wearing discourse in Germany during the COVID-19 pandemic using a dataset of 353,420 German X (formerly Twitter) posts from 2020. For sentiment analysis, we annotated a subsample of the data to train a neural network for classifying the sentiments of posts (neutral, negative, or positive). In combination with clustering, this approach uncovered sentiment patterns of different topics and their subtopics, reflecting the online public response to mask mandates in Germany. We show that our approach can be used to examine long-term narratives and sentiment dynamics and to identify specific topics that explain peaks of interest in the social media discourse.

2023

pdf bib
German Text Embedding Clustering Benchmark
Silvan Wehrli | Bert Arnrich | Christopher Irrgang
Proceedings of the 19th Conference on Natural Language Processing (KONVENS 2023)

2022

pdf bib abs
CLUZH at SIGMORPHON 2022 Shared Tasks on Morpheme Segmentation and Inflection Generation
Silvan Wehrli | Simon Clematide | Peter Makarov
Proceedings of the 19th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology

This paper describes the submissions of the team of the Department of Computational Linguistics, University of Zurich, to the SIGMORPHON 2022 Shared Tasks on Morpheme Segmentation and Inflection Generation. Our submissions use a character-level neural transducer that operates over traditional edit actions. While this model has been found particularly wellsuited for low-resource settings, using it with large data quantities has been difficult. Existing implementations could not fully profit from GPU acceleration and did not efficiently implement mini-batch training, which could be tricky for a transition-based system. For this year’s submission, we have ported the neural transducer to PyTorch and implemented true mini-batch training. This has allowed us to successfully scale the approach to large data quantities and conduct extensive experimentation. We report competitive results for morpheme segmentation (including sharing first place in part 2 of the challenge). We also demonstrate that reducing sentence-level morpheme segmentation to a word-level problem is a simple yet effective strategy. Additionally, we report strong results in inflection generation (the overall best result for large training sets in part 1, the best results in low-resource learning trajectories in part 2). Our code is publicly available.

Co-authors

Georges Hattab 1

Peter Makarov 1

Venues

Fix data