2021
pdf
bib
abs
DUTH at SemEval-2021 Task 7: Is Conventional Machine Learning for Humorous and Offensive Tasks enough in 2021?
Alexandros Karasakalidis
|
Dimitrios Effrosynidis
|
Avi Arampatzis
Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021)
This paper describes the approach that was developed for SemEval 2021 Task 7 (Hahackathon: Incorporating Demographic Factors into Shared Humor Tasks) by the DUTH Team. We used and compared a variety of preprocessing techniques, vectorization methods, and numerous conventional machine learning algorithms, in order to construct classification and regression models for the given tasks. We used majority voting to combine the models’ outputs with small Neural Networks (NN) for classification tasks and their mean for regression for improving our system’s performance. While these methods proved weaker than modern, deep learning models, they are still relevant in research tasks because of their low requirements on computational power and faster training.
2018
pdf
bib
abs
DUTH at SemEval-2018 Task 2: Emoji Prediction in Tweets
Dimitrios Effrosynidis
|
Georgios Peikos
|
Symeon Symeonidis
|
Avi Arampatzis
Proceedings of the 12th International Workshop on Semantic Evaluation
This paper describes the approach that was developed for SemEval 2018 Task 2 (Multilingual Emoji Prediction) by the DUTH Team. First, we employed a combination of pre-processing techniques to reduce the noise of tweets and produce a number of features. Then, we built several N-grams, to represent the combination of word and emojis. Finally, we trained our system with a tuned LinearSVC classifier. Our approach in the leaderboard ranked 18th amongst 48 teams.
2017
pdf
bib
abs
DUTH at SemEval-2017 Task 4: A Voting Classification Approach for Twitter Sentiment Analysis
Symeon Symeonidis
|
Dimitrios Effrosynidis
|
John Kordonis
|
Avi Arampatzis
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)
This report describes our participation to SemEval-2017 Task 4: Sentiment Analysis in Twitter, specifically in subtasks A, B, and C. The approach for text sentiment classification is based on a Majority Vote scheme and combined supervised machine learning methods with classical linguistic resources, including bag-of-words and sentiment lexicon features.
pdf
bib
abs
DUTH at SemEval-2017 Task 5: Sentiment Predictability in Financial Microblogging and News Articles
Symeon Symeonidis
|
John Kordonis
|
Dimitrios Effrosynidis
|
Avi Arampatzis
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)
We present the system developed by the team DUTH for the participation in Semeval-2017 task 5 - Fine-Grained Sentiment Analysis on Financial Microblogs and News, in subtasks A and B. Our approach to determine the sentiment of Microblog Messages and News Statements & Headlines is based on linguistic preprocessing, feature engineering, and supervised machine learning techniques. To train our model, we used Neural Network Regression, Linear Regression, Boosted Decision Tree Regression and Decision Forrest Regression classifiers to forecast sentiment scores. At the end, we present an error measure, so as to improve the performance about forecasting methods of the system.