Julius Gonsior
2023
Mr-Fosdick at SemEval-2023 Task 5: Comparing Dataset Expansion Techniques for Non-Transformer and Transformer Models: Improving Model Performance through Data Augmentation
Christian Falkenberg
|
Erik Schönwälder
|
Tom Rietzke
|
Chris-Andris Görner
|
Robert Walther
|
Julius Gonsior
|
Anja Reusch
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
In supervised learning, a significant amount of data is essential. To achieve this, we generated and evaluated datasets based on a provided dataset using transformer and non-transformer models. By utilizing these generated datasets during the training of new models, we attain a higher balanced accuracy during validation compared to using only the original dataset.
Sabrina Spellman at SemEval-2023 Task 5: Discover the Shocking Truth Behind this Composite Approach to Clickbait Spoiling!
Simon Birkenheuer
|
Jonathan Drechsel
|
Paul Justen
|
Jimmy Phlmann
|
Julius Gonsior
|
Anja Reusch
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
This paper describes an approach to automat- ically close the knowledge gap of Clickbait- Posts via a transformer model trained for Question-Answering, augmented by a task- specific post-processing step. This was part of the SemEval 2023 Clickbait shared task (Frbe et al., 2023a) - specifically task 2. We devised strategies to improve the existing model to fit the task better, e.g. with different special mod- els and a post-processor tailored to different inherent challenges of the task. Furthermore, we explored the possibility of expanding the original training data by using strategies from Heuristic Labeling and Semi-Supervised Learn- ing. With those adjustments, we were able to improve the baseline by 9.8 percentage points to a BLEU-4 score of 48.0%.
Search
Co-authors
- Anja Reusch 2
- Christian Falkenberg 1
- Erik Schönwälder 1
- Tom Rietzke 1
- Chris-Andris Görner 1
- show all...