Kevin Glocker
2022
Hierarchical Multi-Task Transformers for Crosslingual Low Resource Phoneme Recognition
Kevin Glocker
|
Munir Georges
Proceedings of the 5th International Conference on Natural Language and Speech Processing (ICNLSP 2022)
2020
TëXtmarkers at SemEval-2020 Task 10: Emphasis Selection with Agreement Dependent Crowd Layers
Kevin Glocker
|
Stefanos Andreas Markianos Wright
Proceedings of the Fourteenth Workshop on Semantic Evaluation
In visual communication, the ability of a short piece of text to catch someone’s eye in a single glance or from a distance is of paramount importance. In our approach to the SemEval-2020 task “Emphasis Selection For Written Text in Visual Media”, we use contextualized word representations from a pretrained model of the state-of-the-art BERT architecture together with a stacked bidirectional GRU network to predict token-level emphasis probabilities. For tackling low inter-annotator agreement in the dataset, we attempt to model multiple annotators jointly by introducing initialization with agreement dependent noise to a crowd layer architecture. We found our approach to both perform substantially better than initialization with identities for this purpose and to outperform a baseline trained with token level majority voting. Our submission system reaches substantially higher Match m on the development set than the task baseline (0.779), but only slightly outperforms the test set baseline (0.754) using a three model ensemble.
2019
Tüpa at SemEval-2019 Task1: (Almost) feature-free Semantic Parsing
Tobias Pütz
|
Kevin Glocker
Proceedings of the 13th International Workshop on Semantic Evaluation
Our submission for Task 1 ‘Cross-lingual Semantic Parsing with UCCA’ at SemEval-2018 is a feed-forward neural network that builds upon an existing state-of-the-art transition-based directed acyclic graph parser. We replace most of its features by deep contextualized word embeddings and introduce an approximation to represent non-terminal nodes in the graph as an aggregation of their terminal children. We further demonstrate how augmenting data using the baseline systems provides a consistent advantage in all open submission tracks. We submitted results to all open tracks (English, in- and out-of-domain, German in-domain and French in-domain, low-resource). Our system achieves competitive performance in all settings besides the French, where we did not augment the data. Post-evaluation experiments showed that data augmentation is especially crucial in this setting.