Yuxuan Chen


pdf bib
A Comparative Study of Pre-trained Encoders for Low-Resource Named Entity Recognition
Yuxuan Chen | Jonas Mikkelsen | Arne Binder | Christoph Alt | Leonhard Hennig
Proceedings of the 7th Workshop on Representation Learning for NLP

Pre-trained language models (PLM) are effective components of few-shot named entity recognition (NER) approaches when augmented with continued pre-training on task-specific out-of-domain data or fine-tuning on in-domain data. However, their performance in low-resource scenarios, where such data is not available, remains an open question. We introduce an encoder evaluation framework, and use it to systematically compare the performance of state-of-the-art pre-trained representations on the task of low-resource NER. We analyze a wide range of encoders pre-trained with different strategies, model architectures, intermediate-task fine-tuning, and contrastive learning. Our experimental results across ten benchmark NER datasets in English and German show that encoder performance varies significantly, suggesting that the choice of encoder for a specific low-resource scenario needs to be carefully evaluated.

pdf bib
Why only Micro-F1? Class Weighting of Measures for Relation Classification
David Harbecke | Yuxuan Chen | Leonhard Hennig | Christoph Alt
Proceedings of NLP Power! The First Workshop on Efficient Benchmarking in NLP

Relation classification models are conventionally evaluated using only a single measure, e.g., micro-F1, macro-F1 or AUC. In this work, we analyze weighting schemes, such as micro and macro, for imbalanced datasets. We introduce a framework for weighting schemes, where existing schemes are extremes, and two new intermediate schemes. We show that reporting results of different weighting schemes better highlights strengths and weaknesses of a model.

pdf bib
Multilingual Relation Classification via Efficient and Effective Prompting
Yuxuan Chen | David Harbecke | Leonhard Hennig
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Prompting pre-trained language models has achieved impressive performance on various NLP tasks, especially in low data regimes. Despite the success of prompting in monolingual settings, applying prompt-based methods in multilingual scenarios has been limited to a narrow set of tasks, due to the high cost of handcrafting multilingual prompts. In this paper, we present the first work on prompt-based multilingual relation classification (RC), by introducing an efficient and effective method that constructs prompts from relation triples and involves only minimal translation for the class labels. We evaluate its performance in fully supervised, few-shot and zero-shot scenarios, and analyze its effectiveness across 14 languages, prompt variants, and English-task training in cross-lingual settings. We find that in both fully supervised and few-shot scenarios, our prompt method beats competitive baselines: fine-tuning XLM-R_EM and null prompts. It also outperforms the random baseline by a large margin in zero-shot experiments. Our method requires little in-language knowledge and can be used as a strong baseline for similar multilingual classification tasks.


pdf bib
Query-Key Normalization for Transformers
Alex Henry | Prudhvi Raj Dachapally | Shubham Shantaram Pawar | Yuxuan Chen
Findings of the Association for Computational Linguistics: EMNLP 2020

Low-resource language translation is a challenging but socially valuable NLP task. Building on recent work adapting the Transformer’s normalization to this setting, we propose QKNorm, a normalization technique that modifies the attention mechanism to make the softmax function less prone to arbitrary saturation without sacrificing expressivity. Specifically, we apply l2-normalization along the head dimension of each query and key matrix prior to multiplying them and then scale up by a learnable parameter instead of dividing by the square root of the embedding dimension. We show improvements averaging 0.928 BLEU over state-of-the-art bilingual benchmarks for 5 low-resource translation pairs from the TED Talks corpus and IWSLT’15.