Yuchen Wei


2024

pdf bib
Sense of the Day: Short Timeframe Temporal-Aware Word Sense Disambiguation
Yuchen Wei | Milton King
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

The predominant sense of a lemma can vary based on the timeframe (years, decades, centuries) that the text was written. In our work, we explore the predominant sense of shorter timeframes (days, months, seasons, etc.) and find that different short timeframes can have different predominant senses from each other and from the predominant sense of a corpus. Leveraging the predominant sense and sense distribution of a short timeframe, we design short timeframe temporal-aware word sense disambiguation (WSD) models that outperform a temporal agnostic model. Likewise, author-aware WSD models tend to outperform author agnostic models, therefore we augment our temporal-aware models to leverage knowledge of author-level predominant senses and sense distributions to create temporal and author-aware WSD models. In addition to this, we found that considering recent usages of a lemma by the same author can assist a WSD model. Our approach requires the use of only a small amount of text from authors and timeframes.

pdf bib
Team AT at SemEval-2024 Task 8: Machine-Generated Text Detection with Semantic Embeddings
Yuchen Wei
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

This study investigates the detection of machine-generated text using several semantic embedding techniques, a critical issue in the era of advanced language models. Different methodologies were examined: GloVe embeddings, N-gram embedding models, Sentence BERT, and a concatenated embedding approach, against a fine-tuned RoBERTa baseline. The research was conducted within the framework of SemEval-2024 Task 8, encompassing tasks for binary and multi-class classification of machine-generated text.

2023

pdf bib
StFX NLP at SemEval-2023 Task 1: Multimodal Encoding-based Methods for Visual Word Sense Disambiguation
Yuchen Wei | Milton King
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

SemEval-2023’s Task 1, Visual Word Sense Disambiguation, a task about text semantics and visual semantics, selecting an image from a list of candidates, that best exhibits a given target word in a small context. We tried several methods, including the image captioning method and CLIP methods, and submitted our predictions in the competition for this task. This paper describes the methods we used and their performance and provides an analysis and discussion of the performance.