Ryszard Staruch
2025
Oddballness: universal anomaly detection with language models
Filip Gralinski | Ryszard Staruch | Krzysztof Jurkiewicz
Proceedings of the 31st International Conference on Computational Linguistics
Filip Gralinski | Ryszard Staruch | Krzysztof Jurkiewicz
Proceedings of the 31st International Conference on Computational Linguistics
We present a new method to detect anomalies in texts (in general: in sequences of any data), using language models, in a totally unsupervised manner. The method considers probabilities (likelihoods) generated by a language model, but instead of focusing on low-likelihood tokens, it considers a new metric defined in this paper: oddballness. Oddballness measures how “strange” a given token is according to the language model. We demonstrate in grammatical error detection tasks (a specific case of text anomaly detection) that oddballness is better than just considering low-likelihood events, if a totally unsupervised setup is assumed.
Adapting LLMs for Minimal-edit Grammatical Error Correction
Ryszard Staruch | Filip Gralinski | Daniel Dzienisiewicz
Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025)
Ryszard Staruch | Filip Gralinski | Daniel Dzienisiewicz
Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025)
Decoder-only large language models have shown superior performance in the fluency-edit English Grammatical Error Correction, but their adaptation for minimal-edit English GEC is still underexplored. To improve their effectiveness in the minimal-edit approach, we explore the error rate adaptation topic and propose a novel training schedule method. Our experiments set a new state-of-the-art result for a single-model system on the BEA-test set. We also detokenize the most common English GEC datasets to match the natural way of writing text. During the process, we find that there are errors in them. Our experiments analyze whether training on detokenized datasets impacts the results and measure the impact of the usage of the datasets with corrected erroneous examples. To facilitate reproducibility, we have released the source code used to train our models.
PolEval 2025
Łukasz Kobyliński | Ryszard Staruch | Alina Wróblewska | Maciej Ogrodniczuk
Proceedings of the PolEval 2025 Workshop
Łukasz Kobyliński | Ryszard Staruch | Alina Wróblewska | Maciej Ogrodniczuk
Proceedings of the PolEval 2025 Workshop
PolEval is an annual shared-task evaluation campaign dedicated to advancing natural language processing for the Polish language. This paper presents an overview of PolEval 2025, the eighth edition of the campaign, which included three completed tasks covering machine-generated text detection, gender-inclusive language generation, and speech emotion recognition. The evaluation was conducted using standardised datasets and metrics via the AmuEval platform. PolEval 2025 attracted 15 teams and over 100 submissions, demonstrating continued engagement from the Polish NLP community. We describe the organisation of the campaign, the evaluation setup, and the role of PolEval in fostering reproducible research and community-driven benchmarking.