Aman Saini


2024

pdf bib
Spivavtor: An Instruction Tuned Ukrainian Text Editing Model
Aman Saini | Artem Chernodub | Vipul Raheja | Vivek Kulkarni
Proceedings of the Third Ukrainian Natural Language Processing Workshop (UNLP) @ LREC-COLING 2024

We introduce Spivavtor, a dataset, and instruction-tuned models for text editing focused on the Ukrainian language. Spivavtor is the Ukrainian-focused adaptation of the English-only CoEdIT (Raheja et al., 2023) model. Similar to CoEdIT, Spivavtor performs text editing tasks by following instructions in Ukrainian like “Виправте граматику в цьому реченнi” and “Спростiть це речення” which translate to “Correct the grammar in this sentence” and “Simplify this sentence” in English, respectively. This paper describes the details of the Spivavtor-Instruct dataset and Spivavtor models. We evaluate Spivavtor on a variety of text editing tasks in Ukrainian, such as Grammatical Error Correction (GEC), Text Simplification, Coherence, and Paraphrasing, and demonstrate its superior performance on all of them. We publicly release our best performing models and data as resources to the community to advance further research in this space.