Marius Leordeanu
2024
“Vorbești Românește?” A Recipe to Train Powerful Romanian LLMs with English Instructions
Mihai Masala
|
Denis Ilie-Ablachim
|
Alexandru Dima
|
Dragos Georgian Corlatescu
|
Miruna-Andreea Zavelca
|
Ovio Olaru
|
Simina-Maria Terian
|
Andrei Terian
|
Marius Leordeanu
|
Horia Velicu
|
Marius Popescu
|
Mihai Dascalu
|
Traian Rebedea
Findings of the Association for Computational Linguistics: EMNLP 2024
In recent years, Large Language Models (LLMs) have achieved almost human-like performance on various tasks. While some LLMs have been trained on multilingual data, most of the training data is in English; hence, their performance in English greatly exceeds other languages. To our knowledge, we are the first to collect and translate a large collection of texts, instructions, and benchmarks and train, evaluate, and release open-source LLMs tailored for Romanian. We evaluate our methods on four different categories, including academic benchmarks, MT-Bench (manually translated), and a professionally built historical, cultural, and social benchmark adapted to Romanian. We argue for the usefulness and high performance of RoLLMs by obtaining state-of-the-art results across the board. We publicly release all resources (i.e., data, training and evaluation code, models) with the goal of supporting and encouraging research on Romanian LLMs while concurrently creating a generalizable recipe adequate for other low or less-resourced languages.
2020
A hierarchical approach to vision-based language generation: from simple sentences to complex natural language
Simion-Vlad Bogolin
|
Ioana Croitoru
|
Marius Leordeanu
Proceedings of the 28th International Conference on Computational Linguistics
Automatically describing videos in natural language is an ambitious problem, which could bridge our understanding of vision and language. We propose a hierarchical approach, by first generating video descriptions as sequences of simple sentences, followed at the next level by a more complex and fluent description in natural language. While the simple sentences describe simple actions in the form of (subject, verb, object), the second-level paragraph descriptions, indirectly using information from the first-level description, presents the visual content in a more compact, coherent and semantically rich manner. To this end, we introduce the first video dataset in the literature that is annotated with captions at two levels of linguistic complexity. We perform extensive tests that demonstrate that our hierarchical linguistic representation, from simple to complex language, allows us to train a two-stage network that is able to generate significantly more complex paragraphs than current one-stage approaches.