George Kokush
2023
Team NLLG submission for Eval4NLP 2023 Shared Task: Retrieval-Augmented In-Context Learning for NLG Evaluation
Daniil Larionov
|
Vasiliy Viskov
|
George Kokush
|
Alexander Panchenko
|
Steffen Eger
Proceedings of the 4th Workshop on Evaluation and Comparison of NLP Systems
In this paper, we propose a retrieval-augmented in-context learning for natural language generation (NLG) evaluation. This method allows practitioners to utilize large language models (LLMs) for various NLG evaluation tasks without any fine-tuning. We apply our approach to Eval4NLP 2023 Shared Task in translation evaluation and summarization evaluation subtasks. The findings suggest that retrieval-augmented in-context learning is a promising approach for creating LLM-based evaluation metrics for NLG. Further research directions include exploring the performance of various publicly available LLM models and identifying which LLM properties help boost the quality of the metric.
Semantically-Informed Regressive Encoder Score
Vasiliy Viskov
|
George Kokush
|
Daniil Larionov
|
Steffen Eger
|
Alexander Panchenko
Proceedings of the Eighth Conference on Machine Translation
Machine translation is natural language generation (NLG) problem of translating source text from one language to another. As every task in machine learning domain it requires to have evaluation metric. The most obvious one is human evaluation but it is expensive in case of money and time consumption. In last years with appearing of pretrained transformer architectures and large language models (LLMs) state-of-the-art results in automatic machine translation evaluation got a huge quality step in terms of correlation with expert assessment. We introduce MRE-Score, seMantically-informed Regression Encoder Score, the approach with constructing automatic machine translation evaluation system based on regression encoder and contrastive pretraining for downstream problem.