Ilan Kernerman

2026

MTQE.en-he: Machine Translation Quality Estimation for English-Hebrew
Andy Rosenbaum | Assaf Siani | Ilan Kernerman
Proceedings of the Second Workshop on Language Models for Low-Resource Languages (LoResLM 2026)

We release MTQE.en-he: to our knowledge,the first publicly available English-Hebrewbenchmark for Machine Translation QualityEstimation. MTQE.en-he contains 959 English segments from WMT24++, each pairedwith a machine translation into Hebrew, andDirect Assessment scores of the translationquality annotated by three human experts. Webenchmark ChatGPT prompting, TransQuest,and CometKiwi and show that ensemblingthe three models outperforms the best singlemodel (CometKiwi) by 6.4 percentage pointsPearson and 5.8 percentage points Spearman.Fine-tuning experiments with TransQuest andCometKiwi reveal that full-model updates aresensitive to overfitting and distribution collapse,yet parameter-efficient methods (LoRA, BitFit, and FTHead, i.e., fine-tuning only the classification head)train stably and yield improvements of 2-3 percentage points. MTQE.en-heand our experimental results enable future research on this under-resourced language pair.

2025

pdf bib abs

Linking the Lexicala Latin-French Dictionary to the LiLa Knowledge Base
Adriano De Paoli | Marco Carlo Passarotti | Paolo Ruffolo | Giovanni Moretti | Ilan Kernerman
Proceedings of the 5th Conference on Language, Data and Knowledge

This paper presents the integration of the Lexicala Latin–French Dictionary into the LiLa Knowledge Base of linguistic resources for Latin made interoperable through their publication as Linked Open Data. The entries of the dictionary are linked to the large collection of Latin lemmas of LiLa (Lemma Bank), enabling interaction with the other resources published therein. The paper details the data modelling process, the linking methodology, and a couple of practical use cases, showing how interlinking resources via LOD can support advancement in (multilingual) linguistic research.

2022

pdf bib

Proceedings of Globalex Workshop on Linked Lexicography within the 13th Language Resources and Evaluation Conference
Ilan Kernerman | Simon Krek
Proceedings of Globalex Workshop on Linked Lexicography within the 13th Language Resources and Evaluation Conference

pdf bib abs

TIAD 2022: The Fifth Translation Inference Across Dictionaries Shared Task
Jorge Gracia | Besim Kabashi | Ilan Kernerman
Proceedings of Globalex Workshop on Linked Lexicography within the 13th Language Resources and Evaluation Conference

The objective of the Translation Inference Across Dictionaries (TIAD) series of shared tasks is to explore and compare methods and techniques that infer translations indirectly between language pairs, based on other bilingual/multilingual lexicographic resources. In this fifth edition, the participating systems were asked to generate new translations automatically among three languages - English, French, Portuguese - based on known indirect translations contained in the Apertium RDF graph. Such evaluation pairs have been the same during the four last TIAD editions. Since the fourth edition, however, a larger graph is used as a basis to produce the translations, namely Apertium RDF v2. The evaluation of the results was carried out by the organisers against manually compiled language pairs of K Dictionaries. For the second time in the TIAD series, some systems beat the proposed baselines. This paper gives an overall description of the shard task, the evaluation data and methodology, and the systems’ results.

pdf bib

Proceedings of the 2nd Workshop on Sentiment Analysis and Linguistic Linked Data
Ilan Kernerman | Sara Carvalho | Carlos A. Iglesias | Rachele Sprugnoli
Proceedings of the 2nd Workshop on Sentiment Analysis and Linguistic Linked Data

2020

pdf bib

2019

pdf bib abs

We present a portfolio of natural legal language processing and document curation services currently under development in a collaborative European project. First, we give an overview of the project and the different use cases, while, in the main part of the article, we focus upon the 13 different processing services that are being deployed in different prototype applications using a flexible and scalable microservices architecture. Their orchestration is operationalised using a content and document curation workflow manager.

Venues

SALLD1

Fix author

Ilan Kernerman

2026

2025

2022

2020

2019

Co-authors

Venues