Mohamed Khemakhem

2025

The integration of artificial intelligence (AI) with terminology management (TM) has opened new avenues for enhancing efficiency and precision in both fields, necessitating standardized approaches to ensure interoperability and ethical application. The newly formed ISO/TC 37/SC 3/WG 6 represents the first dedicated initiative to study the standardization of the mutual improvements of AI and TM. This group aims to develop standardized frameworks and guidelines that optimize the interaction between AI technologies and terminology resources, benefiting professionals, systems, and practices in both domains. This article presents the state-of-the-art in the mutual relationship between AI and TM, highlighting opportunities for bidirectional advancements. It also addresses limitations and challenges from a standardization perspective. By tackling these issues, ISO/TC 37/SC 3/WG 6 seeks to establish principles that ensure scalability, precision, and ethical considerations, shaping future standards to support global communication and knowledge exchange.

2020

pdf bib abs

Modelling Etymology in LMF/TEI: The Grande Dicionário Houaiss da Língua Portuguesa Dictionary as a Use Case
Fahad Khan | Laurent Romary | Ana Salgado | Jack Bowers | Mohamed Khemakhem | Toma Tasovac
Proceedings of the Twelfth Language Resources and Evaluation Conference

In this article we will introduce two of the new parts of the new multi-part version of the Lexical Markup Framework (LMF) ISO standard, namely part 3 of the standard (ISO 24613-3), which deals with etymological and diachronic data, and Part 4 (ISO 24613-4), which consists of a TEI serialisation of all of the prior parts of the model. We will demonstrate the use of both standards by describing the LMF encoding of a small number of examples taken from a sample conversion of the reference Portuguese dictionary Grande Dicionário Houaiss da Língua Portuguesa, part of a broader experiment comprising the analysis of different, heterogeneously encoded, Portuguese lexical resources. We present the examples in the Unified Modelling Language (UML) and also in a couple of cases in TEI.

2016

pdf bib abs

Sense-annotating a Lexical Substitution Data Set with Ubyline
Tristan Miller | Mohamed Khemakhem | Richard Eckart de Castilho | Iryna Gurevych
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

We describe the construction of GLASS, a newly sense-annotated version of the German lexical substitution data set used at the GermEval 2015: LexSub shared task. Using the two annotation layers, we conduct the first known empirical study of the relationship between manually applied word senses and lexical substitutions. We find that synonymy and hypernymy/hyponymy are the only semantic relations directly linking targets to their substitutes, and that substitutes in the target’s hypernymy/hyponymy taxonomy closely align with the synonyms of a single GermaNet synset. Despite this, these substitutes account for a minority of those provided by the annotators. The results of our analysis accord with those of a previous study on English-language data (albeit with automatically induced word senses), leading us to suspect that the sense―substitution relations we discovered may be of a universal nature. We also tentatively conclude that relatively cheap lexical substitution annotations can be used as a knowledge source for automatic WSD. Also introduced in this paper is Ubyline, the web application used to produce the sense annotations. Ubyline presents an intuitive user interface optimized for annotating lexical sample data, and is readily adaptable to sense inventories other than GermaNet.