Hamit Kavas
2025
Multilingual Skill Extraction for Job Vacancy–Job Seeker Matching in Knowledge Graphs
Hamit Kavas
|
Marc Serra-Vidal
|
Leo Wanner
Proceedings of the Workshop on Generative AI and Knowledge Graphs (GenAIK)
In the modern labor market, accurate matching of job vacancies with suitable candidate CVs is critical. We present a novel multilingual knowledge graph-based framework designed to enhance the matching by accurately extracting the skills requested by a job and provided by a job seeker in a multilingual setting and aligning them via the standardized skill labels of the European Skills, Competences, Qualifications and Occupations (ESCO) taxonomy. The proposed framework employs a combination of state-of-the-art techniques to extract relevant skills from job postings and candidate experiences. These extracted skills are then filtered and mapped to the ESCO taxonomy and integrated into a multilingual knowledge graph that incorporates hierarchical relationships and cross-linguistic variations through embeddings. Our experiments demonstrate a significant improvement of the matching quality compared to the state of the art.
2024
Enhancing Job Posting Classification with Multilingual Embeddings and Large Language Models
Hamit Kavas
|
Marc Serra-Vidal
|
Leo Wanner
Proceedings of the 10th Italian Conference on Computational Linguistics (CLiC-it 2024)
In the modern labour market, taxonomies such the European Skills, Competences, Qualifications and Occupations (ESCO) classification are used as an interlingua to match job postings with job seeker profiles. Both are classified with respect to ESCO occupations, and match if they align with the same occupation and the same skills assigned to the occupation. However, matching models usually struggle with the classification because of overlapping skills and similar definitions of occupations defined in the ESCO taxonomy. This often leads to imprecise classification outcomes. In this paper, we focus on the challenge of the classification of job postings written in Italian or Spanish against ESCO occupations written in English. We experiment with multilingual embeddings, zero-shot classification, and use of a large language model (LLM) and show that the use of an LLM leads to best results.