Kompetencer: Fine-grained Skill Classification in Danish Job Postings via Distant Supervision and Transfer Learning

Mike Zhang, Kristian Nørgaard Jensen, Barbara Plank


Abstract
Skill Classification (SC) is the task of classifying job competences from job postings. This work is the first in SC applied to Danish job vacancy data. We release the first Danish job posting dataset: *Kompetencer* (_en_: competences), annotated for nested spans of competences. To improve upon coarse-grained annotations, we make use of The European Skills, Competences, Qualifications and Occupations (ESCO; le Vrang et al., (2014)) taxonomy API to obtain fine-grained labels via distant supervision. We study two setups: The zero-shot and few-shot classification setting. We fine-tune English-based models and RemBERT (Chung et al., 2020) and compare them to in-language Danish models. Our results show RemBERT significantly outperforms all other models in both the zero-shot and the few-shot setting.
Anthology ID:
2022.lrec-1.46
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
436–447
Language:
URL:
https://aclanthology.org/2022.lrec-1.46
DOI:
Bibkey:
Cite (ACL):
Mike Zhang, Kristian Nørgaard Jensen, and Barbara Plank. 2022. Kompetencer: Fine-grained Skill Classification in Danish Job Postings via Distant Supervision and Transfer Learning. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 436–447, Marseille, France. European Language Resources Association.
Cite (Informal):
Kompetencer: Fine-grained Skill Classification in Danish Job Postings via Distant Supervision and Transfer Learning (Zhang et al., LREC 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.lrec-1.46.pdf
Code
 Kaleidophon/deep-significance +  additional community code
Data
KompetencerSkillSpan