Enhancing Job Evaluation with Data Augmentation and Text Classification

Samaneh Jalilian; Niels van Weeren; Mohammad Shokri; Thijmen Bijl; Suzan Verberne

Enhancing Job Evaluation with Data Augmentation and Text Classification

Samaneh Jalilian, Niels van Weeren, Mohammad Shokri, Thijmen Bijl, Suzan Verberne

Abstract

Accurate job grading and evaluation are essential for ensuring fair compensation in Human Resources (HR) planning. In this research, we propose to improve job evaluation by semi-automating a manual, time-consuming, and inconsistent process with text-based classification models. We address three prediction tasks: job title classification, grading, and compensation prediction. For job title classification, we fine-tune a RoBERTa model for classification and use Gemini to generate synthetic job descriptions for rare job titles. For grade and compensation prediction, we compare TF-IDF and transformer-based embeddings (DistilRoBERTa, MPNet, MiniLM) in combination with deep neural networks and tree-based models (Random Forest, XGBoost). We optimize all models using grid search with hyperparameter tuning and cross-validation. The results show that job title classification by RoBERTa with Gemini-generated descriptions works well with an accuracy of about 97%. In our regression experiments, our models get promising results: for grade prediction, a tuned TF-IDF + XGBoost model achieves a mean absolute error (MAE) of 0.185, and for annual salary prediction, MiniLM embeddings with XGBoost get an MAE of €1,587. These findings demonstrate that a semi-automated pipeline can enhance traditional manual processes by boosting consistency, speeding up HR workflows, and reducing biased assessments.

Anthology ID:: 2026.acl-industry.59
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Month:: July
Year:: 2026
Address:: San Diego, California, USA
Editors:: Yunyao Li, Georg Rehm, Mei Tu
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 872–883
Language:
URL:: https://aclanthology.org/2026.acl-industry.59/
DOI:
Bibkey:
Cite (ACL):: Samaneh Jalilian, Niels van Weeren, Mohammad Shokri, Thijmen Bijl, and Suzan Verberne. 2026. Enhancing Job Evaluation with Data Augmentation and Text Classification. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026), pages 872–883, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):: Enhancing Job Evaluation with Data Augmentation and Text Classification (Jalilian et al., ACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.acl-industry.59.pdf

PDF Cite Search Fix data