Building Data-Driven Occupation Taxonomies: A Bottom-Up Multi-Stage Approach via Semantic Clustering and Multi-Agent Collaboration

Nan Li, Bo Kang, Tijl De Bie


Abstract
Creating robust occupation taxonomies, vital for applications ranging from job recommendation to labor market intelligence, is challenging.Manual curation is slow, while existing automated methods are either not adaptive to dynamic regional markets (top-down) or struggle to build coherent hierarchies from noisy data (bottom-up). We introduce CLIMB (CLusterIng-based Multi-agent taxonomy Builder), a framework that fully automates the creation of high-quality, data-driven taxonomies from raw job postings. CLIMB uses global semantic clustering to distill core occupations, then employs a reflection-based multi-agent system to iteratively build a coherent hierarchy. On three diverse, real-world datasets, we show that CLIMB produces taxonomies that are more coherent and scalable than existing methods and successfully capture unique regional characteristics. We release our code and datasets at https://github.com/aida-ugent/CLIMB.
Anthology ID:
2025.emnlp-industry.113
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track
Month:
November
Year:
2025
Address:
Suzhou (China)
Editors:
Saloni Potdar, Lina Rojas-Barahona, Sebastien Montella
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1596–1614
Language:
URL:
https://aclanthology.org/2025.emnlp-industry.113/
DOI:
Bibkey:
Cite (ACL):
Nan Li, Bo Kang, and Tijl De Bie. 2025. Building Data-Driven Occupation Taxonomies: A Bottom-Up Multi-Stage Approach via Semantic Clustering and Multi-Agent Collaboration. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 1596–1614, Suzhou (China). Association for Computational Linguistics.
Cite (Informal):
Building Data-Driven Occupation Taxonomies: A Bottom-Up Multi-Stage Approach via Semantic Clustering and Multi-Agent Collaboration (Li et al., EMNLP 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.emnlp-industry.113.pdf