Legal Terminology Extraction in Spanish: Gold-standard Generation and LLM Evaluation

Lucia Palacios Palacios, Beatriz Guerrero García, Patricia Martín Chozas, Elena Montiel Ponsoda


Abstract
This study aims to develop a gold-standard for terminological extraction in Castilian Spanish within the domain of labour law. To achieve this, a methodology was developed based on established linguistic theories and reviewed by a team of expert terminologists. Departing from previous extraction studies and reference theoretical frameworks, candidate terms were identified by their morphosyntactic patterns, enriched by assessing their degree of specialisation in reference resources. The candidate terms were then subjected to manual validation. To evaluate its applicability, we assessed the performance of the LLaMA3-8B and Mistral-7B language models in extracting labour law terms from the latest version of the Real Decreto Legislativo 2/2015 Ley del Estatuto de los Trabajadores. YAKE was also included as a statistical baseline for comparison between traditional methods and generative approaches. All models were evaluated against the validated gold-standard.
Anthology ID:
2025.ranlp-1.98
Volume:
Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era
Month:
September
Year:
2025
Address:
Varna, Bulgaria
Editors:
Galia Angelova, Maria Kunilovskaya, Marie Escribe, Ruslan Mitkov
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:
860–869
Language:
URL:
https://aclanthology.org/2025.ranlp-1.98/
DOI:
Bibkey:
Cite (ACL):
Lucia Palacios Palacios, Beatriz Guerrero García, Patricia Martín Chozas, and Elena Montiel Ponsoda. 2025. Legal Terminology Extraction in Spanish: Gold-standard Generation and LLM Evaluation. In Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing - Natural Language Processing in the Generative AI Era, pages 860–869, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):
Legal Terminology Extraction in Spanish: Gold-standard Generation and LLM Evaluation (Palacios Palacios et al., RANLP 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.ranlp-1.98.pdf