Challenges in Technical Regulatory Text Variation Detection

Shriya Vaagdevi Chikati, Samuel Larkin, David Minicola, Chi-kiu Lo


Abstract
We present a preliminary study on the feasibility of using current natural language processing techniques to detect variations between the construction codes of different jurisdictions. We formulate the task as a sentence alignment problem and evaluate various sentence representation models for their performance in this task. Our results show that task-specific trained embeddings perform marginally better than other models, but the overall accuracy remains a challenge. We also show that domain-specific fine-tuning hurts the task performance. The results highlight the challenges of developing NLP applications for technical regulatory texts.
Anthology ID:
2025.regnlp-1.2
Volume:
Proceedings of the 1st Regulatory NLP Workshop (RegNLP 2025)
Month:
January
Year:
2025
Address:
Abu Dhabi, UAE
Editors:
Tuba Gokhan, Kexin Wang, Iryna Gurevych, Ted Briscoe
Venues:
RegNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5–9
Language:
URL:
https://aclanthology.org/2025.regnlp-1.2/
DOI:
Bibkey:
Cite (ACL):
Shriya Vaagdevi Chikati, Samuel Larkin, David Minicola, and Chi-kiu Lo. 2025. Challenges in Technical Regulatory Text Variation Detection. In Proceedings of the 1st Regulatory NLP Workshop (RegNLP 2025), pages 5–9, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
Challenges in Technical Regulatory Text Variation Detection (Chikati et al., RegNLP 2025)
Copy Citation:
PDF:
https://aclanthology.org/2025.regnlp-1.2.pdf