Fine-Grained Semantic Comparison of Legal Documents using LLMs

Elisei Rykov; Nikolay Ivanov; Maria Bandulevich; Kseniia Petrushina; Valentin Malykh; Vasily Konovalov; Alexander Panchenko; Ilseyar Alimova

Fine-Grained Semantic Comparison of Legal Documents using LLMs

Elisei Rykov, Nikolay Ivanov, Maria Bandulevich, Kseniia Petrushina, Valentin Malykh, Vasily Konovalov, Alexander Panchenko, Ilseyar Alimova

Abstract

Frequent revisions of complex regulatory documents in large organizations often introduce inconsistencies and contradictions that are difficult for lawyers and auditors to detect manually. Existing tools rely on character-level diffs and therefore miss paraphrases and semantic shifts. We introduce LegDiff, a novel benchmark for evaluating span-aware semantic comparison of legal texts, and use it to investigate the ability of large language models to detect semantic changes beyond token- and character-level matching. LegDiff comprises manually annotated pairs of legal paragraphs drawn from different documents. In addition, we present a pipeline to generate synthetic training data that aligns with the manual annotations and mirrors the structure and label distribution of the manually curated benchmark, and a visualization tool for clearly displaying detected differences and inconsistencies. The dataset, code, and a visualization tool are publicly available to facilitate reproducibility and further research (https://github.com/s-nlp/SLeDoC).

Anthology ID:: 2026.acl-srw.86
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Santosh T.Y.S.S., Juan Diego Rodriguez, Ona de Gibert
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 982–994
Language:
URL:: https://aclanthology.org/2026.acl-srw.86/
DOI:
Bibkey:
Cite (ACL):: Elisei Rykov, Nikolay Ivanov, Maria Bandulevich, Kseniia Petrushina, Valentin Malykh, Vasily Konovalov, Alexander Panchenko, and Ilseyar Alimova. 2026. Fine-Grained Semantic Comparison of Legal Documents using LLMs. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026), pages 982–994, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Fine-Grained Semantic Comparison of Legal Documents using LLMs (Rykov et al., ACL 2026)
Copy Citation:
PDF:: https://aclanthology.org/2026.acl-srw.86.pdf

PDF Cite Search Fix data