Detecting Paraphrases of Standard Clause Titles in Insurance Contracts

Frieda Josi, Christian Wartena, Ulrich Heid


Abstract
For the analysis of contract texts, validated model texts, such as model clauses, can be used to identify reused contract clauses. This paper investigates how to calculate the similarity between titles of model clauses and headings extracted from contracts, and which similarity measure is most suitable for this. For the calculation of the similarities between title pairs we tested various variants of string similarity and token based similarity. We also compare two more semantic similarity measures based on word embeddings using pretrained embeddings and word embeddings trained on contract texts. The identification of the model clause title can be used as a starting point for the mapping of clauses found in contracts to verified clauses.
Anthology ID:
W19-0803
Volume:
RELATIONS - Workshop on meaning relations between phrases and sentences
Month:
May
Year:
2019
Address:
Gothenburg, Sweden
Editors:
Venelin Kovatchev, Darina Gold, Torsten Zesch
Venue:
IWCS
SIG:
SIGSEM
Publisher:
Association for Computational Linguistics
Note:
Pages:
Language:
URL:
https://aclanthology.org/W19-0803
DOI:
10.18653/v1/W19-0803
Bibkey:
Cite (ACL):
Frieda Josi, Christian Wartena, and Ulrich Heid. 2019. Detecting Paraphrases of Standard Clause Titles in Insurance Contracts. In RELATIONS - Workshop on meaning relations between phrases and sentences, Gothenburg, Sweden. Association for Computational Linguistics.
Cite (Informal):
Detecting Paraphrases of Standard Clause Titles in Insurance Contracts (Josi et al., IWCS 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-0803.pdf