Automatic Detection of Difficulty of French Medical Sequences in Context

Anaïs Koptient, Natalia Grabar


Abstract
Medical documents use technical terms (single or multi-word expressions) with very specific semantics. Patients may find it difficult to understand these terms, which may lower their understanding of medical information. Before the simplification step of such terms, it is important to detect difficult to understand syntactic groups in medical documents as they may correspond to or contain technical terms. We address this question through categorization: we have to predict difficult to understand syntactic groups within syntactically analyzed medical documents. We use different models for this task: one built with only internal features (linguistic features), one built with only external features (contextual features), and one built with both sets of features. Our results show an f-measure over 0.8. Use of contextual (external) features and of annotations from all annotators impact the results positively. Ablation tests indicate that frequencies in large corpora and lexicon are relevant for this task.
Anthology ID:
2022.mwe-1.9
Volume:
Proceedings of the 18th Workshop on Multiword Expressions @LREC2022
Month:
June
Year:
2022
Address:
Marseille, France
Venue:
MWE
SIG:
SIGLEX
Publisher:
European Language Resources Association
Note:
Pages:
55–66
Language:
URL:
https://aclanthology.org/2022.mwe-1.9
DOI:
Bibkey:
Cite (ACL):
Anaïs Koptient and Natalia Grabar. 2022. Automatic Detection of Difficulty of French Medical Sequences in Context. In Proceedings of the 18th Workshop on Multiword Expressions @LREC2022, pages 55–66, Marseille, France. European Language Resources Association.
Cite (Informal):
Automatic Detection of Difficulty of French Medical Sequences in Context (Koptient & Grabar, MWE 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.mwe-1.9.pdf