Automatic Classification of Legal Violations in Cookie Banner Texts

Marieke Van Hofslot; Almila Akdag Salah; Albert Gatt; Cristiana Santos

doi:10.18653/v1/2022.nllp-1.27

Automatic Classification of Legal Violations in Cookie Banner Texts

Marieke Van Hofslot, Almila Akdag Salah, Albert Gatt, Cristiana Santos

Abstract

Cookie banners are designed to request consent from website visitors for their personal data. Recent research suggest that a high percentage of cookie banners violate legal regulations as defined by the General Data Protection Regulation (GDPR) and the ePrivacy Directive. In this paper, we focus on language used in these cookie banners, and whether these violations can be automatically detected, or not. We make use of a small cookie banner dataset that is annotated by five experts for legal violations and test it with state of the art classification models, namely BERT, LEGAL-BERT, BART in a zero-shot setting and BERT with LIWC embeddings. Our results show that none of the models outperform the others in all classes, but in general, BERT and LEGAL-BERT provide the highest accuracy results (70%-97%). However, they are influenced by the small size and the unbalanced distributions in the dataset.

Anthology ID:: 2022.nllp-1.27
Volume:: Proceedings of the Natural Legal Language Processing Workshop 2022
Month:: December
Year:: 2022
Address:: Abu Dhabi, United Arab Emirates (Hybrid)
Editors:: Nikolaos Aletras, Ilias Chalkidis, Leslie Barrett, Cătălina Goanță, Daniel Preoțiuc-Pietro
Venue:: NLLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 287–295
Language:
URL:: https://aclanthology.org/2022.nllp-1.27/
DOI:: 10.18653/v1/2022.nllp-1.27
Bibkey:
Cite (ACL):: Marieke Van Hofslot, Almila Akdag Salah, Albert Gatt, and Cristiana Santos. 2022. Automatic Classification of Legal Violations in Cookie Banner Texts. In Proceedings of the Natural Legal Language Processing Workshop 2022, pages 287–295, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
Cite (Informal):: Automatic Classification of Legal Violations in Cookie Banner Texts (Van Hofslot et al., NLLP 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.nllp-1.27.pdf

PDF Cite Search Fix data