Olivier Salaün
2024
EUROPA: A Legal Multilingual Keyphrase Generation Dataset
Olivier Salaün
|
Frédéric Piedboeuf
|
Guillaume Le Berre
|
David Alfonso-Hermelo
|
Philippe Langlais
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Keyphrase generation has primarily been explored within the context of academic research articles, with a particular focus on scientific domains and the English language. In this work, we present EUROPA, a novel dataset for multilingual keyphrase generation in the legal domain. It is derived from legal judgments from the Court of Justice of the European Union (EU), and contains instances in all 24 EU official languages. We run multilingual models on our corpus and analyze the results, showing room for improvement on a domain-specific multilingual corpus such as the one we present.
2021
Exploiting Domain-Specific Knowledge for Judgment Prediction Is No Panacea
Olivier Salaün
|
Philippe Langlais
|
Karim Benyekhlef
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)
Legal judgment prediction (LJP) usually consists in a text classification task aimed at predicting the verdict on the basis of the fact description. The literature shows that the use of articles as input features helps improve the classification performance. In this work, we designed a verdict prediction task based on landlord-tenant disputes and we applied BERT-based models to which we fed different article-based features. Although the results obtained are consistent with the literature, the improvements with the articles are mostly obtained with the most frequent labels, suggesting that pre-trained and fine-tuned transformer-based models are not scalable as is for legal reasoning in real life scenarios as they would only excel in accurately predicting the most recurrent verdicts to the detriment of other legal outcomes.