Taoufiq Dkaki

2024

pdf bib abs
Explanation Extraction from Hierarchical Classification Frameworks for Long Legal Documents
Nishchal Prasad | Taoufiq Dkaki | Mohand Boughanem
Findings of the Association for Computational Linguistics: NAACL 2024

Hierarchical classification frameworks have been widely used to process long sequences, especially in the legal domain for predictions from long legal documents. But being black-box models they are unable to explain their predictions making them less reliable for practical applications, more so in the legal domain. In this work, we develop an extractive explanation algorithm for hierarchical frameworks for long sequences based on the sensitivity of the trained model to its input perturbations. We perturb using occlusion and develop Ob-HEx; an Occlusion-based Hierarchical Explanation-extractor. We adapt Ob-HEx to Hierarchical Transformer models trained on long Indian legal texts. And use Ob-HEx to analyze them and extract their explanations for the ILDC-Expert dataset, achieving a minimum gain of 1 point over the previous benchmark on most of our performance evaluation metrics.

2023

pdf bib abs
IRIT_IRIS_C at SemEval-2023 Task 6: A Multi-level Encoder-based Architecture for Judgement Prediction of Legal Cases and their Explanation
Nishchal Prasad | Mohand Boughanem | Taoufiq Dkaki
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

This paper describes our system used for sub-task C (1 & 2) in Task 6: LegalEval: Understanding Legal Texts. We propose a three-level encoder-based classification architecture that works by fine-tuning a BERT-based pre-trained encoder, and post-processing the embeddings extracted from its last layers, using transformer encoder layers and RNNs. We run ablation studies on the same and analyze itsperformance. To extract the explanations for the predicted class we develop an explanation extraction algorithm, exploiting the idea of a model’s occlusion sensitivity. We explored some training strategies with a detailed analysis of the dataset. Our system ranks 2nd (macro-F1 metric) for its sub-task C-1 and 7th (ROUGE-2 metric) for sub-task C-2.

Co-authors

Venues

findings1
semeval1

Fix data