2023
pdf
bib
abs
Legal Argument Extraction from Court Judgements using Integer Linear Programming
Basit Ali
|
Sachin Pawar
|
Girish Palshikar
|
Anindita Sinha Banerjee
|
Dhirendra Singh
Proceedings of the 10th Workshop on Argument Mining
Legal arguments are one of the key aspects of legal knowledge which are expressed in various ways in the unstructured text of court judgements. A large database of past legal arguments can be created by extracting arguments from court judgements, categorizing them, and storing them in a structured format. Such a database would be useful for suggesting suitable arguments for any new case. In this paper, we focus on extracting arguments from Indian Supreme Court judgements using minimal supervision. We first identify a set of certain sentence-level argument markers which are useful for argument extraction such as whether a sentence contains a claim or not, whether a sentence is argumentative in nature, whether two sentences are part of the same argument, etc. We then model the legal argument extraction problem as a text segmentation problem where we combine multiple weak evidences in the form of argument markers using Integer Linear Programming (ILP), finally arriving at a global document-level solution giving the most optimal legal arguments. We demonstrate the effectiveness of our technique by comparing it against several competent baselines.
2016
pdf
bib
abs
Multiword Expressions Dataset for Indian Languages
Dhirendra Singh
|
Sudha Bhingardive
|
Pushpak Bhattacharyya
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Multiword Expressions (MWEs) are used frequently in natural languages, but understanding the diversity in MWEs is one of the open problem in the area of Natural Language Processing. In the context of Indian languages, MWEs play an important role. In this paper, we present MWEs annotation dataset created for Indian languages viz., Hindi and Marathi. We extract possible MWE candidates using two repositories: 1) the POS-tagged corpus and 2) the IndoWordNet synsets. Annotation is done for two types of MWEs: compound nouns and light verb constructions. In the process of annotation, human annotators tag valid MWEs from these candidates based on the standard guidelines provided to them. We obtained 3178 compound nouns and 2556 light verb constructions in Hindi and 1003 compound nouns and 2416 light verb constructions in Marathi using two repositories mentioned before. This created resource is made available publicly and can be used as a gold standard for Hindi and Marathi MWE systems.
pdf
bib
abs
Synset Ranking of Hindi WordNet
Sudha Bhingardive
|
Rajita Shukla
|
Jaya Saraswati
|
Laxmi Kashyap
|
Dhirendra Singh
|
Pushpak Bhattacharyya
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Word Sense Disambiguation (WSD) is one of the open problems in the area of natural language processing. Various supervised, unsupervised and knowledge based approaches have been proposed for automatically determining the sense of a word in a particular context. It has been observed that such approaches often find it difficult to beat the WordNet First Sense (WFS) baseline which assigns the sense irrespective of context. In this paper, we present our work on creating the WFS baseline for Hindi language by manually ranking the synsets of Hindi WordNet. A ranking tool is developed where human experts can see the frequency of the word senses in the sense-tagged corpora and have been asked to rank the senses of a word by using this information and also his/her intuition. The accuracy of WFS baseline is tested on several standard datasets. F-score is found to be 60%, 65% and 55% on Health, Tourism and News datasets respectively. The created rankings can also be used in other NLP applications viz., Machine Translation, Information Retrieval, Text Summarization, etc.
pdf
bib
abs
IndoWordNet::Similarity- Computing Semantic Similarity and Relatedness using IndoWordNet
Sudha Bhingardive
|
Hanumant Redkar
|
Prateek Sappadla
|
Dhirendra Singh
|
Pushpak Bhattacharyya
Proceedings of the 8th Global WordNet Conference (GWC)
Semantic similarity and relatedness measures play an important role in natural language processing applications. In this paper, we present the IndoWordNet::Similarity tool and interface, designed for computing the semantic similarity and relatedness between two words in IndoWordNet. A java based tool and a web interface have been developed to compute this semantic similarity and relatedness. Also, Java API has been developed for this purpose. This tool, web interface and the API are made available for the research purpose.
pdf
bib
abs
Detection of Compound Nouns and Light Verb Constructions using IndoWordNet
Dhirendra Singh
|
Sudha Bhingardive
|
Pushpak Bhattacharyyaa
Proceedings of the 8th Global WordNet Conference (GWC)
Detection of MultiWord Expressions (MWEs) is one of the fundamental problems in Natural Language Processing. In this paper, we focus on two categories of MWEs - Compound Nouns and Light Verb Constructions. These two categories can be tackled using knowledge bases, rather than pure statistics. We investigate usability of IndoWordNet for the detection of MWEs. Our IndoWordNet based approach uses semantic and ontological features of words that can be extracted from IndoWordNet. This approach has been tested on Indian languages viz., Assamese, Bengali, Hindi, Konkani, Marathi, Odia and Punjabi. Results show that ontological features are found to be very useful for the detection of light verb constructions, while use of semantic properties for the detection of compound nouns is found to be satisfactory. This approach can be easily adapted by other Indian languages. Detected MWEs can be interpolated into WordNets as they help in representing semantic knowledge.
2015
pdf
bib
Unsupervised Most Frequent Sense Detection using Word Embeddings
Sudha Bhingardive
|
Dhirendra Singh
|
Rudramurthy V
|
Hanumant Redkar
|
Pushpak Bhattacharyya
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
pdf
bib
Using Word Embeddings for Bilingual Unsupervised WSD
Sudha Bhingardive
|
Dhirendra Singh
|
Rudramurthy V
|
Pushpak Bhattacharyya
Proceedings of the 12th International Conference on Natural Language Processing
pdf
bib
Detection of Multiword Expressions for Hindi Language using Word Embeddings and WordNet-based Features
Dhirendra Singh
|
Sudha Bhingardive
|
Kevin Patel
|
Pushpak Bhattacharyya
Proceedings of the 12th International Conference on Natural Language Processing
2014
pdf
bib
Merging Verb Senses of Hindi WordNet using Word Embeddings
Sudha Bhingardive
|
Ratish Puduppully
|
Dhirendra Singh
|
Pushpak Bhattacharyya
Proceedings of the 11th International Conference on Natural Language Processing