Narayan Choudhary


2018

pdf bib
A Treebank for the Healthcare Domain
Nganthoibi Oinam | Diwakar Mishra | Pinal Patel | Narayan Choudhary | Hitesh Desai
Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018)

This paper presents a treebank for the healthcare domain developed at ezDI. The treebank is created from a wide array of clinical health record documents across hospitals. The data has been de-identified and annotated for constituent syntactic structure. The treebank contains a total of 52053 sentences that have been sampled for subdomains as well as linguistic variations. The paper outlines the sampling process followed to ensure a better domain representation in the corpus, the annotation process and challenges, and corpus statistics. The Penn Treebank tagset and guidelines were largely followed, but there were many syntactic contexts that warranted adaptation of the guidelines. The treebank created was used to re-train the Berkeley parser and the Stanford parser. These parsers were also trained with the GENIA treebank for comparative quality assessment. Our treebank yielded great-er accuracy on both parsers. Berkeley parser performed better on our treebank with an average F1 measure of 91 across 5-folds. This was a significant jump from the out-of-the-box F1 score of 70 on Berkeley parser’s default grammar.

2015

pdf bib
ezDI: A Supervised NLP System for Clinical Narrative Analysis
Parth Pathak | Pinal Patel | Vishal Panchal | Sagar Soni | Kinjal Dani | Amrish Patel | Narayan Choudhary
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

2014

pdf bib
ezDI: A Hybrid CRF and SVM based Model for Detecting and Encoding Disorder Mentions in Clinical Notes
Parth Pathak | Pinal Patel | Vishal Panchal | Narayan Choudhary | Amrish Patel | Gautam Joshi
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

pdf bib
Annotating a Large Representative Corpus of Clinical Notes for Parts of Speech
Narayan Choudhary | Parth Pathak | Pinal Patel | Vishal Panchal
Proceedings of LAW VIII - The 8th Linguistic Annotation Workshop

pdf bib
Evaluating Two Annotated Corpora of Hindi Using a Verb Class Identifier
Neha Dixit | Narayan Choudhary
Proceedings of the 11th International Conference on Natural Language Processing