Subir Saha


2023

pdf bib
IITD at SemEval-2023 Task 2: A Multi-Stage Information Retrieval Approach for Fine-Grained Named Entity Recognition
Shivani Choudhary | Niladri Chatterjee | Subir Saha
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

MultiCoNER-II is a fine-grained Named Entity Recognition (NER) task that aims to identify ambiguous and complex named entities in multiple languages, with a small amount of contextual information available. To address this task, we propose a multi-stage information retrieval (IR) pipeline that improves the performance of language models for fine-grained NER. Our approach involves leveraging a combination of a BM25-based IR model and a language model to retrieve relevant passages from a corpus. These passages are then used to train a model that utilizes a weighted average of losses. The prediction is generated by a decoder stack that includes a projection layer and conditional random field. To demonstrate the effectiveness of our approach, we participated in the English track of the MultiCoNER-II competition. Our approach yielded promising results, which we validated through detailed analysis.