MEGAnno: Exploratory Labeling for NLP in Computational Notebooks

Dan Zhang, Hannah Kim, Rafael Li Chen, Eser Kandogan, Estevam Hruschka


Abstract
We present MEGAnno, a novel exploratory annotation framework designed for NLP researchers and practitioners. Unlike existing labeling tools that focus on data labeling only, our framework aims to support a broader, iterative ML workflow including data exploration and model development. With MEGAnno’s API, users can programmatically explore the data through sophisticated search and automated suggestion functions and incrementally update task schema as their project evolve. Combined with our widget, the users can interactively sort, filter, and assign labels to multiple items simultaneously in the same notebook where the rest of the NLP project resides. We demonstrate MEGAnno’s flexible, exploratory, efficient, and seamless labeling experience through a sentiment analysis use case.
Anthology ID:
2022.dash-1.1
Volume:
Proceedings of the Fourth Workshop on Data Science with Human-in-the-Loop (Language Advances)
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates (Hybrid)
Editors:
Eduard Dragut, Yunyao Li, Lucian Popa, Slobodan Vucetic, Shashank Srivastava
Venue:
DaSH
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–7
Language:
URL:
https://aclanthology.org/2022.dash-1.1
DOI:
Bibkey:
Cite (ACL):
Dan Zhang, Hannah Kim, Rafael Li Chen, Eser Kandogan, and Estevam Hruschka. 2022. MEGAnno: Exploratory Labeling for NLP in Computational Notebooks. In Proceedings of the Fourth Workshop on Data Science with Human-in-the-Loop (Language Advances), pages 1–7, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
Cite (Informal):
MEGAnno: Exploratory Labeling for NLP in Computational Notebooks (Zhang et al., DaSH 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.dash-1.1.pdf
Video:
 https://aclanthology.org/2022.dash-1.1.mp4