Taxonomy Builder: a Data-driven and User-centric Tool for Streamlining Taxonomy Construction

Mihai Surdeanu, John Hungerford, Yee Seng Chan, Jessica MacBride, Benjamin Gyori, Andrew Zupon, Zheng Tang, Haoling Qiu, Bonan Min, Yan Zverev, Caitlin Hilverman, Max Thomas, Walter Andrews, Keith Alcock, Zeyu Zhang, Michael Reynolds, Steven Bethard, Rebecca Sharp, Egoitz Laparra


Abstract
An existing domain taxonomy for normalizing content is often assumed when discussing approaches to information extraction, yet often in real-world scenarios there is none. When one does exist, as the information needs shift, it must be continually extended. This is a slow and tedious task, and one which does not scale well. Here we propose an interactive tool that allows a taxonomy to be built or extended rapidly and with a human in the loop to control precision. We apply insights from text summarization and information extraction to reduce the search space dramatically, then leverage modern pretrained language models to perform contextualized clustering of the remaining concepts to yield candidate nodes for the user to review. We show this allows a user to consider as many as 200 taxonomy concept candidates an hour, to quickly build or extend a taxonomy to better fit information needs.
Anthology ID:
2022.hcinlp-1.1
Volume:
Proceedings of the Second Workshop on Bridging Human--Computer Interaction and Natural Language Processing
Month:
July
Year:
2022
Address:
Seattle, Washington
Editors:
Su Lin Blodgett, Hal Daumé III, Michael Madaio, Ani Nenkova, Brendan O'Connor, Hanna Wallach, Qian Yang
Venue:
HCINLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–10
Language:
URL:
https://aclanthology.org/2022.hcinlp-1.1
DOI:
10.18653/v1/2022.hcinlp-1.1
Bibkey:
Cite (ACL):
Mihai Surdeanu, John Hungerford, Yee Seng Chan, Jessica MacBride, Benjamin Gyori, Andrew Zupon, Zheng Tang, Haoling Qiu, Bonan Min, Yan Zverev, Caitlin Hilverman, Max Thomas, Walter Andrews, Keith Alcock, Zeyu Zhang, Michael Reynolds, Steven Bethard, Rebecca Sharp, and Egoitz Laparra. 2022. Taxonomy Builder: a Data-driven and User-centric Tool for Streamlining Taxonomy Construction. In Proceedings of the Second Workshop on Bridging Human--Computer Interaction and Natural Language Processing, pages 1–10, Seattle, Washington. Association for Computational Linguistics.
Cite (Informal):
Taxonomy Builder: a Data-driven and User-centric Tool for Streamlining Taxonomy Construction (Surdeanu et al., HCINLP 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.hcinlp-1.1.pdf
Video:
 https://aclanthology.org/2022.hcinlp-1.1.mp4