%0 Conference Proceedings %T MS-Mentions: Consistently Annotating Entity Mentions in Materials Science Procedural Text %A O’Gorman, Tim %A Jensen, Zach %A Mysore, Sheshera %A Huang, Kevin %A Mahbub, Rubayyat %A Olivetti, Elsa %A McCallum, Andrew %Y Moens, Marie-Francine %Y Huang, Xuanjing %Y Specia, Lucia %Y Yih, Scott Wen-tau %S Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing %D 2021 %8 November %I Association for Computational Linguistics %C Online and Punta Cana, Dominican Republic %F ogorman-etal-2021-ms %X Material science synthesis procedures are a promising domain for scientific NLP, as proper modeling of these recipes could provide insight into new ways of creating materials. However, a fundamental challenge in building information extraction models for material science synthesis procedures is getting accurate labels for the materials, operations, and other entities of those procedures. We present a new corpus of entity mention annotations over 595 Material Science synthesis procedural texts (157,488 tokens), which greatly expands the training data available for the Named Entity Recognition task. We outline a new label inventory designed to provide consistent annotations and a new annotation approach intended to maximize the consistency and annotation speed of domain experts. Inter-annotator agreement studies and baseline models trained upon the data suggest that the corpus provides high-quality annotations of these mention types. This corpus helps lay a foundation for future high-quality modeling of synthesis procedures. %R 10.18653/v1/2021.emnlp-main.101 %U https://aclanthology.org/2021.emnlp-main.101 %U https://doi.org/10.18653/v1/2021.emnlp-main.101 %P 1337-1352