Automatically Cataloging Scholarly Articles using Library of Congress Subject Headings

Nazmul Kazi; Nathaniel Lane; Indika Kahanda

doi:10.18653/v1/2021.eacl-srw.7

Automatically Cataloging Scholarly Articles using Library of Congress Subject Headings

Nazmul Kazi, Nathaniel Lane, Indika Kahanda

Abstract

Institutes are required to catalog their articles with proper subject headings so that the users can easily retrieve relevant articles from the institutional repositories. However, due to the rate of proliferation of the number of articles in these repositories, it is becoming a challenge to manually catalog the newly added articles at the same pace. To address this challenge, we explore the feasibility of automatically annotating articles with Library of Congress Subject Headings (LCSH). We first use web scraping to extract keywords for a collection of articles from the Repository Analytics and Metrics Portal (RAMP). Then, we map these keywords to LCSH names for developing a gold-standard dataset. As a case study, using the subset of Biology-related LCSH concepts, we develop predictive models by formulating this task as a multi-label classification problem. Our experimental results demonstrate the viability of this approach for predicting LCSH for scholarly articles.

Anthology ID:: 2021.eacl-srw.7
Volume:: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop
Month:: April
Year:: 2021
Address:: Online
Editors:: Ionut-Teodor Sorodoc, Madhumita Sushil, Ece Takmaz, Eneko Agirre
Venue:: EACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 43–49
Language:
URL:: https://aclanthology.org/2021.eacl-srw.7/
DOI:: 10.18653/v1/2021.eacl-srw.7
Bibkey:
Cite (ACL):: Nazmul Kazi, Nathaniel Lane, and Indika Kahanda. 2021. Automatically Cataloging Scholarly Articles using Library of Congress Subject Headings. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop, pages 43–49, Online. Association for Computational Linguistics.
Cite (Informal):: Automatically Cataloging Scholarly Articles using Library of Congress Subject Headings (Kazi et al., EACL 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.eacl-srw.7.pdf

PDF Cite Search Fix data