An end-to-end entity recognition and disambiguation framework for identifying Author Affiliation from literature publications

Lianghong Lin, Wenxixie-c@my.cityu.edu.hk Wenxixie-c@my.cityu.edu.hk, Spczili@speed-polyu.edu.hk Spczili@speed-polyu.edu.hk, Tianyong Hao


Abstract
Author affiliation information plays a key role in bibliometric analyses and is essential for evaluating studies. However, as author affiliation information has not been standardized, which leads to difficulties such as synonym ambiguity and incomplete data during automated processing. To address the challenge, this paper proposes an end-to-end entity recognition and disambiguation framework for identifying author affiliation from literature publications. For entity disambiguation, an algorithm combining word embedding and spatial embedding is presented considering that author affiliation texts often contain rich geographic information. The disambiguation algorithm utilizes the semantic information and geographic information, which effectively enhances entity recognition and disambiguation effect. In addition, the proposed framework facilitates the effective utilization of the extensive literature in the PubMed database for comprehensive bibliometric analysis. The experimental results verify the robustness and effectiveness of the algorithm.
Anthology ID:
2024.sdp-1.11
Volume:
Proceedings of the Fourth Workshop on Scholarly Document Processing (SDP 2024)
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Tirthankar Ghosal, Amanpreet Singh, Anita Waard, Philipp Mayr, Aakanksha Naik, Orion Weller, Yoonjoo Lee, Shannon Shen, Yanxia Qin
Venues:
sdp | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
120–129
Language:
URL:
https://aclanthology.org/2024.sdp-1.11
DOI:
Bibkey:
Cite (ACL):
Lianghong Lin, Wenxixie-c@my.cityu.edu.hk Wenxixie-c@my.cityu.edu.hk, Spczili@speed-polyu.edu.hk Spczili@speed-polyu.edu.hk, and Tianyong Hao. 2024. An end-to-end entity recognition and disambiguation framework for identifying Author Affiliation from literature publications. In Proceedings of the Fourth Workshop on Scholarly Document Processing (SDP 2024), pages 120–129, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
An end-to-end entity recognition and disambiguation framework for identifying Author Affiliation from literature publications (Lin et al., sdp-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.sdp-1.11.pdf