Grounded Multimodal Named Entity Recognition on Social Media

Jianfei Yu; Ziyan Li; Jieming Wang; Rui Xia

doi:10.18653/v1/2023.acl-long.508

Grounded Multimodal Named Entity Recognition on Social Media

Jianfei Yu, Ziyan Li, Jieming Wang, Rui Xia

Abstract

In recent years, Multimodal Named Entity Recognition (MNER) on social media has attracted considerable attention. However, existing MNER studies only extract entity-type pairs in text, which is useless for multimodal knowledge graph construction and insufficient for entity disambiguation. To solve these issues, in this work, we introduce a Grounded Multimodal Named Entity Recognition (GMNER) task. Given a text-image social post, GMNER aims to identify the named entities in text, their entity types, and their bounding box groundings in image (i.e. visual regions). To tackle the GMNER task, we construct a Twitter dataset based on two existing MNER datasets. Moreover, we extend four well-known MNER methods to establish a number of baseline systems and further propose a Hierarchical Index generation framework named H-Index, which generates the entity-type-region triples in a hierarchical manner with a sequence-to-sequence model. Experiment results on our annotated dataset demonstrate the superiority of our H-Index framework over baseline systems on the GMNER task.

Anthology ID:: 2023.acl-long.508
Volume:: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 9141–9154
Language:
URL:: https://aclanthology.org/2023.acl-long.508/
DOI:: 10.18653/v1/2023.acl-long.508
Bibkey:
Cite (ACL):: Jianfei Yu, Ziyan Li, Jieming Wang, and Rui Xia. 2023. Grounded Multimodal Named Entity Recognition on Social Media. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 9141–9154, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: Grounded Multimodal Named Entity Recognition on Social Media (Yu et al., ACL 2023)
Copy Citation:
PDF:: https://aclanthology.org/2023.acl-long.508.pdf
Video:: https://aclanthology.org/2023.acl-long.508.mp4

PDF Cite Search Video Fix data