TPLinker: Single-stage Joint Extraction of Entities and Relations Through Token Pair Linking

Yucheng Wang, Bowen Yu, Yueyang Zhang, Tingwen Liu, Hongsong Zhu, Limin Sun


Abstract
Extracting entities and relations from unstructured text has attracted increasing attention in recent years but remains challenging, due to the intrinsic difficulty in identifying overlapping relations with shared entities. Prior works show that joint learning can result in a noticeable performance gain. However, they usually involve sequential interrelated steps and suffer from the problem of exposure bias. At training time, they predict with the ground truth conditions while at inference it has to make extraction from scratch. This discrepancy leads to error accumulation. To mitigate the issue, we propose in this paper a one-stage joint extraction model, namely, TPLinker, which is capable of discovering overlapping relations sharing one or both entities while being immune from the exposure bias. TPLinker formulates joint extraction as a token pair linking problem and introduces a novel handshaking tagging scheme that aligns the boundary tokens of entity pairs under each relation type. Experiment results show that TPLinker performs significantly better on overlapping and multiple relation extraction, and achieves state-of-the-art performance on two public datasets.
Anthology ID:
2020.coling-main.138
Volume:
Proceedings of the 28th International Conference on Computational Linguistics
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Editors:
Donia Scott, Nuria Bel, Chengqing Zong
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
1572–1582
Language:
URL:
https://aclanthology.org/2020.coling-main.138
DOI:
10.18653/v1/2020.coling-main.138
Bibkey:
Cite (ACL):
Yucheng Wang, Bowen Yu, Yueyang Zhang, Tingwen Liu, Hongsong Zhu, and Limin Sun. 2020. TPLinker: Single-stage Joint Extraction of Entities and Relations Through Token Pair Linking. In Proceedings of the 28th International Conference on Computational Linguistics, pages 1572–1582, Barcelona, Spain (Online). International Committee on Computational Linguistics.
Cite (Informal):
TPLinker: Single-stage Joint Extraction of Entities and Relations Through Token Pair Linking (Wang et al., COLING 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.coling-main.138.pdf
Code
 131250208/TPlinker-joint-extraction
Data
NYT10-HRLNYT11-HRLNew York Times Annotated CorpusWebNLG