SlotGAN: Detecting Mentions in Text via Adversarial Distant Learning

Daniel Daza, Michael Cochez, Paul Groth


Abstract
We present SlotGAN, a framework for training a mention detection model that only requires unlabeled text and a gazetteer. It consists of a generator trained to extract spans from an input sentence, and a discriminator trained to determine whether a span comes from the generator, or from the gazetteer. We evaluate the method on English newswire data and compare it against supervised, weakly-supervised, and unsupervised methods. We find that the performance of the method is lower than these baselines, because it tends to generate more and longer spans, and in some cases it relies only on capitalization. In other cases, it generates spans that are valid but differ from the benchmark. When evaluated with metrics based on overlap, we find that SlotGAN performs within 95% of the precision of a supervised method, and 84% of its recall. Our results suggest that the model can generate spans that overlap well, but an additional filtering mechanism is required.
Anthology ID:
2022.spnlp-1.4
Volume:
Proceedings of the Sixth Workshop on Structured Prediction for NLP
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Andreas Vlachos, Priyanka Agrawal, André Martins, Gerasimos Lampouras, Chunchuan Lyu
Venue:
spnlp
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
32–39
Language:
URL:
https://aclanthology.org/2022.spnlp-1.4
DOI:
10.18653/v1/2022.spnlp-1.4
Bibkey:
Cite (ACL):
Daniel Daza, Michael Cochez, and Paul Groth. 2022. SlotGAN: Detecting Mentions in Text via Adversarial Distant Learning. In Proceedings of the Sixth Workshop on Structured Prediction for NLP, pages 32–39, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
SlotGAN: Detecting Mentions in Text via Adversarial Distant Learning (Daza et al., spnlp 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.spnlp-1.4.pdf
Data
CoNLL 2003