Risk Minimization for Zero-shot Sequence Labeling

Zechuan Hu, Yong Jiang, Nguyen Bach, Tao Wang, Zhongqiang Huang, Fei Huang, Kewei Tu


Abstract
Zero-shot sequence labeling aims to build a sequence labeler without human-annotated datasets. One straightforward approach is utilizing existing systems (source models) to generate pseudo-labeled datasets and train a target sequence labeler accordingly. However, due to the gap between the source and the target languages/domains, this approach may fail to recover the true labels. In this paper, we propose a novel unified framework for zero-shot sequence labeling with minimum risk training and design a new decomposable risk function that models the relations between the predicted labels from the source models and the true labels. By making the risk function trainable, we draw a connection between minimum risk training and latent variable model learning. We propose a unified learning algorithm based on the expectation maximization (EM) algorithm. We extensively evaluate our proposed approaches on cross-lingual/domain sequence labeling tasks over twenty-one datasets. The results show that our approaches outperform state-of-the-art baseline systems.
Anthology ID:
2021.acl-long.380
Volume:
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Month:
August
Year:
2021
Address:
Online
Venues:
ACL | IJCNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4909–4920
Language:
URL:
https://aclanthology.org/2021.acl-long.380
DOI:
10.18653/v1/2021.acl-long.380
Bibkey:
Copy Citation:
PDF:
https://aclanthology.org/2021.acl-long.380.pdf