Unsupervised Chunking as Syntactic Structure Induction with a Knowledge-Transfer Approach

Anup Anand Deshmukh; Qianqiu Zhang; Ming Li; Jimmy Lin; Lili Mou

doi:10.18653/v1/2021.findings-emnlp.307

Unsupervised Chunking as Syntactic Structure Induction with a Knowledge-Transfer Approach

Anup Anand Deshmukh, Qianqiu Zhang, Ming Li, Jimmy Lin, Lili Mou

Abstract

In this paper, we address unsupervised chunking as a new task of syntactic structure induction, which is helpful for understanding the linguistic structures of human languages as well as processing low-resource languages. We propose a knowledge-transfer approach that heuristically induces chunk labels from state-of-the-art unsupervised parsing models; a hierarchical recurrent neural network (HRNN) learns from such induced chunk labels to smooth out the noise of the heuristics. Experiments show that our approach largely bridges the gap between supervised and unsupervised chunking.

Anthology ID:: 2021.findings-emnlp.307
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2021
Month:: November
Year:: 2021
Address:: Punta Cana, Dominican Republic
Editors:: Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:: Findings
SIG:: SIGDAT
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3626–3634
Language:
URL:: https://aclanthology.org/2021.findings-emnlp.307
DOI:: 10.18653/v1/2021.findings-emnlp.307
Bibkey:
Cite (ACL):: Anup Anand Deshmukh, Qianqiu Zhang, Ming Li, Jimmy Lin, and Lili Mou. 2021. Unsupervised Chunking as Syntactic Structure Induction with a Knowledge-Transfer Approach. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 3626–3634, Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):: Unsupervised Chunking as Syntactic Structure Induction with a Knowledge-Transfer Approach (Deshmukh et al., Findings 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.findings-emnlp.307.pdf
Video:: https://aclanthology.org/2021.findings-emnlp.307.mp4
Code: anup-deshmukh/unsupervised-chunking
Data: English Web Treebank

PDF Cite Search Code Video