Leveraging Grammar Induction for Language Understanding and Generation

Jushi Kai; Shengyuan Hou; Yusheng Huang; Zhouhan Lin

doi:10.18653/v1/2024.findings-emnlp.259

Leveraging Grammar Induction for Language Understanding and Generation

Jushi Kai, Shengyuan Hou, Yusheng Huang, Zhouhan Lin

Abstract

Grammar induction has made significant progress in recent years. However, it is not clear how the application of induced grammar could enhance practical performance in downstream tasks. In this work, we introduce an unsupervised grammar induction method for language understanding and generation. We construct a grammar parser to induce constituency structures and dependency relations, which is simultaneously trained on downstream tasks without additional syntax annotations. The induced grammar features are subsequently incorporated into Transformer as a syntactic mask to guide self-attention. We evaluate and apply our method to multiple machine translation tasks and natural language understanding tasks. Our method demonstrates superior performance compared to the original Transformer and other models enhanced with external parsers. Experimental results indicate that our method is effective in both from-scratch and pre-trained scenarios. Additionally, our research highlights the contribution of explicitly modeling the grammatical structure of texts to neural network models.

Anthology ID:: 2024.findings-emnlp.259
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2024
Month:: November
Year:: 2024
Address:: Miami, Florida, USA
Editors:: Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4501–4513
Language:
URL:: https://aclanthology.org/2024.findings-emnlp.259/
DOI:: 10.18653/v1/2024.findings-emnlp.259
Bibkey:
Cite (ACL):: Jushi Kai, Shengyuan Hou, Yusheng Huang, and Zhouhan Lin. 2024. Leveraging Grammar Induction for Language Understanding and Generation. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 4501–4513, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):: Leveraging Grammar Induction for Language Understanding and Generation (Kai et al., Findings 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.findings-emnlp.259.pdf

PDF Cite Search Fix data