Graph-based Syntactic Word Embeddings

Ragheb Al-Ghezi, Mikko Kurimo


Abstract
We propose a simple and efficient framework to learn syntactic embeddings based on information derived from constituency parse trees. Using biased random walk methods, our embeddings not only encode syntactic information about words, but they also capture contextual information. We also propose a method to train the embeddings on multiple constituency parse trees to ensure the encoding of global syntactic representation. Quantitative evaluation of the embeddings show a competitive performance on POS tagging task when compared to other types of embeddings, and qualitative evaluation reveals interesting facts about the syntactic typology learned by these embeddings.
Anthology ID:
2020.textgraphs-1.8
Volume:
Proceedings of the Graph-based Methods for Natural Language Processing (TextGraphs)
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Editors:
Dmitry Ustalov, Swapna Somasundaran, Alexander Panchenko, Fragkiskos D. Malliaros, Ioana Hulpuș, Peter Jansen, Abhik Jana
Venue:
TextGraphs
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
72–78
Language:
URL:
https://aclanthology.org/2020.textgraphs-1.8
DOI:
10.18653/v1/2020.textgraphs-1.8
Bibkey:
Cite (ACL):
Ragheb Al-Ghezi and Mikko Kurimo. 2020. Graph-based Syntactic Word Embeddings. In Proceedings of the Graph-based Methods for Natural Language Processing (TextGraphs), pages 72–78, Barcelona, Spain (Online). Association for Computational Linguistics.
Cite (Informal):
Graph-based Syntactic Word Embeddings (Al-Ghezi & Kurimo, TextGraphs 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.textgraphs-1.8.pdf