Fusing Document, Collection and Label Graph-based Representations with Word Embeddings for Text Classification

Konstantinos Skianis, Fragkiskos Malliaros, Michalis Vazirgiannis


Abstract
Contrary to the traditional Bag-of-Words approach, we consider the Graph-of-Words(GoW) model in which each document is represented by a graph that encodes relationships between the different terms. Based on this formulation, the importance of a term is determined by weighting the corresponding node in the document, collection and label graphs, using node centrality criteria. We also introduce novel graph-based weighting schemes by enriching graphs with word-embedding similarities, in order to reward or penalize semantic relationships. Our methods produce more discriminative feature weights for text categorization, outperforming existing frequency-based criteria.
Anthology ID:
W18-1707
Volume:
Proceedings of the Twelfth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-12)
Month:
June
Year:
2018
Address:
New Orleans, Louisiana, USA
Editors:
Goran Glavaš, Swapna Somasundaran, Martin Riedl, Eduard Hovy
Venue:
TextGraphs
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
49–58
Language:
URL:
https://aclanthology.org/W18-1707/
DOI:
10.18653/v1/W18-1707
Bibkey:
Cite (ACL):
Konstantinos Skianis, Fragkiskos Malliaros, and Michalis Vazirgiannis. 2018. Fusing Document, Collection and Label Graph-based Representations with Word Embeddings for Text Classification. In Proceedings of the Twelfth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-12), pages 49–58, New Orleans, Louisiana, USA. Association for Computational Linguistics.
Cite (Informal):
Fusing Document, Collection and Label Graph-based Representations with Word Embeddings for Text Classification (Skianis et al., TextGraphs 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-1707.pdf
Code
 y3nk0/Graph-Based-TC