Incorporating Metadata into Content-Based User Embeddings

Linzi Xing, Michael J. Paul


Abstract
Low-dimensional vector representations of social media users can benefit applications like recommendation systems and user attribute inference. Recent work has shown that user embeddings can be improved by combining different types of information, such as text and network data. We propose a data augmentation method that allows novel feature types to be used within off-the-shelf embedding models. Experimenting with the task of friend recommendation on a dataset of 5,019 Twitter users, we show that our approach can lead to substantial performance gains with the simple addition of network and geographic features.
Anthology ID:
W17-4406
Volume:
Proceedings of the 3rd Workshop on Noisy User-generated Text
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Editors:
Leon Derczynski, Wei Xu, Alan Ritter, Tim Baldwin
Venue:
WNUT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
45–49
Language:
URL:
https://aclanthology.org/W17-4406/
DOI:
10.18653/v1/W17-4406
Bibkey:
Cite (ACL):
Linzi Xing and Michael J. Paul. 2017. Incorporating Metadata into Content-Based User Embeddings. In Proceedings of the 3rd Workshop on Noisy User-generated Text, pages 45–49, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):
Incorporating Metadata into Content-Based User Embeddings (Xing & Paul, WNUT 2017)
Copy Citation:
PDF:
https://aclanthology.org/W17-4406.pdf