Learning User Embeddings from Emails

Yan Song, Chia-Jung Lee


Abstract
Many important email-related tasks, such as email classification or search, highly rely on building quality document representations (e.g., bag-of-words or key phrases) to assist matching and understanding. Despite prior success on representing textual messages, creating quality user representations from emails was overlooked. In this paper, we propose to represent users using embeddings that are trained to reflect the email communication network. Our experiments on Enron dataset suggest that the resulting embeddings capture the semantic distance between users. To assess the quality of embeddings in a real-world application, we carry out auto-foldering task where the lexical representation of an email is enriched with user embedding features. Our results show that folder prediction accuracy is improved when embedding features are present across multiple settings.
Anthology ID:
E17-2116
Volume:
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers
Month:
April
Year:
2017
Address:
Valencia, Spain
Editors:
Mirella Lapata, Phil Blunsom, Alexander Koller
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
733–738
Language:
URL:
https://aclanthology.org/E17-2116/
DOI:
Bibkey:
Cite (ACL):
Yan Song and Chia-Jung Lee. 2017. Learning User Embeddings from Emails. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pages 733–738, Valencia, Spain. Association for Computational Linguistics.
Cite (Informal):
Learning User Embeddings from Emails (Song & Lee, EACL 2017)
Copy Citation:
PDF:
https://aclanthology.org/E17-2116.pdf