Continuous N-gram Representations for Authorship Attribution

Yunita Sari, Andreas Vlachos, Mark Stevenson


Abstract
This paper presents work on using continuous representations for authorship attribution. In contrast to previous work, which uses discrete feature representations, our model learns continuous representations for n-gram features via a neural network jointly with the classification layer. Experimental results demonstrate that the proposed model outperforms the state-of-the-art on two datasets, while producing comparable results on the remaining two.
Anthology ID:
E17-2043
Volume:
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers
Month:
April
Year:
2017
Address:
Valencia, Spain
Editors:
Mirella Lapata, Phil Blunsom, Alexander Koller
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
267–273
Language:
URL:
https://aclanthology.org/E17-2043
DOI:
Bibkey:
Cite (ACL):
Yunita Sari, Andreas Vlachos, and Mark Stevenson. 2017. Continuous N-gram Representations for Authorship Attribution. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pages 267–273, Valencia, Spain. Association for Computational Linguistics.
Cite (Informal):
Continuous N-gram Representations for Authorship Attribution (Sari et al., EACL 2017)
Copy Citation:
PDF:
https://aclanthology.org/E17-2043.pdf