Bernhard Liebl


2020

pdf bib
“Shakespeare in the Vectorian Age” – An evaluation of different word embeddings and NLP parameters for the detection of Shakespeare quotes
Bernhard Liebl | Manuel Burghardt
Proceedings of the 4th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature

In this paper we describe an approach for the computer-aided identification of Shakespearean intertextuality in a corpus of contemporary fiction. We present the Vectorian, which is a framework that implements different word embeddings and various NLP parameters. The Vectorian works like a search engine, i.e. a Shakespeare phrase can be entered as a query, the underlying collection of fiction books is then searched for the phrase and the passages that are likely to contain the phrase, either verbatim or as a paraphrase, are presented in a ranked results list. While the Vectorian can be used via a GUI, in which many different parameters can be set and combined manually, in this paper we present an ablation study that automatically evaluates different embedding and NLP parameter combinations against a ground truth. We investigate the behavior of different parameters during the evaluation and discuss how our results may be used for future studies on the detection of Shakespearean intertextuality.