Hannah Alpert-Abrams


2017

pdf bib
Automatic Compositor Attribution in the First Folio of Shakespeare
Maria Ryskina | Hannah Alpert-Abrams | Dan Garrette | Taylor Berg-Kirkpatrick
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Compositor attribution, the clustering of pages in a historical printed document by the individual who set the type, is a bibliographic task that relies on analysis of orthographic variation and inspection of visual details of the printed page. In this paper, we introduce a novel unsupervised model that jointly describes the textual and visual features needed to distinguish compositors. Applied to images of Shakespeare’s First Folio, our model predicts attributions that agree with the manual judgements of bibliographers with an accuracy of 87%, even on text that is the output of OCR.

2016

pdf bib
An Unsupervised Model of Orthographic Variation for Historical Document Transcription
Dan Garrette | Hannah Alpert-Abrams
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2015

pdf bib
Unsupervised Code-Switching for Multilingual Historical Document Transcription
Dan Garrette | Hannah Alpert-Abrams | Taylor Berg-Kirkpatrick | Dan Klein
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies