Kasra Hosseini

2021

When Time Makes Sense: A Historically-Aware Approach to Targeted Sense Disambiguation
Kaspar Beelen | Federico Nanni | Mariona Coll Ardanuy | Kasra Hosseini | Giorgia Tolfo | Barbara McGillivray
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2020

pdf bib abs

This paper proposes a new approach to animacy detection, the task of determining whether an entity is represented as animate in a text. In particular, this work is focused on atypical animacy and examines the scenario in which typically inanimate objects, specifically machines, are given animate attributes. To address it, we have created the first dataset for atypical animacy detection, based on nineteenth-century sentences in English, with machines represented as either animate or inanimate. Our method builds on recent innovations in language modeling, specifically BERT contextualized word embeddings, to better capture fine-grained contextual properties of words. We present a fully unsupervised pipeline, which can be easily adapted to different contexts, and report its performance on an established animacy dataset and our newly introduced resource. We show that our method provides a substantially more accurate characterization of atypical animacy, especially when applied to highly complex forms of language use.

pdf bib abs

DeezyMatch: A Flexible Deep Learning Approach to Fuzzy String Matching
Kasra Hosseini | Federico Nanni | Mariona Coll Ardanuy
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

We present DeezyMatch, a free, open-source software library written in Python for fuzzy string matching and candidate ranking. Its pair classifier supports various deep neural network architectures for training new classifiers and for fine-tuning a pretrained model, which paves the way for transfer learning in fuzzy string matching. This approach is especially useful where only limited training examples are available. The learned DeezyMatch models can be used to generate rich vector representations from string inputs. The candidate ranker component in DeezyMatch uses these vector representations to find, for a given query, the best matching candidates in a knowledge base. It uses an adaptive searching algorithm applicable to large knowledge bases and query sets. We describe DeezyMatch’s functionality, design and implementation, accompanied by a use case in toponym matching and candidate ranking in realistic noisy datasets.

Co-authors

Ruth Ahnert 1

Jon Lawrence 1

Katherine McDonough 1

Daniel CS Wilson 1

Venues

Fix author