Hypernym-LIBre: A Free Web-based Corpus for Hypernym Detection

Shaurya Rawat, Mariano Rico, Oscar Corcho


Abstract
In this paper, we describe a new web-based corpus for hypernym detection. It consists of 32 GB of high quality english paragraphs along with their part-of-speech tagged and dependency parsed versions. For hypernym detection, the current state-of-the-art uses a corpus which is not available freely. We evaluate the state-of-the-art methods on our corpus and achieve similar results. The advantage of this corpora is that it is available under an open license. Our main contribution is the corpus with POS-tags and dependency tags and the code to extract and simulate the results we have achieved using our corpus.
Anthology ID:
2020.wac-1.6
Volume:
Proceedings of the 12th Web as Corpus Workshop
Month:
May
Year:
2020
Address:
Marseille, France
Venue:
WAC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
42–49
Language:
English
URL:
https://aclanthology.org/2020.wac-1.6
DOI:
Bibkey:
Cite (ACL):
Shaurya Rawat, Mariano Rico, and Oscar Corcho. 2020. Hypernym-LIBre: A Free Web-based Corpus for Hypernym Detection. In Proceedings of the 12th Web as Corpus Workshop, pages 42–49, Marseille, France. European Language Resources Association.
Cite (Informal):
Hypernym-LIBre: A Free Web-based Corpus for Hypernym Detection (Rawat et al., WAC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.wac-1.6.pdf