Linked Open Data and Web Corpus Data for noun compound bracketing

Pierre André Ménard, Caroline Barrière


Abstract
This research provides a comparison of a linked open data resource (DBpedia) and web corpus data resources (Google Web Ngrams and Google Books Ngrams) for noun compound bracketing. Large corpus statistical analysis has often been used for noun compound bracketing, and our goal is to introduce a linked open data (LOD) resource for such task. We show its particularities and its performance on the task. Results obtained on resources tested individually are promising, showing a potential for DBpedia to be included in future hybrid systems.
Anthology ID:
L14-1242
Volume:
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Month:
May
Year:
2014
Address:
Reykjavik, Iceland
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
702–709
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/263_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Pierre André Ménard and Caroline Barrière. 2014. Linked Open Data and Web Corpus Data for noun compound bracketing. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 702–709, Reykjavik, Iceland. European Language Resources Association (ELRA).
Cite (Informal):
Linked Open Data and Web Corpus Data for noun compound bracketing (Ménard & Barrière, LREC 2014)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/263_Paper.pdf