Features for Generic Corpus Querying

Thomas Eckart, Christoph Kuras, Uwe Quasthoff


Abstract
The availability of large corpora for more and more languages enforces generic querying and standard interfaces. This development is especially relevant in the context of integrated research environments like CLARIN or DARIAH. The paper focuses on several applications and implementation details on the basis of a unified corpus format, a unique POS tag set, and prepared data for word similarities. All described data or applications are already or will be in the near future accessible via well-documented RESTful Web services. The target group are all kinds of interested persons with varying level of experience in programming or corpus query languages.
Anthology ID:
L16-1444
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
2794–2798
Language:
URL:
https://aclanthology.org/L16-1444
DOI:
Bibkey:
Cite (ACL):
Thomas Eckart, Christoph Kuras, and Uwe Quasthoff. 2016. Features for Generic Corpus Querying. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 2794–2798, Portorož, Slovenia. European Language Resources Association (ELRA).
Cite (Informal):
Features for Generic Corpus Querying (Eckart et al., LREC 2016)
Copy Citation:
PDF:
https://aclanthology.org/L16-1444.pdf