The Nordic Dialect Corpus

Janne Bondi Johannessen, Joel Priestley, Kristin Hagen, Anders Nøklestad, André Lynum


Abstract
In this paper, we describe the Nordic Dialect Corpus, which has recently been completed. The corpus has a variety of features that combined makes it an advanced tool for language researchers. These features include: Linguistic contents (dialects from five closely related languages), annotation (tagging and two types of transcription), search interface (advanced possibilities for combining a large array of search criteria and results presentation in an intuitive and simple interface), many search variables (linguistics-based, informant-based, time-based), multimedia display (linking of sound and video to transcriptions), display of results in maps, display of informant details (number of words and other information on informants), advanced results handling (concordances, collocations, counts and statistics shown in a variety of graphical modes, plus further processing). Finally, and importantly, the corpus is freely available for research on the web. We give examples of both various kinds of searches, of displays of results and of results handling.
Anthology ID:
L12-1453
Volume:
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Month:
May
Year:
2012
Address:
Istanbul, Turkey
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
3387–3391
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/773_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Janne Bondi Johannessen, Joel Priestley, Kristin Hagen, Anders Nøklestad, and André Lynum. 2012. The Nordic Dialect Corpus. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 3387–3391, Istanbul, Turkey. European Language Resources Association (ELRA).
Cite (Informal):
The Nordic Dialect Corpus (Johannessen et al., LREC 2012)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/773_Paper.pdf