Constructing a Norwegian Academic Wordlist

Janne M Johannessen, Arash Saidi, Kristin Hagen


Abstract
We present the development of a Norwegian Academic Wordlist (AKA list) for the Norwegian Bokmäl variety. To identify specific academic vocabulary we developed a 100-million-word academic corpus based on the University of Oslo archive of digital publications. Other corpora were used for testing and developing general word lists. We tried two different methods, those of Carlund et al. (2012) and Gardner & Davies (2013), and compared them. The resulting list is presented on a web site, where the words can be inspected in different ways, and freely downloaded.
Anthology ID:
L16-1232
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
1457–1462
Language:
URL:
https://aclanthology.org/L16-1232
DOI:
Bibkey:
Cite (ACL):
Janne M Johannessen, Arash Saidi, and Kristin Hagen. 2016. Constructing a Norwegian Academic Wordlist. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 1457–1462, Portorož, Slovenia. European Language Resources Association (ELRA).
Cite (Informal):
Constructing a Norwegian Academic Wordlist (Johannessen et al., LREC 2016)
Copy Citation:
PDF:
https://aclanthology.org/L16-1232.pdf