2013
pdf
bib
Nordic and Baltic Wordnets Aligned and Compared through “WordTies”
Bolette Sandford Pedersen
|
Lars Borin
|
Markus Forsberg
|
Neeme Kahusk
|
Krister Lindén
|
Jyrki Niemi
|
Niklas Nisbeth
|
Lars Nygaard
|
Heili Orav
|
Eirikur Rögnvaldsson
|
Mitchell Seaton
|
Kadri Vider
|
Kaarlo Voionmaa
Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013)
2008
pdf
bib
abs
Glossa: a Multilingual, Multimodal, Configurable User Interface
Lars Nygaard
|
Joel Priestley
|
Anders Nøklestad
|
Janne Bondi Johannessen
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
We describe a web-based corpus query system, Glossa, which combines the expressiveness of regular query languages with the user-friendliness of a graphical interface. Since corpus users are usually linguists with little interest in technical matters, we have developed a system where the user need not have any prior knowledge of the search system. Furthermore, no previous knowledge of abbreviations for metavariables such as part of speech and source text is needed. All searches are done using checkboxes, pull-down menus, or writing simple letters to make words or other strings. Querying for more than one word is simply done by adding an additional query box, and for parts of words by choosing a feature such as start of word. The Glossa system also allows a wide range of viewing and post-processing options. Collocations can be viewed and counted in a number of ways, and be viewed as different kinds of graphical charts. Further annotation and deletion of single results for further processing is also easy. The Glossa system is already in use for a number of corpora. Corpus administrators can easily adapt the system to a wide range of corpora, including multilingual corpora and corpora with audio and video content.
pdf
bib
abs
Evaluation of Linguistics-Based Translation
Janne Bondi Johannessen
|
Torbjørn Nordgård
|
Lars Nygaard
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
We report on the evaluation of the Norwegian-English MT prototype system LOGON. The system is rule-based and makes use of well-established frameworks for analysis and generation (LFG and HPSG). Minimal Recursion Semantics is the glue which performs transfer from source to target language and serves as the information vehicle between LFG and HPSG. The project-internal testing uses material from the training data sources from the domain guidebooks for mountain hiking in the summer season in Southern Norway. This testing, involving eight external assessors, yielded 57 % translated sentences, with acceptable fidelity measures, but with less than acceptable fluency measures. Additional test 1: The LOGON system is sensitive to vocabulary, so we were interested to see to what extent the system would be able to carry over to new texts from the same narrow domain. With only 22 % acceptable translations, this test had disappointing results. Additional test 2: Given the grammatical backbone of the system, we found it important to test it on a syntactic test-suite with only known vocabulary. Here, 55 % of the sentences had good translations. The tests show that even within a very narrow semantic domain, vocabulary sensitivity is the most crucial obstacle for this approach.
2007
pdf
bib
Using Danish as a CG Interlingua: A Wide-Coverage Norwegian-English Machine Translation System
Eckhard Bick
|
Lars Nygaard
Proceedings of the 16th Nordic Conference of Computational Linguistics (NODALIDA 2007)
pdf
bib
An Advanced Speech Corpus for Norwegian
Janne Bondi Johannessen
|
Kristin Hagen
|
Joel James Priestley
|
Lars Nygaard
Proceedings of the 16th Nordic Conference of Computational Linguistics (NODALIDA 2007)
2006
pdf
bib
Using a Bi-Lingual Dictionary in Lexical Transfer
Lars Nygaard
|
Jan Tore Lønning
|
Torbjørn Nordgård
|
Stephan Oepen
Proceedings of the 11th Annual Conference of the European Association for Machine Translation
pdf
bib
Improbable morphological forms in a computational lexicon
Kristin Hagen
|
Lars Nygaard
Proceedings of the 15th Nordic Conference of Computational Linguistics (NODALIDA 2005)
2004
pdf
bib
abs
The OPUS Corpus - Parallel and Free: http://logos.uio.no/opus
Jörg Tiedemann
|
Lars Nygaard
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)
The OPUS corpus is a growing collection of translated documents collected from the internet. The current version contains about 30 million words in 60 languages. The entire corpus is sentence aligned and it also contains linguistic markup for certain languages.