Malin Ahlberg


2015

2014

In this paper we describe and evaluate a tool for paradigm induction and lexicon extraction that has been applied to Old Swedish. The tool is semi-supervised and uses a small seed lexicon and unannotated corpora to derive full inflection tables for input lemmata. In the work presented here, the tool has been modified to deal with the rich spelling variation found in Old Swedish texts. We also present some initial experiments, which are the first steps towards creating a large-scale morphology for Old Swedish.

2013

2012

This paper describes work on a rule-based, open-source parser for Swedish. The central component is a wide-coverage grammar implemented in the GF formalism (Grammatical Framework), a dependently typed grammar formalism based on Martin-Löf type theory. GF has strong support for multilinguality and has so far been used successfully for controlled languages and recent experiments have showed that it is also possible to use the framework for parsing unrestricted language. In addition to GF, we use two other main resources: the Swedish treebank Talbanken and the electronic lexicon SALDO. By combining the grammar with a lexicon extracted from SALDO we obtain a parser accepting all sentences described by the given rules. We develop and test this on examples from Talbanken. The resulting parser gives a full syntactic analysis of the input sentences. It will be highly reusable, freely available, and as GF provides libraries for compiling grammars to a number of programming languages, chosen parts of the the grammar may be used in various NLP applications.