Maria Holmqvist

2012

Alignment-based reordering for SMT
Maria Holmqvist | Sara Stymne | Lars Ahrenberg | Magnus Merkel
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We present a method for improving word alignment quality for phrase-based statistical machine translation by reordering the source text according to the target word order suggested by an initial word alignment. The reordered text is used to create a second word alignment which can be an improvement of the first alignment, since the word order is more similar. The method requires no other pre-processing such as part-of-speech tagging or parsing. We report improved Bleu scores for English-to-German and English-to-Swedish translation. We also examined the effect on word alignment quality and found that the reordering method increased recall while lowering precision, which partly can explain the improved Bleu scores. A manual evaluation of the translation output was also performed to understand what effect our reordering method has on the translation system. We found that where the system employing reordering differed from the baseline in terms of having more words, or a different word order, this generally led to an improvement in translation quality.

2011

pdf bib

A Gold Standard for English-Swedish Word Alignment
Maria Holmqvist | Lars Ahrenberg
Proceedings of the 18th Nordic Conference of Computational Linguistics (NODALIDA 2011)

pdf bib

Experiments with word alignment, normalization and clause reordering for SMT between English and German
Maria Holmqvist | Sara Stymne | Lars Ahrenberg
Proceedings of the Sixth Workshop on Statistical Machine Translation

2010

pdf bib abs

Heuristic Word Alignment with Parallel Phrases
Maria Holmqvist
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

We present a heuristic method for word alignment, which is the task of identifying corresponding words in parallel text. The heuristic method is based on parallel phrases extracted from manually word aligned sentence pairs. Word alignment is performed by matching parallel phrases to new sentence pairs, and adding word links from the parallel phrase to words in the matching sentence segment. Experiments on an English--Swedish parallel corpus showed that the heuristic phrase-based method produced word alignments with high precision but low recall. In order to improve alignment recall, phrases were generalized by replacing words with part-of-speech categories. The generalization improved recall but at the expense of precision. Two filtering strategies were investigated to prune the large set of generalized phrases. Finally, the phrase-based method was compared to statistical word alignment with Giza++ and we found that although statistical alignments based on large datasets will outperform phrase-based word alignment, a combination of phrase-based and statistical word alignment outperformed pure statistical alignment in terms of Alignment Error Rate (AER).

pdf bib

Vs and OOVs: Two Problems for Translation between German and English
Sara Stymne | Maria Holmqvist | Lars Ahrenberg
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR

pdf bib

Learning Dense Models of Query Similarity from User Click Logs
Fabio De Bona | Stefan Riezler | Keith Hall | Massimiliano Ciaramita | Amaç Herdaǧdelen | Maria Holmqvist
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics