Karolina Owczarzak


2018

pdf bib
The Alexa Meaning Representation Language
Thomas Kollar | Danielle Berry | Lauren Stuart | Karolina Owczarzak | Tagyoung Chung | Lambert Mathias | Michael Kayser | Bradford Snow | Spyros Matsoukas
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 3 (Industry Papers)

This paper introduces a meaning representation for spoken language understanding. The Alexa meaning representation language (AMRL), unlike previous approaches, which factor spoken utterances into domains, provides a common representation for how people communicate in spoken language. AMRL is a rooted graph, links to a large-scale ontology, supports cross-domain queries, fine-grained types, complex utterances and composition. A spoken language dataset has been collected for Alexa, which contains ∼20k examples across eight domains. A version of this meaning representation was released to developers at a trade show in 2016.

2014

pdf bib
Wordsyoudontknow: Evaluation of lexicon-based decompounding with unknown handling
Karolina Owczarzak | Ferdinand de Haan | George Krupka | Don Hindle
Proceedings of the First Workshop on Computational Approaches to Compound Analysis (ComAComA 2014)

2012

pdf bib
Assessing the Effect of Inconsistent Assessors on Summarization Evaluation
Karolina Owczarzak | Peter A. Rankel | Hoa Trang Dang | John M. Conroy
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Proceedings of Workshop on Evaluation Metrics and System Comparison for Automatic Summarization
John M. Conroy | Hoa Trang Dang | Ani Nenkova | Karolina Owczarzak
Proceedings of Workshop on Evaluation Metrics and System Comparison for Automatic Summarization

pdf bib
An Assessment of the Accuracy of Automatic Evaluation in Summarization
Karolina Owczarzak | John M. Conroy | Hoa Trang Dang | Ani Nenkova
Proceedings of Workshop on Evaluation Metrics and System Comparison for Automatic Summarization

2011

pdf bib
Who wrote What Where: Analyzing the content of human and automatic summaries
Karolina Owczarzak | Hoa Dang
Proceedings of the Workshop on Automatic Summarization for Different Genres, Media, and Languages

2009

pdf bib
Evaluation of Automatic Summaries: Metrics under Varying Data Conditions
Karolina Owczarzak | Hoa Trang Dang
Proceedings of the 2009 Workshop on Language Generation and Summarisation (UCNLG+Sum 2009)

pdf bib
DEPEVAL(summ): Dependency-based Evaluation for Automatic Summaries
Karolina Owczarzak
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

2007

pdf bib
A cluster-based representation for multi-system MT evaluation
Nicolas Stroppa | Karolina Owczarzak
Proceedings of the 11th Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages: Papers

pdf bib
Dependency-Based Automatic Evaluation for Machine Translation
Karolina Owczarzak | Josef van Genabith | Andy Way
Proceedings of SSST, NAACL-HLT 2007 / AMTA Workshop on Syntax and Structure in Statistical Translation

pdf bib
Labelled Dependencies in Machine Translation Evaluation
Karolina Owczarzak | Josef van Genabith | Andy Way
Proceedings of the Second Workshop on Statistical Machine Translation

2006

pdf bib
Multi-Engine Machine Translation by Recursive Sentence Decomposition
Bart Mellebeek | Karolina Owczarzak | Josef Van Genabith | Andy Way
Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Technical Papers

In this paper, we present a novel approach to combine the outputs of multiple MT engines into a consensus translation. In contrast to previous Multi-Engine Machine Translation (MEMT) techniques, we do not rely on word alignments of output hypotheses, but prepare the input sentence for multi-engine processing. We do this by using a recursive decomposition algorithm that produces simple chunks as input to the MT engines. A consensus translation is produced by combining the best chunk translations, selected through majority voting, a trigram language model score and a confidence score assigned to each MT engine. We report statistically significant relative improvements of up to 9% BLEU score in experiments (English→Spanish) carried out on an 800-sentence test set extracted from the Penn-II Treebank.

pdf bib
Wrapper Syntax for Example-based Machine Translation
Karolina Owczarzak | Bart Mellebeek | Declan Groves | Josef Van Genabith | Andy Way
Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Technical Papers

TransBooster is a wrapper technology designed to improve the performance of wide-coverage machine translation systems. Using linguistically motivated syntactic information, it automatically decomposes source language sentences into shorter and syntactically simpler chunks, and recomposes their translation to form target language sentences. This generally improves both the word order and lexical selection of the translation. To date, TransBooster has been successfully applied to rule-based MT, statistical MT, and multi-engine MT. This paper presents the application of TransBooster to Example-Based Machine Translation. In an experiment conducted on test sets extracted from Europarl and the Penn II Treebank we show that our method can raise the BLEU score up to 3.8% relative to the EBMT baseline. We also conduct a manual evaluation, showing that TransBooster-enhanced EBMT produces a better output in terms of fluency than the baseline EBMT in 55% of the cases and in terms of accuracy in 53% of the cases.

pdf bib
Contextual Bitext-Derived Paraphrases in Automatic MT Evaluation
Karolina Owczarzak | Declan Groves | Josef Van Genabith | Andy Way
Proceedings on the Workshop on Statistical Machine Translation

pdf bib
A Syntactic Skeleton for Statistical Machine Translation
Bart Mellebeek | Karolina Owczarzak | Declan Groves | Josef Van Genabith | Andy Way
Proceedings of the 11th Annual Conference of the European Association for Machine Translation

2005

pdf bib
Improving Online Machine Translation Systems
Bart Mellebeek | Anna Khasin | Karolina Owczarzak | Josef Van Genabith | Andy Way
Proceedings of Machine Translation Summit X: Papers

In (Mellebeek et al., 2005), we proposed the design, implementation and evaluation of a novel and modular approach to boost the translation performance of existing, wide-coverage, freely available machine translation systems, based on reliable and fast automatic decomposition of the translation input and corresponding composition of translation output. Despite showing some initial promise, our method did not improve on the baseline Logomedia1 and Systran2 MT systems. In this paper, we improve on the algorithm presented in (Mellebeek et al., 2005), and on the same test data, show increased scores for a range of automatic evaluation metrics. Our algorithm now outperforms Logomedia, obtains similar results to SDL3 and falls tantalisingly short of the performance achieved by Systran.