Yuval Marton


2016

pdf bib
E-TIPSY: Search Query Corpus Annotated with Entities, Term Importance, POS Tags, and Syntactic Parses
Yuval Marton | Kristina Toutanova
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

We present E-TIPSY, a search query corpus annotated with named Entities, Term Importance, POS tags, and SYntactic parses. This corpus contains crowdsourced (gold) annotations of the three most important terms in each query. In addition, it contains automatically produced annotations of named entities, part-of-speech tags, and syntactic parses for the same queries. This corpus comes in two formats: (1) Sober Subset: annotations that two or more crowd workers agreed upon, and (2) Full Glass: all annotations. We analyze the strikingly low correlation between term importance and syntactic headedness, which invites research into effective ways of combining these different signals. Our corpus can serve as a benchmark for term importance methods aimed at improving search engine quality and as an initial step toward developing a dataset of gold linguistic analysis of web search queries. In addition, it can be used as a basis for linguistic inquiries into the kind of expressions used in search.

2014

pdf bib
A Unified Model for Soft Linguistic Reordering Constraints in Statistical Machine Translation
Junhui Li | Yuval Marton | Philip Resnik | Hal Daumé III
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Proceedings of the First Joint Workshop on Statistical Parsing of Morphologically Rich Languages and Syntactic Analysis of Non-Canonical Languages
Yoav Goldberg | Yuval Marton | Ines Rehbein | Yannick Versley | Özlem Çetinoğlu | Joel Tetreault
Proceedings of the First Joint Workshop on Statistical Parsing of Morphologically Rich Languages and Syntactic Analysis of Non-Canonical Languages

2013

pdf bib
Online Relative Margin Maximization for Statistical Machine Translation
Vladimir Eidelman | Yuval Marton | Philip Resnik
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Proceedings of the Fourth Workshop on Statistical Parsing of Morphologically-Rich Languages
Yoav Goldberg | Yuval Marton | Ines Rehbein | Yannick Versley
Proceedings of the Fourth Workshop on Statistical Parsing of Morphologically-Rich Languages

pdf bib
SPMRL‘13 Shared Task System: The CADIM Arabic Dependency Parser
Yuval Marton | Nizar Habash | Owen Rambow | Sarah Alkhulani
Proceedings of the Fourth Workshop on Statistical Parsing of Morphologically-Rich Languages

pdf bib
Dependency Parsing of Modern Standard Arabic with Lexical and Inflectional Features
Yuval Marton | Nizar Habash | Owen Rambow
Computational Linguistics, Volume 39, Issue 1 - March 2013

2012

pdf bib
Proceedings of the ACL 2012 Joint Workshop on Statistical Parsing and Semantic Processing of Morphologically Rich Languages
Marianna Apidianaki | Ido Dagan | Jennifer Foster | Yuval Marton | Djamé Seddah | Reut Tsarfaty
Proceedings of the ACL 2012 Joint Workshop on Statistical Parsing and Semantic Processing of Morphologically Rich Languages

pdf bib
On-Demand Distributional Semantic Distance and Paraphrasing
Yuval Marton
Tutorial Abstracts at the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
*SEM 2012: The First Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012)
Eneko Agirre | Johan Bos | Mona Diab | Suresh Manandhar | Yuval Marton | Deniz Yuret
*SEM 2012: The First Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012)

2011

pdf bib
Improving Arabic Dependency Parsing with Form-based and Functional Morphological Features
Yuval Marton | Nizar Habash | Owen Rambow
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Filtering Antonymous, Trend-Contrasting, and Polarity-Dissimilar Distributional Paraphrases for Improving Statistical Machine Translation
Yuval Marton | Ahmed El Kholy | Nizar Habash
Proceedings of the Sixth Workshop on Statistical Machine Translation

2010

pdf bib
Improved Statistical Machine Translation with Hybrid Phrasal Paraphrases Derived from Monolingual Text and a Shallow Lexical Resource
Yuval Marton
Proceedings of the 9th Conference of the Association for Machine Translation in the Americas: Research Papers

Paraphrase generation is useful for various NLP tasks. But pivoting techniques for paraphrasing have limited applicability due to their reliance on parallel texts, although they benefit from linguistic knowledge implicit in the sentence alignment. Distributional paraphrasing has wider applicability, but doesn’t benefit from any linguistic knowledge. We combine a distributional semantic distance measure (based on a non-annotated corpus) with a shallow linguistic resource to create a hybrid semantic distance measure of words, which we extend to phrases. We embed this extended hybrid measure in a distributional paraphrasing technique, benefiting from both linguistic knowledge and independence from parallel texts. Evaluated in statistical machine translation tasks by augmenting translation models with paraphrase-based translation rules, we show our novel technique is superior to the non-augmented baseline and both the distributional and pivot paraphrasing techniques. We train models on both a full-size dataset as well as a simulated “low density” small dataset.

pdf bib
Improving Arabic-to-English Statistical Machine Translation by Reordering Post-Verbal Subjects for Alignment
Marine Carpuat | Yuval Marton | Nizar Habash
Proceedings of the ACL 2010 Conference Short Papers

pdf bib
Reordering Matrix Post-verbal Subjects for Arabic-to-English SMT
Marine Carpuat | Yuval Marton | Nizar Habash
Actes de la 17e conférence sur le Traitement Automatique des Langues Naturelles. Articles longs

We improve our recently proposed technique for integrating Arabic verb-subject constructions in SMT word alignment (Carpuat et al., 2010) by distinguishing between matrix (or main clause) and non-matrix Arabic verb-subject constructions. In gold translations, most matrix VS (main clause verb-subject) constructions are translated in inverted SV order, while non-matrix (subordinate clause) VS constructions are inverted in only half the cases. In addition, while detecting verbs and their subjects is a hard task, our syntactic parser detects VS constructions better in matrix than in non-matrix clauses. As a result, reordering only matrix VS for word alignment consistently improves translation quality over a phrase-based SMT baseline, and over reordering all VS constructions, in both medium- and large-scale settings. In fact, the improvements obtained by reordering matrix VS on the medium-scale setting remarkably represent 44% of the gain in BLEU and 51% of the gain in TER obtained with a word alignment training bitext that is 5 times larger.

pdf bib
Domain-Independent Novel Event Discovery and Semi-Automatic Event Annotation
Hao Li | Xiang Li | Heng Ji | Yuval Marton
Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation

pdf bib
Improving Arabic Dependency Parsing with Lexical and Inflectional Morphological Features
Yuval Marton | Nizar Habash | Owen Rambow
Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages

2009

pdf bib
Improved Statistical Machine Translation Using Monolingually-Derived Paraphrases
Yuval Marton | Chris Callison-Burch | Philip Resnik
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
Estimating Semantic Distance Using Soft Semantic Constraints in Knowledge-Source – Corpus Hybrid Models
Yuval Marton | Saif Mohammad | Philip Resnik
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
The University of Maryland Statistical Machine Translation System for the Fourth Workshop on Machine Translation
Chris Dyer | Hendra Setiawan | Yuval Marton | Philip Resnik
Proceedings of the Fourth Workshop on Statistical Machine Translation

2008

pdf bib
Soft Syntactic Constraints for Hierarchical Phrased-Based Translation
Yuval Marton | Philip Resnik
Proceedings of ACL-08: HLT

pdf bib
Online Large-Margin Training of Syntactic and Structural Translation Features
David Chiang | Yuval Marton | Philip Resnik
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing