Jason Smith

Also published as: Jason R. Smith


2013

pdf bib
Dirt Cheap Web-Scale Parallel Text from the Common Crawl
Jason R. Smith | Herve Saint-Amand | Magdalena Plamada | Philipp Koehn | Chris Callison-Burch | Adam Lopez
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2012

pdf bib
Unsupervised Learning on an Approximate Corpus
Jason Smith | Jason Eisner
Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Review of Hypothesis Alignment Algorithms for MT System Combination via Confusion Network Decoding
Antti-Veikko Rosti | Xiaodong He | Damianos Karakos | Gregor Leusch | Yuan Cao | Markus Freitag | Spyros Matsoukas | Hermann Ney | Jason Smith | Bing Zhang
Proceedings of the Seventh Workshop on Statistical Machine Translation

2010

pdf bib
Extracting Parallel Sentences from Comparable Corpora using Document Level Alignment
Jason R. Smith | Chris Quirk | Kristina Toutanova
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

2008

pdf bib
BART: A Modular Toolkit for Coreference Resolution
Yannick Versley | Simone Paolo Ponzetto | Massimo Poesio | Vladimir Eidelman | Alan Jern | Jason Smith | Xiaofeng Yang | Alessandro Moschitti
Proceedings of the ACL-08: HLT Demo Session

pdf bib
BART: A modular toolkit for coreference resolution
Yannick Versley | Simone Ponzetto | Massimo Poesio | Vladimir Eidelman | Alan Jern | Jason Smith | Xiaofeng Yang | Alessandro Moschitti
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

Developing a full coreference system able to run all the way from raw text to semantic interpretation is a considerable engineering effort. Accordingly, there is very limited availability of off-the shelf tools for researchers whose interests are not primarily in coreference or others who want to concentrate on a specific aspect of the problem. We present BART, a highly modular toolkit for developing coreference applications. In the Johns Hopkins workshop on using lexical and encyclopedic knowledge for entity disambiguation, the toolkit was used to extend a reimplementation of Soon et al.’s proposal with a variety of additional syntactic and knowledge-based features, and experiment with alternative resolution processes, preprocessing tools, and classifiers. BART has been released as open source software and is available from http://www.sfs.uni-tuebingen.de/~versley/BART

pdf bib
Latent-Variable Modeling of String Transductions with Finite-State Methods
Markus Dreyer | Jason Smith | Jason Eisner
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing