Valentin I. Spitkovsky

Also published as: Valentin Spitkovsky


2016

pdf bib
A comparison of Named-Entity Disambiguation and Word Sense Disambiguation
Angel Chang | Valentin I. Spitkovsky | Christopher D. Manning | Eneko Agirre
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Named Entity Disambiguation (NED) is the task of linking a named-entity mention to an instance in a knowledge-base, typically Wikipedia-derived resources like DBpedia. This task is closely related to word-sense disambiguation (WSD), where the mention of an open-class word is linked to a concept in a knowledge-base, typically WordNet. This paper analyzes the relation between two annotated datasets on NED and WSD, highlighting the commonalities and differences. We detail the methods to construct a NED system following the WSD word-expert approach, where we need a dictionary and one classifier is built for each target entity mention string. Constructing a dictionary for NED proved challenging, and although similarity and ambiguity are higher for NED, the results are also higher due to the larger number of training data, and the more crisp and skewed meaning differences.

2013

pdf bib
Breaking Out of Local Optima with Count Transforms and Model Recombination: A Study in Grammar Induction
Valentin I. Spitkovsky | Hiyan Alshawi | Daniel Jurafsky
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

2012

pdf bib
Three Dependency-and-Boundary Models for Grammar Induction
Valentin I. Spitkovsky | Hiyan Alshawi | Daniel Jurafsky
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

pdf bib
A Cross-Lingual Dictionary for English Wikipedia Concepts
Valentin I. Spitkovsky | Angel X. Chang
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We present a resource for automatically associating strings of text with English Wikipedia concepts. Our machinery is bi-directional, in the sense that it uses the same fundamental probabilistic methods to map strings to empirical distributions over Wikipedia articles as it does to map article URLs to distributions over short, language-independent strings of natural language text. For maximal inter-operability, we release our resource as a set of flat line-based text files, lexicographically sorted and encoded with UTF-8. These files capture joint probability distributions underlying concepts (we use the terms article, concept and Wikipedia URL interchangeably) and associated snippets of text, as well as other features that can come in handy when working with Wikipedia articles and related information.

pdf bib
A Comparison of Chinese Parsers for Stanford Dependencies
Wanxiang Che | Valentin Spitkovsky | Ting Liu
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Capitalization Cues Improve Dependency Grammar Induction
Valentin I. Spitkovsky | Hiyan Alshawi | Daniel Jurafsky
Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure

2011

pdf bib
Lateen EM: Unsupervised Training with Multiple Objectives, Applied to Dependency Grammar Induction
Valentin I. Spitkovsky | Hiyan Alshawi | Daniel Jurafsky
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf bib
Unsupervised Dependency Parsing without Gold Part-of-Speech Tags
Valentin I. Spitkovsky | Hiyan Alshawi | Angel X. Chang | Daniel Jurafsky
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf bib
Punctuation: Making a Point in Unsupervised Dependency Parsing
Valentin I. Spitkovsky | Hiyan Alshawi | Daniel Jurafsky
Proceedings of the Fifteenth Conference on Computational Natural Language Learning

2010

pdf bib
From Baby Steps to Leapfrog: How “Less is More” in Unsupervised Dependency Parsing
Valentin I. Spitkovsky | Hiyan Alshawi | Daniel Jurafsky
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Profiting from Mark-Up: Hyper-Text Annotations for Guided Parsing
Valentin I. Spitkovsky | Daniel Jurafsky | Hiyan Alshawi
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

pdf bib
Viterbi Training Improves Unsupervised Dependency Parsing
Valentin I. Spitkovsky | Hiyan Alshawi | Daniel Jurafsky | Christopher D. Manning
Proceedings of the Fourteenth Conference on Computational Natural Language Learning