Shane Bergsma


2014

pdf bib
I’m a Belieber: Social Roles via Self-identification and Conceptual Attributes
Charley Beller | Rebecca Knowles | Craig Harman | Shane Bergsma | Margaret Mitchell | Benjamin Van Durme
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2013

pdf bib
Learning Domain-Specific, L1-Specific Measures of Word Readability
Shane Bergsma | David Yarowsky
Traitement Automatique des Langues, Volume 54, Numéro 1 : Varia [Varia]

pdf bib
Broadly Improving User Classification via Communication-Based Name and Location Clustering on Twitter
Shane Bergsma | Mark Dredze | Benjamin Van Durme | Theresa Wilson | David Yarowsky
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Using Conceptual Class Attributes to Characterize Social Media Users
Shane Bergsma | Benjamin Van Durme
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Explicit and Implicit Syntactic Features for Text Classification
Matt Post | Shane Bergsma
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2012

pdf bib
Stylometric Analysis of Scientific Articles
Shane Bergsma | Matt Post | David Yarowsky
Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Language Identification for Creating Language-Specific Twitter Collections
Shane Bergsma | Paul McNamee | Mossaab Bagdouri | Clayton Fink | Theresa Wilson
Proceedings of the Second Workshop on Language in Social Media

2011

pdf bib
Using Large Monolingual and Bilingual Corpora to Improve Coordination Disambiguation
Shane Bergsma | David Yarowsky | Kenneth Church
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Joint Training of Dependency Parsing Filters through Latent Support Vector Machines
Colin Cherry | Shane Bergsma
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Using Visual Information to Predict Lexical Preference
Shane Bergsma | Randy Goebel
Proceedings of the International Conference Recent Advances in Natural Language Processing 2011

2010

pdf bib
Fast and Accurate Arc Filtering for Dependency Parsing
Shane Bergsma | Colin Cherry
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf bib
Using Web-scale N-grams to Improve Base NP Parsing Performance
Emily Pitler | Shane Bergsma | Dekang Lin | Kenneth Church
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf bib
Predicting the Semantic Compositionality of Prefix Verbs
Shane Bergsma | Aditya Bhargava | Hua He | Grzegorz Kondrak
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

pdf bib
New Tools for Web-Scale N-grams
Dekang Lin | Kenneth Church | Heng Ji | Satoshi Sekine | David Yarowsky | Shane Bergsma | Kailash Patil | Emily Pitler | Rachel Lathbury | Vikram Rao | Kapil Dalwani | Sushant Narsale
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

While the web provides a fantastic linguistic resource, collecting and processing data at web-scale is beyond the reach of most academic laboratories. Previous research has relied on search engines to collect online information, but this is hopelessly inefficient for building large-scale linguistic resources, such as lists of named-entity types or clusters of distributionally similar words. An alternative to processing web-scale text directly is to use the information provided in an N-gram corpus. An N-gram corpus is an efficient compression of large amounts of text. An N-gram corpus states how often each sequence of words (up to length N) occurs. We propose tools for working with enhanced web-scale N-gram corpora that include richer levels of source annotation, such as part-of-speech tags. We describe a new set of search tools that make use of these tags, and collectively lower the barrier for lexical learning and ambiguity resolution at web-scale. They will allow novel sources of information to be applied to long-standing natural language challenges.

pdf bib
Creating Robust Supervised Classifiers via Web-Scale N-Gram Data
Shane Bergsma | Emily Pitler | Dekang Lin
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

pdf bib
Transliteration Generation and Mining with Limited Training Resources
Sittichai Jiampojamarn | Kenneth Dwyer | Shane Bergsma | Aditya Bhargava | Qing Dou | Mi-Young Kim | Grzegorz Kondrak
Proceedings of the 2010 Named Entities Workshop

pdf bib
Improved Natural Language Learning via Variance-Regularization Support Vector Machines
Shane Bergsma | Dekang Lin | Dale Schuurmans
Proceedings of the Fourteenth Conference on Computational Natural Language Learning

2009

pdf bib
A Ranking Approach to Stress Prediction for Letter-to-Phoneme Conversion
Qing Dou | Shane Bergsma | Sittichai Jiampojamarn | Grzegorz Kondrak
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf bib
Glen, Glenda or Glendale: Unsupervised and Semi-supervised Learning of English Noun Gender
Shane Bergsma | Dekang Lin | Randy Goebel
Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL-2009)

2008

pdf bib
Discriminative Learning of Selectional Preference from Unlabeled Text
Shane Bergsma | Dekang Lin | Randy Goebel
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

pdf bib
Distributional Identification of Non-Referential Pronouns
Shane Bergsma | Dekang Lin | Randy Goebel
Proceedings of ACL-08: HLT

2007

pdf bib
Learning Noun Phrase Query Segmentation
Shane Bergsma | Qin Iris Wang
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

pdf bib
Automatic Answer Typing for How-Questions
Christopher Pinchak | Shane Bergsma
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference

pdf bib
Alignment-Based Discriminative String Similarity
Shane Bergsma | Grzegorz Kondrak
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

2006

pdf bib
Bootstrapping Path-Based Pronoun Resolution
Shane Bergsma | Dekang Lin
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

2005

pdf bib
An Expectation Maximization Approach to Pronoun Resolution
Colin Cherry | Shane Bergsma
Proceedings of the Ninth Conference on Computational Natural Language Learning (CoNLL-2005)