Sharon Goldwater

Also published as: Sharon J. Goldwater


2024

pdf bib
Self-supervised speech representations display some human-like cross-linguistic perceptual abilities
Joselyn Rodriguez | Kamala Sreepada | Ruolan Leslie Famularo | Sharon Goldwater | Naomi Feldman
Proceedings of the 28th Conference on Computational Natural Language Learning

State of the art models in automatic speech recognition have shown remarkable improvements due to modern self-supervised (SSL) transformer-based architectures such as wav2vec 2.0 (Baevski et al., 2020). However, how these models encode phonetic information is still not well understood. We explore whether SSL speech models display a linguistic property that characterizes human speech perception: language specificity. We show that while wav2vec 2.0 displays an overall language specificity effect when tested on Hindi vs. English, it does not resemble human speech perception when tested on finer-grained differences in Hindi speech contrasts.

pdf bib
Estimating the Level of Dialectness Predicts Inter-annotator Agreement in Multi-dialect Arabic Datasets
Amr Keleg | Walid Magdy | Sharon Goldwater
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

On annotating multi-dialect Arabic datasets, it is common to randomly assign the samples across a pool of native Arabic speakers. Recent analyses recommended routing dialectal samples to native speakers of their respective dialects to build higher-quality datasets. However, automatically identifying the dialect of samples is hard. Moreover, the pool of annotators who are native speakers of specific Arabic dialects might be scarce. Arabic Level of Dialectness (ALDi) was recently introduced as a quantitative variable that measures how sentences diverge from Standard Arabic. On randomly assigning samples to annotators, we hypothesize that samples of higher ALDi scores are harder to label especially if they are written in dialects that the annotators do not speak. We test this by analyzing the relation between ALDi scores and the annotators’ agreement, on 15 public datasets having raw individual sample annotations for various sentence-classification tasks. We find strong evidence supporting our hypothesis for 11 of them. Consequently, we recommend prioritizing routing samples of high ALDi scores to native speakers of each sample’s dialect, for which the dialect could be automatically identified at higher accuracies.

pdf bib
Estimating the Level of Dialectness Predicts Inter-annotator Agreement in Multi-dialect Arabic Datasets
Amr Keleg | Walid Magdy | Sharon Goldwater
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

On annotating multi-dialect Arabic datasets, it is common to randomly assign the samples across a pool of native Arabic speakers. Recent analyses recommended routing dialectal samples to native speakers of their respective dialects to build higher-quality datasets. However, automatically identifying the dialect of samples is hard. Moreover, the pool of annotators who are native speakers of specific Arabic dialects might be scarce. Arabic Level of Dialectness (ALDi) was recently introduced as a quantitative variable that measures how sentences diverge from Standard Arabic. On randomly assigning samples to annotators, we hypothesize that samples of higher ALDi scores are harder to label especially if they are written in dialects that the annotators do not speak. We test this by analyzing the relation between ALDi scores and the annotators’ agreement, on 15 public datasets having raw individual sample annotations for various sentence-classification tasks. We find strong evidence supporting our hypothesis for 11 of them. Consequently, we recommend prioritizing routing samples of high ALDi scores to native speakers of each sample’s dialect, for which the dialect could be automatically identified at higher accuracies.

2023

pdf bib
ALDi: Quantifying the Arabic Level of Dialectness of Text
Amr Keleg | Sharon Goldwater | Walid Magdy
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Transcribed speech and user-generated text in Arabic typically contain a mixture of Modern Standard Arabic (MSA), the standardized language taught in schools, and Dialectal Arabic (DA), used in daily communications. To handle this variation, previous work in Arabic NLP has focused on Dialect Identification (DI) on the sentence or the token level. However, DI treats the task as binary, whereas we argue that Arabic speakers perceive a spectrum of dialectness, which we operationalize at the sentence level as the Arabic Level of Dialectness (ALDi), a continuous linguistic variable. We introduce the AOC-ALDi dataset (derived from the AOC dataset), containing 127,835 sentences (17% from news articles and 83% from user comments on those articles) which are manually labeled with their level of dialectness. We provide a detailed analysis of AOC-ALDi and show that a model trained on it can effectively identify levels of dialectness on a range of other corpora (including dialects and genres not included in AOC-ALDi), providing a more nuanced picture than traditional DI systems. Through case studies, we illustrate how ALDi can reveal Arabic speakers’ stylistic choices in different situations, a useful property for sociolinguistic analyses.

pdf bib
Language-Agnostic Measures Discriminate Inflection and Derivation
Coleman Haley | Edoardo M. Ponti | Sharon Goldwater
Proceedings of the 5th Workshop on Research in Computational Linguistic Typology and Multilingual NLP

In morphology, a distinction is commonly drawn between inflection and derivation. However, a precise definition of this distinction which captures the way the terms are used across languages remains elusive within linguistic theory, typically being based on subjective tests. In this study, we present 4 quantitative measures which use the statistics of a raw text corpus in a language to estimate how much and how variably a morphological construction changes aspects of the lexical entry, specifically, the word’s form and the word’s semantic and syntactic properties (as operationalised by distributional word embeddings). Based on a sample of 26 languages, we find that we can reconstruct 90% of the classification of constructions into inflection and derivation in Unimorph using our 4 measures, providing large-scale cross-linguistic evidence that the concepts of inflection and derivation are associated with measurable signatures in terms of form and distribution signatures that behave consistently across a variety of languages. Critically, our measures and models are entirely language-agnostic, yet perform well across all languages studied. We find that while there is a high degree of consistency in the use of the terms inflection and derivation in terms of our measures, there are still many constructions near the model’s decision boundary between the two categories, indicating a gradient, rather than categorical, distinction.

2022

pdf bib
Universal Dependencies and Semantics for English and Hebrew Child-directed Speech
Ida Szubert | Omri Abend | Nathan Schneider | Samuel Gibbon | Sharon Goldwater | Mark Steedman
Proceedings of the Society for Computation in Linguistics 2022

2021

pdf bib
Adaptor Grammars for Unsupervised Paradigm Clustering
Kate McCurdy | Sharon Goldwater | Adam Lopez
Proceedings of the 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology

This work describes the Edinburgh submission to the SIGMORPHON 2021 Shared Task 2 on unsupervised morphological paradigm clustering. Given raw text input, the task was to assign each token to a cluster with other tokens from the same paradigm. We use Adaptor Grammar segmentations combined with frequency-based heuristics to predict paradigm clusters. Our system achieved the highest average F1 score across 9 test languages, placing first out of 15 submissions.

pdf bib
On the Difficulty of Segmenting Words with Attention
Ramon Sanabria | Hao Tang | Sharon Goldwater
Proceedings of the Second Workshop on Insights from Negative Results in NLP

Word segmentation, the problem of finding word boundaries in speech, is of interest for a range of tasks. Previous papers have suggested that for sequence-to-sequence models trained on tasks such as speech translation or speech recognition, attention can be used to locate and segment the words. We show, however, that even on monolingual data this approach is brittle. In our experiments with different input types, data sizes, and segmentation algorithms, only models trained to predict phones from words succeed in the task. Models trained to predict words from either phones or speech (i.e., the opposite direction needed to generalize to new data), yield much worse results, suggesting that attention-based segmentation is only useful in limited scenarios.

pdf bib
A phonetic model of non-native spoken word processing
Yevgen Matusevych | Herman Kamper | Thomas Schatz | Naomi Feldman | Sharon Goldwater
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

Non-native speakers show difficulties with spoken word processing. Many studies attribute these difficulties to imprecise phonological encoding of words in the lexical memory. We test an alternative hypothesis: that some of these difficulties can arise from the non-native speakers’ phonetic perception. We train a computational model of phonetic learning, which has no access to phonology, on either one or two languages. We first show that the model exhibits predictable behaviors on phone-level and word-level discrimination tasks. We then test the model on a spoken word processing task, showing that phonology may not be necessary to explain some of the word processing effects observed in non-native speakers. We run an additional analysis of the model’s lexical representation space, showing that the two training languages are not fully separated in that space, similarly to the languages of a bilingual human speaker.

2020

pdf bib
Inflecting When There’s No Majority: Limitations of Encoder-Decoder Neural Networks as Cognitive Models for German Plurals
Kate McCurdy | Sharon Goldwater | Adam Lopez
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Can artificial neural networks learn to represent inflectional morphology and generalize to new words as human speakers do? Kirov and Cotterell (2018) argue that the answer is yes: modern Encoder-Decoder (ED) architectures learn human-like behavior when inflecting English verbs, such as extending the regular past tense form /-(e)d/ to novel words. However, their work does not address the criticism raised by Marcus et al. (1995): that neural models may learn to extend not the regular, but the most frequent class — and thus fail on tasks like German number inflection, where infrequent suffixes like /-s/ can still be productively generalized. To investigate this question, we first collect a new dataset from German speakers (production and ratings of plural forms for novel nouns) that is designed to avoid sources of information unavailable to the ED model. The speaker data show high variability, and two suffixes evince ‘regular’ behavior, appearing more often with phonologically atypical inputs. Encoder-decoder models do generalize the most frequently produced plural class, but do not show human-like variability or ‘regular’ extension of these other plural markers. We conclude that modern neural models may still struggle with minority-class generalization.

pdf bib
The role of context in neural pitch accent detection in English
Elizabeth Nielsen | Mark Steedman | Sharon Goldwater
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Prosody is a rich information source in natural language, serving as a marker for phenomena such as contrast. In order to make this information available to downstream tasks, we need a way to detect prosodic events in speech. We propose a new model for pitch accent detection, inspired by the work of Stehwien et al. (2018), who presented a CNN-based model for this task. Our model makes greater use of context by using full utterances as input and adding an LSTM layer. We find that these innovations lead to an improvement from 87.5% to 88.7% accuracy on pitch accent detection on American English speech in the Boston University Radio News Corpus, a state-of-the-art result. We also find that a simple baseline that just predicts a pitch accent on every content word yields 82.2% accuracy, and we suggest that this is the appropriate baseline for this task. Finally, we conduct ablation tests that show pitch is the most important acoustic feature for this task and this corpus.

pdf bib
Conditioning, but on Which Distribution? Grammatical Gender in German Plural Inflection
Kate McCurdy | Adam Lopez | Sharon Goldwater
Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics

Grammatical gender is a consistent and informative cue to the plural class of German nouns. We find that neural encoder-decoder models learn to rely on this cue to predict plural class, but adult speakers are relatively insensitive to it. This suggests that the neural models are not an effective cognitive model of German plural formation.

2019

pdf bib
Are we there yet? Encoder-decoder neural networks as cognitive models of English past tense inflection
Maria Corkery | Yevgen Matusevych | Sharon Goldwater
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

The cognitive mechanisms needed to account for the English past tense have long been a subject of debate in linguistics and cognitive science. Neural network models were proposed early on, but were shown to have clear flaws. Recently, however, Kirov and Cotterell (2018) showed that modern encoder-decoder (ED) models overcome many of these flaws. They also presented evidence that ED models demonstrate humanlike performance in a nonce-word task. Here, we look more closely at the behaviour of their model in this task. We find that (1) the model exhibits instability across multiple simulations in terms of its correlation with human data, and (2) even when results are aggregated across simulations (treating each simulation as an individual human participant), the fit to the human data is not strong—worse than an older rule-based model. These findings hold up through several alternative training regimes and evaluation measures. Although other neural architectures might do better, we conclude that there is still insufficient evidence to claim that neural nets are a good cognitive model for this task.

pdf bib
Pre-training on high-resource speech recognition improves low-resource speech-to-text translation
Sameer Bansal | Herman Kamper | Karen Livescu | Adam Lopez | Sharon Goldwater
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

We present a simple approach to improve direct speech-to-text translation (ST) when the source language is low-resource: we pre-train the model on a high-resource automatic speech recognition (ASR) task, and then fine-tune its parameters for ST. We demonstrate that our approach is effective by pre-training on 300 hours of English ASR data to improve Spanish English ST from 10.8 to 20.2 BLEU when only 20 hours of Spanish-English ST training data are available. Through an ablation study, we find that the pre-trained encoder (acoustic model) accounts for most of the improvement, despite the fact that the shared language in these tasks is the target language text, not the source language audio. Applying this insight, we show that pre-training on ASR helps ST even when the ASR language differs from both source and target ST languages: pre-training on French ASR also improves Spanish-English ST. Finally, we show that the approach improves performance on a true low-resource task: pre-training on a combination of English ASR and French ASR improves Mboshi-French ST, where only 4 hours of data are available, from 3.5 to 7.1 BLEU.

pdf bib
Data Augmentation for Context-Sensitive Neural Lemmatization Using Inflection Tables and Raw Text
Toms Bergmanis | Sharon Goldwater
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Lemmatization aims to reduce the sparse data problem by relating the inflected forms of a word to its dictionary form. Using context can help, both for unseen and ambiguous words. Yet most context-sensitive approaches require full lemma-annotated sentences for training, which may be scarce or unavailable in low-resource languages. In addition (as shown here), in a low-resource setting, a lemmatizer can learn more from n labeled examples of distinct words (types) than from n (contiguous) labeled tokens, since the latter contain far fewer distinct types. To combine the efficiency of type-based learning with the benefits of context, we propose a way to train a context-sensitive lemmatizer with little or no labeled corpus data, using inflection tables from the UniMorph project and raw text examples from Wikipedia that provide sentence contexts for the unambiguous UniMorph examples. Despite these being unambiguous examples, the model successfully generalizes from them, leading to improved results (both overall, and especially on unseen words) in comparison to a baseline that does not use context.

2018

pdf bib
Context Sensitive Neural Lemmatization with Lematus
Toms Bergmanis | Sharon Goldwater
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

The main motivation for developing contextsensitive lemmatizers is to improve performance on unseen and ambiguous words. Yet previous systems have not carefully evaluated whether the use of context actually helps in these cases. We introduce Lematus, a lemmatizer based on a standard encoder-decoder architecture, which incorporates character-level sentence context. We evaluate its lemmatization accuracy across 20 languages in both a full data setting and a lower-resource setting with 10k training examples in each language. In both settings, we show that including context significantly improves results against a context-free version of the model. Context helps more for ambiguous words than for unseen words, though the latter has a greater effect on overall performance differences between languages. We also compare to three previous context-sensitive lemmatization systems, which all use pre-extracted edit trees as well as hand-selected features and/or additional sources of information such as tagged training data. Without using any of these, our context-sensitive model outperforms the best competitor system (Lemming) in the fulldata setting, and performs on par in the lowerresource setting.

pdf bib
Evaluating Historical Text Normalization Systems: How Well Do They Generalize?
Alexander Robertson | Sharon Goldwater
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)

We highlight several issues in the evaluation of historical text normalization systems that make it hard to tell how well these systems would actually work in practice—i.e., for new datasets or languages; in comparison to more naïve systems; or as a preprocessing step for downstream NLP tools. We illustrate these issues and exemplify our proposed evaluation practices by comparing two neural models against a naïve baseline system. We show that the neural models generalize well to unseen words in tests on five languages; nevertheless, they provide no clear benefit over the naïve baseline for downstream POS tagging of an English historical collection. We conclude that future work should include more rigorous evaluation, including both intrinsic and extrinsic measures where possible.

pdf bib
Inducing a lexicon of sociolinguistic variables from code-mixed text
Philippa Shoemark | James Kirby | Sharon Goldwater
Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on Noisy User-generated Text

Sociolinguistics is often concerned with how variants of a linguistic item (e.g., nothing vs. nothin’) are used by different groups or in different situations. We introduce the task of inducing lexical variables from code-mixed text: that is, identifying equivalence pairs such as (football, fitba) along with their linguistic code (football→British, fitba→Scottish). We adapt a framework for identifying gender-biased word pairs to this new task, and present results on three different pairs of English dialects, using tweets as the code-mixed text. Our system achieves precision of over 70% for two of these three datasets, and produces useful results even without extensive parameter tuning. Our success in adapting this framework from gender to language variety suggests that it could be used to discover other types of analogous pairs as well.

2017

pdf bib
Training Data Augmentation for Low-Resource Morphological Inflection
Toms Bergmanis | Katharina Kann | Hinrich Schütze | Sharon Goldwater
Proceedings of the CoNLL SIGMORPHON 2017 Shared Task: Universal Morphological Reinflection

pdf bib
From Segmentation to Analyses: a Probabilistic Model for Unsupervised Morphology Induction
Toms Bergmanis | Sharon Goldwater
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

A major motivation for unsupervised morphological analysis is to reduce the sparse data problem in under-resourced languages. Most previous work focus on segmenting surface forms into their constituent morphs (taking: tak +ing), but surface form segmentation does not solve the sparse data problem as the analyses of take and taking are not connected to each other. We present a system that adapts the MorphoChains system (Narasimhan et al., 2015) to provide morphological analyses that aim to abstract over spelling differences in functionally similar morphs. This results in analyses that are not compelled to use all the orthographic material of a word (stopping: stop +ing) or limited to only that material (acidified: acid +ify +ed). On average across six typologically varied languages our system has a similar or better F-score on EMMA (a measure of underlying morpheme accuracy) than three strong baselines; moreover, the total number of distinct morphemes identified by our system is on average 12.8% lower than for Morfessor (Virpioja et al., 2013), a state-of-the-art surface segmentation system.

pdf bib
Aye or naw, whit dae ye hink? Scottish independence and linguistic identity on social media
Philippa Shoemark | Debnil Sur | Luke Shrimpton | Iain Murray | Sharon Goldwater
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

Political surveys have indicated a relationship between a sense of Scottish identity and voting decisions in the 2014 Scottish Independence Referendum. Identity is often reflected in language use, suggesting the intuitive hypothesis that individuals who support Scottish independence are more likely to use distinctively Scottish words than those who oppose it. In the first large-scale study of sociolinguistic variation on social media in the UK, we identify distinctively Scottish terms in a data-driven way, and find that these terms are indeed used at a higher rate by users of pro-independence hashtags than by users of anti-independence hashtags. However, we also find that in general people are less likely to use distinctively Scottish words in tweets with referendum-related hashtags than in their general Twitter activity. We attribute this difference to style shifting relative to audience, aligning with previous work showing that Twitter users tend to use fewer local variants when addressing a broader audience.

pdf bib
Towards speech-to-text translation without speech recognition
Sameer Bansal | Herman Kamper | Adam Lopez | Sharon Goldwater
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

We explore the problem of translating speech to text in low-resource scenarios where neither automatic speech recognition (ASR) nor machine translation (MT) are available, but we have training data in the form of audio paired with text translations. We present the first system for this problem applied to a realistic multi-speaker dataset, the CALLHOME Spanish-English speech translation corpus. Our approach uses unsupervised term discovery (UTD) to cluster repeated patterns in the audio, creating a pseudotext, which we pair with translations to create a parallel text and train a simple bag-of-words MT model. We identify the challenges faced by the system, finding that the difficulty of cross-speaker UTD results in low recall, but that our system is still able to correctly translate some content words in test data.

pdf bib
Spoken Term Discovery for Language Documentation using Translations
Antonios Anastasopoulos | Sameer Bansal | David Chiang | Sharon Goldwater | Adam Lopez
Proceedings of the Workshop on Speech-Centric Natural Language Processing

Vast amounts of speech data collected for language documentation and research remain untranscribed and unsearchable, but often a small amount of speech may have text translations available. We present a method for partially labeling additional speech with translations in this scenario. We modify an unsupervised speech-to-translation alignment model and obtain prototype speech segments that match the translation words, which are in turn used to discover terms in the unlabelled data. We evaluate our method on a Spanish-English speech translation corpus and on two corpora of endangered languages, Arapaho and Ainu, demonstrating its appropriateness and applicability in an actual very-low-resource scenario.

pdf bib
Topic and audience effects on distinctively Scottish vocabulary usage in Twitter data
Philippa Shoemark | James Kirby | Sharon Goldwater
Proceedings of the Workshop on Stylistic Variation

Sociolinguistic research suggests that speakers modulate their language style in response to their audience. Similar effects have recently been claimed to occur in the informal written context of Twitter, with users choosing less region-specific and non-standard vocabulary when addressing larger audiences. However, these studies have not carefully controlled for the possible confound of topic: that is, tweets addressed to a broad audience might also tend towards topics that engender a more formal style. In addition, it is not clear to what extent previous results generalize to different samples of users. Using mixed-effects models, we show that audience and topic have independent effects on the rate of distinctively Scottish usage in two demographically distinct Twitter user samples. However, not all effects are consistent between the two groups, underscoring the importance of replicating studies on distinct user samples before drawing strong conclusions from social media data.

2016

pdf bib
Towards robust cross-linguistic comparisons of phonological networks
Philippa Shoemark | Sharon Goldwater | James Kirby | Rik Sarkar
Proceedings of the 14th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology

2014

pdf bib
Weak semantic context helps phonetic learning in a model of infant language acquisition
Stella Frank | Naomi H. Feldman | Sharon Goldwater
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
POS induction with distributional and morphological information using a distance-dependent Chinese restaurant process
Kairit Sirts | Jacob Eisenstein | Micha Elsner | Sharon Goldwater
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics
Shuly Wintner | Sharon Goldwater | Stefan Riezler
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers
Shuly Wintner | Stefan Riezler | Sharon Goldwater
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers

2013

pdf bib
Exploring the Utility of Joint Morphological and Syntactic Learning from Child-directed Speech
Stella Frank | Frank Keller | Sharon Goldwater
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
A Joint Learning Model of Word Segmentation, Lexical Acquisition, and Phonetic Variability
Micha Elsner | Sharon Goldwater | Naomi Feldman | Frank Wood
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Modeling Graph Languages with Grammars Extracted via Tree Decompositions
Bevan Keeley Jones | Sharon Goldwater | Mark Johnson
Proceedings of the 11th International Conference on Finite State Methods and Natural Language Processing

pdf bib
Unsupervised Dependency Parsing with Acoustic Cues
John K Pate | Sharon Goldwater
Transactions of the Association for Computational Linguistics, Volume 1

Unsupervised parsing is a difficult task that infants readily perform. Progress has been made on this task using text-based models, but few computational approaches have considered how infants might benefit from acoustic cues. This paper explores the hypothesis that word duration can help with learning syntax. We describe how duration information can be incorporated into an unsupervised Bayesian dependency parser whose only other source of information is the words themselves (without punctuation or parts of speech). Our results, evaluated on both adult-directed and child-directed utterances, show that using word duration can improve parse quality relative to words-only baselines. These results support the idea that acoustic cues provide useful evidence about syntactic structure for language-learning infants, and motivate the use of word duration cues in NLP tasks with speech.

pdf bib
Minimally-Supervised Morphological Segmentation using Adaptor Grammars
Kairit Sirts | Sharon Goldwater
Transactions of the Association for Computational Linguistics, Volume 1

This paper explores the use of Adaptor Grammars, a nonparametric Bayesian modelling framework, for minimally supervised morphological segmentation. We compare three training methods: unsupervised training, semi-supervised training, and a novel model selection method. In the model selection method, we train unsupervised Adaptor Grammars using an over-articulated metagrammar, then use a small labelled data set to select which potential morph boundaries identified by the metagrammar should be returned in the final output. We evaluate on five languages and show that semi-supervised training provides a boost over unsupervised training, while the model selection method yields the best average results over all languages and is competitive with state-of-the-art semi-supervised systems. Moreover, this method provides the potential to tune performance according to different evaluation metrics or downstream tasks.

2012

pdf bib
Bootstrapping a Unified Model of Lexical and Phonetic Acquisition
Micha Elsner | Sharon Goldwater | Jacob Eisenstein
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Semantic Parsing with Bayesian Tree Transducers
Bevan Jones | Mark Johnson | Sharon Goldwater
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Turning the pipeline into a loop: Iterated unsupervised dependency parsing and PoS induction
Christos Christodoulopoulos | Sharon Goldwater | Mark Steedman
Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure

pdf bib
A Probabilistic Model of Syntactic and Semantic Acquisition from Child-Directed Utterances and their Meanings
Tom Kwiatkowski | Sharon Goldwater | Luke Zettlemoyer | Mark Steedman
Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics

2011

pdf bib
Proceedings of the Fifteenth Conference on Computational Natural Language Learning
Sharon Goldwater | Christopher Manning
Proceedings of the Fifteenth Conference on Computational Natural Language Learning

pdf bib
Unsupervised Syntactic Chunking with Acoustic Cues: Computational Models for Prosodic Bootstrapping
John Pate | Sharon Goldwater
Proceedings of the 2nd Workshop on Cognitive Modeling and Computational Linguistics

pdf bib
Unsupervised NLP and Human Language Acquisition: Making Connections to Make Progress
Sharon Goldwater
Proceedings of the First workshop on Unsupervised Learning in NLP

pdf bib
Book Reviews: Computational Modeling of Human Language Acquisition by Afra Alishahi
Sharon Goldwater
Computational Linguistics, Volume 37, Issue 3 - September 2011

pdf bib
Formalizing Semantic Parsing with Tree Transducers
Bevan Jones | Mark Johnson | Sharon Goldwater
Proceedings of the Australasian Language Technology Association Workshop 2011

pdf bib
A Bayesian Mixture Model for PoS Induction Using Multiple Features
Christos Christodoulopoulos | Sharon Goldwater | Mark Steedman
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf bib
Lexical Generalization in CCG Grammar Induction for Semantic Parsing
Tom Kwiatkowski | Luke Zettlemoyer | Sharon Goldwater | Mark Steedman
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

2010

pdf bib
Two Decades of Unsupervised POS Induction: How Far Have We Come?
Christos Christodoulopoulos | Sharon Goldwater | Mark Steedman
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

pdf bib
Inducing Probabilistic CCG Grammars from Logical Form with Higher-Order Unification
Tom Kwiatkowksi | Luke Zettlemoyer | Sharon Goldwater | Mark Steedman
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

pdf bib
Using Sentence Type Information for Syntactic Category Acquisition
Stella Frank | Sharon Goldwater | Frank Keller
Proceedings of the 2010 Workshop on Cognitive Modeling and Computational Linguistics

2009

pdf bib
Improving nonparameteric Bayesian inference: experiments on unsupervised word segmentation with adaptor grammars
Mark Johnson | Sharon Goldwater
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
Inducing Compact but Accurate Tree-Substitution Grammars
Trevor Cohn | Sharon Goldwater | Phil Blunsom
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf bib
A Note on the Implementation of Hierarchical Dirichlet Processes
Phil Blunsom | Trevor Cohn | Sharon Goldwater | Mark Johnson
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers

2008

pdf bib
Which Words Are Hard to Recognize? Prosodic, Lexical, and Disfluency Factors that Increase ASR Error Rates
Sharon Goldwater | Dan Jurafsky | Christopher D. Manning
Proceedings of ACL-08: HLT

2007

pdf bib
Bayesian Inference for PCFGs via Markov Chain Monte Carlo
Mark Johnson | Thomas Griffiths | Sharon Goldwater
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference

pdf bib
A fully Bayesian approach to unsupervised part-of-speech tagging
Sharon Goldwater | Tom Griffiths
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

2006

pdf bib
Contextual Dependencies in Unsupervised Word Segmentation
Sharon Goldwater | Thomas L. Griffiths | Mark Johnson
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

2005

pdf bib
Representational Bias in Unsupervised Learning of Syllable Structure
Sharon Goldwater | Mark Johnson
Proceedings of the Ninth Conference on Computational Natural Language Learning (CoNLL-2005)

pdf bib
Improving Statistical MT through Morphological Analysis
Sharon Goldwater | David McClosky
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing

2004

pdf bib
Priors in Bayesian Learning of Phonological Rules
Sharon Goldwater | Mark Johnson
Proceedings of the 7th Meeting of the ACL Special Interest Group in Computational Phonology: Current Themes in Computational Phonology and Morphology

2000

pdf bib
Building a Robust Dialogue System with Limited Data
Sharon J. Goldwater | Elizabeth Owen Bratt | Jean Mark Gawron | John Dowding
ANLP-NAACL 2000 Workshop: Conversational Systems

pdf bib
Compiling Language Models from a Linguistically Motivated Unification Grammar
Manny Rayner | Beth Ann Hockey | Frankie James | Elizabeth Owen Bratt | Sharon Goldwater | Jean Mark Gawron
COLING 2000 Volume 2: The 18th International Conference on Computational Linguistics

1998

pdf bib
Edge-Based Best-First Chart Parsing
Eugene Charniak | Sharon Goldwater | Mark Johnson
Sixth Workshop on Very Large Corpora