2025
pdf
bib
abs
Playing with Voices: Tabletop Role-Playing Game Recordings as a Diarization Challenge
Lian Remme
|
Kevin Tang
Findings of the Association for Computational Linguistics: NAACL 2025
This paper provides a proof of concept that audio of tabletop role-playing games (TTRPG) could serve as a challenge for diarization systems. TTRPGs are carried out mostly by conversation. Participants often alter their voices to indicate that they are talking as a fictional character. Audio processing systems are susceptible to voice conversion with or without technological assistance. TTRPG present a conversational phenomenon in which voice conversion is an inherent characteristic for an immersive gaming experience. This could make it more challenging for diarizers to pick the real speaker and determine that impersonating is just that. We present the creation of a small TTRPG audio dataset and compare it against the AMI and the ICSI corpus. The performance of two diarizers, pyannote.audio and wespeaker, were evaluated. We observed that TTRPGs’ properties result in a higher confusion rate for both diarizers.Additionally, wespeaker strongly underestimates the number of speakers in the TTRPG audio files.We propose TTRPG audio as a promising challenge for diarization systems.
pdf
bib
abs
Analysis of LLM as a grammatical feature tagger for African American English
Rahul Porwal
|
Alice Rozet
|
Jotsna Gowda
|
Pryce Houck
|
Kevin Tang
|
Sarah Moeller
Findings of the Association for Computational Linguistics: NAACL 2025
African American English (AAE) presents unique challenges in natural language processing (NLP) This research systematically compares the performance of available NLP models—rule-based, transformer-based, and large language models (LLMs)—capable of identifying key grammatical features of AAE, namely Habitual Be and Multiple Negation. These features were selected for their distinct grammatical complexity and frequency of occurrence. The evaluation involved sentence-level binary classification tasks, using both zero-shot and few-shot strategies. The analysis reveals that while LLMs show promise compared to the baseline, they are influenced by biases such as recency and unrelated features in the text such as formality. This study highlights the necessity for improved model training and architectural adjustments to better accommodate AAE’s unique linguistic characteristics. Data and code are available.
2024
pdf
bib
abs
ParsText: A Digraphic Corpus for Tajik-Farsi Transliteration
Rayyan Merchant
|
Kevin Tang
Proceedings of the Second Workshop on Computation and Written Language (CAWL) @ LREC-COLING 2024
Despite speaking dialects of the same language, Persian speakers from Tajikistan cannot read Persian texts from Iran and Afghanistan. This is due to the fact that Tajik Persian is written in the Tajik-Cyrillic script, while Iranian and Afghan Persian are written in the Perso-Arabic script. As the formal registers of these dialects all maintain high levels of mutual intelligibility with each other, machine transliteration has been proposed as a more practical and appropriate solution than machine translation. Unfortunately, Persian texts written in both scripts are much more common in print in Tajikistan than online. This paper introduces a novel corpus meant to remedy that gap: ParsText. ParsText contains 2,813 Persian sentences written in both Tajik-Cyrillic and Perso-Arabic manually collected from blog pages and news articles online. This paper presents the need for such a corpus, previous and related work, data collection and alignment procedures, corpus statistics, and discusses directions for future work.
pdf
bib
abs
Leveraging Syntactic Dependencies in Disambiguation: The Case of African American English
Wilermine Previlon
|
Alice Rozet
|
Jotsna Gowda
|
Bill Dyer
|
Kevin Tang
|
Sarah Moeller
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
African American English (AAE) has received recent attention in the field of natural language processing (NLP). Efforts to address bias against AAE in NLP systems tend to focus on lexical differences. When the unique structures of AAE are considered, the solution is often to remove or neutralize the differences. This work leverages knowledge about the unique linguistic structures to improve automatic disambiguation of habitual and non-habitual meanings of “be” in naturally produced AAE transcribed speech. Both meanings are employed in AAE but examples of Habitual be are rare in already limited AAE data. Generally, representing additional syntactic information improves semantic disambiguation of habituality. Using an ensemble of classical machine learning models with a representation of the unique POS and dependency patterns of Habitual be, we show that integrating syntactic information improves the identification of habitual uses of “be” by about 65 F1 points over a simple baseline model of n-grams, and as much as 74 points. The success of this approach demonstrates the potential impact when we embrace, rather than neutralize, the structural uniqueness of African American English.
2023
pdf
bib
abs
Linear Discriminative Learning: a competitive non-neural baseline for morphological inflection
Cheonkam Jeong
|
Dominic Schmitz
|
Akhilesh Kakolu Ramarao
|
Anna Stein
|
Kevin Tang
Proceedings of the 20th SIGMORPHON workshop on Computational Research in Phonetics, Phonology, and Morphology
This paper presents our submission to the SIGMORPHON 2023 task 2 of Cognitively Plausible Morphophonological Generalization in Korean. We implemented both Linear Discriminative Learning and Transformer models and found that the Linear Discriminative Learning model trained on a combination of corpus and experimental data showed the best performance with the overall accuracy of around 83%. We found that the best model must be trained on both corpus data and the experimental data of one particular participant. Our examination of speaker-variability and speaker-specific information did not explain why a particular participant combined well with the corpus data. We recommend Linear Discriminative Learning models as a future non-neural baseline system, owning to its training speed, accuracy, model interpretability and cognitive plausibility. In order to improve the model performance, we suggest using bigger data and/or performing data augmentation and incorporating speaker- and item-specifics considerably.
2022
pdf
bib
abs
Disambiguation of morpho-syntactic features of African American English – the case of habitual be
Harrison Santiago
|
Joshua Martin
|
Sarah Moeller
|
Kevin Tang
Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion
Recent research has highlighted that natural language processing (NLP) systems exhibit a bias againstAfrican American speakers. These errors are often caused by poor representation of linguistic features unique to African American English (AAE), which is due to the relatively low probability of occurrence for many such features. We present a workflow to overcome this issue in the case of habitual “be”. Habitual “be” is isomorphic, and therefore ambiguous, with other forms of uninflected “be” found in both AAE and General American English (GAE). This creates a clear challenge for bias in NLP technologies. To overcome the scarcity, we employ a combination of rule-based filters and data augmentation that generate a corpus balanced between habitual and non-habitual instances. This balanced corpus trains unbiased machine learning classifiers, as demonstrated on a corpus of AAE transcribed texts, achieving .65 F1 score at classifying habitual “be”.
pdf
bib
abs
HeiMorph at SIGMORPHON 2022 Shared Task on Morphological Acquisition Trajectories
Akhilesh Kakolu Ramarao
|
Yulia Zinova
|
Kevin Tang
|
Ruben van de Vijver
Proceedings of the 19th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology
This paper presents the submission by the HeiMorph team to the SIGMORPHON 2022 task 2 of Morphological Acquisition Trajectories. Across all experimental conditions, we have found no evidence for the so-called Ushaped development trajectory. Our submitted systems achieve an average test accuracies of 55.5% on Arabic, 67% on German and 73.38% on English. We found that, bigram hallucination provides better inferences only for English and Arabic and only when the number of hallucinations remains low.