Kevin Tang


pdf bib
HeiMorph at SIGMORPHON 2022 Shared Task on Morphological Acquisition Trajectories
Akhilesh Kakolu Ramarao | Yulia Zinova | Kevin Tang | Ruben van de Vijver
Proceedings of the 19th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology

This paper presents the submission by the HeiMorph team to the SIGMORPHON 2022 task 2 of Morphological Acquisition Trajectories. Across all experimental conditions, we have found no evidence for the so-called Ushaped development trajectory. Our submitted systems achieve an average test accuracies of 55.5% on Arabic, 67% on German and 73.38% on English. We found that, bigram hallucination provides better inferences only for English and Arabic and only when the number of hallucinations remains low.

pdf bib
Disambiguation of morpho-syntactic features of African American English – the case of habitual be
Harrison Santiago | Joshua Martin | Sarah Moeller | Kevin Tang
Proceedings of the Second Workshop on Language Technology for Equality, Diversity and Inclusion

Recent research has highlighted that natural language processing (NLP) systems exhibit a bias againstAfrican American speakers. These errors are often caused by poor representation of linguistic features unique to African American English (AAE), which is due to the relatively low probability of occurrence for many such features. We present a workflow to overcome this issue in the case of habitual “be”. Habitual “be” is isomorphic, and therefore ambiguous, with other forms of uninflected “be” found in both AAE and General American English (GAE). This creates a clear challenge for bias in NLP technologies. To overcome the scarcity, we employ a combination of rule-based filters and data augmentation that generate a corpus balanced between habitual and non-habitual instances. This balanced corpus trains unbiased machine learning classifiers, as demonstrated on a corpus of AAE transcribed texts, achieving .65 F1 score at classifying habitual “be”.