Alyssa Allen
2024
OSU CompLing at the GEM’24 Data-to-Text Task
Alyssa Allen
|
Ashley Lewis
|
Yi-Chien Lin
|
Tomiris Kaumenova
|
Michael White
Proceedings of the 17th International Natural Language Generation Conference: Generation Challenges
This paper details experiments conducted for completing the GEM 2024 Data-to-Text task for a WebNLG dataset (Gardent et al., 2017). We show that model performance varies greatly across English, Spanish, Chinese, and Russian. Data filtering was done with automatic model judgments via error detection, which performs differently per language. We report English and Spanish dev set results for a data filtering and knowledge distillation approach to generating natural language outputs for sets of triples across a variety of domains. Specifically, we compare three generation conditions: 1) few-shot prompting with ChatGPT (GPT4), 2) fine-tuning LLama2 on the unfiltered dataset, and 3) fine-tuning Llama2 on a filtered version of the dataset. Russian and Chinese efforts did not result in submissions due to inconsistent or incoherent translations being produced in either the data synthesis or final generation stages. We provide details on these shortcomings but largely focus on Spanish and English efforts that align with our task submissions. We ultimately submitted outputs in English and Spanish that were generated using a version of Llama2 fine-tuned on a filtered dataset.
2023
SQL Comment Generation and Additional Research Interests
Alyssa Allen
Proceedings of the 19th Annual Meeting of the Young Reseachers' Roundtable on Spoken Dialogue Systems
My research interests focus on natural language generation (NLG) regarding how to make system outputs more intuitive and comprehensible for the human-user and conversational entrainment and alignment from the perspective of how dialogue systems could or should personalize its responses to the human user. As it relates to NLG, my current work focuses on training a system to auto-generate comments for SQL queries produced by a Text-to-SQL parser. The goal is to make the connection between technical SQL language and the user’s question more transparent. My linguistic training lies primarily at the intersection of computational and socio-linguistics. As such, my curiosities in conversational entrainment and alignment focus on the extent to which conversational agents can or should adjust their language based on human characteristics such as age, race, or gender.