Mollie Frances Shichman
2024
PropBank-Powered Data Creation: Utilizing Sense-Role Labelling to Generate Disaster Scenario Data
Mollie Frances Shichman
|
Claire Bonial
|
Taylor A. Hudson
|
Austin Blodgett
|
Francis Ferraro
|
Rachel Rudinger
Proceedings of the Fifth International Workshop on Designing Meaning Representations @ LREC-COLING 2024
For human-robot dialogue in a search-and-rescue scenario, a strong knowledge of the conditions and objects a robot will face is essential for effective interpretation of natural language instructions. In order to utilize the power of large language models without overwhelming the limited storage capacity of a robot, we propose PropBank-Powered Data Creation. PropBank-Powered Data Creation is an expert-in-the-loop data generation pipeline which creates training data for disaster-specific language models. We leverage semantic role labeling and Rich Event Ontology resources to efficiently develop seed sentences for fine-tuning a smaller, targeted model that could operate onboard a robot for disaster relief. We developed 32 sentence templates, which we used to make 2 seed datasets of 175 instructions for earthquake search and rescue and train derailment response. We further leverage our seed datasets as evaluation data to test our baseline fine-tuned models.