Mollie Frances Shichman


2024

pdf bib
PropBank-Powered Data Creation: Utilizing Sense-Role Labelling to Generate Disaster Scenario Data
Mollie Frances Shichman | Claire Bonial | Taylor A. Hudson | Austin Blodgett | Francis Ferraro | Rachel Rudinger
Proceedings of the Fifth International Workshop on Designing Meaning Representations @ LREC-COLING 2024

For human-robot dialogue in a search-and-rescue scenario, a strong knowledge of the conditions and objects a robot will face is essential for effective interpretation of natural language instructions. In order to utilize the power of large language models without overwhelming the limited storage capacity of a robot, we propose PropBank-Powered Data Creation. PropBank-Powered Data Creation is an expert-in-the-loop data generation pipeline which creates training data for disaster-specific language models. We leverage semantic role labeling and Rich Event Ontology resources to efficiently develop seed sentences for fine-tuning a smaller, targeted model that could operate onboard a robot for disaster relief. We developed 32 sentence templates, which we used to make 2 seed datasets of 175 instructions for earthquake search and rescue and train derailment response. We further leverage our seed datasets as evaluation data to test our baseline fine-tuned models.