Nima Nabizadeh
2020
MyFixit: An Annotated Dataset, Annotation Tool, and Baseline Methods for Information Extraction from Repair Manuals
Nima Nabizadeh
|
Dorothea Kolossa
|
Martin Heckmann
Proceedings of the Twelfth Language Resources and Evaluation Conference
Text instructions are among the most widely used media for learning and teaching. Hence, to create assistance systems that are capable of supporting humans autonomously in new tasks, it would be immensely productive, if machines were enabled to extract task knowledge from such text instructions. In this paper, we, therefore, focus on information extraction (IE) from the instructional text in repair manuals. This brings with it the multiple challenges of information extraction from the situated and technical language in relatively long and often complex instructions. To tackle these challenges, we introduce a semi-structured dataset of repair manuals. The dataset is annotated in a large category of devices, with information that we consider most valuable for an automated repair assistant, including the required tools and the disassembled parts at each step of the repair progress. We then propose methods that can serve as baselines for this IE task: an unsupervised method based on a bags-of-n-grams similarity for extracting the needed tools in each repair step, and a deep-learning-based sequence labeling model for extracting the identity of disassembled parts. These baseline methods are integrated into a semi-automatic web-based annotator application that is also available along with the dataset.
Hierarchy-aware Learning of Sequential Tool Usage via Semi-automatically Constructed Taxonomies
Nima Nabizadeh
|
Martin Heckmann
|
Dorothea Kolossa
Proceedings of the Joint Workshop on Multiword Expressions and Electronic Lexicons
When repairing a device, humans employ a series of tools that corresponds to the arrangement of the device components. Such sequences of tool usage can be learned from repair manuals, so that at each step, having observed the previously applied tools, a sequential model can predict the next required tool. In this paper, we improve the tool prediction performance of such methods by additionally taking the hierarchical relationships among the tools into account. To this aim, we build a taxonomy of tools with hyponymy and hypernymy relations from the data by decomposing all multi-word expressions of tool names. We then develop a sequential model that performs a binary prediction for each node in the taxonomy. The evaluation of the method on a dataset of repair manuals shows that encoding the tools with the constructed taxonomy and using a top-down beam search for decoding increases the prediction accuracy and yields an interpretable taxonomy as a potentially valuable byproduct.
2019
Speaker-adapted neural-network-based fusion for multimodal reference resolution
Diana Kleingarn
|
Nima Nabizadeh
|
Martin Heckmann
|
Dorothea Kolossa
Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue
Humans use a variety of approaches to reference objects in the external world, including verbal descriptions, hand and head gestures, eye gaze or any combination of them. The amount of useful information from each modality, however, may vary depending on the specific person and on several other factors. For this reason, it is important to learn the correct combination of inputs for inferring the best-fitting reference. In this paper, we investigate appropriate speaker-dependent and independent fusion strategies in a multimodal reference resolution task. We show that without any change in the modality models, only through an optimized fusion technique, it is possible to reduce the error rate of the system on a reference resolution task by more than 50%.
Search