Unsupervised Task Graph Generation from Instructional Video Transcripts

Lajanugen Logeswaran, Sungryull Sohn, Yunseok Jang, Moontae Lee, Honglak Lee


Abstract
This work explores the problem of generating task graphs of real-world activities. Different from prior formulations, we consider a setting where text transcripts of instructional videos performing a real-world activity (e.g., making coffee) are provided and the goal is to identify the key steps relevant to the task as well as the dependency relationship between these key steps. We propose a novel task graph generation approach that combines the reasoning capabilities of instruction-tuned language models along with clustering and ranking components to generate accurate task graphs in a completely unsupervised manner. We show that the proposed approach generates more accurate task graphs compared to a supervised learning approach on tasks from the ProceL and CrossTask datasets.
Anthology ID:
2023.findings-acl.210
Volume:
Findings of the Association for Computational Linguistics: ACL 2023
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3392–3406
Language:
URL:
https://aclanthology.org/2023.findings-acl.210
DOI:
10.18653/v1/2023.findings-acl.210
Bibkey:
Cite (ACL):
Lajanugen Logeswaran, Sungryull Sohn, Yunseok Jang, Moontae Lee, and Honglak Lee. 2023. Unsupervised Task Graph Generation from Instructional Video Transcripts. In Findings of the Association for Computational Linguistics: ACL 2023, pages 3392–3406, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Unsupervised Task Graph Generation from Instructional Video Transcripts (Logeswaran et al., Findings 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.findings-acl.210.pdf