English Recipe Flow Graph Corpus

Yoko Yamakata, Shinsuke Mori, John Carroll


Abstract
We present an annotated corpus of English cooking recipe procedures, and describe and evaluate computational methods for learning these annotations. The corpus consists of 300 recipes written by members of the public, which we have annotated with domain-specific linguistic and semantic structure. Each recipe is annotated with (1) ‘recipe named entities’ (r-NEs) specific to the recipe domain, and (2) a flow graph representing in detail the sequencing of steps, and interactions between cooking tools, food ingredients and the products of intermediate steps. For these two kinds of annotations, inter-annotator agreement ranges from 82.3 to 90.5 F1, indicating that our annotation scheme is appropriate and consistent. We experiment with producing these annotations automatically. For r-NE tagging we train a deep neural network NER tool; to compute flow graphs we train a dependency-style parsing procedure which we apply to the entire sequence of r-NEs in a recipe. In evaluations, our systems achieve 71.1 to 87.5 F1, demonstrating that our annotation scheme is learnable.
Anthology ID:
2020.lrec-1.638
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
5187–5194
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.638
DOI:
Bibkey:
Cite (ACL):
Yoko Yamakata, Shinsuke Mori, and John Carroll. 2020. English Recipe Flow Graph Corpus. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 5187–5194, Marseille, France. European Language Resources Association.
Cite (Informal):
English Recipe Flow Graph Corpus (Yamakata et al., LREC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.lrec-1.638.pdf