Non-ingredient Detection in User-generated Recipes using the Sequence Tagging Approach

Yasuhiro Yamaguchi, Shintaro Inuzuka, Makoto Hiramatsu, Jun Harashima


Abstract
Recently, the number of user-generated recipes on the Internet has increased. In such recipes, users are generally supposed to write a title, an ingredient list, and steps to create a dish. However, some items in an ingredient list in a user-generated recipe are not actually edible ingredients. For example, headings, comments, and kitchenware sometimes appear in an ingredient list because users can freely write the list in their recipes. Such noise makes it difficult for computers to use recipes for a variety of tasks, such as calorie estimation. To address this issue, we propose a non-ingredient detection method inspired by a neural sequence tagging model. In our experiment, we annotated 6,675 ingredients in 600 user-generated recipes and showed that our proposed method achieved a 93.3 F1 score.
Anthology ID:
2020.wnut-1.11
Volume:
Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020)
Month:
November
Year:
2020
Address:
Online
Editors:
Wei Xu, Alan Ritter, Tim Baldwin, Afshin Rahimi
Venue:
WNUT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
76–80
Language:
URL:
https://aclanthology.org/2020.wnut-1.11
DOI:
10.18653/v1/2020.wnut-1.11
Bibkey:
Cite (ACL):
Yasuhiro Yamaguchi, Shintaro Inuzuka, Makoto Hiramatsu, and Jun Harashima. 2020. Non-ingredient Detection in User-generated Recipes using the Sequence Tagging Approach. In Proceedings of the Sixth Workshop on Noisy User-generated Text (W-NUT 2020), pages 76–80, Online. Association for Computational Linguistics.
Cite (Informal):
Non-ingredient Detection in User-generated Recipes using the Sequence Tagging Approach (Yamaguchi et al., WNUT 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.wnut-1.11.pdf