%0 Conference Proceedings %T Enriching the E2E dataset %A Castro Ferreira, Thiago %A Vaz, Helena %A Davis, Brian %A Pagano, Adriana %Y Belz, Anya %Y Fan, Angela %Y Reiter, Ehud %Y Sripada, Yaji %S Proceedings of the 14th International Conference on Natural Language Generation %D 2021 %8 August %I Association for Computational Linguistics %C Aberdeen, Scotland, UK %F castro-ferreira-etal-2021-enriching %X This study introduces an enriched version of the E2E dataset, one of the most popular language resources for data-to-text NLG. We extract intermediate representations for popular pipeline tasks such as discourse ordering, text structuring, lexicalization and referring expression generation, enabling researchers to rapidly develop and evaluate their data-to-text pipeline systems. The intermediate representations are extracted by aligning non-linguistic and text representations through a process called delexicalization, which consists in replacing input referring expressions to entities/attributes with placeholders. The enriched dataset is publicly available. %R 10.18653/v1/2021.inlg-1.18 %U https://aclanthology.org/2021.inlg-1.18 %U https://doi.org/10.18653/v1/2021.inlg-1.18 %P 177-183