In this paper, we describe our reproduction ef- fort of the paper: Towards Best Experiment Design for Evaluating Dialogue System Output by Santhanam and Shaikh (2019) for the 2022 ReproGen shared task. We aim to produce the same results, using different human evaluators, and a different implementation of the automatic metrics used in the original paper. Although overall the study posed some challenges to re- produce (e.g. difficulties with reproduction of automatic metrics and statistics), in the end we did find that the results generally replicate the findings of Santhanam and Shaikh (2019) and seem to follow similar trends.
In this paper, we present a novel data-to-text system for cancer patients, providing information on quality of life implications after treatment, which can be embedded in the context of shared decision making. Currently, information on quality of life implications is often not discussed, partly because (until recently) data has been lacking. In our work, we rely on a newly developed prediction model, which assigns patients to scenarios. Furthermore, we use data-to-text techniques to explain these scenario-based predictions in personalized and understandable language. We highlight the possibilities of NLG for personalization, discuss ethical implications and also present the outcomes of a first evaluation with clinicians.