Tianai Dong


pdf bib
DiscoGeM: A Crowdsourced Corpus of Genre-Mixed Implicit Discourse Relations
Merel Scholman | Tianai Dong | Frances Yung | Vera Demberg
Proceedings of the Thirteenth Language Resources and Evaluation Conference

We present DiscoGeM, a crowdsourced corpus of 6,505 implicit discourse relations from three genres: political speech, literature, and encyclopedic texts. Each instance was annotated by 10 crowd workers. Various label aggregation methods were explored to evaluate how to obtain a label that best captures the meaning inferred by the crowd annotators. The results show that a significant proportion of discourse relations in DiscoGeM are ambiguous and can express multiple relation senses. Probability distribution labels better capture these interpretations than single labels. Further, the results emphasize that text genre crucially affects the distribution of discourse relations, suggesting that genre should be included as a factor in automatic relation classification. We make available the newly created DiscoGeM corpus, as well as the dataset with all annotator-level labels. Both the corpus and the dataset can facilitate a multitude of applications and research purposes, for example to function as training data to improve the performance of automatic discourse relation parsers, as well as facilitate research into non-connective signals of discourse relations.


pdf bib
Comparison of methods for explicit discourse connective identification across various domains
Merel Scholman | Tianai Dong | Frances Yung | Vera Demberg
Proceedings of the 2nd Workshop on Computational Approaches to Discourse

Existing parse methods use varying approaches to identify explicit discourse connectives, but their performance has not been consistently evaluated in comparison to each other, nor have they been evaluated consistently on text other than newspaper articles. We here assess the performance on explicit connective identification of three parse methods (PDTB e2e, Lin et al., 2014; the winner of CONLL2015, Wang et al., 2015; and DisSent, Nie et al., 2019), along with a simple heuristic. We also examine how well these systems generalize to different datasets, namely written newspaper text (PDTB), written scientific text (BioDRB), prepared spoken text (TED-MDB) and spontaneous spoken text (Disco-SPICE). The results show that the e2e parser outperforms the other parse methods in all datasets. However, performance drops significantly from the PDTB to all other datasets. We provide a more fine-grained analysis of domain differences and connectives that prove difficult to parse, in order to highlight the areas where gains can be made.

pdf bib
Visually Grounded Follow-up Questions: a Dataset of Spatial Questions Which Require Dialogue History
Tianai Dong | Alberto Testoni | Luciana Benotti | Raffaella Bernardi
Proceedings of Second International Combined Workshop on Spatial Language Understanding and Grounded Communication for Robotics

In this paper, we define and evaluate a methodology for extracting history-dependent spatial questions from visual dialogues. We say that a question is history-dependent if it requires (parts of) its dialogue history to be interpreted. We argue that some kinds of visual questions define a context upon which a follow-up spatial question relies. We call the question that restricts the context: trigger, and we call the spatial question that requires the trigger question to be answered: zoomer. We automatically extract different trigger and zoomer pairs based on the visual property that the questions rely on (e.g. color, number). We manually annotate the automatically extracted trigger and zoomer pairs to verify which zoomers require their trigger. We implement a simple baseline architecture based on a SOTA multimodal encoder. Our results reveal that there is much room for improvement for answering history-dependent questions.