Sandrine Zufferey


2023

Coherence relations between elements of discourse can be signaled by linguistic devices such as connectives and/or alternative signals. While the use and comprehension of connectives have been studied in different categories of speakers, less is known about the functioning of alternative signals of coherence relations, especially in younger populations. In the current study, we aim to examine the sensitivity of French-speaking teenagers to the alternative signals of list relation (words such as plusieurs ‘several’ and différents ‘various’), combined with connectives varying in frequency and signaling two types of coherence relations (addition: en plus, en outre; consequence: donc, ainsi). Our results reveal that, as early as in teenage years, speakers are sensitive (i.e., they produce list continuation sentences) to alternative signals of list relation. Furthermore, the inference of list relation is not significantly changed when an alternative signal is combined with the more frequent additive connective en plus. However, this inference is inhibited by the less frequent additive connective en outre, and is almost completely hindered by the consequence connectives donc and ainsi. Overall, these results show that alternative list signals are an important source for the inference of the list relation, even in the presence of more salient signals of coherence such as connectives.

2015

2013

The various meanings of discourse connectives like while and however are difficult to identify and annotate, even for trained human annotators. This problem is all the more important that connectives are salient textual markers of cohesion and need to be correctly interpreted for many NLP applications. In this paper, we suggest an alternative route to reach a reliable annotation of connectives, by making use of the information provided by their translation in large parallel corpora. This method thus replaces the difficult explicit reasoning involved in traditional sense annotation by an empirical clustering of the senses emerging from the translations. We argue that this method has the advantage of providing more reliable reference data than traditional sense annotation. In addition, its simplicity allows for the rapid constitution of large annotated datasets.

2012

This paper describes methods and results for the annotation of two discourse-level phenomena, connectives and pronouns, over a multilingual parallel corpus. Excerpts from Europarl in English and French have been annotated with disambiguation information for connectives and pronouns, for about 3600 tokens. This data is then used in several ways: for cross-linguistic studies, for training automatic disambiguation software, and ultimately for training and testing discourse-aware statistical machine translation systems. The paper presents the annotation procedures and their results in detail, and overviews the first systems trained on the annotated resources and their use for machine translation.

2011

2007

2004