Marta Andersson
2020
A Sentiment-annotated Dataset of English Causal Connectives
Marta Andersson
|
Murathan Kurfalı
|
Robert Östling
Proceedings of the 14th Linguistic Annotation Workshop
This paper investigates the semantic prosody of three causal connectives: due to, owing to and because of in seven varieties of the English language. While research in the domain of English causality exists, we are not aware of studies that would cover the domain of causal connectives in English. Our claim is that connectives such as because of link two arguments, (at least) one of which will include a phrase that contributes to the interpretation of the relation as positive or negative, and hence define the prosody of the connective used. As our results demonstrate, the majority of the prosodies identified are negative for all three connectives; the proportions are stable across the varieties of English studied, and contrary to our expectations, we find no significant differences between the functions of the connectives and discourse preferences. Further, we investigate whether automatizing the sentiment annotation procedure via a simple language-model based classifier is possible. The initial results highlights the complexity of the task and the need for complicated systems, probably aided with other related datasets to achieve reasonable performance.
2016
Annotating Topic Development in Information Seeking Queries
Marta Andersson
|
Adnan Öztürel
|
Silvia Pareti
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
This paper contributes to the limited body of empirical research in the domain of discourse structure of information seeking queries. We describe the development of an annotation schema for coding topic development in information seeking queries and the initial observations from a pilot sample of query sessions. The main idea that we explore is the relationship between constant and variable discourse entities and their role in tracking changes in the topic progression. We argue that the topicalized entities remain stable across development of the discourse and can be identified by a simple mechanism where anaphora resolution is a precursor. We also claim that a corpus annotated in this framework can be used as training data for dialogue management and computational semantics systems.