Shu-Chuan Tseng


2024

2023

Connectives convey discourse functions that provide textual and pragmatic information in speech communication on top of canonical, sentential use. This paper proposes an applicable scheme with illustrative examples for distinguishing Sentential, Conclusion, Disfluency, Elaboration, and Resumption uses of Mandarin connectives, including conjunctions and adverbs. Quantitative results of our annotation works are presented to gain an overview of connectives in a Mandarin conversational speech corpus. A fine-grained taxonomy is also discussed, but it requires more empirical data to approve the applicability. By conducting a multinomial logistic regression model, we illustrate that connectives exhibit consistent patterns in positional, phonetic, and contextual features oriented to the associated discourse functions. Our results confirm that the position of Conclusion and Resumption connectives orient more to positions in semantically, rather than prosodically, determined units. We also found that connectives used for all four discourse functions tend to have a higher initial F0 value than those of sentential use. Resumption and Disfluency uses are expected to have the largest increase in initial F0 value, followed by Conclusion and Elaboration uses. Durational cues of the preceding context enable distinguishing Sentential use from discourse uses of Conclusion, Elaboration, and Resumption of connectives.

2022

2014

Phone-aligned spoken corpora are indispensable language resources for quantitative linguistic analyses and automatic speech systems. However, producing this type of data resources is not an easy task due to high costs of time and man power as well as difficulties of applying valid annotation criteria and achieving reliable inter-labeler’s consistency. Among different types of spoken corpora, conversational speech that is often filled with extreme reduction and varying pronunciation variants is particularly challenging. By adopting a combined verification procedure, we obtained reasonably good annotation results. Preliminary phone boundaries that were automatically generated by a phone aligner were provided to human labelers for verifying. Instead of making use of the visualization of acoustic cues, the labelers should solely rely on their perceptual judgments to locate a position that best separates two adjacent phones. Impressionistic judgments in cases of reduction and segment deletion were helpful and necessary, as they balanced subtle nuance caused by differences in perception.

2013

2006

A dedicated resource, consisting of annotated speech tools, and workflow design, was developed for the detailed investigation of discourse phenomena in Taiwan Mandarin. The discourse phenomena have functions which are associated with positions in utterances, and temporal properties, and include discourse markers (“NAGE”, “NA”, e.g. “hesitation”, “utterance initiation”), discourse particles (“A”, e.g. “utterance finality”, “utterance continuity”, “focus”, etc.), and fillers (“UHN”, “hesitation”). The distribution of particles in relation to their position in utterances and the temporal properties of particles are investigated. The results of the investigation diverge considerably from claims in existing grammars of Mandarin with respect to utterance position, and show in general greater length than for regular syllables. These properties suggest the possibility of developing an automatic discourse item tagger.

2005

2001

2000