Gülşat Aygen
2024
Text vs. Transcription: A Study of Differences Between the Writing and Speeches of U.S. Presidents
Mina Rajaei Moghadam
|
Mosab Rezaei
|
Gülşat Aygen
|
Reva Freedman
Proceedings of the 4th International Conference on Natural Language Processing for Digital Humanities
Even after many years of research, answering the question of the differences between spoken and written text remains open. This paper aims to study syntactic features that can serve as distinguishing factors. To do so, we focus on the transcribed speeches and written books of United States presidents. We conducted two experiments to analyze high-level syntactic features. In the first experiment, we examine these features while controlling for the effect of sentence length. In the second experiment, we compare the high-level syntactic features with low-level ones. The results indicate that adding high-level syntactic features enhances model performance, particularly in longer sentences. Moreover, the importance of the prepositional phrases in a sentence increases with sentence length. We also find that these longer sentences with more prepositional phrases are more likely to appear in speeches than in written books by U.S. presidents.
Search