Parsed Corpus as a Source for Testing Generalizations in Japanese Syntax

Hideki Kishimoto, Prashant Pardeshi


Abstract
In this paper, we discuss constituent ordering generalizations in Japanese. Japanese has SOV as its basic order, but a significant range of argument order variations brought about by ‘scrambling’ is permitted. Although scrambling does not induce much in the way of semantic effects, it is conceivable that marked orders are derived from the unmarked order under some pragmatic or other motivations. The difference in the effect of basic and derived order is not reflected in native speaker’s grammaticality judgments, but we suggest that the intuition about the ordering of arguments may be attested in corpus data. By using the Keyaki treebank (a proper subset of which is NINJAL Parsed Corpus of Modern Japanese (NPCMJ)), it is shown that the naturallyoccurring corpus data confirm that marked orderings of arguments are less frequent than their unmarked ordering counterparts. We suggest some possible motivations lying behind the argument order variations.
Anthology ID:
2019.lilt-18.3
Volume:
Linguistic Issues in Language Technology, Volume 18, 2019 - Exploiting Parsed Corpora: Applications in Research, Pedagogy, and Processing
Month:
Jul
Year:
2019
Address:
Venue:
LILT
SIG:
Publisher:
CSLI Publications
Note:
Pages:
Language:
URL:
https://aclanthology.org/2019.lilt-18.3
DOI:
Bibkey:
Cite (ACL):
Hideki Kishimoto and Prashant Pardeshi. 2019. Parsed Corpus as a Source for Testing Generalizations in Japanese Syntax. Linguistic Issues in Language Technology, 18.
Cite (Informal):
Parsed Corpus as a Source for Testing Generalizations in Japanese Syntax (Kishimoto & Pardeshi, LILT 2019)
Copy Citation:
PDF:
https://aclanthology.org/2019.lilt-18.3.pdf