Yohei Oseki


2024

pdf bib
Emergent Word Order Universals from Cognitively-Motivated Language Models
Tatsuki Kuribayashi | Ryo Ueda | Ryo Yoshida | Yohei Oseki | Ted Briscoe | Timothy Baldwin
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

The world’s languages exhibit certain so-called typological or implicational universals; for example, Subject-Object-Verb (SOV) languages typically use postpositions. Explaining the source of such biases is a key goal of linguistics.We study word-order universals through a computational simulation with language models (LMs).Our experiments show that typologically-typical word orders tend to have lower perplexity estimated by LMs with cognitively plausible biases: syntactic biases, specific parsing strategies, and memory limitations. This suggests that the interplay of cognitive biases and predictability (perplexity) can explain many aspects of word-order universals.It also showcases the advantage of cognitively-motivated LMs, typically employed in cognitive modeling, in the simulation of language universals.

pdf bib
Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics
Tatsuki Kuribayashi | Giulia Rambelli | Ece Takmaz | Philipp Wicke | Yohei Oseki
Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics

pdf bib
Is Structure Dependence Shaped for Efficient Communication?: A Case Study on Coordination
Kohei Kajikawa | Yusuke Kubota | Yohei Oseki
Proceedings of the 28th Conference on Computational Natural Language Learning

Natural language exhibits various universal properties.But why do these universals exist?One explanation is that they arise from functional pressures to achieve efficient communication, a view which attributes cross-linguistic properties to domain-general cognitive abilities.This hypothesis has successfully addressed some syntactic universal properties such as compositionality and Greenbergian word order universals.However, more abstract syntactic universals have not been explored from the perspective of efficient communication.Among such universals, the most notable one is structure dependence, that is, grammar-internal operations crucially depend on hierarchical representations.This property has traditionally been taken to be central to natural language and to involve domain-specific knowledge irreducible to communicative efficiency. In this paper, we challenge the conventional view by investigating whether structure dependence realizes efficient communication, focusing on coordinate structures.We design three types of artificial languages: (i) one with a structure-dependent reduction operation, which is similar to natural language, (ii) one without any reduction operations, and (iii) one with a linear (rather than structure-dependent) reduction operation.We quantify the communicative efficiency of these languages.The results demonstrate that the language with the structure-dependent reduction operation is significantly more communicatively efficient than the counterfactual languages.This suggests that the existence of structure-dependent properties can be explained from the perspective of efficient communication.

pdf bib
Can Language Models Induce Grammatical Knowledge from Indirect Evidence?
Miyu Oba | Yohei Oseki | Akiyo Fukatsu | Akari Haga | Hiroki Ouchi | Taro Watanabe | Saku Sugawara
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

What kinds of and how much data is necessary for language models to induce grammatical knowledge to judge sentence acceptability? Recent language models still have much room for improvement in their data efficiency compared to humans. This paper investigates whether language models efficiently use indirect data (indirect evidence), from which they infer sentence acceptability. In contrast, humans use indirect evidence efficiently, which is considered one of the inductive biases contributing to efficient language acquisition. To explore this question, we introduce the Wug InDirect Evidence Test (WIDET), a dataset consisting of training instances inserted into the pre-training data and evaluation instances. We inject synthetic instances with newly coined wug words into pretraining data and explore the model’s behavior on evaluation data that assesses grammatical acceptability regarding those words. We prepare the injected instances by varying their levels of indirectness and quantity. Our experiments surprisingly show that language models do not induce grammatical knowledge even after repeated exposure to instances with the same structure but differing only in lexical items from evaluation instances in certain language phenomena. Our findings suggest a potential direction for future research: developing models that use latent indirect evidence to induce grammatical knowledge.

pdf bib
Psychometric Predictive Power of Large Language Models
Tatsuki Kuribayashi | Yohei Oseki | Timothy Baldwin
Findings of the Association for Computational Linguistics: NAACL 2024

Instruction tuning aligns the response of large language models (LLMs) with human preferences.Despite such efforts in human–LLM alignment, we find that instruction tuning does not always make LLMs human-like from a cognitive modeling perspective. More specifically, next-word probabilities estimated by instruction-tuned LLMs are often worse at simulating human reading behavior than those estimated by base LLMs.In addition, we explore prompting methodologies for simulating human reading behavior with LLMs. Our results show that prompts reflecting a particular linguistic hypothesis improve psychometric predictive power, but are still inferior to small base models.These findings highlight that recent advancements in LLMs, i.e., instruction tuning and prompting, do not offer better estimates than direct probability measurements from base LLMs in cognitive modeling. In other words, pure next-word probability remains a strong predictor for human reading behavior, even in the age of LLMs.

pdf bib
Tree-Planted Transformers: Unidirectional Transformer Language Models with Implicit Syntactic Supervision
Ryo Yoshida | Taiga Someya | Yohei Oseki
Findings of the Association for Computational Linguistics: ACL 2024

Syntactic Language Models (SLMs) can be trained efficiently to reach relatively high performance; however, they have trouble with inference efficiency due to the explicit generation of syntactic structures. In this paper, we propose a new method dubbed tree-planting: instead of explicitly generating syntactic structures, we “plant” trees into attention weights of unidirectional Transformer LMs to implicitly reflect syntactic structures of natural language. Specifically, unidirectional Transformer LMs trained with tree-planting will be called Tree-Planted Transformers (TPT), which inherit the training efficiency from SLMs without changing the inference efficiency of their underlying Transformer LMs. Targeted syntactic evaluations on the SyntaxGym benchmark demonstrated that TPTs, despite the lack of explicit generation of syntactic structures, significantly outperformed not only vanilla Transformer LMs but also various SLMs that generate hundreds of syntactic structures in parallel. This result suggests that TPTs can learn human-like syntactic knowledge as data-efficiently as SLMs while maintaining the modeling space of Transformer LMs unchanged.

pdf bib
Modeling Overregularization in Children with Small Language Models
Akari Haga | Saku Sugawara | Akiyo Fukatsu | Miyu Oba | Hiroki Ouchi | Taro Watanabe | Yohei Oseki
Findings of the Association for Computational Linguistics: ACL 2024

The imitation of the children’s language acquisition process has been explored to make language models (LMs) more efficient.In particular, errors caused by children’s regularization (so-called overregularization, e.g., using wroted for the past tense of write) have been widely studied to reveal the mechanisms of language acquisition. Existing research has analyzed regularization in language acquisition only by modeling word inflection directly, which is unnatural in light of human language acquisition. In this paper, we hypothesize that language models that imitate the errors children make during language acquisition have a learning process more similar to humans. To verify this hypothesis, we analyzed the learning curve and error preferences of verb inflections in small-scale LMs using acceptability judgments. We analyze the differences in results by model architecture, data, and tokenization. Our model shows child-like U-shaped learning curves clearly for certain verbs, but the preferences for types of overgeneralization did not fully match the observations in children.

pdf bib
Cognitive Information Bottleneck: Extracting Minimal Sufficient Cognitive Language Processing Signals
Yuto Harada | Yohei Oseki
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

In Reinforcement Learning from Human Feedback (RLHF), explicit human feedback, such as rankings, is employed to align Natural Language Processing (NLP) models with human preferences. In contrast, the potential of implicit human feedback, encompassing cognitive processing signals like eye-tracking and brain activity, remains underexplored. These signals capture unconscious human responses but are often marred by noise and redundancy, complicating their application to specific tasks. To address this issue, we introduce the Cognitive Information Bottleneck (CIB), a method that extracts only the task-relevant information from cognitive processing signals. Grounded in the principles of the information bottleneck, CIB aims to learn representations that maximize the mutual information between the representations and targets while minimizing the mutual information between inputs and representations. By employing CIB to filter out redundant information from cognitive processing signals, our goal is to provide representations that are both minimal and sufficient. This approach enables more efficient fitting of models to inputs. Our results show that the proposed method outperforms existing methods in efficiently compressing various cognitive processing signals and significantly enhances performance on downstream tasks. Evaluated on public datasets, our model surpasses contemporary state-of-the-art models. Furthermore, by analyzing these compressed representations, we offer insights into how cognitive processing signals can be leveraged to improve performance.

pdf bib
JCoLA: Japanese Corpus of Linguistic Acceptability
Taiga Someya | Yushi Sugimoto | Yohei Oseki
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Neural language models have exhibited outstanding performance in a range of downstream tasks. However, there is limited understanding regarding the extent to which these models internalize syntactic knowledge, so that various datasets have recently been constructed to facilitate syntactic evaluation of language models across languages. In this paper, we introduce JCoLA (Japanese Corpus of Linguistic Acceptability), which consists of 10,020 sentences annotated with binary acceptability judgments. Specifically, those sentences are manually extracted from linguistics textbooks, handbooks and journal articles, and split into in-domain data (86 %; relatively simple acceptability judgments extracted from textbooks and handbooks) and out-of-domain data (14 %; theoretically significant acceptability judgments extracted from journal articles), the latter of which is categorized by 12 linguistic phenomena. We then evaluate the syntactic knowledge of 9 different types of Japanese and multilingual language models on JCoLA. The results demonstrated that several models could surpass human performance for the in-domain data, while no models were able to exceed human performance for the out-of-domain data. Error analyses by linguistic phenomena further revealed that although neural language models are adept at handling local syntactic dependencies like argument structure, their performance wanes when confronted with long-distance syntactic dependencies like verbal agreement and NPI licensing.

pdf bib
Learning Bidirectional Morphological Inflection like Humans
Akiyo Fukatsu | Yuto Harada | Yohei Oseki
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

For nearly the past forty years, there has been discussion regarding whether symbolic representations are involved in morphological inflection, a debate commonly known as the Past Tense Debate. The previous literature has extensively explored whether neural models, which do not use symbolic representations can process morphological inflection like humans. However, current research interest has shifted towards whether neural models can acquire morphological inflection like humans. In this paper, we trained neural models, the recurrent neural network (RNN) with attention and the transformer, and a symbolic model, the Minimal Generalization Learner (MGL), under a human-like learning environment. Evaluating the models from the perspective of language acquisition, we found that while the transformer and the MGL exhibited some human-like characteristics, the RNN with attention did not demonstrate human-like behavior across all the evaluation metrics considered in this study. Furthermore, none of the models accurately inflected verbs in the same manner as humans in terms of morphological inflection direction. These results suggest that these models fall short as cognitive models of morphological inflection.

pdf bib
Targeted Syntactic Evaluation on the Chomsky Hierarchy
Taiga Someya | Ryo Yoshida | Yohei Oseki
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

In this paper, we propose a novel evaluation paradigm for Targeted Syntactic Evaluations, where we assess how well language models can recognize linguistic phenomena situated at different levels of the Chomsky hierarchy. Specifically, we create formal languages that abstract four syntactic phenomena in natural languages, each identified at a different level of the Chomsky hierarchy, and use these to evaluate the capabilities of language models: (1) (Adj)ˆn NP type, (2) NPˆn VPˆn type, (3) Nested Dependency type, and (4) Cross Serial Dependency type. We first train three different language models (LSTM, Transformer LM, and Stack-RNN) on language modeling tasks and then evaluate them using pairs of a positive and a negative sentence by investigating whether they can assign a higher probability to the positive sentence than the negative one. Our result demonstrated that all language models have the ability to capture the structural patterns of the (Adj)ˆn NP type formal language. However, LSTM and Transformer LM failed to capture NPˆn VPˆn type language and no architectures can recognize nested dependency and Cross Serial dependency correctly. Neural language models, especially Transformer LMs, have exhibited high performance across a multitude of downstream tasks, leading to the perception that they possess an understanding of natural languages. However, our findings suggest that these models may not necessarily comprehend the syntactic structures that underlie natural language phenomena such as dependency. Rather, it appears that they may extend grammatical rules equivalent to regular grammars to approximate the rules governing dependencies.

2023

pdf bib
BabyLM Challenge: Curriculum learning based on sentence complexity approximating language acquisition
Miyu Oba | Akari Haga | Akiyo Fukatsu | Yohei Oseki
Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning

pdf bib
JBLiMP: Japanese Benchmark of Linguistic Minimal Pairs
Taiga Someya | Yohei Oseki
Findings of the Association for Computational Linguistics: EACL 2023

In this paper, we introduce JBLiMP (Japanese Benchmark of Linguistic Minimal Pairs), a novel dataset for targeted syntactic evaluations of language models in Japanese. JBLiMP consists of 331 minimal pairs, which are created based on acceptability judgments extracted from journal articles in theoretical linguistics. These minimal pairs are grouped into 11 categories, each covering a different linguistic phenomenon. JBLiMP is unique in that it successfully combines two important features independently observed in existing datasets: (i) coverage of complex linguistic phenomena (cf. CoLA) and (ii) presentation of sentences as minimal pairs (cf. BLiMP). In addition, JBLiMP is the first dataset for targeted syntactic evaluations of language models in Japanese, thus allowing the comparison of syntactic knowledge of language models across different languages. We then evaluate the syntactic knowledge of several language models on JBLiMP: GPT-2, LSTM, and n-gram language models. The results demonstrated that all the architectures achieved comparable overall accuracies around 75%. Error analyses by linguistic phenomenon further revealed that these language models successfully captured local dependencies like nominal structures, but not long-distance dependencies such as verbal agreement and binding.

pdf bib
How Much Syntactic Supervision is “Good Enough”?
Hiroshi Noji | Yohei Oseki
Findings of the Association for Computational Linguistics: EACL 2023

In this paper, we explore how much syntactic supervision is “good enough” to make language models (LMs) more human-like. Specifically, we propose the new method called syntactic ablation, where syntactic LMs, namely Recurrent Neural Network Grammars (RNNGs), are gradually ablated from full syntactic supervision to zero syntactic supervision (≈ unidirectional LSTM) by preserving NP, VP, PP, SBAR nonterminal symbols and the combinations thereof. The 17 ablated grammars are then evaluated via targeted syntactic evaluation on the SyntaxGym benchmark. The results of our syntactic ablation demonstrated that (i) the RNNG with zero syntactic supervision underperformed the RNNGs with some syntactic supervision, (ii) the RNNG with full syntactic supervision underperformed the RNNGs with less syntactic supervision, and (iii) the RNNG with mild syntactic supervision achieved the best performance comparable to the state-of-the-art GPT-2-XL. Those results may suggest that the “good enough” approach to language processing seems to make LMs more human-like.

pdf bib
CANDS: A Computational Implementation of Collins and Stabler (2016)
Satoru Ozaki | Yohei Oseki
Proceedings of the Society for Computation in Linguistics 2023

2022

pdf bib
Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics
Emmanuele Chersoni | Nora Hollenstein | Cassandra Jacobs | Yohei Oseki | Laurent Prévot | Enrico Santus
Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics

pdf bib
CMCL 2022 Shared Task on Multilingual and Crosslingual Prediction of Human Reading Behavior
Nora Hollenstein | Emmanuele Chersoni | Cassandra Jacobs | Yohei Oseki | Laurent Prévot | Enrico Santus
Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics

We present the second shared task on eye-tracking data prediction of the Cognitive Modeling and Computational Linguistics Workshop (CMCL). Differently from the previous edition, participating teams are asked to predict eye-tracking features from multiple languages, including a surprise language for which there were no available training data. Moreover, the task also included the prediction of standard deviations of feature values in order to account for individual differences between readers.A total of six teams registered to the task. For the first subtask on multilingual prediction, the winning team proposed a regression model based on lexical features, while for the second subtask on cross-lingual prediction, the winning team used a hybrid model based on a multilingual transformer embeddings as well as statistical features.

pdf bib
Context Limitations Make Neural Language Models More Human-Like
Tatsuki Kuribayashi | Yohei Oseki | Ana Brassard | Kentaro Inui
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Language models (LMs) have been used in cognitive modeling as well as engineering studies—they compute information-theoretic complexity metrics that simulate humans’ cognitive load during reading.This study highlights a limitation of modern neural LMs as the model of choice for this purpose: there is a discrepancy between their context access capacities and that of humans.Our results showed that constraining the LMs’ context access improved their simulation of human reading behavior.We also showed that LM-human gaps in context access were associated with specific syntactic constructions; incorporating syntactic biases into LMs’ context access might enhance their cognitive plausibility.

pdf bib
Composition, Attention, or Both?
Ryo Yoshida | Yohei Oseki
Findings of the Association for Computational Linguistics: EMNLP 2022

In this paper, we propose a novel architecture called Composition Attention Grammars (CAGs) that recursively compose subtrees into a single vector representation with a composition function, and selectively attend to previous structural information with a self-attention mechanism. We investigate whether these components—the composition function and the self-attention mechanism—can both induce human-like syntactic generalization. Specifically, we train language models (LMs) with and without these two components with the model sizes carefully controlled, and evaluate their syntactic generalization performance against six test circuits on the SyntaxGym benchmark. The results demonstrated that the composition function and the self-attention mechanism both play an important role to make LMs more human-like, and closer inspection of linguistic phenomenon implied that the composition function allowed syntactic features, but not semantic features, to percolate into subtree representations.

pdf bib
Learning Argument Structures with Recurrent Neural Network Grammars
Ryo Yoshida | Yohei Oseki
Proceedings of the Society for Computation in Linguistics 2022

2021

pdf bib
Lower Perplexity is Not Always Human-Like
Tatsuki Kuribayashi | Yohei Oseki | Takumi Ito | Ryo Yoshida | Masayuki Asahara | Kentaro Inui
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

In computational psycholinguistics, various language models have been evaluated against human reading behavior (e.g., eye movement) to build human-like computational models. However, most previous efforts have focused almost exclusively on English, despite the recent trend towards linguistic universal within the general community. In order to fill the gap, this paper investigates whether the established results in computational psycholinguistics can be generalized across languages. Specifically, we re-examine an established generalization —the lower perplexity a language model has, the more human-like the language model is— in Japanese with typologically different structures from English. Our experiments demonstrate that this established generalization exhibits a surprising lack of universality; namely, lower perplexity is not always human-like. Moreover, this discrepancy between English and Japanese is further explored from the perspective of (non-)uniform information density. Overall, our results suggest that a cross-lingual evaluation will be necessary to construct human-like computational models.

pdf bib
Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics
Emmanuele Chersoni | Nora Hollenstein | Cassandra Jacobs | Yohei Oseki | Laurent Prévot | Enrico Santus
Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics

pdf bib
CMCL 2021 Shared Task on Eye-Tracking Prediction
Nora Hollenstein | Emmanuele Chersoni | Cassandra L. Jacobs | Yohei Oseki | Laurent Prévot | Enrico Santus
Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics

Eye-tracking data from reading represent an important resource for both linguistics and natural language processing. The ability to accurately model gaze features is crucial to advance our understanding of language processing. This paper describes the Shared Task on Eye-Tracking Data Prediction, jointly organized with the eleventh edition of the Work- shop on Cognitive Modeling and Computational Linguistics (CMCL 2021). The goal of the task is to predict 5 different token- level eye-tracking metrics of the Zurich Cognitive Language Processing Corpus (ZuCo). Eye-tracking data were recorded during natural reading of English sentences. In total, we received submissions from 13 registered teams, whose systems include boosting algorithms with handcrafted features, neural models leveraging transformer language models, or hybrid approaches. The winning system used a range of linguistic and psychometric features in a gradient boosting framework.

pdf bib
Modeling Human Sentence Processing with Left-Corner Recurrent Neural Network Grammars
Ryo Yoshida | Hiroshi Noji | Yohei Oseki
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

In computational linguistics, it has been shown that hierarchical structures make language models (LMs) more human-like. However, the previous literature has been agnostic about a parsing strategy of the hierarchical models. In this paper, we investigated whether hierarchical structures make LMs more human-like, and if so, which parsing strategy is most cognitively plausible. In order to address this question, we evaluated three LMs against human reading times in Japanese with head-final left-branching structures: Long Short-Term Memory (LSTM) as a sequential model and Recurrent Neural Network Grammars (RNNGs) with top-down and left-corner parsing strategies as hierarchical models. Our computational modeling demonstrated that left-corner RNNGs outperformed top-down RNNGs and LSTM, suggesting that hierarchical and left-corner architectures are more cognitively plausible than top-down or sequential architectures. In addition, the relationships between the cognitive plausibility and (i) perplexity, (ii) parsing, and (iii) beam size will also be discussed.

pdf bib
Effective Batching for Recurrent Neural Network Grammars
Hiroshi Noji | Yohei Oseki
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2020

pdf bib
Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics
Emmanuele Chersoni | Cassandra Jacobs | Yohei Oseki | Laurent Prévot | Enrico Santus
Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics

pdf bib
Design of BCCWJ-EEG: Balanced Corpus with Human Electroencephalography
Yohei Oseki | Masayuki Asahara
Proceedings of the Twelfth Language Resources and Evaluation Conference

The past decade has witnessed the happy marriage between natural language processing (NLP) and the cognitive science of language. Moreover, given the historical relationship between biological and artificial neural networks, the advent of deep learning has re-sparked strong interests in the fusion of NLP and the neuroscience of language. Importantly, this inter-fertilization between NLP, on one hand, and the cognitive (neuro)science of language, on the other, has been driven by the language resources annotated with human language processing data. However, there remain several limitations with those language resources on annotations, genres, languages, etc. In this paper, we describe the design of a novel language resource called BCCWJ-EEG, the Balanced Corpus of Contemporary Written Japanese (BCCWJ) experimentally annotated with human electroencephalography (EEG). Specifically, after extensively reviewing the language resources currently available in the literature with special focus on eye-tracking and EEG, we summarize the details concerning (i) participants, (ii) stimuli, (iii) procedure, (iv) data preprocessing, (v) corpus evaluation, (vi) resource release, and (vii) compilation schedule. In addition, potential applications of BCCWJ-EEG to neuroscience and NLP will also be discussed.

pdf bib
Modeling morphological processing in human magnetoencephalography
Yohei Oseki | Alec Marantz
Proceedings of the Society for Computation in Linguistics 2020

2019

pdf bib
Modeling Hierarchical Syntactic Structures in Morphological Processing
Yohei Oseki | Charles Yang | Alec Marantz
Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics

Sentences are represented as hierarchical syntactic structures, which have been successfully modeled in sentence processing. In contrast, despite the theoretical agreement on hierarchical syntactic structures within words, words have been argued to be computationally less complex than sentences and implemented by finite-state models as linear strings of morphemes, and even the psychological reality of morphemes has been denied. In this paper, extending the computational models employed in sentence processing to morphological processing, we performed a computational simulation experiment where, given incremental surprisal as a linking hypothesis, five computational models with different representational assumptions were evaluated against human reaction times in visual lexical decision experiments available from the English Lexicon Project (ELP), a “shared task” in the morphological processing literature. The simulation experiment demonstrated that (i) “amorphous” models without morpheme units underperformed relative to “morphous” models, (ii) a computational model with hierarchical syntactic structures, Probabilistic Context-Free Grammar (PCFG), most accurately explained human reaction times, and (iii) this performance was achieved on top of surface frequency effects. These results strongly suggest that morphological processing tracks morphemes incrementally from left to right and parses them into hierarchical syntactic structures, contrary to “amorphous” and finite-state models of morphological processing.

pdf bib
Inverting and Modeling Morphological Inflection
Yohei Oseki | Yasutada Sudo | Hiromu Sakai | Alec Marantz
Proceedings of the 16th Workshop on Computational Research in Phonetics, Phonology, and Morphology

Previous “wug” tests (Berko, 1958) on Japanese verbal inflection have demonstrated that Japanese speakers, both adults and children, cannot inflect novel present tense forms to “correct” past tense forms predicted by rules of existent verbs (de Chene, 1982; Vance, 1987, 1991; Klafehn, 2003, 2013), indicating that Japanese verbs are merely stored in the mental lexicon. However, the implicit assumption that present tense forms are bases for verbal inflection should not be blindly extended to morphologically rich languages like Japanese in which both present and past tense forms are morphologically complex without inherent direction (Albright, 2002). Interestingly, there are also independent observations in the acquisition literature to suggest that past tense forms may be bases for verbal inflection in Japanese (Klafehn, 2003; Murasugi et al., 2010; Hirose, 2017; Tatsumi et al., 2018). In this paper, we computationally simulate two directions of verbal inflection in Japanese, Present → Past and Past → Present, with the rule-based computational model called Minimal Generalization Learner (MGL; Albright and Hayes, 2003) and experimentally evaluate the model with the bidirectional “wug” test where humans inflect novel verbs in two opposite directions. We conclude that Japanese verbs can be computed online via some generalizations and those generalizations do depend on the direction of morphological inflection.