Hajime Senuma


2024

pdf bib
Computational Complexity of Natural Morphology Revisited
Hajime Senuma | Akiko Aizawa
Transactions of the Association for Computational Linguistics, Volume 12

This paper revisits a classical, yet fundamental, discussion of theoretical computational linguistics: the computational complexity of natural languages. Past studies have revealed that syntax, as observed in Swiss-German, is not weakly context-free. Concerning morphology, Culy (1985) employed a construction in Bambara to show that morphology is not weakly context-free; however, Manaster-Ramer (1988) pointed out that the Bambara case can be problematic because the wordhood of the construction is reliant on special tonal behaviors, and it is ambiguous whether the behaviors belong to the morphological domain. This raises doubts about whether the case can be considered a genuine morphological phenomenon. In this paper, we argue that Classical Ainu, a language we examine, also defies weak context-freeness at the morphological level. The construction we introduce is unambiguously morphological because this language’s valency-sensitive structure and valency-changing operations, such as noun incorporation, preclude its grammatical interpretation as syntactic.

2018

pdf bib
Universal Dependencies for Ainu
Hajime Senuma | Akiko Aizawa
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2017

pdf bib
Seq2seq for Morphological Reinflection: When Deep Learning Fails
Hajime Senuma | Akiko Aizawa
Proceedings of the CoNLL SIGMORPHON 2017 Shared Task: Universal Morphological Reinflection

pdf bib
Toward Universal Dependencies for Ainu
Hajime Senuma | Akiko Aizawa
Proceedings of the NoDaLiDa 2017 Workshop on Universal Dependencies (UDW 2017)

2016

pdf bib
Learning Succinct Models: Pipelined Compression with L1-Regularization, Hashing, Elias-Fano Indices, and Quantization
Hajime Senuma | Akiko Aizawa
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

The recent proliferation of smart devices necessitates methods to learn small-sized models. This paper demonstrates that if there are m features in total but only n = o(√m) features are required to distinguish examples, with 𝛺(log m) training examples and reasonable settings, it is possible to obtain a good model in a succinct representation using n log2 mn + o(m) bits, by using a pipeline of existing compression methods: L1-regularized logistic regression, feature hashing, Elias–Fano indices, and randomized quantization. An experiment shows that a noun phrase chunking task for which an existing library requires 27 megabytes can be compressed to less than 13 kilobytes without notable loss of accuracy.

2011

pdf bib
K-means Clustering with Feature Hashing
Hajime Senuma
Proceedings of the ACL 2011 Student Session

Search
Co-authors
Venues