Ryosuke Kohita


2021

pdf bib
Language-based General Action Template for Reinforcement Learning Agents
Ryosuke Kohita | Akifumi Wachi | Daiki Kimura | Subhajit Chaudhury | Michiaki Tatsubori | Asim Munawar
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib
Polar Embedding
Ran Iwamoto | Ryosuke Kohita | Akifumi Wachi
Proceedings of the 25th Conference on Computational Natural Language Learning

Hierarchical relationships are invaluable information for many natural language processing (NLP) tasks. Distributional representation has become a fundamental approach for encoding word relationships, particularly embeddings in hyperbolic space showed great performance in representing hierarchies by taking advantage of their spatial properties. However, most machine learning systems do not suppose to use in such complex non-Euclidean geometries. To achieve hierarchy representations in commonly used Euclidean space, we propose Polar Embedding that learns word embeddings with the polar coordinate system. Utilizing characteristics of polar coordinates, the hierarchy of words is expressed with two independent variables: radius (generality) and angles (similarity), and their variables are optimized separately. Polar embedding shows word hierarchies explicitly and allows us to use beneficial resources such as word frequencies or word generality annotations for computing radiuses. We introduce an optimization method for learning angles in limited ranges of polar coordinates, which combining a loss function controlling gradient and distribution uniformization. Experimental results on hypernymy datasets indicate that our approach outperforms other embeddings in low-dimensional Euclidean space and competitively performs even with hyperbolic embeddings, which possess a geometric advantage.

pdf bib
Neuro-Symbolic Reinforcement Learning with First-Order Logic
Daiki Kimura | Masaki Ono | Subhajit Chaudhury | Ryosuke Kohita | Akifumi Wachi | Don Joven Agravante | Michiaki Tatsubori | Asim Munawar | Alexander Gray
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Deep reinforcement learning (RL) methods often require many trials before convergence, and no direct interpretability of trained policies is provided. In order to achieve fast convergence and interpretability for the policy in RL, we propose a novel RL method for text-based games with a recent neuro-symbolic framework called Logical Neural Network, which can learn symbolic and interpretable rules in their differentiable network. The method is first to extract first-order logical facts from text observation and external word meaning network (ConceptNet), then train a policy in the network with directly interpretable logical operators. Our experimental results show RL training with the proposed method converges significantly faster than other state-of-the-art neuro-symbolic methods in a TextWorld benchmark.

pdf bib
LOA: Logical Optimal Actions for Text-based Interaction Games
Daiki Kimura | Subhajit Chaudhury | Masaki Ono | Michiaki Tatsubori | Don Joven Agravante | Asim Munawar | Akifumi Wachi | Ryosuke Kohita | Alexander Gray
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations

We present Logical Optimal Actions (LOA), an action decision architecture of reinforcement learning applications with a neuro-symbolic framework which is a combination of neural network and symbolic knowledge acquisition approach for natural language interaction games. The demonstration for LOA experiments consists of a web-based interactive platform for text-based games and visualization for acquired knowledge for improving interpretability for trained rules. This demonstration also provides a comparison module with other neuro-symbolic approaches as well as non-symbolic state-of-the-art agent models on the same text-based games. Our LOA also provides open-sourced implementation in Python for the reinforcement learning environment to facilitate an experiment for studying neuro-symbolic agents. Demo site: https://ibm.biz/acl21-loa, Code: https://github.com/ibm/loa

2020

pdf bib
Q-learning with Language Model for Edit-based Unsupervised Summarization
Ryosuke Kohita | Akifumi Wachi | Yang Zhao | Ryuki Tachibana
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Unsupervised methods are promising for abstractive textsummarization in that the parallel corpora is not required. However, their performance is still far from being satisfied, therefore research on promising solutions is on-going. In this paper, we propose a new approach based on Q-learning with an edit-based summarization. The method combines two key modules to form an Editorial Agent and Language Model converter (EALM). The agent predicts edit actions (e.t., delete, keep, and replace), and then the LM converter deterministically generates a summary on the basis of the action signals. Q-learning is leveraged to train the agent to produce proper edit actions. Experimental results show that EALM delivered competitive performance compared with the previous encoder-decoder-based methods, even with truly zero paired data (i.e., no validation set). Defining the task as Q-learning enables us not only to develop a competitive method but also to make the latest techniques in reinforcement learning available for unsupervised summarization. We also conduct qualitative analysis, providing insights into future study on unsupervised summarizers.

pdf bib
Structural Supervision Improves Few-Shot Learning and Syntactic Generalization in Neural Language Models
Ethan Wilcox | Peng Qian | Richard Futrell | Ryosuke Kohita | Roger Levy | Miguel Ballesteros
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Humans can learn structural properties about a word from minimal experience, and deploy their learned syntactic representations uniformly in different grammatical contexts. We assess the ability of modern neural language models to reproduce this behavior in English and evaluate the effect of structural supervision on learning outcomes. First, we assess few-shot learning capabilities by developing controlled experiments that probe models’ syntactic nominal number and verbal argument structure generalizations for tokens seen as few as two times during training. Second, we assess invariance properties of learned representation: the ability of a model to transfer syntactic generalizations from a base context (e.g., a simple declarative active-voice sentence) to a transformed context (e.g., an interrogative sentence). We test four models trained on the same dataset: an n-gram baseline, an LSTM, and two LSTM-variants trained with explicit structural supervision. We find that in most cases, the neural models are able to induce the proper syntactic generalizations after minimal exposure, often from just two examples during training, and that the two structurally supervised models generalize more accurately than the LSTM model. All neural models are able to leverage information learned in base contexts to drive expectations in transformed contexts, indicating that they have learned some invariance properties of syntax.

pdf bib
Image Position Prediction in Multimodal Documents
Masayasu Muraoka | Ryosuke Kohita | Etsuko Ishii
Proceedings of the Twelfth Language Resources and Evaluation Conference

Conventional multimodal tasks, such as caption generation and visual question answering, have allowed machines to understand an image by describing or being asked about it in natural language, often via a sentence. Datasets for these tasks contain a large number of pairs of an image and the corresponding sentence as an instance. However, a real multimodal document such as a news article or Wikipedia page consists of multiple sentences with multiple images. Such documents require an advanced skill of jointly considering the multiple texts and multiple images, beyond a single sentence and image, for the interpretation. Therefore, aiming at building a system that can understand multimodal documents, we propose a task called image position prediction (IPP). In this task, a system learns plausible positions of images in a given document. To study this task, we automatically constructed a dataset of 66K multimodal documents with 320K images from Wikipedia articles. We conducted a preliminary experiment to evaluate the performance of a current multimodal system on our task. The experimental results show that the system outperformed simple baselines while the performance is still far from human performance, which thus poses new challenges in multimodal research.

pdf bib
Interactive Construction of User-Centric Dictionary for Text Analytics
Ryosuke Kohita | Issei Yoshida | Hiroshi Kanayama | Tetsuya Nasukawa
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

We propose a methodology to construct a term dictionary for text analytics through an interactive process between a human and a machine, which helps the creation of flexible dictionaries with precise granularity required in typical text analysis. This paper introduces the first formulation of interactive dictionary construction to address this issue. To optimize the interaction, we propose a new algorithm that effectively captures an analyst’s intention starting from only a small number of sample terms. Along with the algorithm, we also design an automatic evaluation framework that provides a systematic assessment of any interactive method for the dictionary creation task. Experiments using real scenario based corpora and dictionaries show that our algorithm outperforms baseline methods, and works even with a small number of interactions.

2018

pdf bib
A neural parser as a direct classifier for head-final languages
Hiroshi Kanayama | Masayasu Muraoka | Ryosuke Kohita
Proceedings of the Workshop on the Relevance of Linguistic Structure in Neural Architectures for NLP

This paper demonstrates a neural parser implementation suitable for consistently head-final languages such as Japanese. Unlike the transition- and graph-based algorithms in most state-of-the-art parsers, our parser directly selects the head word of a dependent from a limited number of candidates. This method drastically simplifies the model so that we can easily interpret the output of the neural model. Moreover, by exploiting grammatical knowledge to restrict possible modification types, we can control the output of the parser to reduce specific errors without adding annotated corpora. The neural parser performed well both on conventional Japanese corpora and the Japanese version of Universal Dependency corpus, and the advantages of distributed representations were observed in the comparison with the non-neural conventional model.

pdf bib
Dynamic Feature Selection with Attention in Incremental Parsing
Ryosuke Kohita | Hiroshi Noji | Yuji Matsumoto
Proceedings of the 27th International Conference on Computational Linguistics

One main challenge for incremental transition-based parsers, when future inputs are invisible, is to extract good features from a limited local context. In this work, we present a simple technique to maximally utilize the local features with an attention mechanism, which works as context- dependent dynamic feature selection. Our model learns, for example, which tokens should a parser focus on, to decide the next action. Our multilingual experiment shows its effectiveness across many languages. We also present an experiment with augmented test dataset and demon- strate it helps to understand the model’s behavior on locally ambiguous points.

2017

pdf bib
Effective Online Reordering with Arc-Eager Transitions
Ryosuke Kohita | Hiroshi Noji | Yuji Matsumoto
Proceedings of the 15th International Conference on Parsing Technologies

We present a new transition system with word reordering for unrestricted non-projective dependency parsing. Our system is based on decomposed arc-eager rather than arc-standard, which allows more flexible ambiguity resolution between a local projective and non-local crossing attachment. In our experiment on Universal Dependencies 2.0, we find our parser outperforms the ordinary swap-based parser particularly on languages with a large amount of non-projectivity.

pdf bib
Multilingual Back-and-Forth Conversion between Content and Function Head for Easy Dependency Parsing
Ryosuke Kohita | Hiroshi Noji | Yuji Matsumoto
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

Universal Dependencies (UD) is becoming a standard annotation scheme cross-linguistically, but it is argued that this scheme centering on content words is harder to parse than the conventional one centering on function words. To improve the parsability of UD, we propose a back-and-forth conversion algorithm, in which we preprocess the training treebank to increase parsability, and reconvert the parser outputs to follow the UD scheme as a postprocess. We show that this technique consistently improves LAS across languages even with a state-of-the-art parser, in particular on core dependency arcs such as nominal modifier. We also provide an in-depth analysis to understand why our method increases parsability.