Yoshimasa Tsuruoka


2021

pdf bib
Zero-pronoun Data Augmentation for Japanese-to-English Translation
Ryokan Ri | Toshiaki Nakazawa | Yoshimasa Tsuruoka
Proceedings of the 8th Workshop on Asian Translation (WAT2021)

For Japanese-to-English translation, zero pronouns in Japanese pose a challenge, since the model needs to infer and produce the corresponding pronoun in the target side of the English sentence. However, although fully resolving zero pronouns often needs discourse context, in some cases, the local context within a sentence gives clues to the inference of the zero pronoun. In this study, we propose a data augmentation method that provides additional training signals for the translation model to learn correlations between local context and zero pronouns. We show that the proposed method significantly improves the accuracy of zero pronoun translation with machine translation experiments in the conversational domain.

pdf bib
Modeling Target-side Inflection in Placeholder Translation
Ryokan Ri | Toshiaki Nakazawa | Yoshimasa Tsuruoka
Proceedings of Machine Translation Summit XVIII: Research Track

Placeholder translation systems enable the users to specify how a specific phrase is translated in the output sentence. The system is trained to output special placeholder tokens and the user-specified term is injected into the output through the context-free replacement of the placeholder token. However and this approach could result in ungrammatical sentences because it is often the case that the specified term needs to be inflected according to the context of the output and which is unknown before the translation. To address this problem and we propose a novel method of placeholder translation that can inflect specified terms according to the grammatical construction of the output sentence. We extend the seq2seq architecture with a character-level decoder that takes the lemma of a user-specified term and the words generated from the word-level decoder to output a correct inflected form of the lemma. We evaluate our approach with a Japanese-to-English translation task in the scientific writing domain and and show our model can incorporate specified terms in a correct form more successfully than other comparable models.

pdf bib
Data Augmentation with Unsupervised Machine Translation Improves the Structural Similarity of Cross-lingual Word Embeddings
Sosuke Nishikawa | Ryokan Ri | Yoshimasa Tsuruoka
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: Student Research Workshop

Unsupervised cross-lingual word embedding(CLWE) methods learn a linear transformation matrix that maps two monolingual embedding spaces that are separately trained with monolingual corpora. This method relies on the assumption that the two embedding spaces are structurally similar, which does not necessarily hold true in general. In this paper, we argue that using a pseudo-parallel corpus generated by an unsupervised machine translation model facilitates the structural similarity of the two embedding spaces and improves the quality of CLWEs in the unsupervised mapping method. We show that our approach outperforms other alternative approaches given the same amount of data, and, through detailed analysis, we show that data augmentation with the pseudo data from unsupervised machine translation is especially effective for mapping-based CLWEs because (1) the pseudo data makes the source and target corpora (partially) parallel; (2) the pseudo data contains information on the original language that helps to learn similar embedding spaces between the source and target languages.

2020

pdf bib
Revisiting the Context Window for Cross-lingual Word Embeddings
Ryokan Ri | Yoshimasa Tsuruoka
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Existing approaches to mapping-based cross-lingual word embeddings are based on the assumption that the source and target embedding spaces are structurally similar. The structures of embedding spaces largely depend on the co-occurrence statistics of each word, which the choice of context window determines. Despite this obvious connection between the context window and mapping-based cross-lingual embeddings, their relationship has been underexplored in prior work. In this work, we provide a thorough evaluation, in various languages, domains, and tasks, of bilingual embeddings trained with different context windows. The highlight of our findings is that increasing the size of both the source and target window sizes improves the performance of bilingual lexicon induction, especially the performance on frequent nouns.

2019

pdf bib
Accelerated Reinforcement Learning for Sentence Generation by Vocabulary Prediction
Kazuma Hashimoto | Yoshimasa Tsuruoka
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

A major obstacle in reinforcement learning-based sentence generation is the large action space whose size is equal to the vocabulary size of the target-side language. To improve the efficiency of reinforcement learning, we present a novel approach for reducing the action space based on dynamic vocabulary prediction. Our method first predicts a fixed-size small vocabulary for each input to generate its target sentence. The input-specific vocabularies are then used at supervised and reinforcement learning steps, and also at test time. In our experiments on six machine translation and two image captioning datasets, our method achieves faster reinforcement learning (~2.7x faster) with less GPU memory (~2.3x less) than the full-vocabulary counterpart. We also show that our method more effectively receives rewards with fewer iterations of supervised pre-training.

pdf bib
Incorporating Source-Side Phrase Structures into Neural Machine Translation
Akiko Eriguchi | Kazuma Hashimoto | Yoshimasa Tsuruoka
Computational Linguistics, Volume 45, Issue 2 - June 2019

Neural machine translation (NMT) has shown great success as a new alternative to the traditional Statistical Machine Translation model in multiple languages. Early NMT models are based on sequence-to-sequence learning that encodes a sequence of source words into a vector space and generates another sequence of target words from the vector. In those NMT models, sentences are simply treated as sequences of words without any internal structure. In this article, we focus on the role of the syntactic structure of source sentences and propose a novel end-to-end syntactic NMT model, which we call a tree-to-sequence NMT model, extending a sequence-to-sequence model with the source-side phrase structure. Our proposed model has an attention mechanism that enables the decoder to generate a translated word while softly aligning it with phrases as well as words of the source sentence. We have empirically compared the proposed model with sequence-to-sequence models in various settings on Chinese-to-Japanese and English-to-Japanese translation tasks. Our experimental results suggest that the use of syntactic structure can be beneficial when the training data set is small, but is not as effective as using a bi-directional encoder. As the size of training data set increases, the benefits of using a syntactic tree tends to diminish.

pdf bib
Using Semantic Similarity as Reward for Reinforcement Learning in Sentence Generation
Go Yasui | Yoshimasa Tsuruoka | Masaaki Nagata
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop

Traditional model training for sentence generation employs cross-entropy loss as the loss function. While cross-entropy loss has convenient properties for supervised learning, it is unable to evaluate sentences as a whole, and lacks flexibility. We present the approach of training the generation model using the estimated semantic similarity between the output and reference sentences to alleviate the problems faced by the training with cross-entropy loss. We use the BERT-based scorer fine-tuned to the Semantic Textual Similarity (STS) task for semantic similarity estimation, and train the model with the estimated scores through reinforcement learning (RL). Our experiments show that reinforcement learning with semantic similarity reward improves the BLEU scores from the baseline LSTM NMT model.

2017

pdf bib
Learning to Parse and Translate Improves Neural Machine Translation
Akiko Eriguchi | Yoshimasa Tsuruoka | Kyunghyun Cho
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

There has been relatively little attention to incorporating linguistic prior to neural machine translation. Much of the previous work was further constrained to considering linguistic prior on the source side. In this paper, we propose a hybrid model, called NMT+RNNG, that learns to parse and translate by combining the recurrent neural network grammar into the attention-based neural machine translation. Our approach encourages the neural machine translation model to incorporate linguistic prior during training, and lets it translate on its own afterward. Extensive experiments with four language pairs show the effectiveness of the proposed NMT+RNNG.

pdf bib
Neural Machine Translation with Source-Side Latent Graph Parsing
Kazuma Hashimoto | Yoshimasa Tsuruoka
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

This paper presents a novel neural machine translation model which jointly learns translation and source-side latent graph representations of sentences. Unlike existing pipelined approaches using syntactic parsers, our end-to-end model learns a latent graph parser as part of the encoder of an attention-based neural machine translation model, and thus the parser is optimized according to the translation objective. In experiments, we first show that our model compares favorably with state-of-the-art sequential and pipelined syntax-based NMT models. We also show that the performance of our model can be further improved by pre-training it with a small amount of treebank annotations. Our final ensemble model significantly outperforms the previous best models on the standard English-to-Japanese translation dataset.

pdf bib
A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks
Kazuma Hashimoto | Caiming Xiong | Yoshimasa Tsuruoka | Richard Socher
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Transfer and multi-task learning have traditionally focused on either a single source-target pair or very few, similar tasks. Ideally, the linguistic levels of morphology, syntax and semantics would benefit each other by being trained in a single model. We introduce a joint many-task model together with a strategy for successively growing its depth to solve increasingly complex tasks. Higher layers include shortcut connections to lower-level task predictions to reflect linguistic hierarchies. We use a simple regularization term to allow for optimizing all model weights to improve one task’s loss without exhibiting catastrophic interference of the other tasks. Our single end-to-end model obtains state-of-the-art or competitive results on five different tasks from tagging, parsing, relatedness, and entailment tasks.

2016

pdf bib
Adaptive Joint Learning of Compositional and Non-Compositional Phrase Embeddings
Kazuma Hashimoto | Yoshimasa Tsuruoka
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Tree-to-Sequence Attentional Neural Machine Translation
Akiko Eriguchi | Kazuma Hashimoto | Yoshimasa Tsuruoka
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
A Japanese Chess Commentary Corpus
Shinsuke Mori | John Richardson | Atsushi Ushiku | Tetsuro Sasada | Hirotaka Kameko | Yoshimasa Tsuruoka
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

In recent years there has been a surge of interest in the natural language prosessing related to the real world, such as symbol grounding, language generation, and nonlinguistic data search by natural language queries. In order to concentrate on language ambiguities, we propose to use a well-defined “real world,” that is game states. We built a corpus consisting of pairs of sentences and a game state. The game we focus on is shogi (Japanese chess). We collected 742,286 commentary sentences in Japanese. They are spontaneously generated contrary to natural language annotations in many image datasets provided by human workers on Amazon Mechanical Turk. We defined domain specific named entities and we segmented 2,508 sentences into words manually and annotated each word with a named entity tag. We describe a detailed definition of named entities and show some statistics of our game commentary corpus. We also show the results of the experiments of word segmentation and named entity recognition. The accuracies are as high as those on general domain texts indicating that we are ready to tackle various new problems related to the real world.

pdf bib
Domain Adaptation for Neural Networks by Parameter Augmentation
Yusuke Watanabe | Kazuma Hashimoto | Yoshimasa Tsuruoka
Proceedings of the 1st Workshop on Representation Learning for NLP

pdf bib
Domain Adaptation and Attention-Based Unknown Word Replacement in Chinese-to-Japanese Neural Machine Translation
Kazuma Hashimoto | Akiko Eriguchi | Yoshimasa Tsuruoka
Proceedings of the 3rd Workshop on Asian Translation (WAT2016)

This paper describes our UT-KAY system that participated in the Workshop on Asian Translation 2016. Based on an Attention-based Neural Machine Translation (ANMT) model, we build our system by incorporating a domain adaptation method for multiple domains and an attention-based unknown word replacement method. In experiments, we verify that the attention-based unknown word replacement method is effective in improving translation scores in Chinese-to-Japanese machine translation. We further show results of manual analysis on the replaced unknown words.

pdf bib
Character-based Decoding in Tree-to-Sequence Attention-based Neural Machine Translation
Akiko Eriguchi | Kazuma Hashimoto | Yoshimasa Tsuruoka
Proceedings of the 3rd Workshop on Asian Translation (WAT2016)

This paper reports our systems (UT-AKY) submitted in the 3rd Workshop of Asian Translation 2016 (WAT’16) and their results in the English-to-Japanese translation task. Our model is based on the tree-to-sequence Attention-based NMT (ANMT) model proposed by Eriguchi et al. (2016). We submitted two ANMT systems: one with a word-based decoder and the other with a character-based decoder. Experimenting on the English-to-Japanese translation task, we have confirmed that the character-based decoder can cover almost the full vocabulary in the target language and generate translations much faster than the word-based model.

2015

pdf bib
Can Symbol Grounding Improve Low-Level NLP? Word Segmentation as a Case Study
Hirotaka Kameko | Shinsuke Mori | Yoshimasa Tsuruoka
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
Where Was Alexander the Great in 325 BC? Toward Understanding History Text with a World Model
Yuki Murakami | Yoshimasa Tsuruoka
Proceedings of the First Workshop on Linking Computational Models of Lexical, Sentential and Discourse-level Semantics

pdf bib
Learning Embeddings for Transitive Verb Disambiguation by Implicit Tensor Factorization
Kazuma Hashimoto | Yoshimasa Tsuruoka
Proceedings of the 3rd Workshop on Continuous Vector Space Models and their Compositionality

pdf bib
Task-Oriented Learning of Word Embeddings for Semantic Relation Classification
Kazuma Hashimoto | Pontus Stenetorp | Makoto Miwa | Yoshimasa Tsuruoka
Proceedings of the Nineteenth Conference on Computational Natural Language Learning

2014

pdf bib
Exploiting Timegraphs in Temporal Relation Classification
Natsuda Laokulrat | Makoto Miwa | Yoshimasa Tsuruoka
Proceedings of TextGraphs-9: the workshop on Graph-based Methods for Natural Language Processing

pdf bib
Jointly Learning Word Representations and Composition Functions Using Predicate-Argument Structures
Kazuma Hashimoto | Pontus Stenetorp | Makoto Miwa | Yoshimasa Tsuruoka
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

2013

pdf bib
Simple Customization of Recursive Neural Networks for Semantic Relation Classification
Kazuma Hashimoto | Makoto Miwa | Yoshimasa Tsuruoka | Takashi Chikayama
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
UTTime: Temporal Relation Classification using Deep Syntactic Features
Natsuda Laokulrat | Makoto Miwa | Yoshimasa Tsuruoka | Takashi Chikayama
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)

2011

pdf bib
Learning with Lookahead: Can History-Based Models Rival Globally Optimized Models?
Yoshimasa Tsuruoka | Yusuke Miyao | Jun’ichi Kazama
Proceedings of the Fifteenth Conference on Computational Natural Language Learning

pdf bib
Extracting Bacteria Biotopes with Semi-supervised Named Entity Recognition and Coreference Resolution
Nhung T. H. Nguyen | Yoshimasa Tsuruoka
Proceedings of BioNLP Shared Task 2011 Workshop

pdf bib
SMT Helps Bitext Dependency Parsing
Wenliang Chen | Jun’ichi Kazama | Min Zhang | Yoshimasa Tsuruoka | Yujie Zhang | Yiou Wang | Kentaro Torisawa | Haizhou Li
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf bib
Improving Chinese Word Segmentation and POS Tagging with Semi-supervised Methods Using Large Auto-Analyzed Data
Yiou Wang | Jun’ichi Kazama | Yoshimasa Tsuruoka | Wenliang Chen | Yujie Zhang | Kentaro Torisawa
Proceedings of 5th International Joint Conference on Natural Language Processing

2010

pdf bib
Improving Graph-based Dependency Parsing with Decision History
Wenliang Chen | Jun’ichi Kazama | Yoshimasa Tsuruoka | Kentaro Torisawa
Coling 2010: Posters

2009

pdf bib
Stochastic Gradient Descent Training for L1-regularized Log-linear Models with Cumulative Penalty
Yoshimasa Tsuruoka | Jun’ichi Tsujii | Sophia Ananiadou
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf bib
Fast Full Parsing by Linear-Chain Conditional Random Fields
Yoshimasa Tsuruoka | Jun’ichi Tsujii | Sophia Ananiadou
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

pdf bib
A Discriminative Latent Variable Chinese Segmenter with Hybrid Word/Character Information
Xu Sun | Yaozhong Zhang | Takuya Matsuzaki | Yoshimasa Tsuruoka | Jun’ichi Tsujii
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

2008

pdf bib
Accelerating the Annotation of Sparse Named Entities by Dynamic Sentence Selection
Yoshimasa Tsuruoka | Jun’ichi Tsujii | Sophia Ananiadou
Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing

pdf bib
How to Make the Most of NE Dictionaries in Statistical NER
Yutaka Sasaki | Yoshimasa Tsuruoka | John McNaught | Sophia Ananiadou
Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing

pdf bib
Modeling Latent-Dynamic in Shallow Parsing: A Latent Conditional Model with Improved Inference
Xu Sun | Louis-Philippe Morency | Daisuke Okanohara | Yoshimasa Tsuruoka | Jun’ichi Tsujii
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

pdf bib
A Discriminative Candidate Generator for String Transformations
Naoaki Okazaki | Yoshimasa Tsuruoka | Sophia Ananiadou | Jun’ichi Tsujii
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

pdf bib
Towards Data and Goal Oriented Analysis: Tool Inter-operability and Combinatorial Comparison
Yoshinobu Kano | Ngan Nguyen | Rune Sætre | Kazuhiro Yoshida | Keiichiro Fukamachi | Yusuke Miyao | Yoshimasa Tsuruoka | Sophia Ananiadou | Jun’ichi Tsujii
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-II

pdf bib
Connecting Text Mining and Pathways using the PathText Resource
Rune Sætre | Brian Kemper | Kanae Oda | Naoaki Okazaki | Yukiko Matsuoka | Norihiro Kikuchi | Hiroaki Kitano | Yoshimasa Tsuruoka | Sophia Ananiadou | Jun’ichi Tsujii
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

Many systems have been developed in the past few years to assist researchers in the discovery of knowledge published as English text, for example in the PubMed database. At the same time, higher level collective knowledge is often published using a graphical notation representing all the entities in a pathway and their interactions. We believe that these pathway visualizations could serve as an effective user interface for knowledge discovery if they can be linked to the text in publications. Since the graphical elements in a Pathway are of a very different nature than their corresponding descriptions in English text, we developed a prototype system called PathText. The goal of PathText is to serve as a bridge between these two different representations. In this paper, we first describe the overall architecture and the interfaces of the PathText system, and then provide some details about the core Text Mining components.

2007

pdf bib
An Annotation Type System for a Data-Driven NLP Pipeline
Udo Hahn | Ekaterina Buyko | Katrin Tomanek | Scott Piao | John McNaught | Yoshimasa Tsuruoka | Sophia Ananiadou
Proceedings of the Linguistic Annotation Workshop

2006

pdf bib
Improving the Scalability of Semi-Markov Conditional Random Fields for Named Entity Recognition
Daisuke Okanohara | Yusuke Miyao | Yoshimasa Tsuruoka | Jun’ichi Tsujii
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

pdf bib
Semantic Retrieval for the Accurate Identification of Relational Concepts in Massive Textbases
Yusuke Miyao | Tomoko Ohta | Katsuya Masuda | Yoshimasa Tsuruoka | Kazuhiro Yoshida | Takashi Ninomiya | Jun’ichi Tsujii
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

pdf bib
An Intelligent Search Engine and GUI-based Efficient MEDLINE Search Tool Based on Deep Syntactic Parsing
Tomoko Ohta | Yusuke Miyao | Takashi Ninomiya | Yoshimasa Tsuruoka | Akane Yakushiji | Katsuya Masuda | Jumpei Takeuchi | Kazuhiro Yoshida | Tadayoshi Hara | Jin-Dong Kim | Yuka Tateisi | Jun’ichi Tsujii
Proceedings of the COLING/ACL 2006 Interactive Presentation Sessions

pdf bib
Extremely Lexicalized Models for Accurate and Fast HPSG Parsing
Takashi Ninomiya | Takuya Matsuzaki | Yoshimasa Tsuruoka | Yusuke Miyao | Jun’ichi Tsujii
Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing

pdf bib
Subdomain adaptation of a POS tagger with a small corpus
Yuka Tateisi | Yoshimasa Tsuruoka | Jun’ichi Tsujii
Proceedings of the HLT-NAACL BioNLP Workshop on Linking Natural Language and Biology

2005

pdf bib
Bidirectional Inference with the Easiest-First Strategy for Tagging Sequence Data
Yoshimasa Tsuruoka | Jun’ichi Tsujii
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing

pdf bib
A Machine Learning Approach to Acronym Generation
Yoshimasa Tsuruoka | Sophia Ananiadou | Jun’ichi Tsujii
Proceedings of the ACL-ISMB Workshop on Linking Biological Literature, Ontologies and Databases: Mining Biological Semantics

pdf bib
Efficacy of Beam Thresholding, Unification Filtering and Hybrid Parsing in Probabilistic HPSG Parsing
Takashi Ninomiya | Yoshimasa Tsuruoka | Yusuke Miyao | Jun’ichi Tsujii
Proceedings of the Ninth International Workshop on Parsing Technology

pdf bib
Chunk Parsing Revisited
Yoshimasa Tsuruoka | Jun’ichi Tsujii
Proceedings of the Ninth International Workshop on Parsing Technology

2003

pdf bib
Training a Naive Bayes Classifier via the EM Algorithm with a Class Distribution Constraint
Yoshimasa Tsuruoka | Jun’ichi Tsujii
Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003

pdf bib
Boosting Precision and Recall of Dictionary-Based Protein Name Recognition
Yoshimasa Tsuruoka | Jun’ichi Tsujii
Proceedings of the ACL 2003 Workshop on Natural Language Processing in Biomedicine