Mamoru Komachi - ACL Anthology

Mamoru Komachi

2026

Constructing a Dataset for Hallucination Detection in Japanese Summarization with Fine-grained Faithfulness Labels
Hikari Tanaka | Atsushi Keyaki | Mamoru Komachi
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 4: Student Research Workshop)

Large language models (LLMs) can generate fluent text, but the quality of generated content crucially depends on its consistency with the given input.This aspect is commonly referred to as faithfulness, which concerns whether the output is properly grounded in the input context.A major challenge related to faithfulness is that generated content may include information not supported by the input or may contradict it.This phenomenon is often referred to as hallucination, and increasing attention has been paid to automatic hallucination detection, which determines whether an LLM’s output is hallucinated.To evaluate the performance of hallucination detection systems, researchers use evaluation datasets with labels indicating the presence or absence of hallucinations.While such datasets have been developed for English and Chinese, Japanese evaluation resources for hallucination detection remain limited.Therefore, we constructed a Japanese evaluation dataset for hallucination detection in summarization by manually annotating sentence-level faithfulness labels in LLM-generated summaries of Japanese documents.We annotate 390 summaries (1,938 sentences) generated by three LLMs with sentence-level multi-label annotations for faithfulness with respect to the input document.The taxonomy extends a prior classification scheme and captures distinct patterns of model errors, enabling both binary hallucination detection and fine-grained error-type analysis of Japanese LLM summarization.

Statistical Semantic Change Detection via Usage Similarities
Taichi Aida | Daichi Mochihashi | Hiroya Takamura | Toshinobu Ogiso | Mamoru Komachi
The Proceedings for the 6th International Workshop on Computational Approaches to Language Change (LChange’26)

Semantic change detection comprises two subtasks: classification, which predicts whether a target word has undergone a semantic shift, and ranking, which orders words according to the degree of their semantic change. While most prior studies concentrated on ranking subtask, the classification subtask plays an equally important role, since many practical scenarios require a yes/no decision on semantic change rather than a global ranking. In this work, we propose a novel statistical method that predicts the presence or absence of semantic change. While most existing approaches infer semantic change by comparing word embeddings across time periods or domains, our method directly models the diachronic/synchronic consistency of usage-level similarity scores. Our experiments on SemEval-2020 Task 1 and WUGS datasets demonstrate that the proposed formulation outperforms existing state-of-the-art embedding-based methods, and robustly detects semantic change across languages in both diachronic and synchronic settings.

2025

Analyzing Continuous Semantic Shifts with Diachronic Word Similarity Matrices
Hajime Kiyama | Taichi Aida | Mamoru Komachi | Toshinobu Ogiso | Hiroya Takamura | Daichi Mochihashi
Proceedings of the 31st International Conference on Computational Linguistics

The meanings and relationships of words shift over time. This phenomenon is referred to as semantic shift. Research focused on understanding how semantic shifts occur over multiple time periods is essential for gaining a detailed understanding of semantic shifts. However, detecting change points only between adjacent time periods is insufficient for analyzing detailed semantic shifts, and using BERT-based methods to examine word sense proportions incurs a high computational cost. To address those issues, we propose a simple yet intuitive framework for how semantic shifts occur over multiple time periods by utilizing similarity matrices based on word embeddings. We calculate diachronic word similarity matrices using fast and lightweight word embeddings across arbitrary time periods, making it deeper to analyze continuous semantic shifts. Additionally, by clustering the resulting similarity matrices, we can categorize words that exhibit similar behavior of semantic shift in an unsupervised manner.

HIT-YOU at TSAR 2025 Shared Task Leveraging Similarity-Based Few-Shot Prompting, Round-Trip Translation, and Self-Refinement for Readability-Controlled Text Simplification
Mao Shimada | Kexin Bian | Zhidong Ling | Mamoru Komachi
Proceedings of the Fourth Workshop on Text Simplification, Accessibility and Readability (TSAR 2025)

We describe our submission to the TSAR 2025 shared task on readability-controlled text simplification, which evaluates systems on their ability to adjust linguistic complexity to specified CEFR levels while preserving meaning and coherence. We explored two complementary frameworks leveraging the shared task CEFR classifier as feedback. The first is an ensemble approach generating diverse candidates using multiple LLMs under zero-shot prompting with level-specific instructions and vocabulary lists, one-shot prompting, and round-trip translation. Candidates were filtered by predicted CEFR level before an LLM judge selected the final output. The second framework is a self-refinement loop, where a single candidate is iteratively revised with classifier feedback until matching the target level or reaching a maximum number of iterations. This study is among the first to apply round-trip translation and iterative self-refinement to controlled simplification, broadening the toolkit for adapting linguistic complexity.

A Fair Comparison without Translationese: English vs. Target-language Instructions for Multilingual LLMs
Taisei Enomoto | Hwichan Kim | Zhousi Chen | Mamoru Komachi
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 2: Short Papers)

Most large language models are multilingual instruction executors. Prior studies suggested that English instructions are more effective than target-language instructions even for non-English tasks; however, these studies often use datasets and instructions translated from English, which introduce biases known as translationese, hindering an unbiased comparison. To address this issue, we conduct a fair comparison between English and target-language instructions by eliminating translationese effects. Contrary to previous studies, our experiments across several tasks reveal that the advantage of adopting English instructions is not overwhelming. Additionally, we report on the features of generated texts and the instruction-following abilities when using respective instructions.

Targeted Syntactic Evaluation for Grammatical Error Correction
Aomi Koyama | Masato Mita | Su-Youn Yoon | Yasufumi Takama | Mamoru Komachi
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Language learners encounter a wide range of grammar items across the beginner, intermediate, and advanced levels.To develop grammatical error correction (GEC) models effectively, it is crucial to identify which grammar items are easier or more challenging for models to correct. However, conventional benchmarks based on learner-produced texts are insufficient for conducting detailed evaluations of GEC model performance across a wide range of grammar items due to biases in their distribution.To address this issue, we propose a new evaluation paradigm that assesses GEC models using minimal pairs of ungrammatical and grammatical sentences for each grammar item. As the first benchmark within this paradigm, we introduce the CEFR-based Targeted Syntactic Evaluation Dataset for Grammatical Error Correction (CTSEG), which complements existing English benchmarks by enabling fine-grained analyses previously unattainable with conventional datasets. Using CTSEG, we evaluate three mainstream types of English GEC models: sequence-to-sequence models, sequence tagging models, and prompt-based models. The results indicate that while current models perform well on beginner-level grammar items, their performance deteriorates substantially for intermediate and advanced items.

2024

Large Language Models Are State-of-the-Art Evaluator for Grammatical Error Correction
Masamune Kobayashi | Masato Mita | Mamoru Komachi
Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024)

Large Language Models (LLMs) have been reported to outperform existing automatic evaluation metrics in some tasks, such as text summarization and machine translation. However, there has been a lack of research on LLMs as evaluators in grammatical error correction (GEC). In this study, we investigate the performance of LLMs in GEC evaluation by employing prompts designed to incorporate various evaluation criteria inspired by previous research. Our extensive experimental results demonstrate that GPT-4 achieved Kendall’s rank correlation of 0.662 with human judgments, surpassing all existing methods. Furthermore, in recent GEC evaluations, we have underscored the significance of the LLMs scale and particularly emphasized the importance of fluency among evaluation criteria.

Token-length Bias in Minimal-pair Paradigm Datasets
Naoya Ueda | Masato Mita | Teruaki Oka | Mamoru Komachi
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Minimal-pair paradigm datasets have been used as benchmarks to evaluate the linguistic knowledge of models and provide an unsupervised method of acceptability judgment. The model performances are evaluated based on the percentage of minimal pairs in the MPP dataset where the model assigns a higher sentence log-likelihood to an acceptable sentence than to an unacceptable sentence. Each minimal pair in MPP datasets is controlled to align the number of words per sentence because the sentence length affects the sentence log-likelihood. However, aligning the number of words may be insufficient because recent language models tokenize sentences with subwords. Tokenization may cause a token length difference in minimal pairs, introducing token-length bias that skews the evaluation results. This study demonstrates that MPP datasets suffer from token-length bias and fail to evaluate the linguistic knowledge of a language model correctly. The results proved that sentences with a shorter token length would likely be assigned a higher log-likelihood regardless of their acceptability, which becomes problematic when comparing models with different tokenizers. To address this issue, we propose a debiased minimal pair generation method, allowing MPP datasets to measure language ability correctly and provide comparable results for all models.

Pruning Multilingual Large Language Models for Multilingual Inference
Hwichan Kim | Jun Suzuki | Tosho Hirasawa | Mamoru Komachi
Findings of the Association for Computational Linguistics: EMNLP 2024

Multilingual large language models (MLLMs), trained on multilingual balanced data, demonstrate better zero-shot learning performance in non-English languages compared to large language models trained on English-dominant data. However, the disparity in performance between English and non-English languages remains a challenge yet to be fully addressed. This study introduces a promising direction for enhancing non-English performance through a specialized pruning approach. Specifically, we prune MLLMs using bilingual sentence pairs from English and other languages and empirically demonstrate that this pruning strategy can enhance the MLLMs’ performance in non-English language.

Revisiting Meta-evaluation for Grammatical Error Correction
Masamune Kobayashi | Masato Mita | Mamoru Komachi
Transactions of the Association for Computational Linguistics, Volume 12

Metrics are the foundation for automatic evaluation in grammatical error correction (GEC), with their evaluation of the metrics (meta-evaluation) relying on their correlation with human judgments. However, conventional meta-evaluations in English GEC encounter several challenges, including biases caused by inconsistencies in evaluation granularity and an outdated setup using classical systems. These problems can lead to misinterpretation of metrics and potentially hinder the applicability of GEC techniques. To address these issues, this paper proposes SEEDA, a new dataset for GEC meta-evaluation. SEEDA consists of corrections with human ratings along two different granularities: edit-based and sentence-based, covering 12 state-of-the-art systems including large language models, and two human corrections with different focuses. The results of improved correlations by aligning the granularity in the sentence-level meta-evaluation suggest that edit-based metrics may have been underestimated in existing studies. Furthermore, correlations of most metrics decrease when changing from classical to neural systems, indicating that traditional metrics are relatively poor at evaluating fluently corrected sentences with many edits.

A Document-Level Text Simplification Dataset for Japanese
Yoshinari Nagai | Teruaki Oka | Mamoru Komachi
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Document-level text simplification, a task that combines single-document summarization and intra-sentence simplification, has garnered significant attention. However, studies have primarily focused on languages such as English and German, leaving Japanese and similar languages underexplored because of a scarcity of linguistic resources. In this study, we devised JADOS, the first Japanese document-level text simplification dataset based on newspaper articles and Wikipedia. Our dataset focuses on simplification, to enhance readability by reducing the number of sentences and tokens in a document. We conducted investigations using our dataset. Firstly, we analyzed the characteristics of Japanese simplification by comparing it across different domains and with English counterparts. Moreover, we experimentally evaluated the performances of text summarization methods, transformer-based text simplification models, and large language models. In terms of D-SARI scores, the transformer-based models performed best across all domains. Finally, we manually evaluated several model outputs and target articles, demonstrating the need for document-level text simplification models in Japanese.

DejaVu: Disambiguation evaluation dataset for English-JApanese machine translation on VisUal information
Ayako Sato | Tosho Hirasawa | Hwichan Kim | Zhousi Chen | Teruaki Oka | Masato Mita | Mamoru Komachi
Proceedings of the 38th Pacific Asia Conference on Language, Information and Computation

TMU-HIT’s Submission for the WMT24 Quality Estimation Shared Task: Is GPT-4 a Good Evaluator for Machine Translation?
Ayako Sato | Kyotaro Nakajima | Hwichan Kim | Zhousi Chen | Mamoru Komachi
Proceedings of the Ninth Conference on Machine Translation

In machine translation quality estimation (QE), translation quality is evaluated automatically without the need for reference translations. This paper describes our contribution to the sentence-level subtask of Task 1 at the Ninth Machine Translation Conference (WMT24), which predicts quality scores for neural MT outputs without reference translations. We fine-tune GPT-4o mini, a large-scale language model (LLM), with limited data for QE.We report results for the direct assessment (DA) method for four language pairs: English-Gujarati (En-Gu), English-Hindi (En-Hi), English-Tamil (En-Ta), and English-Telugu (En-Te).Experiments under zero-shot, few-shot prompting, and fine-tuning settings revealed significantly low performance in the zero-shot, while fine-tuning achieved accuracy comparable to last year’s best scores. Our system demonstrated the effectiveness of this approach in low-resource language QE, securing 1st place in both En-Gu and En-Hi, and 4th place in En-Ta and En-Te.

TMU-HIT at MLSP 2024: How Well Can GPT-4 Tackle Multilingual Lexical Simplification?
Taisei Enomoto | Hwichan Kim | Tosho Hirasawa | Yoshinari Nagai | Ayako Sato | Kyotaro Nakajima | Mamoru Komachi
Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024)

Lexical simplification (LS) is a process of replacing complex words with simpler alternatives to help readers understand sentences seamlessly. This process is divided into two primary subtasks: assessing word complexities and replacing high-complexity words with simpler alternatives. Employing task-specific supervised data to train models is a prevalent strategy for addressing these subtasks. However, such approach cannot be employed for low-resource languages. Therefore, this paper introduces a multilingual LS pipeline system that does not rely on supervised data. Specifically, we have developed systems based on GPT-4 for each subtask. Our systems demonstrated top-class performance on both tasks in many languages. The results indicate that GPT-4 can effectively assess lexical complexity and simplify complex words in a multilingual context with high quality.

A Survey for LLM Tuning Methods:Classifying Approaches Based on Model Internal Accessibility
Kyotaro Nakajima | Hwichan Kim | Tosho Hirasawa | Taisei Enomoto | Zhousi Chen | Mamoru Komachi
Proceedings of the 38th Pacific Asia Conference on Language, Information and Computation

2023

TMU Feedback Comment Generation System Using Pretrained Sequence-to-Sequence Language Models
Naoya Ueda | Mamoru Komachi
Proceedings of the 16th International Natural Language Generation Conference: Generation Challenges

In this paper, we introduce our Tokyo Metropolitan University Feedback Comment Generation system submitted to the feedback comment generation task for INLG 2023 Generation Challenge. In this task, a source sentence and offset range of preposition uses are given as the input. Then, a system generates hints or explanatory notes about preposition uses as the output. To tackle this generation task, we finetuned pretrained sequence-to-sequence language models. The models using BART and T5 showed significant improvement in BLEU score, demonstrating the effectiveness of the pretrained sequence-to-sequence language models in this task. We found that using part-of-speech tag information as an auxiliary input improves the generation quality of feedback comments. Furthermore, we adopt a simple postprocessing method that can enhance the reliability of the generation. As a result, our system achieved the F1 score of 47.4 points in BLEU-based evaluation and 60.9 points in manual evaluation, which ranked second and third on the leaderboard.

Visual Prediction Improves Zero-Shot Cross-Modal Machine Translation
Tosho Hirasawa | Emanuele Bugliarello | Desmond Elliott | Mamoru Komachi
Proceedings of the Eighth Conference on Machine Translation

Multimodal machine translation (MMT) systems have been successfully developed in recent years for a few language pairs. However, training such models usually requires tuples of a source language text, target language text, and images. Obtaining these data involves expensive human annotations, making it difficult to develop models for unseen text-only language pairs. In this work, we propose the task of zero-shot cross-modal machine translation aiming to transfer multimodal knowledge from an existing multimodal parallel corpus into a new translation direction. We also introduce a novel MMT model with a visual prediction network to learn visual features grounded on multimodal parallel data and provide pseudo-features for text-only language pairs. With this training paradigm, our MMT model outperforms its text-only counterpart. In our extensive analyses, we show that (i) the selection of visual features is important, and (ii) training on image-aware translations and being grounded on a similar language pair are mandatory.

Query Generation Using GPT-3 for CLIP-Based Word Sense Disambiguation for Image Retrieval
Xiaomeng Pan | Zhousi Chen | Mamoru Komachi
Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023)

In this study, we propose using the GPT-3 as a query generator for the backend of CLIP as an implicit word sense disambiguation (WSD) component for the SemEval 2023 shared task Visual Word Sense Disambiguation (VWSD). We confirmed previous findings — human-like prompts adapted for WSD with quotes benefit both CLIP and GPT-3, whereas plain phrases or poorly templated prompts give the worst results.

Construction of Evaluation Dataset for Japanese Lexical Semantic Change Detection
Zhidong Ling | Taichi Aida | Teruaki Oka | Mamoru Komachi
Proceedings of the 37th Pacific Asia Conference on Language, Information and Computation

Does Masked Language Model Pre-training with Artificial Data Improve Low-resource Neural Machine Translation?
Hiroto Tamura | Tosho Hirasawa | Hwichan Kim | Mamoru Komachi
Findings of the Association for Computational Linguistics: EACL 2023

Pre-training masked language models (MLMs) with artificial data has been proven beneficial for several natural language processing tasks such as natural language understanding and summarization; however, it has been less explored for neural machine translation (NMT).A previous study revealed the benefit of transfer learning for NMT in a limited setup, which differs from MLM.In this study, we prepared two kinds of artificial data and compared the translation performance of NMT when pre-trained with MLM.In addition to the random sequences, we created artificial data mimicking token frequency information from the real world. Our results showed that pre-training the models with artificial data by MLM improves translation performance in low-resource situations. Additionally, we found that pre-training on artificial data created considering token frequency information facilitates improved performance.

Simultaneous Domain Adaptation of Tokenization and Machine Translation
Taisei Enomoto | Tosho Hirasawa | Hwichan Kim | Teruaki Oka | Mamoru Komachi
Proceedings of the 37th Pacific Asia Conference on Language, Information and Computation

Discontinuous Combinatory Constituency Parsing
Zhousi Chen | Mamoru Komachi
Transactions of the Association for Computational Linguistics, Volume 11

We extend a pair of continuous combinator-based constituency parsers (one binary and one multi-branching) into a discontinuous pair. Our parsers iteratively compose constituent vectors from word embeddings without any grammar constraints. Their empirical complexities are subquadratic. Our extension includes 1) a swap action for the orientation-based binary model and 2) biaffine attention for the chunker-based multi-branching model. In tests conducted with the Discontinuous Penn Treebank and TIGER Treebank, we achieved state-of-the-art discontinuous accuracy with a significant speed advantage.

Cloze Quality Estimation for Language Assessment
Zizheng Zhang | Masato Mita | Mamoru Komachi
Findings of the Association for Computational Linguistics: EACL 2023

Cloze tests play an essential role in language assessment and help language learners improve their skills. In this paper, we propose a novel task called Cloze Quality Estimation (CQE) — a zero-shot task of evaluating whether a cloze test is of sufficient “high-quality” for language assessment based on two important factors: reliability and validity. We have taken the first step by creating a new dataset named CELA for the CQE task, which includes English cloze tests and corresponding evaluations about their quality annotated by native English speakers, which includes 2,597 and 1,730 instances in aspects of reliability and validity, respectively. We have tested baseline evaluation methods on the dataset, showing that our method could contribute to the CQE task, but the task is still challenging.

Enhancing Few-shot Cross-lingual Transfer with Target Language Peculiar Examples
Hwichan Kim | Mamoru Komachi
Findings of the Association for Computational Linguistics: ACL 2023

Few-shot cross-lingual transfer, fine-tuning Multilingual Masked Language Model (MMLM) with source language labeled data and a small amount of target language labeled data, provides excellent performance in the target language. However, if no labeled data in the target language are available, they need to be created through human annotations. In this study, we devise a metric to select annotation candidates from an unlabeled data pool that efficiently enhance accuracy for few-shot cross-lingual transfer. It is known that training a model with hard examples is important to improve the model’s performance. Therefore, we first identify examples that MMLM cannot solve in a zero-shot cross-lingual transfer setting and demonstrate that it is hard to predict peculiar examples in the target language, i.e., the examples distant from the source language examples in cross-lingual semantic space of the MMLM.We then choose high peculiarity examples as annotation candidates and perform few-shot cross-lingual transfer. In comprehensive experiments with 20 languages and 6 tasks, we demonstrate that the high peculiarity examples improve the target language accuracy compared to other candidate selection methods proposed in previous studies.

ClozEx: A Task toward Generation of English Cloze Explanation
Zizheng Zhang | Masato Mita | Mamoru Komachi
Findings of the Association for Computational Linguistics: EMNLP 2023

Providing explanations for cloze questions in language assessment (LA) has been recognized as a valuable approach to enhancing the language proficiency of learners. However, there is a noticeable absence of dedicated tasks and datasets specifically designed for generating language learner explanations. In response to this gap, this paper introduces a novel task ClozEx of generating explanations for cloze questions in LA, with a particular focus on English as a Second Language (ESL) learners. To support this task, we present a meticulously curated dataset comprising cloze questions paired with corresponding explanations. This dataset aims to assess language proficiency and facilitates language learning by offering informative and accurate explanations. To tackle the task, we fine-tuned various baseline models with our training data, including encoder-decoder and decoder-only architectures. We also explored whether large language models (LLMs) are able to generate good explanations without fine-tuning, just using pre-defined prompts. The evaluation results demonstrate that encoder-decoder models have the potential to deliver fluent and valid explanations when trained on our dataset.

2022

Towards Automatic Generation of Messages Countering Online Hate Speech and Microaggressions
Mana Ashida | Mamoru Komachi
Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH)

With the widespread use of social media, online hate is increasing, and microaggressions are receiving attention. We explore the potential for using pretrained language models to automatically generate messages that combat the associated offensive texts. Specifically, we focus on using prompting to steer model generation as it requires less data and computation than fine-tuning. We also propose a human evaluation perspective; offensiveness, stance, and informativeness. After obtaining 306 counterspeech and 42 microintervention messages generated by GPT-2, 3, Neo, we conducted a human evaluation using Amazon Mechanical Turk. The results indicate the potential of using prompting in the proposed generation task. All the generated texts along with the annotation are published to encourage future research on countering hate and microaggressions online.

Infinite SCAN: An Infinite Model of Diachronic Semantic Change
Seiichi Inoue | Mamoru Komachi | Toshinobu Ogiso | Hiroya Takamura | Daichi Mochihashi
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

In this study, we propose a Bayesian model that can jointly estimate the number of senses of words and their changes through time.The model combines a dynamic topic model on Gaussian Markov random fields with a logistic stick-breaking process that realizes Dirichlet process. In the experiments, we evaluated the proposed model in terms of interpretability, accuracy in estimating the number of senses, and tracking their changes using both artificial data and real data.We quantitatively verified that the model behaves as expected through evaluation using artificial data.Using the CCOHA corpus, we showed that our model outperforms the baseline model and investigated the semantic changes of several well-known target words.

TMU NMT System with Automatic Post-Editing by Multi-Source Levenshtein Transformer for the Restricted Translation Task of WAT 2022
Seiichiro Kondo | Mamoru Komachi
Proceedings of the 9th Workshop on Asian Translation

In this paper, we describe our TMU English–Japanese systems submitted to the restricted translation task at WAT 2022 (Nakazawa et al., 2022). In this task, we translate an input sentence with the constraint that certain words or phrases (called restricted target vocabularies (RTVs)) should be contained in the output sentence. To satisfy this constraint, we address this task using a combination of two techniques. One is lexical-constraint-aware neural machine translation (LeCA) (Chen et al., 2020), which is a method of adding RTVs at the end of input sentences. The other is multi-source Levenshtein transformer (MSLevT) (Wan et al., 2020), which is a non-autoregressive method for automatic post-editing. Our system generates translations in two steps. First, we generate the translation using LeCA. Subsequently, we filter the sentences that do not satisfy the constraints and post-edit them with MSLevT. Our experimental results reveal that 100% of the RTVs can be included in the generated sentences while maintaining the translation quality of the LeCA model on both English to Japanese (En→Ja) and Japanese to English (Ja→En) tasks. Furthermore, the method used in previous studies requires an increase in the beam size to satisfy the constraints, which is computationally expensive. In contrast, the proposed method does not require a similar increase and can generate translations faster.

Learning How to Translate North Korean through South Korean
Hwichan Kim | Sangwhan Moon | Naoaki Okazaki | Mamoru Komachi
Proceedings of the Thirteenth Language Resources and Evaluation Conference

South and North Korea both use the Korean language. However, Korean NLP research has focused on South Korean only, and existing NLP systems of the Korean language, such as neural machine translation (NMT) models, cannot properly handle North Korean inputs. Training a model using North Korean data is the most straightforward approach to solving this problem, but there is insufficient data to train NMT models. In this study, we create data for North Korean NMT models using a comparable corpus. First, we manually create evaluation data for automatic alignment and machine translation, and then, investigate automatic alignment methods suitable for North Korean. Finally, we show that a model trained by North Korean bilingual data without human annotation significantly boosts North Korean translation accuracy compared to existing South Korean models in zero-shot settings.

Construction of a Quality Estimation Dataset for Automatic Evaluation of Japanese Grammatical Error Correction
Daisuke Suzuki | Yujin Takahashi | Ikumi Yamashita | Taichi Aida | Tosho Hirasawa | Michitaka Nakatsuji | Masato Mita | Mamoru Komachi
Proceedings of the Thirteenth Language Resources and Evaluation Conference

In grammatical error correction (GEC), automatic evaluation is considered as an important factor for research and development of GEC systems. Previous studies on automatic evaluation have shown that quality estimation models built from datasets with manual evaluation can achieve high performance in automatic evaluation of English GEC. However, quality estimation models have not yet been studied in Japanese, because there are no datasets for constructing quality estimation models. In this study, therefore, we created a quality estimation dataset with manual evaluation to build an automatic evaluation model for Japanese GEC. By building a quality estimation model using this dataset and conducting a meta-evaluation, we verified the usefulness of the quality estimation model for Japanese GEC.

Zuo Zhuan Ancient Chinese Dataset for Word Sense Disambiguation
Xiaomeng Pan | Hongfei Wang | Teruaki Oka | Mamoru Komachi
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Student Research Workshop

Word Sense Disambiguation (WSD) is a core task in Natural Language Processing (NLP). Ancient Chinese has rarely been used in WSD tasks, however, as no public dataset for ancient Chinese WSD tasks exists. Creation of an ancient Chinese dataset is considered a significant challenge because determining the most appropriate sense in a context is difficult and time-consuming owing to the different usages in ancient and modern Chinese. Actually, no public dataset for ancient Chinese WSD tasks exists. To solve the problem of ancient Chinese WSD, we annotate part of Pre-Qin (221 BC) text Zuo Zhuan using a copyright-free dictionary to create a public sense-tagged dataset. Then, we apply a simple Nearest Neighbors (k-NN) method using a pre-trained language model to the dataset. Our code and dataset will be available on GitHub.

Japanese Named Entity Recognition from Automatic Speech Recognition Using Pre-trained Models
Seiichiro Kondo | Naoya Ueda | Teruaki Oka | Masakazu Sugiyama | Asahi Hentona | Mamoru Komachi
Proceedings of the 36th Pacific Asia Conference on Language, Information and Computation

ProQE: Proficiency-wise Quality Estimation dataset for Grammatical Error Correction
Yujin Takahashi | Masahiro Kaneko | Masato Mita | Mamoru Komachi
Proceedings of the Thirteenth Language Resources and Evaluation Conference

This study investigates how supervised quality estimation (QE) models of grammatical error correction (GEC) are affected by the learners’ proficiency with the data. QE models for GEC evaluations in prior work have obtained a high correlation with manual evaluations. However, when functioning in a real-world context, the data used for the reported results have limitations because prior works were biased toward data by learners with relatively high proficiency levels. To address this issue, we created a QE dataset that includes multiple proficiency levels and explored the necessity of performing proficiency-wise evaluation for QE of GEC. Our experiments demonstrated that differences in evaluation dataset proficiency affect the performance of QE models, and proficiency-wise evaluation helps create more robust models.

2021

From Masked Language Modeling to Translation: Non-English Auxiliary Tasks Improve Zero-shot Spoken Language Understanding
Rob van der Goot | Ibrahim Sharaf | Aizhan Imankulova | Ahmet Üstün | Marija Stepanović | Alan Ramponi | Siti Oryza Khairunnisa | Mamoru Komachi | Barbara Plank
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

The lack of publicly available evaluation data for low-resource languages limits progress in Spoken Language Understanding (SLU). As key tasks like intent classification and slot filling require abundant training data, it is desirable to reuse existing data in high-resource languages to develop models for low-resource scenarios. We introduce xSID, a new benchmark for cross-lingual (x) Slot and Intent Detection in 13 languages from 6 language families, including a very low-resource dialect. To tackle the challenge, we propose a joint learning approach, with English SLU training data and non-English auxiliary tasks from raw text, syntax and translation for transfer. We study two setups which differ by type and language coverage of the pre-trained embeddings. Our results show that jointly learning the main tasks with masked language modeling is effective for slots, while machine translation transfer works best for intent classification.

Can Monolingual Pre-trained Encoder-Decoder Improve NMT for Distant Language Pairs?
Hwichan Kim | Mamoru Komachi
Proceedings of the 35th Pacific Asia Conference on Language, Information and Computation

Machine Translation with Pre-specified Target-side Words Using a Semi-autoregressive Model
Seiichiro Kondo | Aomi Koyama | Tomoshige Kiyuna | Tosho Hirasawa | Mamoru Komachi
Proceedings of the 8th Workshop on Asian Translation (WAT2021)

We introduce our TMU Japanese-to-English system, which employs a semi-autoregressive model, to tackle the WAT 2021 restricted translation task. In this task, we translate an input sentence with the constraint that some words, called restricted target vocabularies (RTVs), must be contained in the output sentence. To satisfy this constraint, we use a semi-autoregressive model, namely, RecoverSAT, due to its ability (known as “forced translation”) to insert specified words into the output sentence. When using “forced translation,” the order of inserting RTVs is a critical problem. In this work, we aligned the source sentence and the corresponding RTVs using GIZA++. In our system, we obtain word alignment between a source sentence and the corresponding RTVs and then sort the RTVs in the order of their corresponding words or phrases in the source sentence. Using the model with sorted order RTVs, we succeeded in inserting all the RTVs into output sentences in more than 96% of the test sentences. Moreover, we confirmed that sorting RTVs improved the BLEU score compared with random order RTVs.

Sentence Concatenation Approach to Data Augmentation for Neural Machine Translation
Seiichiro Kondo | Kengo Hotate | Tosho Hirasawa | Masahiro Kaneko | Mamoru Komachi
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop

Recently, neural machine translation is widely used for its high translation accuracy, but it is also known to show poor performance at long sentence translation. Besides, this tendency appears prominently for low resource languages. We assume that these problems are caused by long sentences being few in the train data. Therefore, we propose a data augmentation method for handling long sentences. Our method is simple; we only use given parallel corpora as train data and generate long sentences by concatenating two sentences. Based on our experiments, we confirm improvements in long sentence translation by proposed data augmentation despite the simplicity. Moreover, the proposed method improves translation quality more when combined with back-translation.

Comparison of Grammatical Error Correction Using Back-Translation Models
Aomi Koyama | Kengo Hotate | Masahiro Kaneko | Mamoru Komachi
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop

Grammatical error correction (GEC) suffers from a lack of sufficient parallel data. Studies on GEC have proposed several methods to generate pseudo data, which comprise pairs of grammatical and artificially produced ungrammatical sentences. Currently, a mainstream approach to generate pseudo data is back-translation (BT). Most previous studies using BT have employed the same architecture for both the GEC and BT models. However, GEC models have different correction tendencies depending on the architecture of their models. Thus, in this study, we compare the correction tendencies of GEC models trained on pseudo data generated by three BT models with different architectures, namely, Transformer, CNN, and LSTM. The results confirm that the correction tendencies for each error type are different for every BT model. In addition, we investigate the correction tendencies when using a combination of pseudo data generated by different BT models. As a result, we find that the combination of different BT models improves or interpolates the performance of each error type compared with using a single BT model with different seeds.

Neural Combinatory Constituency Parsing
Zhousi Chen | Longtu Zhang | Aizhan Imankulova | Mamoru Komachi
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

Modeling Text using the Continuous Space Topic Model with Pre-Trained Word Embeddings
Seiichi Inoue | Taichi Aida | Mamoru Komachi | Manabu Asai
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: Student Research Workshop

In this study, we propose a model that extends the continuous space topic model (CSTM), which flexibly controls word probability in a document, using pre-trained word embeddings. To develop the proposed model, we pre-train word embeddings, which capture the semantics of words and plug them into the CSTM. Intrinsic experimental results show that the proposed model exhibits a superior performance over the CSTM in terms of perplexity and convergence speed. Furthermore, extrinsic experimental results show that the proposed model is useful for a document classification task when compared with the baseline model. We qualitatively show that the latent coordinates obtained by training the proposed model are better than those of the baseline model.

TMEKU System for the WAT2021 Multimodal Translation Task
Yuting Zhao | Mamoru Komachi | Tomoyuki Kajiwara | Chenhui Chu
Proceedings of the 8th Workshop on Asian Translation (WAT2021)

We introduce our TMEKU system submitted to the English-Japanese Multimodal Translation Task for WAT 2021. We participated in the Flickr30kEnt-JP task and Ambiguous MSCOCO Multimodal task under the constrained condition using only the officially provided datasets. Our proposed system employs soft alignment of word-region for multimodal neural machine translation (MNMT). The experimental results evaluated on the BLEU metric provided by the WAT 2021 evaluation site show that the TMEKU system has achieved the best performance among all the participated systems. Further analysis of the case study demonstrates that leveraging word-region alignment between the textual and visual modalities is the key to performance enhancement in our TMEKU system, which leads to better visual information use.

TMU NMT System with Japanese BART for the Patent task of WAT 2021
Hwichan Kim | Mamoru Komachi
Proceedings of the 8th Workshop on Asian Translation (WAT2021)

In this paper, we introduce our TMU Neural Machine Translation (NMT) system submitted for the Patent task (Korean Japanese and English Japanese) of 8th Workshop on Asian Translation (Nakazawa et al., 2021). Recently, several studies proposed pre-trained encoder-decoder models using monolingual data. One of the pre-trained models, BART (Lewis et al., 2020), was shown to improve translation accuracy via fine-tuning with bilingual data. However, they experimented only Romanian!English translation using English BART. In this paper, we examine the effectiveness of Japanese BART using Japan Patent Office Corpus 2.0. Our experiments indicate that Japanese BART can also improve translation accuracy in both Korean Japanese and English Japanese translations.

Analyzing Semantic Changes in Japanese Words Using BERT
Kazuma Kobayashi | Taichi Aida | Mamoru Komachi
Proceedings of the 35th Pacific Asia Conference on Language, Information and Computation

A Comprehensive Analysis of PMI-based Models for Measuring Semantic Differences
Taichi Aida | Mamoru Komachi | Toshinobu Ogiso | Hiroya Takamura | Daichi Mochihashi
Proceedings of the 35th Pacific Asia Conference on Language, Information and Computation

2020

Cross-lingual Transfer Learning for Grammatical Error Correction
Ikumi Yamashita | Satoru Katsumata | Masahiro Kaneko | Aizhan Imankulova | Mamoru Komachi
Proceedings of the 28th International Conference on Computational Linguistics

In this study, we explore cross-lingual transfer learning in grammatical error correction (GEC) tasks. Many languages lack the resources required to train GEC models. Cross-lingual transfer learning from high-resource languages (the source models) is effective for training models of low-resource languages (the target models) for various tasks. However, in GEC tasks, the possibility of transferring grammatical knowledge (e.g., grammatical functions) across languages is not evident. Therefore, we investigate cross-lingual transfer learning methods for GEC. Our results demonstrate that transfer learning from other languages can improve the accuracy of GEC. We also demonstrate that proximity to source languages has a significant impact on the accuracy of correcting certain types of errors.

Zero-shot North Korean to English Neural Machine Translation by Character Tokenization and Phoneme Decomposition
Hwichan Kim | Tosho Hirasawa | Mamoru Komachi
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop

The primary limitation of North Korean to English translation is the lack of a parallel corpus; therefore, high translation accuracy cannot be achieved. To address this problem, we propose a zero-shot approach using South Korean data, which are remarkably similar to North Korean data. We train a neural machine translation model after tokenizing a South Korean text at the character level and decomposing characters into phonemes. We demonstrate that our method can effectively learn North Korean to English translation and improve the BLEU scores by +1.01 points in comparison with the baseline.

Towards a Standardized Dataset on Indonesian Named Entity Recognition
Siti Oryza Khairunnisa | Aizhan Imankulova | Mamoru Komachi
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing: Student Research Workshop

In recent years, named entity recognition (NER) tasks in the Indonesian language have undergone extensive development. There are only a few corpora for Indonesian NER; hence, recent Indonesian NER studies have used diverse datasets. Although an open dataset is available, it includes only approximately 2,000 sentences and contains inconsistent annotations, thereby preventing accurate training of NER models without reliance on pre-trained models. Therefore, we re-annotated the dataset and compared the two annotations’ performance using the Bidirectional Long Short-Term Memory and Conditional Random Field (BiLSTM-CRF) approach. Fixing the annotation yielded a more consistent result for the organization tag and improved the prediction score by a large margin. Moreover, to take full advantage of pre-trained models, we compared different feature embeddings to determine their impact on the NER task for the Indonesian language.

Automated Essay Scoring System for Nonnative Japanese Learners
Reo Hirao | Mio Arai | Hiroki Shimanaka | Satoru Katsumata | Mamoru Komachi
Proceedings of the Twelfth Language Resources and Evaluation Conference

In this study, we created an automated essay scoring (AES) system for nonnative Japanese learners using an essay dataset with annotations for a holistic score and multiple trait scores, including content, organization, and language scores. In particular, we developed AES systems using two different approaches: a feature-based approach and a neural-network-based approach. In the former approach, we used Japanese-specific linguistic features, including character-type features such as “kanji” and “hiragana.” In the latter approach, we used two models: a long short-term memory (LSTM) model (Hochreiter and Schmidhuber, 1997) and a bidirectional encoder representations from transformers (BERT) model (Devlin et al., 2019), which achieved the highest accuracy in various natural language processing tasks in 2018. Overall, the BERT model achieved the best root mean squared error and quadratic weighted kappa scores. In addition, we analyzed the robustness of the outputs of the BERT model. We have released and shared this system to facilitate further research on AES for Japanese as a second language learners.

Grammatical Error Correction Using Pseudo Learner Corpus Considering Learner’s Error Tendency
Yujin Takahashi | Satoru Katsumata | Mamoru Komachi
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop

Recently, several studies have focused on improving the performance of grammatical error correction (GEC) tasks using pseudo data. However, a large amount of pseudo data are required to train an accurate GEC model. To address the limitations of language and computational resources, we assume that introducing pseudo errors into sentences similar to those written by the language learners is more efficient, rather than incorporating random pseudo errors into monolingual data. In this regard, we study the effect of pseudo data on GEC task performance using two approaches. First, we extract sentences that are similar to the learners’ sentences from monolingual data. Second, we generate realistic pseudo errors by considering error types that learners often make. Based on our comparative results, we observe that F0.5 scores for the Russian GEC task are significantly improved.

SOME: Reference-less Sub-Metrics Optimized for Manual Evaluations of Grammatical Error Correction
Ryoma Yoshimura | Masahiro Kaneko | Tomoyuki Kajiwara | Mamoru Komachi
Proceedings of the 28th International Conference on Computational Linguistics

We propose a reference-less metric trained on manual evaluations of system outputs for grammatical error correction (GEC). Previous studies have shown that reference-less metrics are promising; however, existing metrics are not optimized for manual evaluations of the system outputs because no dataset of the system output exists with manual evaluation. This study manually evaluates outputs of GEC systems to optimize the metrics. Experimental results show that the proposed metric improves correlation with the manual evaluation in both system- and sentence-level meta-evaluation. Our dataset and metric will be made publicly available.

TMUOU Submission for WMT20 Quality Estimation Shared Task
Akifumi Nakamachi | Hiroki Shimanaka | Tomoyuki Kajiwara | Mamoru Komachi
Proceedings of the Fifth Conference on Machine Translation

We introduce the TMUOU submission for the WMT20 Quality Estimation Shared Task 1: Sentence-Level Direct Assessment. Our system is an ensemble model of four regression models based on XLM-RoBERTa with language tags. We ranked 4th in Pearson and 2nd in MAE and RMSE on a multilingual track.

TMU Japanese-English Multimodal Machine Translation System for WAT 2020
Hiroto Tamura | Tosho Hirasawa | Masahiro Kaneko | Mamoru Komachi
Proceedings of the 7th Workshop on Asian Translation

We introduce our TMU system submitted to the Japanese<->English Multimodal Task (constrained) for WAT 2020 (Nakazawa et al., 2020). This task aims to improve translation performance with the help of another modality (images) associated with the input sentences. In a multimodal translation task, the dataset is, by its nature, a low-resource one. Our method used herein augments the data by generating noisy translations and adding noise to existing training images. Subsequently, we pretrain a translation model on the augmented noisy data, and then fine-tune it on the clean data. We also examine the probabilistic dropping of either the textual or visual context vector in the decoder. This aims to regularize the network to make use of both features while training. The experimental results indicate that translation performance can be improved using our method of textual data augmentation with noising on the target side and probabilistic dropping of either context vector.

Stronger Baselines for Grammatical Error Correction Using a Pretrained Encoder-Decoder Model
Satoru Katsumata | Mamoru Komachi
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing

Studies on grammatical error correction (GEC) have reported on the effectiveness of pretraining a Seq2Seq model with a large amount of pseudodata. However, this approach requires time-consuming pretraining of GEC because of the size of the pseudodata. In this study, we explored the utility of bidirectional and auto-regressive transformers (BART) as a generic pretrained encoder-decoder model for GEC. With the use of this generic pretrained model for GEC, the time-consuming pretraining can be eliminated. We find that monolingual and multilingual BART models achieve high performance in GEC, with one of the results being comparable to the current strong results in English GEC.

Neural Machine Translation from Historical Japanese to Contemporary Japanese Using Diachronically Domain-Adapted Word Embeddings
Masashi Takaku | Tosho Hirasawa | Mamoru Komachi | Kanako Komiya
Proceedings of the 34th Pacific Asia Conference on Language, Information and Computation

Translation of New Named Entities from English to Chinese
Zizheng Zhang | Tosho Hirasawa | Wei Houjing | Masahiro Kaneko | Mamoru Komachi
Proceedings of the 7th Workshop on Asian Translation

New things are being created and new words are constantly being added to languages worldwide. However, it is not practical to translate them all manually into a new foreign language. When translating from an alphabetic language such as English to Chinese, appropriate Chinese characters must be assigned, which is particularly costly compared to other language pairs. Therefore, we propose a task of generating and evaluating new translations from English to Chinese focusing on named entities. We defined three criteria for human evaluation—fluency, adequacy of pronunciation, and adequacy of meaning—and constructed evaluation data based on these definitions. In addition, we built a baseline system and analyzed the output of the system.

TMU-NLP System Using BERT-based Pre-trained Model to the NLP-TEA CGED Shared Task 2020
Hongfei Wang | Mamoru Komachi
Proceedings of the 6th Workshop on Natural Language Processing Techniques for Educational Applications

In this paper, we introduce our system for NLPTEA 2020 shared task of Chinese Grammatical Error Diagnosis (CGED). In recent years, pre-trained models have been extensively studied, and several downstream tasks have benefited from their utilization. In this study, we treat the grammar error diagnosis (GED) task as a grammatical error correction (GEC) problem and propose a method that incorporates a pre-trained model into an encoder-decoder model to solve this problem.

Towards Multimodal Simultaneous Neural Machine Translation
Aizhan Imankulova | Masahiro Kaneko | Tosho Hirasawa | Mamoru Komachi
Proceedings of the Fifth Conference on Machine Translation

Simultaneous translation involves translating a sentence before the speaker’s utterance is completed in order to realize real-time understanding in multiple languages. This task is significantly more challenging than the general full sentence translation because of the shortage of input information during decoding. To alleviate this shortage, we propose multimodal simultaneous neural machine translation (MSNMT), which leverages visual information as an additional modality. Our experiments with the Multi30k dataset showed that MSNMT significantly outperforms its text-only counterpart in more timely translation situations with low latency. Furthermore, we verified the importance of visual information during decoding by performing an adversarial evaluation of MSNMT, where we studied how models behaved with incongruent input modality and analyzed the effect of different word order between source and target languages.

Construction of an Evaluation Corpus for Grammatical Error Correction for Learners of Japanese as a Second Language
Aomi Koyama | Tomoshige Kiyuna | Kenji Kobayashi | Mio Arai | Mamoru Komachi
Proceedings of the Twelfth Language Resources and Evaluation Conference

The NAIST Lang-8 Learner Corpora (Lang-8 corpus) is one of the largest second-language learner corpora. The Lang-8 corpus is suitable as a training dataset for machine translation-based grammatical error correction systems. However, it is not suitable as an evaluation dataset because the corrected sentences sometimes include inappropriate sentences. Therefore, we created and released an evaluation corpus for correcting grammatical errors made by learners of Japanese as a Second Language (JSL). As our corpus has less noise and its annotation scheme reflects the characteristics of the dataset, it is ideal as an evaluation corpus for correcting grammatical errors in sentences written by JSL learners. In addition, we applied neural machine translation (NMT) and statistical machine translation (SMT) techniques to correct the grammar of the JSL learners’ sentences and evaluated their results using our corpus. We also compared the performance of the NMT system with that of the SMT system.

Generating Diverse Corrections with Local Beam Search for Grammatical Error Correction
Kengo Hotate | Masahiro Kaneko | Mamoru Komachi
Proceedings of the 28th International Conference on Computational Linguistics

In this study, we propose a beam search method to obtain diverse outputs in a local sequence transduction task where most of the tokens in the source and target sentences overlap, such as in grammatical error correction (GEC). In GEC, it is advisable to rewrite only the local sequences that must be rewritten while leaving the correct sequences unchanged. However, existing methods of acquiring various outputs focus on revising all tokens of a sentence. Therefore, existing methods may either generate ungrammatical sentences because they force the entire sentence to be changed or produce non-diversified sentences by weakening the constraints to avoid generating ungrammatical sentences. Considering these issues, we propose a method that does not rewrite all the tokens in a text, but only rewrites those parts that need to be diversely corrected. Our beam search method adjusts the search token in the beam according to the probability that the prediction is copied from the source sentence. The experimental results show that our proposed method generates more diverse corrections than existing methods without losing accuracy in the GEC task.

Chinese Grammatical Correction Using BERT-based Pre-trained Model
Hongfei Wang | Michiki Kurosawa | Satoru Katsumata | Mamoru Komachi
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing

In recent years, pre-trained models have been extensively studied, and several downstream tasks have benefited from their utilization. In this study, we verify the effectiveness of two methods that incorporate a pre-trained model into an encoder-decoder model on Chinese grammatical error correction tasks. We also analyze the error type and conclude that sentence-level errors are yet to be addressed.

Double Attention-based Multimodal Neural Machine Translation with Semantic Image Regions
Yuting Zhao | Mamoru Komachi | Tomoyuki Kajiwara | Chenhui Chu
Proceedings of the 22nd Annual Conference of the European Association for Machine Translation

Existing studies on multimodal neural machine translation (MNMT) have mainly focused on the effect of combining visual and textual modalities to improve translations. However, it has been suggested that the visual modality is only marginally beneficial. Conventional visual attention mechanisms have been used to select the visual features from equally-sized grids generated by convolutional neural networks (CNNs), and may have had modest effects on aligning the visual concepts associated with textual objects, because the grid visual features do not capture semantic information. In contrast, we propose the application of semantic image regions for MNMT by integrating visual and textual features using two individual attention mechanisms (double attention). We conducted experiments on the Multi30k dataset and achieved an improvement of 0.5 and 0.9 BLEU points for English-German and English-French translation tasks, compared with the MNMT with grid visual features. We also demonstrated concrete improvements on translation performance benefited from semantic image regions.

English-to-Japanese Diverse Translation by Combining Forward and Backward Outputs
Masahiro Kaneko | Aizhan Imankulova | Tosho Hirasawa | Mamoru Komachi
Proceedings of the Fourth Workshop on Neural Generation and Translation

We introduce our TMU system that is submitted to The 4th Workshop on Neural Generation and Translation (WNGT2020) to English-to-Japanese (En→Ja) track on Simultaneous Translation And Paraphrase for Language Education (STAPLE) shared task. In most cases machine translation systems generate a single output from the input sentence, however, in order to assist language learners in their journey with better and more diverse feedback, it is helpful to create a machine translation system that is able to produce diverse translations of each input sentence. However, creating such systems would require complex modifications in a model to ensure the diversity of outputs. In this paper, we investigated if it is possible to create such systems in a simple way and whether it can produce desired diverse outputs. In particular, we combined the outputs from forward and backward neural translation models (NMT). Our system achieved third place in En→Ja track, despite adopting only a simple approach.

Non-Autoregressive Grammatical Error Correction Toward a Writing Support System
Hiroki Homma | Mamoru Komachi
Proceedings of the 6th Workshop on Natural Language Processing Techniques for Educational Applications

There are several problems in applying grammatical error correction (GEC) to a writing support system. One of them is the handling of sentences in the middle of the input. Till date, the performance of GEC for incomplete sentences is not well-known. Hence, we analyze the performance of each model for incomplete sentences. Another problem is the correction speed. When the speed is slow, the usability of the system is limited, and the user experience is degraded. Therefore, in this study, we also focus on the non-autoregressive (NAR) model, which is a widely studied fast decoding method. We perform GEC in Japanese with traditional autoregressive and recent NAR models and analyze their accuracy and speed.

Korean-to-Japanese Neural Machine Translation System using Hanja Information
Hwichan Kim | Tosho Hirasawa | Mamoru Komachi
Proceedings of the 7th Workshop on Asian Translation

In this paper, we describe our TMU neural machine translation (NMT) system submitted for the Patent task (Korean→Japanese) of the 7th Workshop on Asian Translation (WAT 2020, Nakazawa et al., 2020). We propose a novel method to train a Korean-to-Japanese translation model. Specifically, we focus on the vocabulary overlap of Korean Hanja words and Japanese Kanji words, and propose strategies to leverage Hanja information. Our experiment shows that Hanja information is effective within a specific domain, leading to an improvement in the BLEU scores by +1.09 points compared to the baseline.

2019

Multi-Task Learning for Japanese Predicate Argument Structure Analysis
Hikaru Omori | Mamoru Komachi
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

An event-noun is a noun that has an argument structure similar to a predicate. Recent works, including those considered state-of-the-art, ignore event-nouns or build a single model for solving both Japanese predicate argument structure analysis (PASA) and event-noun argument structure analysis (ENASA). However, because there are interactions between predicates and event-nouns, it is not sufficient to target only predicates. To address this problem, we present a multi-task learning method for PASA and ENASA. Our multi-task models improved the performance of both tasks compared to a single-task model by sharing knowledge from each task. Moreover, in PASA, our models achieved state-of-the-art results in overall F1 scores on the NAIST Text Corpus. In addition, this is the first work to employ neural networks in ENASA.

TMU Transformer System Using BERT for Re-ranking at BEA 2019 Grammatical Error Correction on Restricted Track
Masahiro Kaneko | Kengo Hotate | Satoru Katsumata | Mamoru Komachi
Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications

We introduce our system that is submitted to the restricted track of the BEA 2019 shared task on grammatical error correction1 (GEC). It is essential to select an appropriate hypothesis sentence from the candidates list generated by the GEC model. A re-ranker can evaluate the naturalness of a corrected sentence using language models trained on large corpora. On the other hand, these language models and language representations do not explicitly take into account the grammatical errors written by learners. Thus, it is not straightforward to utilize language representations trained from a large corpus, such as Bidirectional Encoder Representations from Transformers (BERT), in a form suitable for the learner’s grammatical errors. Therefore, we propose to fine-tune BERT on learner corpora with grammatical errors for re-ranking. The experimental results of the W&I+LOCNESS development dataset demonstrate that re-ranking using BERT can effectively improve the correction performance.

Debiasing Word Embeddings Improves Multimodal Machine Translation
Tosho Hirasawa | Mamoru Komachi
Proceedings of Machine Translation Summit XVII: Research Track

Sakura: Large-scale Incorrect Example Retrieval System for Learners of Japanese as a Second Language
Mio Arai | Tomonori Kodaira | Mamoru Komachi
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

This study develops an incorrect example retrieval system, called Sakura, using a large-scale Lang-8 dataset for Japanese language learners. Existing example retrieval systems do not include grammatically incorrect examples or present only a few examples, if any. If a retrieval system has a wide coverage of incorrect examples along with the correct counterpart, learners can revise their composition themselves. Considering the usability of retrieving incorrect examples, our proposed system uses a large-scale corpus to expand the coverage of incorrect examples and presents correct expressions along with incorrect expressions. Our intrinsic and extrinsic evaluations indicate that our system is more useful than a previous system.

Filtering Pseudo-References by Paraphrasing for Automatic Evaluation of Machine Translation
Ryoma Yoshimura | Hiroki Shimanaka | Yukio Matsumura | Hayahide Yamagishi | Mamoru Komachi
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

In this paper, we introduce our participation in the WMT 2019 Metric Shared Task. We propose an improved version of sentence BLEU using filtered pseudo-references. We propose a method to filter pseudo-references by paraphrasing for automatic evaluation of machine translation (MT). We use the outputs of off-the-shelf MT systems as pseudo-references filtered by paraphrasing in addition to a single human reference (gold reference). We use BERT fine-tuned with paraphrase corpus to filter pseudo-references by checking the paraphrasability with the gold reference. Our experimental results of the WMT 2016 and 2017 datasets show that our method achieved higher correlation with human evaluation than the sentence BLEU (SentBLEU) baselines with a single reference and with unfiltered pseudo-references.

Controlling Grammatical Error Correction Using Word Edit Rate
Kengo Hotate | Masahiro Kaneko | Satoru Katsumata | Mamoru Komachi
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop

When professional English teachers correct grammatically erroneous sentences written by English learners, they use various methods. The correction method depends on how much corrections a learner requires. In this paper, we propose a method for neural grammar error correction (GEC) that can control the degree of correction. We show that it is possible to actually control the degree of GEC by using new training data annotated with word edit rate. Thereby, diverse corrected sentences is obtained from a single erroneous sentence. Moreover, compared to a GEC model that does not use information on the degree of correction, the proposed method improves correction accuracy.

Multimodal Machine Translation with Embedding Prediction
Tosho Hirasawa | Hayahide Yamagishi | Yukio Matsumura | Mamoru Komachi
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop

Multimodal machine translation is an attractive application of neural machine translation (NMT). It helps computers to deeply understand visual objects and their relations with natural languages. However, multimodal NMT systems suffer from a shortage of available training data, resulting in poor performance for translating rare words. In NMT, pretrained word embeddings have been shown to improve NMT of low-resource domains, and a search-based approach is proposed to address the rare word problem. In this study, we effectively combine these two approaches in the context of multimodal NMT and explore how we can take full advantage of pretrained word embeddings to better translate rare words. We report overall performance improvements of 1.24 METEOR and 2.49 BLEU and achieve an improvement of 7.67 F-score for rare word translation.

Japanese-Russian TMU Neural Machine Translation System using Multilingual Model for WAT 2019
Aizhan Imankulova | Masahiro Kaneko | Mamoru Komachi
Proceedings of the 6th Workshop on Asian Translation

We introduce our system that is submitted to the News Commentary task (Japanese<->Russian) of the 6th Workshop on Asian Translation. The goal of this shared task is to study extremely low resource situations for distant language pairs. It is known that using parallel corpora of different language pair as training data is effective for multilingual neural machine translation model in extremely low resource scenarios. Therefore, to improve the translation quality of Japanese<->Russian language pair, our method leverages other in-domain Japanese-English and English-Russian parallel corpora as additional training data for our multilingual NMT model.

(Almost) Unsupervised Grammatical Error Correction using Synthetic Comparable Corpus
Satoru Katsumata | Mamoru Komachi
Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications

We introduce unsupervised techniques based on phrase-based statistical machine translation for grammatical error correction (GEC) trained on a pseudo learner corpus created by Google Translation. We verified our GEC system through experiments on a low resource track of the shared task at BEA2019. As a result, we achieved an F0.5 score of 28.31 points with the test data.

Grammatical-Error-Aware Incorrect Example Retrieval System for Learners of Japanese as a Second Language
Mio Arai | Masahiro Kaneko | Mamoru Komachi
Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications

Existing example retrieval systems do not include grammatically incorrect examples or present only a few examples, if any. Even if a retrieval system has a wide coverage of incorrect examples along with the correct counterpart, learners need to know whether their query includes errors or not. Considering the usability of retrieving incorrect examples, our proposed method uses a large-scale corpus and presents correct expressions along with incorrect expressions using a grammatical error detection system so that the learner do not need to be aware of how to search for the examples. Intrinsic and extrinsic evaluations indicate that our method improves accuracy of example sentence retrieval and quality of learner’s writing.

2018

Graph-based Filtering of Out-of-Vocabulary Words for Encoder-Decoder Models
Satoru Katsumata | Yukio Matsumura | Hayahide Yamagishi | Mamoru Komachi
Proceedings of ACL 2018, Student Research Workshop

Encoder-decoder models typically only employ words that are frequently used in the training corpus because of the computational costs and/or to exclude noisy words. However, this vocabulary set may still include words that interfere with learning in encoder-decoder models. This paper proposes a method for selecting more suitable words for learning encoders by utilizing not only frequency, but also co-occurrence information, which we capture using the HITS algorithm. The proposed method is applied to two tasks: machine translation and grammatical error correction. For Japanese-to-English translation, this method achieved a BLEU score that was 0.56 points more than that of a baseline. It also outperformed the baseline method for English grammatical error correction, with an F-measure that was 1.48 points higher.

Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications
Yuen-Hsien Tseng | Hsin-Hsi Chen | Vincent Ng | Mamoru Komachi
Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications

TMU Japanese-Chinese Unsupervised NMT System for WAT 2018 Translation Task
Longtu Zhang | Yuting Zhao | Mamoru Komachi
Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation: 5th Workshop on Asian Translation: 5th Workshop on Asian Translation

Neural Machine Translation of Logographic Language Using Sub-character Level Information
Longtu Zhang | Mamoru Komachi
Proceedings of the Third Conference on Machine Translation: Research Papers

Recent neural machine translation (NMT) systems have been greatly improved by encoder-decoder models with attention mechanisms and sub-word units. However, important differences between languages with logographic and alphabetic writing systems have long been overlooked. This study focuses on these differences and uses a simple approach to improve the performance of NMT systems utilizing decomposed sub-character level information for logographic languages. Our results indicate that our approach not only improves the translation capabilities of NMT systems between Chinese and English, but also further improves NMT systems between Chinese and Japanese, because it utilizes the shared information brought by similar sub-character units.

Complex Word Identification Based on Frequency in a Learner Corpus
Tomoyuki Kajiwara | Mamoru Komachi
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications

We introduce the TMU systems for the Complex Word Identification (CWI) Shared Task 2018. TMU systems use random forest classifiers and regressors whose features are the number of characters, the number of words, and the frequency of target words in various corpora. Our simple systems performed best on 5 tracks out of 12 tracks. Our ablation analysis revealed the usefulness of a learner corpus for CWI task.

TMU System for SLAM-2018
Masahiro Kaneko | Tomoyuki Kajiwara | Mamoru Komachi
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications

We introduce the TMU systems for the second language acquisition modeling shared task 2018 (Settles et al., 2018). To model learner error patterns, it is necessary to maintain a considerable amount of information regarding the type of exercises learners have been learning in the past and the manner in which they answered them. Tracking an enormous learner’s learning history and their correct and mistaken answers is essential to predict the learner’s future mistakes. Therefore, we propose a model which tracks the learner’s learning history efficiently. Our systems ranked fourth in the English and Spanish subtasks, and fifth in the French subtask.

Japanese Predicate Conjugation for Neural Machine Translation
Michiki Kurosawa | Yukio Matsumura | Hayahide Yamagishi | Mamoru Komachi
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop

Neural machine translation (NMT) has a drawback in that can generate only high-frequency words owing to the computational costs of the softmax function in the output layer. In Japanese-English NMT, Japanese predicate conjugation causes an increase in vocabulary size. For example, one verb can have as many as 19 surface varieties. In this research, we focus on predicate conjugation for compressing the vocabulary size in Japanese. The vocabulary list is filled with the various forms of verbs. We propose methods using predicate conjugation information without discarding linguistic information. The proposed methods can generate low-frequency words and deal with unknown words. Two methods were considered to introduce conjugation information: the first considers it as a token (conjugation token) and the second considers it as an embedded vector (conjugation feature). The results using these methods demonstrate that the vocabulary size can be compressed by approximately 86.1% (Tanaka corpus) and the NMT models can output the words not in the training data set. Furthermore, BLEU scores improved by 0.91 points in Japanese-to-English translation, and 0.32 points in English-to-Japanese translation with ASPEC.

Long Short-Term Memory for Japanese Word Segmentation
Yoshiaki Kitagawa | Mamoru Komachi
Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation

RUSE: Regressor Using Sentence Embeddings for Automatic Machine Translation Evaluation
Hiroki Shimanaka | Tomoyuki Kajiwara | Mamoru Komachi
Proceedings of the Third Conference on Machine Translation: Shared Task Papers

We introduce the RUSE metric for the WMT18 metrics shared task. Sentence embeddings can capture global information that cannot be captured by local features based on character or word N-grams. Although training sentence embeddings using small-scale translation datasets with manual evaluation is difficult, sentence embeddings trained from large-scale data in other tasks can improve the automatic evaluation of machine translation. We use a multi-layer perceptron regressor based on three types of sentence embeddings. The experimental results of the WMT16 and WMT17 datasets show that the RUSE metric achieves a state-of-the-art performance in both segment- and system-level metrics tasks with embedding features only.

Japanese Sentiment Classification using a Tree-Structured Long Short-Term Memory with Attention
Ryosuke Miyazaki | Mamoru Komachi
Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation

Construction of a Japanese Word Similarity Dataset
Yuya Sakaizawa | Mamoru Komachi
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

TMU Japanese-English Neural Machine Translation System using Generative Adversarial Network for WAT 2018
Yukio Matsumura | Satoru Katsumata | Mamoru Komachi
Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation: 5th Workshop on Asian Translation: 5th Workshop on Asian Translation

Metric for Automatic Machine Translation Evaluation based on Universal Sentence Representations
Hiroki Shimanaka | Tomoyuki Kajiwara | Mamoru Komachi
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop

Sentence representations can capture a wide range of information that cannot be captured by local features based on character or word N-grams. This paper examines the usefulness of universal sentence representations for evaluating the quality of machine translation. Al-though it is difficult to train sentence representations using small-scale translation datasets with manual evaluation, sentence representations trained from large-scale data in other tasks can improve the automatic evaluation of machine translation. Experimental results of the WMT-2016 dataset show that the proposed method achieves state-of-the-art performance with sentence representation features only.

The Rule of Three: Abstractive Text Summarization in Three Bullet Points
Tomonori Kodaira | Mamoru Komachi
Proceedings of the 32nd Pacific Asia Conference on Language, Information and Computation

2017

Improving Low-Resource Neural Machine Translation with Filtered Pseudo-Parallel Corpus
Aizhan Imankulova | Takayuki Sato | Mamoru Komachi
Proceedings of the 4th Workshop on Asian Translation (WAT2017)

Large-scale parallel corpora are indispensable to train highly accurate machine translators. However, manually constructed large-scale parallel corpora are not freely available in many language pairs. In previous studies, training data have been expanded using a pseudo-parallel corpus obtained using machine translation of the monolingual corpus in the target language. However, in low-resource language pairs in which only low-accuracy machine translation systems can be used, translation quality is reduces when a pseudo-parallel corpus is used naively. To improve machine translation performance with low-resource language pairs, we propose a method to expand the training data effectively via filtering the pseudo-parallel corpus using a quality estimation based on back-translation. As a result of experiments with three language pairs using small, medium, and large parallel corpora, language pairs with fewer training data filtered out more sentence pairs and improved BLEU scores more significantly.

MIPA: Mutual Information Based Paraphrase Acquisition via Bilingual Pivoting
Tomoyuki Kajiwara | Mamoru Komachi | Daichi Mochihashi
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

We present a pointwise mutual information (PMI)-based approach to formalize paraphrasability and propose a variant of PMI, called MIPA, for the paraphrase acquisition. Our paraphrase acquisition method first acquires lexical paraphrase pairs by bilingual pivoting and then reranks them by PMI and distributional similarity. The complementary nature of information from bilingual corpora and from monolingual corpora makes the proposed method robust. Experimental results show that the proposed method substantially outperforms bilingual pivoting and distributional similarity themselves in terms of metrics such as MRR, MAP, coverage, and Spearman’s correlation.

Grammatical Error Detection Using Error- and Grammaticality-Specific Word Embeddings
Masahiro Kaneko | Yuya Sakaizawa | Mamoru Komachi
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

In this study, we improve grammatical error detection by learning word embeddings that consider grammaticality and error patterns. Most existing algorithms for learning word embeddings usually model only the syntactic context of words so that classifiers treat erroneous and correct words as similar inputs. We address the problem of contextual information by considering learner errors. Specifically, we propose two models: one model that employs grammatical error patterns and another model that considers grammaticality of the target word. We determine grammaticality of n-gram sequence from the annotated error tags and extract grammatical error patterns for word embeddings from large-scale learner corpora. Experimental results show that a bidirectional long-short term memory model initialized by our word embeddings achieved the state-of-the-art accuracy by a large margin in an English grammatical error detection task on the First Certificate in English dataset.

Improving Japanese-to-English Neural Machine Translation by Paraphrasing the Target Language
Yuuki Sekizawa | Tomoyuki Kajiwara | Mamoru Komachi
Proceedings of the 4th Workshop on Asian Translation (WAT2017)

Neural machine translation (NMT) produces sentences that are more fluent than those produced by statistical machine translation (SMT). However, NMT has a very high computational cost because of the high dimensionality of the output layer. Generally, NMT restricts the size of vocabulary, which results in infrequent words being treated as out-of-vocabulary (OOV) and degrades the performance of the translation. In evaluation, we achieved a statistically significant BLEU score improvement of 0.55-0.77 over the baselines including the state-of-the-art method.

Suggesting Sentences for ESL using Kernel Embeddings
Kent Shioda | Mamoru Komachi | Rue Ikeya | Daichi Mochihashi
Proceedings of the 4th Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA 2017)

Sentence retrieval is an important NLP application for English as a Second Language (ESL) learners. ESL learners are familiar with web search engines, but generic web search results may not be adequate for composing documents in a specific domain. However, if we build our own search system specialized to a domain, it may be subject to the data sparseness problem. Recently proposed word2vec partially addresses the data sparseness problem, but fails to extract sentences relevant to queries owing to the modeling of the latent intent of the query. Thus, we propose a method of retrieving example sentences using kernel embeddings and N-gram windows. This method implicitly models latent intent of query and sentences, and alleviates the problem of noisy alignment. Our results show that our method achieved higher precision in sentence retrieval for ESL in the domain of a university press release corpus, as compared to a previous unsupervised method used for a semantic textual similarity task.

Improving Japanese-to-English Neural Machine Translation by Voice Prediction
Hayahide Yamagishi | Shin Kanouchi | Takayuki Sato | Mamoru Komachi
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

This study reports an attempt to predict the voice of reference using the information from the input sentences or previous input/output sentences. Our previous study presented a voice controlling method to generate sentences for neural machine translation, wherein it was demonstrated that the BLEU score improved when the voice of generated sentence was controlled relative to that of the reference. However, it is impractical to use the reference information because we cannot discern the voice of the correct translation in advance. Thus, this study presents a voice prediction method for generated sentences for neural machine translation. While evaluating on Japanese-to-English translation, we obtain a 0.70-improvement in the BLEU using the predicted voice.

Building a Non-Trivial Paraphrase Corpus Using Multiple Machine Translation Systems
Yui Suzuki | Tomoyuki Kajiwara | Mamoru Komachi
Proceedings of ACL 2017, Student Research Workshop

Tokyo Metropolitan University Neural Machine Translation System for WAT 2017
Yukio Matsumura | Mamoru Komachi
Proceedings of the 4th Workshop on Asian Translation (WAT2017)

In this paper, we describe our neural machine translation (NMT) system, which is based on the attention-based NMT and uses long short-term memories (LSTM) as RNN. We implemented beam search and ensemble decoding in the NMT system. The system was tested on the 4th Workshop on Asian Translation (WAT 2017) shared tasks. In our experiments, we participated in the scientific paper subtasks and attempted Japanese-English, English-Japanese, and Japanese-Chinese translation tasks. The experimental results showed that implementation of beam search and ensemble decoding can effectively improve the translation quality.

2016

Controlling the Voice of a Sentence in Japanese-to-English Neural Machine Translation
Hayahide Yamagishi | Shin Kanouchi | Takayuki Sato | Mamoru Komachi
Proceedings of the 3rd Workshop on Asian Translation (WAT2016)

In machine translation, we must consider the difference in expression between languages. For example, the active/passive voice may change in Japanese-English translation. The same verb in Japanese may be translated into different voices at each translation because the voice of a generated sentence cannot be determined using only the information of the Japanese sentence. Machine translation systems should consider the information structure to improve the coherence of the output by using several topicalization techniques such as passivization. Therefore, this paper reports on our attempt to control the voice of the sentence generated by an encoder-decoder model. To control the voice of the generated sentence, we added the voice information of the target sentence to the source sentence during the training. We then generated sentences with a specified voice by appending the voice information to the source sentence. We observed experimentally whether the voice could be controlled. The results showed that, we could control the voice of the generated sentence with 85.0% accuracy on average. In the evaluation of Japanese-English translation, we obtained a 0.73-point improvement in BLEU score by using gold voice labels.

Neural Reordering Model Considering Phrase Translation and Word Alignment for Phrase-based Translation
Shin Kanouchi | Katsuhito Sudoh | Mamoru Komachi
Proceedings of the 3rd Workshop on Asian Translation (WAT2016)

This paper presents an improved lexicalized reordering model for phrase-based statistical machine translation using a deep neural network. Lexicalized reordering suffers from reordering ambiguity, data sparseness and noises in a phrase table. Previous neural reordering model is successful to solve the first and second problems but fails to address the third one. Therefore, we propose new features using phrase translation and word alignment to construct phrase vectors to handle inherently noisy phrase translation pairs. The experimental results show that our proposed method improves the accuracy of phrase reordering. We confirm that the proposed method works well with phrase pairs including NULL alignments.

Analysis of English Spelling Errors in a Word-Typing Game
Ryuichi Tachibana | Mamoru Komachi
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

The emergence of the web has necessitated the need to detect and correct noisy consumer-generated texts. Most of the previous studies on English spelling-error extraction collected English spelling errors from web services such as Twitter by using the edit distance or from input logs utilizing crowdsourcing. However, in the former approach, it is not clear which word corresponds to the spelling error, and the latter approach requires an annotation cost for the crowdsourcing. One notable exception is Rodrigues and Rytting (2012), who proposed to extract English spelling errors by using a word-typing game. Their approach saves the cost of crowdsourcing, and guarantees an exact alignment between the word and the spelling error. However, they did not assert whether the extracted spelling error corpora reflect the usual writing process such as writing a document. Therefore, we propose a new correctable word-typing game that is more similar to the actual writing process. Experimental results showed that we can regard typing-game logs as a source of spelling errors.

Building a Monolingual Parallel Corpus for Text Simplification Using Sentence Similarity Based on Alignment between Word Embeddings
Tomoyuki Kajiwara | Mamoru Komachi
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Methods for text simplification using the framework of statistical machine translation have been extensively studied in recent years. However, building the monolingual parallel corpus necessary for training the model requires costly human annotation. Monolingual parallel corpora for text simplification have therefore been built only for a limited number of languages, such as English and Portuguese. To obviate the need for human annotation, we propose an unsupervised method that automatically builds the monolingual parallel corpus for text simplification using sentence similarity based on word embeddings. For any sentence pair comprising a complex sentence and its simple counterpart, we employ a many-to-one method of aligning each word in the complex sentence with the most similar word in the simple sentence and compute sentence similarity by averaging these word similarities. The experimental results demonstrate the excellent performance of the proposed method in a monolingual parallel corpus construction task for English text simplification. The results also demonstrated the superior accuracy in text simplification that use the framework of statistical machine translation trained using the corpus built by the proposed method to that using the existing corpora.

Japanese-English Machine Translation of Recipe Texts
Takayuki Sato | Jun Harashima | Mamoru Komachi
Proceedings of the 3rd Workshop on Asian Translation (WAT2016)

Concomitant with the globalization of food culture, demand for the recipes of specialty dishes has been increasing. The recent growth in recipe sharing websites and food blogs has resulted in numerous recipe texts being available for diverse foods in various languages. However, little work has been done on machine translation of recipe texts. In this paper, we address the task of translating recipes and investigate the advantages and disadvantages of traditional phrase-based statistical machine translation and more recent neural machine translation. Specifically, we translate Japanese recipes into English, analyze errors in the translated recipes, and discuss available room for improvements.

Controlled and Balanced Dataset for Japanese Lexical Simplification
Tomonori Kodaira | Tomoyuki Kajiwara | Mamoru Komachi
Proceedings of the ACL 2016 Student Research Workshop

Disaster Analysis using User-Generated Weather Report
Yasunobu Asakura | Masatsugu Hangyo | Mamoru Komachi
Proceedings of the 2nd Workshop on Noisy User-generated Text (WNUT)

Information extraction from user-generated text has gained much attention with the growth of the Web.Disaster analysis using information from social media provides valuable, real-time, geolocation information for helping people caught up these in disasters. However, it is not convenient to analyze texts posted on social media because disaster keywords match any texts that contain words. For collecting posts about a disaster from social media, we need to develop a classifier to filter posts irrelevant to disasters. Moreover, because of the nature of social media, we can take advantage of posts that come with GPS information. However, a post does not always refer to an event occurring at the place where it has been posted. Therefore, we propose a new task of classifying whether a flood disaster occurred, in addition to predicting the geolocation of events from user-generated text. We report the annotation of the flood disaster corpus and develop a classifier to demonstrate the use of this corpus for disaster analysis.

2015

Japanese Sentiment Classification with Stacked Denoising Auto-Encoder using Distributed Word Representation
Peinan Zhang | Mamoru Komachi
Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation

Who caught a cold ? - Identifying the subject of a symptom
Shin Kanouchi | Mamoru Komachi | Naoaki Okazaki | Eiji Aramaki | Hiroshi Ishikawa
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Improving Chinese Grammatical Error Correction with Corpus Augmentation and Hierarchical Phrase-based Statistical Machine Translation
Yinchen Zhao | Mamoru Komachi | Hiroshi Ishikawa
Proceedings of the 2nd Workshop on Natural Language Processing Techniques for Educational Applications

Source Phrase Segmentation and Translation for Japanese-English Translation Using Dependency Structure
Junki Matsuo | Kenichi Ohwada | Mamoru Komachi
Proceedings of the 2nd Workshop on Asian Translation (WAT2015)

Disease Event Detection based on Deep Modality Analysis
Yoshiaki Kitagawa | Mamoru Komachi | Eiji Aramaki | Naoaki Okazaki | Hiroshi Ishikawa
Proceedings of the ACL-IJCNLP 2015 Student Research Workshop

2014

Predicate-Argument Structure-based Preordering for Japanese-English Statistical Machine Translation of Scientific Papers
Kenichi Ohwada | Ryosuke Miyazaki | Mamoru Komachi
Proceedings of the 1st Workshop on Asian Translation (WAT2014)

2013

NAIST at 2013 CoNLL Grammatical Error Correction Shared Task
Ippei Yoshimoto | Tomoya Kose | Kensuke Mitsuzawa | Keisuke Sakaguchi | Tomoya Mizumoto | Yuta Hayashibe | Mamoru Komachi | Yuji Matsumoto
Proceedings of the Seventeenth Conference on Computational Natural Language Learning: Shared Task

NAIST at the NLI 2013 Shared Task
Tomoya Mizumoto | Yuta Hayashibe | Keisuke Sakaguchi | Mamoru Komachi | Yuji Matsumoto
Proceedings of the Eighth Workshop on Innovative Use of NLP for Building Educational Applications

Discriminative Approach to Fill-in-the-Blank Quiz Generation for Language Learners
Keisuke Sakaguchi | Yuki Arase | Mamoru Komachi
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

A Learner Corpus-based Approach to Verb Suggestion for ESL
Yu Sawai | Mamoru Komachi | Yuji Matsumoto
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Towards Automatic Error Type Classification of Japanese Language Learners’ Writings
Hiromi Oyama | Mamoru Komachi | Yuji Matsumoto
Proceedings of the 27th Pacific Asia Conference on Language, Information, and Computation (PACLIC 27)

2012

UniDic for Early Middle Japanese: a Dictionary for Morphological Analysis of Classical Japanese
Toshinobu Ogiso | Mamoru Komachi | Yasuharu Den | Yuji Matsumoto
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

In order to construct an annotated diachronic corpus of Japanese, we propose to create a new dictionary for morphological analysis of Early Middle Japanese (Classical Japanese) based on UniDic, a dictionary for Contemporary Japanese. Differences between the Early Middle Japanese and Contemporary Japanese, which prevent a naïve adaptation of UniDic to Early Middle Japanese, are found at the levels of lexicon, morphology, grammar, orthography and pronunciation. In order to overcome these problems, we extended dictionary entries and created a training corpus of Early Middle Japanese to adapt UniDic for Contemporary Japanese to Early Middle Japanese. Experimental results show that the proposed UniDic-EMJ, a new dictionary for Early Middle Japanese, achieves as high accuracy (97%) as needed for the linguistic research on lexicon and grammar in Japanese classical text analysis.

NAIST at the HOO 2012 Shared Task
Keisuke Sakaguchi | Yuta Hayashibe | Shuhei Kondo | Lis Kanashiro | Tomoya Mizumoto | Mamoru Komachi | Yuji Matsumoto
Proceedings of the Seventh Workshop on Building Educational Applications Using NLP

The Effect of Learner Corpus Size in Grammatical Error Correction of ESL Writings
Tomoya Mizumoto | Yuta Hayashibe | Mamoru Komachi | Masaaki Nagata | Yuji Matsumoto
Proceedings of COLING 2012: Posters

Joint English Spelling Error Correction and POS Tagging for Language Learners Writing
Keisuke Sakaguchi | Tomoya Mizumoto | Mamoru Komachi | Yuji Matsumoto
Proceedings of COLING 2012

Tense and Aspect Error Correction for ESL Learners Using Global Context
Toshikazu Tajiri | Mamoru Komachi | Yuji Matsumoto
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2011

Narrative Schema as World Knowledge for Coreference Resolution
Joseph Irwin | Mamoru Komachi | Yuji Matsumoto
Proceedings of the Fifteenth Conference on Computational Natural Language Learning: Shared Task

Mining Revision Log of Language Learning SNS for Automated Japanese Error Correction of Second Language Learners
Tomoya Mizumoto | Mamoru Komachi | Masaaki Nagata | Yuji Matsumoto
Proceedings of 5th International Joint Conference on Natural Language Processing

Using the Mutual k-Nearest Neighbor Graphs for Semi-supervised Classification on Natural Language Data
Kohei Ozaki | Masashi Shimbo | Mamoru Komachi | Yuji Matsumoto
Proceedings of the Fifteenth Conference on Computational Natural Language Learning

Error Correcting Romaji-kana Conversion for Japanese Language Education
Seiji Kasahara | Mamoru Komachi | Masaaki Nagata | Yuji Matsumoto
Proceedings of the Workshop on Advances in Text Input Methods (WTIM 2011)

Japanese Abbreviation Expansion with Query and Clickthrough Logs
Kei Uchiumi | Mamoru Komachi | Keigo Machinaga | Toshiyuki Maezawa | Toshinori Satou | Yoshinori Kobayashi
Proceedings of 5th International Joint Conference on Natural Language Processing

HITS-based Seed Selection and Stop List Construction for Bootstrapping
Tetsuo Kiso | Masashi Shimbo | Mamoru Komachi | Yuji Matsumoto
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

Automatic Labeling of Voiced Consonants for Morphological Analysis of Modern Japanese Literature
Teruaki Oka | Mamoru Komachi | Toshinobu Ogiso | Yuji Matsumoto
Proceedings of 5th International Joint Conference on Natural Language Processing

Japanese Predicate Argument Structure Analysis Exploiting Argument Position and Type
Yuta Hayashibe | Mamoru Komachi | Yuji Matsumoto
Proceedings of 5th International Joint Conference on Natural Language Processing

2009

Learning Semantic Categories from Clickthrough Logs
Mamoru Komachi | Shimpei Makimoto | Kei Uchiumi | Manabu Sassano
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers

2008

Graph-based Analysis of Semantic Drift in Espresso-like Bootstrapping Algorithms
Mamoru Komachi | Taku Kudo | Masashi Shimbo | Yuji Matsumoto
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

Minimally Supervised Learning of Semantic Knowledge from Query Logs
Mamoru Komachi | Hisami Suzuki
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I

2007

Annotating a Japanese Text Corpus with Predicate-Argument and Coreference Relations
Ryu Iida | Mamoru Komachi | Kentaro Inui | Yuji Matsumoto
Proceedings of the Linguistic Annotation Workshop

2006

Phrase reordering for statistical machine translation based on predicate-argument structure
Mamoru Komachi | Masaaki Nagata | Yuji Matsumoto
Proceedings of the Third International Workshop on Spoken Language Translation: Evaluation Campaign

Co-authors

Satoru Katsumata 10

Aizhan Imankulova 8

Yukio Matsumura 6

Tomoya Mizumoto 6

Daichi Mochihashi 6

Toshinobu Ogiso 6

Hayahide Yamagishi 6

Yuta Hayashibe 5

Keisuke Sakaguchi 5

Hiroki Shimanaka 5

Taisei Enomoto 4

Shin Kanouchi 4

Seiichiro Kondo 4

Masaaki Nagata 4

Takayuki Sato 4

Hiroya Takamura 4

Hiroshi Ishikawa 3

Tomonori Kodaira 3

Kyotaro Nakajima 3

Naoaki Okazaki 3

Masashi Shimbo 3

Yujin Takahashi 3

Zizheng Zhang 3

Seiichi Inoue 2

Siti Oryza Khairunnisa 2

Yoshiaki Kitagawa 2

Tomoshige Kiyuna 2

Masamune Kobayashi 2

Michiki Kurosawa 2

Ryosuke Miyazaki 2

Yoshinari Nagai 2

Kenichi Ohwada 2

Yuya Sakaizawa 2

Hiroto Tamura 2

Ikumi Yamashita 2

Ryoma Yoshimura 2

Yasunobu Asakura 1

Emanuele Bugliarello 1

Hsin-Hsi Chen 1

Desmond Elliott 1

Rob Van Der Goot 1

Masatsugu Hangyo 1

Jun Harashima 1

Asahi Hentona 1

Lis Kanashiro 1

Seiji Kasahara 1

Atsushi Keyaki 1

Hajime Kiyama 1

Yoshinori Kobayashi 1

Kenji Kobayashi 1

Kazuma Kobayashi 1

Kanako Komiya 1

Keigo Machinaga 1

Toshiyuki Maezawa 1

Shimpei Makimoto 1

Kensuke Mitsuzawa 1

Sangwhan Moon 1

Akifumi Nakamachi 1

Michitaka Nakatsuji 1

Barbara Plank 1

Manabu Sassano 1

Toshinori Satou 1

Yuuki Sekizawa 1

Ibrahim Sharaf 1

Marija Stepanović 1

Katsuhito Sudoh 1

Masakazu Sugiyama 1

Hisami Suzuki 1

Daisuke Suzuki 1

Ryuichi Tachibana 1

Toshikazu Tajiri 1

Masashi Takaku 1

Yasufumi Takama 1

Hikari Tanaka 1

Yuen-Hsien Tseng 1

Ippei Yoshimoto 1

Ahmet Üstün 1

Venues