2024
pdf
bib
abs
First Heuristic Then Rational: Dynamic Use of Heuristics in Language Model Reasoning
Yoichi Aoki
|
Keito Kudo
|
Tatsuki Kuribayashi
|
Shusaku Sone
|
Masaya Taniguchi
|
Keisuke Sakaguchi
|
Kentaro Inui
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Explicit multi-step reasoning, such as chain-of-thought, is widely adopted in the community to explore the better performance of language models (LMs). We report on the systematic strategy that LMs use in this process.Our controlled experiments reveal that LMs rely more heavily on heuristics, such as lexical overlap, in the earlier stages of reasoning when more steps are required to reach an answer. Conversely, their reliance on heuristics decreases as LMs progress closer to the final answer. This suggests that LMs track only a limited number of future steps and dynamically combine heuristic strategies with rational ones in solving tasks involving multi-step reasoning.
pdf
bib
abs
Document-level Translation with LLM Reranking: Team-J at WMT 2024 General Translation Task
Keito Kudo
|
Hiroyuki Deguchi
|
Makoto Morishita
|
Ryo Fujii
|
Takumi Ito
|
Shintaro Ozaki
|
Koki Natsumi
|
Kai Sato
|
Kazuki Yano
|
Ryosuke Takahashi
|
Subaru Kimura
|
Tomomasa Hara
|
Yusuke Sakai
|
Jun Suzuki
Proceedings of the Ninth Conference on Machine Translation
We participated in the constrained track for English-Japanese and Japanese-Chinese translations at the WMT 2024 General Machine Translation Task. Our approach was to generate a large number of sentence-level translation candidates and select the most probable translation using minimum Bayes risk (MBR) decoding and document-level large language model (LLM) re-ranking. We first generated hundreds of translation candidates from multiple translation models and retained the top 30 candidates using MBR decoding. In addition, we continually pre-trained LLMs on the target language corpora to leverage document-level information. We utilized LLMs to select the most probable sentence sequentially in context from the beginning of the document.
2023
pdf
bib
abs
Empirical Investigation of Neural Symbolic Reasoning Strategies
Yoichi Aoki
|
Keito Kudo
|
Tatsuki Kuribayashi
|
Ana Brassard
|
Masashi Yoshikawa
|
Keisuke Sakaguchi
|
Kentaro Inui
Findings of the Association for Computational Linguistics: EACL 2023
Neural reasoning accuracy improves when generating intermediate reasoning steps. However, the source of this improvement is yet unclear. Here, we investigate and factorize the benefit of generating intermediate steps for symbolic reasoning. Specifically, we decompose the reasoning strategy w.r.t. step granularity and chaining strategy. With a purely symbolic numerical reasoning dataset (e.g., A=1, B=3, C=A+3, C?), we found that the choice of reasoning strategies significantly affects the performance, with the gap becoming even larger as the extrapolation length becomes longer. Surprisingly, we also found that certain configurations lead to nearly perfect performance, even in the case of length extrapolation. Our results indicate the importance of further exploring effective strategies for neural reasoning models.
pdf
bib
abs
A Challenging Multimodal Video Summary: Simultaneously Extracting and Generating Keyframe-Caption Pairs from Video
Keito Kudo
|
Haruki Nagasawa
|
Jun Suzuki
|
Nobuyuki Shimizu
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
This paper proposes a practical multimodal video summarization task setting and a dataset to train and evaluate the task. The target task involves summarizing a given video into a predefined number of keyframe-caption pairs and displaying them in a listable format to grasp the video content quickly. This task aims to extract crucial scenes from the video in the form of images (keyframes) and generate corresponding captions explaining each keyframe’s situation. This task is useful as a practical application and presents a highly challenging problem worthy of study. Specifically, achieving simultaneous optimization of the keyframe selection performance and caption quality necessitates careful consideration of the mutual dependence on both preceding and subsequent keyframes and captions. To facilitate subsequent research in this field, we also construct a dataset by expanding upon existing datasets and propose an evaluation framework. Furthermore, we develop two baseline systems and report their respective performance.
pdf
bib
abs
Do Deep Neural Networks Capture Compositionality in Arithmetic Reasoning?
Keito Kudo
|
Yoichi Aoki
|
Tatsuki Kuribayashi
|
Ana Brassard
|
Masashi Yoshikawa
|
Keisuke Sakaguchi
|
Kentaro Inui
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics
Compositionality is a pivotal property of symbolic reasoning. However, how well recent neural models capture compositionality remains underexplored in the symbolic reasoning tasks. This study empirically addresses this question by systematically examining recently published pre-trained seq2seq models with a carefully controlled dataset of multi-hop arithmetic symbolic reasoning. We introduce a skill tree on compositionality in arithmetic symbolic reasoning that defines the hierarchical levels of complexity along with three compositionality dimensions: systematicity, productivity, and substitutivity. Our experiments revealed that among the three types of composition, the models struggled most with systematicity, performing poorly even with relatively simple compositions. That difficulty was not resolved even after training the models with intermediate reasoning steps.
pdf
bib
abs
SKIM at WMT 2023 General Translation Task
Keito Kudo
|
Takumi Ito
|
Makoto Morishita
|
Jun Suzuki
Proceedings of the Eighth Conference on Machine Translation
The SKIM team’s submission used a standard procedure to build ensemble Transformer models, including base-model training, back-translation of base models for data augmentation, and retraining of several final models using back-translated training data. Each final model had its own architecture and configuration, including up to 10.5B parameters, and substituted self- and cross-sublayers in the decoder with a cross+self-attention sub-layer. We selected the best candidate from a large candidate pool, namely 70 translations generated from 13 distinct models for each sentence, using an MBR reranking method using COMET and COMET-QE. We also applied data augmentation and selection techniques to the training data of the Transformer models.
2022
pdf
bib
abs
NT5 at WMT 2022 General Translation Task
Makoto Morishita
|
Keito Kudo
|
Yui Oka
|
Katsuki Chousa
|
Shun Kiyono
|
Sho Takase
|
Jun Suzuki
Proceedings of the Seventh Conference on Machine Translation (WMT)
This paper describes the NTT-Tohoku-TokyoTech-RIKEN (NT5) team’s submission system for the WMT’22 general translation task. This year, we focused on the English-to-Japanese and Japanese-to-English translation tracks. Our submission system consists of an ensemble of Transformer models with several extensions. We also applied data augmentation and selection techniques to obtain potentially effective training data for training individual Transformer models in the pre-training and fine-tuning scheme. Additionally, we report our trial of incorporating a reranking module and the reevaluated results of several techniques that have been recently developed and published.