Yotaro Watanabe


2024

pdf bib
Multilingual Sentence-T5: Scalable Sentence Encoders for Multilingual Applications
Chihiro Yano | Akihiko Fukuchi | Shoko Fukasawa | Hideyuki Tachibana | Yotaro Watanabe
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Prior work on multilingual sentence embedding has demonstrated that the efficient use of natural language inference (NLI) data to build high-performance models can outperform conventional methods. However, the potential benefits from the recent “exponential” growth of language models with billions of parameters have not yet been fully explored. In this paper, we introduce Multilingual Sentence T5 (m-ST5), as a larger model of NLI-based multilingual sentence embedding, by extending Sentence T5, an existing monolingual model. By employing the low-rank adaptation (LoRA) technique, we have achieved a successful scaling of the model’s size to 5.7 billion parameters. We conducted experiments to evaluate the performance of sentence embedding and verified that the method outperforms the NLI-based prior approach. Furthermore, we also have confirmed a positive correlation between the size of the model and its performance. It was particularly noteworthy that languages with fewer resources or those with less linguistic similarity to English benefited more from the parameter increase. Our model is available at https://huggingface.co/pkshatech/m-ST5.

2021

pdf bib
Validity-Based Sampling and Smoothing Methods for Multiple Reference Image Captioning
Shunta Nagasawa | Yotaro Watanabe | Hitoshi Iyatomi
Proceedings of the Third Workshop on Multimodal Artificial Intelligence

In image captioning, multiple captions are often provided as ground truths, since a valid caption is not always uniquely determined. Conventional methods randomly select a single caption and treat it as correct, but there have been few effective training methods that utilize multiple given captions. In this paper, we proposed two training technique for making effective use of multiple reference captions: 1) validity-based caption sampling (VBCS), which prioritizes the use of captions that are estimated to be highly valid during training, and 2) weighted caption smoothing (WCS), which applies smoothing only to the relevant words the reference caption to reflect multiple reference captions simultaneously. Experiments show that our proposed methods improve CIDEr by 2.6 points and BLEU4 by 0.9 points from baseline on the MSCOCO dataset.

2014

pdf bib
Finding The Best Model Among Representative Compositional Models
Masayasu Muraoka | Sonse Shimaoka | Kazeto Yamamoto | Yotaro Watanabe | Naoaki Okazaki | Kentaro Inui
Proceedings of the 28th Pacific Asia Conference on Language, Information and Computing

2013

pdf bib
Is a 204 cm Man Tall or Small ? Acquisition of Numerical Common Sense from the Web
Katsuma Narisawa | Yotaro Watanabe | Junta Mizuno | Naoaki Okazaki | Kentaro Inui
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Computer-assisted Structuring of Emergency Management Information: A Project Note
Yotaro Watanabe | Kentaro Inui | Shingo Suzuki | Hiroko Koumoto | Mitsuhiro Higashida | Yuji Maeda | Katsumi Iwatsuki
Proceedings of the Workshop on Language Processing and Crisis Information 2013

2012

pdf bib
A Latent Discriminative Model for Compositional Entailment Relation Recognition using Natural Logic
Yotaro Watanabe | Junta Mizuno | Eric Nichols | Naoaki Okazaki | Kentaro Inui
Proceedings of COLING 2012

2010

pdf bib
A Structured Model for Joint Learning of Argument Roles and Predicate Senses
Yotaro Watanabe | Masayuki Asahara | Yuji Matsumoto
Proceedings of the ACL 2010 Conference Short Papers

pdf bib
Automatic Classification of Semantic Relations between Facts and Opinions
Koji Murakami | Eric Nichols | Junta Mizuno | Yotaro Watanabe | Hayato Goto | Megumi Ohki | Suguru Matsuyoshi | Kentaro Inui | Yuji Matsumoto
Proceedings of the Second Workshop on NLP Challenges in the Information Explosion Era (NLPIX 2010)

2009

pdf bib
Multilingual Syntactic-Semantic Dependency Parsing with Three-Stage Approximate Max-Margin Linear Models
Yotaro Watanabe | Masayuki Asahara | Yuji Matsumoto
Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL 2009): Shared Task

2008

pdf bib
A Pipeline Approach for Syntactic and Semantic Dependency Parsing
Yotaro Watanabe | Masakazu Iwatate | Masayuki Asahara | Yuji Matsumoto
CoNLL 2008: Proceedings of the Twelfth Conference on Computational Natural Language Learning

2007

pdf bib
A Graph-Based Approach to Named Entity Categorization in Wikipedia Using Conditional Random Fields
Yotaro Watanabe | Masayuki Asahara | Yuji Matsumoto
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

2005

pdf bib
Combination of Machine Learning Methods for Optimum Chinese Word Segmentation
Masayuki Asahara | Kenta Fukuoka | Ai Azuma | Chooi-Ling Goh | Yotaro Watanabe | Yuji Matsumoto | Takashi Tsuzuki
Proceedings of the Fourth SIGHAN Workshop on Chinese Language Processing