Zhishen Yang


2020

pdf bib
Image Caption Generation for News Articles
Zhishen Yang | Naoaki Okazaki
Proceedings of the 28th International Conference on Computational Linguistics

In this paper, we address the task of news-image captioning, which generates a description of an image given the image and its article body as input. This task is more challenging than the conventional image captioning, because it requires a joint understanding of image and text. We present a Transformer model that integrates text and image modalities and attends to textual features from visual features in generating a caption. Experiments based on automatic evaluation metrics and human evaluation show that an article text provides primary information to reproduce news-image captions written by journalists. The results also demonstrate that the proposed model outperforms the state-of-the-art model. In addition, we also confirm that visual features contribute to improving the quality of news-image captions.

pdf bib
TextLearner at SemEval-2020 Task 10: A Contextualized Ranking System in Solving Emphasis Selection in Text
Zhishen Yang | Lars Wolfsteller | Naoaki Okazaki
Proceedings of the Fourteenth Workshop on Semantic Evaluation

This paper describes the emphasis selection system of the team TextLearner for SemEval 2020 Task 10: Emphasis Selection For Written Text in Visual Media. The system aims to learn the emphasis selection distribution using contextual representations extracted from pre-trained language models and a two-staged ranking model. The experimental results demonstrate the strong contextual representation power of the recent advanced transformer-based language model RoBERTa, which can be exploited using a simple but effective architecture on top.

2019

pdf bib
TokyoTech_NLP at SemEval-2019 Task 3: Emotion-related Symbols in Emotion Detection
Zhishen Yang | Sam Vijlbrief | Naoaki Okazaki
Proceedings of the 13th International Workshop on Semantic Evaluation

This paper presents our contextual emotion detection system in approaching the SemEval2019 shared task 3: EmoContext: Contextual Emotion Detection in Text. This system cooperates with an emotion detection neural network method (Poria et al., 2017), emoji2vec (Eisner et al., 2016) embedding, word2vec embedding (Mikolov et al., 2013), and our proposed emoticon and emoji preprocessing method. The experimental results demonstrate the usefulness of our emoticon and emoji prepossessing method, and representations of emoticons and emoji contribute model’s emotion detection.