Jong-Hoon Oh

Also published as: Jong Hoon Oh

2021

BERTAC: Enhancing Transformer-based Language Models with Adversarially Pretrained Convolutional Neural Networks
Jong-Hoon Oh | Ryu Iida | Julien Kloetzer | Kentaro Torisawa
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Transformer-based language models (TLMs), such as BERT, ALBERT and GPT-3, have shown strong performance in a wide range of NLP tasks and currently dominate the field of NLP. However, many researchers wonder whether these models can maintain their dominance forever. Of course, we do not have answers now, but, as an attempt to find better neural architectures and training schemes, we pretrain a simple CNN using a GAN-style learning scheme and Wikipedia data, and then integrate it with standard TLMs. We show that on the GLUE tasks, the combination of our pretrained CNN with ALBERT outperforms the original ALBERT and achieves a similar performance to that of SOTA. Furthermore, on open-domain QA (Quasar-T and SearchQA), the combination of the CNN with ALBERT or RoBERTa achieved stronger performance than SOTA and the original TLMs. We hope that this work provides a hint for developing a novel strong network architecture along with its training scheme. Our source code and models are available at https://github.com/nict-wisdom/bertac.

2019

pdf bib abs

Event Causality Recognition Exploiting Multiple Annotators’ Judgments and Background Knowledge
Kazuma Kadowaki | Ryu Iida | Kentaro Torisawa | Jong-Hoon Oh | Julien Kloetzer
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

We propose new BERT-based methods for recognizing event causality such as “smoke cigarettes” –> “die of lung cancer” written in web texts. In our methods, we grasp each annotator’s policy by training multiple classifiers, each of which predicts the labels given by a single annotator, and combine the resulting classifiers’ outputs to predict the final labels determined by majority vote. Furthermore, we investigate the effect of supplying background knowledge to our classifiers. Since BERT models are pre-trained with a large corpus, some sort of background knowledge for event causality may be learned during pre-training. Our experiments with a Japanese dataset suggest that this is actually the case: Performance improved when we pre-trained the BERT models with web texts containing a large number of event causalities instead of Wikipedia articles or randomly sampled web texts. However, this effect was limited. Therefore, we further improved performance by simply adding texts related to an input causality candidate as background knowledge to the input of the BERT models. We believe these findings indicate a promising future research direction.

pdf bib abs

Open-Domain Why-Question Answering with Adversarial Learning to Encode Answer Texts
Jong-Hoon Oh | Kazuma Kadowaki | Julien Kloetzer | Ryu Iida | Kentaro Torisawa
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

In this paper, we propose a method for why-question answering (why-QA) that uses an adversarial learning framework. Existing why-QA methods retrieve “answer passages” that usually consist of several sentences. These multi-sentence passages contain not only the reason sought by a why-question and its connection to the why-question, but also redundant and/or unrelated parts. We use our proposed “Adversarial networks for Generating compact-answer Representation” (AGR) to generate from a passage a vector representation of the non-redundant reason sought by a why-question and exploit the representation for judging whether the passage actually answers the why-question. Through a series of experiments using Japanese why-QA datasets, we show that these representations improve the performance of our why-QA neural model as well as that of a BERT-based why-QA model. We show that they also improve a state-of-the-art distantly supervised open-domain QA (DS-QA) method on publicly available English datasets, even though the target task is not a why-QA.

2016

pdf bib abs

WISDOM X, DISAANA and D-SUMM: Large-scale NLP Systems for Analyzing Textual Big Data
Junta Mizuno | Masahiro Tanaka | Kiyonori Ohtake | Jong-Hoon Oh | Julien Kloetzer | Chikara Hashimoto | Kentaro Torisawa
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations

We demonstrate our large-scale NLP systems: WISDOM X, DISAANA, and D-SUMM. WISDOM X provides numerous possible answers including unpredictable ones to widely diverse natural language questions to provide deep insights about a broad range of issues. DISAANA and D-SUMM enable us to assess the damage caused by large-scale disasters in real time using Twitter as an information source.

Jong-Hoon Oh

2021

2019

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2005

2002

2000

Co-authors

Venues