Seunghak Yu


2023

pdf bib
Large-scale Lifelong Learning of In-context Instructions and How to Tackle It
Jisoo Mok | Jaeyoung Do | Sungjin Lee | Tara Taghavi | Seunghak Yu | Sungroh Yoon
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Jointly fine-tuning a Pre-trained Language Model (PLM) on a pre-defined set of tasks with in-context instructions has been proven to improve its generalization performance, allowing us to build a universal language model that can be deployed across task boundaries. In this work, we explore for the first time whether this attractive property of in-context instruction learning can be extended to a scenario in which tasks are fed to the target PLM in a sequential manner. The primary objective of so-called lifelong in-context instruction learning is to improve the target PLM’s instance- and task-level generalization performance as it observes more tasks. DynaInst, the proposed method to lifelong in-context instruction learning, achieves noticeable improvements in both types of generalization, nearly reaching the upper bound performance obtained through joint training.

2022

pdf bib
Cooperative Self-training of Machine Reading Comprehension
Hongyin Luo | Shang-Wen Li | Mingye Gao | Seunghak Yu | James Glass
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Pretrained language models have significantly improved the performance of downstream language understanding tasks, including extractive question answering, by providing high-quality contextualized word embeddings. However, training question answering models still requires large amounts of annotated data for specific domains. In this work, we propose a cooperative self-training framework, RGX, for automatically generating more non-trivial question-answer pairs to improve model performance. RGX is built upon a masked answer extraction task with an interactive learning environment containing an answer entity Recognizer, a question Generator, and an answer eXtractor. Given a passage with a masked entity, the generator generates a question around the entity, and the extractor is trained to extract the masked entity with the generated question and raw texts. The framework allows the training of question generation and answering models on any text corpora without annotation. We further leverage a self-training technique to improve the performance of both question generation and answer extraction models. Experiment results show that RGX outperforms the state-of-the-art (SOTA) pretrained language models and transfer learning approaches on standard question-answering benchmarks, and yields the new SOTA performance under given model size and transfer learning settings.

2021

pdf bib
Interpretable Propaganda Detection in News Articles
Seunghak Yu | Giovanni Da San Martino | Mitra Mohtarami | James Glass | Preslav Nakov
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)

Online users today are exposed to misleading and propagandistic news articles and media posts on a daily basis. To counter thus, a number of approaches have been designed aiming to achieve a healthier and safer online news and media consumption. Automatic systems are able to support humans in detecting such content; yet, a major impediment to their broad adoption is that besides being accurate, the decisions of such systems need also to be interpretable in order to be trusted and widely adopted by users. Since misleading and propagandistic content influences readers through the use of a number of deception techniques, we propose to detect and to show the use of such techniques as a way to offer interpretability. In particular, we define qualitatively descriptive features and we analyze their suitability for detecting deception techniques. We further show that our interpretable features can be easily combined with pre-trained language models, yielding state-of-the-art results.

2020

pdf bib
Prta: A System to Support the Analysis of Propaganda Techniques in the News
Giovanni Da San Martino | Shaden Shaar | Yifan Zhang | Seunghak Yu | Alberto Barrón-Cedeño | Preslav Nakov
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

Recent events, such as the 2016 US Presidential Campaign, Brexit and the COVID-19 “infodemic”, have brought into the spotlight the dangers of online disinformation. There has been a lot of research focusing on fact-checking and disinformation detection. However, little attention has been paid to the specific rhetorical and psychological techniques used to convey propaganda messages. Revealing the use of such techniques can help promote media literacy and critical thinking, and eventually contribute to limiting the impact of “fake news” and disinformation campaigns. Prta (Propaganda Persuasion Techniques Analyzer) allows users to explore the articles crawled on a regular basis by highlighting the spans in which propaganda techniques occur and to compare them on the basis of their use of propaganda techniques. The system further reports statistics about the use of such techniques, overall and over time, or according to filtering criteria specified by the user based on time interval, keywords, and/or political orientation of the media. Moreover, it allows users to analyze any text or URL through a dedicated interface or via an API. The system is available online: https://www.tanbih.org/prta.

2019

pdf bib
Fine-Grained Analysis of Propaganda in News Article
Giovanni Da San Martino | Seunghak Yu | Alberto Barrón-Cedeño | Rostislav Petrov | Preslav Nakov
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Propaganda aims at influencing people’s mindset with the purpose of advancing a specific agenda. Previous work has addressed propaganda detection at document level, typically labelling all articles from a propagandistic news outlet as propaganda. Such noisy gold labels inevitably affect the quality of any learning system trained on them. A further issue with most existing systems is the lack of explainability. To overcome these limitations, we propose a novel task: performing fine-grained analysis of texts by detecting all fragments that contain propaganda techniques as well as their type. In particular, we create a corpus of news articles manually annotated at fragment level with eighteen propaganda techniques and propose a suitable evaluation measure. We further design a novel multi-granularity neural network, and we show that it outperforms several strong BERT-based baselines.

2018

pdf bib
On-Device Neural Language Model Based Word Prediction
Seunghak Yu | Nilesh Kulkarni | Haejun Lee | Jihie Kim
Proceedings of the 27th International Conference on Computational Linguistics: System Demonstrations

Recent developments in deep learning with application to language modeling have led to success in tasks of text processing, summarizing and machine translation. However, deploying huge language models for the mobile device such as on-device keyboards poses computation as a bottle-neck due to their puny computation capacities. In this work, we propose an on-device neural language model based word prediction method that optimizes run-time memory and also provides a real-time prediction environment. Our model size is 7.40MB and has average prediction time of 6.47 ms. Our proposed model outperforms the existing methods for word prediction in terms of keystroke savings and word prediction rate and has been successfully commercialized.

pdf bib
A Multi-Stage Memory Augmented Neural Network for Machine Reading Comprehension
Seunghak Yu | Sathish Reddy Indurthi | Seohyun Back | Haejun Lee
Proceedings of the Workshop on Machine Reading for Question Answering

Reading Comprehension (RC) of text is one of the fundamental tasks in natural language processing. In recent years, several end-to-end neural network models have been proposed to solve RC tasks. However, most of these models suffer in reasoning over long documents. In this work, we propose a novel Memory Augmented Machine Comprehension Network (MAMCN) to address long-range dependencies present in machine reading comprehension. We perform extensive experiments to evaluate proposed method with the renowned benchmark datasets such as SQuAD, QUASAR-T, and TriviaQA. We achieve the state of the art performance on both the document-level (QUASAR-T, TriviaQA) and paragraph-level (SQuAD) datasets compared to all the previously published approaches.

pdf bib
Cut to the Chase: A Context Zoom-in Network for Reading Comprehension
Sathish Reddy Indurthi | Seunghak Yu | Seohyun Back | Heriberto Cuayáhuitl
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

In recent years many deep neural networks have been proposed to solve Reading Comprehension (RC) tasks. Most of these models suffer from reasoning over long documents and do not trivially generalize to cases where the answer is not present as a span in a given document. We present a novel neural-based architecture that is capable of extracting relevant regions based on a given question-document pair and generating a well-formed answer. To show the effectiveness of our architecture, we conducted several experiments on the recently proposed and challenging RC dataset ‘NarrativeQA’. The proposed architecture outperforms state-of-the-art results by 12.62% (ROUGE-L) relative improvement.

pdf bib
MemoReader: Large-Scale Reading Comprehension through Neural Memory Controller
Seohyun Back | Seunghak Yu | Sathish Reddy Indurthi | Jihie Kim | Jaegul Choo
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Machine reading comprehension helps machines learn to utilize most of the human knowledge written in the form of text. Existing approaches made a significant progress comparable to human-level performance, but they are still limited in understanding, up to a few paragraphs, failing to properly comprehend lengthy document. In this paper, we propose a novel deep neural network architecture to handle a long-range dependency in RC tasks. In detail, our method has two novel aspects: (1) an advanced memory-augmented architecture and (2) an expanded gated recurrent unit with dense connections that mitigate potential information distortion occurring in the memory. Our proposed architecture is widely applicable to other models. We have performed extensive experiments with well-known benchmark datasets such as TriviaQA, QUASAR-T, and SQuAD. The experimental results demonstrate that the proposed method outperforms existing methods, especially for lengthy documents.

pdf bib
Supervised Clustering of Questions into Intents for Dialog System Applications
Iryna Haponchyk | Antonio Uva | Seunghak Yu | Olga Uryupina | Alessandro Moschitti
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Modern automated dialog systems require complex dialog managers able to deal with user intent triggered by high-level semantic questions. In this paper, we propose a model for automatically clustering questions into user intents to help the design tasks. Since questions are short texts, uncovering their semantics to group them together can be very challenging. We approach the problem by using powerful semantic classifiers from question duplicate/matching research along with a novel idea of supervised clustering methods based on structured output. We test our approach on two intent clustering corpora, showing an impressive improvement over previous methods for two languages/domains.

2017

pdf bib
Syllable-level Neural Language Model for Agglutinative Language
Seunghak Yu | Nilesh Kulkarni | Haejun Lee | Jihie Kim
Proceedings of the First Workshop on Subword and Character Level Models in NLP

We introduce a novel method to diminish the problem of out of vocabulary words by introducing an embedding method which leverages the agglutinative property of language. We propose additional embedding derived from syllables and morphemes for the words to improve the performance of language model. We apply the above method to input prediction tasks and achieve state of the art performance in terms of Key Stroke Saving (KSS) w.r.t. to existing device input prediction methods.