Kazunari Sugiyama

2021

pdf bib abs
Multi-TimeLine Summarization (MTLS): Improving Timeline Summarization by Generating Multiple Summaries
Yi Yu | Adam Jatowt | Antoine Doucet | Kazunari Sugiyama | Masatoshi Yoshikawa
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

In this paper, we address a novel task, Multiple TimeLine Summarization (MTLS), which extends the flexibility and versatility of Time-Line Summarization (TLS). Given any collection of time-stamped news articles, MTLS automatically discovers important yet different stories and generates a corresponding time-line for each story. To achieve this, we propose a novel unsupervised summarization framework based on two-stage affinity propagation. We also introduce a quantitative evaluation measure for MTLS based on previousTLS evaluation methods. Experimental results show that our MTLS framework demonstrates high effectiveness and MTLS task can give bet-ter results than TLS.

2019

pdf bib abs
Predicting Helpful Posts in Open-Ended Discussion Forums: A Neural Architecture
Kishaloy Halder | Min-Yen Kan | Kazunari Sugiyama
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Users participate in online discussion forums to learn from others and share their knowledge with the community. They often start a thread with a question or by sharing their new findings on a certain topic. We find that, unlike Community Question Answering, where questions are mostly factoid based, the threads in a forum are often open-ended (e.g., asking for recommendations from others) without a single correct answer. In this paper, we address the task of identifying helpful posts in a forum thread to help users comprehend long running discussion threads, which often contain repetitive or irrelevant posts. We propose a recurrent neural network based architecture to model (i) the relevance of a post regarding the original post starting the thread and (ii) the novelty it brings to the discussion, compared to the previous posts in the thread. Experimental results on different types of online forum datasets show that our model significantly outperforms the state-of-the-art neural network models for text classification.

2018

pdf bib abs
Identifying Emergent Research Trends by Key Authors and Phrases
Shenhao Jiang | Animesh Prasad | Min-Yen Kan | Kazunari Sugiyama
Proceedings of the 27th International Conference on Computational Linguistics

Identifying emergent research trends is a key issue for both primary researchers as well as secondary research managers. Such processes can uncover the historical development of an area, and yield insight on developing topics. We propose an embedded trend detection framework for this task which incorporates our bijunctive hypothesis that important phrases are written by important authors within a field and vice versa. By ranking both author and phrase information in a multigraph, our method jointly determines key phrases and authoritative authors. We represent this intermediate output as phrasal embeddings, and feed this to a recurrent neural network (RNN) to compute trend scores that identify research trends. Over two large datasets of scientific articles, we demonstrate that our approach successfully detects past trends from the field, outperforming baselines based solely on text centrality or citation.

pdf bib abs
Treatment Side Effect Prediction from Online User-generated Content
Van Hoang Nguyen | Kazunari Sugiyama | Min-Yen Kan | Kishaloy Halder
Proceedings of the Ninth International Workshop on Health Text Mining and Information Analysis

With Health 2.0, patients and caregivers increasingly seek information regarding possible drug side effects during their medical treatments in online health communities. These are helpful platforms for non-professional medical opinions, yet pose risk of being unreliable in quality and insufficient in quantity to cover the wide range of potential drug reactions. Existing approaches which analyze such user-generated content in online forums heavily rely on feature engineering of both documents and users, and often overlook the relationships between posts within a common discussion thread. Inspired by recent advancements, we propose a neural architecture that models the textual content of user-generated documents and user experiences in online communities to predict side effects during treatment. Experimental results show that our proposed architecture outperforms baseline models.