Seonhee Cho


2023

pdf bib
Open-WikiTable : Dataset for Open Domain Question Answering with Complex Reasoning over Table
Sunjun Kweon | Yeonsu Kwon | Seonhee Cho | Yohan Jo | Edward Choi
Findings of the Association for Computational Linguistics: ACL 2023

Despite recent interest in open domain question answering (ODQA) over tables, many studies still rely on datasets that are not truly optimal for the task with respect to utilizing structural nature of table. These datasets assume answers reside as a single cell value and do not necessitate exploring over multiple cells such as aggregation, comparison, and sorting. Thus, we release Open-WikiTable, the first ODQA dataset that requires complex reasoning over tables. Open-WikiTable is built upon WikiSQL and WikiTableQuestions to be applicable in the open-domain setting. As each question is coupled with both textual answers and SQL queries, Open-WikiTable opens up a wide range of possibilities for future research, as both reader and parser methods can be applied. The dataset is publicly available.

2021

pdf bib
KOAS: Korean Text Offensiveness Analysis System
San-Hee Park | Kang-Min Kim | Seonhee Cho | Jun-Hyung Park | Hyuntae Park | Hyuna Kim | Seongwon Chung | SangKeun Lee
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

Warning: This manuscript contains a certain level of offensive expression. As communication through social media platforms has grown immensely, the increasing prevalence of offensive language online has become a critical problem. Notably in Korea, one of the countries with the highest Internet usage, automatic detection of offensive expressions has recently been brought to attention. However, morphological richness and complex syntax of Korean causes difficulties in neural model training. Furthermore, most of previous studies mainly focus on the detection of abusive language, disregarding implicit offensiveness and underestimating a different degree of intensity. To tackle these problems, we present KOAS, a system that fully exploits both contextual and linguistic features and estimates an offensiveness score for a text. We carefully designed KOAS with a multi-task learning framework and constructed a Korean dataset for offensive analysis from various domains. Refer for a detailed demonstration.