Juyeon Kang


2024

pdf bib
Multi-Lingual ESG Impact Duration Inference
Chung-Chi Chen | Yu-Min Tseng | Juyeon Kang | Anais Lhuissier | Yohei Seki | Hanwool Lee | Min-Yuh Day | Teng-Tsai Tu | Hsin-Hsi Chen
Proceedings of the Joint Workshop of the 7th Financial Technology and Natural Language Processing, the 5th Knowledge Discovery from Unstructured Data in Financial Services, and the 4th Workshop on Economics and Natural Language Processing

To accurately assess the dynamic impact of a company’s activities on its Environmental, Social, and Governance (ESG) scores, we have initiated a series of shared tasks, named ML-ESG. These tasks adhere to the MSCI guidelines for annotating news articles across various languages. This paper details the third iteration of our series, ML-ESG-3, with a focus on impact duration inference—a task that poses significant challenges in estimating the enduring influence of events, even for human analysts. In ML-ESG-3, we provide datasets in five languages (Chinese, English, French, Korean, and Japanese) and share insights from our experience in compiling such subjective datasets. Additionally, this paper reviews the methodologies proposed by ML-ESG-3 participants and offers a comparative analysis of the models’ performances. Concluding the paper, we introduce the concept for the forthcoming series of shared tasks, namely multi-lingual ESG promise verification, and discuss its potential contributions to the field.

2023

pdf bib
Multi-Lingual ESG Issue Identification
Chung-Chi Chen | Yu-Min Tseng | Juyeon Kang | Anaïs Lhuissier | Min-Yuh Day | Teng-Tsai Tu | Hsin-Hsi Chen
Proceedings of the Fifth Workshop on Financial Technology and Natural Language Processing and the Second Multimodal AI For Financial Forecasting

pdf bib
Multi-Lingual ESG Impact Type Identification
Chung-Chi Chen | Yu-Min Tseng | Juyeon Kang | Anaïs Lhuissier | Yohei Seki | Min-Yuh Day | Teng-Tsai Tu | Hsin-Hsi Chen
Proceedings of the Sixth Workshop on Financial Technology and Natural Language Processing

Assessing a company’s sustainable development goes beyond just financial metrics; the inclusion of environmental, social, and governance (ESG) factors is becoming increasingly vital. The ML-ESG shared task series seeks to pioneer discussions on news-driven ESG ratings, drawing inspiration from the MSCI ESG rating guidelines. In its second edition, ML-ESG-2 emphasizes impact type identification, offering datasets in four languages: Chinese, English, French, and Japanese. Of the 28 teams registered, 8 participated in the official evaluation. This paper presents a comprehensive overview of ML-ESG-2, detailing the dataset specifics and summarizing the performance outcomes of the participating teams.

2022

pdf bib
FinSim4-ESG Shared Task: Learning Semantic Similarities for the Financial Domain. Extended edition to ESG insights
Juyeon Kang | Ismail El Maarouf
Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)

This paper describes FinSim4-ESG 1 shared task organized in the 4th FinNLP workshopwhich is held in conjunction with the IJCAI-ECAI-2022 confer- enceThis year, the FinSim4 is extended to the Environment, Social and Government (ESG) insights and proposes two subtasks, one for ESG Taxonomy Enrichment and the other for Sustainable Sentence Prediction. Among the 28 teams registered to the shared task, a total of 8 teams submitted their systems results and 6 teams also submitted a paper to describe their method. The winner of each subtask shows good performance results of 0.85% and 0.95% in terms of accuracy, respectively.

pdf bib
The Financial Document Structure Extraction Shared Task (FinTOC 2022)
Juyeon Kang | Abderrahim Ait Azzi | Sandra Bellato | Blanca Carbajo Coronado | Mahmoud El-Haj | Ismail El Maarouf | Mei Gan | Ana Gisbert | Antonio Moreno Sandoval
Proceedings of the 4th Financial Narrative Processing Workshop @LREC2022

This paper describes the FinTOC-2022 Shared Task on the structure extraction from financial documents, its participants results and their findings. This shared task was organized as part of The 4th Financial Narrative Processing Workshop (FNP 2022), held jointly at The 13th Edition of the Language Resources and Evaluation Conference (LREC 2022), Marseille, France (El-Haj et al., 2022). This shared task aimed to stimulate research in systems for extracting table-of-contents (TOC) from investment documents (such as financial prospectuses) by detecting the document titles and organizing them hierarchically into a TOC. For the forth edition of this shared task, three subtasks were presented to the participants: one with English documents, one with French documents and the other one with Spanish documents. This year, we proposed a different and revised dataset for English and French compared to the previous editions of FinTOC and a new dataset for Spanish documents was added. The task attracted 6 submissions for each language from 4 teams, and the most successful methods make use of textual, structural and visual features extracted from the documents and propose classification models for detecting titles and TOCs for all of the subtasks.

2021

pdf bib
FinSim-3: The 3rd Shared Task on Learning Semantic Similarities for the Financial Domain
Juyeon Kang | Ismail El Maarouf | Sandra Bellato | Mei Gan
Proceedings of the Third Workshop on Financial Technology and Natural Language Processing

pdf bib
The Financial Document Structure Extraction Shared Task (FinTOC2021)
Ismail El Maarouf | Juyeon Kang | Abderrahim Ait Azzi | Sandra Bellato | Mei Gan | Mahmoud El-Haj
Proceedings of the 3rd Financial Narrative Processing Workshop

2020

pdf bib
Extractive Summarization System for Annual Reports
Abderrahim Ait Azzi | Juyeon Kang
Proceedings of the 1st Joint Workshop on Financial Narrative Processing and MultiLing Financial Summarisation

In this paper, we report on our experiments in building a summarization system for generating summaries from annual reports. We adopt an “extractive” summarization approach in our hybrid system combining neural networks and rules-based algorithms with the expectation that such a system may capture key sentences or paragraphs from the data. A rules-based TOC (Table Of Contents) extraction and a binary classifier of narrative section titles are main components of our system allowing to identify narrative sections and best candidates for extracting final summaries. As result, we propose one to three summaries per document according to the classification score of narrative section titles.

2018

pdf bib
Data Anonymization for Requirements Quality Analysis: a Reproducible Automatic Error Detection Task
Juyeon Kang | Jungyeul Park
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2016

pdf bib
Generating a Linguistic Model for Requirement Quality Analysis
Juyeon Kang | Jungyeul Park
Proceedings of the 30th Pacific Asia Conference on Language, Information and Computation: Posters

2014

pdf bib
Requirement Mining in Technical Documents
Juyeon Kang | Patrick Saint-Dizier
Proceedings of the First Workshop on Argumentation Mining

2011

pdf bib
Système d’analyse catégorielle ACCG : adéquation au traitement de problèmes syntaxiques complexes (ACCG categorical analysis system: adequacy to the treatment of complex syntactic problems)
Juyeon Kang | Jean-Pierre Desclés
Actes de la 18e conférence sur le Traitement Automatique des Langues Naturelles. Démonstrations

2008

pdf bib
Korean Parsing Based on the Applicative Combinatory Categorial Grammar
Juyeon Kang | Jean-Pierre Desclés
Proceedings of the 22nd Pacific Asia Conference on Language, Information and Computation