Chung-Chi Chen


2021

pdf bib
Semantics-Preserved Data Augmentation for Aspect-Based Sentiment Analysis
Ting-Wei Hsu | Chung-Chi Chen | Hen-Hsen Huang | Hsin-Hsi Chen
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Both the issues of data deficiencies and semantic consistency are important for data augmentation. Most of previous methods address the first issue, but ignore the second one. In the cases of aspect-based sentiment analysis, violation of the above issues may change the aspect and sentiment polarity. In this paper, we propose a semantics-preservation data augmentation approach by considering the importance of each word in a textual sequence according to the related aspects and sentiments. We then substitute the unimportant tokens with two replacement strategies without altering the aspect-level polarity. Our approach is evaluated on several publicly available sentiment analysis datasets and the real-world stock price/risk movement prediction scenarios. Experimental results show that our methodology achieves better performances in all datasets.

pdf bib
Financial Opinion Mining
Chung-Chi Chen | Hen-Hsen Huang | Hsin-Hsi Chen
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts

In this tutorial, we will show where we are and where we will be to those researchers interested in this topic. We divide this tutorial into three parts, including coarse-grained financial opinion mining, fine-grained financial opinion mining, and possible research directions. This tutorial starts by introducing the components in a financial opinion proposed in our research agenda and summarizes their related studies. We also highlight the task of mining customers’ opinions toward financial services in the FinTech industry, and compare them with usual opinions. Several potential research questions will be addressed. We hope the audiences of this tutorial will gain an overview of financial opinion mining and figure out their research directions.

pdf bib
Dynamic Graph Transformer for Implicit Tag Recognition
Yi-Ting Liou | Chung-Chi Chen | Hen-Hsen Huang | Hsin-Hsi Chen
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

Textual information extraction is a typical research topic in the NLP community. Several NLP tasks such as named entity recognition and relation extraction between entities have been well-studied in previous work. However, few works pay their attention to the implicit information. For example, a financial news article mentioned “Apple Inc.” may be also related to Samsung, even though Samsung is not explicitly mentioned in this article. This work presents a novel dynamic graph transformer that distills the textual information and the entity relations on the fly. Experimental results confirm the effectiveness of our approach to implicit tag recognition.

pdf bib
Proceedings of the Third Workshop on Financial Technology and Natural Language Processing
Chung-Chi Chen | Hen-Hsen Huang | Hiroya Takamura | Hsin-Hsi Chen
Proceedings of the Third Workshop on Financial Technology and Natural Language Processing

2020

pdf bib
Proceedings of the Second Workshop on Financial Technology and Natural Language Processing
Chung-Chi Chen | Hen-Hsen Huang | Hiroya Takamura | Hsin-Hsi Chen
Proceedings of the Second Workshop on Financial Technology and Natural Language Processing

pdf bib
NTUNLPL at FinCausal 2020, Task 2:Improving Causality Detection Using Viterbi Decoder
Pei-Wei Kao | Chung-Chi Chen | Hen-Hsen Huang | Hsin-Hsi Chen
Proceedings of the 1st Joint Workshop on Financial Narrative Processing and MultiLing Financial Summarisation

In order to provide an explanation of machine learning models, causality detection attracts lots of attention in the artificial intelligence research community. In this paper, we explore the cause-effect detection in financial news and propose an approach, which combines the BIO scheme with the Viterbi decoder for addressing this challenge. Our approach is ranked the first in the official run of cause-effect detection (Task 2) of the FinCausal-2020 shared task. We not only report the implementation details and ablation analysis in this paper, but also publish our code for academic usage.

pdf bib
Issues and Perspectives from 10,000 Annotated Financial Social Media Data
Chung-Chi Chen | Hen-Hsen Huang | Hsin-Hsi Chen
Proceedings of the 12th Language Resources and Evaluation Conference

In this paper, we investigate the annotation of financial social media data from several angles. We present Fin-SoMe, a dataset with 10,000 labeled financial tweets annotated by experts from both the front desk and the middle desk in a bank’s treasury. These annotated results reveal that (1) writer-labeled market sentiment may be a misleading label; (2) writer’s sentiment and market sentiment of an investor may be different; (3) most financial tweets provide unfounded analysis results; and (4) almost no investors write down the gain/loss results for their positions, which would otherwise greatly facilitate detailed evaluation of their performance. Based on these results, we address various open problems and suggest possible directions for future work on financial social media data. We also provide an experiment on the key snippet extraction task to compare the performance of using a general sentiment dictionary and using the domain-specific dictionary. The results echo our findings from the experts’ annotations.

2019

pdf bib
Proceedings of the First Workshop on Financial Technology and Natural Language Processing
Chung-Chi Chen | Hen-Hsen Huang | Hiroya Takamura | Hsin-Hsi Chen
Proceedings of the First Workshop on Financial Technology and Natural Language Processing

pdf bib
Numeracy-600K: Learning Numeracy for Detecting Exaggerated Information in Market Comments
Chung-Chi Chen | Hen-Hsen Huang | Hiroya Takamura | Hsin-Hsi Chen
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

In this paper, we attempt to answer the question of whether neural network models can learn numeracy, which is the ability to predict the magnitude of a numeral at some specific position in a text description. A large benchmark dataset, called Numeracy-600K, is provided for the novel task. We explore several neural network models including CNN, GRU, BiGRU, CRNN, CNN-capsule, GRU-capsule, and BiGRU-capsule in the experiments. The results show that the BiGRU model gets the best micro-averaged F1 score of 80.16%, and the GRU-capsule model gets the best macro-averaged F1 score of 64.71%. Besides discussing the challenges through comprehensive experiments, we also present an important application scenario, i.e., detecting exaggerated information, for the task.

2017

pdf bib
NLG301 at SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News
Chung-Chi Chen | Hen-Hsen Huang | Hsin-Hsi Chen
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

Short length, multi-targets, target relation-ship, monetary expressions, and outside reference are characteristics of financial tweets. This paper proposes methods to extract target spans from a tweet and its referencing web page. Total 15 publicly available sentiment dictionaries and one sentiment dictionary constructed from training set, containing sentiment scores in binary or real numbers, are used to compute the sentiment scores of text spans. Moreover, the correlation coeffi-cients of the price return between any two stocks are learned with the price data from Bloomberg. They are used to capture the relationships between the interesting tar-get and other stocks mentioned in a tweet. The best result of our method in both sub-task are 56.68% and 55.43%, evaluated by evaluation method 2.