Yada Zhu


2022

pdf bib
Stock Price Volatility Prediction: A Case Study with AutoML
Hilal Pataci | Yunyao Li | Yannis Katsis | Yada Zhu | Lucian Popa
Proceedings of the Fourth Workshop on Financial Technology and Natural Language Processing (FinNLP)

Accurate prediction of the stock price volatility, the rate at which the price of a stock increases or decreases over a particular period, is an important problem in finance. Inaccurate prediction of stock price volatility might lead to investment risk and financial loss, while accurate prediction might generate significant returns for investors. Several studies investigated stock price volatility prediction in a regression task by using the transcripts of earning calls (quarterly conference calls held by public companies) with Natural Language Processing (NLP) techniques. Existing studies use the entire transcript and this degrades the performance due to noise caused by irrelevant information that might not have a significant impact on stock price volatility. In order to overcome these limitations, by considering stock price volatility prediction as a classification task, we explore several denoising approaches, ranging from general-purpose approaches to techniques specific to finance to remove the noise, and leverage AutoML systems that enable auto-exploration of a wide variety of models. Our preliminary findings indicate that domain-specific denoising approaches provide better results than general-purpose approaches, moreover AutoML systems provide promising results.

2021

pdf bib
On Sample Based Explanation Methods for NLP: Faithfulness, Efficiency and Semantic Evaluation
Wei Zhang | Ziming Huang | Yada Zhu | Guangnan Ye | Xiaodong Cui | Fan Zhang
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

In the recent advances of natural language processing, the scale of the state-of-the-art models and datasets is usually extensive, which challenges the application of sample-based explanation methods in many aspects, such as explanation interpretability, efficiency, and faithfulness. In this work, for the first time, we can improve the interpretability of explanations by allowing arbitrary text sequences as the explanation unit. On top of this, we implement a hessian-free method with a model faithfulness guarantee. Finally, to compare our method with the others, we propose a semantic-based evaluation metric that can better align with humans’ judgment of explanations than the widely adopted diagnostic or re-training measures. The empirical results on multiple real data sets demonstrate the proposed method’s superior performance to popular explanation techniques such as Influence Function or TracIn on semantic evaluation.