AbstractFinancial risk prediction is an essential task for risk management in capital markets. While traditional prediction models are built based on the hard information of numerical data, recent studies have shown that the soft information of verbal cues in earnings conference calls is significant for predicting market risk due to its less constrained fashion and direct interaction between managers and analysts. However, most existing models mainly focus on extracting useful semantic information from the textual conference call transcripts but ignore their subtle yet important information of dialogue structures. To bridge this gap, we develop a graph attention network called DialogueGAT for financial risk prediction by simultaneously modeling the speakers and their utterances in dialogues in conference calls. Different from previous studies, we propose a new method for constructing the graph of speakers and utterances in a dialogue, and design contextual attention at both speaker and utterance levels for disentangling their effects on the downstream prediction task. For model evaluation, we extend an existing dataset of conference call transcripts by adding the dialogue structure and speaker information. Empirical results on our dataset of S&P1500 companies demonstrate the superiority of our proposed model over competitive baselines from the extant literature.