Jia-Huei Ju

Also published as: Jia-huei Ju


2024

pdf bib
Relevance-aware Diverse Query Generation for Out-of-domain Text Ranking
Jia-Huei Ju | Chao-Han Yang | Szu-Wei Fu | Ming-Feng Tsai | Chuan-Ju Wang
Proceedings of the 9th Workshop on Representation Learning for NLP (RepL4NLP-2024)

Domain adaptation presents significant challenges for out-of-domain text ranking, especially when supervised data is limited. In this paper, we present ReadQG (Relevance-Aware Diverse Query Generation), a method to generate informative synthetic queries to facilitate the adaptation process of text ranking models. Unlike previous approaches focusing solely on relevant query generation, our ReadQG generates diverse queries with continuous relevance scores. Specifically, we propose leveraging soft-prompt tuning and diverse generation objectives to control query generation according to the given relevance. Our experiments show that integrating negative queries into the learning process enhances the effectiveness of text ranking models in out-of-domain information retrieval (IR) benchmarks. Furthermore, we measure the quality of query generation, highlighting the underlying beneficial characteristics of negative queries. Our empirical results and analysis also shed light on potential directions for more advanced data augmentation in IR. The data and code have been released.

2023

pdf bib
A Compare-and-contrast Multistage Pipeline for Uncovering Financial Signals in Financial Reports
Jia-Huei Ju | Yu-Shiang Huang | Cheng-Wei Lin | Che Lin | Chuan-Ju Wang
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

In this paper, we address the challenge of discovering financial signals in narrative financial reports. As these documents are often lengthy and tend to blend routine information with new information, it is challenging for professionals to discern critical financial signals. To this end, we leverage the inherent nature of the year-to-year structure of reports to define a novel signal-highlighting task; more importantly, we propose a compare-and-contrast multistage pipeline that recognizes different relationships between the reports and locates relevant rationales for these relationships. We also create and publicly release a human-annotated dataset for our task. Our experiments on the dataset validate the effectiveness of our pipeline, and we provide detailed analyses and ablation studies to support our findings.

pdf bib
FISH: A Financial Interactive System for Signal Highlighting
Ta-wei Huang | Jia-huei Ju | Yu-shiang Huang | Cheng-wei Lin | Yi-shyuan Chiang | Chuan-ju Wang
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations

In this system demonstration, we seek to streamline the process of reviewing financial statements and provide insightful information for practitioners. We develop FISH, an interactive system that extracts and highlights crucial textual signals from financial statements efficiently and precisely. To achieve our goal, we integrate pre-trained BERT representations and a fine-tuned BERT highlighting model with a newly-proposed two-stage classify-then-highlight pipeline. We also conduct the human evaluation, showing FISH can provide accurate financial signals. FISH overcomes the limitations of existing research andmore importantly benefits both academics and practitioners in finance as they can leverage state-of-the-art contextualized language models with their newly gained insights. The system is available online at https://fish-web-fish.de.r.appspot.com/, and a short video for introduction is at https://youtu.be/ZbvZQ09i6aw.