Ming-Feng Tsai

Also published as: Meng-Feng Tsai


2024

pdf bib
Relevance-aware Diverse Query Generation for Out-of-domain Text Ranking
Jia-Huei Ju | Chao-Han Yang | Szu-Wei Fu | Ming-Feng Tsai | Chuan-Ju Wang
Proceedings of the 9th Workshop on Representation Learning for NLP (RepL4NLP-2024)

Domain adaptation presents significant challenges for out-of-domain text ranking, especially when supervised data is limited. In this paper, we present ReadQG (Relevance-Aware Diverse Query Generation), a method to generate informative synthetic queries to facilitate the adaptation process of text ranking models. Unlike previous approaches focusing solely on relevant query generation, our ReadQG generates diverse queries with continuous relevance scores. Specifically, we propose leveraging soft-prompt tuning and diverse generation objectives to control query generation according to the given relevance. Our experiments show that integrating negative queries into the learning process enhances the effectiveness of text ranking models in out-of-domain information retrieval (IR) benchmarks. Furthermore, we measure the quality of query generation, highlighting the underlying beneficial characteristics of negative queries. Our empirical results and analysis also shed light on potential directions for more advanced data augmentation in IR. The data and code have been released.

2020

pdf bib
Designing Templates for Eliciting Commonsense Knowledge from Pretrained Sequence-to-Sequence Models
Jheng-Hong Yang | Sheng-Chieh Lin | Rodrigo Nogueira | Ming-Feng Tsai | Chuan-Ju Wang | Jimmy Lin
Proceedings of the 28th International Conference on Computational Linguistics

While internalized “implicit knowledge” in pretrained transformers has led to fruitful progress in many natural language understanding tasks, how to most effectively elicit such knowledge remains an open question. Based on the text-to-text transfer transformer (T5) model, this work explores a template-based approach to extract implicit knowledge for commonsense reasoning on multiple-choice (MC) question answering tasks. Experiments on three representative MC datasets show the surprisingly good performance of our simple template, coupled with a logit normalization technique for disambiguation. Furthermore, we verify that our proposed template can be easily extended to other MC tasks with contexts such as supporting facts in open-book question answering settings. Starting from the MC task, this work initiates further research to find generic natural language templates that can effectively leverage stored knowledge in pretrained models.

2018

pdf bib
RiskFinder: A Sentence-level Risk Detector for Financial Reports
Yu-Wen Liu | Liang-Chih Liu | Chuan-Ju Wang | Ming-Feng Tsai
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations

This paper presents a web-based information system, RiskFinder, for facilitating the analyses of soft and hard information in financial reports. In particular, the system broadens the analyses from the word level to sentence level, which makes the system useful for practitioner communities and unprecedented among financial academics. The proposed system has four main components: 1) a Form 10-K risk-sentiment dataset, consisting of a set of risk-labeled financial sentences and pre-trained sentence embeddings; 2) metadata, including basic information on each company that published the Form 10-K financial report as well as several relevant financial measures; 3) an interface that highlights risk-related sentences in the financial reports based on the latest sentence embedding techniques; 4) a visualization of financial time-series data for a corresponding company. This paper also conducts some case studies to showcase that the system can be of great help in capturing valuable insight within large amounts of textual information. The system is now online available at https://cfda.csie.org/RiskFinder/.

pdf bib
Proceedings of the First Workshop on Economics and Natural Language Processing
Udo Hahn | Véronique Hoste | Ming-Feng Tsai
Proceedings of the First Workshop on Economics and Natural Language Processing

2014

pdf bib
Financial Keyword Expansion via Continuous Word Vector Representations
Ming-Feng Tsai | Chuan-Ju Wang
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

2013

pdf bib
Financial Sentiment Analysis for Risk Prediction
Chuan-Ju Wang | Ming-Feng Tsai | Tse Liu | Chin-Ting Chang
Proceedings of the Sixth International Joint Conference on Natural Language Processing

pdf bib
主要漢字形聲字發音規則探勘與視覺化 (Primary Chinese Semantic-Phonetic Compounds Pronunciation Rules Mining and Visualization) [In Chinese]
Chien-Hui Hsu | Meng-Feng Tsai | Chia-Hui Chang | Hsiang-Mei Liao | Shu-Ping Li | Denise H. Wu
Proceedings of the 25th Conference on Computational Linguistics and Speech Processing (ROCLING 2013)

2012

pdf bib
聲符部件排序與形聲字發音規則探勘 (Phonetic Component Ranking and Pronunciation Rules Discovery for Picto-Phonetic Chinese Characters) [In Chinese]
Chia-Hui Chang | Shu-Yen Lin | Meng-Feng Tsai | Shu-Ping Li | Hsiang-Mei Liao | Norden E. Huang
International Journal of Computational Linguistics & Chinese Language Processing, Volume 17, Number 3, September 2012

pdf bib
Visualization on Financial Terms via Risk Ranking from Financial Reports
Ming-Feng Tsai | Chuan-Ju Wang
Proceedings of COLING 2012: Demonstration Papers

2010

pdf bib
以最佳化及機率分佈標記形聲字聲符之研究 (Annotating Phonetic Component of Chinese Characters Using Constrained Optimization and Pronunciation Distribution) [In Chinese]
Chia-Hui Chang | Shu-Yen Lin | Shu-Ying Li | Meng-Feng Tsai | Shu-Ping Li | Hsiang-Mei Liao | Chih-Wen Sun | Norden E. Huang
International Journal of Computational Linguistics & Chinese Language Processing, Volume 15, Number 2, June 2010

2009

pdf bib
Latent Prosody Model-Assisted Mandarin Accent Identification
Yuan-Fu Liao | Shuan-Chen Yeh | Ming-Feng Tsai | Wei-Hsiung Ting | Sen-Chia Chang
Proceedings of the 21st Conference on Computational Linguistics and Speech Processing