2024
pdf
bib
abs
Unlocking Black-Box Prompt Tuning Efficiency via Zeroth-Order Optimization
Heshen Zhan
|
Congliang Chen
|
Tian Ding
|
Ziniu Li
|
Ruoyu Sun
Findings of the Association for Computational Linguistics: EMNLP 2024
Prompt optimization emerges as an important technique for adapting Large Language Models (LLMs) to specific tasks. Unfortunately, LLM proprietors often limit access to models’ internal weights, confining users to inference API services. This restriction poses a significant challenge for prompt optimization, as conventional optimization-based algorithms rely heavily on gradient information, which is unavailable via inference APIs. Addressing this challenge, this paper presents the Zeroth-Order Tuning (ZOT) approach, which enables efficient prompt tuning solely via inference APIs. ZOT adopts the zeroth-order optimization framework, utilizing finite differences to approximate gradient information. We further incorporate ZOT with gradient clipping and momentum techniques to enhance the tuning effectiveness. Experimental results show that ZOT outperforms existing black-box prompt tuning methods in terms of both task-specific performance and convergence speed. Furthermore, we provide a theoretical explanation for the unexpectedly strong performance of zeroth-order methods on LLM prompt tuning. By introducing the concept of effective dimension, we establish a strong connection between the inherently low effective dimension of prompt spaces and the superior convergence speed of zeroth-order methods. Our code is available at https://github.com/ZhanHeshen/ZOT.
pdf
bib
abs
AceGPT, Localizing Large Language Models in Arabic
Huang Huang
|
Fei Yu
|
Jianqing Zhu
|
Xuening Sun
|
Hao Cheng
|
Song Dingjie
|
Zhihong Chen
|
Mosen Alharthi
|
Bang An
|
Juncai He
|
Ziche Liu
|
Junying Chen
|
Jianquan Li
|
Benyou Wang
|
Lian Zhang
|
Ruoyu Sun
|
Xiang Wan
|
Haizhou Li
|
Jinchao Xu
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
This paper is devoted to the development of a localized Large Language Model (LLM) specifically for Arabic, a language imbued with unique cultural characteristics inadequately addressed by current mainstream models. Significant concerns emerge when addressing cultural sensitivity and local values. To address this, the paper proposes a comprehensive solution that includes further pre-training with Arabic texts, Supervised Fine-Tuning (SFT) utilizing native Arabic instructions, and GPT-4 responses in Arabic, alongside Reinforcement Learning with AI Feedback (RLAIF) employing a reward model attuned to local culture and values. The goal is to cultivate culturally cognizant and value-aligned Arabic LLMs capable of accommodating the diverse, application-specific needs of Arabic-speaking communities. Comprehensive evaluations reveal that the resulting model, dubbed ‘AceGPT’, sets the state-of-the-art standard for open Arabic LLMs across various benchmarks. Codes, data, and models are in https://github.com/FreedomIntelligence/AceGPT.
pdf
bib
abs
FinBPM: A Framework for Portfolio Management-based Financial Investor Behavior Perception Model
Zhilu Zhang
|
Procheta Sen
|
Zimu Wang
|
Ruoyu Sun
|
Zhengyong Jiang
|
Jionglong Su
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
The goal of portfolio management is to simultaneously maximize the accumulated return and also to control risk. In consecutive trading periods, portfolio manager needs to continuously adjust the portfolio weights based on the factors which can cause price fluctuation in the market. In the stock market, the factors affecting the stock price can be divided into two categories. The first is price fluctuations caused by irrational investment of the speculators. The second is endogenous value changes caused by operations of the company. In recent years, with the advancement of artificial intelligence technology, reinforcement learning (RL) algorithms have been increasingly employed by scholars to address financial problems, particularly in the area of portfolio management. However, the deep RL models proposed by these scholars in the past have focused more on analyzing the price changes caused by the investment behavior of speculators in response to technical indicators of actual stock prices. In this research, we introduce an RL-based framework called FinBPM, which takes both the factor pertaining to the impact on operations of the company and the factor of the irrational investment of the speculator into consideration. For our experimentation, we randomly selected twelve stocks from the Dow Jones Industrial Index to construct our portfolio. The experimental results reveal that, in comparison to conventional reinforcement learning methods, our approach with at least 13.26% increase over other methods compared. Additionally, it achieved the best Sharpe ratio of 2.77, effectively maximizing the return per unit of risk.