Ziyue Li

2024

Large Language Models (LLMs) have demonstrated proficiency in addressing tasks that necessitate a combination of task planning and the usage of external tools, such as weather and calculator APIs. However, real-world industrial systems present prevalent challenges in task planning and tool usage: numerous APIs in the real system make it intricate to invoke the appropriate one, while the inherent limitations of LLMs pose challenges in orchestrating an accurate sub-task sequence and API-calling order. This paper introduces a comprehensive framework aimed at enhancing the Task Planning and Tool Usage (TPTU) abilities of LLM-based agents in industry. Our framework comprises three key components designed to address these challenges: (1) the API Retriever selects the most pertinent APIs among the extensive API set; (2) the Demo Selector retrieves task-level demonstrations, which is further used for in-context learning to aid LLMs in accurately decomposing subtasks and effectively invoking hard-to-distinguish APIs; (3) LLM Finetuner tunes a base LLM to enhance its capability for task planning and API calling. We validate our methods using a real-world industry system and an open-sourced academic dataset, demonstrating the efficacy of each individual component as well as the integrated framework. The code is available at here.

pdf bib abs
Guiding Large Language Models via External Attention Prompting for Scientific Extreme Summarization
Yuan Chang | Ziyue Li | Xiaoqiu Le
Proceedings of the Fourth Workshop on Scholarly Document Processing (SDP 2024)

Scientific extreme summarization, the task of generating concise one-sentence summaries (TLDRs) for scientific papers, presents significant challenges due to the need for deep domain-specific understanding and the ability to distill salient information. This study identifies the critical role of titles and keywords in enhancing TLDR generation through quantitative analysis. We propose a novel method, External Attention Prompting (EAP), which leverages LLMs by guiding them to focus on the most critical parts of the source text through varying degrees of attention signals. Our method employs Markdown emphasis syntax to annotate attention levels, enabling LLMs to prioritize salient information effectively. Extensive experiments demonstrate that EAP significantly outperforms baseline methods across various LLMs and metrics in both zero-shot and few-shot settings. Further evaluations by GPT-4 demonstrate that EAP can enable LLMs to generate TLDRs of higher human-aligned quality.

pdf bib abs
Simulating Expert Discussions with Multi-agent for Enhanced Scientific Problem Solving
Ziyue Li | Yuan Chang | Xiaoqiu Le
Proceedings of the Fourth Workshop on Scholarly Document Processing (SDP 2024)

Large Language Models (LLMs) have shown remarkable potential across various domains, yet their application in addressing complex scientific problems remains a formidable challenge. This paper presents a novel methodology to augment the problem-solving capabilities of LLMs by assigning them roles as domain-specific experts. By simulating a panel of experts, each LLM is tasked with delivering professional and cautious responses to scientific inquiries. Our approach involves querying multiple LLMs and assessing the consistency of their responses. High agreement among the LLMs suggests greater confidence in the proposed solution, whereas discrepancies prompt a collaborative discussion among the LLMs to reach a consensus. This method emulates real-world scientific problem-solving processes, fostering a more reliable and robust mechanism for LLMs to tackle scientific questions. Our experimental results show that assigning roles to multiple LLMs as domain-specific experts significantly improves their accuracy and reliability in solving scientific problems. This framework has the potential to advance the application of AI in scientific research, enhancing its effectiveness and trustworthiness.

Co-authors

Venues

Fix data