Puyang Xu


2024

pdf bib
Planning and Editing What You Retrieve for Enhanced Tool Learning
Tenghao Huang | Dongwon Jung | Vaibhav Kumar | Mohammad Kachuee | Xiang Li | Puyang Xu | Muhao Chen
Findings of the Association for Computational Linguistics: NAACL 2024

Recent advancements in integrating external tools with Large Language Models (LLMs) have opened new frontiers, with applications in mathematical reasoning, code generators, and smart assistants. However, existing methods, relying on simple one-time retrieval strategies, fall short on effectively and accurately shortlisting relevant tools. This paper introduces a novel PLUTO (Planning, Learning, and Understanding for TOols) approach, encompassing “Plan-and-Retrieve (P&R)” and “Edit-and-Ground (E&G)” paradigms. The P&R paradigm consists of a neural retrieval module for shortlisting relevant tools and an LLM-based query planner that decomposes complex queries into actionable tasks, enhancing the effectiveness of tool utilization. The E&G paradigm utilizes LLMs to enrich tool descriptions based on user scenarios, bridging the gap between user queries and tool functionalities. Experiment results demonstrate that these paradigms significantly improve the recall and NDCG in tool retrieval tasks, significantly surpassing current state-of-the-art models.

2023

pdf bib
Ranking-Enhanced Unsupervised Sentence Representation Learning
Yeon Seonwoo | Guoyin Wang | Changmin Seo | Sajal Choudhary | Jiwei Li | Xiang Li | Puyang Xu | Sunghyun Park | Alice Oh
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Unsupervised sentence representation learning has progressed through contrastive learning and data augmentation methods such as dropout masking. Despite this progress, sentence encoders are still limited to using only an input sentence when predicting its semantic vector. In this work, we show that the semantic meaning of a sentence is also determined by nearest-neighbor sentences that are similar to the input sentence. Based on this finding, we propose a novel unsupervised sentence encoder, RankEncoder. RankEncoder predicts the semantic vector of an input sentence by leveraging its relationship with other sentences in an external corpus, as well as the input sentence itself. We evaluate RankEncoder on semantic textual benchmark datasets. From the experimental results, we verify that 1) RankEncoder achieves 80.07% Spearman’s correlation, a 1.1% absolute improvement compared to the previous state-of-the-art performance, 2) RankEncoder is universally applicable to existing unsupervised sentence embedding methods, and 3) RankEncoder is specifically effective for predicting the similarity scores of similar sentence pairs.

2022

pdf bib
Open World Classification with Adaptive Negative Samples
Ke Bai | Guoyin Wang | Jiwei Li | Sunghyun Park | Sungjin Lee | Puyang Xu | Ricardo Henao | Lawrence Carin
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Open world classification is a task in natural language processing with key practical relevance and impact.Since the open or unknown category data only manifests in the inference phase, finding a model with a suitable decision boundary accommodating for the identification of known classes and discrimination of the open category is challenging.The performance of existing models is limited by the lack of effective open category data during the training stage or the lack of a good mechanism to learn appropriate decision boundaries.We propose an approach based on Adaptive Negative Samples (ANS) designed to generate effective synthetic open category samples in the training stage and without requiring any prior knowledge or external datasets.Empirically, we find a significant advantage in using auxiliary one-versus-rest binary classifiers, which effectively utilize the generated negative samples and avoid the complex threshold-seeking stage in previous works.Extensive experiments on three benchmark datasets show that ANS achieves significant improvements over state-of-the-art methods.

2018

pdf bib
An End-to-end Approach for Handling Unknown Slot Values in Dialogue State Tracking
Puyang Xu | Qi Hu
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We highlight a practical yet rarely discussed problem in dialogue state tracking (DST), namely handling unknown slot values. Previous approaches generally assume predefined candidate lists and thus are not designed to output unknown values, especially when the spoken language understanding (SLU) module is absent as in many end-to-end (E2E) systems. We describe in this paper an E2E architecture based on the pointer network (PtrNet) that can effectively extract unknown slot values while still obtains state-of-the-art accuracy on the standard DSTC2 benchmark. We also provide extensive empirical evidence to show that tracking unknown values can be challenging and our approach can bring significant improvement with the help of an effective feature dropout technique.

2011

pdf bib
Efficient Subsampling for Training Complex Language Models
Puyang Xu | Asela Gunawardana | Sanjeev Khudanpur
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing