Weizhi Fei
2024
Extending Context Window of Large Language Models via Semantic Compression
Weizhi Fei
|
Xueyan Niu
|
Pingyi Zhou
|
Lu Hou
|
Bo Bai
|
Lei Deng
|
Wei Han
Findings of the Association for Computational Linguistics: ACL 2024
Transformer based Large Language Models (LLMs) often impose limitations on the length of the text input to ensure the generation of fluent and relevant responses due to the quadratic complexity. These constraints restrict their applicability in long text scenarios. In this paper, we propose a novel semantic compression method that enables generalization to texts that are 6-8 times longer without incurring significant computational costs or requiring fine-tuning. Our proposed framework draws inspiration from source coding in information theory and employs a pre-trained model to reduce the semantic redundancy of long inputs before passing them to the LLMs for downstream tasks. Experimental results demonstrate that our method effectively extends the context window of LLMs across a range of tasks including question answering, summarization, few-shot learning, and information retrieval. Furthermore, the proposed semantic compression method exhibits consistent fluency in text generation while reducing the associated computational overhead.
2023
Wasserstein-Fisher-Rao Embedding: Logical Query Embeddings with Local Comparison and Global Transport
Zihao Wang
|
Weizhi Fei
|
Hang Yin
|
Yangqiu Song
|
Ginny Wong
|
Simon See
Findings of the Association for Computational Linguistics: ACL 2023
Answering complex queries on knowledge graphs is important but particularly challenging because of the data incompleteness. Query embedding methods address this issue by learningbased models and simulating logical reasoning with set operators. Previous works focus on specific forms of embeddings, but scoring functions between embeddings are underexplored. In contrast to existing scorning functions motivated by local comparison or global transport, this work investigates the local and global trade-off with unbalanced optimal transport theory. Specifically, we embed sets as bounded measures in R endowed with a scoring function motivated by the Wasserstein-Fisher-Rao metric. Such a design also facilitates closed-form set operators in the embedding space. Moreover, we introduce a convolution-based algorithm for linear time computation and a block diagonal kernel to enforce the trade-off. Results show that WFRE is capable of outperforming existing query embedding methods on standard datasets, evaluation sets with combinatorially complex queries, and hierarchical knowledge graphs. Ablation study shows that finding a better local and global trade-off is essential for performance improvement.