QwenLong-CPRS: Towards -LLMs with Dynamic Context Optimization

Main:12 Pages
10 Figures
Bibliography:4 Pages
8 Tables
Appendix:3 Pages
Abstract
This technical report presents QwenLong-CPRS, a context compression framework designed for explicit long-context optimization, addressing prohibitive computation overhead during the prefill stage and the "lost in the middle" performance degradation of large language models (LLMs) during long sequence processing. Implemented through a novel dynamic context optimization mechanism, QwenLong-CPRS enables multi-granularity context compression guided by natural language instructions, achieving both efficiency gains and improved performance.
View on arXivComments on this paper