ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.08467
  4. Cited By
Learning to Compress Prompts with Gist Tokens

Learning to Compress Prompts with Gist Tokens

17 April 2023
Jesse Mu
Xiang Lisa Li
Noah D. Goodman
    VLM
ArXivPDFHTML

Papers citing "Learning to Compress Prompts with Gist Tokens"

50 / 164 papers shown
Title
Ruler: A Model-Agnostic Method to Control Generated Length for Large
  Language Models
Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models
Jiaming Li
Lei Zhang
Yunshui Li
Ziqiang Liu
Yuelin Bai
Run Luo
Longze Chen
Min Yang
ALM
25
0
0
27 Sep 2024
More Effective LLM Compressed Tokens with Uniformly Spread Position
  Identifiers and Compression Loss
More Effective LLM Compressed Tokens with Uniformly Spread Position Identifiers and Compression Loss
Runsong Zhao
Pengcheng Huang
Xinyu Liu
Chunyang Xiao
Tong Xiao
Jingbo Zhu
23
0
0
22 Sep 2024
Contextual Compression in Retrieval-Augmented Generation for Large
  Language Models: A Survey
Contextual Compression in Retrieval-Augmented Generation for Large Language Models: A Survey
Sourav Verma
RALM
3DV
24
2
0
20 Sep 2024
TACO-RL: Task Aware Prompt Compression Optimization with Reinforcement
  Learning
TACO-RL: Task Aware Prompt Compression Optimization with Reinforcement Learning
Shivam Shandilya
Menglin Xia
Supriyo Ghosh
Huiqiang Jiang
Jue Zhang
Qianhui Wu
Victor Rühle
27
4
0
19 Sep 2024
Semformer: Transformer Language Models with Semantic Planning
Semformer: Transformer Language Models with Semantic Planning
Yongjing Yin
Junran Ding
Kai Song
Yue Zhang
37
4
0
17 Sep 2024
E2LLM: Encoder Elongated Large Language Models for Long-Context
  Understanding and Reasoning
E2LLM: Encoder Elongated Large Language Models for Long-Context Understanding and Reasoning
Zihan Liao
Jun Wang
Hang Yu
Lingxiao Wei
Jianguo Li
Jun Wang
Wei Zhang
19
2
0
10 Sep 2024
Squid: Long Context as a New Modality for Energy-Efficient On-Device
  Language Models
Squid: Long Context as a New Modality for Energy-Efficient On-Device Language Models
Wei Chen
Zhiyuan Li
Shuo Xin
Yihao Wang
26
4
0
28 Aug 2024
Enhancing and Accelerating Large Language Models via Instruction-Aware
  Contextual Compression
Enhancing and Accelerating Large Language Models via Instruction-Aware Contextual Compression
Haowen Hou
Fei Ma
Binwen Bai
Xinxin Zhu
Fei Yu
28
0
0
28 Aug 2024
Writing in the Margins: Better Inference Pattern for Long Context
  Retrieval
Writing in the Margins: Better Inference Pattern for Long Context Retrieval
M. Russak
Umar Jamil
Christopher Bryant
Kiran Kamble
Axel Magnuson
Mateusz Russak
Waseem Alshikh
19
2
0
27 Aug 2024
Personalizing Federated Instrument Segmentation with Visual Trait Priors
  in Robotic Surgery
Personalizing Federated Instrument Segmentation with Visual Trait Priors in Robotic Surgery
Jialang Xu
Jiacheng Wang
Lequan Yu
Danail Stoyanov
Yueming Jin
E. Mazomenos
24
1
0
06 Aug 2024
500xCompressor: Generalized Prompt Compression for Large Language Models
500xCompressor: Generalized Prompt Compression for Large Language Models
Zongqian Li
Yixuan Su
Nigel Collier
MQ
30
4
0
06 Aug 2024
Finch: Prompt-guided Key-Value Cache Compression
Finch: Prompt-guided Key-Value Cache Compression
Giulio Corallo
Paolo Papotti
33
3
0
31 Jul 2024
Efficient Inference of Vision Instruction-Following Models with Elastic
  Cache
Efficient Inference of Vision Instruction-Following Models with Elastic Cache
Zuyan Liu
Benlin Liu
Jiahui Wang
Yuhao Dong
Guangyi Chen
Yongming Rao
Ranjay Krishna
Jiwen Lu
VLM
40
8
0
25 Jul 2024
Keep the Cost Down: A Review on Methods to Optimize LLM' s KV-Cache
  Consumption
Keep the Cost Down: A Review on Methods to Optimize LLM' s KV-Cache Consumption
Shi Luohe
Hongyi Zhang
Yao Yao
Z. Li
Zhao Hai
31
31
0
25 Jul 2024
CompAct: Compressing Retrieved Documents Actively for Question Answering
CompAct: Compressing Retrieved Documents Actively for Question Answering
Chanwoong Yoon
Taewhoo Lee
Hyeon Hwang
Minbyul Jeong
Jaewoo Kang
KELM
RALM
MQ
27
10
0
12 Jul 2024
PromptIntern: Saving Inference Costs by Internalizing Recurrent Prompt
  during Large Language Model Fine-tuning
PromptIntern: Saving Inference Costs by Internalizing Recurrent Prompt during Large Language Model Fine-tuning
Jiaru Zou
Mengyu Zhou
Tao Li
Shi Han
Dongmei Zhang
46
6
0
02 Jul 2024
KV Cache Compression, But What Must We Give in Return? A Comprehensive
  Benchmark of Long Context Capable Approaches
KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches
Jiayi Yuan
Hongyi Liu
Shaochen
Zhong
Yu-Neng Chuang
...
Hongye Jin
V. Chaudhary
Zhaozhuo Xu
Zirui Liu
Xia Hu
34
17
0
01 Jul 2024
UIO-LLMs: Unbiased Incremental Optimization for Long-Context LLMs
UIO-LLMs: Unbiased Incremental Optimization for Long-Context LLMs
Wenhao Li
Mingbao Lin
Yunshan Zhong
Shuicheng Yan
Rongrong Ji
33
0
0
26 Jun 2024
From Decoding to Meta-Generation: Inference-time Algorithms for Large
  Language Models
From Decoding to Meta-Generation: Inference-time Algorithms for Large Language Models
Sean Welleck
Amanda Bertsch
Matthew Finlayson
Hailey Schoelkopf
Alex Xie
Graham Neubig
Ilia Kulikov
Zaid Harchaoui
33
47
0
24 Jun 2024
Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning
Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning
Brandon Huang
Chancharik Mitra
Assaf Arbelle
Leonid Karlinsky
Trevor Darrell
Roei Herzig
37
12
0
21 Jun 2024
Leveraging Passage Embeddings for Efficient Listwise Reranking with Large Language Models
Leveraging Passage Embeddings for Efficient Listwise Reranking with Large Language Models
Qi Liu
Bo Wang
Nan Wang
Jiaxin Mao
RALM
72
3
0
21 Jun 2024
DeciMamba: Exploring the Length Extrapolation Potential of Mamba
DeciMamba: Exploring the Length Extrapolation Potential of Mamba
Assaf Ben-Kish
Itamar Zimerman
Shady Abu Hussein
Nadav Cohen
Amir Globerson
Lior Wolf
Raja Giryes
Mamba
67
13
0
20 Jun 2024
In-Context Former: Lightning-fast Compressing Context for Large Language
  Model
In-Context Former: Lightning-fast Compressing Context for Large Language Model
Xiangfeng Wang
Zaiyi Chen
Zheyong Xie
Tong Bill Xu
Yongyi He
Enhong Chen
35
1
0
19 Jun 2024
VoCo-LLaMA: Towards Vision Compression with Large Language Models
VoCo-LLaMA: Towards Vision Compression with Large Language Models
Xubing Ye
Yukang Gan
Xiaoke Huang
Yixiao Ge
Yansong Tang
MLLM
VLM
27
22
0
18 Jun 2024
SampleAttention: Near-Lossless Acceleration of Long Context LLM
  Inference with Adaptive Structured Sparse Attention
SampleAttention: Near-Lossless Acceleration of Long Context LLM Inference with Adaptive Structured Sparse Attention
Qianchao Zhu
Jiangfei Duan
Chang Chen
Siran Liu
Xiuhong Li
...
Huanqi Cao
Xiao Chuanfu
Xingcheng Zhang
Dahua Lin
Chao Yang
30
15
0
17 Jun 2024
In-Context Editing: Learning Knowledge from Self-Induced Distributions
In-Context Editing: Learning Knowledge from Self-Induced Distributions
Siyuan Qi
Bangcheng Yang
Kailin Jiang
Xiaobo Wang
Jiaqi Li
Yifan Zhong
Yaodong Yang
Zilong Zheng
KELM
99
8
0
17 Jun 2024
Taking a Deep Breath: Enhancing Language Modeling of Large Language
  Models with Sentinel Tokens
Taking a Deep Breath: Enhancing Language Modeling of Large Language Models with Sentinel Tokens
Weiyao Luo
Suncong Zheng
Heming Xia
Weikang Wang
Yan Lei
Tianyu Liu
Shuang Chen
Zhifang Sui
20
1
0
16 Jun 2024
AIM: Let Any Multi-modal Large Language Models Embrace Efficient
  In-Context Learning
AIM: Let Any Multi-modal Large Language Models Embrace Efficient In-Context Learning
Jun Gao
Qian Qiao
Ziqiang Cao
Zili Wang
Wenjie Li
26
3
0
11 Jun 2024
Recurrent Context Compression: Efficiently Expanding the Context Window
  of LLM
Recurrent Context Compression: Efficiently Expanding the Context Window of LLM
Chensen Huang
Guibo Zhu
Xuepeng Wang
Yifei Luo
Guojing Ge
Haoran Chen
Dong Yi
Jinqiao Wang
54
1
0
10 Jun 2024
Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal
  Learning
Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning
Alex Jinpeng Wang
Linjie Li
Yiqi Lin
Min Li
Lijuan Wang
Mike Zheng Shou
VLM
18
3
0
04 Jun 2024
Retaining Key Information under High Compression Ratios: Query-Guided
  Compressor for LLMs
Retaining Key Information under High Compression Ratios: Query-Guided Compressor for LLMs
Zhiwei Cao
Qian Cao
Yu Lu
Ningxin Peng
Luyang Huang
Shanbo Cheng
Jinsong Su
40
11
0
04 Jun 2024
Enhancing Presentation Slide Generation by LLMs with a Multi-Staged
  End-to-End Approach
Enhancing Presentation Slide Generation by LLMs with a Multi-Staged End-to-End Approach
Sambaran Bandyopadhyay
Himanshu Maheshwari
Anandhavelu Natarajan
Apoorv Saxena
31
5
0
01 Jun 2024
Passage-specific Prompt Tuning for Passage Reranking in Question
  Answering with Large Language Models
Passage-specific Prompt Tuning for Passage Reranking in Question Answering with Large Language Models
Xuyang Wu
Zhiyuan Peng
Sravanthi Rajanala
Hsin-Tai Wu
Yi Fang
LRM
RALM
46
0
0
31 May 2024
Expert-Guided Extinction of Toxic Tokens for Debiased Generation
Expert-Guided Extinction of Toxic Tokens for Debiased Generation
Xueyao Sun
Kaize Shi
Haoran Tang
Guandong Xu
Qing Li
MU
38
1
0
29 May 2024
Unifying Demonstration Selection and Compression for In-Context Learning
Unifying Demonstration Selection and Compression for In-Context Learning
Jun Gao
Ziqiang Cao
Wenjie Li
38
3
0
27 May 2024
SelfCP: Compressing Over-Limit Prompt via the Frozen Large Language
  Model Itself
SelfCP: Compressing Over-Limit Prompt via the Frozen Large Language Model Itself
Jun Gao
Ziqiang Cao
Wenjie Li
25
4
0
27 May 2024
Compressing Lengthy Context With UltraGist
Compressing Lengthy Context With UltraGist
Peitian Zhang
Zheng Liu
Shitao Xiao
Ninglu Shao
Qiwei Ye
Zhicheng Dou
27
4
0
26 May 2024
xRAG: Extreme Context Compression for Retrieval-augmented Generation
  with One Token
xRAG: Extreme Context Compression for Retrieval-augmented Generation with One Token
Xin Cheng
Xun Wang
Xingxing Zhang
Tao Ge
Si-Qing Chen
Furu Wei
Huishuai Zhang
Dongyan Zhao
65
29
0
22 May 2024
The CAP Principle for LLM Serving: A Survey of Long-Context Large
  Language Model Serving
The CAP Principle for LLM Serving: A Survey of Long-Context Large Language Model Serving
Pai Zeng
Zhenyu Ning
Jieru Zhao
Weihao Cui
Mengwei Xu
Liwei Guo
Xusheng Chen
Yizhou Shan
LLMAG
40
4
0
18 May 2024
Layer-Condensed KV Cache for Efficient Inference of Large Language
  Models
Layer-Condensed KV Cache for Efficient Inference of Large Language Models
Haoyi Wu
Kewei Tu
MQ
41
17
0
17 May 2024
Compressing Long Context for Enhancing RAG with AMR-based Concept
  Distillation
Compressing Long Context for Enhancing RAG with AMR-based Concept Distillation
Kaize Shi
Xueyao Sun
Qing Li
Guandong Xu
38
12
0
06 May 2024
Lifelong Knowledge Editing for LLMs with Retrieval-Augmented Continuous Prompt Learning
Lifelong Knowledge Editing for LLMs with Retrieval-Augmented Continuous Prompt Learning
Qizhou Chen
Taolin Zhang
Xiaofeng He
Dongyang Li
Chengyu Wang
Longtao Huang
Hui Xue
CLL
KELM
41
10
0
06 May 2024
URL: Universal Referential Knowledge Linking via Task-instructed
  Representation Compression
URL: Universal Referential Knowledge Linking via Task-instructed Representation Compression
Zhuoqun Li
Hongyu Lin
Tianshu Wang
Boxi Cao
Yaojie Lu
Weixiang Zhou
Hao Wang
Zhenyu Zeng
Le Sun
Xianpei Han
38
1
0
24 Apr 2024
A Survey on Efficient Inference for Large Language Models
A Survey on Efficient Inference for Large Language Models
Zixuan Zhou
Xuefei Ning
Ke Hong
Tianyu Fu
Jiaming Xu
...
Shengen Yan
Guohao Dai
Xiao-Ping Zhang
Yuhan Dong
Yu-Xiang Wang
46
80
0
22 Apr 2024
ChatRetriever: Adapting Large Language Models for Generalized and Robust
  Conversational Dense Retrieval
ChatRetriever: Adapting Large Language Models for Generalized and Robust Conversational Dense Retrieval
Kelong Mao
Chenlong Deng
Haonan Chen
Fengran Mo
Zheng Liu
Tetsuya Sakai
Zhicheng Dou
KELM
39
11
0
21 Apr 2024
Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual
  Alignment
Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment
Zhaofeng Wu
Ananth Balashankar
Yoon Kim
Jacob Eisenstein
Ahmad Beirami
43
8
0
18 Apr 2024
In-Context Learning State Vector with Inner and Momentum Optimization
In-Context Learning State Vector with Inner and Momentum Optimization
Dongfang Li
Zhenyu Liu
Xinshuo Hu
Zetian Sun
Baotian Hu
Min Zhang
24
5
0
17 Apr 2024
TransformerFAM: Feedback attention is working memory
TransformerFAM: Feedback attention is working memory
Dongseong Hwang
Weiran Wang
Zhuoyuan Huo
K. Sim
P. M. Mengibar
32
12
0
14 Apr 2024
LLoCO: Learning Long Contexts Offline
LLoCO: Learning Long Contexts Offline
Sijun Tan
Xiuyu Li
Shishir G. Patil
Ziyang Wu
Tianjun Zhang
Kurt Keutzer
Joseph E. Gonzalez
Raluca A. Popa
RALM
OffRL
LLMAG
38
6
0
11 Apr 2024
Leave No Context Behind: Efficient Infinite Context Transformers with
  Infini-attention
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention
Tsendsuren Munkhdalai
Manaal Faruqui
Siddharth Gopal
LRM
LLMAG
CLL
81
103
0
10 Apr 2024
Previous
1234
Next