ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.08467
  4. Cited By
Learning to Compress Prompts with Gist Tokens

Learning to Compress Prompts with Gist Tokens

17 April 2023
Jesse Mu
Xiang Lisa Li
Noah D. Goodman
    VLM
ArXivPDFHTML

Papers citing "Learning to Compress Prompts with Gist Tokens"

50 / 164 papers shown
Title
RAVU: Retrieval Augmented Video Understanding with Compositional Reasoning over Graph
RAVU: Retrieval Augmented Video Understanding with Compositional Reasoning over Graph
Sameer Malik
Moyuru Yamada
Ayush Singh
Dishank Aggarwal
91
0
0
06 May 2025
A Survey of Foundation Model-Powered Recommender Systems: From Feature-Based, Generative to Agentic Paradigms
A Survey of Foundation Model-Powered Recommender Systems: From Feature-Based, Generative to Agentic Paradigms
Chengkai Huang
Hongtao Huang
Tong Yu
Kaige Xie
Junda Wu
Shuai Zhang
Julian McAuley
Dietmar Jannach
Lina Yao
LRM
AI4CE
22
0
0
23 Apr 2025
MOOSComp: Improving Lightweight Long-Context Compressor via Mitigating Over-Smoothing and Incorporating Outlier Scores
MOOSComp: Improving Lightweight Long-Context Compressor via Mitigating Over-Smoothing and Incorporating Outlier Scores
Fengwei Zhou
Jiafei Song
Wenjin Jason Li
Gengjian Xue
Zhikang Zhao
Yichao Lu
Bailin Na
17
0
0
23 Apr 2025
Dynamic Compressing Prompts for Efficient Inference of Large Language Models
Dynamic Compressing Prompts for Efficient Inference of Large Language Models
Jinwu Hu
W. Zhang
Yufeng Wang
Yu Hu
Bin Xiao
Mingkui Tan
Qing Du
19
0
0
15 Apr 2025
Long Context In-Context Compression by Getting to the Gist of Gisting
Long Context In-Context Compression by Getting to the Gist of Gisting
Aleksandar Petrov
Mark Sandler
A. Zhmoginov
Nolan Miller
Max Vladymyrov
27
0
0
11 Apr 2025
InfiniteICL: Breaking the Limit of Context Window Size via Long Short-term Memory Transformation
InfiniteICL: Breaking the Limit of Context Window Size via Long Short-term Memory Transformation
Bowen Cao
Deng Cai
W. Lam
CLL
46
0
0
02 Apr 2025
Understanding and Improving Information Preservation in Prompt Compression for LLMs
Understanding and Improving Information Preservation in Prompt Compression for LLMs
Weronika Łajewska
Momchil Hardalov
Laura Aina
Neha Anna John
Hang Su
Lluís Marquez
60
0
0
24 Mar 2025
Growing a Twig to Accelerate Large Vision-Language Models
Growing a Twig to Accelerate Large Vision-Language Models
Zhenwei Shao
Mingyang Wang
Zhou Yu
Wenwen Pan
Yan Yang
Tao Wei
H. Zhang
Ning Mao
Wei Chen
Jun Yu
VLM
59
1
0
18 Mar 2025
A Survey on Transformer Context Extension: Approaches and Evaluation
A Survey on Transformer Context Extension: Approaches and Evaluation
Yijun Liu
Jinzheng Yu
Yang Xu
Zhongyang Li
Qingfu Zhu
LLMAG
66
0
0
17 Mar 2025
reWordBench: Benchmarking and Improving the Robustness of Reward Models with Transformed Inputs
reWordBench: Benchmarking and Improving the Robustness of Reward Models with Transformed Inputs
Zhaofeng Wu
Michihiro Yasunaga
Andrew Cohen
Yoon Kim
Asli Celikyilmaz
Marjan Ghazvininejad
38
1
0
14 Mar 2025
KV-Distill: Nearly Lossless Learnable Context Compression for LLMs
Vivek Chari
Guanghui Qin
Benjamin Van Durme
VLM
71
0
0
13 Mar 2025
AttentionRAG: Attention-Guided Context Pruning in Retrieval-Augmented Generation
Yixiong Fang
Tianran Sun
Yuling Shi
Xiaodong Gu
50
0
0
13 Mar 2025
EFPC: Towards Efficient and Flexible Prompt Compression
Yun-Hao Cao
Yangsong Wang
Shuzheng Hao
Zhenxing Li
Chengjun Zhan
Sichao Liu
Yi-Qi Hu
58
0
0
11 Mar 2025
Context-aware Biases for Length Extrapolation
Ali Veisi
Amir Mansourian
50
0
0
11 Mar 2025
Uncovering Gaps in How Humans and LLMs Interpret Subjective Language
Erik Jones
Arjun Patrawala
Jacob Steinhardt
47
0
0
06 Mar 2025
Learning to Substitute Components for Compositional Generalization
Learning to Substitute Components for Compositional Generalization
Z. Li
Gangwei Jiang
Chenwang Wu
Ying Wei
Defu Lian
Enhong Chen
57
0
0
28 Feb 2025
RankCoT: Refining Knowledge for Retrieval-Augmented Generation through Ranking Chain-of-Thoughts
RankCoT: Refining Knowledge for Retrieval-Augmented Generation through Ranking Chain-of-Thoughts
Mingyan Wu
Zhenghao Liu
Yukun Yan
Xinze Li
S. Yu
Zheni Zeng
Yu Gu
Ge Yu
RALM
AI4TS
LRM
52
1
0
25 Feb 2025
LightThinker: Thinking Step-by-Step Compression
LightThinker: Thinking Step-by-Step Compression
Jintian Zhang
Yuqi Zhu
Mengshu Sun
Yujie Luo
Shuofei Qiao
Lun Du
Da Zheng
H. Chen
N. Zhang
LRM
LLMAG
47
10
0
24 Feb 2025
Representation Engineering for Large-Language Models: Survey and Research Challenges
Representation Engineering for Large-Language Models: Survey and Research Challenges
Lukasz Bartoszcze
Sarthak Munshi
Bryan Sukidi
Jennifer Yen
Zejia Yang
David Williams-King
Linh Le
Kosi Asuzu
Carsten Maple
100
0
0
24 Feb 2025
A generative approach to LLM harmfulness detection with special red flag tokens
A generative approach to LLM harmfulness detection with special red flag tokens
Sophie Xhonneux
David Dobre
Mehrnaz Mohfakhami
Leo Schwinn
Gauthier Gidel
47
1
0
22 Feb 2025
Mitigating Lost-in-Retrieval Problems in Retrieval Augmented Multi-Hop Question Answering
Mitigating Lost-in-Retrieval Problems in Retrieval Augmented Multi-Hop Question Answering
Rongzhi Zhu
Xiangyu Liu
Zequn Sun
Yiwei Wang
Wei Hu
LRM
RALM
KELM
83
1
0
21 Feb 2025
Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity
Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity
Yuri Kuratov
M. Arkhipov
Aydar Bulatov
Mikhail Burtsev
82
0
0
18 Feb 2025
Following the Autoregressive Nature of LLM Embeddings via Compression and Alignment
Following the Autoregressive Nature of LLM Embeddings via Compression and Alignment
Jingcheng Deng
Zhongtao Jiang
Liang Pang
Liwei Chen
Kun Xu
Zihao Wei
Huawei Shen
Xueqi Cheng
49
1
0
17 Feb 2025
Ten Challenging Problems in Federated Foundation Models
Ten Challenging Problems in Federated Foundation Models
Tao Fan
Hanlin Gu
Xuemei Cao
Chee Seng Chan
Qian Chen
...
Y. Zhang
Xiaojin Zhang
Zhenzhe Zheng
Lixin Fan
Qiang Yang
FedML
75
4
0
14 Feb 2025
LCIRC: A Recurrent Compression Approach for Efficient Long-form Context and Query Dependent Modeling in LLMs
LCIRC: A Recurrent Compression Approach for Efficient Long-form Context and Query Dependent Modeling in LLMs
Sumin An
Junyoung Sung
Wonpyo Park
Chanjun Park
Paul Hongsuck Seo
95
0
0
10 Feb 2025
Learning Task Representations from In-Context Learning
Learning Task Representations from In-Context Learning
Baturay Saglam
Zhuoran Yang
Dionysis Kalogerias
Amin Karbasi
55
0
0
08 Feb 2025
Vision-centric Token Compression in Large Language Model
Vision-centric Token Compression in Large Language Model
Ling Xing
Alex Jinpeng Wang
Rui Yan
J. Tang
VLM
60
0
0
02 Feb 2025
Efficient Prompt Compression with Evaluator Heads for Long-Context Transformer Inference
Efficient Prompt Compression with Evaluator Heads for Long-Context Transformer Inference
WeiZhi Fei
Xueyan Niu
Guoqing Xie
Yingqing Liu
Bo Bai
Wei Han
28
1
0
22 Jan 2025
A Survey of Research in Large Language Models for Electronic Design Automation
A Survey of Research in Large Language Models for Electronic Design Automation
Jingyu Pan
Guanglei Zhou
Chen-Chia Chang
Isaac Jacobson
Jiang Hu
Y. Chen
67
2
0
17 Jan 2025
Better Prompt Compression Without Multi-Layer Perceptrons
Better Prompt Compression Without Multi-Layer Perceptrons
Edouardo Honig
Andrew Lizarraga
Zijun Zhang
Ying Nian Wu
MQ
99
0
0
12 Jan 2025
Registering Source Tokens to Target Language Spaces in Multilingual Neural Machine Translation
Registering Source Tokens to Target Language Spaces in Multilingual Neural Machine Translation
Zhi Qu
Yiran Wang
Jiannan Mao
Chenchen Ding
Hideki Tanaka
Masao Utiyama
Taro Watanabe
LRM
40
0
0
06 Jan 2025
From Reading to Compressing: Exploring the Multi-document Reader for Prompt Compression
From Reading to Compressing: Exploring the Multi-document Reader for Prompt Compression
Eunseong Choi
Sunkyung Lee
Minjin Choi
June Park
Jongwuk Lee
60
1
0
03 Jan 2025
A Silver Bullet or a Compromise for Full Attention? A Comprehensive
  Study of Gist Token-based Context Compression
A Silver Bullet or a Compromise for Full Attention? A Comprehensive Study of Gist Token-based Context Compression
Chenlong Deng
Zhisong Zhang
Kelong Mao
Shuaiyi Li
Xinting Huang
Dong Yu
Zhicheng Dou
36
1
0
23 Dec 2024
Attention Entropy is a Key Factor: An Analysis of Parallel Context
  Encoding with Full-attention-based Pre-trained Language Models
Attention Entropy is a Key Factor: An Analysis of Parallel Context Encoding with Full-attention-based Pre-trained Language Models
Zhisong Zhang
Yan Wang
Xinting Huang
Tianqing Fang
H. Zhang
Chenlong Deng
Shuaiyi Li
Dong Yu
75
2
0
21 Dec 2024
Deploying Foundation Model Powered Agent Services: A Survey
Deploying Foundation Model Powered Agent Services: A Survey
Wenchao Xu
Jinyu Chen
Peirong Zheng
Xiaoquan Yi
Tianyi Tian
...
Quan Wan
Haozhao Wang
Yunfeng Fan
Qinliang Su
Xuemin Shen
AI4CE
112
1
0
18 Dec 2024
C3oT: Generating Shorter Chain-of-Thought without Compromising
  Effectiveness
C3oT: Generating Shorter Chain-of-Thought without Compromising Effectiveness
Yu Kang
Xianghui Sun
Liangyu Chen
Wei Zou
LRM
72
18
0
16 Dec 2024
BRIEF: Bridging Retrieval and Inference for Multi-hop Reasoning via Compression
BRIEF: Bridging Retrieval and Inference for Multi-hop Reasoning via Compression
Yuankai Li
Jia-Chen Gu
Di Wu
Kai-Wei Chang
Nanyun Peng
RALM
MQ
18
0
0
20 Oct 2024
Prompt Compression for Large Language Models: A Survey
Prompt Compression for Large Language Models: A Survey
Zongqian Li
Yinhong Liu
Yixuan Su
Nigel Collier
MQ
47
9
0
16 Oct 2024
Selection-p: Self-Supervised Task-Agnostic Prompt Compression for
  Faithfulness and Transferability
Selection-p: Self-Supervised Task-Agnostic Prompt Compression for Faithfulness and Transferability
Tsz Ting Chung
Leyang Cui
Lemao Liu
Xinting Huang
Shuming Shi
Dit-Yan Yeung
25
1
0
15 Oct 2024
HART: Efficient Visual Generation with Hybrid Autoregressive Transformer
HART: Efficient Visual Generation with Hybrid Autoregressive Transformer
Haotian Tang
Yecheng Wu
Shang Yang
Enze Xie
Junsong Chen
Junyu Chen
Zhuoyang Zhang
Han Cai
Y. Lu
Song Han
61
32
0
14 Oct 2024
Divide, Reweight, and Conquer: A Logit Arithmetic Approach for
  In-Context Learning
Divide, Reweight, and Conquer: A Logit Arithmetic Approach for In-Context Learning
Chengsong Huang
Langlin Huang
Jiaxin Huang
MoMe
27
1
0
14 Oct 2024
CAMPHOR: Collaborative Agents for Multi-input Planning and High-Order
  Reasoning On Device
CAMPHOR: Collaborative Agents for Multi-input Planning and High-Order Reasoning On Device
Yicheng Fu
R. Anantha
Jianpeng Cheng
LRM
LLMAG
26
2
0
12 Oct 2024
ELICIT: LLM Augmentation via External In-Context Capability
ELICIT: LLM Augmentation via External In-Context Capability
Futing Wang
Jianhao Yan
Yue Zhang
Tao Lin
39
0
0
12 Oct 2024
Generation with Dynamic Vocabulary
Generation with Dynamic Vocabulary
Yanting Liu
Tao Ji
Changzhi Sun
Yuanbin Wu
Xiaoling Wang
40
0
0
11 Oct 2024
Fast State Restoration in LLM Serving with HCache
Fast State Restoration in LLM Serving with HCache
Shiwei Gao
Youmin Chen
Jiwu Shu
25
4
0
07 Oct 2024
MELODI: Exploring Memory Compression for Long Contexts
MELODI: Exploring Memory Compression for Long Contexts
Yinpeng Chen
DeLesley Hutchins
Aren Jansen
Andrey Zhmoginov
David Racz
Jesper Andersen
20
2
0
04 Oct 2024
LoRC: Low-Rank Compression for LLMs KV Cache with a Progressive
  Compression Strategy
LoRC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy
Rongzhi Zhang
Kuang Wang
Liyuan Liu
Shuohang Wang
Hao Cheng
Chao Zhang
Yelong Shen
MQ
21
5
0
04 Oct 2024
Distilling an End-to-End Voice Assistant Without Instruction Training
  Data
Distilling an End-to-End Voice Assistant Without Instruction Training Data
William B. Held
Ella Li
Michael Joseph Ryan
Weiyan Shi
Yanzhe Zhang
Diyi Yang
AuLLM
36
8
0
03 Oct 2024
Selective Attention Improves Transformer
Selective Attention Improves Transformer
Yaniv Leviathan
Matan Kalman
Yossi Matias
49
8
0
03 Oct 2024
Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads on Consumer-Grade Devices
Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads on Consumer-Grade Devices
Yuxiang Huang
Binhang Yuan
Xu Han
Chaojun Xiao
Zhiyuan Liu
RALM
73
1
0
02 Oct 2024
1234
Next