ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.05736
  4. Cited By
LLMLingua: Compressing Prompts for Accelerated Inference of Large
  Language Models

LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models

9 October 2023
Huiqiang Jiang
Qianhui Wu
Chin-Yew Lin
Yuqing Yang
Lili Qiu
ArXivPDFHTML

Papers citing "LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models"

20 / 70 papers shown
Title
A Survey on Efficient Inference for Large Language Models
A Survey on Efficient Inference for Large Language Models
Zixuan Zhou
Xuefei Ning
Ke Hong
Tianyu Fu
Jiaming Xu
...
Shengen Yan
Guohao Dai
Xiao-Ping Zhang
Yuhan Dong
Yu-Xiang Wang
46
80
0
22 Apr 2024
LongEmbed: Extending Embedding Models for Long Context Retrieval
LongEmbed: Extending Embedding Models for Long Context Retrieval
Dawei Zhu
Liang Wang
Nan Yang
Yifan Song
Wenhao Wu
Furu Wei
Sujian Li
RALM
35
21
0
18 Apr 2024
Efficient Prompting Methods for Large Language Models: A Survey
Efficient Prompting Methods for Large Language Models: A Survey
Kaiyan Chang
Songcheng Xu
Chenglong Wang
Yingfeng Luo
Tong Xiao
Jingbo Zhu
LRM
30
32
0
01 Apr 2024
PROMPT-SAW: Leveraging Relation-Aware Graphs for Textual Prompt
  Compression
PROMPT-SAW: Leveraging Relation-Aware Graphs for Textual Prompt Compression
Muhammad Asif Ali
Zhengping Li
Shu Yang
Keyuan Cheng
Yang Cao
Tianhao Huang
Lijie Hu
Lu Yu
Di Wang
VLM
RALM
32
9
0
30 Mar 2024
PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient
  Task Adaptation
PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation
Yizhe Xiong
Hui Chen
Tianxiang Hao
Zijia Lin
Jungong Han
Yuesong Zhang
Guoxin Wang
Yongjun Bao
Guiguang Ding
43
16
0
14 Mar 2024
Retrieval-Augmented Generation for AI-Generated Content: A Survey
Retrieval-Augmented Generation for AI-Generated Content: A Survey
Penghao Zhao
Hailin Zhang
Qinhan Yu
Zhengren Wang
Yunteng Geng
Fangcheng Fu
Ling Yang
Wentao Zhang
Jie Jiang
Bin Cui
3DV
110
220
0
29 Feb 2024
FhGenie: A Custom, Confidentiality-preserving Chat AI for Corporate and
  Scientific Use
FhGenie: A Custom, Confidentiality-preserving Chat AI for Corporate and Scientific Use
Ingo Weber
Hendrik Linka
Daniel Mertens
Tamara Muryshkin
Heinrich Opgenoorth
Stefan Langer
SILM
19
3
0
29 Feb 2024
Beyond the Limits: A Survey of Techniques to Extend the Context Length
  in Large Language Models
Beyond the Limits: A Survey of Techniques to Extend the Context Length in Large Language Models
Xindi Wang
Mahsa Salmani
Parsa Omidi
Xiangyu Ren
Mehdi Rezagholizadeh
A. Eshaghi
LRM
29
35
0
03 Feb 2024
LoMA: Lossless Compressed Memory Attention
LoMA: Lossless Compressed Memory Attention
Yumeng Wang
Zhenyang Xiao
14
3
0
16 Jan 2024
Flexibly Scaling Large Language Models Contexts Through Extensible
  Tokenization
Flexibly Scaling Large Language Models Contexts Through Extensible Tokenization
Ninglu Shao
Shitao Xiao
Zheng Liu
Peitian Zhang
23
4
0
15 Jan 2024
Retrieval-Augmented Generation for Large Language Models: A Survey
Retrieval-Augmented Generation for Large Language Models: A Survey
Yunfan Gao
Yun Xiong
Xinyu Gao
Kangxiang Jia
Jinliu Pan
Yuxi Bi
Yi Dai
Jiawei Sun
Meng Wang
Haofen Wang
3DV
RALM
52
1,499
1
18 Dec 2023
A Survey on Hallucination in Large Language Models: Principles,
  Taxonomy, Challenges, and Open Questions
A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions
Lei Huang
Weijiang Yu
Weitao Ma
Weihong Zhong
Zhangyin Feng
...
Qianglong Chen
Weihua Peng
Xiaocheng Feng
Bing Qin
Ting Liu
LRM
HILM
31
714
0
09 Nov 2023
PromptAgent: Strategic Planning with Language Models Enables
  Expert-level Prompt Optimization
PromptAgent: Strategic Planning with Language Models Enables Expert-level Prompt Optimization
Xinyuan Wang
Chenxi Li
Zhen Wang
Fan Bai
Haotian Luo
Jiayou Zhang
Nebojsa Jojic
Eric P. Xing
Zhiting Hu
23
98
0
25 Oct 2023
CacheGen: KV Cache Compression and Streaming for Fast Language Model
  Serving
CacheGen: KV Cache Compression and Streaming for Fast Language Model Serving
Yuhan Liu
Hanchen Li
Yihua Cheng
Siddhant Ray
Yuyang Huang
...
Ganesh Ananthanarayanan
Michael Maire
Henry Hoffmann
Ari Holtzman
Junchen Jiang
50
41
0
11 Oct 2023
LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios
  via Prompt Compression
LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression
Huiqiang Jiang
Qianhui Wu
Xufang Luo
Dongsheng Li
Chin-Yew Lin
Yuqing Yang
Lili Qiu
RALM
101
182
0
10 Oct 2023
LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA
  Composition
LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition
Chengsong Huang
Qian Liu
Bill Yuchen Lin
Tianyu Pang
Chao Du
Min-Bin Lin
MoMe
26
182
0
25 Jul 2023
In-context Autoencoder for Context Compression in a Large Language Model
In-context Autoencoder for Context Compression in a Large Language Model
Tao Ge
Jing Hu
Lei Wang
Xun Wang
Si-Qing Chen
Furu Wei
RALM
32
66
0
13 Jul 2023
Complexity-Based Prompting for Multi-Step Reasoning
Complexity-Based Prompting for Multi-Step Reasoning
Yao Fu
Hao-Chun Peng
Ashish Sabharwal
Peter Clark
Tushar Khot
ReLM
LRM
162
411
0
03 Oct 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,402
0
28 Jan 2022
Types of Out-of-Distribution Texts and How to Detect Them
Types of Out-of-Distribution Texts and How to Detect Them
Udit Arora
William Huang
He He
OODD
207
97
0
14 Sep 2021
Previous
12