ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.08223
  4. Cited By
Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting

Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting

11 July 2024
Zilong Wang
Zifeng Wang
Long Le
Huaixiu Steven Zheng
Swaroop Mishra
Vincent Perot
Yuwei Zhang
Anush Mattapalli
Ankur Taly
Jingbo Shang
Chen-Yu Lee
Tomas Pfister
    RALM
ArXivPDFHTML

Papers citing "Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting"

27 / 27 papers shown
Title
AlignRAG: An Adaptable Framework for Resolving Misalignments in Retrieval-Aware Reasoning of RAG
AlignRAG: An Adaptable Framework for Resolving Misalignments in Retrieval-Aware Reasoning of RAG
Jiaqi Wei
Hao Zhou
Xiang Zhang
Di Zhang
Zijie Qiu
Wei Wei
Jinzhe Li
Wanli Ouyang
Siqi Sun
19
0
0
21 Apr 2025
The Other Side of the Coin: Exploring Fairness in Retrieval-Augmented Generation
The Other Side of the Coin: Exploring Fairness in Retrieval-Augmented Generation
Z. Zhang
Ning Li
Qi Liu
Rui Li
W. Gao
Qingyang Mao
Zhenya Huang
Baosheng Yu
Dacheng Tao
RALM
32
0
0
11 Apr 2025
Scaling Test-Time Inference with Policy-Optimized, Dynamic Retrieval-Augmented Generation via KV Caching and Decoding
Scaling Test-Time Inference with Policy-Optimized, Dynamic Retrieval-Augmented Generation via KV Caching and Decoding
Sakhinana Sagar Srinivas
Venkataramana Runkana
OffRL
33
1
0
02 Apr 2025
Question-Aware Knowledge Graph Prompting for Enhancing Large Language Models
Question-Aware Knowledge Graph Prompting for Enhancing Large Language Models
Haochen Liu
Song Wang
Chen Chen
J. Li
RALM
KELM
59
0
0
30 Mar 2025
Instruction-Oriented Preference Alignment for Enhancing Multi-Modal Comprehension Capability of MLLMs
Instruction-Oriented Preference Alignment for Enhancing Multi-Modal Comprehension Capability of MLLMs
Zitian Wang
Yue Liao
Kang Rong
Fengyun Rao
Yibo Yang
Si Liu
70
0
0
26 Mar 2025
CausalRAG: Integrating Causal Graphs into Retrieval-Augmented Generation
CausalRAG: Integrating Causal Graphs into Retrieval-Augmented Generation
Nengbo Wang
Xiaotian Han
Jagdip Singh
Jing Ma
V. Chaudhary
40
0
0
25 Mar 2025
OpenRAG: Optimizing RAG End-to-End via In-Context Retrieval Learning
Jiawei Zhou
Lei Chen
3DV
VLM
70
0
0
11 Mar 2025
Leveraging Approximate Caching for Faster Retrieval-Augmented Generation
Shai Bergman
Zhang Ji
Anne-Marie Kermarrec
Diana Petrescu
Rafael Pires
Mathis Randl
M. Vos
29
0
0
07 Mar 2025
Tutorial Proposal: Speculative Decoding for Efficient LLM Inference
Heming Xia
Cunxiao Du
Y. Li
Qian Liu
Wenjie Li
34
0
0
01 Mar 2025
TeleRAG: Efficient Retrieval-Augmented Generation Inference with Lookahead Retrieval
TeleRAG: Efficient Retrieval-Augmented Generation Inference with Lookahead Retrieval
Chien-Yu Lin
Keisuke Kamahori
Yiyu Liu
Xiaoxiang Shi
Madhav Kashyap
...
Stephanie Wang
Arvind Krishnamurthy
Rohan Kadekodi
Luis Ceze
Baris Kasikci
3DV
VLM
50
0
0
28 Feb 2025
Speculative Decoding and Beyond: An In-Depth Survey of Techniques
Speculative Decoding and Beyond: An In-Depth Survey of Techniques
Y. Hu
Zining Liu
Zhenyuan Dong
Tianfan Peng
Bradley McDanel
S. Zhang
82
0
0
27 Feb 2025
URAG: Implementing a Unified Hybrid RAG for Precise Answers in University Admission Chatbots -- A Case Study at HCMUT
URAG: Implementing a Unified Hybrid RAG for Precise Answers in University Admission Chatbots -- A Case Study at HCMUT
Long Nguyen
Tho Quan
36
0
0
27 Jan 2025
An Overview and Discussion on Using Large Language Models for Implementation Generation of Solutions to Open-Ended Problems
An Overview and Discussion on Using Large Language Models for Implementation Generation of Solutions to Open-Ended Problems
Hashmath Shaik
Alex Doboli
OffRL
ELM
44
0
0
31 Dec 2024
Deploying Foundation Model Powered Agent Services: A Survey
Deploying Foundation Model Powered Agent Services: A Survey
Wenchao Xu
Jinyu Chen
Peirong Zheng
Xiaoquan Yi
Tianyi Tian
...
Quan Wan
Haozhao Wang
Yunfeng Fan
Qinliang Su
Xuemin Shen
AI4CE
107
1
0
18 Dec 2024
SuffixDecoding: A Model-Free Approach to Speeding Up Large Language
  Model Inference
SuffixDecoding: A Model-Free Approach to Speeding Up Large Language Model Inference
Gabriele Oliaro
Zhihao Jia
Daniel F Campos
Aurick Qiao
LRM
24
0
0
07 Nov 2024
Rationale-Guided Retrieval Augmented Generation for Medical Question
  Answering
Rationale-Guided Retrieval Augmented Generation for Medical Question Answering
Jiwoong Sohn
Yein Park
Chanwoong Yoon
Sihyeon Park
Hyeon Hwang
Mujeen Sung
Hyunjae Kim
Jaewoo Kang
RALM
50
0
0
01 Nov 2024
Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm
  Intelligence
Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm Intelligence
Shangbin Feng
Zifeng Wang
Yike Wang
Sayna Ebrahimi
Hamid Palangi
...
Nathalie Rauschmayr
Yejin Choi
Yulia Tsvetkov
Chen-Yu Lee
Tomas Pfister
MoMe
25
3
0
15 Oct 2024
Astute RAG: Overcoming Imperfect Retrieval Augmentation and Knowledge
  Conflicts for Large Language Models
Astute RAG: Overcoming Imperfect Retrieval Augmentation and Knowledge Conflicts for Large Language Models
Fei Wang
Xingchen Wan
Ruoxi Sun
Jiefeng Chen
Sercan Ö. Arık
RALM
25
6
0
09 Oct 2024
LLM Self-Correction with DeCRIM: Decompose, Critique, and Refine for
  Enhanced Following of Instructions with Multiple Constraints
LLM Self-Correction with DeCRIM: Decompose, Critique, and Refine for Enhanced Following of Instructions with Multiple Constraints
Thomas Palmeira Ferraz
Kartik Mehta
Yu-Hsiang Lin
Haw-Shiuan Chang
Shereen Oraby
Sijia Liu
Vivek Subramanian
Tagyoung Chung
Mohit Bansal
Nanyun Peng
30
1
0
09 Oct 2024
ParallelSpec: Parallel Drafter for Efficient Speculative Decoding
ParallelSpec: Parallel Drafter for Efficient Speculative Decoding
Zilin Xiao
Hongming Zhang
Tao Ge
Siru Ouyang
Vicente Ordonez
Dong Yu
28
1
0
08 Oct 2024
Answer is All You Need: Instruction-following Text Embedding via
  Answering the Question
Answer is All You Need: Instruction-following Text Embedding via Answering the Question
Letian Peng
Yuwei Zhang
Zilong Wang
Jayanth Srinivasa
Gaowen Liu
Zihan Wang
Jingbo Shang
29
6
0
15 Feb 2024
Corrective Retrieval Augmented Generation
Corrective Retrieval Augmented Generation
Shi-Qi Yan
Jia-Chen Gu
Yun Zhu
Zhen-Hua Ling
RALM
72
70
0
29 Jan 2024
Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language
  Models
Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models
W. Yu
Hongming Zhang
Xiaoman Pan
Kaixin Ma
Hongwei Wang
Dong Yu
KELM
RALM
LRM
51
101
0
15 Nov 2023
Self-RAG: Learning to Retrieve, Generate, and Critique through
  Self-Reflection
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Akari Asai
Zeqiu Wu
Yizhong Wang
Avirup Sil
Hannaneh Hajishirzi
RALM
135
600
0
17 Oct 2023
Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large
  Language Models in Knowledge Conflicts
Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts
Jian Xie
Kai Zhang
Jiangjie Chen
Renze Lou
Yu-Chuan Su
RALM
198
75
0
22 May 2023
Distilling Step-by-Step! Outperforming Larger Language Models with Less
  Training Data and Smaller Model Sizes
Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes
Lokesh Nagalapatti
Chun-Liang Li
Chih-Kuan Yeh
Hootan Nakhost
Yasuhisa Fujii
Alexander Ratner
Ranjay Krishna
Chen-Yu Lee
Tomas Pfister
ALM
196
283
0
03 May 2023
Large Language Models are Zero-Shot Reasoners
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
291
2,712
0
24 May 2022
1