ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.12854
  4. Cited By
Scaling Retrieval-Based Language Models with a Trillion-Token Datastore

Scaling Retrieval-Based Language Models with a Trillion-Token Datastore

9 July 2024
Rulin Shao
Jacqueline He
Akari Asai
Weijia Shi
Tim Dettmers
Sewon Min
Luke Zettlemoyer
Pang Wei Koh
    RALM
ArXivPDFHTML

Papers citing "Scaling Retrieval-Based Language Models with a Trillion-Token Datastore"

25 / 25 papers shown
Title
Query-driven Document-level Scientific Evidence Extraction from Biomedical Studies
Query-driven Document-level Scientific Evidence Extraction from Biomedical Studies
Massimiliano Pronesti
Joao Bettencourt-Silva
Paul Flanagan
Alessandra Pascale
Oisin Redmond
Anya Belz
Yufang Hou
32
0
0
09 May 2025
VeriDebug: A Unified LLM for Verilog Debugging via Contrastive Embedding and Guided Correction
VeriDebug: A Unified LLM for Verilog Debugging via Contrastive Embedding and Guided Correction
N. Wang
Bingkun Yao
Jie Zhou
Yuchen Hu
Xi Wang
Nan Guan
Zhe Jiang
31
0
0
27 Apr 2025
Efficient Distributed Retrieval-Augmented Generation for Enhancing Language Model Performance
Efficient Distributed Retrieval-Augmented Generation for Enhancing Language Model Performance
S. Liu
Zhenzhe Zheng
Xiaoyao Huang
Fan Wu
Guihai Chen
Jie Wu
25
0
0
15 Apr 2025
HyperRAG: Enhancing Quality-Efficiency Tradeoffs in Retrieval-Augmented Generation with Reranker KV-Cache Reuse
HyperRAG: Enhancing Quality-Efficiency Tradeoffs in Retrieval-Augmented Generation with Reranker KV-Cache Reuse
Yuwei An
Yihua Cheng
Seo Jin Park
Junchen Jiang
36
1
0
03 Apr 2025
OpenRAG: Optimizing RAG End-to-End via In-Context Retrieval Learning
Jiawei Zhou
Lei Chen
3DV
VLM
75
0
0
11 Mar 2025
Collapse of Dense Retrievers: Short, Early, and Literal Biases Outranking Factual Evidence
Mohsen Fayyaz
Ali Modarressi
Hinrich Schuetze
Nanyun Peng
52
0
0
06 Mar 2025
TeleRAG: Efficient Retrieval-Augmented Generation Inference with Lookahead Retrieval
TeleRAG: Efficient Retrieval-Augmented Generation Inference with Lookahead Retrieval
Chien-Yu Lin
Keisuke Kamahori
Yiyu Liu
Xiaoxiang Shi
Madhav Kashyap
...
Stephanie Wang
Arvind Krishnamurthy
Rohan Kadekodi
Luis Ceze
Baris Kasikci
3DV
VLM
64
1
0
28 Feb 2025
ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents
ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents
Qiuchen Wang
Ruixue Ding
Zehui Chen
Weiqi Wu
Shihang Wang
Pengjun Xie
Feng Zhao
56
1
0
25 Feb 2025
Unveiling Downstream Performance Scaling of LLMs: A Clustering-Based Perspective
Unveiling Downstream Performance Scaling of LLMs: A Clustering-Based Perspective
Chengyin Xu
Kaiyuan Chen
Xiao Li
Ke Shen
Chenggang Li
OffRL
41
0
0
24 Feb 2025
A Survey of Model Architectures in Information Retrieval
A Survey of Model Architectures in Information Retrieval
Zhichao Xu
Fengran Mo
Zhiqi Huang
Crystina Zhang
Puxuan Yu
Bei Wang
Jimmy J. Lin
Vivek Srikumar
KELM
3DV
48
2
0
21 Feb 2025
HopRAG: Multi-Hop Reasoning for Logic-Aware Retrieval-Augmented Generation
HopRAG: Multi-Hop Reasoning for Logic-Aware Retrieval-Augmented Generation
Hao Liu
Zhengren Wang
Xi Chen
Z. Li
Feiyu Xiong
Qinhan Yu
W. Zhang
LRM
49
3
0
18 Feb 2025
How to Upscale Neural Networks with Scaling Law? A Survey and Practical Guidelines
How to Upscale Neural Networks with Scaling Law? A Survey and Practical Guidelines
Ayan Sengupta
Yash Goel
Tanmoy Chakraborty
41
0
0
17 Feb 2025
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
Bradley Brown
Jordan Juravsky
Ryan Ehrlich
Ronald Clark
Quoc V. Le
Christopher Ré
Azalia Mirhoseini
ALM
LRM
76
207
0
03 Jan 2025
What External Knowledge is Preferred by LLMs? Characterizing and
  Exploring Chain of Evidence in Imperfect Context
What External Knowledge is Preferred by LLMs? Characterizing and Exploring Chain of Evidence in Imperfect Context
Zhiyuan Chang
Mingyang Li
Xiaojun Jia
Junjie Wang
Yuekai Huang
Qing Wang
Yihao Huang
Yang Liu
99
0
0
17 Dec 2024
Scaling Laws for Predicting Downstream Performance in LLMs
Scaling Laws for Predicting Downstream Performance in LLMs
Yangyi Chen
Binxuan Huang
Yifan Gao
Zhengyang Wang
Jingfeng Yang
Heng Ji
LRM
43
7
0
11 Oct 2024
Astute RAG: Overcoming Imperfect Retrieval Augmentation and Knowledge
  Conflicts for Large Language Models
Astute RAG: Overcoming Imperfect Retrieval Augmentation and Knowledge Conflicts for Large Language Models
Fei Wang
Xingchen Wan
Ruoxi Sun
Jiefeng Chen
Sercan Ö. Arık
RALM
32
7
0
09 Oct 2024
Accelerating Inference of Networks in the Frequency Domain
Accelerating Inference of Networks in the Frequency Domain
Chenqiu Zhao
Guanfang Dong
Anup Basu
33
10
0
06 Oct 2024
Retrieval-Enhanced Machine Learning: Synthesis and Opportunities
Retrieval-Enhanced Machine Learning: Synthesis and Opportunities
To Eun Kim
Alireza Salemi
Andrew Drozdov
Fernando Diaz
Hamed Zamani
48
7
0
17 Jul 2024
Language models scale reliably with over-training and on downstream
  tasks
Language models scale reliably with over-training and on downstream tasks
S. Gadre
Georgios Smyrnis
Vaishaal Shankar
Suchin Gururangan
Mitchell Wortsman
...
Y. Carmon
Achal Dave
Reinhard Heckel
Niklas Muennighoff
Ludwig Schmidt
ALM
ELM
LRM
103
40
0
13 Mar 2024
OLMo: Accelerating the Science of Language Models
OLMo: Accelerating the Science of Language Models
Dirk Groeneveld
Iz Beltagy
Pete Walsh
Akshita Bhagia
Rodney Michael Kinney
...
Jesse Dodge
Kyle Lo
Luca Soldaini
Noah A. Smith
Hanna Hajishirzi
OSLM
130
349
0
01 Feb 2024
Paloma: A Benchmark for Evaluating Language Model Fit
Paloma: A Benchmark for Evaluating Language Model Fit
Ian H. Magnusson
Akshita Bhagia
Valentin Hofmann
Luca Soldaini
A. Jha
...
Iz Beltagy
Hanna Hajishirzi
Noah A. Smith
Kyle Richardson
Jesse Dodge
132
21
0
16 Dec 2023
Self-RAG: Learning to Retrieve, Generate, and Critique through
  Self-Reflection
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Akari Asai
Zeqiu Wu
Yizhong Wang
Avirup Sil
Hannaneh Hajishirzi
RALM
144
600
0
17 Oct 2023
Deduplicating Training Data Makes Language Models Better
Deduplicating Training Data Makes Language Models Better
Katherine Lee
Daphne Ippolito
A. Nystrom
Chiyuan Zhang
Douglas Eck
Chris Callison-Burch
Nicholas Carlini
SyDa
237
588
0
14 Jul 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
245
1,977
0
31 Dec 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
226
4,424
0
23 Jan 2020
1