ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.04426
  4. Cited By
Improving language models by retrieving from trillions of tokens
v1v2v3 (latest)

Improving language models by retrieving from trillions of tokens

8 December 2021
Sebastian Borgeaud
A. Mensch
Jordan Hoffmann
Trevor Cai
Eliza Rutherford
Katie Millican
George van den Driessche
Jean-Baptiste Lespiau
Bogdan Damoc
Aidan Clark
Diego de Las Casas
Aurelia Guy
Jacob Menick
Roman Ring
Tom Hennigan
Saffron Huang
Lorenzo Maggiore
Chris Jones
Albin Cassirer
Andy Brock
Michela Paganini
G. Irving
Oriol Vinyals
Simon Osindero
Karen Simonyan
Jack W. Rae
Erich Elsen
Laurent Sifre
    KELMRALM
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)

Papers citing "Improving language models by retrieving from trillions of tokens"

50 / 893 papers shown
Copy Is All You Need
Copy Is All You NeedInternational Conference on Learning Representations (ICLR), 2023
Tian Lan
Deng Cai
Yan Wang
Heyan Huang
Xian-Ling Mao
244
32
0
13 Jul 2023
A Comprehensive Overview of Large Language Models
A Comprehensive Overview of Large Language ModelsACM Transactions on Intelligent Systems and Technology (ACM TIST), 2023
Humza Naveed
Asad Ullah Khan
Shi Qiu
Muhammad Saqib
Saeed Anwar
Muhammad Usman
Naveed Akhtar
Nick Barnes
Lin Wang
OffRL
862
1,200
0
12 Jul 2023
Pluggable Neural Machine Translation Models via Memory-augmented
  Adapters
Pluggable Neural Machine Translation Models via Memory-augmented AdaptersInternational Conference on Language Resources and Evaluation (LREC), 2023
Yuzhuang Xu
Shuo Wang
Peng Li
Xuebo Liu
Xiaolong Wang
Weidong Liu
Yang Liu
342
1
0
12 Jul 2023
ReLoRA: High-Rank Training Through Low-Rank Updates
ReLoRA: High-Rank Training Through Low-Rank UpdatesInternational Conference on Learning Representations (ICLR), 2023
Vladislav Lialin
Namrata Shivagunde
Sherin Muckatira
Anna Rumshisky
BDL
515
178
0
11 Jul 2023
Linear Alignment of Vision-language Models for Image Captioning
Linear Alignment of Vision-language Models for Image Captioning
Fabian Paischer
M. Hofmarcher
Sepp Hochreiter
Thomas Adler
CLIPVLM
486
2
0
10 Jul 2023
Focused Transformer: Contrastive Training for Context Scaling
Focused Transformer: Contrastive Training for Context ScalingNeural Information Processing Systems (NeurIPS), 2023
Szymon Tworkowski
Konrad Staniszewski
Mikolaj Pacek
Yuhuai Wu
Henryk Michalewski
Piotr Milo's
235
165
0
06 Jul 2023
VerifAI: Verified Generative AI
VerifAI: Verified Generative AIConference on Innovative Data Systems Research (CIDR), 2023
Nan Tang
Chenyu Yang
Ju Fan
Lei Cao
Yuyu Luo
Alon Halevy
226
27
0
06 Jul 2023
Citation: A Key to Building Responsible and Accountable Large Language
  Models
Citation: A Key to Building Responsible and Accountable Large Language Models
Jie Huang
Kevin Chen-Chuan Chang
HILM
325
28
0
05 Jul 2023
Trainable Transformer in Transformer
Trainable Transformer in TransformerInternational Conference on Machine Learning (ICML), 2023
A. Panigrahi
Sadhika Malladi
Mengzhou Xia
Sanjeev Arora
VLM
359
14
0
03 Jul 2023
Meta-training with Demonstration Retrieval for Efficient Few-shot
  Learning
Meta-training with Demonstration Retrieval for Efficient Few-shot LearningAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Aaron Mueller
Kanika Narang
Lambert Mathias
Qifan Wang
Hamed Firooz
RALM
223
4
0
30 Jun 2023
Query Understanding in the Age of Large Language Models
Query Understanding in the Age of Large Language Models
Avishek Anand
Venktesh V
Abhijit Anand
Vinay Setty
LRM
259
9
0
28 Jun 2023
LeanDojo: Theorem Proving with Retrieval-Augmented Language Models
LeanDojo: Theorem Proving with Retrieval-Augmented Language ModelsNeural Information Processing Systems (NeurIPS), 2023
Kaiyu Yang
Aidan M. Swope
Alex Gu
Rahul Chalamala
Peiyang Song
Shixing Yu
Saad Godil
R. Prenger
Anima Anandkumar
RALM
380
338
0
27 Jun 2023
Long-range Language Modeling with Self-retrieval
Long-range Language Modeling with Self-retrievalTransactions of the Association for Computational Linguistics (TACL), 2023
Ohad Rubin
Jonathan Berant
RALMKELM
229
32
0
23 Jun 2023
ToolQA: A Dataset for LLM Question Answering with External Tools
ToolQA: A Dataset for LLM Question Answering with External ToolsNeural Information Processing Systems (NeurIPS), 2023
Yuchen Zhuang
Yue Yu
Kuan-Chieh Wang
Haotian Sun
Chao Zhang
ELMLLMAG
325
344
0
23 Jun 2023
Resources and Evaluations for Multi-Distribution Dense Information
  Retrieval
Resources and Evaluations for Multi-Distribution Dense Information Retrieval
Soumya Chatterjee
Omar Khattab
Simran Arora
195
0
0
21 Jun 2023
Co-design Hardware and Algorithm for Vector Search
Co-design Hardware and Algorithm for Vector SearchInternational Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2023
Wenqi Jiang
Shigang Li
Yu Zhu
Johannes de Fine Licht
Zhenhao He
...
Cédric Renggli
Shuai Zhang
Theodoros Rekatsinas
Torsten Hoefler
Gustavo Alonso
340
30
0
19 Jun 2023
Large Language Models are Fixated by Red Herrings: Exploring Creative
  Problem Solving and Einstellung Effect using the Only Connect Wall Dataset
Large Language Models are Fixated by Red Herrings: Exploring Creative Problem Solving and Einstellung Effect using the Only Connect Wall DatasetNeural Information Processing Systems (NeurIPS), 2023
S. Naeini
Raeid Saqur
M. Saeidi
John Giorgi
Babak Taati
346
18
0
19 Jun 2023
RepoFusion: Training Code Models to Understand Your Repository
RepoFusion: Training Code Models to Understand Your Repository
Disha Shrivastava
Denis Kocetkov
H. D. Vries
Dzmitry Bahdanau
Torsten Scholak
306
50
0
19 Jun 2023
GLIMMER: generalized late-interaction memory reranker
GLIMMER: generalized late-interaction memory reranker
Michiel de Jong
Yury Zemlyanskiy
Nicholas FitzGerald
Sumit Sanghai
William W. Cohen
Joshua Ainslie
RALM
232
9
0
17 Jun 2023
Neural Priming for Sample-Efficient Adaptation
Neural Priming for Sample-Efficient AdaptationNeural Information Processing Systems (NeurIPS), 2023
Matthew Wallingford
Vivek Ramanujan
Alex Fang
Aditya Kusupati
Roozbeh Mottaghi
Aniruddha Kembhavi
Ludwig Schmidt
Ali Farhadi
VLM
486
19
0
16 Jun 2023
Retrieving-to-Answer: Zero-Shot Video Question Answering with Frozen
  Large Language Models
Retrieving-to-Answer: Zero-Shot Video Question Answering with Frozen Large Language Models
Junting Pan
Ziyi Lin
Yuying Ge
Xiatian Zhu
Renrui Zhang
Yi Wang
Yu Qiao
Jiaming Song
MLLM
180
35
0
15 Jun 2023
Encyclopedic VQA: Visual questions about detailed properties of
  fine-grained categories
Encyclopedic VQA: Visual questions about detailed properties of fine-grained categoriesIEEE International Conference on Computer Vision (ICCV), 2023
Thomas Mensink
J. Uijlings
Lluis Castrejon
A. Goel
Felipe Cadar
Howard Zhou
Fei Sha
A. Araújo
V. Ferrari
281
82
0
15 Jun 2023
Retrieval-Enhanced Contrastive Vision-Text Models
Retrieval-Enhanced Contrastive Vision-Text ModelsInternational Conference on Learning Representations (ICLR), 2023
Ahmet Iscen
Mathilde Caron
Alireza Fathi
Cordelia Schmid
CLIPVLM
292
39
0
12 Jun 2023
Augmenting Language Models with Long-Term Memory
Augmenting Language Models with Long-Term MemoryNeural Information Processing Systems (NeurIPS), 2023
Weizhi Wang
Li Dong
Hao Cheng
Xiaodong Liu
Xifeng Yan
Jianfeng Gao
Furu Wei
KELMRALM
241
142
0
12 Jun 2023
PoET: A generative model of protein families as sequences-of-sequences
PoET: A generative model of protein families as sequences-of-sequencesNeural Information Processing Systems (NeurIPS), 2023
Timothy F. Truong
Tristan Bepler
SLR
211
69
0
09 Jun 2023
Speech-to-Text Adapter and Speech-to-Entity Retriever Augmented LLMs for
  Speech Understanding
Speech-to-Text Adapter and Speech-to-Entity Retriever Augmented LLMs for Speech Understanding
Mingqiu Wang
Izhak Shafran
H. Soltau
Wei Han
Yuan Cao
Dian Yu
Laurent El Shafey
RALMAuLLM
204
9
0
08 Jun 2023
Information Flow Control in Machine Learning through Modular Model
  Architecture
Information Flow Control in Machine Learning through Modular Model ArchitectureUSENIX Security Symposium (USENIX Security), 2023
Trishita Tiwari
Suchin Gururangan
Chuan Guo
Weizhe Hua
Sanjay Kariyappa
Udit Gupta
Wenjie Xiong
Kiwan Maeng
Hsien-Hsin S. Lee
G. E. Suh
198
9
0
05 Jun 2023
SelfEvolve: A Code Evolution Framework via Large Language Models
SelfEvolve: A Code Evolution Framework via Large Language Models
Shuyang Jiang
Yuhao Wang
Yu Wang
264
50
0
05 Jun 2023
Taught by the Internet, Exploring Bias in OpenAIs GPT3
Taught by the Internet, Exploring Bias in OpenAIs GPT3
Ali Ayaz
Aditya Nawalgaria
Ruilian Yin
117
0
0
04 Jun 2023
Retrieval-Enhanced Visual Prompt Learning for Few-shot Classification
Retrieval-Enhanced Visual Prompt Learning for Few-shot Classification
Jintao Rong
Hao Chen
Tianrun Chen
Linlin Ou
Xinyi Yu
Yifan Liu
VLMVPVLM
194
8
0
04 Jun 2023
AI Transparency in the Age of LLMs: A Human-Centered Research Roadmap
AI Transparency in the Age of LLMs: A Human-Centered Research Roadmap
Q. V. Liao
J. Vaughan
318
222
0
02 Jun 2023
KL-Divergence Guided Temperature Sampling
KL-Divergence Guided Temperature Sampling
Chung-Ching Chang
David Reitter
Renat Aksitov
Yun-hsuan Sung
HILM
192
10
0
02 Jun 2023
Faster Causal Attention Over Large Sequences Through Sparse Flash
  Attention
Faster Causal Attention Over Large Sequences Through Sparse Flash Attention
Matteo Pagliardini
Daniele Paliotta
Martin Jaggi
Franccois Fleuret
LRM
176
29
0
01 Jun 2023
Reimagining Retrieval Augmented Language Models for Answering Queries
Reimagining Retrieval Augmented Language Models for Answering QueriesAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
W. Tan
Yuliang Li
Pedro Rodriguez
Rich James
Xi Lin
A. Halevy
Scott Yih
KELMLRM
308
13
0
01 Jun 2023
Vocabulary-free Image Classification
Vocabulary-free Image ClassificationNeural Information Processing Systems (NeurIPS), 2023
Alessandro Conti
Enrico Fini
Goran Frehse
Paolo Rota
Yiming Wang
Elisa Ricci
VLM
462
33
0
01 Jun 2023
Domain Specialization as the Key to Make Large Language Models
  Disruptive: A Comprehensive Survey
Domain Specialization as the Key to Make Large Language Models Disruptive: A Comprehensive SurveyACM Computing Surveys (ACM Comput. Surv.), 2023
Chen Ling
Xujiang Zhao
Jiaying Lu
Chengyuan Deng
Can Zheng
...
Chris White
Quanquan Gu
Jian Pei
Carl Yang
Bo Pan
ALM
411
214
0
30 May 2023
Information Association for Language Model Updating by Mitigating
  LM-Logical Discrepancy
Information Association for Language Model Updating by Mitigating LM-Logical DiscrepancyConference on Computational Natural Language Learning (CoNLL), 2023
Pengfei Yu
Heng Ji
KELM
205
12
0
29 May 2023
Test-Time Training on Nearest Neighbors for Large Language Models
Test-Time Training on Nearest Neighbors for Large Language ModelsInternational Conference on Learning Representations (ICLR), 2023
Moritz Hardt
Yu Sun
VLMRALM
421
53
0
29 May 2023
Knowledge-Augmented Reasoning Distillation for Small Language Models in
  Knowledge-Intensive Tasks
Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-Intensive TasksNeural Information Processing Systems (NeurIPS), 2023
Minki Kang
Seanie Lee
Jinheon Baek
Kenji Kawaguchi
Sung Ju Hwang
ALMLRM
291
96
0
28 May 2023
Prompt-Guided Retrieval Augmentation for Non-Knowledge-Intensive Tasks
Prompt-Guided Retrieval Augmentation for Non-Knowledge-Intensive TasksAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Zhicheng Guo
Sijie Cheng
Yile Wang
Peng Li
Yang Liu
RALM
147
28
0
28 May 2023
Augmentation-Adapted Retriever Improves Generalization of Language
  Models as Generic Plug-In
Augmentation-Adapted Retriever Improves Generalization of Language Models as Generic Plug-InAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Zichun Yu
Chenyan Xiong
S. Yu
Zhiyuan Liu
KELMVLM
292
83
0
27 May 2023
On the Tool Manipulation Capability of Open-source Large Language Models
On the Tool Manipulation Capability of Open-source Large Language Models
Qiantong Xu
Fenglu Hong
Yangqiu Song
Changran Hu
Zheng Chen
Jian Zhang
LLMAG
256
97
0
25 May 2023
Surface-Based Retrieval Reduces Perplexity of Retrieval-Augmented
  Language Models
Surface-Based Retrieval Reduces Perplexity of Retrieval-Augmented Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Ehsan Doostmohammadi
Tobias Norlund
Marco Kuhlmann
Richard Johansson
RALM
190
11
0
25 May 2023
SAIL: Search-Augmented Instruction Learning
SAIL: Search-Augmented Instruction LearningConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Hongyin Luo
Yung-Sung Chuang
Yuan Gong
Tianhua Zhang
Yoon Kim
Xixin Wu
D. Fox
Helen Meng
James R. Glass
ALMLRMRALM
237
35
0
24 May 2023
Privacy Implications of Retrieval-Based Language Models
Privacy Implications of Retrieval-Based Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yangsibo Huang
Samyak Gupta
Zexuan Zhong
Keqin Li
Danqi Chen
RALM
181
42
0
24 May 2023
Adapting Language Models to Compress Contexts
Adapting Language Models to Compress ContextsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Alexis Chevalier
Alexander Wettig
Anirudh Ajith
Danqi Chen
LLMAG
277
257
0
24 May 2023
Allies: Prompting Large Language Model with Beam Search
Allies: Prompting Large Language Model with Beam SearchConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Hao Sun
Xiao Liu
Yeyun Gong
Yan Zhang
Daxin Jiang
Linjun Yang
Nan Duan
RALM
233
10
0
24 May 2023
Enabling Large Language Models to Generate Text with Citations
Enabling Large Language Models to Generate Text with CitationsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Tianyu Gao
Howard Yen
Jiatong Yu
Danqi Chen
LM&MAHILM
435
493
0
24 May 2023
KNN-LM Does Not Improve Open-ended Text Generation
KNN-LM Does Not Improve Open-ended Text GenerationConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Shufan Wang
Yixiao Song
Andrew Drozdov
Aparna Garimella
Varun Manjunatha
Mohit Iyyer
RALM
197
13
0
24 May 2023
Think Before You Act: Decision Transformers with Working Memory
Think Before You Act: Decision Transformers with Working MemoryInternational Conference on Machine Learning (ICML), 2023
Jikun Kang
Romain Laroche
Xingdi Yuan
Adam Trischler
Xuefei Liu
Jie Fu
OffRL
281
0
0
24 May 2023
Previous
123...131415161718
Next