Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2112.04426
Cited By
v1
v2
v3 (latest)
Improving language models by retrieving from trillions of tokens
8 December 2021
Sebastian Borgeaud
A. Mensch
Jordan Hoffmann
Trevor Cai
Eliza Rutherford
Katie Millican
George van den Driessche
Jean-Baptiste Lespiau
Bogdan Damoc
Aidan Clark
Diego de Las Casas
Aurelia Guy
Jacob Menick
Roman Ring
Tom Hennigan
Saffron Huang
Lorenzo Maggiore
Chris Jones
Albin Cassirer
Andy Brock
Michela Paganini
G. Irving
Oriol Vinyals
Simon Osindero
Karen Simonyan
Jack W. Rae
Erich Elsen
Laurent Sifre
KELM
RALM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Papers citing
"Improving language models by retrieving from trillions of tokens"
50 / 893 papers shown
Copy Is All You Need
International Conference on Learning Representations (ICLR), 2023
Tian Lan
Deng Cai
Yan Wang
Heyan Huang
Xian-Ling Mao
244
32
0
13 Jul 2023
A Comprehensive Overview of Large Language Models
ACM Transactions on Intelligent Systems and Technology (ACM TIST), 2023
Humza Naveed
Asad Ullah Khan
Shi Qiu
Muhammad Saqib
Saeed Anwar
Muhammad Usman
Naveed Akhtar
Nick Barnes
Lin Wang
OffRL
862
1,200
0
12 Jul 2023
Pluggable Neural Machine Translation Models via Memory-augmented Adapters
International Conference on Language Resources and Evaluation (LREC), 2023
Yuzhuang Xu
Shuo Wang
Peng Li
Xuebo Liu
Xiaolong Wang
Weidong Liu
Yang Liu
342
1
0
12 Jul 2023
ReLoRA: High-Rank Training Through Low-Rank Updates
International Conference on Learning Representations (ICLR), 2023
Vladislav Lialin
Namrata Shivagunde
Sherin Muckatira
Anna Rumshisky
BDL
515
178
0
11 Jul 2023
Linear Alignment of Vision-language Models for Image Captioning
Fabian Paischer
M. Hofmarcher
Sepp Hochreiter
Thomas Adler
CLIP
VLM
486
2
0
10 Jul 2023
Focused Transformer: Contrastive Training for Context Scaling
Neural Information Processing Systems (NeurIPS), 2023
Szymon Tworkowski
Konrad Staniszewski
Mikolaj Pacek
Yuhuai Wu
Henryk Michalewski
Piotr Milo's
235
165
0
06 Jul 2023
VerifAI: Verified Generative AI
Conference on Innovative Data Systems Research (CIDR), 2023
Nan Tang
Chenyu Yang
Ju Fan
Lei Cao
Yuyu Luo
Alon Halevy
226
27
0
06 Jul 2023
Citation: A Key to Building Responsible and Accountable Large Language Models
Jie Huang
Kevin Chen-Chuan Chang
HILM
325
28
0
05 Jul 2023
Trainable Transformer in Transformer
International Conference on Machine Learning (ICML), 2023
A. Panigrahi
Sadhika Malladi
Mengzhou Xia
Sanjeev Arora
VLM
359
14
0
03 Jul 2023
Meta-training with Demonstration Retrieval for Efficient Few-shot Learning
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Aaron Mueller
Kanika Narang
Lambert Mathias
Qifan Wang
Hamed Firooz
RALM
223
4
0
30 Jun 2023
Query Understanding in the Age of Large Language Models
Avishek Anand
Venktesh V
Abhijit Anand
Vinay Setty
LRM
259
9
0
28 Jun 2023
LeanDojo: Theorem Proving with Retrieval-Augmented Language Models
Neural Information Processing Systems (NeurIPS), 2023
Kaiyu Yang
Aidan M. Swope
Alex Gu
Rahul Chalamala
Peiyang Song
Shixing Yu
Saad Godil
R. Prenger
Anima Anandkumar
RALM
380
338
0
27 Jun 2023
Long-range Language Modeling with Self-retrieval
Transactions of the Association for Computational Linguistics (TACL), 2023
Ohad Rubin
Jonathan Berant
RALM
KELM
229
32
0
23 Jun 2023
ToolQA: A Dataset for LLM Question Answering with External Tools
Neural Information Processing Systems (NeurIPS), 2023
Yuchen Zhuang
Yue Yu
Kuan-Chieh Wang
Haotian Sun
Chao Zhang
ELM
LLMAG
325
344
0
23 Jun 2023
Resources and Evaluations for Multi-Distribution Dense Information Retrieval
Soumya Chatterjee
Omar Khattab
Simran Arora
195
0
0
21 Jun 2023
Co-design Hardware and Algorithm for Vector Search
International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2023
Wenqi Jiang
Shigang Li
Yu Zhu
Johannes de Fine Licht
Zhenhao He
...
Cédric Renggli
Shuai Zhang
Theodoros Rekatsinas
Torsten Hoefler
Gustavo Alonso
340
30
0
19 Jun 2023
Large Language Models are Fixated by Red Herrings: Exploring Creative Problem Solving and Einstellung Effect using the Only Connect Wall Dataset
Neural Information Processing Systems (NeurIPS), 2023
S. Naeini
Raeid Saqur
M. Saeidi
John Giorgi
Babak Taati
346
18
0
19 Jun 2023
RepoFusion: Training Code Models to Understand Your Repository
Disha Shrivastava
Denis Kocetkov
H. D. Vries
Dzmitry Bahdanau
Torsten Scholak
306
50
0
19 Jun 2023
GLIMMER: generalized late-interaction memory reranker
Michiel de Jong
Yury Zemlyanskiy
Nicholas FitzGerald
Sumit Sanghai
William W. Cohen
Joshua Ainslie
RALM
232
9
0
17 Jun 2023
Neural Priming for Sample-Efficient Adaptation
Neural Information Processing Systems (NeurIPS), 2023
Matthew Wallingford
Vivek Ramanujan
Alex Fang
Aditya Kusupati
Roozbeh Mottaghi
Aniruddha Kembhavi
Ludwig Schmidt
Ali Farhadi
VLM
486
19
0
16 Jun 2023
Retrieving-to-Answer: Zero-Shot Video Question Answering with Frozen Large Language Models
Junting Pan
Ziyi Lin
Yuying Ge
Xiatian Zhu
Renrui Zhang
Yi Wang
Yu Qiao
Jiaming Song
MLLM
180
35
0
15 Jun 2023
Encyclopedic VQA: Visual questions about detailed properties of fine-grained categories
IEEE International Conference on Computer Vision (ICCV), 2023
Thomas Mensink
J. Uijlings
Lluis Castrejon
A. Goel
Felipe Cadar
Howard Zhou
Fei Sha
A. Araújo
V. Ferrari
281
82
0
15 Jun 2023
Retrieval-Enhanced Contrastive Vision-Text Models
International Conference on Learning Representations (ICLR), 2023
Ahmet Iscen
Mathilde Caron
Alireza Fathi
Cordelia Schmid
CLIP
VLM
292
39
0
12 Jun 2023
Augmenting Language Models with Long-Term Memory
Neural Information Processing Systems (NeurIPS), 2023
Weizhi Wang
Li Dong
Hao Cheng
Xiaodong Liu
Xifeng Yan
Jianfeng Gao
Furu Wei
KELM
RALM
241
142
0
12 Jun 2023
PoET: A generative model of protein families as sequences-of-sequences
Neural Information Processing Systems (NeurIPS), 2023
Timothy F. Truong
Tristan Bepler
SLR
211
69
0
09 Jun 2023
Speech-to-Text Adapter and Speech-to-Entity Retriever Augmented LLMs for Speech Understanding
Mingqiu Wang
Izhak Shafran
H. Soltau
Wei Han
Yuan Cao
Dian Yu
Laurent El Shafey
RALM
AuLLM
204
9
0
08 Jun 2023
Information Flow Control in Machine Learning through Modular Model Architecture
USENIX Security Symposium (USENIX Security), 2023
Trishita Tiwari
Suchin Gururangan
Chuan Guo
Weizhe Hua
Sanjay Kariyappa
Udit Gupta
Wenjie Xiong
Kiwan Maeng
Hsien-Hsin S. Lee
G. E. Suh
198
9
0
05 Jun 2023
SelfEvolve: A Code Evolution Framework via Large Language Models
Shuyang Jiang
Yuhao Wang
Yu Wang
264
50
0
05 Jun 2023
Taught by the Internet, Exploring Bias in OpenAIs GPT3
Ali Ayaz
Aditya Nawalgaria
Ruilian Yin
117
0
0
04 Jun 2023
Retrieval-Enhanced Visual Prompt Learning for Few-shot Classification
Jintao Rong
Hao Chen
Tianrun Chen
Linlin Ou
Xinyi Yu
Yifan Liu
VLM
VPVLM
194
8
0
04 Jun 2023
AI Transparency in the Age of LLMs: A Human-Centered Research Roadmap
Q. V. Liao
J. Vaughan
318
222
0
02 Jun 2023
KL-Divergence Guided Temperature Sampling
Chung-Ching Chang
David Reitter
Renat Aksitov
Yun-hsuan Sung
HILM
192
10
0
02 Jun 2023
Faster Causal Attention Over Large Sequences Through Sparse Flash Attention
Matteo Pagliardini
Daniele Paliotta
Martin Jaggi
Franccois Fleuret
LRM
176
29
0
01 Jun 2023
Reimagining Retrieval Augmented Language Models for Answering Queries
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
W. Tan
Yuliang Li
Pedro Rodriguez
Rich James
Xi Lin
A. Halevy
Scott Yih
KELM
LRM
308
13
0
01 Jun 2023
Vocabulary-free Image Classification
Neural Information Processing Systems (NeurIPS), 2023
Alessandro Conti
Enrico Fini
Goran Frehse
Paolo Rota
Yiming Wang
Elisa Ricci
VLM
462
33
0
01 Jun 2023
Domain Specialization as the Key to Make Large Language Models Disruptive: A Comprehensive Survey
ACM Computing Surveys (ACM Comput. Surv.), 2023
Chen Ling
Xujiang Zhao
Jiaying Lu
Chengyuan Deng
Can Zheng
...
Chris White
Quanquan Gu
Jian Pei
Carl Yang
Bo Pan
ALM
411
214
0
30 May 2023
Information Association for Language Model Updating by Mitigating LM-Logical Discrepancy
Conference on Computational Natural Language Learning (CoNLL), 2023
Pengfei Yu
Heng Ji
KELM
205
12
0
29 May 2023
Test-Time Training on Nearest Neighbors for Large Language Models
International Conference on Learning Representations (ICLR), 2023
Moritz Hardt
Yu Sun
VLM
RALM
421
53
0
29 May 2023
Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-Intensive Tasks
Neural Information Processing Systems (NeurIPS), 2023
Minki Kang
Seanie Lee
Jinheon Baek
Kenji Kawaguchi
Sung Ju Hwang
ALM
LRM
291
96
0
28 May 2023
Prompt-Guided Retrieval Augmentation for Non-Knowledge-Intensive Tasks
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Zhicheng Guo
Sijie Cheng
Yile Wang
Peng Li
Yang Liu
RALM
147
28
0
28 May 2023
Augmentation-Adapted Retriever Improves Generalization of Language Models as Generic Plug-In
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Zichun Yu
Chenyan Xiong
S. Yu
Zhiyuan Liu
KELM
VLM
292
83
0
27 May 2023
On the Tool Manipulation Capability of Open-source Large Language Models
Qiantong Xu
Fenglu Hong
Yangqiu Song
Changran Hu
Zheng Chen
Jian Zhang
LLMAG
256
97
0
25 May 2023
Surface-Based Retrieval Reduces Perplexity of Retrieval-Augmented Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Ehsan Doostmohammadi
Tobias Norlund
Marco Kuhlmann
Richard Johansson
RALM
190
11
0
25 May 2023
SAIL: Search-Augmented Instruction Learning
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Hongyin Luo
Yung-Sung Chuang
Yuan Gong
Tianhua Zhang
Yoon Kim
Xixin Wu
D. Fox
Helen Meng
James R. Glass
ALM
LRM
RALM
237
35
0
24 May 2023
Privacy Implications of Retrieval-Based Language Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yangsibo Huang
Samyak Gupta
Zexuan Zhong
Keqin Li
Danqi Chen
RALM
181
42
0
24 May 2023
Adapting Language Models to Compress Contexts
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Alexis Chevalier
Alexander Wettig
Anirudh Ajith
Danqi Chen
LLMAG
277
257
0
24 May 2023
Allies: Prompting Large Language Model with Beam Search
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Hao Sun
Xiao Liu
Yeyun Gong
Yan Zhang
Daxin Jiang
Linjun Yang
Nan Duan
RALM
233
10
0
24 May 2023
Enabling Large Language Models to Generate Text with Citations
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Tianyu Gao
Howard Yen
Jiatong Yu
Danqi Chen
LM&MA
HILM
435
493
0
24 May 2023
KNN-LM Does Not Improve Open-ended Text Generation
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Shufan Wang
Yixiao Song
Andrew Drozdov
Aparna Garimella
Varun Manjunatha
Mohit Iyyer
RALM
197
13
0
24 May 2023
Think Before You Act: Decision Transformers with Working Memory
International Conference on Machine Learning (ICML), 2023
Jikun Kang
Romain Laroche
Xingdi Yuan
Adam Trischler
Xuefei Liu
Jie Fu
OffRL
281
0
0
24 May 2023
Previous
1
2
3
...
13
14
15
16
17
18
Next