Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2112.04426
Cited By
v1
v2
v3 (latest)
Improving language models by retrieving from trillions of tokens
8 December 2021
Sebastian Borgeaud
A. Mensch
Jordan Hoffmann
Trevor Cai
Eliza Rutherford
Katie Millican
George van den Driessche
Jean-Baptiste Lespiau
Bogdan Damoc
Aidan Clark
Diego de Las Casas
Aurelia Guy
Jacob Menick
Roman Ring
Tom Hennigan
Saffron Huang
Lorenzo Maggiore
Chris Jones
Albin Cassirer
Andy Brock
Michela Paganini
G. Irving
Oriol Vinyals
Simon Osindero
Karen Simonyan
Jack W. Rae
Erich Elsen
Laurent Sifre
KELM
RALM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Papers citing
"Improving language models by retrieving from trillions of tokens"
43 / 893 papers shown
Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models
Neural Information Processing Systems (NeurIPS), 2022
Kushal Tirumala
Aram H. Markosyan
Luke Zettlemoyer
Armen Aghajanyan
TDI
356
243
0
22 May 2022
Visually-Augmented Language Modeling
International Conference on Learning Representations (ICLR), 2022
Weizhi Wang
Li Dong
Hao Cheng
Haoyu Song
Xiaodong Liu
Xifeng Yan
Jianfeng Gao
Furu Wei
VLM
232
22
0
20 May 2022
A Generalist Agent
Scott E. Reed
Konrad Zolna
Emilio Parisotto
Sergio Gomez Colmenarejo
Alexander Novikov
...
Yutian Chen
R. Hadsell
Oriol Vinyals
Mahyar Bordbar
Nando de Freitas
LM&Ro
LLMAG
AI4CE
474
979
0
12 May 2022
Asking for Knowledge: Training RL Agents to Query External Knowledge Using Language
International Conference on Machine Learning (ICML), 2022
Iou-Jen Liu
Xingdi Yuan
Marc-Alexandre Côté
Pierre-Yves Oudeyer
Alex Schwing
RALM
250
13
0
12 May 2022
Retrieval-Enhanced Machine Learning
Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2022
Hamed Zamani
Fernando Diaz
Mostafa Dehghani
Donald Metzler
Michael Bendersky
171
59
0
02 May 2022
OPT: Open Pre-trained Transformer Language Models
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
...
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLM
OSLM
AI4CE
892
4,417
0
02 May 2022
TemporalWiki: A Lifelong Benchmark for Training and Evaluating Ever-Evolving Language Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Joel Jang
Seonghyeon Ye
Changho Lee
Sohee Yang
Joongbo Shin
Janghoon Han
Gyeonghun Kim
Minjoon Seo
CLL
KELM
439
116
0
29 Apr 2022
Can deep learning match the efficiency of human visual long-term memory in storing object details?
Emin Orhan
VLM
OCL
231
0
0
27 Apr 2022
Semi-Parametric Neural Image Synthesis
A. Blattmann
Robin Rombach
Kaan Oktay
Jonas Muller
Bjorn Ommer
DiffM
304
32
0
25 Apr 2022
ChapterBreak: A Challenge Dataset for Long-Range Language Models
North American Chapter of the Association for Computational Linguistics (NAACL), 2022
Simeng Sun
Katherine Thai
Mohit Iyyer
189
20
0
22 Apr 2022
Standing on the Shoulders of Giant Frozen Language Models
Yoav Levine
Itay Dalmedigos
Ori Ram
Yoel Zeldes
Daniel Jannai
...
Barak Lenz
Shai Shalev-Shwartz
Amnon Shashua
Kevin Leyton-Brown
Y. Shoham
VLM
240
52
0
21 Apr 2022
K-LITE: Learning Transferable Visual Models with External Knowledge
Neural Information Processing Systems (NeurIPS), 2022
Sheng Shen
Chunyuan Li
Xiaowei Hu
Jianwei Yang
Yujia Xie
...
Ce Liu
Kurt Keutzer
Trevor Darrell
Anna Rohrbach
Jianfeng Gao
CLIP
VLM
197
96
0
20 Apr 2022
METRO: Efficient Denoising Pretraining of Large Scale Autoencoding Language Models with Model Generated Signals
Payal Bajaj
Chenyan Xiong
Guolin Ke
Xiaodong Liu
Di He
Saurabh Tiwary
Tie-Yan Liu
Paul N. Bennett
Xia Song
Jianfeng Gao
208
34
0
13 Apr 2022
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback
Yuntao Bai
Andy Jones
Kamal Ndousse
Amanda Askell
Anna Chen
...
Jack Clark
Sam McCandlish
C. Olah
Benjamin Mann
Jared Kaplan
960
3,520
0
12 Apr 2022
Augmenting Pre-trained Language Models with QA-Memory for Open-Domain Question Answering
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2022
Wenhu Chen
Pat Verga
Michiel de Jong
John Wieting
William W. Cohen
RALM
KELM
307
28
0
10 Apr 2022
Knowledge Base Index Compression via Dimensionality and Precision Reduction
Vilém Zouhar
Marius Mosbach
Miaoran Zhang
Dietrich Klakow
245
3
0
06 Apr 2022
KNN-Diffusion: Image Generation via Large-Scale Retrieval
International Conference on Learning Representations (ICLR), 2022
Shelly Sheynin
Oron Ashual
Adam Polyak
Uriel Singer
Oran Gafni
Eliya Nachmani
Yaniv Taigman
VLM
SyDa
DiffM
252
148
0
06 Apr 2022
PaLM: Scaling Language Modeling with Pathways
Journal of machine learning research (JMLR), 2022
Aakanksha Chowdhery
Sharan Narang
Jacob Devlin
Maarten Bosma
Gaurav Mishra
...
Kathy Meier-Hellstern
Douglas Eck
J. Dean
Slav Petrov
Noah Fiedel
PILM
LRM
1.2K
7,524
0
05 Apr 2022
Revisiting a kNN-based Image Classification System with High-capacity Storage
European Conference on Computer Vision (ECCV), 2022
K. Nakata
Youyang Ng
Daisuke Miyashita
A. Maki
Yu Lin
J. Deguchi
255
29
0
03 Apr 2022
PanGu-Bot: Efficient Generative Dialogue Pre-training from Pre-trained Language Model
Fei Mi
Yitong Li
Yulong Zeng
Jingyan Zhou
Yasheng Wang
Chuanfei Xu
Lifeng Shang
Xin Jiang
Shiqi Zhao
Qun Liu
ALM
348
17
0
31 Mar 2022
Training Compute-Optimal Large Language Models
Jordan Hoffmann
Sebastian Borgeaud
A. Mensch
Elena Buchatskaya
Trevor Cai
...
Karen Simonyan
Erich Elsen
Jack W. Rae
Oriol Vinyals
Laurent Sifre
AI4TS
798
2,684
0
29 Mar 2022
Diagonal State Spaces are as Effective as Structured State Spaces
Neural Information Processing Systems (NeurIPS), 2022
Ankit Gupta
Albert Gu
Jonathan Berant
420
416
0
27 Mar 2022
Language Models that Seek for Knowledge: Modular Search & Generation for Dialogue and Prompt Completion
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Kurt Shuster
M. Komeili
Leonard Adolphs
Stephen Roller
Arthur Szlam
Jason Weston
KELM
232
143
0
24 Mar 2022
Teaching language models to support answers with verified quotes
Jacob Menick
Maja Trebacz
Vladimir Mikulik
John Aslanides
Francis Song
...
Mia Glaese
Susannah Young
Lucy Campbell-Gillingham
G. Irving
Nat McAleese
ELM
RALM
530
308
0
21 Mar 2022
Reasoning over Public and Private Data in Retrieval-Based Systems
Transactions of the Association for Computational Linguistics (TACL), 2022
Simran Arora
Patrick Lewis
Angela Fan
Jacob Kahn
Christopher Ré
192
31
0
14 Mar 2022
Internet-augmented language models through few-shot prompting for open-domain question answering
Angeliki Lazaridou
E. Gribovskaya
Wojciech Stokowiec
N. Grigorev
KELM
LRM
244
159
0
10 Mar 2022
Finite-Sum Coupled Compositional Stochastic Optimization: Theory and Applications
International Conference on Machine Learning (ICML), 2022
Bokun Wang
Tianbao Yang
546
35
0
24 Feb 2022
From Natural Language to Simulations: Applying GPT-3 Codex to Automate Simulation Modeling of Logistics Systems
Social Science Research Network (SSRN), 2022
I. Jackson
M. J. Sáenz
182
10
0
24 Feb 2022
Do Transformers know symbolic rules, and would we know if they did?
Tommi Gröndahl
Yu-Wen Guo
Nirmal Asokan
420
0
0
19 Feb 2022
Retrieval-Augmented Reinforcement Learning
International Conference on Machine Learning (ICML), 2022
Anirudh Goyal
A. Friesen
Andrea Banino
T. Weber
Nan Rosemary Ke
...
Michal Valko
Simon Osindero
Timothy Lillicrap
N. Heess
Charles Blundell
OffRL
406
66
0
17 Feb 2022
Transformer Memory as a Differentiable Search Index
Neural Information Processing Systems (NeurIPS), 2022
Yi Tay
Vinh Q. Tran
Mostafa Dehghani
Jianmo Ni
Dara Bahri
...
Zhe Zhao
Jai Gupta
Tal Schuster
William W. Cohen
Donald Metzler
434
368
0
14 Feb 2022
Semi-supervised New Event Type Induction and Description via Contrastive Loss-Enforced Batch Attention
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2022
Carl Edwards
Heng Ji
153
12
0
12 Feb 2022
Competition-Level Code Generation with AlphaCode
Science (Science), 2022
Yujia Li
David Choi
Junyoung Chung
Nate Kushman
Julian Schrittwieser
...
Esme Sutherland Robson
Pushmeet Kohli
Nando de
Koray Kavukcuoglu
Oriol Vinyals
684
1,883
0
08 Feb 2022
A Survey on Retrieval-Augmented Text Generation
Huayang Li
Yixuan Su
Deng Cai
Yan Wang
Lemao Liu
RALM
409
266
0
02 Feb 2022
Retrieve-and-Fill for Scenario-based Task-Oriented Semantic Parsing
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2022
Akshat Shrivastava
Shrey Desai
Anchit Gupta
A. Elkahky
Aleksandr Livshits
Alexander Zotov
Ahmed Aly
268
7
0
02 Feb 2022
Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval
International Conference on Machine Learning (ICML), 2022
Uri Alon
Frank F. Xu
Junxian He
Sudipta Sengupta
Dan Roth
Graham Neubig
RALM
295
75
0
28 Jan 2022
LaMDA: Language Models for Dialog Applications
R. Thoppilan
Daniel De Freitas
Jamie Hall
Noam M. Shazeer
Apoorv Kulshreshtha
...
Blaise Aguera-Arcas
Claire Cui
M. Croak
Ed H. Chi
Quoc Le
ALM
406
1,799
0
20 Jan 2022
Reasoning Through Memorization: Nearest Neighbor Knowledge Graph Embeddings
Natural Language Processing and Chinese Computing (NLPCC), 2022
Peng Wang
Xin Xie
Xiaohan Wang
Ningyu Zhang
RALM
415
20
0
14 Jan 2022
Evidentiality-guided Generation for Knowledge-Intensive NLP Tasks
Akari Asai
Matt Gardner
Hannaneh Hajishirzi
RALM
296
55
0
16 Dec 2021
Learning To Retrieve Prompts for In-Context Learning
Ohad Rubin
Jonathan Herzig
Jonathan Berant
VPVLM
RALM
386
832
0
16 Dec 2021
ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction
Keshav Santhanam
Omar Khattab
Jon Saad-Falcon
Christopher Potts
Matei A. Zaharia
485
583
0
02 Dec 2021
The Inductive Bias of In-Context Learning: Rethinking Pretraining Example Design
International Conference on Learning Representations (ICLR), 2021
Yoav Levine
Noam Wies
Daniel Jannai
D. Navon
Yedid Hoshen
Amnon Shashua
AI4CE
278
42
0
09 Oct 2021
Inductive Biases for Deep Learning of Higher-Level Cognition
Proceedings of the Royal Society A (Proc. R. Soc. A), 2020
Anirudh Goyal
Yoshua Bengio
AI4CE
533
419
0
30 Nov 2020
Previous
1
2
3
...
16
17
18
Page 18 of 18
Page
of 18
Go