ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.04426
  4. Cited By
Improving language models by retrieving from trillions of tokens
v1v2v3 (latest)

Improving language models by retrieving from trillions of tokens

8 December 2021
Sebastian Borgeaud
A. Mensch
Jordan Hoffmann
Trevor Cai
Eliza Rutherford
Katie Millican
George van den Driessche
Jean-Baptiste Lespiau
Bogdan Damoc
Aidan Clark
Diego de Las Casas
Aurelia Guy
Jacob Menick
Roman Ring
Tom Hennigan
Saffron Huang
Lorenzo Maggiore
Chris Jones
Albin Cassirer
Andy Brock
Michela Paganini
G. Irving
Oriol Vinyals
Simon Osindero
Karen Simonyan
Jack W. Rae
Erich Elsen
Laurent Sifre
    KELMRALM
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)

Papers citing "Improving language models by retrieving from trillions of tokens"

43 / 893 papers shown
Memorization Without Overfitting: Analyzing the Training Dynamics of
  Large Language Models
Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language ModelsNeural Information Processing Systems (NeurIPS), 2022
Kushal Tirumala
Aram H. Markosyan
Luke Zettlemoyer
Armen Aghajanyan
TDI
356
243
0
22 May 2022
Visually-Augmented Language Modeling
Visually-Augmented Language ModelingInternational Conference on Learning Representations (ICLR), 2022
Weizhi Wang
Li Dong
Hao Cheng
Haoyu Song
Xiaodong Liu
Xifeng Yan
Jianfeng Gao
Furu Wei
VLM
232
22
0
20 May 2022
A Generalist Agent
A Generalist Agent
Scott E. Reed
Konrad Zolna
Emilio Parisotto
Sergio Gomez Colmenarejo
Alexander Novikov
...
Yutian Chen
R. Hadsell
Oriol Vinyals
Mahyar Bordbar
Nando de Freitas
LM&RoLLMAGAI4CE
474
979
0
12 May 2022
Asking for Knowledge: Training RL Agents to Query External Knowledge
  Using Language
Asking for Knowledge: Training RL Agents to Query External Knowledge Using LanguageInternational Conference on Machine Learning (ICML), 2022
Iou-Jen Liu
Xingdi Yuan
Marc-Alexandre Côté
Pierre-Yves Oudeyer
Alex Schwing
RALM
250
13
0
12 May 2022
Retrieval-Enhanced Machine Learning
Retrieval-Enhanced Machine LearningAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2022
Hamed Zamani
Fernando Diaz
Mostafa Dehghani
Donald Metzler
Michael Bendersky
171
59
0
02 May 2022
OPT: Open Pre-trained Transformer Language Models
OPT: Open Pre-trained Transformer Language Models
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
...
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLMOSLMAI4CE
892
4,417
0
02 May 2022
TemporalWiki: A Lifelong Benchmark for Training and Evaluating
  Ever-Evolving Language Models
TemporalWiki: A Lifelong Benchmark for Training and Evaluating Ever-Evolving Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Joel Jang
Seonghyeon Ye
Changho Lee
Sohee Yang
Joongbo Shin
Janghoon Han
Gyeonghun Kim
Minjoon Seo
CLLKELM
439
116
0
29 Apr 2022
Can deep learning match the efficiency of human visual long-term memory
  in storing object details?
Can deep learning match the efficiency of human visual long-term memory in storing object details?
Emin Orhan
VLMOCL
231
0
0
27 Apr 2022
Semi-Parametric Neural Image Synthesis
Semi-Parametric Neural Image Synthesis
A. Blattmann
Robin Rombach
Kaan Oktay
Jonas Muller
Bjorn Ommer
DiffM
304
32
0
25 Apr 2022
ChapterBreak: A Challenge Dataset for Long-Range Language Models
ChapterBreak: A Challenge Dataset for Long-Range Language ModelsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2022
Simeng Sun
Katherine Thai
Mohit Iyyer
189
20
0
22 Apr 2022
Standing on the Shoulders of Giant Frozen Language Models
Standing on the Shoulders of Giant Frozen Language Models
Yoav Levine
Itay Dalmedigos
Ori Ram
Yoel Zeldes
Daniel Jannai
...
Barak Lenz
Shai Shalev-Shwartz
Amnon Shashua
Kevin Leyton-Brown
Y. Shoham
VLM
240
52
0
21 Apr 2022
K-LITE: Learning Transferable Visual Models with External Knowledge
K-LITE: Learning Transferable Visual Models with External KnowledgeNeural Information Processing Systems (NeurIPS), 2022
Sheng Shen
Chunyuan Li
Xiaowei Hu
Jianwei Yang
Yujia Xie
...
Ce Liu
Kurt Keutzer
Trevor Darrell
Anna Rohrbach
Jianfeng Gao
CLIPVLM
197
96
0
20 Apr 2022
METRO: Efficient Denoising Pretraining of Large Scale Autoencoding
  Language Models with Model Generated Signals
METRO: Efficient Denoising Pretraining of Large Scale Autoencoding Language Models with Model Generated Signals
Payal Bajaj
Chenyan Xiong
Guolin Ke
Xiaodong Liu
Di He
Saurabh Tiwary
Tie-Yan Liu
Paul N. Bennett
Xia Song
Jianfeng Gao
208
34
0
13 Apr 2022
Training a Helpful and Harmless Assistant with Reinforcement Learning
  from Human Feedback
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback
Yuntao Bai
Andy Jones
Kamal Ndousse
Amanda Askell
Anna Chen
...
Jack Clark
Sam McCandlish
C. Olah
Benjamin Mann
Jared Kaplan
960
3,520
0
12 Apr 2022
Augmenting Pre-trained Language Models with QA-Memory for Open-Domain
  Question Answering
Augmenting Pre-trained Language Models with QA-Memory for Open-Domain Question AnsweringConference of the European Chapter of the Association for Computational Linguistics (EACL), 2022
Wenhu Chen
Pat Verga
Michiel de Jong
John Wieting
William W. Cohen
RALMKELM
307
28
0
10 Apr 2022
Knowledge Base Index Compression via Dimensionality and Precision
  Reduction
Knowledge Base Index Compression via Dimensionality and Precision Reduction
Vilém Zouhar
Marius Mosbach
Miaoran Zhang
Dietrich Klakow
245
3
0
06 Apr 2022
KNN-Diffusion: Image Generation via Large-Scale Retrieval
KNN-Diffusion: Image Generation via Large-Scale RetrievalInternational Conference on Learning Representations (ICLR), 2022
Shelly Sheynin
Oron Ashual
Adam Polyak
Uriel Singer
Oran Gafni
Eliya Nachmani
Yaniv Taigman
VLMSyDaDiffM
252
148
0
06 Apr 2022
PaLM: Scaling Language Modeling with Pathways
PaLM: Scaling Language Modeling with PathwaysJournal of machine learning research (JMLR), 2022
Aakanksha Chowdhery
Sharan Narang
Jacob Devlin
Maarten Bosma
Gaurav Mishra
...
Kathy Meier-Hellstern
Douglas Eck
J. Dean
Slav Petrov
Noah Fiedel
PILMLRM
1.2K
7,524
0
05 Apr 2022
Revisiting a kNN-based Image Classification System with High-capacity
  Storage
Revisiting a kNN-based Image Classification System with High-capacity StorageEuropean Conference on Computer Vision (ECCV), 2022
K. Nakata
Youyang Ng
Daisuke Miyashita
A. Maki
Yu Lin
J. Deguchi
255
29
0
03 Apr 2022
PanGu-Bot: Efficient Generative Dialogue Pre-training from Pre-trained
  Language Model
PanGu-Bot: Efficient Generative Dialogue Pre-training from Pre-trained Language Model
Fei Mi
Yitong Li
Yulong Zeng
Jingyan Zhou
Yasheng Wang
Chuanfei Xu
Lifeng Shang
Xin Jiang
Shiqi Zhao
Qun Liu
ALM
348
17
0
31 Mar 2022
Training Compute-Optimal Large Language Models
Training Compute-Optimal Large Language Models
Jordan Hoffmann
Sebastian Borgeaud
A. Mensch
Elena Buchatskaya
Trevor Cai
...
Karen Simonyan
Erich Elsen
Jack W. Rae
Oriol Vinyals
Laurent Sifre
AI4TS
798
2,684
0
29 Mar 2022
Diagonal State Spaces are as Effective as Structured State Spaces
Diagonal State Spaces are as Effective as Structured State SpacesNeural Information Processing Systems (NeurIPS), 2022
Ankit Gupta
Albert Gu
Jonathan Berant
420
416
0
27 Mar 2022
Language Models that Seek for Knowledge: Modular Search & Generation for
  Dialogue and Prompt Completion
Language Models that Seek for Knowledge: Modular Search & Generation for Dialogue and Prompt CompletionConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Kurt Shuster
M. Komeili
Leonard Adolphs
Stephen Roller
Arthur Szlam
Jason Weston
KELM
232
143
0
24 Mar 2022
Teaching language models to support answers with verified quotes
Teaching language models to support answers with verified quotes
Jacob Menick
Maja Trebacz
Vladimir Mikulik
John Aslanides
Francis Song
...
Mia Glaese
Susannah Young
Lucy Campbell-Gillingham
G. Irving
Nat McAleese
ELMRALM
530
308
0
21 Mar 2022
Reasoning over Public and Private Data in Retrieval-Based Systems
Reasoning over Public and Private Data in Retrieval-Based SystemsTransactions of the Association for Computational Linguistics (TACL), 2022
Simran Arora
Patrick Lewis
Angela Fan
Jacob Kahn
Christopher Ré
192
31
0
14 Mar 2022
Internet-augmented language models through few-shot prompting for
  open-domain question answering
Internet-augmented language models through few-shot prompting for open-domain question answering
Angeliki Lazaridou
E. Gribovskaya
Wojciech Stokowiec
N. Grigorev
KELMLRM
244
159
0
10 Mar 2022
Finite-Sum Coupled Compositional Stochastic Optimization: Theory and
  Applications
Finite-Sum Coupled Compositional Stochastic Optimization: Theory and ApplicationsInternational Conference on Machine Learning (ICML), 2022
Bokun Wang
Tianbao Yang
546
35
0
24 Feb 2022
From Natural Language to Simulations: Applying GPT-3 Codex to Automate
  Simulation Modeling of Logistics Systems
From Natural Language to Simulations: Applying GPT-3 Codex to Automate Simulation Modeling of Logistics SystemsSocial Science Research Network (SSRN), 2022
I. Jackson
M. J. Sáenz
182
10
0
24 Feb 2022
Do Transformers know symbolic rules, and would we know if they did?
Do Transformers know symbolic rules, and would we know if they did?
Tommi Gröndahl
Yu-Wen Guo
Nirmal Asokan
420
0
0
19 Feb 2022
Retrieval-Augmented Reinforcement Learning
Retrieval-Augmented Reinforcement LearningInternational Conference on Machine Learning (ICML), 2022
Anirudh Goyal
A. Friesen
Andrea Banino
T. Weber
Nan Rosemary Ke
...
Michal Valko
Simon Osindero
Timothy Lillicrap
N. Heess
Charles Blundell
OffRL
406
66
0
17 Feb 2022
Transformer Memory as a Differentiable Search Index
Transformer Memory as a Differentiable Search IndexNeural Information Processing Systems (NeurIPS), 2022
Yi Tay
Vinh Q. Tran
Mostafa Dehghani
Jianmo Ni
Dara Bahri
...
Zhe Zhao
Jai Gupta
Tal Schuster
William W. Cohen
Donald Metzler
434
368
0
14 Feb 2022
Semi-supervised New Event Type Induction and Description via Contrastive
  Loss-Enforced Batch Attention
Semi-supervised New Event Type Induction and Description via Contrastive Loss-Enforced Batch AttentionConference of the European Chapter of the Association for Computational Linguistics (EACL), 2022
Carl Edwards
Heng Ji
153
12
0
12 Feb 2022
Competition-Level Code Generation with AlphaCode
Competition-Level Code Generation with AlphaCodeScience (Science), 2022
Yujia Li
David Choi
Junyoung Chung
Nate Kushman
Julian Schrittwieser
...
Esme Sutherland Robson
Pushmeet Kohli
Nando de
Koray Kavukcuoglu
Oriol Vinyals
684
1,883
0
08 Feb 2022
A Survey on Retrieval-Augmented Text Generation
A Survey on Retrieval-Augmented Text Generation
Huayang Li
Yixuan Su
Deng Cai
Yan Wang
Lemao Liu
RALM
409
266
0
02 Feb 2022
Retrieve-and-Fill for Scenario-based Task-Oriented Semantic Parsing
Retrieve-and-Fill for Scenario-based Task-Oriented Semantic ParsingConference of the European Chapter of the Association for Computational Linguistics (EACL), 2022
Akshat Shrivastava
Shrey Desai
Anchit Gupta
A. Elkahky
Aleksandr Livshits
Alexander Zotov
Ahmed Aly
268
7
0
02 Feb 2022
Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval
Neuro-Symbolic Language Modeling with Automaton-augmented RetrievalInternational Conference on Machine Learning (ICML), 2022
Uri Alon
Frank F. Xu
Junxian He
Sudipta Sengupta
Dan Roth
Graham Neubig
RALM
295
75
0
28 Jan 2022
LaMDA: Language Models for Dialog Applications
LaMDA: Language Models for Dialog Applications
R. Thoppilan
Daniel De Freitas
Jamie Hall
Noam M. Shazeer
Apoorv Kulshreshtha
...
Blaise Aguera-Arcas
Claire Cui
M. Croak
Ed H. Chi
Quoc Le
ALM
406
1,799
0
20 Jan 2022
Reasoning Through Memorization: Nearest Neighbor Knowledge Graph
  Embeddings
Reasoning Through Memorization: Nearest Neighbor Knowledge Graph EmbeddingsNatural Language Processing and Chinese Computing (NLPCC), 2022
Peng Wang
Xin Xie
Xiaohan Wang
Ningyu Zhang
RALM
415
20
0
14 Jan 2022
Evidentiality-guided Generation for Knowledge-Intensive NLP Tasks
Evidentiality-guided Generation for Knowledge-Intensive NLP Tasks
Akari Asai
Matt Gardner
Hannaneh Hajishirzi
RALM
296
55
0
16 Dec 2021
Learning To Retrieve Prompts for In-Context Learning
Learning To Retrieve Prompts for In-Context Learning
Ohad Rubin
Jonathan Herzig
Jonathan Berant
VPVLMRALM
386
832
0
16 Dec 2021
ColBERTv2: Effective and Efficient Retrieval via Lightweight Late
  Interaction
ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction
Keshav Santhanam
Omar Khattab
Jon Saad-Falcon
Christopher Potts
Matei A. Zaharia
485
583
0
02 Dec 2021
The Inductive Bias of In-Context Learning: Rethinking Pretraining
  Example Design
The Inductive Bias of In-Context Learning: Rethinking Pretraining Example DesignInternational Conference on Learning Representations (ICLR), 2021
Yoav Levine
Noam Wies
Daniel Jannai
D. Navon
Yedid Hoshen
Amnon Shashua
AI4CE
278
42
0
09 Oct 2021
Inductive Biases for Deep Learning of Higher-Level Cognition
Inductive Biases for Deep Learning of Higher-Level CognitionProceedings of the Royal Society A (Proc. R. Soc. A), 2020
Anirudh Goyal
Yoshua Bengio
AI4CE
533
419
0
30 Nov 2020
Previous
123...161718
Page 18 of 18
Pageof 18