ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.04426
  4. Cited By
Improving language models by retrieving from trillions of tokens
v1v2v3 (latest)

Improving language models by retrieving from trillions of tokens

8 December 2021
Sebastian Borgeaud
A. Mensch
Jordan Hoffmann
Trevor Cai
Eliza Rutherford
Katie Millican
George van den Driessche
Jean-Baptiste Lespiau
Bogdan Damoc
Aidan Clark
Diego de Las Casas
Aurelia Guy
Jacob Menick
Roman Ring
Tom Hennigan
Saffron Huang
Lorenzo Maggiore
Chris Jones
Albin Cassirer
Andy Brock
Michela Paganini
G. Irving
Oriol Vinyals
Simon Osindero
Karen Simonyan
Jack W. Rae
Erich Elsen
Laurent Sifre
    KELMRALM
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)

Papers citing "Improving language models by retrieving from trillions of tokens"

50 / 893 papers shown
Generative Knowledge Graph Construction: A Review
Generative Knowledge Graph Construction: A ReviewConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Hongbin Ye
Ningyu Zhang
Hui Chen
Huajun Chen
315
95
0
23 Oct 2022
Revision Transformers: Instructing Language Models to Change their
  Values
Revision Transformers: Instructing Language Models to Change their ValuesEuropean Conference on Artificial Intelligence (ECAI), 2022
Felix Friedrich
Wolfgang Stammer
P. Schramowski
Kristian Kersting
KELM
254
11
0
19 Oct 2022
Deep Bidirectional Language-Knowledge Graph Pretraining
Deep Bidirectional Language-Knowledge Graph PretrainingNeural Information Processing Systems (NeurIPS), 2022
Michihiro Yasunaga
Antoine Bosselut
Hongyu Ren
Xikun Zhang
Christopher D. Manning
Abigail Z. Jacobs
J. Leskovec
280
248
0
17 Oct 2022
Self-Adaptive Named Entity Recognition by Retrieving Unstructured
  Knowledge
Self-Adaptive Named Entity Recognition by Retrieving Unstructured KnowledgeConference of the European Chapter of the Association for Computational Linguistics (EACL), 2022
Kosuke Nishida
Naoki Yoshinaga
Kyosuke Nishida
294
2
0
14 Oct 2022
MTEB: Massive Text Embedding Benchmark
MTEB: Massive Text Embedding BenchmarkConference of the European Chapter of the Association for Computational Linguistics (EACL), 2022
Niklas Muennighoff
Nouamane Tazi
L. Magne
Nils Reimers
1.0K
674
0
13 Oct 2022
Knowledge-grounded Dialog State Tracking
Knowledge-grounded Dialog State TrackingConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Dian Yu
Mingqiu Wang
Yuan Cao
Izhak Shafran
Laurent El Shafey
H. Soltau
BDL
166
4
0
13 Oct 2022
Decoupled Context Processing for Context Augmented Language Modeling
Decoupled Context Processing for Context Augmented Language ModelingNeural Information Processing Systems (NeurIPS), 2022
Shunyuan Zheng
Ruiqi Guo
Surinder Kumar
RALMKELM
196
30
0
11 Oct 2022
Mind's Eye: Grounded Language Model Reasoning through Simulation
Mind's Eye: Grounded Language Model Reasoning through SimulationInternational Conference on Learning Representations (ICLR), 2022
Ruibo Liu
Jason W. Wei
S. Gu
Te-Yen Wu
Soroush Vosoughi
Claire Cui
Denny Zhou
Andrew M. Dai
ReLMLRM
356
92
0
11 Oct 2022
Retrieval Augmentation for T5 Re-ranker using External Sources
Retrieval Augmentation for T5 Re-ranker using External Sources
Kai Hui
Tao Chen
Zhen Qin
Honglei Zhuang
Fernando Diaz
Michael Bendersky
Donald Metzler
RALMLRM
101
1
0
11 Oct 2022
Fighting FIRe with FIRE: Assessing the Validity of Text-to-Video
  Retrieval Benchmarks
Fighting FIRe with FIRE: Assessing the Validity of Text-to-Video Retrieval BenchmarksFindings (Findings), 2022
Pedro Rodriguez
Mahmoud Azab
Becka Silvert
Renato Sanchez
Linzy Labson
Hardik Shah
Seungwhan Moon
212
2
0
10 Oct 2022
Noise-Robust De-Duplication at Scale
Noise-Robust De-Duplication at ScaleInternational Conference on Learning Representations (ICLR), 2022
Emily Silcock
Luca DÁmico-Wong
Jinglin Yang
Melissa Dell
SyDa
168
21
0
09 Oct 2022
Measuring and Narrowing the Compositionality Gap in Language Models
Measuring and Narrowing the Compositionality Gap in Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Ofir Press
Muru Zhang
Sewon Min
Ludwig Schmidt
Noah A. Smith
M. Lewis
ReLMKELMLRM
719
936
0
07 Oct 2022
MuRAG: Multimodal Retrieval-Augmented Generator for Open Question
  Answering over Images and Text
MuRAG: Multimodal Retrieval-Augmented Generator for Open Question Answering over Images and TextConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Wenhu Chen
Hexiang Hu
Xi Chen
Pat Verga
William W. Cohen
RALM
336
234
0
06 Oct 2022
Improving the Domain Adaptation of Retrieval Augmented Generation (RAG)
  Models for Open Domain Question Answering
Improving the Domain Adaptation of Retrieval Augmented Generation (RAG) Models for Open Domain Question AnsweringTransactions of the Association for Computational Linguistics (TACL), 2022
Shamane Siriwardhana
Rivindu Weerasekera
Elliott Wen
Tharindu Kaluarachchi
R. Rana
Suranga Nanayakkara
VLM
275
272
0
06 Oct 2022
Memory in humans and deep language models: Linking hypotheses for model
  augmentation
Memory in humans and deep language models: Linking hypotheses for model augmentation
Omri Raccah
Pheobe Chen
Ted Willke
David Poeppel
Vy A. Vo
RALM
262
1
0
04 Oct 2022
ASIF: Coupled Data Turns Unimodal Models to Multimodal Without Training
ASIF: Coupled Data Turns Unimodal Models to Multimodal Without TrainingNeural Information Processing Systems (NeurIPS), 2022
Antonio Norelli
Marco Fumero
Valentino Maiorca
Luca Moschella
Emanuele Rodolà
Francesco Locatello
VLM
372
44
0
04 Oct 2022
When to Make Exceptions: Exploring Language Models as Accounts of Human
  Moral Judgment
When to Make Exceptions: Exploring Language Models as Accounts of Human Moral JudgmentNeural Information Processing Systems (NeurIPS), 2022
Zhijing Jin
Sydney Levine
Fernando Gonzalez
Ojasv Kamal
Maarten Sap
Mrinmaya Sachan
Amélie Reymond
J. Tenenbaum
Bernhard Schölkopf
ELMLRM
413
117
0
04 Oct 2022
Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple
  Tasks
Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple TasksAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Zhenhailong Wang
Xiaoman Pan
Dian Yu
Dong Yu
Jianshu Chen
Heng Ji
VLM
264
10
0
01 Oct 2022
Re-Imagen: Retrieval-Augmented Text-to-Image Generator
Re-Imagen: Retrieval-Augmented Text-to-Image GeneratorInternational Conference on Learning Representations (ICLR), 2022
Wenhu Chen
Hexiang Hu
Chitwan Saharia
William W. Cohen
VLM
565
229
0
29 Sep 2022
Improving alignment of dialogue agents via targeted human judgements
Improving alignment of dialogue agents via targeted human judgements
Amelia Glaese
Nat McAleese
Maja Trkebacz
John Aslanides
Vlad Firoiu
...
John F. J. Mellor
Demis Hassabis
Koray Kavukcuoglu
Lisa Anne Hendricks
G. Irving
ALMAAML
533
630
0
28 Sep 2022
FiD-Light: Efficient and Effective Retrieval-Augmented Text Generation
FiD-Light: Efficient and Effective Retrieval-Augmented Text GenerationAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2022
Sebastian Hofstatter
Jiecao Chen
K. Raman
Hamed Zamani
RALM
576
105
0
28 Sep 2022
Variational Open-Domain Question Answering
Variational Open-Domain Question AnsweringInternational Conference on Machine Learning (ICML), 2022
Valentin Liévin
Andreas Geert Motzfeldt
Ida Riis Jensen
Ole Winther
OODBDL
206
11
0
23 Sep 2022
StoryDALL-E: Adapting Pretrained Text-to-Image Transformers for Story
  Continuation
StoryDALL-E: Adapting Pretrained Text-to-Image Transformers for Story ContinuationEuropean Conference on Computer Vision (ECCV), 2022
A. Maharana
Darryl Hannan
Joey Tianyi Zhou
DiffM
269
101
0
13 Sep 2022
Diffusion Models in Vision: A Survey
Diffusion Models in Vision: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Florinel-Alin Croitoru
Vlad Hondru
Radu Tudor Ionescu
M. Shah
DiffMVLMMedIm
1.2K
1,761
0
10 Sep 2022
A Review of Sparse Expert Models in Deep Learning
A Review of Sparse Expert Models in Deep Learning
W. Fedus
J. Dean
Barret Zoph
MoE
248
193
0
04 Sep 2022
Petals: Collaborative Inference and Fine-tuning of Large Models
Petals: Collaborative Inference and Fine-tuning of Large ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Alexander Borzunov
Dmitry Baranchuk
Tim Dettmers
Max Ryabinin
Younes Belkada
Artem Chumachenko
Pavel Samygin
Colin Raffel
VLM
222
95
0
02 Sep 2022
Efficient Methods for Natural Language Processing: A Survey
Efficient Methods for Natural Language Processing: A SurveyTransactions of the Association for Computational Linguistics (TACL), 2022
Marcos Vinícius Treviso
Ji-Ung Lee
Tianchu Ji
Betty van Aken
Qingqing Cao
...
Emma Strubell
Niranjan Balasubramanian
Leon Derczynski
Iryna Gurevych
Roy Schwartz
369
140
0
31 Aug 2022
Domain-Specific NER via Retrieving Correlated Samples
Domain-Specific NER via Retrieving Correlated SamplesInternational Conference on Computational Linguistics (COLING), 2022
Xin Zhang
Yong Jiang
Xiaobin Wang
Xuming Hu
Yueheng Sun
Pengjun Xie
Meishan Zhang
351
18
0
27 Aug 2022
PEER: A Collaborative Language Model
PEER: A Collaborative Language ModelInternational Conference on Learning Representations (ICLR), 2022
Timo Schick
Jane Dwivedi-Yu
Zhengbao Jiang
Fabio Petroni
Patrick Lewis
Gautier Izacard
Qingfei You
Christoforos Nalmpantis
Edouard Grave
Sebastian Riedel
ALM
262
104
0
24 Aug 2022
Retrieval-based Controllable Molecule Generation
Retrieval-based Controllable Molecule GenerationInternational Conference on Learning Representations (ICLR), 2022
Zichao Wang
Weili Nie
Zhuoran Qiao
Chaowei Xiao
Richard Baraniuk
Anima Anandkumar
331
45
0
23 Aug 2022
Ered: Enhanced Text Representations with Entities and Descriptions
Ered: Enhanced Text Representations with Entities and Descriptions
Qinghua Zhao
Shuai Ma
Yu Lei
155
1
0
18 Aug 2022
Retrieval-Augmented Transformer for Image Captioning
Retrieval-Augmented Transformer for Image CaptioningInternational Conference on Content-Based Multimedia Indexing (CBMI), 2022
Sara Sarto
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
188
68
0
26 Jul 2022
Text-Guided Synthesis of Artistic Images with Retrieval-Augmented
  Diffusion Models
Text-Guided Synthesis of Artistic Images with Retrieval-Augmented Diffusion Models
Robin Rombach
A. Blattmann
Bjorn Ommer
DiffM
256
90
0
26 Jul 2022
Discrete Key-Value Bottleneck
Discrete Key-Value BottleneckInternational Conference on Machine Learning (ICML), 2022
Frederik Trauble
Anirudh Goyal
Nasim Rahaman
Michael C. Mozer
Kenji Kawaguchi
Yoshua Bengio
Bernhard Schölkopf
CLL
302
23
0
22 Jul 2022
MQRetNN: Multi-Horizon Time Series Forecasting with Retrieval
  Augmentation
MQRetNN: Multi-Horizon Time Series Forecasting with Retrieval Augmentation
Sitan Yang
Carson Eisenach
Dhruv Madeka
AI4TS
116
10
0
21 Jul 2022
Can large language models reason about medical questions?
Can large language models reason about medical questions?Patterns (Patterns), 2022
Valentin Liévin
C. Hother
Andreas Geert Motzfeldt
Ole Winther
ELMLM&MAAI4MHLRM
510
387
0
17 Jul 2022
Recent Developments in AI and USPTO Open Data
Recent Developments in AI and USPTO Open Data
Scott Beliveau
Jerry Ma
107
1
0
12 Jul 2022
TPU-KNN: K Nearest Neighbor Search at Peak FLOP/s
TPU-KNN: K Nearest Neighbor Search at Peak FLOP/sNeural Information Processing Systems (NeurIPS), 2022
Felix Chern
Blake A. Hechtman
Andy Davis
Ruiqi Guo
David Majnemer
Surinder Kumar
243
29
0
28 Jun 2022
ProGen2: Exploring the Boundaries of Protein Language Models
ProGen2: Exploring the Boundaries of Protein Language ModelsCell Systems (Cell Syst.), 2022
Erik Nijkamp
Jeffrey A. Ruffolo
Eli N. Weinstein
Nikhil Naik
Ali Madani
AI4TS
165
412
0
27 Jun 2022
Emergent Abilities of Large Language Models
Emergent Abilities of Large Language Models
Jason W. Wei
Yi Tay
Rishi Bommasani
Colin Raffel
Barret Zoph
...
Tatsunori Hashimoto
Oriol Vinyals
Abigail Z. Jacobs
J. Dean
W. Fedus
ELMReLMLRM
532
3,107
0
15 Jun 2022
LIFT: Language-Interfaced Fine-Tuning for Non-Language Machine Learning
  Tasks
LIFT: Language-Interfaced Fine-Tuning for Non-Language Machine Learning TasksNeural Information Processing Systems (NeurIPS), 2022
Tuan Dinh
Yuchen Zeng
Ruisu Zhang
Ziqian Lin
Michael Gira
Shashank Rajput
Jy-yong Sohn
Dimitris Papailiopoulos
Kangwook Lee
LMTD
555
167
0
14 Jun 2022
Large-Scale Retrieval for Reinforcement Learning
Large-Scale Retrieval for Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2022
Peter C. Humphreys
A. Guez
O. Tieleman
Laurent Sifre
T. Weber
Timothy Lillicrap
RALMOffRL
309
32
0
10 Jun 2022
Improving Contrastive Learning of Sentence Embeddings with
  Case-Augmented Positives and Retrieved Negatives
Improving Contrastive Learning of Sentence Embeddings with Case-Augmented Positives and Retrieved NegativesAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2022
Wei Wang
Liangzhu Ge
Jingqiao Zhang
Cheng Yang
165
25
0
06 Jun 2022
Decoupling Knowledge from Memorization: Retrieval-augmented Prompt
  Learning
Decoupling Knowledge from Memorization: Retrieval-augmented Prompt LearningNeural Information Processing Systems (NeurIPS), 2022
Xiang Chen
Lei Li
Ningyu Zhang
Xiaozhuan Liang
Shumin Deng
Chuanqi Tan
Fei Huang
Luo Si
Huajun Chen
VLM
369
60
0
29 May 2022
Learning to Automate Follow-up Question Generation using Process
  Knowledge for Depression Triage on Reddit Posts
Learning to Automate Follow-up Question Generation using Process Knowledge for Depression Triage on Reddit PostsWorkshop on Computational Linguistics and Clinical Psychology (CLPsych), 2022
Shrey Gupta
Anmol Agarwal
Manas Gaur
Kaushik Roy
Vignesh Narayanan
Ponnurangam Kumaraguru
Amit P. Sheth
AI4MH
127
38
0
27 May 2022
kNN-Prompt: Nearest Neighbor Zero-Shot Inference
kNN-Prompt: Nearest Neighbor Zero-Shot InferenceConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Weijia Shi
Julian Michael
Suchin Gururangan
Luke Zettlemoyer
RALMVLM
188
40
0
27 May 2022
Tranception: protein fitness prediction with autoregressive transformers
  and inference-time retrieval
Tranception: protein fitness prediction with autoregressive transformers and inference-time retrievalInternational Conference on Machine Learning (ICML), 2022
Pascal Notin
M. Dias
J. Frazer
Javier Marchena-Hurtado
Aidan Gomez
D. Marks
Y. Gal
213
226
0
27 May 2022
Training Language Models with Memory Augmentation
Training Language Models with Memory AugmentationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Zexuan Zhong
Tao Lei
Danqi Chen
RALM
722
144
0
25 May 2022
TALM: Tool Augmented Language Models
TALM: Tool Augmented Language Models
Aaron T Parisi
Yao-Min Zhao
Noah Fiedel
KELMRALMLLMAG
274
183
0
24 May 2022
Chunk-based Nearest Neighbor Machine Translation
Chunk-based Nearest Neighbor Machine TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Pedro Henrique Martins
Zita Marinho
André F.T. Martins
RALM
309
32
0
24 May 2022
Previous
123...161718
Next