Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2112.04426
Cited By
v1
v2
v3 (latest)
Improving language models by retrieving from trillions of tokens
8 December 2021
Sebastian Borgeaud
A. Mensch
Jordan Hoffmann
Trevor Cai
Eliza Rutherford
Katie Millican
George van den Driessche
Jean-Baptiste Lespiau
Bogdan Damoc
Aidan Clark
Diego de Las Casas
Aurelia Guy
Jacob Menick
Roman Ring
Tom Hennigan
Saffron Huang
Lorenzo Maggiore
Chris Jones
Albin Cassirer
Andy Brock
Michela Paganini
G. Irving
Oriol Vinyals
Simon Osindero
Karen Simonyan
Jack W. Rae
Erich Elsen
Laurent Sifre
KELM
RALM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (1 upvotes)
Papers citing
"Improving language models by retrieving from trillions of tokens"
50 / 893 papers shown
Title
On the Generalization Ability of Retrieval-Enhanced Transformers
Findings (Findings), 2023
Tobias Norlund
Ehsan Doostmohammadi
Richard Johansson
Marco Kuhlmann
RALM
103
7
0
23 Feb 2023
k
k
k
NN-Adapter: Efficient Domain Adaptation for Black-Box Language Models
Yangsibo Huang
Daogao Liu
Zexuan Zhong
Weijia Shi
Y. Lee
RALM
ALM
154
20
0
21 Feb 2023
Complex QA and language models hybrid architectures, Survey
Xavier Daull
P. Bellot
Emmanuel Bruno
Vincent Martin
Elisabeth Murisasco
ELM
623
17
0
17 Feb 2023
Augmented Language Models: a Survey
Grégoire Mialon
Roberto Dessì
Maria Lomeli
Christoforos Nalmpantis
Ramakanth Pasunuru
...
Jane Dwivedi-Yu
Asli Celikyilmaz
Edouard Grave
Yann LeCun
Thomas Scialom
LRM
KELM
254
482
0
15 Feb 2023
Characterizing Attribution and Fluency Tradeoffs for Retrieval-Augmented Large Language Models
Renat Aksitov
Chung-Ching Chang
David Reitter
Siamak Shakeri
Yun-hsuan Sung
RALM
178
20
0
11 Feb 2023
Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Zhuolin Yang
Ming-Yu Liu
Zihan Liu
V. Korthikanti
Weili Nie
...
Yuke Zhu
Mohammad Shoeybi
Bryan Catanzaro
Chaowei Xiao
Anima Anandkumar
VLM
RALM
161
50
0
09 Feb 2023
Toolformer: Language Models Can Teach Themselves to Use Tools
Neural Information Processing Systems (NeurIPS), 2023
Timo Schick
Jane Dwivedi-Yu
Roberto Dessì
Roberta Raileanu
Maria Lomeli
Luke Zettlemoyer
Nicola Cancedda
Thomas Scialom
SyDa
RALM
397
2,567
0
09 Feb 2023
Augmenting Zero-Shot Dense Retrievers with Plug-in Mixture-of-Memories
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Suyu Ge
Chenyan Xiong
Corby Rosset
Arnold Overwijk
Jiawei Han
Paul N. Bennett
VLM
129
11
0
07 Feb 2023
Learning a Fourier Transform for Linear Relative Positional Encodings in Transformers
International Conference on Artificial Intelligence and Statistics (AISTATS), 2023
K. Choromanski
Shanda Li
Valerii Likhosherstov
Kumar Avinava Dubey
Shengjie Luo
Di He
Yiming Yang
Tamás Sarlós
Thomas Weingarten
Adrian Weller
301
9
0
03 Feb 2023
ResMem: Learn what you can and memorize the rest
Neural Information Processing Systems (NeurIPS), 2023
Zitong Yang
Michal Lukasik
Vaishnavh Nagarajan
Zong-xiao Li
A. S. Rawat
Manzil Zaheer
A. Menon
Surinder Kumar
VLM
282
9
0
03 Feb 2023
QR-CLIP: Introducing Explicit Open-World Knowledge for Location and Time Reasoning
Weimin Shi
Mingchen Zhuge
D. Gao
Zhong Zhou
Ming-Ming Cheng
Deng-Ping Fan
LRM
VLM
205
0
0
02 Feb 2023
In-Context Retrieval-Augmented Language Models
Transactions of the Association for Computational Linguistics (TACL), 2023
Ori Ram
Yoav Levine
Itay Dalmedigos
Dor Muhlgay
Amnon Shashua
Kevin Leyton-Brown
Y. Shoham
KELM
RALM
LRM
485
821
0
31 Jan 2023
The Power of External Memory in Increasing Predictive Model Capacity
Cenk Baykal
D. Cutler
Nishanth Dikkala
Nikhil Ghosh
Rina Panigrahy
Xin Wang
KELM
147
0
0
31 Jan 2023
Alternating Updates for Efficient Transformers
Neural Information Processing Systems (NeurIPS), 2023
Cenk Baykal
D. Cutler
Nishanth Dikkala
Nikhil Ghosh
Rina Panigrahy
Xin Wang
MoE
172
8
0
30 Jan 2023
REPLUG: Retrieval-Augmented Black-Box Language Models
North American Chapter of the Association for Computational Linguistics (NAACL), 2023
Weijia Shi
Sewon Min
Michihiro Yasunaga
Minjoon Seo
Rich James
M. Lewis
Luke Zettlemoyer
Anuj Kumar
RALM
VLM
KELM
651
832
0
30 Jan 2023
Semi-Parametric Video-Grounded Text Generation
Sungdong Kim
Jin-Hwa Kim
Jiyoung Lee
Minjoon Seo
VGen
220
17
0
27 Jan 2023
Pre-computed memory or on-the-fly encoding? A hybrid approach to retrieval augmentation makes the most of your compute
International Conference on Machine Learning (ICML), 2023
Michiel de Jong
Yury Zemlyanskiy
Nicholas FitzGerald
Joshua Ainslie
Sumit Sanghai
Fei Sha
William W. Cohen
RALM
203
20
0
25 Jan 2023
Learning Customized Visual Models with Retrieval-Augmented Knowledge
Computer Vision and Pattern Recognition (CVPR), 2023
Haotian Liu
Kilho Son
Jianwei Yang
Ce Liu
Jianfeng Gao
Yong Jae Lee
Chunyuan Li
VLM
219
77
0
17 Jan 2023
Dissociating language and thought in large language models
Kyle Mahowald
Anna A. Ivanova
I. Blank
Nancy Kanwisher
J. Tenenbaum
Evelina Fedorenko
ELM
ReLM
276
228
0
16 Jan 2023
Cross-Model Comparative Loss for Enhancing Neuronal Utility in Language Understanding
Yunchang Zhu
Liang Pang
Kangxi Wu
Yanyan Lan
Huawei Shen
Xueqi Cheng
AAML
ELM
151
3
0
10 Jan 2023
Why do Nearest Neighbor Language Models Work?
International Conference on Machine Learning (ICML), 2023
Frank F. Xu
Uri Alon
Graham Neubig
RALM
163
29
0
07 Jan 2023
Using External Off-Policy Speech-To-Text Mappings in Contextual End-To-End Automated Speech Recognition
David M. Chan
Shalini Ghosh
Ariya Rastrow
Björn Hoffmeister
OffRL
183
7
0
06 Jan 2023
Tsetlin Machine Embedding: Representing Words Using Logical Expressions
Findings (Findings), 2023
Bimal Bhattarai
Ole-Christoffer Granmo
Lei Jiao
Rohan Kumar Yadav
Jivitesh Sharma
NAI
180
17
0
02 Jan 2023
Rethinking with Retrieval: Faithful Large Language Model Inference
Hangfeng He
Hongming Zhang
Dan Roth
KELM
LRM
443
201
0
31 Dec 2022
Noise-aware Learning from Web-crawled Image-Text Data for Image Captioning
IEEE International Conference on Computer Vision (ICCV), 2022
Woohyun Kang
Jonghwan Mun
Sungjun Lee
Byungseok Roh
VLM
217
26
0
27 Dec 2022
Contrastive Distillation Is a Sample-Efficient Self-Supervised Loss Policy for Transfer Learning
Christopher T. Lengerich
Gabriel Synnaeve
Amy Zhang
Hugh Leather
Kurt Shuster
Franccois Charton
Charysse Redwood
SSL
OffRL
175
1
0
21 Dec 2022
Resolving Indirect Referring Expressions for Entity Selection
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Mohammad Javad Hosseini
Filip Radlinski
Silvia Pareti
Annie Louis
167
2
0
21 Dec 2022
Open Domain Multi-document Summarization: A Comprehensive Study of Model Brittleness under Retrieval
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
John Giorgi
Luca Soldaini
Bo Wang
Gary D. Bader
Kyle Lo
Lucy Lu Wang
Arman Cohan
223
21
0
20 Dec 2022
When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Alex Troy Mallen
Akari Asai
Victor Zhong
Rajarshi Das
Daniel Khashabi
Hannaneh Hajishirzi
RALM
HILM
KELM
345
856
0
20 Dec 2022
Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
H. Trivedi
Niranjan Balasubramanian
Tushar Khot
Ashish Sabharwal
KELM
RALM
LRM
413
712
0
20 Dec 2022
Training Trajectories of Language Models Across Scales
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Mengzhou Xia
Mikel Artetxe
Chunting Zhou
Xi Lin
Ramakanth Pasunuru
Danqi Chen
Luke Zettlemoyer
Ves Stoyanov
AIFin
LRM
231
70
0
19 Dec 2022
DSI++: Updating Transformer Memory with New Documents
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Sanket Vaibhav Mehta
Jai Gupta
Yi Tay
Mostafa Dehghani
Vinh Q. Tran
J. Rao
Marc Najork
Emma Strubell
Donald Metzler
CLL
209
59
0
19 Dec 2022
FiDO: Fusion-in-Decoder optimized for stronger performance and faster inference
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Michiel de Jong
Yury Zemlyanskiy
Joshua Ainslie
Nicholas FitzGerald
Sumit Sanghai
Fei Sha
William W. Cohen
VLM
257
37
0
15 Dec 2022
Faster Maximum Inner Product Search in High Dimensions
International Conference on Machine Learning (ICML), 2022
Mo Tiwari
Ryan Kang
Je-Yong Lee
Luke Lee
Chris Piech
Sebastian Thrun
Ilan Shomorony
Martin Jinye Zhang
346
6
0
14 Dec 2022
G-MAP: General Memory-Augmented Pre-trained Language Model for Domain Tasks
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Zhongwei Wan
Yichun Yin
Wei Zhang
Jiaxin Shi
Lifeng Shang
Guangyong Chen
Xin Jiang
Qun Liu
VLM
CLL
339
20
0
07 Dec 2022
A Flexible Nadaraya-Watson Head Can Offer Explainable and Calibrated Classification
Alan Q. Wang
M. Sabuncu
215
6
0
07 Dec 2022
Document-Level Abstractive Summarization
Gonçalo Raposo
Afonso Raposo
Ana Sofia Carmo
118
3
0
06 Dec 2022
Improving Few-Shot Performance of Language Models via Nearest Neighbor Calibration
Feng Nie
Meixi Chen
Zhirui Zhang
Xuan Cheng
173
38
0
05 Dec 2022
Retrieval as Attention: End-to-end Learning of Retrieval and Reading within a Single Transformer
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Zhengbao Jiang
Luyu Gao
Jun Araki
Haibo Ding
Zhiruo Wang
Jamie Callan
Graham Neubig
RALM
260
49
0
05 Dec 2022
Named Entity and Relation Extraction with Multi-Modal Retrieval
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Xinyu Wang
Jiong Cai
Yong Jiang
Pengjun Xie
Kewei Tu
Wei Lu
199
67
0
03 Dec 2022
Nonparametric Masked Language Modeling
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Sewon Min
Weijia Shi
M. Lewis
Xilun Chen
Anuj Kumar
Hannaneh Hajishirzi
Luke Zettlemoyer
RALM
304
55
0
02 Dec 2022
Retrieval-Augmented Multimodal Language Modeling
International Conference on Machine Learning (ICML), 2023
Michihiro Yasunaga
Armen Aghajanyan
Weijia Shi
Rich James
J. Leskovec
Abigail Z. Jacobs
M. Lewis
Luke Zettlemoyer
Anuj Kumar
RALM
217
130
0
22 Nov 2022
Token Turing Machines
Computer Vision and Pattern Recognition (CVPR), 2022
Michael S. Ryoo
K. Gopalakrishnan
Kumara Kahatapitiya
Ted Xiao
Kanishka Rao
Austin Stone
Yao Lu
Julian Ibarz
Anurag Arnab
210
28
0
16 Nov 2022
Galactica: A Large Language Model for Science
Ross Taylor
Marcin Kardas
Guillem Cucurull
Thomas Scialom
Anthony Hartshorn
Elvis Saravia
Andrew Poulton
Viktor Kerkez
Robert Stojnic
ELM
ReLM
363
916
0
16 Nov 2022
A Survey of Knowledge Enhanced Pre-trained Language Models
IEEE Transactions on Knowledge and Data Engineering (TKDE), 2022
Linmei Hu
Zeyi Liu
Ziwang Zhao
Lei Hou
Liqiang Nie
Juanzi Li
KELM
VLM
418
193
0
11 Nov 2022
Exemplar Guided Deep Neural Network for Spatial Transcriptomics Analysis of Gene Expression Prediction
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Yan Yang
Md Zakir Hossain
Eric A. Stone
Shafin Rahman
AI4TS
177
34
0
30 Oct 2022
Knowledge-in-Context: Towards Knowledgeable Semi-Parametric Language Models
International Conference on Learning Representations (ICLR), 2022
Xiaoman Pan
Wenlin Yao
Hongming Zhang
Dian Yu
Dong Yu
Jianshu Chen
KELM
483
26
0
28 Oct 2022
You can't pick your neighbors, or can you? When and how to rely on retrieval in the
k
k
k
NN-LM
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Andrew Drozdov
Shufan Wang
Razieh Rahimi
Andrew McCallum
Hamed Zamani
Mohit Iyyer
RALM
346
21
0
28 Oct 2022
QUILL: Query Intent with Large Language Models using Retrieval Augmentation and Multi-stage Distillation
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Krishna Srinivasan
K. Raman
Anupam Samanta
Ling-Yen Liao
L. Bertelli
Michael Bendersky
RALM
LRM
145
25
0
27 Oct 2022
Broken Neural Scaling Laws
International Conference on Learning Representations (ICLR), 2022
Ethan Caballero
Kshitij Gupta
Irina Rish
David M. Krueger
968
98
0
26 Oct 2022
Previous
1
2
3
...
15
16
17
18
Next