v1v2v3 (latest)

Improving language models by retrieving from trillions of tokens

8 December 2021

George van den Driessche

Jean-Baptiste Lespiau

Saffron Huang

ArXiv (abs)PDF HTML HuggingFace (1 upvotes)

Papers citing "Improving language models by retrieving from trillions of tokens"

43 / 893 papers shown

Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language ModelsNeural Information Processing Systems (NeurIPS), 2022

Luke Zettlemoyer

356

243

22 May 2022

Visually-Augmented Language ModelingInternational Conference on Learning Representations (ICLR), 2022

Xiaodong Liu

232

20 May 2022

Sergio Gomez Colmenarejo

...

474

979

12 May 2022

Asking for Knowledge: Training RL Agents to Query External Knowledge Using LanguageInternational Conference on Machine Learning (ICML), 2022

250

12 May 2022

Retrieval-Enhanced Machine LearningAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2022

171

02 May 2022

OPT: Open Pre-trained Transformer Language Models

...

Luke Zettlemoyer

892

4,417

02 May 2022

TemporalWiki: A Lifelong Benchmark for Training and Evaluating Ever-Evolving Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

439

116

29 Apr 2022

Can deep learning match the efficiency of human visual long-term memory in storing object details?

Emin Orhan

VLM OCL

231

27 Apr 2022

Semi-Parametric Neural Image Synthesis

304

25 Apr 2022

ChapterBreak: A Challenge Dataset for Long-Range Language ModelsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2022

Simeng Sun

Katherine Thai

Mohit Iyyer

189

22 Apr 2022

Standing on the Shoulders of Giant Frozen Language Models

...

240

21 Apr 2022

K-LITE: Learning Transferable Visual Models with External KnowledgeNeural Information Processing Systems (NeurIPS), 2022

Jianwei Yang

...

197

20 Apr 2022

METRO: Efficient Denoising Pretraining of Large Scale Autoencoding Language Models with Model Generated Signals

Xiaodong Liu

Xia Song

208

13 Apr 2022

Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback

...

960

3,520

12 Apr 2022

Augmenting Pre-trained Language Models with QA-Memory for Open-Domain Question AnsweringConference of the European Chapter of the Association for Computational Linguistics (EACL), 2022

307

10 Apr 2022

Knowledge Base Index Compression via Dimensionality and Precision Reduction

245

06 Apr 2022

KNN-Diffusion: Image Generation via Large-Scale RetrievalInternational Conference on Learning Representations (ICLR), 2022

252

148

06 Apr 2022

PaLM: Scaling Language Modeling with PathwaysJournal of machine learning research (JMLR), 2022

Sharan Narang

...

Kathy Meier-Hellstern

1.2K

7,524

05 Apr 2022

Revisiting a kNN-based Image Classification System with High-capacity StorageEuropean Conference on Computer Vision (ECCV), 2022

255

03 Apr 2022

PanGu-Bot: Efficient Generative Dialogue Pre-training from Pre-trained Language Model

Lifeng Shang

Xin Jiang

Shiqi Zhao

Qun Liu

ALM

348

31 Mar 2022

Training Compute-Optimal Large Language Models

...

798

2,684

29 Mar 2022

Diagonal State Spaces are as Effective as Structured State SpacesNeural Information Processing Systems (NeurIPS), 2022

Ankit Gupta

Albert Gu

Jonathan Berant

420

416

27 Mar 2022

Language Models that Seek for Knowledge: Modular Search & Generation for Dialogue and Prompt CompletionConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Jason Weston

232

143

24 Mar 2022

Teaching language models to support answers with verified quotes

...

Lucy Campbell-Gillingham

G. Irving

Nat McAleese

ELM RALM

530

308

21 Mar 2022

Reasoning over Public and Private Data in Retrieval-Based SystemsTransactions of the Association for Computational Linguistics (TACL), 2022

Simran Arora

Patrick Lewis

Angela Fan

Jacob Kahn

Christopher Ré

192

14 Mar 2022

Internet-augmented language models through few-shot prompting for open-domain question answering

Wojciech Stokowiec

244

159

10 Mar 2022

Finite-Sum Coupled Compositional Stochastic Optimization: Theory and ApplicationsInternational Conference on Machine Learning (ICML), 2022

Bokun Wang

Tianbao Yang

546

24 Feb 2022

From Natural Language to Simulations: Applying GPT-3 Codex to Automate Simulation Modeling of Logistics SystemsSocial Science Research Network (SSRN), 2022

I. Jackson

M. J. Sáenz

182

24 Feb 2022

Do Transformers know symbolic rules, and would we know if they did?

Tommi Gröndahl

Yu-Wen Guo

Nirmal Asokan

420

19 Feb 2022

Retrieval-Augmented Reinforcement LearningInternational Conference on Machine Learning (ICML), 2022

...

406

17 Feb 2022

Transformer Memory as a Differentiable Search IndexNeural Information Processing Systems (NeurIPS), 2022

...

434

368

14 Feb 2022

Semi-supervised New Event Type Induction and Description via Contrastive Loss-Enforced Batch AttentionConference of the European Chapter of the Association for Computational Linguistics (EACL), 2022

Carl Edwards

Heng Ji

153

12 Feb 2022

Competition-Level Code Generation with AlphaCodeScience (Science), 2022

...

Esme Sutherland Robson

684

1,883

08 Feb 2022

A Survey on Retrieval-Augmented Text Generation

409

266

02 Feb 2022

Retrieve-and-Fill for Scenario-based Task-Oriented Semantic ParsingConference of the European Chapter of the Association for Computational Linguistics (EACL), 2022

268

02 Feb 2022

Neuro-Symbolic Language Modeling with Automaton-augmented RetrievalInternational Conference on Machine Learning (ICML), 2022

Graham Neubig

295

28 Jan 2022

LaMDA: Language Models for Dialog Applications

...

406

1,799

20 Jan 2022

Reasoning Through Memorization: Nearest Neighbor Knowledge Graph EmbeddingsNatural Language Processing and Chinese Computing (NLPCC), 2022

Peng Wang

Xin Xie

Xiaohan Wang

Ningyu Zhang

RALM

415

14 Jan 2022

Evidentiality-guided Generation for Knowledge-Intensive NLP Tasks

296

16 Dec 2021

Learning To Retrieve Prompts for In-Context Learning

386

832

16 Dec 2021

ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction

485

583

02 Dec 2021

The Inductive Bias of In-Context Learning: Rethinking Pretraining Example DesignInternational Conference on Learning Representations (ICLR), 2021

278

09 Oct 2021

Inductive Biases for Deep Learning of Higher-Level CognitionProceedings of the Royal Society A (Proc. R. Soc. A), 2020

Anirudh Goyal

Yoshua Bengio

AI4CE

533

419

30 Nov 2020