ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.04426
  4. Cited By
Improving language models by retrieving from trillions of tokens
v1v2v3 (latest)

Improving language models by retrieving from trillions of tokens

8 December 2021
Sebastian Borgeaud
A. Mensch
Jordan Hoffmann
Trevor Cai
Eliza Rutherford
Katie Millican
George van den Driessche
Jean-Baptiste Lespiau
Bogdan Damoc
Aidan Clark
Diego de Las Casas
Aurelia Guy
Jacob Menick
Roman Ring
Tom Hennigan
Saffron Huang
Lorenzo Maggiore
Chris Jones
Albin Cassirer
Andy Brock
Michela Paganini
G. Irving
Oriol Vinyals
Simon Osindero
Karen Simonyan
Jack W. Rae
Erich Elsen
Laurent Sifre
    KELMRALM
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)

Papers citing "Improving language models by retrieving from trillions of tokens"

50 / 893 papers shown
Title
Query Rewriting for Retrieval-Augmented Large Language Models
Query Rewriting for Retrieval-Augmented Large Language Models
Xinbei Ma
Yeyun Gong
Pengcheng He
Hai Zhao
Nan Duan
KELMLRM
207
184
0
23 May 2023
Towards A Unified View of Sparse Feed-Forward Network in Pretraining
  Large Language Model
Towards A Unified View of Sparse Feed-Forward Network in Pretraining Large Language ModelConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Leo Liu
Tim Dettmers
Xi Lin
Ves Stoyanov
Xian Li
MoE
166
12
0
23 May 2023
Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large
  Language Models in Knowledge Conflicts
Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge ConflictsInternational Conference on Learning Representations (ICLR), 2023
Jian Xie
Kai Zhang
Jiangjie Chen
Renze Lou
Yu-Chuan Su
RALM
657
241
0
22 May 2023
Has It All Been Solved? Open NLP Research Questions Not Solved by Large
  Language Models
Has It All Been Solved? Open NLP Research Questions Not Solved by Large Language ModelsInternational Conference on Language Resources and Evaluation (LREC), 2023
Oana Ignat
Zhijing Jin
Artem Abzaliev
Laura Biester
Santiago Castro
...
Verónica Pérez-Rosas
Siqi Shen
Zekun Wang
Winston Wu
Amélie Reymond
LRM
304
8
0
21 May 2023
CRITIC: Large Language Models Can Self-Correct with Tool-Interactive
  Critiquing
CRITIC: Large Language Models Can Self-Correct with Tool-Interactive CritiquingInternational Conference on Learning Representations (ICLR), 2023
Zhibin Gou
Zhihong Shao
Yeyun Gong
Yelong Shen
Yujiu Yang
Nan Duan
Weizhu Chen
KELMLRM
369
560
0
19 May 2023
Decouple knowledge from parameters for plug-and-play language modeling
Decouple knowledge from parameters for plug-and-play language modeling
Xin Cheng
Yankai Lin
Preslav Nakov
Dongyan Zhao
Rui Yan
KELM
182
2
0
19 May 2023
ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via
  Tool Embeddings
ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool EmbeddingsNeural Information Processing Systems (NeurIPS), 2023
Shibo Hao
Tianyang Liu
Zhen Wang
Zhiting Hu
RALMLLMAG
445
232
0
19 May 2023
The Web Can Be Your Oyster for Improving Large Language Models
The Web Can Be Your Oyster for Improving Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Junyi Li
Tianyi Tang
Wayne Xin Zhao
Jingyuan Wang
Jian-Yun Nie
Ji-Rong Wen
RALMKELM
363
7
0
18 May 2023
ReGen: Zero-Shot Text Classification via Training Data Generation with
  Progressive Dense Retrieval
ReGen: Zero-Shot Text Classification via Training Data Generation with Progressive Dense RetrievalAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Yue Yu
Yuchen Zhuang
Rongzhi Zhang
Yu Meng
Jiaming Shen
Chao Zhang
VLM
165
39
0
18 May 2023
Knowledge Card: Filling LLMs' Knowledge Gaps with Plug-in Specialized
  Language Models
Knowledge Card: Filling LLMs' Knowledge Gaps with Plug-in Specialized Language ModelsInternational Conference on Learning Representations (ICLR), 2023
Shangbin Feng
Weijia Shi
Yuyang Bai
Vidhisha Balachandran
Tianxing He
Yulia Tsvetkov
KELM
301
48
0
17 May 2023
Integrating Generative Artificial Intelligence in Intelligent Vehicle
  Systems
Integrating Generative Artificial Intelligence in Intelligent Vehicle Systems
Lukas Stappen
J. Dillmann
S. Striegel
Hans-Jörg Vögel
Nicolas Flores-Herr
Björn W. Schuller
133
12
0
15 May 2023
Leveraging Large Language Models in Conversational Recommender Systems
Leveraging Large Language Models in Conversational Recommender Systems
Luke Friedman
Sameer Ahuja
David Allen
Zhenning Tan
Hakim Sidahmed
...
Ajay Patel
Harsh Lara
Brian Chu
Zexiang Chen
Manoj Kumar Tiwari
232
114
0
13 May 2023
Synergistic Interplay between Search and Large Language Models for
  Information Retrieval
Synergistic Interplay between Search and Large Language Models for Information RetrievalAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Jiazhan Feng
Chongyang Tao
Xiubo Geng
Tao Shen
Can Xu
Guodong Long
Dongyan Zhao
Daxin Jiang
KELM
247
19
0
12 May 2023
Active Retrieval Augmented Generation
Active Retrieval Augmented GenerationConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Zhengbao Jiang
Frank F. Xu
Luyu Gao
Zhiqing Sun
Qian Liu
Jane Dwivedi-Yu
Yiming Yang
Jamie Callan
Graham Neubig
RALM
353
458
0
11 May 2023
Automatic Evaluation of Attribution by Large Language Models
Automatic Evaluation of Attribution by Large Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Xiang Yue
Boshi Wang
Ziru Chen
Kai Zhang
Yu-Chuan Su
Huan Sun
ALMLRMHILM
256
74
0
10 May 2023
DAMO-NLP at SemEval-2023 Task 2: A Unified Retrieval-augmented System
  for Multilingual Named Entity Recognition
DAMO-NLP at SemEval-2023 Task 2: A Unified Retrieval-augmented System for Multilingual Named Entity RecognitionInternational Workshop on Semantic Evaluation (SemEval), 2023
Zeqi Tan
Shen Huang
Zixia Jia
Jiong Cai
Hai-Tao Zheng
...
Yueting Zhuang
Kewei Tu
Pengjun Xie
Fei Huang
Yong Jiang
144
13
0
05 May 2023
Unlimiformer: Long-Range Transformers with Unlimited Length Input
Unlimiformer: Long-Range Transformers with Unlimited Length InputNeural Information Processing Systems (NeurIPS), 2023
Amanda Bertsch
Uri Alon
Graham Neubig
Matthew R. Gormley
RALM
379
157
0
02 May 2023
UNTER: A Unified Knowledge Interface for Enhancing Pre-trained Language
  Models
UNTER: A Unified Knowledge Interface for Enhancing Pre-trained Language Models
Deming Ye
Yankai Lin
Zhengyan Zhang
Maosong Sun
KELM
211
0
0
02 May 2023
Why So Gullible? Enhancing the Robustness of Retrieval-Augmented Models
  against Counterfactual Noise
Why So Gullible? Enhancing the Robustness of Retrieval-Augmented Models against Counterfactual Noise
Giwon Hong
Jeonghwan Kim
Junmo Kang
Sung-Hyon Myaeng
Joyce Jiyoung Whang
RALMAAML
325
36
0
02 May 2023
Representation Matters: The Game of Chess Poses a Challenge to Vision
  Transformers
Representation Matters: The Game of Chess Poses a Challenge to Vision TransformersEuropean Conference on Artificial Intelligence (ECAI), 2023
Johannes Czech
Johannes Czech
Kristian Kersting
ViT
145
6
0
28 Apr 2023
Search-in-the-Chain: Interactively Enhancing Large Language Models with
  Search for Knowledge-intensive Tasks
Search-in-the-Chain: Interactively Enhancing Large Language Models with Search for Knowledge-intensive TasksThe Web Conference (WWW), 2023
Shicheng Xu
Liang Pang
Huawei Shen
Xueqi Cheng
Tat-Seng Chua
RALMKELMLRM
433
82
0
28 Apr 2023
Analogy-Forming Transformers for Few-Shot 3D Parsing
Analogy-Forming Transformers for Few-Shot 3D ParsingInternational Conference on Learning Representations (ICLR), 2023
N. Gkanatsios
M. Singh
Zhaoyuan Fang
Shubham Tulsiani
Katerina Fragkiadaki
3DPC3DV
293
2
0
27 Apr 2023
q2d: Turning Questions into Dialogs to Teach Models How to Search
q2d: Turning Questions into Dialogs to Teach Models How to SearchConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yonatan Bitton
Shlomi Cohen-Ganor
Ido Hakimi
Yoad Lewenberg
Roee Aharoni
Enav Weinreb
233
5
0
27 Apr 2023
GeneGPT: Augmenting Large Language Models with Domain Tools for Improved
  Access to Biomedical Information
GeneGPT: Augmenting Large Language Models with Domain Tools for Improved Access to Biomedical Information
Qiao Jin
Yifan Yang
Qingyu Chen
Zhiyong Lu
LM&MALLMAG
175
201
0
19 Apr 2023
BRENT: Bidirectional Retrieval Enhanced Norwegian Transformer
BRENT: Bidirectional Retrieval Enhanced Norwegian TransformerNordic Conference of Computational Linguistics (NODALIDA), 2023
Lucas Georges Gabriel Charpentier
Sondre Wold
David Samuel
Egil Rønningstad
KELMRALM
165
2
0
19 Apr 2023
An Evaluation on Large Language Model Outputs: Discourse and
  Memorization
An Evaluation on Large Language Model Outputs: Discourse and MemorizationNatural Language Processing Journal (JNLP), 2023
Adrian de Wynter
Xun Wang
Alex Sokolov
Qilong Gu
Si-Qing Chen
ELM
194
41
0
17 Apr 2023
The MiniPile Challenge for Data-Efficient Language Models
The MiniPile Challenge for Data-Efficient Language Models
Jean Kaddour
MoEALM
308
63
0
17 Apr 2023
Tool Learning with Foundation Models
Tool Learning with Foundation ModelsACM Computing Surveys (ACM Comput. Surv.), 2023
Yujia Qin
Shengding Hu
Yankai Lin
Weize Chen
Ning Ding
...
Cheng Yang
Tongshuang Wu
Heng Ji
Zhiyuan Liu
Maosong Sun
334
306
0
17 Apr 2023
Shall We Pretrain Autoregressive Language Models with Retrieval? A
  Comprehensive Study
Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive StudyConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Wei Ping
Ming-Yu Liu
Peng Xu
Lawrence C. McAfee
Zihan Liu
...
Oleksii Kuchaiev
Yue Liu
Chaowei Xiao
Anima Anandkumar
Bryan Catanzaro
RALM
332
76
0
13 Apr 2023
chatClimate: Grounding Conversational AI in Climate Science
chatClimate: Grounding Conversational AI in Climate Science
S. Vaghefi
Qian Wang
V. Muccione
Jingwei Ni
Mathias Kraus
...
Tobias Schimanski
Chiara Colesanti-Senni
Nicolas Webersinke
Christrian Huggel
Markus Leippold
KELMAI4MHHILM
320
100
0
11 Apr 2023
Improving Image Recognition by Retrieving from Web-Scale Image-Text Data
Improving Image Recognition by Retrieving from Web-Scale Image-Text DataComputer Vision and Pattern Recognition (CVPR), 2023
Ahmet Iscen
Alireza Fathi
Cordelia Schmid
VLM3DV
231
29
0
11 Apr 2023
Similarity search in the blink of an eye with compressed indices
Similarity search in the blink of an eye with compressed indicesProceedings of the VLDB Endowment (PVLDB), 2023
Cecilia Aguerrebere
Ishwar Bhati
Mark Hildebrand
Mariano Tepper
Ted Willke
201
45
0
07 Apr 2023
The Vector Grounding Problem
The Vector Grounding Problem
Dimitri Coelho Mollo
Raphael Milliere
355
41
0
04 Apr 2023
Querying Large Language Models with SQL
Querying Large Language Models with SQLInternational Conference on Extending Database Technology (EDBT), 2023
Mohammed Saeed
Nicola De Cao
Paolo Papotti
222
41
0
02 Apr 2023
ConceptEVA: Concept-Based Interactive Exploration and Customization of
  Document Summaries
ConceptEVA: Concept-Based Interactive Exploration and Customization of Document SummariesInternational Conference on Human Factors in Computing Systems (CHI), 2023
Xiaoyu Zhang
J. Li
Po-Wei Chi
Senthil K. Chandrasegaran
Kwan-Liu Ma
147
27
0
31 Mar 2023
Train/Test-Time Adaptation with Retrieval
Train/Test-Time Adaptation with RetrievalComputer Vision and Pattern Recognition (CVPR), 2023
Luca Zancato
Alessandro Achille
Tian Yu Liu
Matthew Trager
Pramuditha Perera
Stefano Soatto
TTAOOD
159
14
0
25 Mar 2023
$k$NN Prompting: Beyond-Context Learning with Calibration-Free Nearest
  Neighbor Inference
kkkNN Prompting: Beyond-Context Learning with Calibration-Free Nearest Neighbor InferenceInternational Conference on Learning Representations (ICLR), 2023
Benfeng Xu
Quan Wang
Zhendong Mao
Yajuan Lyu
Qiaoqiao She
Yongdong Zhang
277
66
0
24 Mar 2023
Large AI Models in Health Informatics: Applications, Challenges, and the
  Future
Large AI Models in Health Informatics: Applications, Challenges, and the FutureIEEE journal of biomedical and health informatics (IEEE JBHI), 2023
Jianing Qiu
Lin Li
Jiankai Sun
Jiachuan Peng
Peilun Shi
...
Bo Xiao
Wu Yuan
Ningli Wang
Dong Xu
Benny Lo
AI4MHLM&MA
244
180
0
21 Mar 2023
Language Model Behavior: A Comprehensive Survey
Language Model Behavior: A Comprehensive SurveyInternational Conference on Computational Logic (ICCL), 2023
Tyler A. Chang
Benjamin Bergen
VLMLRMLM&MA
304
136
0
20 Mar 2023
Retrieving Multimodal Information for Augmented Generation: A Survey
Retrieving Multimodal Information for Augmented Generation: A SurveyConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Ruochen Zhao
Hailin Chen
Weishi Wang
Fangkai Jiao
Do Xuan Long
...
Bosheng Ding
Xiaobao Guo
Minzhi Li
Xingxuan Li
Shafiq Joty
370
123
0
20 Mar 2023
TypeT5: Seq2seq Type Inference using Static Analysis
TypeT5: Seq2seq Type Inference using Static AnalysisInternational Conference on Learning Representations (ICLR), 2023
Jiayi Wei
Greg Durrett
Işıl Dillig
204
23
0
16 Mar 2023
Text-to-image Diffusion Models in Generative AI: A Survey
Text-to-image Diffusion Models in Generative AI: A Survey
Chenshuang Zhang
Chaoning Zhang
Mengchun Zhang
In So Kweon
VLM
247
370
0
14 Mar 2023
Magnushammer: A Transformer-Based Approach to Premise Selection
Magnushammer: A Transformer-Based Approach to Premise SelectionInternational Conference on Learning Representations (ICLR), 2023
Maciej Mikuła
Szymon Tworkowski
Szymon Antoniak
Bartosz Piotrowski
Albert Qiaochu Jiang
Jinyi Zhou
Christian Szegedy
Lukasz Kuciñski
Piotr Milo's
Yuhuai Wu
233
56
0
08 Mar 2023
Foundation Models for Decision Making: Problems, Methods, and
  Opportunities
Foundation Models for Decision Making: Problems, Methods, and Opportunities
Sherry Yang
Ofir Nachum
Yilun Du
Jason W. Wei
Pieter Abbeel
Dale Schuurmans
LM&RoOffRLLRMAI4CE
368
206
0
07 Mar 2023
Your representations are in the network: composable and parallel
  adaptation for large scale models
Your representations are in the network: composable and parallel adaptation for large scale modelsNeural Information Processing Systems (NeurIPS), 2023
Yonatan Dukler
Alessandro Achille
Hao Yang
Varsha Vivek
Luca Zancato
Benjamin Bowman
Avinash Ravichandran
Charless C. Fowlkes
A. Swaminathan
Stefano Soatto
261
3
0
07 Mar 2023
Enhancing Activity Prediction Models in Drug Discovery with the Ability
  to Understand Human Language
Enhancing Activity Prediction Models in Drug Discovery with the Ability to Understand Human LanguageInternational Conference on Machine Learning (ICML), 2023
Philipp Seidl
Andreu Vall
Sepp Hochreiter
Günter Klambauer
279
59
0
06 Mar 2023
The Contribution of Knowledge in Visiolinguistic Learning: A Survey on
  Tasks and Challenges
The Contribution of Knowledge in Visiolinguistic Learning: A Survey on Tasks and Challenges
Maria Lymperaiou
Giorgos Stamou
VLM
216
5
0
04 Mar 2023
Semiparametric Language Models Are Scalable Continual Learners
Semiparametric Language Models Are Scalable Continual Learners
Guangyue Peng
Tao Ge
Si-Qing Chen
Furu Wei
Houfeng Wang
KELM
148
11
0
02 Mar 2023
Retrieved Sequence Augmentation for Protein Representation Learning
Retrieved Sequence Augmentation for Protein Representation LearningbioRxiv (bioRxiv), 2023
Chang Ma
Haiteng Zhao
Lin Zheng
Jiayi Xin
Qintong Li
Lijun Wu
Zhihong Deng
Yang Lu
Qi Liu
Lingpeng Kong
AI4TS
166
14
0
24 Feb 2023
Less is More: Data Pruning for Faster Adversarial Training
Less is More: Data Pruning for Faster Adversarial Training
Yize Li
Pu Zhao
Xinyu Lin
B. Kailkhura
Ryan Goldh
AAML
257
14
0
23 Feb 2023
Previous
123...1415161718
Next