Title
On the Generalization Ability of Retrieval-Enhanced TransformersFindings (Findings), 2023 Tobias Norlund Ehsan Doostmohammadi Richard Johansson Marco Kuhlmann RALM 103 7 0 23 Feb 2023
$k$ NN-Adapter: Efficient Domain Adaptation for Black-Box Language Models Yangsibo Huang Daogao Liu Zexuan Zhong Weijia Shi Y. Lee RALM ALM 154 20 0 21 Feb 2023
Complex QA and language models hybrid architectures, Survey Xavier Daull P. Bellot Emmanuel Bruno Vincent Martin Elisabeth Murisasco ELM 623 17 0 17 Feb 2023
Augmented Language Models: a Survey Grégoire Mialon Roberto Dessì Maria Lomeli Christoforos Nalmpantis Ramakanth Pasunuru ... Jane Dwivedi-Yu Asli Celikyilmaz Edouard Grave Yann LeCun Thomas Scialom LRM KELM 254 482 0 15 Feb 2023
Characterizing Attribution and Fluency Tradeoffs for Retrieval-Augmented Large Language Models Renat Aksitov Chung-Ching Chang David Reitter Siamak Shakeri Yun-hsuan Sung RALM 178 20 0 11 Feb 2023
Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image CaptioningConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 Zhuolin Yang Ming-Yu Liu Zihan Liu V. Korthikanti Weili Nie ... Yuke Zhu Mohammad Shoeybi Bryan Catanzaro Chaowei Xiao Anima Anandkumar VLM RALM 161 50 0 09 Feb 2023
Toolformer: Language Models Can Teach Themselves to Use ToolsNeural Information Processing Systems (NeurIPS), 2023 Timo Schick Jane Dwivedi-Yu Roberto Dessì Roberta Raileanu Maria Lomeli Luke Zettlemoyer Nicola Cancedda Thomas Scialom SyDa RALM 397 2,567 0 09 Feb 2023
Augmenting Zero-Shot Dense Retrievers with Plug-in Mixture-of-MemoriesConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 Suyu Ge Chenyan Xiong Corby Rosset Arnold Overwijk Jiawei Han Paul N. Bennett VLM 129 11 0 07 Feb 2023
Learning a Fourier Transform for Linear Relative Positional Encodings in TransformersInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023 K. Choromanski Shanda Li Valerii Likhosherstov Kumar Avinava Dubey Shengjie Luo Di He Yiming Yang Tamás Sarlós Thomas Weingarten Adrian Weller 301 9 0 03 Feb 2023
ResMem: Learn what you can and memorize the restNeural Information Processing Systems (NeurIPS), 2023 Zitong Yang Michal Lukasik Vaishnavh Nagarajan Zong-xiao Li A. S. Rawat Manzil Zaheer A. Menon Surinder Kumar VLM 282 9 0 03 Feb 2023
QR-CLIP: Introducing Explicit Open-World Knowledge for Location and Time Reasoning Weimin Shi Mingchen Zhuge D. Gao Zhong Zhou Ming-Ming Cheng Deng-Ping Fan LRM VLM 205 0 0 02 Feb 2023
In-Context Retrieval-Augmented Language ModelsTransactions of the Association for Computational Linguistics (TACL), 2023 Ori Ram Yoav Levine Itay Dalmedigos Dor Muhlgay Amnon Shashua Kevin Leyton-Brown Y. Shoham KELM RALM LRM 485 821 0 31 Jan 2023
The Power of External Memory in Increasing Predictive Model Capacity Cenk Baykal D. Cutler Nishanth Dikkala Nikhil Ghosh Rina Panigrahy Xin Wang KELM 147 0 0 31 Jan 2023
Alternating Updates for Efficient TransformersNeural Information Processing Systems (NeurIPS), 2023 Cenk Baykal D. Cutler Nishanth Dikkala Nikhil Ghosh Rina Panigrahy Xin Wang MoE 172 8 0 30 Jan 2023
REPLUG: Retrieval-Augmented Black-Box Language ModelsNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023 Weijia Shi Sewon Min Michihiro Yasunaga Minjoon Seo Rich James M. Lewis Luke Zettlemoyer Anuj Kumar RALM VLM KELM 651 832 0 30 Jan 2023
Semi-Parametric Video-Grounded Text Generation Sungdong Kim Jin-Hwa Kim Jiyoung Lee Minjoon Seo VGen 220 17 0 27 Jan 2023
Pre-computed memory or on-the-fly encoding? A hybrid approach to retrieval augmentation makes the most of your computeInternational Conference on Machine Learning (ICML), 2023 Michiel de Jong Yury Zemlyanskiy Nicholas FitzGerald Joshua Ainslie Sumit Sanghai Fei Sha William W. Cohen RALM 203 20 0 25 Jan 2023
Learning Customized Visual Models with Retrieval-Augmented KnowledgeComputer Vision and Pattern Recognition (CVPR), 2023 Haotian Liu Kilho Son Jianwei Yang Ce Liu Jianfeng Gao Yong Jae Lee Chunyuan Li VLM 219 77 0 17 Jan 2023
Dissociating language and thought in large language models Kyle Mahowald Anna A. Ivanova I. Blank Nancy Kanwisher J. Tenenbaum Evelina Fedorenko ELM ReLM 276 228 0 16 Jan 2023
Cross-Model Comparative Loss for Enhancing Neuronal Utility in Language Understanding Yunchang Zhu Liang Pang Kangxi Wu Yanyan Lan Huawei Shen Xueqi Cheng AAML ELM 151 3 0 10 Jan 2023
Why do Nearest Neighbor Language Models Work?International Conference on Machine Learning (ICML), 2023 Frank F. Xu Uri Alon Graham Neubig RALM 163 29 0 07 Jan 2023
Using External Off-Policy Speech-To-Text Mappings in Contextual End-To-End Automated Speech Recognition David M. Chan Shalini Ghosh Ariya Rastrow Björn Hoffmeister OffRL 183 7 0 06 Jan 2023
Tsetlin Machine Embedding: Representing Words Using Logical ExpressionsFindings (Findings), 2023 Bimal Bhattarai Ole-Christoffer Granmo Lei Jiao Rohan Kumar Yadav Jivitesh Sharma NAI 180 17 0 02 Jan 2023
Rethinking with Retrieval: Faithful Large Language Model Inference Hangfeng He Hongming Zhang Dan Roth KELM LRM 443 201 0 31 Dec 2022
Noise-aware Learning from Web-crawled Image-Text Data for Image CaptioningIEEE International Conference on Computer Vision (ICCV), 2022 Woohyun Kang Jonghwan Mun Sungjun Lee Byungseok Roh VLM 217 26 0 27 Dec 2022
Contrastive Distillation Is a Sample-Efficient Self-Supervised Loss Policy for Transfer Learning Christopher T. Lengerich Gabriel Synnaeve Amy Zhang Hugh Leather Kurt Shuster Franccois Charton Charysse Redwood SSL OffRL 175 1 0 21 Dec 2022
Resolving Indirect Referring Expressions for Entity SelectionAnnual Meeting of the Association for Computational Linguistics (ACL), 2022 Mohammad Javad Hosseini Filip Radlinski Silvia Pareti Annie Louis 167 2 0 21 Dec 2022
Open Domain Multi-document Summarization: A Comprehensive Study of Model Brittleness under RetrievalConference on Empirical Methods in Natural Language Processing (EMNLP), 2022 John Giorgi Luca Soldaini Bo Wang Gary D. Bader Kyle Lo Lucy Lu Wang Arman Cohan 223 21 0 20 Dec 2022
When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric MemoriesAnnual Meeting of the Association for Computational Linguistics (ACL), 2022 Alex Troy Mallen Akari Asai Victor Zhong Rajarshi Das Daniel Khashabi Hannaneh Hajishirzi RALM HILM KELM 345 856 0 20 Dec 2022
Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step QuestionsAnnual Meeting of the Association for Computational Linguistics (ACL), 2022 H. Trivedi Niranjan Balasubramanian Tushar Khot Ashish Sabharwal KELM RALM LRM 413 712 0 20 Dec 2022
Training Trajectories of Language Models Across ScalesAnnual Meeting of the Association for Computational Linguistics (ACL), 2022 Mengzhou Xia Mikel Artetxe Chunting Zhou Xi Lin Ramakanth Pasunuru Danqi Chen Luke Zettlemoyer Ves Stoyanov AIFin LRM 231 70 0 19 Dec 2022
DSI++: Updating Transformer Memory with New DocumentsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022 Sanket Vaibhav Mehta Jai Gupta Yi Tay Mostafa Dehghani Vinh Q. Tran J. Rao Marc Najork Emma Strubell Donald Metzler CLL 209 59 0 19 Dec 2022
FiDO: Fusion-in-Decoder optimized for stronger performance and faster inferenceAnnual Meeting of the Association for Computational Linguistics (ACL), 2022 Michiel de Jong Yury Zemlyanskiy Joshua Ainslie Nicholas FitzGerald Sumit Sanghai Fei Sha William W. Cohen VLM 257 37 0 15 Dec 2022
Faster Maximum Inner Product Search in High DimensionsInternational Conference on Machine Learning (ICML), 2022 Mo Tiwari Ryan Kang Je-Yong Lee Luke Lee Chris Piech Sebastian Thrun Ilan Shomorony Martin Jinye Zhang 346 6 0 14 Dec 2022
G-MAP: General Memory-Augmented Pre-trained Language Model for Domain TasksConference on Empirical Methods in Natural Language Processing (EMNLP), 2022 Zhongwei Wan Yichun Yin Wei Zhang Jiaxin Shi Lifeng Shang Guangyong Chen Xin Jiang Qun Liu VLM CLL 339 20 0 07 Dec 2022
A Flexible Nadaraya-Watson Head Can Offer Explainable and Calibrated Classification Alan Q. Wang M. Sabuncu 215 6 0 07 Dec 2022
Document-Level Abstractive Summarization Gonçalo Raposo Afonso Raposo Ana Sofia Carmo 118 3 0 06 Dec 2022
Improving Few-Shot Performance of Language Models via Nearest Neighbor Calibration Feng Nie Meixi Chen Zhirui Zhang Xuan Cheng 173 38 0 05 Dec 2022
Retrieval as Attention: End-to-end Learning of Retrieval and Reading within a Single TransformerConference on Empirical Methods in Natural Language Processing (EMNLP), 2022 Zhengbao Jiang Luyu Gao Jun Araki Haibo Ding Zhiruo Wang Jamie Callan Graham Neubig RALM 260 49 0 05 Dec 2022
Named Entity and Relation Extraction with Multi-Modal RetrievalConference on Empirical Methods in Natural Language Processing (EMNLP), 2022 Xinyu Wang Jiong Cai Yong Jiang Pengjun Xie Kewei Tu Wei Lu 199 67 0 03 Dec 2022
Nonparametric Masked Language ModelingAnnual Meeting of the Association for Computational Linguistics (ACL), 2022 Sewon Min Weijia Shi M. Lewis Xilun Chen Anuj Kumar Hannaneh Hajishirzi Luke Zettlemoyer RALM 304 55 0 02 Dec 2022
Retrieval-Augmented Multimodal Language ModelingInternational Conference on Machine Learning (ICML), 2023 Michihiro Yasunaga Armen Aghajanyan Weijia Shi Rich James J. Leskovec Abigail Z. Jacobs M. Lewis Luke Zettlemoyer Anuj Kumar RALM 217 130 0 22 Nov 2022
Token Turing MachinesComputer Vision and Pattern Recognition (CVPR), 2022 Michael S. Ryoo K. Gopalakrishnan Kumara Kahatapitiya Ted Xiao Kanishka Rao Austin Stone Yao Lu Julian Ibarz Anurag Arnab 210 28 0 16 Nov 2022
Galactica: A Large Language Model for Science Ross Taylor Marcin Kardas Guillem Cucurull Thomas Scialom Anthony Hartshorn Elvis Saravia Andrew Poulton Viktor Kerkez Robert Stojnic ELM ReLM 363 916 0 16 Nov 2022
A Survey of Knowledge Enhanced Pre-trained Language ModelsIEEE Transactions on Knowledge and Data Engineering (TKDE), 2022 Linmei Hu Zeyi Liu Ziwang Zhao Lei Hou Liqiang Nie Juanzi Li KELM VLM 418 193 0 11 Nov 2022
Exemplar Guided Deep Neural Network for Spatial Transcriptomics Analysis of Gene Expression PredictionIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022 Yan Yang Md Zakir Hossain Eric A. Stone Shafin Rahman AI4TS 177 34 0 30 Oct 2022
Knowledge-in-Context: Towards Knowledgeable Semi-Parametric Language ModelsInternational Conference on Learning Representations (ICLR), 2022 Xiaoman Pan Wenlin Yao Hongming Zhang Dian Yu Dong Yu Jianshu Chen KELM 483 26 0 28 Oct 2022
You can't pick your neighbors, or can you? When and how to rely on retrieval in the $k$ NN-LMConference on Empirical Methods in Natural Language Processing (EMNLP), 2022 Andrew Drozdov Shufan Wang Razieh Rahimi Andrew McCallum Hamed Zamani Mohit Iyyer RALM 346 21 0 28 Oct 2022
QUILL: Query Intent with Large Language Models using Retrieval Augmentation and Multi-stage DistillationConference on Empirical Methods in Natural Language Processing (EMNLP), 2022 Krishna Srinivasan K. Raman Anupam Samanta Ling-Yen Liao L. Bertelli Michael Bendersky RALM LRM 145 25 0 27 Oct 2022
Broken Neural Scaling LawsInternational Conference on Learning Representations (ICLR), 2022 Ethan Caballero Kshitij Gupta Irina Rish David M. Krueger 968 98 0 26 Oct 2022

All Papers

Improving language models by retrieving from trillions of tokens

Papers citing "Improving language models by retrieving from trillions of tokens"