Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.13421
Cited By
Long-range Language Modeling with Self-retrieval
23 June 2023
Ohad Rubin
Jonathan Berant
RALM
KELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Long-range Language Modeling with Self-retrieval"
25 / 25 papers shown
Title
Associative Recurrent Memory Transformer
Ivan Rodkin
Yuri Kuratov
Aydar Bulatov
Mikhail Burtsev
58
2
0
17 Feb 2025
Retrieval Augmented Spelling Correction for E-Commerce Applications
Xuan Guo
Rohit Patki
Dante Everaert
Christopher Potts
14
0
0
15 Oct 2024
BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack
Yuri Kuratov
Aydar Bulatov
Petr Anokhin
Ivan Rodkin
Dmitry Sorokin
Artyom Sorokin
Mikhail Burtsev
RALM
ALM
LRM
ReLM
ELM
31
57
0
14 Jun 2024
Reliable, Adaptable, and Attributable Language Models with Retrieval
Akari Asai
Zexuan Zhong
Danqi Chen
Pang Wei Koh
Luke Zettlemoyer
Hanna Hajishirzi
Wen-tau Yih
KELM
RALM
33
53
0
05 Mar 2024
Analyzing and Adapting Large Language Models for Few-Shot Multilingual NLU: Are We There Yet?
E. Razumovskaia
Ivan Vulić
Anna Korhonen
16
5
0
04 Mar 2024
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Soham De
Samuel L. Smith
Anushan Fernando
Aleksandar Botev
George-Christian Muraru
...
David Budden
Yee Whye Teh
Razvan Pascanu
Nando de Freitas
Çağlar Gülçehre
Mamba
51
116
0
29 Feb 2024
In Search of Needles in a 11M Haystack: Recurrent Memory Finds What LLMs Miss
Yuri Kuratov
Aydar Bulatov
Petr Anokhin
Dmitry Sorokin
Artyom Sorokin
Mikhail Burtsev
RALM
109
32
0
16 Feb 2024
Accelerating Retrieval-Augmented Language Model Serving with Speculation
Zhihao Zhang
Alan Zhu
Lijie Yang
Yihua Xu
Lanting Li
P. Phothilimthana
Zhihao Jia
RALM
KELM
31
14
0
25 Jan 2024
UniMS-RAG: A Unified Multi-source Retrieval-Augmented Generation for Personalized Dialogue Systems
Hongru Wang
Wenyu Huang
Yang Deng
Rui Wang
Zezhong Wang
Yufei Wang
Fei Mi
Jeff Z. Pan
Kam-Fai Wong
RALM
34
26
0
24 Jan 2024
KCTS: Knowledge-Constrained Tree Search Decoding with Token-Level Hallucination Detection
Sehyun Choi
Tianqing Fang
Zhaowei Wang
Yangqiu Song
17
32
0
13 Oct 2023
CacheGen: KV Cache Compression and Streaming for Fast Language Model Serving
Yuhan Liu
Hanchen Li
Yihua Cheng
Siddhant Ray
Yuyang Huang
...
Ganesh Ananthanarayanan
Michael Maire
Henry Hoffmann
Ari Holtzman
Junchen Jiang
37
41
0
11 Oct 2023
Making Retrieval-Augmented Language Models Robust to Irrelevant Context
Ori Yoran
Tomer Wolfson
Ori Ram
Jonathan Berant
RALM
LRM
16
175
0
02 Oct 2023
Attention Sorting Combats Recency Bias In Long Context Language Models
A. Peysakhovich
Adam Lerer
LRM
RALM
26
40
0
28 Sep 2023
Recursively Summarizing Enables Long-Term Dialogue Memory in Large Language Models
Qingyue Wang
Y. Fu
Yanan Cao
Zhiliang Tian
Shi Wang
Dacheng Tao
LLMAG
KELM
RALM
34
22
0
29 Aug 2023
A Comprehensive Overview of Large Language Models
Humza Naveed
Asad Ullah Khan
Shi Qiu
Muhammad Saqib
Saeed Anwar
Muhammad Usman
Naveed Akhtar
Nick Barnes
Ajmal Saeed Mian
OffRL
30
499
0
12 Jul 2023
Lost in the Middle: How Language Models Use Long Contexts
Nelson F. Liu
Kevin Lin
John Hewitt
Ashwin Paranjape
Michele Bevilacqua
Fabio Petroni
Percy Liang
RALM
10
1,380
0
06 Jul 2023
Unlimiformer: Long-Range Transformers with Unlimited Length Input
Amanda Bertsch
Uri Alon
Graham Neubig
Matthew R. Gormley
RALM
91
122
0
02 May 2023
Resurrecting Recurrent Neural Networks for Long Sequences
Antonio Orvieto
Samuel L. Smith
Albert Gu
Anushan Fernando
Çağlar Gülçehre
Razvan Pascanu
Soham De
83
258
0
11 Mar 2023
Training Language Models with Memory Augmentation
Zexuan Zhong
Tao Lei
Danqi Chen
RALM
221
126
0
25 May 2022
Unsupervised Corpus Aware Language Model Pre-training for Dense Passage Retrieval
Luyu Gao
Jamie Callan
RALM
150
326
0
12 Aug 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
236
1,508
0
31 Dec 2020
Shortformer: Better Language Modeling using Shorter Inputs
Ofir Press
Noah A. Smith
M. Lewis
206
87
0
31 Dec 2020
Distilling Knowledge from Reader to Retriever for Question Answering
Gautier Izacard
Edouard Grave
RALM
165
249
0
08 Dec 2020
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
246
1,982
0
28 Jul 2020
Efficient Content-Based Sparse Attention with Routing Transformers
Aurko Roy
M. Saffar
Ashish Vaswani
David Grangier
MoE
228
502
0
12 Mar 2020
1