Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2109.09115
Cited By
Do Long-Range Language Models Actually Use Long-Range Context?
19 September 2021
Simeng Sun
Kalpesh Krishna
Andrew Mattarella-Micke
Mohit Iyyer
RALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Do Long-Range Language Models Actually Use Long-Range Context?"
26 / 76 papers shown
Title
Long-range Language Modeling with Self-retrieval
Ohad Rubin
Jonathan Berant
RALM
KELM
19
18
0
23 Jun 2023
Exposing Attention Glitches with Flip-Flop Language Modeling
Bingbin Liu
Jordan T. Ash
Surbhi Goel
A. Krishnamurthy
Cyril Zhang
LRM
27
46
0
01 Jun 2023
Type Prediction With Program Decomposition and Fill-in-the-Type Training
Federico Cassano
Ming-Ho Yee
Noah Shinn
Arjun Guha
Steven Holtzen
29
5
0
25 May 2023
Landmark Attention: Random-Access Infinite Context Length for Transformers
Amirkeivan Mohtashami
Martin Jaggi
LLMAG
19
149
0
25 May 2023
Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers
Sotiris Anagnostidis
Dario Pavllo
Luca Biggio
Lorenzo Noci
Aurélien Lucchi
Thomas Hofmann
34
53
0
25 May 2023
Token-wise Decomposition of Autoregressive Language Model Hidden States for Analyzing Model Predictions
Byung-Doh Oh
William Schuler
22
2
0
17 May 2023
MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers
L. Yu
Daniel Simig
Colin Flaherty
Armen Aghajanyan
Luke Zettlemoyer
M. Lewis
21
84
0
12 May 2023
Black-box language model explanation by context length probing
Ondřej Cífka
Antoine Liutkus
MILM
LRM
6
6
0
30 Dec 2022
Reranking Overgenerated Responses for End-to-End Task-Oriented Dialogue Systems
Songbo Hu
Ivan Vulić
Fangyu Liu
Anna Korhonen
30
0
0
07 Nov 2022
Model Criticism for Long-Form Text Generation
Yuntian Deng
Volodymyr Kuleshov
Alexander M. Rush
31
19
0
16 Oct 2022
Context Limitations Make Neural Language Models More Human-Like
Tatsuki Kuribayashi
Yohei Oseki
Ana Brassard
Kentaro Inui
44
29
0
23 May 2022
RankGen: Improving Text Generation with Large Ranking Models
Kalpesh Krishna
Yapei Chang
John Wieting
Mohit Iyyer
AIMat
16
68
0
19 May 2022
ChapterBreak: A Challenge Dataset for Long-Range Language Models
Simeng Sun
Katherine Thai
Mohit Iyyer
10
19
0
22 Apr 2022
LaMemo: Language Modeling with Look-Ahead Memory
Haozhe Ji
Rongsheng Zhang
Zhenyu Yang
Zhipeng Hu
Minlie Huang
KELM
RALM
CLL
11
3
0
15 Apr 2022
Uniform Complexity for Text Generation
Joseph Marvin Imperial
Harish Tayyar Madabushi
13
3
0
11 Apr 2022
Memorizing Transformers
Yuhuai Wu
M. Rabe
DeLesley S. Hutchins
Christian Szegedy
RALM
16
171
0
16 Mar 2022
Block-Recurrent Transformers
DeLesley S. Hutchins
Imanol Schlag
Yuhuai Wu
Ethan Dyer
Behnam Neyshabur
16
94
0
11 Mar 2022
The NLP Task Effectiveness of Long-Range Transformers
Guanghui Qin
Yukun Feng
Benjamin Van Durme
10
28
0
16 Feb 2022
General-purpose, long-context autoregressive modeling with Perceiver AR
Curtis Hawthorne
Andrew Jaegle
Cătălina Cangea
Sebastian Borgeaud
C. Nash
...
Hannah R. Sheahan
Neil Zeghidour
Jean-Baptiste Alayrac
João Carreira
Jesse Engel
35
65
0
15 Feb 2022
SCROLLS: Standardized CompaRison Over Long Language Sequences
Uri Shaham
Elad Segal
Maor Ivgi
Avia Efrat
Ori Yoran
...
Ankit Gupta
Wenhan Xiong
Mor Geva
Jonathan Berant
Omer Levy
RALM
23
133
0
10 Jan 2022
Coherence boosting: When your pretrained language model is not paying enough attention
Nikolay Malkin
Zhen Wang
Nebojsa Jojic
RALM
19
35
0
15 Oct 2021
Predicting Attention Sparsity in Transformers
Marcos Vinícius Treviso
António Góis
Patrick Fernandes
E. Fonseca
André F. T. Martins
35
13
0
24 Sep 2021
Shortformer: Better Language Modeling using Shorter Inputs
Ofir Press
Noah A. Smith
M. Lewis
219
88
0
31 Dec 2020
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
254
2,012
0
28 Jul 2020
Efficient Content-Based Sparse Attention with Routing Transformers
Aurko Roy
M. Saffar
Ashish Vaswani
David Grangier
MoE
238
579
0
12 Mar 2020
Effective Approaches to Attention-based Neural Machine Translation
Thang Luong
Hieu H. Pham
Christopher D. Manning
216
7,923
0
17 Aug 2015
Previous
1
2