Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2202.07765
Cited By
General-purpose, long-context autoregressive modeling with Perceiver AR
15 February 2022
Curtis Hawthorne
Andrew Jaegle
Cătălina Cangea
Sebastian Borgeaud
C. Nash
Mateusz Malinowski
Sander Dieleman
Oriol Vinyals
M. Botvinick
Ian Simon
Hannah R. Sheahan
Neil Zeghidour
Jean-Baptiste Alayrac
João Carreira
Jesse Engel
Re-assign community
ArXiv
PDF
HTML
Papers citing
"General-purpose, long-context autoregressive modeling with Perceiver AR"
27 / 27 papers shown
Title
Long-Context Autoregressive Video Modeling with Next-Frame Prediction
Yuchao Gu
Weijia Mao
Mike Zheng Shou
VGen
73
2
0
25 Mar 2025
MALT Diffusion: Memory-Augmented Latent Transformers for Any-Length Video Generation
Sihyun Yu
Meera Hahn
Dan Kondratyuk
Jinwoo Shin
Agrim Gupta
José Lezama
Irfan Essa
David A. Ross
Jonathan Huang
DiffM
VGen
72
0
0
18 Feb 2025
Graph-Aware Isomorphic Attention for Adaptive Dynamics in Transformers
Markus J. Buehler
AI4CE
35
1
0
04 Jan 2025
MrT5: Dynamic Token Merging for Efficient Byte-level Language Models
Julie Kallini
Shikhar Murty
Christopher D. Manning
Christopher Potts
Róbert Csordás
30
2
0
28 Oct 2024
Adaptive Large Language Models By Layerwise Attention Shortcuts
Prateek Verma
Mert Pilanci
KELM
OffRL
42
0
0
17 Sep 2024
Transformer-VQ: Linear-Time Transformers via Vector Quantization
Albert Mohwald
24
15
0
28 Sep 2023
JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models
Peike Li
Bo-Yu Chen
Yao Yao
Yikai Wang
Allen Wang
Alex Jinpeng Wang
MGen
VLM
DiffM
57
37
0
09 Aug 2023
Hierarchical Attention Encoder Decoder
Asier Mujika
BDL
11
3
0
01 Jun 2023
A Multi-Scale Attentive Transformer for Multi-Instrument Symbolic Music Generation
Xipin Wei
Junhui Chen
Zirui Zheng
Li Guo
Lantian Li
Dong Wang
12
3
0
26 May 2023
FIT: Far-reaching Interleaved Transformers
Ting-Li Chen
Lala Li
19
12
0
22 May 2023
SingSong: Generating musical accompaniments from singing
Chris Donahue
Antoine Caillon
Adam Roberts
Ethan Manilow
P. Esling
...
Mauro Verzetti
Ian Simon
Olivier Pietquin
Neil Zeghidour
Jesse Engel
25
52
0
30 Jan 2023
Hungry Hungry Hippos: Towards Language Modeling with State Space Models
Daniel Y. Fu
Tri Dao
Khaled Kamal Saab
A. Thomas
Atri Rudra
Christopher Ré
43
367
0
28 Dec 2022
A Survey on Artificial Intelligence for Music Generation: Agents, Domains and Perspectives
Carlos Hernandez-Olivan
Javier Hernandez-Olivan
J. R. Beltrán
MGen
27
6
0
25 Oct 2022
Neural Attentive Circuits
Nasim Rahaman
M. Weiß
Francesco Locatello
C. Pal
Yoshua Bengio
Bernhard Schölkopf
Erran L. Li
Nicolas Ballas
19
6
0
14 Oct 2022
Deep Generative Multimedia Children's Literature
Matthew Lyle Olson
9
0
0
27 Sep 2022
COPER: Continuous Patient State Perceiver
V. Chauhan
Anshul Thakur
Odhran O'Donoghue
David A. Clifton
AI4TS
OOD
24
5
0
05 Aug 2022
Long Range Language Modeling via Gated State Spaces
Harsh Mehta
Ankit Gupta
Ashok Cutkosky
Behnam Neyshabur
Mamba
26
231
0
27 Jun 2022
Block-Recurrent Transformers
DeLesley S. Hutchins
Imanol Schlag
Yuhuai Wu
Ethan Dyer
Behnam Neyshabur
16
94
0
11 Mar 2022
Self-attention Does Not Need
O
(
n
2
)
O(n^2)
O
(
n
2
)
Memory
M. Rabe
Charles Staats
LRM
18
139
0
10 Dec 2021
Primer: Searching for Efficient Transformers for Language Modeling
David R. So
Wojciech Mañke
Hanxiao Liu
Zihang Dai
Noam M. Shazeer
Quoc V. Le
VLM
83
151
0
17 Sep 2021
Combiner: Full Attention Transformer with Sparse Computation Cost
Hongyu Ren
H. Dai
Zihang Dai
Mengjiao Yang
J. Leskovec
Dale Schuurmans
Bo Dai
73
77
0
12 Jul 2021
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
4,764
0
24 Feb 2021
Generative Spoken Language Modeling from Raw Audio
Kushal Lakhotia
Evgeny Kharitonov
Wei-Ning Hsu
Yossi Adi
Adam Polyak
...
Tu Nguyen
Jade Copet
Alexei Baevski
A. Mohamed
Emmanuel Dupoux
AuLLM
174
336
0
01 Feb 2021
Shortformer: Better Language Modeling using Shorter Inputs
Ofir Press
Noah A. Smith
M. Lewis
219
88
0
31 Dec 2020
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
249
2,009
0
28 Jul 2020
Efficient Content-Based Sparse Attention with Routing Transformers
Aurko Roy
M. Saffar
Ashish Vaswani
David Grangier
MoE
238
578
0
12 Mar 2020
Pixel Recurrent Neural Networks
Aaron van den Oord
Nal Kalchbrenner
Koray Kavukcuoglu
SSeg
GAN
225
2,542
0
25 Jan 2016
1