Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2107.11906
Cited By
H-Transformer-1D: Fast One-Dimensional Hierarchical Attention for Sequences
25 July 2021
Zhenhai Zhu
Radu Soricut
Re-assign community
ArXiv
PDF
HTML
Papers citing
"H-Transformer-1D: Fast One-Dimensional Hierarchical Attention for Sequences"
8 / 8 papers shown
Title
ZETA: Leveraging Z-order Curves for Efficient Top-k Attention
Qiuhao Zeng
Jerry Huang
Peng Lu
Gezheng Xu
Boxing Chen
Charles X. Ling
Boyu Wang
39
1
0
24 Jan 2025
Transformer-VQ: Linear-Time Transformers via Vector Quantization
Albert Mohwald
11
15
0
28 Sep 2023
Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers
Sotiris Anagnostidis
Dario Pavllo
Luca Biggio
Lorenzo Noci
Aurélien Lucchi
Thomas Hofmann
24
51
0
25 May 2023
Diagonal State Spaces are as Effective as Structured State Spaces
Ankit Gupta
Albert Gu
Jonathan Berant
14
288
0
27 Mar 2022
Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting
Haoyi Zhou
Shanghang Zhang
J. Peng
Shuai Zhang
Jianxin Li
Hui Xiong
Wan Zhang
AI4TS
159
3,799
0
14 Dec 2020
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
246
1,982
0
28 Jul 2020
Efficient Content-Based Sparse Attention with Routing Transformers
Aurko Roy
M. Saffar
Ashish Vaswani
David Grangier
MoE
228
502
0
12 Mar 2020
Effective Approaches to Attention-based Neural Machine Translation
Thang Luong
Hieu H. Pham
Christopher D. Manning
208
7,687
0
17 Aug 2015
1