ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2001.04451
  4. Cited By
Reformer: The Efficient Transformer

Reformer: The Efficient Transformer

13 January 2020
Nikita Kitaev
Lukasz Kaiser
Anselm Levskaya
    VLM
ArXivPDFHTML

Papers citing "Reformer: The Efficient Transformer"

50 / 375 papers shown
Title
LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference
LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference
Qichen Fu
Minsik Cho
Thomas Merth
Sachin Mehta
Mohammad Rastegari
Mahyar Najibi
38
25
0
19 Jul 2024
HDT: Hierarchical Document Transformer
HDT: Hierarchical Document Transformer
Haoyu He
Markus Flicke
Jan Buchmann
Iryna Gurevych
Andreas Geiger
37
0
0
11 Jul 2024
Let the Code LLM Edit Itself When You Edit the Code
Let the Code LLM Edit Itself When You Edit the Code
Zhenyu He
Jun Zhang
Shengjie Luo
Jingjing Xu
Z. Zhang
Di He
KELM
33
0
0
03 Jul 2024
LPViT: Low-Power Semi-structured Pruning for Vision Transformers
LPViT: Low-Power Semi-structured Pruning for Vision Transformers
Kaixin Xu
Zhe Wang
Chunyun Chen
Xue Geng
Jie Lin
Xulei Yang
Min-man Wu
Min Wu
Xiaoli Li
Weisi Lin
ViT
VLM
43
7
0
02 Jul 2024
Reducing Fine-Tuning Memory Overhead by Approximate and Memory-Sharing
  Backpropagation
Reducing Fine-Tuning Memory Overhead by Approximate and Memory-Sharing Backpropagation
Yuchen Yang
Yingdong Shi
Cheems Wang
Xiantong Zhen
Yuxuan Shi
Jun Xu
37
1
0
24 Jun 2024
Sketch-GNN: Scalable Graph Neural Networks with Sublinear Training
  Complexity
Sketch-GNN: Scalable Graph Neural Networks with Sublinear Training Complexity
Mucong Ding
Tahseen Rabbani
Bang An
Evan Z Wang
Furong Huang
28
20
0
21 Jun 2024
Language Modeling with Editable External Knowledge
Language Modeling with Editable External Knowledge
Belinda Z. Li
Emmy Liu
Alexis Ross
Abbas Zeitoun
Graham Neubig
Jacob Andreas
KELM
32
4
0
17 Jun 2024
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Liliang Ren
Yang Liu
Yadong Lu
Yelong Shen
Chen Liang
Weizhu Chen
Mamba
68
55
0
11 Jun 2024
Symmetric Dot-Product Attention for Efficient Training of BERT Language
  Models
Symmetric Dot-Product Attention for Efficient Training of BERT Language Models
Martin Courtois
Malte Ostendorff
Leonhard Hennig
Georg Rehm
31
2
0
10 Jun 2024
TSCMamba: Mamba Meets Multi-View Learning for Time Series Classification
TSCMamba: Mamba Meets Multi-View Learning for Time Series Classification
Md. Atik Ahamed
Qiang Cheng
Mamba
66
1
0
06 Jun 2024
Loki: Low-Rank Keys for Efficient Sparse Attention
Loki: Low-Rank Keys for Efficient Sparse Attention
Prajwal Singhania
Siddharth Singh
Shwai He
S. Feizi
A. Bhatele
32
13
0
04 Jun 2024
Scorch: A Library for Sparse Deep Learning
Scorch: A Library for Sparse Deep Learning
Bobby Yan
Alexander J. Root
Trevor Gale
David Broman
Fredrik Kjolstad
25
0
0
27 May 2024
Foundational GPT Model for MEG
Foundational GPT Model for MEG
Richard Csaky
M. Es
Oiwi Parker Jones
M. Woolrich
32
2
0
14 Apr 2024
Hash3D: Training-free Acceleration for 3D Generation
Hash3D: Training-free Acceleration for 3D Generation
Xingyi Yang
Xinchao Wang
3DGS
33
10
0
09 Apr 2024
ATFNet: Adaptive Time-Frequency Ensembled Network for Long-term Time
  Series Forecasting
ATFNet: Adaptive Time-Frequency Ensembled Network for Long-term Time Series Forecasting
Hengyu Ye
Jiadong Chen
Shijin Gong
Fuxin Jiang
Tieying Zhang
Jianjun Chen
Xiaofeng Gao
AI4TS
29
2
0
08 Apr 2024
On the Theoretical Expressive Power and the Design Space of Higher-Order
  Graph Transformers
On the Theoretical Expressive Power and the Design Space of Higher-Order Graph Transformers
Cai Zhou
Rose Yu
Yusu Wang
32
7
0
04 Apr 2024
DiJiang: Efficient Large Language Models through Compact Kernelization
DiJiang: Efficient Large Language Models through Compact Kernelization
Hanting Chen
Zhicheng Liu
Xutao Wang
Yuchuan Tian
Yunhe Wang
VLM
24
5
0
29 Mar 2024
UNITS: A Unified Multi-Task Time Series Model
UNITS: A Unified Multi-Task Time Series Model
Shanghua Gao
Teddy Koker
Owen Queen
Thomas Hartvigsen
Theodoros Tsiligkaridis
Marinka Zitnik
AI4TS
38
15
0
29 Feb 2024
Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling
Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling
Mahdi Karami
Ali Ghodsi
VLM
42
6
0
28 Feb 2024
Transformers are Expressive, But Are They Expressive Enough for
  Regression?
Transformers are Expressive, But Are They Expressive Enough for Regression?
Swaroop Nath
H. Khadilkar
Pushpak Bhattacharyya
26
3
0
23 Feb 2024
Multimodal Transformer With a Low-Computational-Cost Guarantee
Multimodal Transformer With a Low-Computational-Cost Guarantee
Sungjin Park
Edward Choi
49
1
0
23 Feb 2024
DeepLag: Discovering Deep Lagrangian Dynamics for Intuitive Fluid
  Prediction
DeepLag: Discovering Deep Lagrangian Dynamics for Intuitive Fluid Prediction
Qilong Ma
Haixu Wu
Lanxiang Xing
Jianmin Wang
Mingsheng Long
AI4CE
26
0
0
04 Feb 2024
FreDF: Learning to Forecast in the Frequency Domain
FreDF: Learning to Forecast in the Frequency Domain
Hao Wang
Licheng Pan
Zhichao Chen
Degui Yang
Sen Zhang
Yifei Yang
Xinggao Liu
Haoxuan Li
Dacheng Tao
AI4TS
57
13
0
04 Feb 2024
Efficiency-oriented approaches for self-supervised speech representation
  learning
Efficiency-oriented approaches for self-supervised speech representation learning
Luis Lugo
Valentin Vielzeuf
SSL
26
1
0
18 Dec 2023
DYAD: A Descriptive Yet Abjuring Density efficient approximation to
  linear neural network layers
DYAD: A Descriptive Yet Abjuring Density efficient approximation to linear neural network layers
S. Chandy
Varun Gangal
Yi Yang
Gabriel Maggiotti
32
0
0
11 Dec 2023
Long-MIL: Scaling Long Contextual Multiple Instance Learning for
  Histopathology Whole Slide Image Analysis
Long-MIL: Scaling Long Contextual Multiple Instance Learning for Histopathology Whole Slide Image Analysis
Honglin Li
Yunlong Zhang
Chenglu Zhu
Jiatong Cai
Sunyi Zheng
Lin Yang
VLM
30
4
0
21 Nov 2023
BasisFormer: Attention-based Time Series Forecasting with Learnable and
  Interpretable Basis
BasisFormer: Attention-based Time Series Forecasting with Learnable and Interpretable Basis
Zelin Ni
Hang Yu
Shizhan Liu
Jianguo Li
Weiyao Lin
AI4TS
18
30
0
31 Oct 2023
UniTime: A Language-Empowered Unified Model for Cross-Domain Time Series
  Forecasting
UniTime: A Language-Empowered Unified Model for Cross-Domain Time Series Forecasting
Xu Liu
Junfeng Hu
Yuan N. Li
Shizhe Diao
Yuxuan Liang
Bryan Hooi
Roger Zimmermann
AI4TS
27
75
0
15 Oct 2023
MemGPT: Towards LLMs as Operating Systems
MemGPT: Towards LLMs as Operating Systems
Charles Packer
Sarah Wooders
Kevin Lin
Vivian Fang
Shishir G. Patil
Ion Stoica
Joseph E. Gonzalez
RALM
31
126
0
12 Oct 2023
Exploring Progress in Multivariate Time Series Forecasting:
  Comprehensive Benchmarking and Heterogeneity Analysis
Exploring Progress in Multivariate Time Series Forecasting: Comprehensive Benchmarking and Heterogeneity Analysis
Zezhi Shao
Fei Wang
Yongjun Xu
Wei Wei
Chengqing Yu
...
Guangyin Jin
Xin Cao
Gao Cong
Christian S. Jensen
Xueqi Cheng
AI4TS
18
57
0
09 Oct 2023
Training a Large Video Model on a Single Machine in a Day
Training a Large Video Model on a Single Machine in a Day
Yue Zhao
Philipp Krahenbuhl
VLM
29
15
0
28 Sep 2023
Transformer-VQ: Linear-Time Transformers via Vector Quantization
Transformer-VQ: Linear-Time Transformers via Vector Quantization
Albert Mohwald
28
15
0
28 Sep 2023
Learning the Efficient Frontier
Learning the Efficient Frontier
Philippe Chatigny
Ivan Sergienko
Ryan Ferguson
Jordan Weir
Maxime Bergeron
19
1
0
27 Sep 2023
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
Yukang Chen
Shengju Qian
Haotian Tang
Xin Lai
Zhijian Liu
Song Han
Jiaya Jia
37
151
0
21 Sep 2023
Transformers versus LSTMs for electronic trading
Transformers versus LSTMs for electronic trading
Paul Bilokon
Yitao Qiu
AI4TS
AIFin
13
13
0
20 Sep 2023
Embed-Search-Align: DNA Sequence Alignment using Transformer Models
Embed-Search-Align: DNA Sequence Alignment using Transformer Models
Pavan Holur
K. Enevoldsen
Shreyas Rajesh
L. Mboning
Thalia Georgiou
Louis-S. Bouchard
Matteo Pellegrini
V. Roychowdhury
18
0
0
20 Sep 2023
UnifiedGesture: A Unified Gesture Synthesis Model for Multiple Skeletons
UnifiedGesture: A Unified Gesture Synthesis Model for Multiple Skeletons
Sicheng Yang
Z. Wang
Zhiyong Wu
Minglei Li
Zhensong Zhang
...
Lei Hao
Songcen Xu
Xiaofei Wu
Changpeng Yang
Zonghong Dai
DiffM
39
14
0
13 Sep 2023
How to Protect Copyright Data in Optimization of Large Language Models?
How to Protect Copyright Data in Optimization of Large Language Models?
T. Chu
Zhao-quan Song
Chiwun Yang
34
29
0
23 Aug 2023
Instruction Position Matters in Sequence Generation with Large Language
  Models
Instruction Position Matters in Sequence Generation with Large Language Models
Yanjun Liu
Xianfeng Zeng
Fandong Meng
Jie Zhou
LRM
41
8
0
23 Aug 2023
RecycleGPT: An Autoregressive Language Model with Recyclable Module
RecycleGPT: An Autoregressive Language Model with Recyclable Module
Yu Jiang
Qiaozhi He
Xiaomin Zhuang
Zhihua Wu
Kunpeng Wang
Wenlai Zhao
Guangwen Yang
KELM
23
3
0
07 Aug 2023
Question Answering with Deep Neural Networks for Semi-Structured
  Heterogeneous Genealogical Knowledge Graphs
Question Answering with Deep Neural Networks for Semi-Structured Heterogeneous Genealogical Knowledge Graphs
Omri Suissa
M. Zhitomirsky-Geffet
Avshalom Elmalech
GNN
BDL
29
8
0
30 Jul 2023
Attention over pre-trained Sentence Embeddings for Long Document
  Classification
Attention over pre-trained Sentence Embeddings for Long Document Classification
Amine Abdaoui
Sourav Dutta
8
1
0
18 Jul 2023
LongNet: Scaling Transformers to 1,000,000,000 Tokens
LongNet: Scaling Transformers to 1,000,000,000 Tokens
Jiayu Ding
Shuming Ma
Li Dong
Xingxing Zhang
Shaohan Huang
Wenhui Wang
Nanning Zheng
Furu Wei
CLL
35
151
0
05 Jul 2023
Natural Language Generation and Understanding of Big Code for
  AI-Assisted Programming: A Review
Natural Language Generation and Understanding of Big Code for AI-Assisted Programming: A Review
M. Wong
Shangxin Guo
Ching Nam Hang
Siu-Wai Ho
C. Tan
37
78
0
04 Jul 2023
Learning Feature Matching via Matchable Keypoint-Assisted Graph Neural
  Network
Learning Feature Matching via Matchable Keypoint-Assisted Graph Neural Network
Zizhuo Li
Jiayi Ma
27
2
0
04 Jul 2023
ContextSpeech: Expressive and Efficient Text-to-Speech for Paragraph
  Reading
ContextSpeech: Expressive and Efficient Text-to-Speech for Paragraph Reading
Yujia Xiao
Shaofei Zhang
Xi Wang
Xuejiao Tan
Lei He
Sheng Zhao
Frank Soong
Tan Lee
17
5
0
03 Jul 2023
PaReprop: Fast Parallelized Reversible Backpropagation
PaReprop: Fast Parallelized Reversible Backpropagation
Tyler Lixuan Zhu
K. Mangalam
17
1
0
15 Jun 2023
Training-free Diffusion Model Adaptation for Variable-Sized
  Text-to-Image Synthesis
Training-free Diffusion Model Adaptation for Variable-Sized Text-to-Image Synthesis
Zhiyu Jin
Xuli Shen
Bin Li
Xiangyang Xue
24
36
0
14 Jun 2023
Cross-LKTCN: Modern Convolution Utilizing Cross-Variable Dependency for
  Multivariate Time Series Forecasting Dependency for Multivariate Time Series
  Forecasting
Cross-LKTCN: Modern Convolution Utilizing Cross-Variable Dependency for Multivariate Time Series Forecasting Dependency for Multivariate Time Series Forecasting
Donghao Luo
Xue Wang
BDL
AI4TS
11
2
0
04 Jun 2023
Koopa: Learning Non-stationary Time Series Dynamics with Koopman
  Predictors
Koopa: Learning Non-stationary Time Series Dynamics with Koopman Predictors
Yong Liu
Chenyu Li
Jianmin Wang
Mingsheng Long
AI4TS
28
101
0
30 May 2023
Previous
12345678
Next