Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2004.11886
Cited By
Lite Transformer with Long-Short Range Attention
24 April 2020
Zhanghao Wu
Zhijian Liu
Ji Lin
Yujun Lin
Song Han
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Lite Transformer with Long-Short Range Attention"
50 / 58 papers shown
Title
Base-Detail Feature Learning Framework for Visible-Infrared Person Re-Identification
Zhihao Gong
Lian Wu
Yong Xu
41
0
0
06 May 2025
ChordFormer: A Conformer-Based Architecture for Large-Vocabulary Audio Chord Recognition
Muhammad Waseem Akram
Stefano Dettori
V. Colla
Giorgio Buttazzo
52
0
0
17 Feb 2025
Vision-Language Models for Edge Networks: A Comprehensive Survey
Ahmed Sharshar
Latif U. Khan
Waseem Ullah
Mohsen Guizani
VLM
70
3
0
11 Feb 2025
Parallel Sequence Modeling via Generalized Spatial Propagation Network
Hongjun Wang
Wonmin Byeon
Jiarui Xu
Jinwei Gu
Ka Chun Cheung
Xiaolong Wang
Kai Han
Jan Kautz
Sifei Liu
149
0
0
21 Jan 2025
LPViT: Low-Power Semi-structured Pruning for Vision Transformers
Kaixin Xu
Zhe Wang
Chunyun Chen
Xue Geng
Jie Lin
Xulei Yang
Min-man Wu
Min Wu
Xiaoli Li
Weisi Lin
ViT
VLM
49
7
0
02 Jul 2024
Enhancing IoT Intelligence: A Transformer-based Reinforcement Learning Methodology
Gaith Rjoub
Saidul Islam
Jamal Bentahar
M. Almaiah
Rana Alrawashdeh
OffRL
21
1
0
05 Apr 2024
EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty
Yuhui Li
Fangyun Wei
Chao Zhang
Hongyang R. Zhang
39
122
0
26 Jan 2024
Efficiency-oriented approaches for self-supervised speech representation learning
Luis Lugo
Valentin Vielzeuf
SSL
26
1
0
18 Dec 2023
MobileNMT: Enabling Translation in 15MB and 30ms
Ye Lin
Xiaohui Wang
Zhexi Zhang
Mingxuan Wang
Tong Xiao
Jingbo Zhu
MQ
25
1
0
07 Jun 2023
EENED: End-to-End Neural Epilepsy Detection based on Convolutional Transformer
Chenyu Liu
Xin-qiu Zhou
Yang Liu
ViT
MedIm
18
1
0
17 May 2023
Training Large Language Models Efficiently with Sparsity and Dataflow
V. Srinivasan
Darshan Gandhi
Urmish Thakker
R. Prabhakar
MoE
30
6
0
11 Apr 2023
Pyramid Multi-branch Fusion DCNN with Multi-Head Self-Attention for Mandarin Speech Recognition
Kai Liu
Hailiang Xiong
Gangqiang Yang
Zhengfeng Du
Yewen Cao
D. Shah
13
0
0
23 Mar 2023
Convolution-enhanced Evolving Attention Networks
Yujing Wang
Yaming Yang
Zhuowan Li
Jiangang Bai
Mingliang Zhang
Xiangtai Li
J. Yu
Ce Zhang
Gao Huang
Yu Tong
ViT
24
6
0
16 Dec 2022
CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for Multi-Modality Image Fusion
Zixiang Zhao
Hao Bai
Jiangshe Zhang
Yulun Zhang
Shuang Xu
Zudi Lin
Radu Timofte
Luc Van Gool
34
309
0
26 Nov 2022
ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-Design
Haoran You
Zhanyi Sun
Huihong Shi
Zhongzhi Yu
Yang Katie Zhao
Yongan Zhang
Chaojian Li
Baopu Li
Yingyan Lin
ViT
22
76
0
18 Oct 2022
E-Branchformer: Branchformer with Enhanced merging for speech recognition
Kwangyoun Kim
Felix Wu
Yifan Peng
Jing Pan
Prashant Sridhar
Kyu Jeong Han
Shinji Watanabe
50
105
0
30 Sep 2022
Lightweight Transformers for Human Activity Recognition on Mobile Devices
Sannara Ek
François Portet
P. Lalanda
29
28
0
22 Sep 2022
Efficient Methods for Natural Language Processing: A Survey
Marcos Vinícius Treviso
Ji-Ung Lee
Tianchu Ji
Betty van Aken
Qingqing Cao
...
Emma Strubell
Niranjan Balasubramanian
Leon Derczynski
Iryna Gurevych
Roy Schwartz
28
109
0
31 Aug 2022
MRL: Learning to Mix with Attention and Convolutions
Shlok Mohta
Hisahiro Suganuma
Yoshiki Tanaka
22
2
0
30 Aug 2022
Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
Yifan Peng
Siddharth Dalmia
Ian Lane
Shinji Watanabe
19
142
0
06 Jul 2022
FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning
Yeonghyeon Lee
Kangwook Jang
Jahyun Goo
Youngmoon Jung
Hoi-Rim Kim
15
28
0
01 Jul 2022
Knowledge Distillation of Transformer-based Language Models Revisited
Chengqiang Lu
Jianwei Zhang
Yunfei Chu
Zhengyu Chen
Jingren Zhou
Fei Wu
Haiqing Chen
Hongxia Yang
VLM
25
10
0
29 Jun 2022
Machine Learning for Microcontroller-Class Hardware: A Review
Swapnil Sayan Saha
S. Sandha
Mani B. Srivastava
24
118
0
29 May 2022
Attention Mechanism in Neural Networks: Where it Comes and Where it Goes
Derya Soydaner
3DV
44
149
0
27 Apr 2022
A Call for Clarity in Beam Search: How It Works and When It Stops
Jungo Kasai
Keisuke Sakaguchi
Ronan Le Bras
Dragomir R. Radev
Yejin Choi
Noah A. Smith
26
6
0
11 Apr 2022
TubeDETR: Spatio-Temporal Video Grounding with Transformers
Antoine Yang
Antoine Miech
Josef Sivic
Ivan Laptev
Cordelia Schmid
ViT
28
94
0
30 Mar 2022
A General Survey on Attention Mechanisms in Deep Learning
Gianni Brauwers
Flavius Frasincar
31
296
0
27 Mar 2022
Deep Transformers Thirst for Comprehensive-Frequency Data
R. Xia
Chao Xue
Boyu Deng
Fang Wang
Jingchao Wang
ViT
25
0
0
14 Mar 2022
CP-ViT: Cascade Vision Transformer Pruning via Progressive Sparsity Prediction
Zhuoran Song
Yihong Xu
Zhezhi He
Li Jiang
Naifeng Jing
Xiaoyao Liang
ViT
24
39
0
09 Mar 2022
EdgeFormer: A Parameter-Efficient Transformer for On-Device Seq2seq Generation
Tao Ge
Si-Qing Chen
Furu Wei
MoE
22
21
0
16 Feb 2022
Representing Long-Range Context for Graph Neural Networks with Global Attention
Zhanghao Wu
Paras Jain
Matthew A. Wright
Azalia Mirhoseini
Joseph E. Gonzalez
Ion Stoica
GNN
40
258
0
21 Jan 2022
3D Medical Point Transformer: Introducing Convolution to Attention Networks for Medical Point Cloud Analysis
Jianhui Yu
Chaoyi Zhang
Heng Wang
Dingxin Zhang
Yang Song
Tiange Xiang
Dongnan Liu
Weidong (Tom) Cai
ViT
MedIm
21
32
0
09 Dec 2021
Pixelated Butterfly: Simple and Efficient Sparse training for Neural Network Models
Tri Dao
Beidi Chen
Kaizhao Liang
Jiaming Yang
Zhao-quan Song
Atri Rudra
Christopher Ré
30
75
0
30 Nov 2021
Direct Multi-view Multi-person 3D Pose Estimation
Tao Wang
Jianfeng Zhang
Yujun Cai
Shuicheng Yan
Jiashi Feng
3DH
32
89
0
07 Nov 2021
Vis-TOP: Visual Transformer Overlay Processor
Wei Hu
Dian Xu
Zimeng Fan
Fang Liu
Yanxiang He
BDL
ViT
20
5
0
21 Oct 2021
Language Modeling using LMUs: 10x Better Data Efficiency or Improved Scaling Compared to Transformers
Narsimha Chilkuri
Eric Hunsberger
Aaron R. Voelker
G. Malik
C. Eliasmith
30
7
0
05 Oct 2021
Molformer: Motif-based Transformer on 3D Heterogeneous Molecular Graphs
Fang Wu
Dragomir R. Radev
Huabin Xing
ViT
33
54
0
04 Oct 2021
RAIL-KD: RAndom Intermediate Layer Mapping for Knowledge Distillation
Md. Akmal Haidar
Nithin Anchuri
Mehdi Rezagholizadeh
Abbas Ghaddar
Philippe Langlais
Pascal Poupart
31
22
0
21 Sep 2021
Do Long-Range Language Models Actually Use Long-Range Context?
Simeng Sun
Kalpesh Krishna
Andrew Mattarella-Micke
Mohit Iyyer
RALM
25
80
0
19 Sep 2021
Post-Training Quantization for Vision Transformer
Zhenhua Liu
Yunhe Wang
Kai Han
Siwei Ma
Wen Gao
ViT
MQ
41
325
0
27 Jun 2021
A Survey of Transformers
Tianyang Lin
Yuxin Wang
Xiangyang Liu
Xipeng Qiu
ViT
32
1,087
0
08 Jun 2021
Layer Reduction: Accelerating Conformer-Based Self-Supervised Model via Layer Consistency
Jinchuan Tian
Rongzhi Gu
Helin Wang
Yuexian Zou
21
0
0
08 Apr 2021
Evaluating Neural Word Embeddings for Sanskrit
Kevin Qinghong Lin
Om Adideva
Digumarthi Komal
Laxmidhar Behera
Pawan Goyal
29
12
0
01 Apr 2021
Exploiting Temporal Contexts with Strided Transformer for 3D Human Pose Estimation
Wenhao Li
Hong Liu
Runwei Ding
Mengyuan Liu
Pichao Wang
Wenming Yang
ViT
25
189
0
26 Mar 2021
The Accented English Speech Recognition Challenge 2020: Open Datasets, Tracks, Baselines, Results and Methods
Xian Shi
Fan Yu
Yizhou Lu
Yuhao Liang
Qiangze Feng
Daliang Wang
Y. Qian
Lei Xie
24
66
0
20 Feb 2021
Bringing AI To Edge: From Deep Learning's Perspective
Di Liu
Hao Kong
Xiangzhong Luo
Weichen Liu
Ravi Subramaniam
52
116
0
25 Nov 2020
ConvTransformer: A Convolutional Transformer Network for Video Frame Synthesis
Zhouyong Liu
S. Luo
Wubin Li
Jingben Lu
Yufan Wu
Shilei Sun
Chunguo Li
Luxi Yang
ViT
19
79
0
20 Nov 2020
Transformer-based End-to-End Speech Recognition with Local Dense Synthesizer Attention
Menglong Xu
Shengqiang Li
Xiao-Lei Zhang
24
31
0
23 Oct 2020
Rethinking Document-level Neural Machine Translation
Zewei Sun
Mingxuan Wang
Hao Zhou
Chengqi Zhao
Shujian Huang
Jiajun Chen
Lei Li
VLM
83
47
0
18 Oct 2020
ConvBERT: Improving BERT with Span-based Dynamic Convolution
Zihang Jiang
Weihao Yu
Daquan Zhou
Yunpeng Chen
Jiashi Feng
Shuicheng Yan
32
156
0
06 Aug 2020
1
2
Next