Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1901.10430
Cited By
v1
v2 (latest)
Pay Less Attention with Lightweight and Dynamic Convolutions
International Conference on Learning Representations (ICLR), 2019
29 January 2019
Felix Wu
Angela Fan
Alexei Baevski
Yann N. Dauphin
Michael Auli
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Pay Less Attention with Lightweight and Dynamic Convolutions"
50 / 337 papers shown
Multi-refined Feature Enhanced Sentiment Analysis Using Contextual Instruction
Peter Atandoh
Jie Zou
Weikang Guo
Jiwei Wei
Zheng Wang
194
0
0
01 Nov 2025
Long Context Automated Essay Scoring with Language Models
Christopher Ormerod
Gitit Kehat
133
0
0
12 Sep 2025
Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search
Yuxian Gu
Qinghao Hu
Shang Yang
Haocheng Xi
Junyu Chen
Song Han
Han Cai
260
15
0
21 Aug 2025
Ensemble-Based Survival Models with the Self-Attended Beran Estimator Predictions
Computational Mathematics and Modeling (CMM), 2025
Lev V. Utkin
Semen P. Khomets
Vlada A. Efremenko
A. Konstantinov
Natalya M. Verbova
139
1
0
09 Jun 2025
Multi-Token Attention
O. Yu. Golovneva
Tianlu Wang
Jason Weston
Sainbayar Sukhbaatar
350
4
0
01 Apr 2025
Revisiting Backdoor Attacks on Time Series Classification in the Frequency Domain
The Web Conference (WWW), 2025
Yuanmin Huang
Mi Zhang
Zhaoxiang Wang
Wenxuan Li
Min Yang
AAML
AI4TS
441
4
0
12 Mar 2025
The FFT Strikes Again: An Efficient Alternative to Self-Attention
Jacob Fein-Ashley
Rajgopal Kannan
Viktor Prasanna
614
1
0
25 Feb 2025
On the Performance Analysis of Momentum Method: A Frequency Domain Perspective
International Conference on Learning Representations (ICLR), 2024
Xianliang Li
Jun Luo
Zhiwei Zheng
Hanxiao Wang
Li Luo
Lingkun Wen
Linlong Wu
Sheng Xu
542
4
0
29 Nov 2024
Efficient Machine Translation with a BiLSTM-Attention Approach
Yuxu Wu
Yiren Xing
157
4
0
29 Oct 2024
big.LITTLE Vision Transformer for Efficient Visual Recognition
He Guo
Yulong Wang
Zixuan Ye
Jifeng Dai
Yuwen Xiong
ViT
262
2
0
14 Oct 2024
Stereo-Knowledge Distillation from dpMV to Dual Pixels for Light Field Video Reconstruction
Aryan Garg
Raghav Mallampali
Akshat Joshi
Shrisudhan Govindarajan
Kaushik Mitra
290
1
0
20 May 2024
MambaOut: Do We Really Need Mamba for Vision?
Computer Vision and Pattern Recognition (CVPR), 2024
Weihao Yu
Xinchao Wang
Mamba
355
185
0
13 May 2024
TransfoRhythm: A Transformer Architecture Conductive to Blood Pressure Estimation via Solo PPG Signal Capturing
Amir Arjomand
Amin Boudesh
Farnoush Bayatmakou
Kenneth B. Kent
Arash Mohammadi
376
3
0
15 Apr 2024
Synthetic Data Generation and Joint Learning for Robust Code-Mixed Translation
Kamal Kumar
Yinhan Liu
Parth Patwa
Tanmoy
Mihir Adam Roberts
283
9
0
25 Mar 2024
OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation
Bohao Peng
Xiaoyang Wu
Li Jiang
Yukang Chen
Hengshuang Zhao
Zhuotao Tian
Jiaya Jia
284
57
0
21 Mar 2024
TRELM: Towards Robust and Efficient Pre-training for Knowledge-Enhanced Language Models
Junbing Yan
Chengyu Wang
Taolin Zhang
Xiao-Mei He
Junyuan Huang
Longtao Huang
Hui Xue
Wei Zhang
VLM
KELM
214
0
0
17 Mar 2024
Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling
Mahdi Karami
Ali Ghodsi
VLM
369
8
0
28 Feb 2024
Revisiting the Markov Property for Machine Translation
Cunxiao Du
Hao Zhou
Zhaopeng Tu
Jing Jiang
274
2
0
03 Feb 2024
Computation and Parameter Efficient Multi-Modal Fusion Transformer for Cued Speech Recognition
Lei Liu
Tianpeng Liu
Haizhou Li
266
12
0
31 Jan 2024
Topology-Aware Exploration of Energy-Based Models Equilibrium: Toric QC-LDPC Codes and Hyperbolic MET QC-LDPC Codes
V. Usatyuk
Denis Sapozhnikov
Sergey Egorov
255
0
0
26 Jan 2024
Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications
Computer Vision and Pattern Recognition (CVPR), 2024
Yuwen Xiong
Zhiqi Li
Yuntao Chen
Feng Wang
Xizhou Zhu
...
Jiaming Song
Yu Qiao
Lewei Lu
Jie Zhou
Jifeng Dai
165
144
0
11 Jan 2024
Heterogeneous Encoders Scaling In The Transformer For Neural Machine Translation
J. Hu
Roberto Cavicchioli
Giulia Berardinelli
Alessandro Capotondi
197
3
0
26 Dec 2023
Gated Linear Attention Transformers with Hardware-Efficient Training
Aaron Courville
Bailin Wang
Songlin Yang
Yikang Shen
Yoon Kim
482
303
0
11 Dec 2023
MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training
Computer Vision and Pattern Recognition (CVPR), 2023
Pavan Kumar Anasosalu Vasu
Hadi Pouransari
Fartash Faghri
Raviteja Vemulapalli
Oncel Tuzel
CLIP
VLM
713
84
0
28 Nov 2023
Attention Deficit is Ordered! Fooling Deformable Vision Transformers with Collaborative Adversarial Patches
Quazi Mishkatul Alam
Bilel Tarchoun
Ihsen Alouani
Nael B. Abu-Ghazaleh
AAML
ViT
229
1
0
21 Nov 2023
A Survey of Large Language Models in Medicine: Progress, Application, and Challenge
Hongjian Zhou
Fenglin Liu
Boyang Gu
Xinyu Zou
Jinfa Huang
...
Yefeng Zheng
Lei A. Clifton
Zheng Li
Fenglin Liu
David Clifton
LM&MA
736
191
0
09 Nov 2023
Dynamic Convolutional Neural Networks as Efficient Pre-trained Audio Models
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Florian Schmid
Khaled Koutini
Gerhard Widmer
222
26
0
24 Oct 2023
Surveying the Landscape of Text Summarization with Deep Learning: A Comprehensive Review
Guanghua Wang
Weili Wu
AI4TS
AILaw
234
10
0
13 Oct 2023
Is attention required for ICL? Exploring the Relationship Between Model Architecture and In-Context Learning Ability
International Conference on Learning Representations (ICLR), 2023
Ivan Lee
Nan Jiang
Taylor Berg-Kirkpatrick
425
15
0
12 Oct 2023
Sparse Universal Transformer
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Shawn Tan
Songlin Yang
Zhenfang Chen
Aaron Courville
Chuang Gan
MoE
266
25
0
11 Oct 2023
Interpret Vision Transformers as ConvNets with Dynamic Convolutions
Chong Zhou
Chen Change Loy
Bo Dai
ViT
268
1
0
19 Sep 2023
Nonrigid Object Contact Estimation With Regional Unwrapping Transformer
IEEE International Conference on Computer Vision (ICCV), 2023
Wei Xie
Zimeng Zhao
Shiying Li
Binghui Zuo
Yangang Wang
196
4
0
27 Aug 2023
Temporally-Adaptive Models for Efficient Video Understanding
Ziyuan Huang
Shiwei Zhang
Liang Pan
Zhiwu Qing
Yingya Zhang
Ziwei Liu
Marcelo H. Ang
214
17
0
10 Aug 2023
Spherical and Hyperbolic Toric Topology-Based Codes On Graph Embedding for Ising MRF Models: Classical and Quantum Topology Machine Learning
V. Usatyuk
Sergey Egorov
Denis Sapozhnikov
330
3
0
28 Jul 2023
Edge Guided GANs with Multi-Scale Contrastive Learning for Semantic Image Synthesis
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Hao Tang
Guolei Sun
Andrii Zadaianchuk
Luc Van Gool
GAN
268
17
0
22 Jul 2023
EM-Network: Oracle Guided Self-distillation for Sequence Learning
International Conference on Machine Learning (ICML), 2023
J. Yoon
Sunghwan Ahn
Hyeon Seung Lee
Minchan Kim
Seokhwan Kim
N. Kim
VLM
285
2
0
14 Jun 2023
A Feature Reuse Framework with Texture-adaptive Aggregation for Reference-based Super-Resolution
Xiaoyong Mei
Yi Yang
Ming Li
Changqin Huang
Kai Zhang
Pietro Lio
166
4
0
02 Jun 2023
Monotonic Location Attention for Length Generalization
International Conference on Machine Learning (ICML), 2023
Jishnu Ray Chowdhury
Cornelia Caragea
LLMAG
177
11
0
31 May 2023
A Quantitative Review on Language Model Efficiency Research
Meng Jiang
Hy Dang
Lingbo Tong
206
0
0
28 May 2023
Parallel Data Helps Neural Entity Coreference Resolution
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Gongbo Tang
Christian Hardmeier
149
5
0
28 May 2023
Neural Machine Translation with Dynamic Graph Convolutional Decoder
Lei Li
Kai Fan
Ling Yang
Hongjian Li
Chun Yuan
159
5
0
28 May 2023
Neural Machine Translation for Mathematical Formulae
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Felix Petersen
M. Schubotz
André Greiner-Petter
Bela Gipp
196
11
0
25 May 2023
Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT Operator
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Ziwei He
Meng Yang
Minwei Feng
Jingcheng Yin
Xiang Wang
Jingwen Leng
Zhouhan Lin
ViT
346
21
0
24 May 2023
Challenges in Context-Aware Neural Machine Translation
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Linghao Jin
Jacqueline He
Jonathan May
Xuezhe Ma
219
12
0
23 May 2023
Finding the Pillars of Strength for Multi-Head Attention
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Jinjie Ni
Rui Mao
Zonglin Yang
Han Lei
Xiaoshi Zhong
215
9
0
22 May 2023
VTPNet for 3D deep learning on point cloud
Wei Zhou
Weiwei Jin
Qian Wang
Yifan Wang
Dekui Wang
Xingxing Hao
Yong Yu
3DPC
ViT
165
1
0
10 May 2023
BranchNorm: Robustly Scaling Extremely Deep Transformers
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Yanjun Liu
Xianfeng Zeng
Fandong Meng
Jie Zhou
180
4
0
04 May 2023
Sequence Modeling with Multiresolution Convolutional Memory
International Conference on Machine Learning (ICML), 2023
Jiaxin Shi
Ke Alexander Wang
E. Fox
302
22
0
02 May 2023
Detection of Pavement Cracks by Deep Learning Models of Transformer and UNet
Yu Zhang
Lin Zhang
UQCV
MedIm
ViT
169
35
0
25 Apr 2023
TransFlow: Transformer as Flow Learner
Computer Vision and Pattern Recognition (CVPR), 2023
Yawen Lu
Qifan Wang
Siqi Ma
Tong Geng
Victor Y. Chen
Huaijin Chen
Dongfang Liu
ViT
289
64
0
23 Apr 2023
1
2
3
4
5
6
7
Next
Page 1 of 7