ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1901.10430
  4. Cited By
Pay Less Attention with Lightweight and Dynamic Convolutions
v1v2 (latest)

Pay Less Attention with Lightweight and Dynamic Convolutions

International Conference on Learning Representations (ICLR), 2019
29 January 2019
Felix Wu
Angela Fan
Alexei Baevski
Yann N. Dauphin
Michael Auli
ArXiv (abs)PDFHTML

Papers citing "Pay Less Attention with Lightweight and Dynamic Convolutions"

50 / 337 papers shown
Title
Multi-refined Feature Enhanced Sentiment Analysis Using Contextual Instruction
Multi-refined Feature Enhanced Sentiment Analysis Using Contextual Instruction
Peter Atandoh
Jie Zou
Weikang Guo
Jiwei Wei
Zheng Wang
106
0
0
01 Nov 2025
Long Context Automated Essay Scoring with Language Models
Long Context Automated Essay Scoring with Language Models
Christopher Ormerod
Gitit Kehat
88
0
0
12 Sep 2025
Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search
Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search
Yuxian Gu
Qinghao Hu
Shang Yang
Haocheng Xi
Junyu Chen
Song Han
Han Cai
172
10
0
21 Aug 2025
Ensemble-Based Survival Models with the Self-Attended Beran Estimator Predictions
Ensemble-Based Survival Models with the Self-Attended Beran Estimator PredictionsComputational Mathematics and Modeling (CMM), 2025
Lev V. Utkin
Semen P. Khomets
Vlada A. Efremenko
A. Konstantinov
Natalya M. Verbova
118
0
0
09 Jun 2025
Multi-Token Attention
Multi-Token Attention
O. Yu. Golovneva
Tianlu Wang
Jason Weston
Sainbayar Sukhbaatar
279
3
0
01 Apr 2025
Revisiting Backdoor Attacks on Time Series Classification in the Frequency Domain
Revisiting Backdoor Attacks on Time Series Classification in the Frequency DomainThe Web Conference (WWW), 2025
Yuanmin Huang
Mi Zhang
Zhaoxiang Wang
Wenxuan Li
Min Yang
AAMLAI4TS
363
4
0
12 Mar 2025
The FFT Strikes Again: An Efficient Alternative to Self-Attention
The FFT Strikes Again: An Efficient Alternative to Self-Attention
Jacob Fein-Ashley
Rajgopal Kannan
Viktor Prasanna
546
1
0
25 Feb 2025
On the Performance Analysis of Momentum Method: A Frequency Domain Perspective
On the Performance Analysis of Momentum Method: A Frequency Domain PerspectiveInternational Conference on Learning Representations (ICLR), 2024
Xianliang Li
Jun Luo
Zhiwei Zheng
Hanxiao Wang
Li Luo
Lingkun Wen
Linlong Wu
Sheng Xu
449
4
0
29 Nov 2024
Efficient Machine Translation with a BiLSTM-Attention Approach
Efficient Machine Translation with a BiLSTM-Attention Approach
Yuxu Wu
Yiren Xing
132
3
0
29 Oct 2024
big.LITTLE Vision Transformer for Efficient Visual Recognition
big.LITTLE Vision Transformer for Efficient Visual Recognition
He Guo
Yulong Wang
Zixuan Ye
Jifeng Dai
Yuwen Xiong
ViT
199
1
0
14 Oct 2024
Stereo-Knowledge Distillation from dpMV to Dual Pixels for Light Field
  Video Reconstruction
Stereo-Knowledge Distillation from dpMV to Dual Pixels for Light Field Video Reconstruction
Aryan Garg
Raghav Mallampali
Akshat Joshi
Shrisudhan Govindarajan
Kaushik Mitra
247
1
0
20 May 2024
MambaOut: Do We Really Need Mamba for Vision?
MambaOut: Do We Really Need Mamba for Vision?Computer Vision and Pattern Recognition (CVPR), 2024
Weihao Yu
Xinchao Wang
Mamba
257
165
0
13 May 2024
TransfoRhythm: A Transformer Architecture Conductive to Blood Pressure Estimation via Solo PPG Signal Capturing
TransfoRhythm: A Transformer Architecture Conductive to Blood Pressure Estimation via Solo PPG Signal Capturing
Amir Arjomand
Amin Boudesh
Farnoush Bayatmakou
Kenneth B. Kent
Arash Mohammadi
331
3
0
15 Apr 2024
Synthetic Data Generation and Joint Learning for Robust Code-Mixed
  Translation
Synthetic Data Generation and Joint Learning for Robust Code-Mixed Translation
Kamal Kumar
Yinhan Liu
Parth Patwa
Tanmoy
Mihir Adam Roberts
238
6
0
25 Mar 2024
OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation
OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation
Bohao Peng
Xiaoyang Wu
Li Jiang
Yukang Chen
Hengshuang Zhao
Zhuotao Tian
Jiaya Jia
220
53
0
21 Mar 2024
TRELM: Towards Robust and Efficient Pre-training for Knowledge-Enhanced
  Language Models
TRELM: Towards Robust and Efficient Pre-training for Knowledge-Enhanced Language Models
Junbing Yan
Chengyu Wang
Taolin Zhang
Xiao-Mei He
Junyuan Huang
Longtao Huang
Hui Xue
Wei Zhang
VLMKELM
139
0
0
17 Mar 2024
Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling
Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling
Mahdi Karami
Ali Ghodsi
VLM
256
7
0
28 Feb 2024
Revisiting the Markov Property for Machine Translation
Revisiting the Markov Property for Machine Translation
Cunxiao Du
Hao Zhou
Zhaopeng Tu
Jing Jiang
223
2
0
03 Feb 2024
Computation and Parameter Efficient Multi-Modal Fusion Transformer for
  Cued Speech Recognition
Computation and Parameter Efficient Multi-Modal Fusion Transformer for Cued Speech Recognition
Lei Liu
Tianpeng Liu
Haizhou Li
210
12
0
31 Jan 2024
Topology-Aware Exploration of Energy-Based Models Equilibrium: Toric
  QC-LDPC Codes and Hyperbolic MET QC-LDPC Codes
Topology-Aware Exploration of Energy-Based Models Equilibrium: Toric QC-LDPC Codes and Hyperbolic MET QC-LDPC Codes
V. Usatyuk
Denis Sapozhnikov
Sergey Egorov
215
0
0
26 Jan 2024
Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator
  for Vision Applications
Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision ApplicationsComputer Vision and Pattern Recognition (CVPR), 2024
Yuwen Xiong
Zhiqi Li
Yuntao Chen
Feng Wang
Xizhou Zhu
...
Jiaming Song
Yu Qiao
Lewei Lu
Jie Zhou
Jifeng Dai
137
125
0
11 Jan 2024
Heterogeneous Encoders Scaling In The Transformer For Neural Machine
  Translation
Heterogeneous Encoders Scaling In The Transformer For Neural Machine Translation
J. Hu
Roberto Cavicchioli
Giulia Berardinelli
Alessandro Capotondi
175
3
0
26 Dec 2023
Gated Linear Attention Transformers with Hardware-Efficient Training
Gated Linear Attention Transformers with Hardware-Efficient Training
Aaron Courville
Bailin Wang
Songlin Yang
Yikang Shen
Yoon Kim
345
291
0
11 Dec 2023
MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced
  Training
MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced TrainingComputer Vision and Pattern Recognition (CVPR), 2023
Pavan Kumar Anasosalu Vasu
Hadi Pouransari
Fartash Faghri
Raviteja Vemulapalli
Oncel Tuzel
CLIPVLM
467
81
0
28 Nov 2023
Attention Deficit is Ordered! Fooling Deformable Vision Transformers
  with Collaborative Adversarial Patches
Attention Deficit is Ordered! Fooling Deformable Vision Transformers with Collaborative Adversarial Patches
Quazi Mishkatul Alam
Bilel Tarchoun
Ihsen Alouani
Nael B. Abu-Ghazaleh
AAMLViT
192
1
0
21 Nov 2023
A Survey of Large Language Models in Medicine: Progress, Application,
  and Challenge
A Survey of Large Language Models in Medicine: Progress, Application, and Challenge
Hongjian Zhou
Fenglin Liu
Boyang Gu
Xinyu Zou
Jinfa Huang
...
Yefeng Zheng
Lei A. Clifton
Zheng Li
Fenglin Liu
David Clifton
LM&MA
556
177
0
09 Nov 2023
Dynamic Convolutional Neural Networks as Efficient Pre-trained Audio
  Models
Dynamic Convolutional Neural Networks as Efficient Pre-trained Audio ModelsIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Florian Schmid
Khaled Koutini
Gerhard Widmer
175
23
0
24 Oct 2023
Surveying the Landscape of Text Summarization with Deep Learning: A
  Comprehensive Review
Surveying the Landscape of Text Summarization with Deep Learning: A Comprehensive Review
Guanghua Wang
Weili Wu
AI4TSAILaw
200
8
0
13 Oct 2023
Is attention required for ICL? Exploring the Relationship Between Model
  Architecture and In-Context Learning Ability
Is attention required for ICL? Exploring the Relationship Between Model Architecture and In-Context Learning AbilityInternational Conference on Learning Representations (ICLR), 2023
Ivan Lee
Nan Jiang
Taylor Berg-Kirkpatrick
381
15
0
12 Oct 2023
Sparse Universal Transformer
Sparse Universal TransformerConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Shawn Tan
Songlin Yang
Zhenfang Chen
Aaron Courville
Chuang Gan
MoE
220
23
0
11 Oct 2023
Interpret Vision Transformers as ConvNets with Dynamic Convolutions
Interpret Vision Transformers as ConvNets with Dynamic Convolutions
Chong Zhou
Chen Change Loy
Bo Dai
ViT
233
1
0
19 Sep 2023
Nonrigid Object Contact Estimation With Regional Unwrapping Transformer
Nonrigid Object Contact Estimation With Regional Unwrapping TransformerIEEE International Conference on Computer Vision (ICCV), 2023
Wei Xie
Zimeng Zhao
Shiying Li
Binghui Zuo
Yangang Wang
149
4
0
27 Aug 2023
Temporally-Adaptive Models for Efficient Video Understanding
Temporally-Adaptive Models for Efficient Video Understanding
Ziyuan Huang
Shiwei Zhang
Liang Pan
Zhiwu Qing
Yingya Zhang
Ziwei Liu
Marcelo H. Ang
177
16
0
10 Aug 2023
Spherical and Hyperbolic Toric Topology-Based Codes On Graph Embedding
  for Ising MRF Models: Classical and Quantum Topology Machine Learning
Spherical and Hyperbolic Toric Topology-Based Codes On Graph Embedding for Ising MRF Models: Classical and Quantum Topology Machine Learning
V. Usatyuk
Sergey Egorov
Denis Sapozhnikov
281
3
0
28 Jul 2023
Edge Guided GANs with Multi-Scale Contrastive Learning for Semantic
  Image Synthesis
Edge Guided GANs with Multi-Scale Contrastive Learning for Semantic Image SynthesisIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Hao Tang
Guolei Sun
Andrii Zadaianchuk
Luc Van Gool
GAN
243
17
0
22 Jul 2023
EM-Network: Oracle Guided Self-distillation for Sequence Learning
EM-Network: Oracle Guided Self-distillation for Sequence LearningInternational Conference on Machine Learning (ICML), 2023
J. Yoon
Sunghwan Ahn
Hyeon Seung Lee
Minchan Kim
Seokhwan Kim
N. Kim
VLM
248
2
0
14 Jun 2023
A Feature Reuse Framework with Texture-adaptive Aggregation for
  Reference-based Super-Resolution
A Feature Reuse Framework with Texture-adaptive Aggregation for Reference-based Super-Resolution
Xiaoyong Mei
Yi Yang
Ming Li
Changqin Huang
Kai Zhang
Pietro Lio
137
4
0
02 Jun 2023
Monotonic Location Attention for Length Generalization
Monotonic Location Attention for Length GeneralizationInternational Conference on Machine Learning (ICML), 2023
Jishnu Ray Chowdhury
Cornelia Caragea
LLMAG
160
10
0
31 May 2023
A Quantitative Review on Language Model Efficiency Research
A Quantitative Review on Language Model Efficiency Research
Meng Jiang
Hy Dang
Lingbo Tong
159
0
0
28 May 2023
Parallel Data Helps Neural Entity Coreference Resolution
Parallel Data Helps Neural Entity Coreference ResolutionAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Gongbo Tang
Christian Hardmeier
103
5
0
28 May 2023
Neural Machine Translation with Dynamic Graph Convolutional Decoder
Neural Machine Translation with Dynamic Graph Convolutional Decoder
Lei Li
Kai Fan
Ling Yang
Hongjian Li
Chun Yuan
114
5
0
28 May 2023
Neural Machine Translation for Mathematical Formulae
Neural Machine Translation for Mathematical FormulaeAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Felix Petersen
M. Schubotz
André Greiner-Petter
Bela Gipp
157
10
0
25 May 2023
Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT Operator
Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT OperatorAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Ziwei He
Meng Yang
Minwei Feng
Jingcheng Yin
Xiang Wang
Jingwen Leng
Zhouhan Lin
ViT
271
19
0
24 May 2023
Challenges in Context-Aware Neural Machine Translation
Challenges in Context-Aware Neural Machine TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Linghao Jin
Jacqueline He
Jonathan May
Xuezhe Ma
191
11
0
23 May 2023
Finding the Pillars of Strength for Multi-Head Attention
Finding the Pillars of Strength for Multi-Head AttentionAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Jinjie Ni
Rui Mao
Zonglin Yang
Han Lei
Xiaoshi Zhong
182
7
0
22 May 2023
VTPNet for 3D deep learning on point cloud
VTPNet for 3D deep learning on point cloud
Wei Zhou
Weiwei Jin
Qian Wang
Yifan Wang
Dekui Wang
Xingxing Hao
Yong Yu
3DPCViT
122
1
0
10 May 2023
BranchNorm: Robustly Scaling Extremely Deep Transformers
BranchNorm: Robustly Scaling Extremely Deep TransformersAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Yanjun Liu
Xianfeng Zeng
Fandong Meng
Jie Zhou
139
4
0
04 May 2023
Sequence Modeling with Multiresolution Convolutional Memory
Sequence Modeling with Multiresolution Convolutional MemoryInternational Conference on Machine Learning (ICML), 2023
Jiaxin Shi
Ke Alexander Wang
E. Fox
254
21
0
02 May 2023
Detection of Pavement Cracks by Deep Learning Models of Transformer and
  UNet
Detection of Pavement Cracks by Deep Learning Models of Transformer and UNet
Yu Zhang
Lin Zhang
UQCVMedImViT
133
31
0
25 Apr 2023
TransFlow: Transformer as Flow Learner
TransFlow: Transformer as Flow LearnerComputer Vision and Pattern Recognition (CVPR), 2023
Yawen Lu
Qifan Wang
Siqi Ma
Tong Geng
Victor Y. Chen
Huaijin Chen
Dongfang Liu
ViT
237
62
0
23 Apr 2023
1234567
Next