v1v2 (latest)

Pay Less Attention with Lightweight and Dynamic Convolutions

International Conference on Learning Representations (ICLR), 2019

29 January 2019

Angela Fan

Papers citing "Pay Less Attention with Lightweight and Dynamic Convolutions"

50 / 337 papers shown

Multi-refined Feature Enhanced Sentiment Analysis Using Contextual Instruction

194

01 Nov 2025

Long Context Automated Essay Scoring with Language Models

Christopher Ormerod

Gitit Kehat

133

12 Sep 2025

Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search

260

21 Aug 2025

Ensemble-Based Survival Models with the Self-Attended Beran Estimator PredictionsComputational Mathematics and Modeling (CMM), 2025

139

09 Jun 2025

Multi-Token Attention

350

01 Apr 2025

Revisiting Backdoor Attacks on Time Series Classification in the Frequency DomainThe Web Conference (WWW), 2025

441

12 Mar 2025

The FFT Strikes Again: An Efficient Alternative to Self-Attention

Jacob Fein-Ashley

Rajgopal Kannan

Viktor Prasanna

614

25 Feb 2025

On the Performance Analysis of Momentum Method: A Frequency Domain PerspectiveInternational Conference on Learning Representations (ICLR), 2024

542

29 Nov 2024

Efficient Machine Translation with a BiLSTM-Attention Approach

Yuxu Wu

Yiren Xing

157

29 Oct 2024

big.LITTLE Vision Transformer for Efficient Visual Recognition

Yulong Wang

Jifeng Dai

262

14 Oct 2024

Stereo-Knowledge Distillation from dpMV to Dual Pixels for Light Field Video Reconstruction

Aryan Garg

Raghav Mallampali

Akshat Joshi

Shrisudhan Govindarajan

Kaushik Mitra

290

20 May 2024

MambaOut: Do We Really Need Mamba for Vision?Computer Vision and Pattern Recognition (CVPR), 2024

Weihao Yu

Xinchao Wang

Mamba

355

185

13 May 2024

TransfoRhythm: A Transformer Architecture Conductive to Blood Pressure Estimation via Solo PPG Signal Capturing

376

15 Apr 2024

Synthetic Data Generation and Joint Learning for Robust Code-Mixed Translation

283

25 Mar 2024

OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation

284

21 Mar 2024

TRELM: Towards Robust and Efficient Pre-training for Knowledge-Enhanced Language Models

Chengyu Wang

Wei Zhang

214

17 Mar 2024

Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling

Mahdi Karami

Ali Ghodsi

VLM

369

28 Feb 2024

Revisiting the Markov Property for Machine Translation

274

03 Feb 2024

Computation and Parameter Efficient Multi-Modal Fusion Transformer for Cued Speech Recognition

Lei Liu

Tianpeng Liu

Haizhou Li

266

31 Jan 2024

Topology-Aware Exploration of Energy-Based Models Equilibrium: Toric QC-LDPC Codes and Hyperbolic MET QC-LDPC Codes

V. Usatyuk

Denis Sapozhnikov

Sergey Egorov

255

26 Jan 2024

Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision ApplicationsComputer Vision and Pattern Recognition (CVPR), 2024

...

Yu Qiao

165

144

11 Jan 2024

Heterogeneous Encoders Scaling In The Transformer For Neural Machine Translation

197

26 Dec 2023

Gated Linear Attention Transformers with Hardware-Efficient Training

Bailin Wang

482

303

11 Dec 2023

MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced TrainingComputer Vision and Pattern Recognition (CVPR), 2023

Pavan Kumar Anasosalu Vasu

713

28 Nov 2023

Attention Deficit is Ordered! Fooling Deformable Vision Transformers with Collaborative Adversarial Patches

229

21 Nov 2023

A Survey of Large Language Models in Medicine: Progress, Application, and Challenge

...

736

191

09 Nov 2023

Dynamic Convolutional Neural Networks as Efficient Pre-trained Audio ModelsIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

Florian Schmid

Khaled Koutini

Gerhard Widmer

222

24 Oct 2023

Surveying the Landscape of Text Summarization with Deep Learning: A Comprehensive Review

Guanghua Wang

Weili Wu

AI4TS AILaw

234

13 Oct 2023

Is attention required for ICL? Exploring the Relationship Between Model Architecture and In-Context Learning AbilityInternational Conference on Learning Representations (ICLR), 2023

Ivan Lee

Nan Jiang

Taylor Berg-Kirkpatrick

425

12 Oct 2023

Sparse Universal TransformerConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Shawn Tan

Songlin Yang

Zhenfang Chen

Aaron Courville

Chuang Gan

MoE

266

11 Oct 2023

Interpret Vision Transformers as ConvNets with Dynamic Convolutions

268

19 Sep 2023

Nonrigid Object Contact Estimation With Regional Unwrapping TransformerIEEE International Conference on Computer Vision (ICCV), 2023

196

27 Aug 2023

Temporally-Adaptive Models for Efficient Video Understanding

Ziwei Liu

214

10 Aug 2023

Spherical and Hyperbolic Toric Topology-Based Codes On Graph Embedding for Ising MRF Models: Classical and Quantum Topology Machine Learning

V. Usatyuk

Sergey Egorov

Denis Sapozhnikov

330

28 Jul 2023

Edge Guided GANs with Multi-Scale Contrastive Learning for Semantic Image SynthesisIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023

Luc Van Gool

268

22 Jul 2023

EM-Network: Oracle Guided Self-distillation for Sequence LearningInternational Conference on Machine Learning (ICML), 2023

285

14 Jun 2023

A Feature Reuse Framework with Texture-adaptive Aggregation for Reference-based Super-Resolution

166

02 Jun 2023

Monotonic Location Attention for Length GeneralizationInternational Conference on Machine Learning (ICML), 2023

Jishnu Ray Chowdhury

Cornelia Caragea

LLMAG

177

31 May 2023

A Quantitative Review on Language Model Efficiency Research

Meng Jiang

Hy Dang

Lingbo Tong

206

28 May 2023

Parallel Data Helps Neural Entity Coreference ResolutionAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Gongbo Tang

Christian Hardmeier

149

28 May 2023

Neural Machine Translation with Dynamic Graph Convolutional Decoder

Lei Li

159

28 May 2023

Neural Machine Translation for Mathematical FormulaeAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

196

25 May 2023

Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT OperatorAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

346

24 May 2023

Challenges in Context-Aware Neural Machine TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

219

23 May 2023

Finding the Pillars of Strength for Multi-Head AttentionAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

215

22 May 2023

VTPNet for 3D deep learning on point cloud

165

10 May 2023

BranchNorm: Robustly Scaling Extremely Deep TransformersAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Yanjun Liu

Xianfeng Zeng

Fandong Meng

Jie Zhou

180

04 May 2023

Sequence Modeling with Multiresolution Convolutional MemoryInternational Conference on Machine Learning (ICML), 2023

Jiaxin Shi

Ke Alexander Wang

E. Fox

302

02 May 2023

Detection of Pavement Cracks by Deep Learning Models of Transformer and UNet

Yu Zhang

Lin Zhang

UQCV MedIm ViT

169

25 Apr 2023

TransFlow: Transformer as Flow LearnerComputer Vision and Pattern Recognition (CVPR), 2023

289

23 Apr 2023