v1v2 (latest)

Pay Less Attention with Lightweight and Dynamic Convolutions

International Conference on Learning Representations (ICLR), 2019

29 January 2019

Angela Fan

Papers citing "Pay Less Attention with Lightweight and Dynamic Convolutions"

50 / 337 papers shown

Title
Multi-refined Feature Enhanced Sentiment Analysis Using Contextual Instruction Peter Atandoh Jie Zou Weikang Guo Jiwei Wei Zheng Wang 106 0 0 01 Nov 2025
Long Context Automated Essay Scoring with Language Models Christopher Ormerod Gitit Kehat 88 0 0 12 Sep 2025
Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search Yuxian Gu Qinghao Hu Shang Yang Haocheng Xi Junyu Chen Song Han Han Cai 172 10 0 21 Aug 2025
Ensemble-Based Survival Models with the Self-Attended Beran Estimator PredictionsComputational Mathematics and Modeling (CMM), 2025 Lev V. Utkin Semen P. Khomets Vlada A. Efremenko A. Konstantinov Natalya M. Verbova 118 0 0 09 Jun 2025
Multi-Token Attention O. Yu. Golovneva Tianlu Wang Jason Weston Sainbayar Sukhbaatar 279 3 0 01 Apr 2025
Revisiting Backdoor Attacks on Time Series Classification in the Frequency DomainThe Web Conference (WWW), 2025 Yuanmin Huang Mi Zhang Zhaoxiang Wang Wenxuan Li Min Yang AAML AI4TS 363 4 0 12 Mar 2025
The FFT Strikes Again: An Efficient Alternative to Self-Attention Jacob Fein-Ashley Rajgopal Kannan Viktor Prasanna 546 1 0 25 Feb 2025
On the Performance Analysis of Momentum Method: A Frequency Domain PerspectiveInternational Conference on Learning Representations (ICLR), 2024 Xianliang Li Jun Luo Zhiwei Zheng Hanxiao Wang Li Luo Lingkun Wen Linlong Wu Sheng Xu 449 4 0 29 Nov 2024
Efficient Machine Translation with a BiLSTM-Attention Approach Yuxu Wu Yiren Xing 132 3 0 29 Oct 2024
big.LITTLE Vision Transformer for Efficient Visual Recognition He Guo Yulong Wang Zixuan Ye Jifeng Dai Yuwen Xiong ViT 199 1 0 14 Oct 2024
Stereo-Knowledge Distillation from dpMV to Dual Pixels for Light Field Video Reconstruction Aryan Garg Raghav Mallampali Akshat Joshi Shrisudhan Govindarajan Kaushik Mitra 247 1 0 20 May 2024
MambaOut: Do We Really Need Mamba for Vision?Computer Vision and Pattern Recognition (CVPR), 2024 Weihao Yu Xinchao Wang Mamba 257 165 0 13 May 2024
TransfoRhythm: A Transformer Architecture Conductive to Blood Pressure Estimation via Solo PPG Signal Capturing Amir Arjomand Amin Boudesh Farnoush Bayatmakou Kenneth B. Kent Arash Mohammadi 331 3 0 15 Apr 2024
Synthetic Data Generation and Joint Learning for Robust Code-Mixed Translation Kamal Kumar Yinhan Liu Parth Patwa Tanmoy Mihir Adam Roberts 238 6 0 25 Mar 2024
OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation Bohao Peng Xiaoyang Wu Li Jiang Yukang Chen Hengshuang Zhao Zhuotao Tian Jiaya Jia 220 53 0 21 Mar 2024
TRELM: Towards Robust and Efficient Pre-training for Knowledge-Enhanced Language Models Junbing Yan Chengyu Wang Taolin Zhang Xiao-Mei He Junyuan Huang Longtao Huang Hui Xue Wei Zhang VLM KELM 139 0 0 17 Mar 2024
Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling Mahdi Karami Ali Ghodsi VLM 256 7 0 28 Feb 2024
Revisiting the Markov Property for Machine Translation Cunxiao Du Hao Zhou Zhaopeng Tu Jing Jiang 223 2 0 03 Feb 2024
Computation and Parameter Efficient Multi-Modal Fusion Transformer for Cued Speech Recognition Lei Liu Tianpeng Liu Haizhou Li 210 12 0 31 Jan 2024
Topology-Aware Exploration of Energy-Based Models Equilibrium: Toric QC-LDPC Codes and Hyperbolic MET QC-LDPC Codes V. Usatyuk Denis Sapozhnikov Sergey Egorov 215 0 0 26 Jan 2024
Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision ApplicationsComputer Vision and Pattern Recognition (CVPR), 2024 Yuwen Xiong Zhiqi Li Yuntao Chen Feng Wang Xizhou Zhu ... Jiaming Song Yu Qiao Lewei Lu Jie Zhou Jifeng Dai 137 125 0 11 Jan 2024
Heterogeneous Encoders Scaling In The Transformer For Neural Machine Translation J. Hu Roberto Cavicchioli Giulia Berardinelli Alessandro Capotondi 175 3 0 26 Dec 2023
Gated Linear Attention Transformers with Hardware-Efficient Training Aaron Courville Bailin Wang Songlin Yang Yikang Shen Yoon Kim 345 291 0 11 Dec 2023
MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced TrainingComputer Vision and Pattern Recognition (CVPR), 2023 Pavan Kumar Anasosalu Vasu Hadi Pouransari Fartash Faghri Raviteja Vemulapalli Oncel Tuzel CLIP VLM 467 81 0 28 Nov 2023
Attention Deficit is Ordered! Fooling Deformable Vision Transformers with Collaborative Adversarial Patches Quazi Mishkatul Alam Bilel Tarchoun Ihsen Alouani Nael B. Abu-Ghazaleh AAML ViT 192 1 0 21 Nov 2023
A Survey of Large Language Models in Medicine: Progress, Application, and Challenge Hongjian Zhou Fenglin Liu Boyang Gu Xinyu Zou Jinfa Huang ... Yefeng Zheng Lei A. Clifton Zheng Li Fenglin Liu David Clifton LM&MA 556 177 0 09 Nov 2023
Dynamic Convolutional Neural Networks as Efficient Pre-trained Audio ModelsIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023 Florian Schmid Khaled Koutini Gerhard Widmer 175 23 0 24 Oct 2023
Surveying the Landscape of Text Summarization with Deep Learning: A Comprehensive Review Guanghua Wang Weili Wu AI4TS AILaw 200 8 0 13 Oct 2023
Is attention required for ICL? Exploring the Relationship Between Model Architecture and In-Context Learning AbilityInternational Conference on Learning Representations (ICLR), 2023 Ivan Lee Nan Jiang Taylor Berg-Kirkpatrick 381 15 0 12 Oct 2023
Sparse Universal TransformerConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 Shawn Tan Songlin Yang Zhenfang Chen Aaron Courville Chuang Gan MoE 220 23 0 11 Oct 2023
Interpret Vision Transformers as ConvNets with Dynamic Convolutions Chong Zhou Chen Change Loy Bo Dai ViT 233 1 0 19 Sep 2023
Nonrigid Object Contact Estimation With Regional Unwrapping TransformerIEEE International Conference on Computer Vision (ICCV), 2023 Wei Xie Zimeng Zhao Shiying Li Binghui Zuo Yangang Wang 149 4 0 27 Aug 2023
Temporally-Adaptive Models for Efficient Video Understanding Ziyuan Huang Shiwei Zhang Liang Pan Zhiwu Qing Yingya Zhang Ziwei Liu Marcelo H. Ang 177 16 0 10 Aug 2023
Spherical and Hyperbolic Toric Topology-Based Codes On Graph Embedding for Ising MRF Models: Classical and Quantum Topology Machine Learning V. Usatyuk Sergey Egorov Denis Sapozhnikov 281 3 0 28 Jul 2023
Edge Guided GANs with Multi-Scale Contrastive Learning for Semantic Image SynthesisIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023 Hao Tang Guolei Sun Andrii Zadaianchuk Luc Van Gool GAN 243 17 0 22 Jul 2023
EM-Network: Oracle Guided Self-distillation for Sequence LearningInternational Conference on Machine Learning (ICML), 2023 J. Yoon Sunghwan Ahn Hyeon Seung Lee Minchan Kim Seokhwan Kim N. Kim VLM 248 2 0 14 Jun 2023
A Feature Reuse Framework with Texture-adaptive Aggregation for Reference-based Super-Resolution Xiaoyong Mei Yi Yang Ming Li Changqin Huang Kai Zhang Pietro Lio 137 4 0 02 Jun 2023
Monotonic Location Attention for Length GeneralizationInternational Conference on Machine Learning (ICML), 2023 Jishnu Ray Chowdhury Cornelia Caragea LLMAG 160 10 0 31 May 2023
A Quantitative Review on Language Model Efficiency Research Meng Jiang Hy Dang Lingbo Tong 159 0 0 28 May 2023
Parallel Data Helps Neural Entity Coreference ResolutionAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 Gongbo Tang Christian Hardmeier 103 5 0 28 May 2023
Neural Machine Translation with Dynamic Graph Convolutional Decoder Lei Li Kai Fan Ling Yang Hongjian Li Chun Yuan 114 5 0 28 May 2023
Neural Machine Translation for Mathematical FormulaeAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 Felix Petersen M. Schubotz André Greiner-Petter Bela Gipp 157 10 0 25 May 2023
Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT OperatorAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 Ziwei He Meng Yang Minwei Feng Jingcheng Yin Xiang Wang Jingwen Leng Zhouhan Lin ViT 271 19 0 24 May 2023
Challenges in Context-Aware Neural Machine TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2023 Linghao Jin Jacqueline He Jonathan May Xuezhe Ma 191 11 0 23 May 2023
Finding the Pillars of Strength for Multi-Head AttentionAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 Jinjie Ni Rui Mao Zonglin Yang Han Lei Xiaoshi Zhong 182 7 0 22 May 2023
VTPNet for 3D deep learning on point cloud Wei Zhou Weiwei Jin Qian Wang Yifan Wang Dekui Wang Xingxing Hao Yong Yu 3DPC ViT 122 1 0 10 May 2023
BranchNorm: Robustly Scaling Extremely Deep TransformersAnnual Meeting of the Association for Computational Linguistics (ACL), 2023 Yanjun Liu Xianfeng Zeng Fandong Meng Jie Zhou 139 4 0 04 May 2023
Sequence Modeling with Multiresolution Convolutional MemoryInternational Conference on Machine Learning (ICML), 2023 Jiaxin Shi Ke Alexander Wang E. Fox 254 21 0 02 May 2023
Detection of Pavement Cracks by Deep Learning Models of Transformer and UNet Yu Zhang Lin Zhang UQCV MedIm ViT 133 31 0 25 Apr 2023
TransFlow: Transformer as Flow LearnerComputer Vision and Pattern Recognition (CVPR), 2023 Yawen Lu Qifan Wang Siqi Ma Tong Geng Victor Y. Chen Huaijin Chen Dongfang Liu ViT 237 62 0 23 Apr 2023