ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1901.10430
  4. Cited By
Pay Less Attention with Lightweight and Dynamic Convolutions
v1v2 (latest)

Pay Less Attention with Lightweight and Dynamic Convolutions

International Conference on Learning Representations (ICLR), 2019
29 January 2019
Felix Wu
Angela Fan
Alexei Baevski
Yann N. Dauphin
Michael Auli
ArXiv (abs)PDFHTML

Papers citing "Pay Less Attention with Lightweight and Dynamic Convolutions"

50 / 337 papers shown
Probabilistic Attention for Interactive Segmentation
Probabilistic Attention for Interactive SegmentationNeural Information Processing Systems (NeurIPS), 2021
Prasad Gabbur
Manjot Bilkhu
J. Movellan
183
13
0
23 Jun 2021
LV-BERT: Exploiting Layer Variety for BERT
LV-BERT: Exploiting Layer Variety for BERTFindings (Findings), 2021
Weihao Yu
Zihang Jiang
Fei Chen
Qibin Hou
Jiashi Feng
MQ
153
0
0
22 Jun 2021
Eigen Analysis of Self-Attention and its Reconstruction from Partial
  Computation
Eigen Analysis of Self-Attention and its Reconstruction from Partial Computation
Srinadh Bhojanapalli
Ayan Chakrabarti
Himanshu Jain
Sanjiv Kumar
Michal Lukasik
Andreas Veit
110
10
0
16 Jun 2021
HELP: Hardware-Adaptive Efficient Latency Prediction for NAS via
  Meta-Learning
HELP: Hardware-Adaptive Efficient Latency Prediction for NAS via Meta-Learning
Hayeon Lee
Sewoong Lee
Song Chong
Sung Ju Hwang
258
31
0
16 Jun 2021
Coreference-Aware Dialogue Summarization
Coreference-Aware Dialogue Summarization
Zhengyuan Liu
Ke Shi
Nancy F. Chen
150
69
0
16 Jun 2021
GroupBERT: Enhanced Transformer Architecture with Efficient Grouped
  Structures
GroupBERT: Enhanced Transformer Architecture with Efficient Grouped Structures
Ivan Chelombiev
Daniel Justus
Douglas Orr
A. Dietrich
Frithjof Gressmann
A. Koliousis
Carlo Luschi
141
5
0
10 Jun 2021
Convolutions and Self-Attention: Re-interpreting Relative Positions in
  Pre-trained Language Models
Convolutions and Self-Attention: Re-interpreting Relative Positions in Pre-trained Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2021
Tyler A. Chang
Yifan Xu
Weijian Xu
Zhuowen Tu
ViT
105
16
0
10 Jun 2021
A Survey of Transformers
A Survey of TransformersAI Open (AO), 2021
Tianyang Lin
Yuxin Wang
Xiangyang Liu
Xipeng Qiu
ViT
441
1,380
0
08 Jun 2021
On the Connection between Local Attention and Dynamic Depth-wise
  Convolution
On the Connection between Local Attention and Dynamic Depth-wise ConvolutionInternational Conference on Learning Representations (ICLR), 2021
Qi Han
Zejia Fan
Jingdong Sun
Lei-huan Sun
Ming-Ming Cheng
Jiaying Liu
Jingdong Wang
ViT
356
132
0
08 Jun 2021
On the Language Coverage Bias for Neural Machine Translation
On the Language Coverage Bias for Neural Machine TranslationFindings (Findings), 2021
Shuo Wang
Zhaopeng Tu
Zhixing Tan
Shuming Shi
Maosong Sun
Yang Liu
116
21
0
07 Jun 2021
Neural Implicit 3D Shapes from Single Images with Spatial Patterns
Neural Implicit 3D Shapes from Single Images with Spatial PatternsInternational Conference on Image and Graphics (ICIG), 2021
Yixin Zhuang
Yunzhe Liu
Yujie Wang
Baoquan Chen
3DPC
159
0
0
06 Jun 2021
Attention mechanisms and deep learning for machine vision: A survey of
  the state of the art
Attention mechanisms and deep learning for machine vision: A survey of the state of the art
A. M. Hafiz
S. A. Parah
R. A. Bhat
224
56
0
03 Jun 2021
Container: Context Aggregation Network
Container: Context Aggregation NetworkNeural Information Processing Systems (NeurIPS), 2021
Peng Gao
Jiasen Lu
Jiaming Song
Roozbeh Mottaghi
Aniruddha Kembhavi
ViT
271
81
0
02 Jun 2021
Self-Training Sampling with Monolingual Data Uncertainty for Neural
  Machine Translation
Self-Training Sampling with Monolingual Data Uncertainty for Neural Machine TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2021
Wenxiang Jiao
Xing Wang
Zhaopeng Tu
Shuming Shi
Michael R. Lyu
Irwin King
UQLM
147
39
0
02 Jun 2021
DLA-Net: Learning Dual Local Attention Features for Semantic
  Segmentation of Large-Scale Building Facade Point Clouds
DLA-Net: Learning Dual Local Attention Features for Semantic Segmentation of Large-Scale Building Facade Point CloudsPattern Recognition (Pattern Recogn.), 2021
Yanfei Su
Weiquan Liu
Zhimin Yuan
Ming Cheng
Zhihong Zhang
Xuelun Shen
Cheng-Yu Wang
3DPC
274
49
0
01 Jun 2021
Memory-Efficient Differentiable Transformer Architecture Search
Memory-Efficient Differentiable Transformer Architecture SearchFindings (Findings), 2021
Yuekai Zhao
Li Dong
Yelong Shen
Zhihua Zhang
Furu Wei
Weizhu Chen
ViT
217
20
0
31 May 2021
Good for Misconceived Reasons: An Empirical Revisiting on the Need for
  Visual Context in Multimodal Machine Translation
Good for Misconceived Reasons: An Empirical Revisiting on the Need for Visual Context in Multimodal Machine TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2021
Zhiyong Wu
Lingpeng Kong
W. Bi
Xiang Li
B. Kao
LRM
134
97
0
30 May 2021
NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural
  Architecture Search
NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture SearchKnowledge Discovery and Data Mining (KDD), 2021
Jin Xu
Xu Tan
Renqian Luo
Kaitao Song
Jian Li
Tao Qin
Tie-Yan Liu
MQ
139
89
0
30 May 2021
An Attention Free Transformer
An Attention Free Transformer
Shuangfei Zhai
Walter A. Talbott
Nitish Srivastava
Chen Huang
Hanlin Goh
Ruixiang Zhang
J. Susskind
ViT
381
162
0
28 May 2021
Controllable Abstractive Dialogue Summarization with Sketch Supervision
Controllable Abstractive Dialogue Summarization with Sketch SupervisionFindings (Findings), 2021
Chien-Sheng Wu
Linqing Liu
Wenhao Liu
Pontus Stenetorp
Caiming Xiong
216
57
0
28 May 2021
TranSmart: A Practical Interactive Machine Translation System
TranSmart: A Practical Interactive Machine Translation System
Guoping Huang
Lemao Liu
Xing Wang
Longyue Wang
Huayang Li
Zhaopeng Tu
Chengyang Huang
Shuming Shi
170
36
0
27 May 2021
Learning Language Specific Sub-network for Multilingual Machine
  Translation
Learning Language Specific Sub-network for Multilingual Machine TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2021
Zehui Lin
Liwei Wu
Mingxuan Wang
Lei Li
274
87
0
19 May 2021
Pay Attention to MLPs
Pay Attention to MLPsNeural Information Processing Systems (NeurIPS), 2021
Hanxiao Liu
Zihang Dai
David R. So
Quoc V. Le
AI4CE
567
796
0
17 May 2021
Dynamic Pooling Improves Nanopore Base Calling Accuracy
Dynamic Pooling Improves Nanopore Base Calling AccuracyIEEE/ACM Transactions on Computational Biology & Bioinformatics (TCBB), 2021
V. Boža
Peter Perešíni
Broňa Brejová
T. Vinař
160
4
0
16 May 2021
The Volctrans Neural Speech Translation System for IWSLT 2021
The Volctrans Neural Speech Translation System for IWSLT 2021International Workshop on Spoken Language Translation (IWSLT), 2021
Chengqi Zhao
Zhicheng Liu
Jian-Fei Tong
Tao Wang
Mingxuan Wang
Rong Ye
Qianqian Dong
Jun Cao
Lei Li
207
9
0
16 May 2021
Not All Memories are Created Equal: Learning to Forget by Expiring
Not All Memories are Created Equal: Learning to Forget by ExpiringInternational Conference on Machine Learning (ICML), 2021
Sainbayar Sukhbaatar
Da Ju
Spencer Poff
Stephen Roller
Arthur Szlam
Jason Weston
Angela Fan
CLL
239
36
0
13 May 2021
Poolingformer: Long Document Modeling with Pooling Attention
Poolingformer: Long Document Modeling with Pooling AttentionInternational Conference on Machine Learning (ICML), 2021
Hang Zhang
Yeyun Gong
Yelong Shen
Weisheng Li
Jiancheng Lv
Nan Duan
Weizhu Chen
183
115
0
10 May 2021
Are Pre-trained Convolutions Better than Pre-trained Transformers?
Are Pre-trained Convolutions Better than Pre-trained Transformers?Annual Meeting of the Association for Computational Linguistics (ACL), 2021
Yi Tay
Mostafa Dehghani
J. Gupta
Dara Bahri
V. Aribandi
Zhen Qin
Donald Metzler
AI4CE
177
51
0
07 May 2021
MLP-Mixer: An all-MLP Architecture for Vision
MLP-Mixer: An all-MLP Architecture for VisionNeural Information Processing Systems (NeurIPS), 2021
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
1.2K
3,284
0
04 May 2021
Review of end-to-end speech synthesis technology based on deep learning
Review of end-to-end speech synthesis technology based on deep learning
Zhaoxi Mu
Xinyu Yang
Yizhuo Dong
AuLLMALM
202
31
0
20 Apr 2021
Knowledge Neurons in Pretrained Transformers
Knowledge Neurons in Pretrained TransformersAnnual Meeting of the Association for Computational Linguistics (ACL), 2021
Damai Dai
Li Dong
Y. Hao
Zhifang Sui
Baobao Chang
Furu Wei
KELMMU
538
574
0
18 Apr 2021
How to Train BERT with an Academic Budget
How to Train BERT with an Academic BudgetConference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Peter Izsak
Moshe Berchansky
Omer Levy
338
128
0
15 Apr 2021
UniDrop: A Simple yet Effective Technique to Improve Transformer without
  Extra Cost
UniDrop: A Simple yet Effective Technique to Improve Transformer without Extra CostNorth American Chapter of the Association for Computational Linguistics (NAACL), 2021
Zhen Wu
Lijun Wu
Qi Meng
Ziheng Lu
Shufang Xie
Tao Qin
Xinyu Dai
Tie-Yan Liu
201
25
0
11 Apr 2021
Non-Autoregressive Semantic Parsing for Compositional Task-Oriented
  Dialog
Non-Autoregressive Semantic Parsing for Compositional Task-Oriented DialogNorth American Chapter of the Association for Computational Linguistics (NAACL), 2021
Arun Babu
Akshat Shrivastava
Armen Aghajanyan
Ahmed Aly
Angela Fan
Marjan Ghazvininejad
134
21
0
11 Apr 2021
Learning Graph Structures with Transformer for Multivariate Time Series
  Anomaly Detection in IoT
Learning Graph Structures with Transformer for Multivariate Time Series Anomaly Detection in IoTIEEE Internet of Things Journal (IEEE IoT Journal), 2021
Zekai Chen
Dingshuo Chen
Xiao Zhang
Zixuan Yuan
Xiuzhen Cheng
AI4TS
319
456
0
08 Apr 2021
Do We Need Anisotropic Graph Neural Networks?
Do We Need Anisotropic Graph Neural Networks?International Conference on Learning Representations (ICLR), 2021
Shyam A. Tailor
Felix L. Opolka
Pietro Lio
Nicholas D. Lane
348
42
0
03 Apr 2021
Dual Contrastive Loss and Attention for GANs
Dual Contrastive Loss and Attention for GANsIEEE International Conference on Computer Vision (ICCV), 2021
Ning Yu
Guilin Liu
Aysegül Dündar
Andrew Tao
Bryan Catanzaro
Larry S. Davis
Mario Fritz
GAN
325
68
0
31 Mar 2021
CvT: Introducing Convolutions to Vision Transformers
CvT: Introducing Convolutions to Vision TransformersIEEE International Conference on Computer Vision (ICCV), 2021
Haiping Wu
Bin Xiao
Noel Codella
Xiyang Dai
Xiyang Dai
Lu Yuan
Lei Zhang
ViT
495
2,273
0
29 Mar 2021
Unified Graph Structured Models for Video Understanding
Unified Graph Structured Models for Video UnderstandingIEEE International Conference on Computer Vision (ICCV), 2021
Anurag Arnab
Chen Sun
Cordelia Schmid
226
52
0
29 Mar 2021
Mask Attention Networks: Rethinking and Strengthen Transformer
Mask Attention Networks: Rethinking and Strengthen TransformerNorth American Chapter of the Association for Computational Linguistics (NAACL), 2021
Zhihao Fan
Yeyun Gong
Dayiheng Liu
Zhongyu Wei
Siyuan Wang
Jian Jiao
Nan Duan
Ruofei Zhang
Xuanjing Huang
150
78
0
25 Mar 2021
Finetuning Pretrained Transformers into RNNs
Finetuning Pretrained Transformers into RNNsConference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Jungo Kasai
Hao Peng
Yizhe Zhang
Dani Yogatama
Gabriel Ilharco
Nikolaos Pappas
Yi Mao
Weizhu Chen
Noah A. Smith
317
81
0
24 Mar 2021
IOT: Instance-wise Layer Reordering for Transformer Structures
IOT: Instance-wise Layer Reordering for Transformer StructuresInternational Conference on Learning Representations (ICLR), 2021
Jinhua Zhu
Lijun Wu
Ziheng Lu
Shufang Xie
Tao Qin
Wen-gang Zhou
Houqiang Li
Tie-Yan Liu
189
8
0
05 Mar 2021
Random Feature Attention
Random Feature AttentionInternational Conference on Learning Representations (ICLR), 2021
Hao Peng
Nikolaos Pappas
Dani Yogatama
Roy Schwartz
Noah A. Smith
Lingpeng Kong
329
401
0
03 Mar 2021
Do Transformer Modifications Transfer Across Implementations and
  Applications?
Do Transformer Modifications Transfer Across Implementations and Applications?Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Sharan Narang
Hyung Won Chung
Yi Tay
W. Fedus
Thibault Févry
...
Wei Li
Nan Ding
Jake Marcus
Adam Roberts
Colin Raffel
215
134
0
23 Feb 2021
Axial Residual Networks for CycleGAN-based Voice Conversion
Axial Residual Networks for CycleGAN-based Voice Conversion
J. You
Gyuhyeon Nam
Dalhyun Kim
Gyeongsu Chae
170
3
0
16 Feb 2021
MUFASA: Multimodal Fusion Architecture Search for Electronic Health
  Records
MUFASA: Multimodal Fusion Architecture Search for Electronic Health RecordsAAAI Conference on Artificial Intelligence (AAAI), 2021
Zhen Xu
David R. So
Andrew M. Dai
Mamba
338
65
0
03 Feb 2021
The heads hypothesis: A unifying statistical approach towards
  understanding multi-headed attention in BERT
The heads hypothesis: A unifying statistical approach towards understanding multi-headed attention in BERTAAAI Conference on Artificial Intelligence (AAAI), 2021
Madhura Pande
Aakriti Budhraja
Preksha Nema
Pratyush Kumar
Mitesh M. Khapra
177
20
0
22 Jan 2021
Subformer: Exploring Weight Sharing for Parameter Efficiency in
  Generative Transformers
Subformer: Exploring Weight Sharing for Parameter Efficiency in Generative TransformersConference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Machel Reid
Edison Marrese-Taylor
Y. Matsuo
MoE
340
57
0
01 Jan 2021
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective
  with Transformers
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with TransformersComputer Vision and Pattern Recognition (CVPR), 2020
Sixiao Zheng
Jiachen Lu
Hengshuang Zhao
Xiatian Zhu
Zekun Luo
...
Yanwei Fu
Jianfeng Feng
Tao Xiang
Juil Sock
Li Zhang
ViT
534
3,400
0
31 Dec 2020
Neural Machine Translation: A Review of Methods, Resources, and Tools
Neural Machine Translation: A Review of Methods, Resources, and ToolsAI Open (AO), 2020
Zhixing Tan
Shuo Wang
Zonghan Yang
Gang Chen
Xuancheng Huang
Maosong Sun
Yang Liu
3DVAI4TS
256
123
0
31 Dec 2020
Previous
1234567
Next