v1v2 (latest)

Pay Less Attention with Lightweight and Dynamic Convolutions

International Conference on Learning Representations (ICLR), 2019

29 January 2019

Angela Fan

Papers citing "Pay Less Attention with Lightweight and Dynamic Convolutions"

50 / 337 papers shown

Probabilistic Attention for Interactive SegmentationNeural Information Processing Systems (NeurIPS), 2021

Prasad Gabbur

Manjot Bilkhu

J. Movellan

183

23 Jun 2021

LV-BERT: Exploiting Layer Variety for BERTFindings (Findings), 2021

Weihao Yu

153

22 Jun 2021

Eigen Analysis of Self-Attention and its Reconstruction from Partial Computation

Srinadh Bhojanapalli

Sanjiv Kumar

110

16 Jun 2021

HELP: Hardware-Adaptive Efficient Latency Prediction for NAS via Meta-Learning

258

16 Jun 2021

Coreference-Aware Dialogue Summarization

Zhengyuan Liu

Ke Shi

Nancy F. Chen

150

16 Jun 2021

GroupBERT: Enhanced Transformer Architecture with Efficient Grouped Structures

Carlo Luschi

141

10 Jun 2021

Convolutions and Self-Attention: Re-interpreting Relative Positions in Pre-trained Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2021

105

10 Jun 2021

A Survey of TransformersAI Open (AO), 2021

Tianyang Lin

Yuxin Wang

Xiangyang Liu

Xipeng Qiu

ViT

441

1,380

08 Jun 2021

On the Connection between Local Attention and Dynamic Depth-wise ConvolutionInternational Conference on Learning Representations (ICLR), 2021

Ming-Ming Cheng

Jingdong Wang

356

132

08 Jun 2021

On the Language Coverage Bias for Neural Machine TranslationFindings (Findings), 2021

Shuo Wang

Zhaopeng Tu

Zhixing Tan

Shuming Shi

Maosong Sun

Yang Liu

116

07 Jun 2021

Neural Implicit 3D Shapes from Single Images with Spatial PatternsInternational Conference on Image and Graphics (ICIG), 2021

Yujie Wang

159

06 Jun 2021

Attention mechanisms and deep learning for machine vision: A survey of the state of the art

A. M. Hafiz

S. A. Parah

R. A. Bhat

224

03 Jun 2021

Container: Context Aggregation NetworkNeural Information Processing Systems (NeurIPS), 2021

271

02 Jun 2021

Self-Training Sampling with Monolingual Data Uncertainty for Neural Machine TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2021

Irwin King

147

02 Jun 2021

DLA-Net: Learning Dual Local Attention Features for Semantic Segmentation of Large-Scale Building Facade Point CloudsPattern Recognition (Pattern Recogn.), 2021

274

01 Jun 2021

Memory-Efficient Differentiable Transformer Architecture SearchFindings (Findings), 2021

217

31 May 2021

Good for Misconceived Reasons: An Empirical Revisiting on the Need for Visual Context in Multimodal Machine TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2021

Zhiyong Wu

Lingpeng Kong

W. Bi

Xiang Li

B. Kao

LRM

134

30 May 2021

NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture SearchKnowledge Discovery and Data Mining (KDD), 2021

Xu Tan

139

30 May 2021

An Attention Free Transformer

381

162

28 May 2021

Controllable Abstractive Dialogue Summarization with Sketch SupervisionFindings (Findings), 2021

216

28 May 2021

TranSmart: A Practical Interactive Machine Translation System

170

27 May 2021

Learning Language Specific Sub-network for Multilingual Machine TranslationAnnual Meeting of the Association for Computational Linguistics (ACL), 2021

Zehui Lin

Liwei Wu

Mingxuan Wang

Lei Li

274

19 May 2021

Pay Attention to MLPsNeural Information Processing Systems (NeurIPS), 2021

567

796

17 May 2021

Dynamic Pooling Improves Nanopore Base Calling AccuracyIEEE/ACM Transactions on Computational Biology & Bioinformatics (TCBB), 2021

160

16 May 2021

The Volctrans Neural Speech Translation System for IWSLT 2021International Workshop on Spoken Language Translation (IWSLT), 2021

Lei Li

207

16 May 2021

Not All Memories are Created Equal: Learning to Forget by ExpiringInternational Conference on Machine Learning (ICML), 2021

Jason Weston

Angela Fan

CLL

239

13 May 2021

Poolingformer: Long Document Modeling with Pooling AttentionInternational Conference on Machine Learning (ICML), 2021

183

115

10 May 2021

Are Pre-trained Convolutions Better than Pre-trained Transformers?Annual Meeting of the Association for Computational Linguistics (ACL), 2021

Zhen Qin

177

07 May 2021

MLP-Mixer: An all-MLP Architecture for VisionNeural Information Processing Systems (NeurIPS), 2021

...

Alexey Dosovitskiy

1.2K

3,284

04 May 2021

Review of end-to-end speech synthesis technology based on deep learning

202

20 Apr 2021

Knowledge Neurons in Pretrained TransformersAnnual Meeting of the Association for Computational Linguistics (ACL), 2021

Damai Dai

Li Dong

Y. Hao

Zhifang Sui

Baobao Chang

Furu Wei

KELM MU

538

574

18 Apr 2021

How to Train BERT with an Academic BudgetConference on Empirical Methods in Natural Language Processing (EMNLP), 2021

Peter Izsak

Moshe Berchansky

Omer Levy

338

128

15 Apr 2021

UniDrop: A Simple yet Effective Technique to Improve Transformer without Extra CostNorth American Chapter of the Association for Computational Linguistics (NAACL), 2021

201

11 Apr 2021

Non-Autoregressive Semantic Parsing for Compositional Task-Oriented DialogNorth American Chapter of the Association for Computational Linguistics (NAACL), 2021

Angela Fan

134

11 Apr 2021

Learning Graph Structures with Transformer for Multivariate Time Series Anomaly Detection in IoTIEEE Internet of Things Journal (IEEE IoT Journal), 2021

Dingshuo Chen

319

456

08 Apr 2021

Do We Need Anisotropic Graph Neural Networks?International Conference on Learning Representations (ICLR), 2021

348

03 Apr 2021

Dual Contrastive Loss and Attention for GANsIEEE International Conference on Computer Vision (ICCV), 2021

Mario Fritz

325

31 Mar 2021

CvT: Introducing Convolutions to Vision TransformersIEEE International Conference on Computer Vision (ICCV), 2021

Lu Yuan

Lei Zhang

ViT

495

2,273

29 Mar 2021

Unified Graph Structured Models for Video UnderstandingIEEE International Conference on Computer Vision (ICCV), 2021

Anurag Arnab

Chen Sun

Cordelia Schmid

226

29 Mar 2021

Mask Attention Networks: Rethinking and Strengthen TransformerNorth American Chapter of the Association for Computational Linguistics (NAACL), 2021

Dayiheng Liu

Xuanjing Huang

150

25 Mar 2021

Finetuning Pretrained Transformers into RNNsConference on Empirical Methods in Natural Language Processing (EMNLP), 2021

Hao Peng

317

24 Mar 2021

IOT: Instance-wise Layer Reordering for Transformer StructuresInternational Conference on Learning Representations (ICLR), 2021

189

05 Mar 2021

Random Feature AttentionInternational Conference on Learning Representations (ICLR), 2021

Hao Peng

Lingpeng Kong

329

401

03 Mar 2021

Do Transformer Modifications Transfer Across Implementations and Applications?Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021

Sharan Narang

...

215

134

23 Feb 2021

Axial Residual Networks for CycleGAN-based Voice Conversion

170

16 Feb 2021

MUFASA: Multimodal Fusion Architecture Search for Electronic Health RecordsAAAI Conference on Artificial Intelligence (AAAI), 2021

338

03 Feb 2021

The heads hypothesis: A unifying statistical approach towards understanding multi-headed attention in BERTAAAI Conference on Artificial Intelligence (AAAI), 2021

Mitesh M. Khapra

177

22 Jan 2021

Subformer: Exploring Weight Sharing for Parameter Efficiency in Generative TransformersConference on Empirical Methods in Natural Language Processing (EMNLP), 2021

Machel Reid

Edison Marrese-Taylor

Y. Matsuo

MoE

340

01 Jan 2021

Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with TransformersComputer Vision and Pattern Recognition (CVPR), 2020

...

Li Zhang

534

3,400

31 Dec 2020

Neural Machine Translation: A Review of Methods, Resources, and ToolsAI Open (AO), 2020

Zhixing Tan

Shuo Wang

Zonghan Yang

Gang Chen

Xuancheng Huang

Maosong Sun

Yang Liu

3DV AI4TS

256

123

31 Dec 2020