ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1901.10430
  4. Cited By
Pay Less Attention with Lightweight and Dynamic Convolutions
v1v2 (latest)

Pay Less Attention with Lightweight and Dynamic Convolutions

International Conference on Learning Representations (ICLR), 2019
29 January 2019
Felix Wu
Angela Fan
Alexei Baevski
Yann N. Dauphin
Michael Auli
ArXiv (abs)PDFHTML

Papers citing "Pay Less Attention with Lightweight and Dynamic Convolutions"

37 / 337 papers shown
Depth-Adaptive Transformer
Depth-Adaptive TransformerInternational Conference on Learning Representations (ICLR), 2019
Maha Elbayad
Jiatao Gu
Edouard Grave
Michael Auli
397
235
0
22 Oct 2019
Reducing Transformer Depth on Demand with Structured Dropout
Reducing Transformer Depth on Demand with Structured DropoutInternational Conference on Learning Representations (ICLR), 2019
Angela Fan
Edouard Grave
Armand Joulin
611
656
0
25 Sep 2019
TinyBERT: Distilling BERT for Natural Language Understanding
TinyBERT: Distilling BERT for Natural Language UnderstandingFindings (Findings), 2019
Xiaoqi Jiao
Yichun Yin
Lifeng Shang
Xin Jiang
Xiao Chen
Linlin Li
F. Wang
Qun Liu
VLM
600
2,155
0
23 Sep 2019
Multi-agent Learning for Neural Machine Translation
Multi-agent Learning for Neural Machine TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Tianchi Bi
Hao Xiong
Zhongjun He
Hua Wu
Haifeng Wang
AI4CE
105
12
0
03 Sep 2019
A Unified Neural Coherence Model
A Unified Neural Coherence ModelConference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Han Cheol Moon
Tasnim Mohiuddin
Shafiq Joty
Xu Chi
96
49
0
01 Sep 2019
Adaptively Sparse Transformers
Adaptively Sparse TransformersConference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Gonçalo M. Correia
Vlad Niculae
André F. T. Martins
341
277
0
30 Aug 2019
Improving Deep Transformer with Depth-Scaled Initialization and Merged
  Attention
Improving Deep Transformer with Depth-Scaled Initialization and Merged AttentionConference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Biao Zhang
Ivan Titov
Rico Sennrich
183
115
0
29 Aug 2019
Revealing the Dark Secrets of BERT
Revealing the Dark Secrets of BERTConference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Olga Kovaleva
Alexey Romanov
Anna Rogers
Anna Rumshisky
380
603
0
21 Aug 2019
Dynamic Graph Message Passing Networks
Dynamic Graph Message Passing NetworksComputer Vision and Pattern Recognition (CVPR), 2019
Li Zhang
Dan Xu
Anurag Arnab
Juil Sock
GNN
381
147
0
19 Aug 2019
Recurrent Graph Syntax Encoder for Neural Machine Translation
Recurrent Graph Syntax Encoder for Neural Machine Translation
Liang Ding
Dacheng Tao
146
6
0
19 Aug 2019
Multi-modality Latent Interaction Network for Visual Question Answering
Multi-modality Latent Interaction Network for Visual Question AnsweringIEEE International Conference on Computer Vision (ICCV), 2019
Shiyang Feng
Haoxuan You
Zhanpeng Zhang
Xiaogang Wang
Jiaming Song
164
86
0
10 Aug 2019
UdS Submission for the WMT 19 Automatic Post-Editing Task
UdS Submission for the WMT 19 Automatic Post-Editing TaskConference on Machine Translation (WMT), 2019
Hongfei Xu
Qiuhui Liu
Josef van Genabith
94
4
0
09 Aug 2019
Extracting Interpretable Physical Parameters from Spatiotemporal Systems
  using Unsupervised Learning
Extracting Interpretable Physical Parameters from Spatiotemporal Systems using Unsupervised LearningPhysical Review X (PRX), 2019
Peter Y. Lu
Samuel Kim
Marin Soljacic
AI4CE
214
66
0
13 Jul 2019
Massively Multilingual Neural Machine Translation in the Wild: Findings
  and Challenges
Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges
N. Arivazhagan
Ankur Bapna
Orhan Firat
Dmitry Lepikhin
Melvin Johnson
...
George F. Foster
Colin Cherry
Wolfgang Macherey
Zhiwen Chen
Yonghui Wu
260
447
0
11 Jul 2019
Positional Normalization
Positional NormalizationNeural Information Processing Systems (NeurIPS), 2019
Boyi Li
Felix Wu
Kilian Q. Weinberger
Serge J. Belongie
193
105
0
09 Jul 2019
The Indirect Convolution Algorithm
The Indirect Convolution Algorithm
Marat Dukhan
139
45
0
03 Jul 2019
Augmenting Self-attention with Persistent Memory
Augmenting Self-attention with Persistent Memory
Sainbayar Sukhbaatar
Edouard Grave
Guillaume Lample
Edouard Grave
Armand Joulin
RALMKELM
221
149
0
02 Jul 2019
The University of Sydney's Machine Translation System for WMT19
The University of Sydney's Machine Translation System for WMT19Conference on Machine Translation (WMT), 2019
Liang Ding
Dacheng Tao
93
13
0
30 Jun 2019
GNN-FiLM: Graph Neural Networks with Feature-wise Linear Modulation
GNN-FiLM: Graph Neural Networks with Feature-wise Linear ModulationInternational Conference on Machine Learning (ICML), 2019
Marc Brockschmidt
375
169
0
28 Jun 2019
Stand-Alone Self-Attention in Vision Models
Stand-Alone Self-Attention in Vision ModelsNeural Information Processing Systems (NeurIPS), 2019
Prajit Ramachandran
Niki Parmar
Ashish Vaswani
Irwan Bello
Anselm Levskaya
Jonathon Shlens
VLMSLRViT
371
1,323
0
13 Jun 2019
Understanding and Improving Transformer From a Multi-Particle Dynamic
  System Point of View
Understanding and Improving Transformer From a Multi-Particle Dynamic System Point of View
Yiping Lu
Zhuohan Li
Di He
Zhiqing Sun
Bin Dong
Tao Qin
Liwei Wang
Tie-Yan Liu
AI4CE
237
203
0
06 Jun 2019
Revisiting Low-Resource Neural Machine Translation: A Case Study
Revisiting Low-Resource Neural Machine Translation: A Case StudyAnnual Meeting of the Association for Computational Linguistics (ACL), 2019
Rico Sennrich
Biao Zhang
158
227
0
28 May 2019
Joint Source-Target Self Attention with Locality Constraints
Joint Source-Target Self Attention with Locality Constraints
José A. R. Fonollosa
Noe Casas
Marta R. Costa-jussá
132
23
0
16 May 2019
Taming Pretrained Transformers for Extreme Multi-label Text
  Classification
Taming Pretrained Transformers for Extreme Multi-label Text Classification
Wei-Cheng Chang
Hsiang-Fu Yu
Kai Zhong
Yiming Yang
Inderjit Dhillon
270
20
0
07 May 2019
Low-Memory Neural Network Training: A Technical Report
Low-Memory Neural Network Training: A Technical Report
N. Sohoni
Christopher R. Aberger
Megan Leszczynski
Jian Zhang
Christopher Ré
253
110
0
24 Apr 2019
BERTScore: Evaluating Text Generation with BERT
BERTScore: Evaluating Text Generation with BERT
Tianyi Zhang
Varsha Kishore
Felix Wu
Kilian Q. Weinberger
Yoav Artzi
2.4K
7,458
0
21 Apr 2019
An Empirical Study of Spatial Attention Mechanisms in Deep Networks
An Empirical Study of Spatial Attention Mechanisms in Deep Networks
Xizhou Zhu
Dazhi Cheng
Zheng Zhang
Stephen Lin
Jifeng Dai
184
487
0
11 Apr 2019
CondConv: Conditionally Parameterized Convolutions for Efficient
  Inference
CondConv: Conditionally Parameterized Convolutions for Efficient Inference
Brandon Yang
Gabriel Bender
Quoc V. Le
Jiquan Ngiam
MedIm3DV
383
753
0
10 Apr 2019
Sequence-to-Sequence Speech Recognition with Time-Depth Separable
  Convolutions
Sequence-to-Sequence Speech Recognition with Time-Depth Separable Convolutions
Awni Y. Hannun
Ann Lee
Qiantong Xu
R. Collobert
183
105
0
04 Apr 2019
fairseq: A Fast, Extensible Toolkit for Sequence Modeling
fairseq: A Fast, Extensible Toolkit for Sequence Modeling
Myle Ott
Sergey Edunov
Alexei Baevski
Angela Fan
Sam Gross
Nathan Ng
David Grangier
Michael Auli
VLMFaML
540
3,317
0
01 Apr 2019
FastFusionNet: New State-of-the-Art for DAWNBench SQuAD
FastFusionNet: New State-of-the-Art for DAWNBench SQuAD
Felix Wu
Boyi Li
Lequn Wang
Ni Lao
John Blitzer
Kilian Q. Weinberger
FedML
116
5
0
28 Feb 2019
Synchronous Bidirectional Inference for Neural Sequence Generation
Synchronous Bidirectional Inference for Neural Sequence Generation
Jiajun Zhang
Long Zhou
Yang Zhao
Chengqing Zong
163
39
0
24 Feb 2019
Seven Myths in Machine Learning Research
Seven Myths in Machine Learning Research
Oscar Chang
Hod Lipson
53
0
0
18 Feb 2019
Strategies for Structuring Story Generation
Strategies for Structuring Story Generation
Angela Fan
M. Lewis
Yann N. Dauphin
298
220
0
04 Feb 2019
The Evolved Transformer
The Evolved TransformerInternational Conference on Machine Learning (ICML), 2019
David R. So
Chen Liang
Quoc V. Le
ViT
537
487
0
30 Jan 2019
Tensorized Embedding Layers for Efficient Model Compression
Tensorized Embedding Layers for Efficient Model Compression
Oleksii Hrinchuk
Valentin Khrulkov
L. Mirvakhabova
Elena Orlova
Ivan Oseledets
229
75
0
30 Jan 2019
Higher-order Network for Action Recognition
Higher-order Network for Action RecognitionInternational Conference on Pattern Recognition (ICPR), 2018
Jie Shao
Xiangyang Xue
270
0
0
19 Nov 2018
Previous
1234567