v1v2 (latest)

Pay Less Attention with Lightweight and Dynamic Convolutions

International Conference on Learning Representations (ICLR), 2019

29 January 2019

Angela Fan

Papers citing "Pay Less Attention with Lightweight and Dynamic Convolutions"

37 / 337 papers shown

Depth-Adaptive TransformerInternational Conference on Learning Representations (ICLR), 2019

397

235

22 Oct 2019

Reducing Transformer Depth on Demand with Structured DropoutInternational Conference on Learning Representations (ICLR), 2019

Angela Fan

Edouard Grave

Armand Joulin

611

656

25 Sep 2019

TinyBERT: Distilling BERT for Natural Language UnderstandingFindings (Findings), 2019

Xiaoqi Jiao

Yichun Yin

Lifeng Shang

Xin Jiang

Xiao Chen

Linlin Li

F. Wang

Qun Liu

VLM

600

2,155

23 Sep 2019

Multi-agent Learning for Neural Machine TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2019

105

03 Sep 2019

A Unified Neural Coherence ModelConference on Empirical Methods in Natural Language Processing (EMNLP), 2019

01 Sep 2019

Adaptively Sparse TransformersConference on Empirical Methods in Natural Language Processing (EMNLP), 2019

Gonçalo M. Correia

Vlad Niculae

André F. T. Martins

341

277

30 Aug 2019

Improving Deep Transformer with Depth-Scaled Initialization and Merged AttentionConference on Empirical Methods in Natural Language Processing (EMNLP), 2019

Biao Zhang

Ivan Titov

Rico Sennrich

183

115

29 Aug 2019

Revealing the Dark Secrets of BERTConference on Empirical Methods in Natural Language Processing (EMNLP), 2019

380

603

21 Aug 2019

Dynamic Graph Message Passing NetworksComputer Vision and Pattern Recognition (CVPR), 2019

Li Zhang

Dan Xu

Anurag Arnab

Juil Sock

GNN

381

147

19 Aug 2019

Recurrent Graph Syntax Encoder for Neural Machine Translation

Liang Ding

Dacheng Tao

146

19 Aug 2019

Multi-modality Latent Interaction Network for Visual Question AnsweringIEEE International Conference on Computer Vision (ICCV), 2019

164

10 Aug 2019

UdS Submission for the WMT 19 Automatic Post-Editing TaskConference on Machine Translation (WMT), 2019

Hongfei Xu

Qiuhui Liu

Josef van Genabith

09 Aug 2019

Extracting Interpretable Physical Parameters from Spatiotemporal Systems using Unsupervised LearningPhysical Review X (PRX), 2019

214

13 Jul 2019

Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges

...

Colin Cherry

260

447

11 Jul 2019

Positional NormalizationNeural Information Processing Systems (NeurIPS), 2019

Boyi Li

Felix Wu

Kilian Q. Weinberger

Serge J. Belongie

193

105

09 Jul 2019

The Indirect Convolution Algorithm

Marat Dukhan

139

03 Jul 2019

Augmenting Self-attention with Persistent Memory

221

149

02 Jul 2019

The University of Sydney's Machine Translation System for WMT19Conference on Machine Translation (WMT), 2019

Liang Ding

Dacheng Tao

30 Jun 2019

GNN-FiLM: Graph Neural Networks with Feature-wise Linear ModulationInternational Conference on Machine Learning (ICML), 2019

Marc Brockschmidt

375

169

28 Jun 2019

Stand-Alone Self-Attention in Vision ModelsNeural Information Processing Systems (NeurIPS), 2019

371

1,323

13 Jun 2019

Understanding and Improving Transformer From a Multi-Particle Dynamic System Point of View

237

203

06 Jun 2019

Revisiting Low-Resource Neural Machine Translation: A Case StudyAnnual Meeting of the Association for Computational Linguistics (ACL), 2019

Rico Sennrich

Biao Zhang

158

227

28 May 2019

Joint Source-Target Self Attention with Locality Constraints

José A. R. Fonollosa

Noe Casas

Marta R. Costa-jussá

132

16 May 2019

Taming Pretrained Transformers for Extreme Multi-label Text Classification

Inderjit Dhillon

270

07 May 2019

Low-Memory Neural Network Training: A Technical Report

N. Sohoni

Christopher R. Aberger

Megan Leszczynski

Jian Zhang

Christopher Ré

253

110

24 Apr 2019

BERTScore: Evaluating Text Generation with BERT

2.4K

7,458

21 Apr 2019

An Empirical Study of Spatial Attention Mechanisms in Deep Networks

184

487

11 Apr 2019

CondConv: Conditionally Parameterized Convolutions for Efficient Inference

383

753

10 Apr 2019

Sequence-to-Sequence Speech Recognition with Time-Depth Separable Convolutions

183

105

04 Apr 2019

fairseq: A Fast, Extensible Toolkit for Sequence Modeling

Sergey Edunov

Angela Fan

540

3,317

01 Apr 2019

FastFusionNet: New State-of-the-Art for DAWNBench SQuAD

Boyi Li

116

28 Feb 2019

Synchronous Bidirectional Inference for Neural Sequence Generation

163

24 Feb 2019

Seven Myths in Machine Learning Research

Oscar Chang

Hod Lipson

18 Feb 2019

Strategies for Structuring Story Generation

Angela Fan

M. Lewis

Yann N. Dauphin

298

220

04 Feb 2019

The Evolved TransformerInternational Conference on Machine Learning (ICML), 2019

537

487

30 Jan 2019

Tensorized Embedding Layers for Efficient Model Compression

229

30 Jan 2019

Higher-order Network for Action RecognitionInternational Conference on Pattern Recognition (ICPR), 2018

Jie Shao

Xiangyang Xue

270

19 Nov 2018