Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1901.10430
Cited By
v1
v2 (latest)
Pay Less Attention with Lightweight and Dynamic Convolutions
International Conference on Learning Representations (ICLR), 2019
29 January 2019
Felix Wu
Angela Fan
Alexei Baevski
Yann N. Dauphin
Michael Auli
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Pay Less Attention with Lightweight and Dynamic Convolutions"
50 / 337 papers shown
Title
Variational Neural Machine Translation with Normalizing Flows
Annual Meeting of the Association for Computational Linguistics (ACL), 2020
Hendra Setiawan
Matthias Sperber
Udhay Nallasamy
Matthias Paulik
DRL
121
13
0
28 May 2020
Normalized Attention Without Probability Cage
Oliver Richter
Roger Wattenhofer
222
22
0
19 May 2020
Conformer: Convolution-augmented Transformer for Speech Recognition
Anmol Gulati
James Qin
Chung-Cheng Chiu
Niki Parmar
Yu Zhang
...
Wei Han
Shibo Wang
Zhengdong Zhang
Yonghui Wu
Ruoming Pang
650
3,739
0
16 May 2020
Rethinking and Improving Natural Language Generation with Layer-Wise Multi-View Decoding
Fenglin Liu
Xuancheng Ren
Guangxiang Zhao
Chenyu You
Xuewei Ma
Xian Wu
Xu Sun
398
2
0
16 May 2020
Hierarchical Attention Transformer Architecture For Syntactic Spell Correction
Abhishek Niranjan
B. Shaik
K. Verma
64
2
0
11 May 2020
Synthesizer: Rethinking Self-Attention in Transformer Models
International Conference on Machine Learning (ICML), 2020
Yi Tay
Dara Bahri
Donald Metzler
Da-Cheng Juan
Zhe Zhao
Che Zheng
249
379
0
02 May 2020
Hard-Coded Gaussian Attention for Neural Machine Translation
Annual Meeting of the Association for Computational Linguistics (ACL), 2020
Weiqiu You
Simeng Sun
Mohit Iyyer
225
71
0
02 May 2020
POINTER: Constrained Progressive Text Generation via Insertion-based Generative Pre-training
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020
Yizhe Zhang
Guoyin Wang
Chunyuan Li
Zhe Gan
Chris Brockett
Bill Dolan
215
30
0
01 May 2020
Exploring Self-attention for Image Recognition
Computer Vision and Pattern Recognition (CVPR), 2020
Hengshuang Zhao
Jiaya Jia
V. Koltun
SSL
245
877
0
28 Apr 2020
Lite Transformer with Long-Short Range Attention
International Conference on Learning Representations (ICLR), 2020
Zhanghao Wu
Zhijian Liu
Ji Lin
Chengyue Wu
Song Han
156
357
0
24 Apr 2020
DyNet: Dynamic Convolution for Accelerating Convolutional Neural Networks
Yikang Zhang
Jian Zhang
Qiang-qiang Wang
Zhaobai Zhong
176
104
0
22 Apr 2020
Understanding the Difficulty of Training Transformers
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020
Liyuan Liu
Xiaodong Liu
Jianfeng Gao
Weizhu Chen
Jiawei Han
AI4CE
227
282
0
17 Apr 2020
Highway Transformer: Self-Gating Enhanced Self-Attentive Networks
Annual Meeting of the Association for Computational Linguistics (ACL), 2020
Yekun Chai
Jin Shuo
Xinwen Hou
252
21
0
17 Apr 2020
Transform and Tell: Entity-Aware News Image Captioning
Computer Vision and Pattern Recognition (CVPR), 2020
Alasdair Tran
A. Mathews
Lexing Xie
VLM
169
108
0
17 Apr 2020
Training with Quantization Noise for Extreme Model Compression
International Conference on Learning Representations (ICLR), 2020
Angela Fan
Pierre Stock
Benjamin Graham
Edouard Grave
Remi Gribonval
Edouard Grave
Armand Joulin
MQ
239
256
0
15 Apr 2020
Neural Machine Translation: Challenges, Progress and Future
Science China Technological Sciences (Sci China Technol Sci), 2020
Jiajun Zhang
Chengqing Zong
146
58
0
13 Apr 2020
Longformer: The Long-Document Transformer
Iz Beltagy
Matthew E. Peters
Arman Cohan
RALM
VLM
612
4,805
0
10 Apr 2020
Exploring Versatile Generative Language Model Via Parameter-Efficient Transfer Learning
Findings (Findings), 2020
Mohammad Kachuee
Andrea Madotto
Pascale Fung
361
183
0
08 Apr 2020
Aligned Cross Entropy for Non-Autoregressive Machine Translation
International Conference on Machine Learning (ICML), 2020
Marjan Ghazvininejad
Vladimir Karpukhin
Luke Zettlemoyer
Omer Levy
166
120
0
03 Apr 2020
Abstractive Text Summarization based on Language Model Conditioning and Locality Modeling
International Conference on Language Resources and Evaluation (LREC), 2020
Dmitrii Aksenov
J. Moreno-Schneider
Peter Bourgonje
Robert Schwarzenberg
Leonhard Hennig
Georg Rehm
174
31
0
29 Mar 2020
Probing Word Translations in the Transformer and Trading Decoder for Encoder Layers
Hongfei Xu
Josef van Genabith
Qiuhui Liu
Deyi Xiong
115
3
0
21 Mar 2020
PowerNorm: Rethinking Batch Normalization in Transformers
International Conference on Machine Learning (ICML), 2020
Sheng Shen
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
BDL
234
17
0
17 Mar 2020
Revisit Systematic Generalization via Meaningful Learning
BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackboxNLP), 2020
Ning Shi
Wei Ping
Wei Wang
Xiangyu Liu
Zhouhan Lin
428
2
0
14 Mar 2020
Meta-Embeddings Based On Self-Attention
Qichen Li
Xiaoke Jiang
Jun Xia
Jian Li
135
2
0
03 Mar 2020
Transformer++
Prakhar Thapak
P. Hore
98
0
0
02 Mar 2020
A Primer in BERTology: What we know about how BERT works
Transactions of the Association for Computational Linguistics (TACL), 2020
Anna Rogers
Olga Kovaleva
Anna Rumshisky
OffRL
393
1,690
0
27 Feb 2020
On Feature Normalization and Data Augmentation
Computer Vision and Pattern Recognition (CVPR), 2020
Boyi Li
Felix Wu
Ser-Nam Lim
Serge J. Belongie
Kilian Q. Weinberger
205
154
0
25 Feb 2020
Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation
Findings (Findings), 2020
Alessandro Raganato
Yves Scherrer
Jörg Tiedemann
331
96
0
24 Feb 2020
Tree-structured Attention with Hierarchical Accumulation
International Conference on Learning Representations (ICLR), 2020
Xuan-Phi Nguyen
Shafiq Joty
Guosheng Lin
R. Socher
137
77
0
19 Feb 2020
Low-Rank Bottleneck in Multi-head Attention Models
International Conference on Machine Learning (ICML), 2020
Srinadh Bhojanapalli
Chulhee Yun
A. S. Rawat
Sashank J. Reddi
Sanjiv Kumar
155
121
0
17 Feb 2020
Incorporating BERT into Neural Machine Translation
International Conference on Learning Representations (ICLR), 2020
Jinhua Zhu
Ziheng Lu
Lijun Wu
Di He
Tao Qin
Wen-gang Zhou
Houqiang Li
Tie-Yan Liu
FedML
AIMat
200
381
0
17 Feb 2020
Time-aware Large Kernel Convolutions
International Conference on Machine Learning (ICML), 2020
Vasileios Lioutas
Yuhong Guo
AI4TS
213
30
0
08 Feb 2020
Towards the Systematic Reporting of the Energy and Carbon Footprints of Machine Learning
Peter Henderson
Jie Hu
Joshua Romoff
Emma Brunskill
Dan Jurafsky
Joelle Pineau
266
563
0
31 Jan 2020
Semi-Autoregressive Training Improves Mask-Predict Decoding
Marjan Ghazvininejad
Omer Levy
Luke Zettlemoyer
144
72
0
23 Jan 2020
Normalization of Input-output Shared Embeddings in Text Generation Models
Jinyang Liu
Yujia Zhai
Zizhong Chen
121
0
0
22 Jan 2020
Non-Autoregressive Machine Translation with Disentangled Context Transformer
International Conference on Machine Learning (ICML), 2020
Jungo Kasai
James Cross
Marjan Ghazvininejad
Jiatao Gu
260
35
0
15 Jan 2020
Is Attention All What You Need? -- An Empirical Investigation on Convolution-Based Active Memory and Self-Attention
Thomas D. Dowdell
Hongyu Zhang
120
4
0
27 Dec 2019
Explicit Sparse Transformer: Concentrated Attention Through Explicit Selection
Guangxiang Zhao
Junyang Lin
Zhiyuan Zhang
Xuancheng Ren
Qi Su
Xu Sun
154
133
0
25 Dec 2019
Tag-less Back-Translation
Machine Translation (MT), 2019
Idris Abdulmumin
B. Galadanci
Aliyu Dadan Garba
222
12
0
22 Dec 2019
Are Transformers universal approximators of sequence-to-sequence functions?
International Conference on Learning Representations (ICLR), 2019
Chulhee Yun
Srinadh Bhojanapalli
A. S. Rawat
Sashank J. Reddi
Sanjiv Kumar
290
427
0
20 Dec 2019
Neural Machine Translation: A Review and Survey
Journal of Artificial Intelligence Research (JAIR), 2019
Felix Stahlberg
3DV
AI4TS
MedIm
327
373
0
04 Dec 2019
SAMSum Corpus: A Human-annotated Dialogue Dataset for Abstractive Summarization
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019
Bogdan Gliwa
Iwona Mochol
M. Biesek
A. Wawer
439
743
0
27 Nov 2019
Self-Attention Enhanced Selective Gate with Entity-Aware Embedding for Distantly Supervised Relation Extraction
AAAI Conference on Artificial Intelligence (AAAI), 2019
Tao Shen
Guodong Long
Tao Shen
Wanrong Zhu
Lina Yao
Huan Huo
Jing Jiang
132
87
0
27 Nov 2019
Iterative Batch Back-Translation for Neural Machine Translation: A Conceptual Model
Idris Abdulmumin
B. Galadanci
Abubakar Isa
120
0
0
26 Nov 2019
MUSE: Parallel Multi-Scale Attention for Sequence to Sequence Learning
Guangxiang Zhao
Xu Sun
Jingjing Xu
Zhiyuan Zhang
Liangchen Luo
LRM
151
59
0
17 Nov 2019
Compressive Transformers for Long-Range Sequence Modelling
International Conference on Learning Representations (ICLR), 2019
Jack W. Rae
Anna Potapenko
Siddhant M. Jayakumar
Timothy Lillicrap
RALM
VLM
KELM
263
760
0
13 Nov 2019
Two-Headed Monster And Crossed Co-Attention Networks
Yaoyiran Li
Jing Jiang
130
0
0
10 Nov 2019
Distilling Knowledge Learned in BERT for Text Generation
Yen-Chun Chen
Zhe Gan
Yu Cheng
Jingzhou Liu
Jingjing Liu
225
31
0
10 Nov 2019
Data Diversification: A Simple Strategy For Neural Machine Translation
Xuan-Phi Nguyen
Shafiq Joty
Wu Kui
Ai Ti Aw
348
16
0
05 Nov 2019
Exploring Kernel Functions in the Softmax Layer for Contextual Word Classification
International Workshop on Spoken Language Translation (IWSLT), 2019
Yingbo Gao
Christian Herold
Weiyue Wang
Hermann Ney
155
4
0
28 Oct 2019
Previous
1
2
3
4
5
6
7
Next