Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1901.10430
Cited By
v1
v2 (latest)
Pay Less Attention with Lightweight and Dynamic Convolutions
International Conference on Learning Representations (ICLR), 2019
29 January 2019
Felix Wu
Angela Fan
Alexei Baevski
Yann N. Dauphin
Michael Auli
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Pay Less Attention with Lightweight and Dynamic Convolutions"
50 / 337 papers shown
Title
Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications
Han Cai
Ji Lin
Chengyue Wu
Zhijian Liu
Haotian Tang
Hanrui Wang
Ligeng Zhu
Song Han
226
132
0
25 Apr 2022
BLISS: Robust Sequence-to-Sequence Learning via Self-Supervised Input Representation
Zheng Zhang
Liang Ding
Dazhao Cheng
Xuebo Liu
Min Zhang
Dacheng Tao
139
11
0
16 Apr 2022
A Call for Clarity in Beam Search: How It Works and When It Stops
International Conference on Language Resources and Evaluation (LREC), 2022
Jungo Kasai
Keisuke Sakaguchi
Ronan Le Bras
Dragomir R. Radev
Yejin Choi
Noah A. Smith
253
9
0
11 Apr 2022
MAESTRO: Matched Speech Text Representations through Modality Matching
Interspeech (Interspeech), 2022
Zhehuai Chen
Yu Zhang
Andrew Rosenberg
Bhuvana Ramabhadran
Pedro J. Moreno
Ankur Bapna
Heiga Zen
213
119
0
07 Apr 2022
Paying More Attention to Self-attention: Improving Pre-trained Language Models via Attention Guiding
Shanshan Wang
Zhumin Chen
Zhaochun Ren
Huasheng Liang
Qiang Yan
Sudipta Singha Roy
116
10
0
06 Apr 2022
MaxViT: Multi-Axis Vision Transformer
European Conference on Computer Vision (ECCV), 2022
Zhengzhong Tu
Hossein Talebi
Han Zhang
Feng Yang
P. Milanfar
A. Bovik
Yinxiao Li
ViT
459
867
0
04 Apr 2022
COOL, a Context Outlooker, and its Application to Question Answering and other Natural Language Processing Tasks
International Joint Conference on Artificial Intelligence (IJCAI), 2022
Fangyi Zhu
See-Kiong Ng
S. Bressan
LRM
133
1
0
01 Apr 2022
Logit Normalization for Long-tail Object Detection
International Journal of Computer Vision (IJCV), 2022
Liang Zhao
Yao Teng
Limin Wang
213
17
0
31 Mar 2022
Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs
Computer Vision and Pattern Recognition (CVPR), 2022
Xiaohan Ding
Xinming Zhang
Yi Zhou
Jungong Han
Guiguang Ding
Jian Sun
VLM
313
662
0
13 Mar 2022
Look Backward and Forward: Self-Knowledge Distillation with Bidirectional Decoder for Neural Machine Translation
Xuan Zhang
Libin Shen
Disheng Pan
Liangguo Wang
Yanjun Miao
163
1
0
10 Mar 2022
EdgeFormer: A Parameter-Efficient Transformer for On-Device Seq2seq Generation
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Tao Ge
Si-Qing Chen
Furu Wei
MoE
284
28
0
16 Feb 2022
General-purpose, long-context autoregressive modeling with Perceiver AR
International Conference on Machine Learning (ICML), 2022
Curtis Hawthorne
Andrew Jaegle
Cătălina Cangea
Sebastian Borgeaud
C. Nash
...
Hannah R. Sheahan
Neil Zeghidour
Jean-Baptiste Alayrac
João Carreira
Jesse Engel
208
74
0
15 Feb 2022
Improving Neural Machine Translation by Denoising Training
Liang Ding
Keqin Peng
Dacheng Tao
VLM
AI4CE
185
6
0
19 Jan 2022
PAEG: Phrase-level Adversarial Example Generation for Neural Machine Translation
International Conference on Computational Linguistics (COLING), 2022
Juncheng Wan
Jian Yang
Shuming Ma
Dongdong Zhang
Weinan Zhang
Yong Yu
Zhoujun Li
SILM
AAML
193
5
0
06 Jan 2022
GAT-CADNet: Graph Attention Network for Panoptic Symbol Spotting in CAD Drawings
Computer Vision and Pattern Recognition (CVPR), 2022
Zhaohua Zheng
Jianfang Li
Lingjie Zhu
Honghua Li
F. Petzold
Ping Tan
119
22
0
03 Jan 2022
Spatio-temporal Relation Modeling for Few-shot Action Recognition
Anirudh Thatipelli
Sanath Narayan
Salman Khan
Rao Muhammad Anwer
Fahad Shahbaz Khan
Guohao Li
ViT
235
109
0
09 Dec 2021
3D Medical Point Transformer: Introducing Convolution to Attention Networks for Medical Point Cloud Analysis
Jianhui Yu
Chaoyi Zhang
Heng Wang
Dingxin Zhang
Yang Song
Tiange Xiang
Dongnan Liu
Weidong (Tom) Cai
ViT
MedIm
179
43
0
09 Dec 2021
OW-DETR: Open-world Detection Transformer
Akshita Gupta
Sanath Narayan
K. J. Joseph
Salman Khan
Fahad Shahbaz Khan
M. Shah
ViT
215
231
0
02 Dec 2021
FastTrees: Parallel Latent Tree-Induction for Faster Sequence Encoding
B. Pung
Alvin Chan
115
0
0
28 Nov 2021
Dynamic Parameterized Network for CTR Prediction
Jian Zhu
Congcong Liu
Pei Wang
Xiwei Zhao
Guangpeng Chen
Junsheng Jin
Changping Peng
Zhangang Lin
Jingping Shao
168
2
0
09 Nov 2021
Mixed Transformer U-Net For Medical Image Segmentation
Hongyi Wang
Shiao Xie
Lanfen Lin
Yutaro Iwamoto
X. Han
Yenwei Chen
Ruofeng Tong
ViT
MedIm
136
340
0
08 Nov 2021
Direct Multi-view Multi-person 3D Pose Estimation
Tao Wang
Jianfeng Zhang
Yujun Cai
Shuicheng Yan
Jiashi Feng
3DH
271
118
0
07 Nov 2021
Towards Building ASR Systems for the Next Billion Users
Tahir Javed
Sumanth Doddapaneni
A. Raman
Kaushal Bhogale
Gowtham Ramesh
Anoop Kunchukuttan
Pratyush Kumar
Mitesh M. Khapra
216
69
0
06 Nov 2021
Efficiently Modeling Long Sequences with Structured State Spaces
International Conference on Learning Representations (ICLR), 2021
Albert Gu
Karan Goel
Christopher Ré
902
2,771
0
31 Oct 2021
Scatterbrain: Unifying Sparse and Low-rank Attention Approximation
Neural Information Processing Systems (NeurIPS), 2021
Beidi Chen
Tri Dao
Eric Winsor
Zhao Song
Atri Rudra
Christopher Ré
152
151
0
28 Oct 2021
Empirical Analysis of Korean Public AI Hub Parallel Corpora and in-depth Analysis using LIWC
Chanjun Park
Midan Shim
Sugyeong Eo
Seolhwa Lee
Jaehyung Seo
Hyeonseok Moon
Heuiseok Lim
104
8
0
28 Oct 2021
GNN-LM: Language Modeling based on Global Contexts via GNN
Yuxian Meng
Shi Zong
Xiaoya Li
Xiaofei Sun
Tianwei Zhang
Leilei Gan
Jiwei Li
LRM
467
45
0
17 Oct 2021
Taming Sparsely Activated Transformer with Stochastic Experts
International Conference on Learning Representations (ICLR), 2021
Simiao Zuo
Xiaodong Liu
Jian Jiao
Young Jin Kim
Hany Hassan
Ruofei Zhang
T. Zhao
Jianfeng Gao
MoE
223
133
0
08 Oct 2021
The NiuTrans System for WNGT 2020 Efficiency Task
Chi Hu
Bei Li
Ye Lin
Yinqiao Li
Yanyang Li
Chenglong Wang
Tong Xiao
Jingbo Zhu
78
7
0
16 Sep 2021
Improving Neural Machine Translation by Bidirectional Training
Liang Ding
Di Wu
Dacheng Tao
137
34
0
16 Sep 2021
RankNAS: Efficient Neural Architecture Search by Pairwise Ranking
Chi Hu
Chenglong Wang
Xiangnan Ma
Xia Meng
Yinqiao Li
Tong Xiao
Jingbo Zhu
Changliang Li
203
13
0
15 Sep 2021
Topic-Aware Contrastive Learning for Abstractive Dialogue Summarization
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Junpeng Liu
Yanyan Zou
Hainan Zhang
Hongshen Chen
Zhuoye Ding
Caixia Yuan
Caixia Yuan
95
70
0
10 Sep 2021
BERT, mBERT, or BiBERT? A Study on Contextualized Embeddings for Neural Machine Translation
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Haoran Xu
Benjamin Van Durme
Kenton W. Murray
238
73
0
09 Sep 2021
Low-Resource Dialogue Summarization with Domain-Agnostic Multi-Source Pretraining
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021
Yicheng Zou
Bolin Zhu
Xingwu Hu
Tao Gui
Tao Gui
214
33
0
09 Sep 2021
Survey of Low-Resource Machine Translation
Computational Linguistics (CL), 2021
Barry Haddow
Rachel Bawden
Antonio Valerio Miceli Barone
Jindvrich Helcl
Alexandra Birch
AIMat
444
194
0
01 Sep 2021
Lightweight Self-Attentive Sequential Recommendation
International Conference on Information and Knowledge Management (CIKM), 2021
Yang Li
Tong Chen
Pengfei Zhang
Hongzhi Yin
HAI
AI4TS
128
121
0
25 Aug 2021
Discriminative Region-based Multi-Label Zero-Shot Learning
Sanath Narayan
Akshita Gupta
Salman Khan
Fahad Shahbaz Khan
Ling Shao
M. Shah
VLM
187
55
0
20 Aug 2021
An Attention Module for Convolutional Neural Networks
Zhu Baozhou
P. Hofstee
Jinho Lee
Zaid Al-Ars
181
32
0
18 Aug 2021
Adaptive Graph Convolution for Point Cloud Analysis
Hao Zhou
Yidan Feng
Mingsheng Fang
Mingqiang Wei
J. Qin
Tong Lu
3DPC
186
166
0
18 Aug 2021
Fast Convergence of DETR with Spatially Modulated Co-Attention
IEEE International Conference on Computer Vision (ICCV), 2021
Shiyang Feng
Minghang Zheng
Xiaogang Wang
Jifeng Dai
Jiaming Song
ViT
229
363
0
05 Aug 2021
Dialogue Summarization with Supporting Utterance Flow Modeling and Fact Regularization
Wang Chen
Pijian Li
Hou Pong Chan
Irwin King
HILM
AI4TS
141
10
0
03 Aug 2021
Dynamic Convolution for 3D Point Cloud Instance Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Tong He
Chunhua Shen
Anton Van Den Hengel
3DPC
221
17
0
18 Jul 2021
A Survey on Deep Domain Adaptation and Tiny Object Detection Challenges, Techniques and Datasets
Muhammed Muzammul
Xi Li
ObjD
214
18
0
16 Jul 2021
AutoBERT-Zero: Evolving BERT Backbone from Scratch
AAAI Conference on Artificial Intelligence (AAAI), 2021
Jiahui Gao
Hang Xu
Han Shi
Xiaozhe Ren
Philip L. H. Yu
Xiaodan Liang
Xin Jiang
Zhenguo Li
141
39
0
15 Jul 2021
Visual Parser: Representing Part-whole Hierarchies with Transformers
Shuyang Sun
Xiaoyu Yue
S. Bai
Juil Sock
207
28
0
13 Jul 2021
ABD-Net: Attention Based Decomposition Network for 3D Point Cloud Decomposition
Siddharth Katageri
S. V. Kudari
Akshaykumar Gunari
R. Tabib
U. Mudenagudi
3DPC
128
5
0
09 Jul 2021
A Survey on Dialogue Summarization: Recent Advances and New Frontiers
Xiachong Feng
Xiaocheng Feng
Bing Qin
257
113
0
07 Jul 2021
Learning Geometric Combinatorial Optimization Problems using Self-attention and Domain Knowledge
Jaeseung Lee
Woojin Choi
Jibum Kim
3DPC
127
0
0
05 Jul 2021
Language Models are Good Translators
Shuo Wang
Zhaopeng Tu
Zhixing Tan
Wenxuan Wang
Maosong Sun
Yang Liu
148
22
0
25 Jun 2021
VOLO: Vision Outlooker for Visual Recognition
Li-xin Yuan
Qibin Hou
Zihang Jiang
Jiashi Feng
Shuicheng Yan
ViT
359
372
0
24 Jun 2021
Previous
1
2
3
4
5
6
7
Next