ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2105.08050
  4. Cited By
Pay Attention to MLPs
v1v2 (latest)

Pay Attention to MLPs

Neural Information Processing Systems (NeurIPS), 2021
17 May 2021
Hanxiao Liu
Zihang Dai
David R. So
Quoc V. Le
    AI4CE
ArXiv (abs)PDFHTML

Papers citing "Pay Attention to MLPs"

50 / 323 papers shown
HEAR: Holistic Evaluation of Audio Representations
HEAR: Holistic Evaluation of Audio RepresentationsNeural Information Processing Systems (NeurIPS), 2022
Joseph P. Turian
Jordie Shier
H. Khan
Bhiksha Raj
Björn W. Schuller
...
P. Esling
Pranay Manocha
Shinji Watanabe
Zeyu Jin
Yonatan Bisk
384
134
0
06 Mar 2022
Bending Reality: Distortion-aware Transformers for Adapting to Panoramic
  Semantic Segmentation
Bending Reality: Distortion-aware Transformers for Adapting to Panoramic Semantic SegmentationComputer Vision and Pattern Recognition (CVPR), 2022
Kailai Li
Kailun Yang
Chaoxiang Ma
Simon Reiß
Kunyu Peng
Rainer Stiefelhagen
ViT
301
92
0
02 Mar 2022
Filter-enhanced MLP is All You Need for Sequential Recommendation
Filter-enhanced MLP is All You Need for Sequential RecommendationThe Web Conference (WWW), 2022
Kun Zhou
Hui Yu
Wayne Xin Zhao
Ji-Rong Wen
252
360
0
28 Feb 2022
Transformers in Medical Image Analysis: A Review
Transformers in Medical Image Analysis: A ReviewIntelligent Medicine (IM), 2022
Kelei He
Chen Gan
Zhuoyuan Li
I. Rekik
Zihao Yin
Wen Ji
Yang Gao
Qian Wang
Junfeng Zhang
Dinggang Shen
ViTMedIm
344
352
0
24 Feb 2022
Transformer Quality in Linear Time
Transformer Quality in Linear TimeInternational Conference on Machine Learning (ICML), 2022
Weizhe Hua
Zihang Dai
Hanxiao Liu
Quoc V. Le
469
297
0
21 Feb 2022
Visual Attention Network
Visual Attention NetworkComputational Visual Media (CVM), 2022
Meng-Hao Guo
Chengrou Lu
Zheng-Ning Liu
Ming-Ming Cheng
Shiyong Hu
ViTVLM
474
869
0
20 Feb 2022
MLP-ASR: Sequence-length agnostic all-MLP architectures for speech
  recognition
MLP-ASR: Sequence-length agnostic all-MLP architectures for speech recognition
Jin Sakuma
Tatsuya Komatsu
Robin Scheibler
103
6
0
17 Feb 2022
The Quarks of Attention
The Quarks of AttentionArtificial Intelligence (AIJ), 2022
Pierre Baldi
Roman Vershynin
GNN
106
11
0
15 Feb 2022
Mixing and Shifting: Exploiting Global and Local Dependencies in Vision
  MLPs
Mixing and Shifting: Exploiting Global and Local Dependencies in Vision MLPs
Huangjie Zheng
Pengcheng He
Weizhu Chen
Mingyuan Zhou
115
16
0
14 Feb 2022
BViT: Broad Attention based Vision Transformer
BViT: Broad Attention based Vision TransformerIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2022
Nannan Li
Yaran Chen
Weifan Li
Zixiang Ding
Dong Zhao
ViT
261
30
0
13 Feb 2022
pNLP-Mixer: an Efficient all-MLP Architecture for Language
pNLP-Mixer: an Efficient all-MLP Architecture for LanguageAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Francesco Fusco
Damian Pascual
Peter W. J. Staar
Diego Antognini
201
34
0
09 Feb 2022
Image-to-Image MLP-mixer for Image Reconstruction
Image-to-Image MLP-mixer for Image Reconstruction
Youssef Mansour
Kang Lin
Reinhard Heckel
SupR
244
17
0
04 Feb 2022
Plug-In Inversion: Model-Agnostic Inversion for Vision with Data
  Augmentations
Plug-In Inversion: Model-Agnostic Inversion for Vision with Data AugmentationsInternational Conference on Machine Learning (ICML), 2022
Amin Ghiasi
Hamid Kazemi
Steven Reich
Chen Zhu
Micah Goldblum
Tom Goldstein
241
17
0
31 Jan 2022
DynaMixer: A Vision MLP Architecture with Dynamic Mixing
DynaMixer: A Vision MLP Architecture with Dynamic MixingInternational Conference on Machine Learning (ICML), 2022
Ziyu Wang
Wenhao Jiang
Yiming Zhu
Li Yuan
Yibing Song
Wei Liu
253
48
0
28 Jan 2022
When Shift Operation Meets Vision Transformer: An Extremely Simple
  Alternative to Attention Mechanism
When Shift Operation Meets Vision Transformer: An Extremely Simple Alternative to Attention MechanismAAAI Conference on Artificial Intelligence (AAAI), 2022
Guangting Wang
Yucheng Zhao
Chuanxin Tang
Chong Luo
Wenjun Zeng
202
88
0
26 Jan 2022
Patches Are All You Need?
Patches Are All You Need?
Asher Trockman
J. Zico Kolter
ViT
438
484
0
24 Jan 2022
MAXIM: Multi-Axis MLP for Image Processing
MAXIM: Multi-Axis MLP for Image ProcessingComputer Vision and Pattern Recognition (CVPR), 2022
Zhengzhong Tu
Hossein Talebi
Han Zhang
Feng Yang
P. Milanfar
A. Bovik
Yinxiao Li
296
631
0
09 Jan 2022
Beyond modeling: NLP Pipeline for efficient environmental policy
  analysis
Beyond modeling: NLP Pipeline for efficient environmental policy analysis
J. Planas
Daniel Firebanks-Quevedo
G. Naydenova
Ramansh Sharma
Cristina Taylor
Kathleen Buckingham
Rong Fang
92
9
0
08 Jan 2022
The GatedTabTransformer. An enhanced deep learning architecture for
  tabular modeling
The GatedTabTransformer. An enhanced deep learning architecture for tabular modeling
Radostin Cholakov
T. Kolev
LMTD
133
20
0
01 Jan 2022
RepMLPNet: Hierarchical Vision MLP with Re-parameterized Locality
RepMLPNet: Hierarchical Vision MLP with Re-parameterized LocalityComputer Vision and Pattern Recognition (CVPR), 2021
Xiaohan Ding
Honghao Chen
Xinming Zhang
Jungong Han
Guiguang Ding
159
81
0
21 Dec 2021
The King is Naked: on the Notion of Robustness for Natural Language
  Processing
The King is Naked: on the Notion of Robustness for Natural Language Processing
Emanuele La Malfa
Marta Z. Kwiatkowska
305
31
0
13 Dec 2021
MLP Architectures for Vision-and-Language Modeling: An Empirical Study
MLP Architectures for Vision-and-Language Modeling: An Empirical Study
Yi-Liang Nie
Linjie Li
Zhe Gan
Shuohang Wang
Chenguang Zhu
Michael Zeng
Zicheng Liu
Joey Tianyi Zhou
Lijuan Wang
161
8
0
08 Dec 2021
VIRT: Improving Representation-based Models for Text Matching through
  Virtual Interaction
VIRT: Improving Representation-based Models for Text Matching through Virtual Interaction
Dan Li
Yang Yang
Hongyin Tang
Jingang Wang
Tong Xu
Wei Wu
Enhong Chen
193
10
0
08 Dec 2021
A Novel Deep Parallel Time-series Relation Network for Fault Diagnosis
A Novel Deep Parallel Time-series Relation Network for Fault Diagnosis
Chun Yang
AI4TSAI4CE
199
4
0
03 Dec 2021
SWAT: Spatial Structure Within and Among Tokens
SWAT: Spatial Structure Within and Among TokensInternational Joint Conference on Artificial Intelligence (IJCAI), 2021
Kumara Kahatapitiya
Michael S. Ryoo
264
7
0
26 Nov 2021
MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal
  Representation Learning
MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal Representation Learning
David Junhao Zhang
Kunchang Li
Yali Wang
Yuxiang Chen
Shashwat Chandra
Yu Qiao
Luoqi Liu
Mike Zheng Shou
AI4TS
199
35
0
24 Nov 2021
An Image Patch is a Wave: Phase-Aware Vision MLP
An Image Patch is a Wave: Phase-Aware Vision MLP
Yehui Tang
Kai Han
Jianyuan Guo
Chang Xu
Yanxi Li
Chao Xu
Yunhe Wang
417
155
0
24 Nov 2021
Adaptive Fourier Neural Operators: Efficient Token Mixers for
  Transformers
Adaptive Fourier Neural Operators: Efficient Token Mixers for Transformers
John Guibas
Morteza Mardani
Zong-Yi Li
Andrew Tao
Anima Anandkumar
Bryan Catanzaro
324
326
0
24 Nov 2021
MetaFormer Is Actually What You Need for Vision
MetaFormer Is Actually What You Need for VisionComputer Vision and Pattern Recognition (CVPR), 2021
Weihao Yu
Mi Luo
Pan Zhou
Chenyang Si
Yichen Zhou
Xinchao Wang
Jiashi Feng
Shuicheng Yan
496
1,185
0
22 Nov 2021
PointMixer: MLP-Mixer for Point Cloud Understanding
PointMixer: MLP-Mixer for Point Cloud UnderstandingEuropean Conference on Computer Vision (ECCV), 2021
Jaesung Choe
Chunghyun Park
François Rameau
Jaesik Park
In So Kweon
3DPC
388
117
0
22 Nov 2021
Are Transformers More Robust Than CNNs?
Are Transformers More Robust Than CNNs?Neural Information Processing Systems (NeurIPS), 2021
Yutong Bai
Jieru Mei
Alan Yuille
Cihang Xie
ViTAAML
454
316
0
10 Nov 2021
Are we ready for a new paradigm shift? A Survey on Visual Deep MLP
Are we ready for a new paradigm shift? A Survey on Visual Deep MLP
Ruiyang Liu
Hai-Tao Zheng
Li Tao
Dun Liang
Haitao Zheng
622
112
0
07 Nov 2021
Convolutional Gated MLP: Combining Convolutions & gMLP
Convolutional Gated MLP: Combining Convolutions & gMLP
A. Rajagopal
V. Nirmala
130
21
0
06 Nov 2021
Arbitrary Distribution Modeling with Censorship in Real-Time Bidding
  Advertising
Arbitrary Distribution Modeling with Censorship in Real-Time Bidding Advertising
Xu Li
Michelle Ma Zhang
Youjun Tong
Zhenya Wang
114
12
0
26 Oct 2021
Graph-less Neural Networks: Teaching Old MLPs New Tricks via
  Distillation
Graph-less Neural Networks: Teaching Old MLPs New Tricks via Distillation
Shichang Zhang
Yozen Liu
Luke Huan
Neil Shah
334
221
0
17 Oct 2021
Attention-Free Keyword Spotting
Attention-Free Keyword Spotting
Mashrur M. Morshed
Ahmad Omar Ahsan
316
13
0
14 Oct 2021
SuperShaper: Task-Agnostic Super Pre-training of BERT Models with
  Variable Hidden Dimensions
SuperShaper: Task-Agnostic Super Pre-training of BERT Models with Variable Hidden Dimensions
Vinod Ganesan
Gowtham Ramesh
Pratyush Kumar
129
9
0
10 Oct 2021
UniNet: Unified Architecture Search with Convolution, Transformer, and
  MLP
UniNet: Unified Architecture Search with Convolution, Transformer, and MLPEuropean Conference on Computer Vision (ECCV), 2021
Jihao Liu
Jiaming Song
Guanglu Song
Xin Huang
Yu Liu
ViT
214
38
0
08 Oct 2021
Deep Instance Segmentation with Automotive Radar Detection Points
Deep Instance Segmentation with Automotive Radar Detection Points
Tao Huang
Weiyi Xiong
Liping Bai
Yu Xia
Wei Chen
Wanli Ouyang
Bing Zhu
394
68
0
05 Oct 2021
General Cross-Architecture Distillation of Pretrained Language Models
  into Matrix Embeddings
General Cross-Architecture Distillation of Pretrained Language Models into Matrix Embeddings
Lukas Galke
Isabelle Cuber
Christophe Meyer
Henrik Ferdinand Nolscher
Angelina Sonderecker
A. Scherp
198
2
0
17 Sep 2021
Sparse MLP for Image Recognition: Is Self-Attention Really Necessary?
Sparse MLP for Image Recognition: Is Self-Attention Really Necessary?
Chuanxin Tang
Yucheng Zhao
Guangting Wang
Chong Luo
Wenxuan Xie
Wenjun Zeng
MoEViT
194
117
0
12 Sep 2021
ConvMLP: Hierarchical Convolutional MLPs for Vision
ConvMLP: Hierarchical Convolutional MLPs for Vision
Jiachen Li
Ali Hassani
Steven Walton
Humphrey Shi
170
84
0
09 Sep 2021
Cross-token Modeling with Conditional Computation
Cross-token Modeling with Conditional Computation
Yuxuan Lou
Fuzhao Xue
Zangwei Zheng
Yang You
MoE
232
28
0
05 Sep 2021
SANSformers: Self-Supervised Forecasting in Electronic Health Records
  with Attention-Free Models
SANSformers: Self-Supervised Forecasting in Electronic Health Records with Attention-Free Models
Yogesh Kumar
Alexander Ilin
H. Salo
S. Kulathinal
M. Leinonen
Pekka Marttinen
AI4TSMedIm
304
0
0
31 Aug 2021
Hire-MLP: Vision MLP via Hierarchical Rearrangement
Hire-MLP: Vision MLP via Hierarchical RearrangementComputer Vision and Pattern Recognition (CVPR), 2021
Jianyuan Guo
Yehui Tang
Kai Han
Xinghao Chen
Han Wu
Chao Xu
Chang Xu
Yunhe Wang
263
114
0
30 Aug 2021
A Battle of Network Structures: An Empirical Study of CNN, Transformer,
  and MLP
A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP
Yucheng Zhao
Guangting Wang
Chuanxin Tang
Chong Luo
Wenjun Zeng
Zhengjun Zha
168
90
0
30 Aug 2021
MOI-Mixer: Improving MLP-Mixer with Multi Order Interactions in
  Sequential Recommendation
MOI-Mixer: Improving MLP-Mixer with Multi Order Interactions in Sequential Recommendation
Hojoon Lee
Dongyoon Hwang
Sunghwan Hong
Changyeon Kim
Seungryong Kim
Jaegul Choo
189
11
0
17 Aug 2021
RaftMLP: How Much Can Be Done Without Attention and with Less Spatial
  Locality?
RaftMLP: How Much Can Be Done Without Attention and with Less Spatial Locality?Asian Conference on Computer Vision (ACCV), 2021
Yuki Tatsunami
Masato Taki
195
12
0
09 Aug 2021
S$^2$-MLPv2: Improved Spatial-Shift MLP Architecture for Vision
S2^22-MLPv2: Improved Spatial-Shift MLP Architecture for Vision
Tan Yu
Xu Li
Yunfeng Cai
Mingming Sun
Ping Li
125
64
0
02 Aug 2021
Structure and Performance of Fully Connected Neural Networks: Emerging
  Complex Network Properties
Structure and Performance of Fully Connected Neural Networks: Emerging Complex Network Properties
Leonardo F. S. Scabini
Odemir M. Bruno
GNN
111
78
0
29 Jul 2021
Previous
1234567
Next