ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2105.08050
  4. Cited By
Pay Attention to MLPs

Pay Attention to MLPs

17 May 2021
Hanxiao Liu
Zihang Dai
David R. So
Quoc V. Le
    AI4CE
ArXivPDFHTML

Papers citing "Pay Attention to MLPs"

50 / 303 papers shown
Title
Astronomia ex machina: a history, primer, and outlook on neural networks
  in astronomy
Astronomia ex machina: a history, primer, and outlook on neural networks in astronomy
Michael J. Smith
James E. Geach
21
32
0
07 Nov 2022
How Much Does Attention Actually Attend? Questioning the Importance of
  Attention in Pretrained Transformers
How Much Does Attention Actually Attend? Questioning the Importance of Attention in Pretrained Transformers
Michael Hassid
Hao Peng
Daniel Rotem
Jungo Kasai
Ivan Montero
Noah A. Smith
Roy Schwartz
24
24
0
07 Nov 2022
Neural Fourier Shift for Binaural Speech Rendering
Neural Fourier Shift for Binaural Speech Rendering
Jinkyu Lee
Kyogu Lee
23
7
0
02 Nov 2022
Globally Gated Deep Linear Networks
Globally Gated Deep Linear Networks
Qianyi Li
H. Sompolinsky
AI4CE
14
10
0
31 Oct 2022
QNet: A Quantum-native Sequence Encoder Architecture
QNet: A Quantum-native Sequence Encoder Architecture
Wei-Yen Day
Hao-Sheng Chen
Min Sun
21
0
0
31 Oct 2022
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for
  Language Models
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models
Hong Liu
Sang Michael Xie
Zhiyuan Li
Tengyu Ma
AI4CE
32
49
0
25 Oct 2022
MetaFormer Baselines for Vision
MetaFormer Baselines for Vision
Weihao Yu
Chenyang Si
Pan Zhou
Mi Luo
Yichen Zhou
Jiashi Feng
Shuicheng Yan
Xinchao Wang
MoE
23
156
0
24 Oct 2022
Similarity of Neural Architectures using Adversarial Attack
  Transferability
Similarity of Neural Architectures using Adversarial Attack Transferability
Jaehui Hwang
Dongyoon Han
Byeongho Heo
Song Park
Sanghyuk Chun
Jong-Seok Lee
AAML
24
1
0
20 Oct 2022
Decoupling Features in Hierarchical Propagation for Video Object
  Segmentation
Decoupling Features in Hierarchical Propagation for Video Object Segmentation
Zongxin Yang
Yi Yang
VOS
11
152
0
18 Oct 2022
SA-MLP: Distilling Graph Knowledge from GNNs into Structure-Aware MLP
SA-MLP: Distilling Graph Knowledge from GNNs into Structure-Aware MLP
Jie Chen
Shouzhen Chen
Mingyuan Bai
Junbin Gao
Junping Zhang
Jian Pu
32
10
0
18 Oct 2022
AutoMoE: Heterogeneous Mixture-of-Experts with Adaptive Computation for
  Efficient Neural Machine Translation
AutoMoE: Heterogeneous Mixture-of-Experts with Adaptive Computation for Efficient Neural Machine Translation
Ganesh Jawahar
Subhabrata Mukherjee
Xiaodong Liu
Young Jin Kim
Muhammad Abdul-Mageed
L. Lakshmanan
Ahmed Hassan Awadallah
Sébastien Bubeck
Jianfeng Gao
MoE
22
5
0
14 Oct 2022
Are All Vision Models Created Equal? A Study of the Open-Loop to
  Closed-Loop Causality Gap
Are All Vision Models Created Equal? A Study of the Open-Loop to Closed-Loop Causality Gap
Mathias Lechner
Ramin Hasani
Alexander Amini
Tsun-Hsuan Wang
T. Henzinger
Daniela Rus
CML
OOD
21
7
0
09 Oct 2022
The Lie Derivative for Measuring Learned Equivariance
The Lie Derivative for Measuring Learned Equivariance
Nate Gruver
Marc Finzi
Micah Goldblum
A. Wilson
14
34
0
06 Oct 2022
Centralized Feature Pyramid for Object Detection
Centralized Feature Pyramid for Object Detection
Yu Quan
Dong Zhang
Liyan Zhang
Jinhui Tang
ObjD
24
147
0
05 Oct 2022
Rethinking Performance Gains in Image Dehazing Networks
Rethinking Performance Gains in Image Dehazing Networks
Yuda Song
Yang Zhou
Hui Qian
Xin Du
SSeg
28
48
0
23 Sep 2022
Mega: Moving Average Equipped Gated Attention
Mega: Moving Average Equipped Gated Attention
Xuezhe Ma
Chunting Zhou
Xiang Kong
Junxian He
Liangke Gui
Graham Neubig
Jonathan May
Luke Zettlemoyer
14
182
0
21 Sep 2022
Analysis of Quantization on MLP-based Vision Models
Analysis of Quantization on MLP-based Vision Models
Lingran Zhao
Zhen Dong
Kurt Keutzer
MQ
19
7
0
14 Sep 2022
Pre-Training a Graph Recurrent Network for Language Representation
Pre-Training a Graph Recurrent Network for Language Representation
Yile Wang
Linyi Yang
Zhiyang Teng
M. Zhou
Yue Zhang
GNN
30
1
0
08 Sep 2022
LKD-Net: Large Kernel Convolution Network for Single Image Dehazing
LKD-Net: Large Kernel Convolution Network for Single Image Dehazing
Pinjun Luo
Guoqiang Xiao
Xinbo Gao
Song Wu
19
31
0
05 Sep 2022
gSwin: Gated MLP Vision Model with Hierarchical Structure of Shifted
  Window
gSwin: Gated MLP Vision Model with Hierarchical Structure of Shifted Window
Mocho Go
Hideyuki Tachibana
ViT
29
9
0
24 Aug 2022
Efficient Attention-free Video Shift Transformers
Efficient Attention-free Video Shift Transformers
Adrian Bulat
Brais Martínez
Georgios Tzimiropoulos
ViT
27
1
0
23 Aug 2022
CM-MLP: Cascade Multi-scale MLP with Axial Context Relation Encoder for
  Edge Segmentation of Medical Image
CM-MLP: Cascade Multi-scale MLP with Axial Context Relation Encoder for Edge Segmentation of Medical Image
Jinkai Lv
Yuyong Hu
Quanshui Fu
Zhiwang Zhang
Yuqiang Hu
Lin Lv
Guoqing Yang
Jinpeng Li
Yi Zhao
MedIm
15
9
0
23 Aug 2022
Enhancing Targeted Attack Transferability via Diversified Weight Pruning
Enhancing Targeted Attack Transferability via Diversified Weight Pruning
Hung-Jui Wang
Yuehua Wu
Shang-Tse Chen
AAML
16
2
0
18 Aug 2022
giMLPs: Gate with Inhibition Mechanism in MLPs
Cheng Kang
Jindich Prokop
Lei Tong
Huiyu Zhou
Yong Hu
Daneil Novak
14
0
0
01 Aug 2022
Doubly Deformable Aggregation of Covariance Matrices for Few-shot
  Segmentation
Doubly Deformable Aggregation of Covariance Matrices for Few-shot Segmentation
Zhitong Xiong
Haopeng Li
Xiao Xiang Zhu
35
35
0
30 Jul 2022
HorNet: Efficient High-Order Spatial Interactions with Recursive Gated
  Convolutions
HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolutions
Yongming Rao
Wenliang Zhao
Yansong Tang
Jie Zhou
Ser-Nam Lim
Jiwen Lu
ViT
20
252
0
28 Jul 2022
TINYCD: A (Not So) Deep Learning Model For Change Detection
TINYCD: A (Not So) Deep Learning Model For Change Detection
Andrea Codegoni
G. Lombardi
Alessandro Ferrari
23
72
0
26 Jul 2022
SplitMixer: Fat Trimmed From MLP-like Models
SplitMixer: Fat Trimmed From MLP-like Models
Ali Borji
Sikun Lin
21
3
0
21 Jul 2022
Assaying Out-Of-Distribution Generalization in Transfer Learning
Assaying Out-Of-Distribution Generalization in Transfer Learning
F. Wenzel
Andrea Dittadi
Peter V. Gehler
Carl-Johann Simon-Gabriel
Max Horn
...
Chris Russell
Thomas Brox
Bernt Schiele
Bernhard Schölkopf
Francesco Locatello
OOD
OODD
AAML
49
71
0
19 Jul 2022
Research Trends and Applications of Data Augmentation Algorithms
Research Trends and Applications of Data Augmentation Algorithms
João Fonseca
F. Bação
35
4
0
18 Jul 2022
MLP-GAN for Brain Vessel Image Segmentation
B. Xie
Hao Tang
Bin Duan
Dawen Cai
Yan Yan
MedIm
30
2
0
17 Jul 2022
Parameterization of Cross-Token Relations with Relative Positional
  Encoding for Vision MLP
Parameterization of Cross-Token Relations with Relative Positional Encoding for Vision MLP
Zhicai Wang
Y. Hao
Xingyu Gao
Hao Zhang
Shuo Wang
Tingting Mu
Xiangnan He
16
8
0
15 Jul 2022
Trans4Map: Revisiting Holistic Bird's-Eye-View Mapping from Egocentric
  Images to Allocentric Semantics with Vision Transformers
Trans4Map: Revisiting Holistic Bird's-Eye-View Mapping from Egocentric Images to Allocentric Semantics with Vision Transformers
Chang Chen
Jiaming Zhang
Kailun Yang
Kunyu Peng
Rainer Stiefelhagen
ViT
18
8
0
13 Jul 2022
Image and Model Transformation with Secret Key for Vision Transformer
Image and Model Transformation with Secret Key for Vision Transformer
Hitoshi Kiya
Ryota Iijima
Maungmaung Aprilpyone
Yuma Kinoshita
ViT
24
21
0
12 Jul 2022
Branchformer: Parallel MLP-Attention Architectures to Capture Local and
  Global Context for Speech Recognition and Understanding
Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
Yifan Peng
Siddharth Dalmia
Ian Lane
Shinji Watanabe
19
142
0
06 Jul 2022
CoBEVT: Cooperative Bird's Eye View Semantic Segmentation with Sparse
  Transformers
CoBEVT: Cooperative Bird's Eye View Semantic Segmentation with Sparse Transformers
Runsheng Xu
Zhengzhong Tu
Hao Xiang
Wei Shao
Bolei Zhou
Jiaqi Ma
42
218
0
05 Jul 2022
Less Is More: Fast Multivariate Time Series Forecasting with Light
  Sampling-oriented MLP Structures
Less Is More: Fast Multivariate Time Series Forecasting with Light Sampling-oriented MLP Structures
T. Zhang
Yizhuo Zhang
Wei Cao
Jiang Bian
Xiaohan Yi
Shun Zheng
Jian Li
BDL
AI4TS
95
152
0
04 Jul 2022
Golfer: Trajectory Prediction with Masked Goal Conditioning MnM Network
Golfer: Trajectory Prediction with Masked Goal Conditioning MnM Network
Xiaocheng Tang
S. S. Eshkevari
Haoyu Chen
Wei Wu
Wei Qian
Xiaoming Wang
12
7
0
02 Jul 2022
QbyE-MLPMixer: Query-by-Example Open-Vocabulary Keyword Spotting using
  MLPMixer
QbyE-MLPMixer: Query-by-Example Open-Vocabulary Keyword Spotting using MLPMixer
Jinmiao Huang
W. Gharbieh
Qianhui Wan
Han Suk Shim
Chul Lee
17
9
0
23 Jun 2022
Time Gated Convolutional Neural Networks for Crop Classification
Time Gated Convolutional Neural Networks for Crop Classification
Longlong Weng
Yashu Kang
Kezhao Jiang
Chun-Tse Chen
16
2
0
20 Jun 2022
Replacing Labeled Real-image Datasets with Auto-generated Contours
Replacing Labeled Real-image Datasets with Auto-generated Contours
Hirokatsu Kataoka
Ryo Hayamizu
Ryosuke Yamada
Kodai Nakashima
Sora Takashima
Xinyu Zhang
Edgar Josafat Martinez-Noriega
Nakamasa Inoue
Rio Yokota
12
23
0
18 Jun 2022
Peripheral Vision Transformer
Peripheral Vision Transformer
Juhong Min
Yucheng Zhao
Chong Luo
Minsu Cho
ViT
MDE
24
30
0
14 Jun 2022
GraphMLP: A Graph MLP-Like Architecture for 3D Human Pose Estimation
GraphMLP: A Graph MLP-Like Architecture for 3D Human Pose Estimation
Wenhao Li
Hong Liu
Tianyu Guo
Runwei Ding
Haoling Tang
3DH
14
27
0
13 Jun 2022
IL-MCAM: An interactive learning and multi-channel attention
  mechanism-based weakly supervised colorectal histopathology image
  classification approach
IL-MCAM: An interactive learning and multi-channel attention mechanism-based weakly supervised colorectal histopathology image classification approach
Hao Chen
Chen Li
Xirong Li
M. Rahaman
Weiming Hu
...
Wanli Liu
Changhao Sun
Hongzan Sun
Xinyu Huang
M. Grzegorzek
HAI
27
99
0
07 Jun 2022
MDMLP: Image Classification from Scratch on Small Datasets with MLP
MDMLP: Image Classification from Scratch on Small Datasets with MLP
Tianxu Lv
Chongyang Bai
Chaojie Wang
22
5
0
28 May 2022
A Unified Weight Initialization Paradigm for Tensorial Convolutional
  Neural Networks
A Unified Weight Initialization Paradigm for Tensorial Convolutional Neural Networks
Y. Pan
Zeyong Su
Ao Liu
Jingquan Wang
Nannan Li
Zenglin Xu
19
11
0
28 May 2022
Transformers from an Optimization Perspective
Transformers from an Optimization Perspective
Yongyi Yang
Zengfeng Huang
David Wipf
37
24
0
27 May 2022
AdaptFormer: Adapting Vision Transformers for Scalable Visual
  Recognition
AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition
Shoufa Chen
Chongjian Ge
Zhan Tong
Jiangliu Wang
Yibing Song
Jue Wang
Ping Luo
141
637
0
26 May 2022
Augmentation-induced Consistency Regularization for Classification
Augmentation-induced Consistency Regularization for Classification
Jianguo Wu
Shijing Si
Jianzong Wang
Jing Xiao
15
2
0
25 May 2022
Sparse Mixers: Combining MoE and Mixing to build a more efficient BERT
Sparse Mixers: Combining MoE and Mixing to build a more efficient BERT
James Lee-Thorp
Joshua Ainslie
MoE
32
11
0
24 May 2022
Previous
1234567
Next