Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1606.08415
Cited By
Gaussian Error Linear Units (GELUs)
27 June 2016
Dan Hendrycks
Kevin Gimpel
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Gaussian Error Linear Units (GELUs)"
50 / 780 papers shown
Title
Video Frame Interpolation Transformer
Zhihao Shi
Xiangyu Xu
Xiaohong Liu
Jun Chen
Ming-Hsuan Yang
ViT
15
157
0
27 Nov 2021
Scene Representation Transformer: Geometry-Free Novel View Synthesis Through Set-Latent Scene Representations
Mehdi S. M. Sajjadi
H. Meyer
Etienne Pot
Urs M. Bergmann
Klaus Greff
...
Daniel Duckworth
Alexey Dosovitskiy
Jakob Uszkoreit
Thomas Funkhouser
Andrea Tagliasacchi
ViT
35
184
0
25 Nov 2021
Pruning Self-attentions into Convolutional Layers in Single Path
Haoyu He
Jianfei Cai
Jing Liu
Zizheng Pan
Jing Zhang
Dacheng Tao
Bohan Zhuang
ViT
31
40
0
23 Nov 2021
Benchmarking Detection Transfer Learning with Vision Transformers
Yanghao Li
Saining Xie
Xinlei Chen
Piotr Dollar
Kaiming He
Ross B. Girshick
17
165
0
22 Nov 2021
PointMixer: MLP-Mixer for Point Cloud Understanding
Jaesung Choe
Chunghyun Park
François Rameau
Jaesik Park
In So Kweon
3DPC
39
98
0
22 Nov 2021
Mesa: A Memory-saving Training Framework for Transformers
Zizheng Pan
Peng Chen
Haoyu He
Jing Liu
Jianfei Cai
Bohan Zhuang
23
20
0
22 Nov 2021
Global and Local Alignment Networks for Unpaired Image-to-Image Translation
Guanglei Yang
H. Tang
Humphrey Shi
M. Ding
N. Sebe
Radu Timofte
Luc Van Gool
Elisa Ricci
13
1
0
19 Nov 2021
Restormer: Efficient Transformer for High-Resolution Image Restoration
Syed Waqas Zamir
Aditya Arora
Salman Khan
Munawar Hayat
F. Khan
Ming-Hsuan Yang
ViT
40
2,123
0
18 Nov 2021
Are Transformers More Robust Than CNNs?
Yutong Bai
Jieru Mei
Alan Yuille
Cihang Xie
ViT
AAML
192
257
0
10 Nov 2021
Data Augmentation Can Improve Robustness
Sylvestre-Alvise Rebuffi
Sven Gowal
D. A. Calian
Florian Stimberg
Olivia Wiles
Timothy A. Mann
AAML
17
269
0
09 Nov 2021
SMU: smooth activation function for deep networks using smoothing maximum technique
Koushik Biswas
Sandeep Kumar
Shilpak Banerjee
A. Pandey
28
32
0
08 Nov 2021
Are we ready for a new paradigm shift? A Survey on Visual Deep MLP
Ruiyang Liu
Yinghui Li
Li Tao
Dun Liang
Haitao Zheng
85
97
0
07 Nov 2021
Hybrid Spectrogram and Waveform Source Separation
Alexandre Défossez
22
160
0
05 Nov 2021
Cross-Modality Fusion Transformer for Multispectral Object Detection
Q. Fang
D. Han
Zhaokui Wang
ViT
22
140
0
30 Oct 2021
BiC-Net: Learning Efficient Spatio-Temporal Relation for Text-Video Retrieval
Ning Han
Jingjing Chen
Chuhao Shi
Yawen Zeng
Guangyi Xiao
Hao Chen
22
10
0
29 Oct 2021
Bolt: Bridging the Gap between Auto-tuners and Hardware-native Performance
Jiarong Xing
Leyuan Wang
Shang Zhang
Jack H Chen
Ang Chen
Yibo Zhu
30
43
0
25 Oct 2021
ConformalLayers: A non-linear sequential neural network with associative layers
Zhen Wan
Zhuoyuan Mao
C. N. Vasconcelos
14
3
0
23 Oct 2021
Logical Activation Functions: Logit-space equivalents of Probabilistic Boolean Operators
S. Lowe
Robert C. Earle
Jason dÉon
Thomas Trappenberg
Sageev Oore
23
1
0
22 Oct 2021
Vis-TOP: Visual Transformer Overlay Processor
Wei Hu
Dian Xu
Zimeng Fan
Fang Liu
Yanxiang He
BDL
ViT
20
5
0
21 Oct 2021
Cascaded Cross MLP-Mixer GANs for Cross-View Image Translation
Bin Ren
Hao Tang
N. Sebe
32
30
0
19 Oct 2021
Improving Robustness using Generated Data
Sven Gowal
Sylvestre-Alvise Rebuffi
Olivia Wiles
Florian Stimberg
D. A. Calian
Timothy A. Mann
30
293
0
18 Oct 2021
NormFormer: Improved Transformer Pretraining with Extra Normalization
Sam Shleifer
Jason Weston
Myle Ott
AI4CE
33
74
0
18 Oct 2021
bert2BERT: Towards Reusable Pretrained Language Models
Cheng Chen
Yichun Yin
Lifeng Shang
Xin Jiang
Yujia Qin
Fengyu Wang
Zhi Wang
Xiao Chen
Zhiyuan Liu
Qun Liu
VLM
24
59
0
14 Oct 2021
Differentially Private Fine-tuning of Language Models
Da Yu
Saurabh Naik
A. Backurs
Sivakanth Gopi
Huseyin A. Inan
...
Y. Lee
Andre Manoel
Lukas Wutschitz
Sergey Yekhanin
Huishuai Zhang
134
346
0
13 Oct 2021
Dynamic Inference with Neural Interpreters
Nasim Rahaman
Muhammad Waleed Gondal
S. Joshi
Peter V. Gehler
Yoshua Bengio
Francesco Locatello
Bernhard Schölkopf
34
31
0
12 Oct 2021
6D-ViT: Category-Level 6D Object Pose Estimation via Transformer-based Instance Representation Learning
Lu Zou
Zhangjin Huang
Naijie Gu
Guoping Wang
ViT
31
45
0
10 Oct 2021
UniNet: Unified Architecture Search with Convolution, Transformer, and MLP
Jihao Liu
Hongsheng Li
Guanglu Song
Xin Huang
Yu Liu
ViT
37
35
0
08 Oct 2021
Pathologies in priors and inference for Bayesian transformers
Tristan Cinquin
Alexander Immer
Max Horn
Vincent Fortuin
UQCV
BDL
MedIm
31
9
0
08 Oct 2021
Style Equalization: Unsupervised Learning of Controllable Generative Sequence Models
Jen-Hao Rick Chang
A. Shrivastava
H. Koppula
Xiaoshuai Zhang
Oncel Tuzel
DiffM
51
16
0
06 Oct 2021
MoEfication: Transformer Feed-forward Layers are Mixtures of Experts
Zhengyan Zhang
Yankai Lin
Zhiyuan Liu
Peng Li
Maosong Sun
Jie Zhou
MoE
27
117
0
05 Oct 2021
Fine-tuning wav2vec2 for speaker recognition
Nik Vaessen
David A. van Leeuwen
36
107
0
30 Sep 2021
Introducing the DOME Activation Functions
Mohamed E. Hussein
Wael AbdAlmageed
27
1
0
30 Sep 2021
Activation Functions in Deep Learning: A Comprehensive Survey and Benchmark
S. Dubey
S. Singh
B. B. Chaudhuri
41
641
0
29 Sep 2021
UFO-ViT: High Performance Linear Vision Transformer without Softmax
Jeonggeun Song
ViT
114
20
0
29 Sep 2021
IGLU: Efficient GCN Training via Lazy Updates
S. Narayanan
Aditya Sinha
Prateek Jain
Purushottam Kar
Sundararajan Sellamanickam
BDL
52
9
0
28 Sep 2021
SAU: Smooth activation function using convolution with approximate identities
Koushik Biswas
Sandeep Kumar
Shilpak Banerjee
A. Pandey
13
6
0
27 Sep 2021
iRNN: Integer-only Recurrent Neural Network
Eyyub Sari
Vanessa Courville
V. Nia
MQ
45
4
0
20 Sep 2021
BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese
Nguyen Luong Tran
Duong Minh Le
Dat Quoc Nguyen
19
51
0
20 Sep 2021
Fast and Sample-Efficient Interatomic Neural Network Potentials for Molecules and Materials Based on Gaussian Moments
Viktor Zaverkin
David Holzmüller
Ingo Steinwart
Johannes Kastner
18
19
0
20 Sep 2021
Commonsense Knowledge in Word Associations and ConceptNet
Chunhua Liu
Trevor Cohn
Lea Frermann
30
7
0
20 Sep 2021
AutoInit: Analytic Signal-Preserving Weight Initialization for Neural Networks
G. Bingham
Risto Miikkulainen
ODL
24
4
0
18 Sep 2021
Encoding Distributional Soft Actor-Critic for Autonomous Driving in Multi-lane Scenarios
Jingliang Duan
Yangang Ren
Fawang Zhang
Yang Guan
Dongjie Yu
Shengbo Eben Li
B. Cheng
Lin Zhao
21
6
0
12 Sep 2021
TEASEL: A Transformer-Based Speech-Prefixed Language Model
Mehdi Arjmand
M. Dousti
H. Moradi
30
18
0
12 Sep 2021
Multilingual Translation via Grafting Pre-trained Language Models
Zewei Sun
Mingxuan Wang
Lei Li
AI4CE
188
22
0
11 Sep 2021
ErfAct and Pserf: Non-monotonic Smooth Trainable Activation Functions
Koushik Biswas
Sandeep Kumar
Shilpak Banerjee
A. Pandey
46
13
0
09 Sep 2021
Learning the Physics of Particle Transport via Transformers
O. Pastor-Serrano
Zoltán Perkó
MedIm
21
13
0
08 Sep 2021
nnFormer: Interleaved Transformer for Volumetric Segmentation
Hong-Yu Zhou
J. Guo
Yinghao Zhang
Lequan Yu
Liansheng Wang
Yizhou Yu
ViT
MedIm
27
307
0
07 Sep 2021
Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization
Tiezheng Yu
Wenliang Dai
Zihan Liu
Pascale Fung
32
73
0
06 Sep 2021
Learning to Generate Scene Graph from Natural Language Supervision
Yiwu Zhong
Jing Shi
Jianwei Yang
Chenliang Xu
Yin Li
SSL
31
77
0
06 Sep 2021
Hire-MLP: Vision MLP via Hierarchical Rearrangement
Jianyuan Guo
Yehui Tang
Kai Han
Xinghao Chen
Han Wu
Chao Xu
Chang Xu
Yunhe Wang
46
105
0
30 Aug 2021
Previous
1
2
3
...
12
13
14
15
16
Next