ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.08415
  4. Cited By
Gaussian Error Linear Units (GELUs)

Gaussian Error Linear Units (GELUs)

27 June 2016
Dan Hendrycks
Kevin Gimpel
ArXivPDFHTML

Papers citing "Gaussian Error Linear Units (GELUs)"

50 / 780 papers shown
Title
Video Frame Interpolation Transformer
Video Frame Interpolation Transformer
Zhihao Shi
Xiangyu Xu
Xiaohong Liu
Jun Chen
Ming-Hsuan Yang
ViT
15
157
0
27 Nov 2021
Scene Representation Transformer: Geometry-Free Novel View Synthesis
  Through Set-Latent Scene Representations
Scene Representation Transformer: Geometry-Free Novel View Synthesis Through Set-Latent Scene Representations
Mehdi S. M. Sajjadi
H. Meyer
Etienne Pot
Urs M. Bergmann
Klaus Greff
...
Daniel Duckworth
Alexey Dosovitskiy
Jakob Uszkoreit
Thomas Funkhouser
Andrea Tagliasacchi
ViT
35
184
0
25 Nov 2021
Pruning Self-attentions into Convolutional Layers in Single Path
Pruning Self-attentions into Convolutional Layers in Single Path
Haoyu He
Jianfei Cai
Jing Liu
Zizheng Pan
Jing Zhang
Dacheng Tao
Bohan Zhuang
ViT
31
40
0
23 Nov 2021
Benchmarking Detection Transfer Learning with Vision Transformers
Benchmarking Detection Transfer Learning with Vision Transformers
Yanghao Li
Saining Xie
Xinlei Chen
Piotr Dollar
Kaiming He
Ross B. Girshick
17
165
0
22 Nov 2021
PointMixer: MLP-Mixer for Point Cloud Understanding
PointMixer: MLP-Mixer for Point Cloud Understanding
Jaesung Choe
Chunghyun Park
François Rameau
Jaesik Park
In So Kweon
3DPC
39
98
0
22 Nov 2021
Mesa: A Memory-saving Training Framework for Transformers
Mesa: A Memory-saving Training Framework for Transformers
Zizheng Pan
Peng Chen
Haoyu He
Jing Liu
Jianfei Cai
Bohan Zhuang
23
20
0
22 Nov 2021
Global and Local Alignment Networks for Unpaired Image-to-Image
  Translation
Global and Local Alignment Networks for Unpaired Image-to-Image Translation
Guanglei Yang
H. Tang
Humphrey Shi
M. Ding
N. Sebe
Radu Timofte
Luc Van Gool
Elisa Ricci
13
1
0
19 Nov 2021
Restormer: Efficient Transformer for High-Resolution Image Restoration
Restormer: Efficient Transformer for High-Resolution Image Restoration
Syed Waqas Zamir
Aditya Arora
Salman Khan
Munawar Hayat
F. Khan
Ming-Hsuan Yang
ViT
40
2,123
0
18 Nov 2021
Are Transformers More Robust Than CNNs?
Are Transformers More Robust Than CNNs?
Yutong Bai
Jieru Mei
Alan Yuille
Cihang Xie
ViT
AAML
192
257
0
10 Nov 2021
Data Augmentation Can Improve Robustness
Data Augmentation Can Improve Robustness
Sylvestre-Alvise Rebuffi
Sven Gowal
D. A. Calian
Florian Stimberg
Olivia Wiles
Timothy A. Mann
AAML
17
269
0
09 Nov 2021
SMU: smooth activation function for deep networks using smoothing
  maximum technique
SMU: smooth activation function for deep networks using smoothing maximum technique
Koushik Biswas
Sandeep Kumar
Shilpak Banerjee
A. Pandey
28
32
0
08 Nov 2021
Are we ready for a new paradigm shift? A Survey on Visual Deep MLP
Are we ready for a new paradigm shift? A Survey on Visual Deep MLP
Ruiyang Liu
Yinghui Li
Li Tao
Dun Liang
Haitao Zheng
85
97
0
07 Nov 2021
Hybrid Spectrogram and Waveform Source Separation
Hybrid Spectrogram and Waveform Source Separation
Alexandre Défossez
22
160
0
05 Nov 2021
Cross-Modality Fusion Transformer for Multispectral Object Detection
Cross-Modality Fusion Transformer for Multispectral Object Detection
Q. Fang
D. Han
Zhaokui Wang
ViT
22
140
0
30 Oct 2021
BiC-Net: Learning Efficient Spatio-Temporal Relation for Text-Video
  Retrieval
BiC-Net: Learning Efficient Spatio-Temporal Relation for Text-Video Retrieval
Ning Han
Jingjing Chen
Chuhao Shi
Yawen Zeng
Guangyi Xiao
Hao Chen
22
10
0
29 Oct 2021
Bolt: Bridging the Gap between Auto-tuners and Hardware-native
  Performance
Bolt: Bridging the Gap between Auto-tuners and Hardware-native Performance
Jiarong Xing
Leyuan Wang
Shang Zhang
Jack H Chen
Ang Chen
Yibo Zhu
30
43
0
25 Oct 2021
ConformalLayers: A non-linear sequential neural network with associative
  layers
ConformalLayers: A non-linear sequential neural network with associative layers
Zhen Wan
Zhuoyuan Mao
C. N. Vasconcelos
14
3
0
23 Oct 2021
Logical Activation Functions: Logit-space equivalents of Probabilistic
  Boolean Operators
Logical Activation Functions: Logit-space equivalents of Probabilistic Boolean Operators
S. Lowe
Robert C. Earle
Jason dÉon
Thomas Trappenberg
Sageev Oore
23
1
0
22 Oct 2021
Vis-TOP: Visual Transformer Overlay Processor
Vis-TOP: Visual Transformer Overlay Processor
Wei Hu
Dian Xu
Zimeng Fan
Fang Liu
Yanxiang He
BDL
ViT
20
5
0
21 Oct 2021
Cascaded Cross MLP-Mixer GANs for Cross-View Image Translation
Cascaded Cross MLP-Mixer GANs for Cross-View Image Translation
Bin Ren
Hao Tang
N. Sebe
32
30
0
19 Oct 2021
Improving Robustness using Generated Data
Improving Robustness using Generated Data
Sven Gowal
Sylvestre-Alvise Rebuffi
Olivia Wiles
Florian Stimberg
D. A. Calian
Timothy A. Mann
30
293
0
18 Oct 2021
NormFormer: Improved Transformer Pretraining with Extra Normalization
NormFormer: Improved Transformer Pretraining with Extra Normalization
Sam Shleifer
Jason Weston
Myle Ott
AI4CE
33
74
0
18 Oct 2021
bert2BERT: Towards Reusable Pretrained Language Models
bert2BERT: Towards Reusable Pretrained Language Models
Cheng Chen
Yichun Yin
Lifeng Shang
Xin Jiang
Yujia Qin
Fengyu Wang
Zhi Wang
Xiao Chen
Zhiyuan Liu
Qun Liu
VLM
24
59
0
14 Oct 2021
Differentially Private Fine-tuning of Language Models
Differentially Private Fine-tuning of Language Models
Da Yu
Saurabh Naik
A. Backurs
Sivakanth Gopi
Huseyin A. Inan
...
Y. Lee
Andre Manoel
Lukas Wutschitz
Sergey Yekhanin
Huishuai Zhang
134
346
0
13 Oct 2021
Dynamic Inference with Neural Interpreters
Dynamic Inference with Neural Interpreters
Nasim Rahaman
Muhammad Waleed Gondal
S. Joshi
Peter V. Gehler
Yoshua Bengio
Francesco Locatello
Bernhard Schölkopf
34
31
0
12 Oct 2021
6D-ViT: Category-Level 6D Object Pose Estimation via Transformer-based
  Instance Representation Learning
6D-ViT: Category-Level 6D Object Pose Estimation via Transformer-based Instance Representation Learning
Lu Zou
Zhangjin Huang
Naijie Gu
Guoping Wang
ViT
31
45
0
10 Oct 2021
UniNet: Unified Architecture Search with Convolution, Transformer, and
  MLP
UniNet: Unified Architecture Search with Convolution, Transformer, and MLP
Jihao Liu
Hongsheng Li
Guanglu Song
Xin Huang
Yu Liu
ViT
37
35
0
08 Oct 2021
Pathologies in priors and inference for Bayesian transformers
Pathologies in priors and inference for Bayesian transformers
Tristan Cinquin
Alexander Immer
Max Horn
Vincent Fortuin
UQCV
BDL
MedIm
31
9
0
08 Oct 2021
Style Equalization: Unsupervised Learning of Controllable Generative
  Sequence Models
Style Equalization: Unsupervised Learning of Controllable Generative Sequence Models
Jen-Hao Rick Chang
A. Shrivastava
H. Koppula
Xiaoshuai Zhang
Oncel Tuzel
DiffM
51
16
0
06 Oct 2021
MoEfication: Transformer Feed-forward Layers are Mixtures of Experts
MoEfication: Transformer Feed-forward Layers are Mixtures of Experts
Zhengyan Zhang
Yankai Lin
Zhiyuan Liu
Peng Li
Maosong Sun
Jie Zhou
MoE
27
117
0
05 Oct 2021
Fine-tuning wav2vec2 for speaker recognition
Fine-tuning wav2vec2 for speaker recognition
Nik Vaessen
David A. van Leeuwen
36
107
0
30 Sep 2021
Introducing the DOME Activation Functions
Introducing the DOME Activation Functions
Mohamed E. Hussein
Wael AbdAlmageed
27
1
0
30 Sep 2021
Activation Functions in Deep Learning: A Comprehensive Survey and
  Benchmark
Activation Functions in Deep Learning: A Comprehensive Survey and Benchmark
S. Dubey
S. Singh
B. B. Chaudhuri
41
641
0
29 Sep 2021
UFO-ViT: High Performance Linear Vision Transformer without Softmax
UFO-ViT: High Performance Linear Vision Transformer without Softmax
Jeonggeun Song
ViT
114
20
0
29 Sep 2021
IGLU: Efficient GCN Training via Lazy Updates
IGLU: Efficient GCN Training via Lazy Updates
S. Narayanan
Aditya Sinha
Prateek Jain
Purushottam Kar
Sundararajan Sellamanickam
BDL
52
9
0
28 Sep 2021
SAU: Smooth activation function using convolution with approximate
  identities
SAU: Smooth activation function using convolution with approximate identities
Koushik Biswas
Sandeep Kumar
Shilpak Banerjee
A. Pandey
13
6
0
27 Sep 2021
iRNN: Integer-only Recurrent Neural Network
iRNN: Integer-only Recurrent Neural Network
Eyyub Sari
Vanessa Courville
V. Nia
MQ
45
4
0
20 Sep 2021
BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese
BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese
Nguyen Luong Tran
Duong Minh Le
Dat Quoc Nguyen
19
51
0
20 Sep 2021
Fast and Sample-Efficient Interatomic Neural Network Potentials for
  Molecules and Materials Based on Gaussian Moments
Fast and Sample-Efficient Interatomic Neural Network Potentials for Molecules and Materials Based on Gaussian Moments
Viktor Zaverkin
David Holzmüller
Ingo Steinwart
Johannes Kastner
18
19
0
20 Sep 2021
Commonsense Knowledge in Word Associations and ConceptNet
Commonsense Knowledge in Word Associations and ConceptNet
Chunhua Liu
Trevor Cohn
Lea Frermann
30
7
0
20 Sep 2021
AutoInit: Analytic Signal-Preserving Weight Initialization for Neural
  Networks
AutoInit: Analytic Signal-Preserving Weight Initialization for Neural Networks
G. Bingham
Risto Miikkulainen
ODL
24
4
0
18 Sep 2021
Encoding Distributional Soft Actor-Critic for Autonomous Driving in
  Multi-lane Scenarios
Encoding Distributional Soft Actor-Critic for Autonomous Driving in Multi-lane Scenarios
Jingliang Duan
Yangang Ren
Fawang Zhang
Yang Guan
Dongjie Yu
Shengbo Eben Li
B. Cheng
Lin Zhao
21
6
0
12 Sep 2021
TEASEL: A Transformer-Based Speech-Prefixed Language Model
TEASEL: A Transformer-Based Speech-Prefixed Language Model
Mehdi Arjmand
M. Dousti
H. Moradi
30
18
0
12 Sep 2021
Multilingual Translation via Grafting Pre-trained Language Models
Multilingual Translation via Grafting Pre-trained Language Models
Zewei Sun
Mingxuan Wang
Lei Li
AI4CE
188
22
0
11 Sep 2021
ErfAct and Pserf: Non-monotonic Smooth Trainable Activation Functions
ErfAct and Pserf: Non-monotonic Smooth Trainable Activation Functions
Koushik Biswas
Sandeep Kumar
Shilpak Banerjee
A. Pandey
46
13
0
09 Sep 2021
Learning the Physics of Particle Transport via Transformers
Learning the Physics of Particle Transport via Transformers
O. Pastor-Serrano
Zoltán Perkó
MedIm
21
13
0
08 Sep 2021
nnFormer: Interleaved Transformer for Volumetric Segmentation
nnFormer: Interleaved Transformer for Volumetric Segmentation
Hong-Yu Zhou
J. Guo
Yinghao Zhang
Lequan Yu
Liansheng Wang
Yizhou Yu
ViT
MedIm
27
307
0
07 Sep 2021
Vision Guided Generative Pre-trained Language Models for Multimodal
  Abstractive Summarization
Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization
Tiezheng Yu
Wenliang Dai
Zihan Liu
Pascale Fung
32
73
0
06 Sep 2021
Learning to Generate Scene Graph from Natural Language Supervision
Learning to Generate Scene Graph from Natural Language Supervision
Yiwu Zhong
Jing Shi
Jianwei Yang
Chenliang Xu
Yin Li
SSL
31
77
0
06 Sep 2021
Hire-MLP: Vision MLP via Hierarchical Rearrangement
Hire-MLP: Vision MLP via Hierarchical Rearrangement
Jianyuan Guo
Yehui Tang
Kai Han
Xinghao Chen
Han Wu
Chao Xu
Chang Xu
Yunhe Wang
46
105
0
30 Aug 2021
Previous
123...1213141516
Next