ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.08415
  4. Cited By
Gaussian Error Linear Units (GELUs)

Gaussian Error Linear Units (GELUs)

27 June 2016
Dan Hendrycks
Kevin Gimpel
ArXivPDFHTML

Papers citing "Gaussian Error Linear Units (GELUs)"

50 / 783 papers shown
Title
4M: Massively Multimodal Masked Modeling
4M: Massively Multimodal Masked Modeling
David Mizrahi
Roman Bachmann
Ouguzhan Fatih Kar
Teresa Yeo
Mingfei Gao
Afshin Dehghan
Amir Zamir
MLLM
44
63
0
11 Dec 2023
DreamVideo: Composing Your Dream Videos with Customized Subject and
  Motion
DreamVideo: Composing Your Dream Videos with Customized Subject and Motion
Yujie Wei
Shiwei Zhang
Zhiwu Qing
Hangjie Yuan
Zhiheng Liu
Yu Liu
Yingya Zhang
Jingren Zhou
Hongming Shan
DiffM
VGen
17
89
0
07 Dec 2023
Defense Against Adversarial Attacks using Convolutional Auto-Encoders
Defense Against Adversarial Attacks using Convolutional Auto-Encoders
Shreyasi Mandal
AAML
23
1
0
06 Dec 2023
C3: High-performance and low-complexity neural compression from a single
  image or video
C3: High-performance and low-complexity neural compression from a single image or video
Hyunjik Kim
Matthias Bauer
Lucas Theis
Jonathan Richard Schwarz
Emilien Dupont
VGen
22
23
0
05 Dec 2023
Analyzing and Improving the Training Dynamics of Diffusion Models
Analyzing and Improving the Training Dynamics of Diffusion Models
Tero Karras
M. Aittala
J. Lehtinen
Janne Hellsten
Timo Aila
S. Laine
28
155
0
05 Dec 2023
HUGS: Human Gaussian Splats
HUGS: Human Gaussian Splats
Muhammed Kocabas
Jen-Hao Rick Chang
J. Gabriel
Oncel Tuzel
Anurag Ranjan
3DGS
42
91
0
29 Nov 2023
Improving Feature Stability during Upsampling -- Spectral Artifacts and
  the Importance of Spatial Context
Improving Feature Stability during Upsampling -- Spectral Artifacts and the Importance of Spatial Context
Shashank Agnihotri
Julia Grabinski
M. Keuper
30
6
0
29 Nov 2023
End-to-End Temporal Action Detection with 1B Parameters Across 1000
  Frames
End-to-End Temporal Action Detection with 1B Parameters Across 1000 Frames
Shuming Liu
Chen-Da Liu-Zhang
Chen Zhao
Bernard Ghanem
33
25
0
28 Nov 2023
Compositional Capabilities of Autoregressive Transformers: A Study on
  Synthetic, Interpretable Tasks
Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks
Rahul Ramesh
Ekdeep Singh Lubana
Mikail Khona
Robert P. Dick
Hidenori Tanaka
CoGe
33
6
0
21 Nov 2023
Deep Learning-Based Real-Time Quality Control of Standard Video
  Compression for Live Streaming
Deep Learning-Based Real-Time Quality Control of Standard Video Compression for Live Streaming
Matin Mortaheb
M. A. Khojastepour
S. Chakradhar
S. Ulukus
13
1
0
21 Nov 2023
GRAM: An Interpretable Approach for Graph Anomaly Detection using
  Gradient Attention Maps
GRAM: An Interpretable Approach for Graph Anomaly Detection using Gradient Attention Maps
Yifei Yang
Peng Wang
Xiaofan He
Dongmian Zou
14
5
0
10 Nov 2023
Towards a Unified Framework of Contrastive Learning for Disentangled
  Representations
Towards a Unified Framework of Contrastive Learning for Disentangled Representations
Stefan Matthes
Zhiwei Han
Hao Shen
31
4
0
08 Nov 2023
OmniVec: Learning robust representations with cross modal sharing
OmniVec: Learning robust representations with cross modal sharing
Siddharth Srivastava
Gaurav Sharma
SSL
27
64
0
07 Nov 2023
Copilot4D: Learning Unsupervised World Models for Autonomous Driving via
  Discrete Diffusion
Copilot4D: Learning Unsupervised World Models for Autonomous Driving via Discrete Diffusion
Lunjun Zhang
Yuwen Xiong
Ze Yang
Sergio Casas
Rui Hu
R. Urtasun
39
50
0
02 Nov 2023
End-to-End Single-Channel Speaker-Turn Aware Conversational Speech
  Translation
End-to-End Single-Channel Speaker-Turn Aware Conversational Speech Translation
Juan Pablo Zuluaga
Zhaocheng Huang
Xing Niu
Rohit Paturi
S. Srinivasan
Prashant Mathur
Brian Thompson
Marcello Federico
BDL
27
2
0
01 Nov 2023
Learn to Categorize or Categorize to Learn? Self-Coding for Generalized
  Category Discovery
Learn to Categorize or Categorize to Learn? Self-Coding for Generalized Category Discovery
Sarah Rastegar
Hazel Doughty
Cees G. M. Snoek
30
15
0
30 Oct 2023
Video Frame Interpolation with Many-to-many Splatting and Spatial
  Selective Refinement
Video Frame Interpolation with Many-to-many Splatting and Spatial Selective Refinement
Ping Hu
Simon Niklaus
Lu Zhang
Stan Sclaroff
Kate Saenko
25
6
0
29 Oct 2023
TorchDEQ: A Library for Deep Equilibrium Models
TorchDEQ: A Library for Deep Equilibrium Models
Zhengyang Geng
J. Zico Kolter
VLM
54
12
0
28 Oct 2023
Understanding the Effects of Projectors in Knowledge Distillation
Understanding the Effects of Projectors in Knowledge Distillation
Yudong Chen
Sen Wang
Jiajun Liu
Xuwei Xu
Frank de Hoog
Brano Kusy
Zi Huang
26
0
0
26 Oct 2023
Cross-attention Spatio-temporal Context Transformer for Semantic
  Segmentation of Historical Maps
Cross-attention Spatio-temporal Context Transformer for Semantic Segmentation of Historical Maps
Sidi Wu
Yizi Chen
Konrad Schindler
L. Hurni
21
2
0
19 Oct 2023
From Alexnet to Transformers: Measuring the Non-linearity of Deep Neural Networks with Affine Optimal Transport
From Alexnet to Transformers: Measuring the Non-linearity of Deep Neural Networks with Affine Optimal Transport
Quentin Bouniot
I. Redko
Anton Mallasto
Charlotte Laclau
Karol Arndt
Oliver Struckmeier
Markus Heinonen
Ville Kyrki
Samuel Kaski
54
2
0
17 Oct 2023
SeUNet-Trans: A Simple yet Effective UNet-Transformer Model for Medical
  Image Segmentation
SeUNet-Trans: A Simple yet Effective UNet-Transformer Model for Medical Image Segmentation
Tan-Hanh Pham
Xianqi Li
Kim-Doang Nguyen
MedIm
ViT
26
8
0
16 Oct 2023
Homophone Disambiguation Reveals Patterns of Context Mixing in Speech
  Transformers
Homophone Disambiguation Reveals Patterns of Context Mixing in Speech Transformers
Hosein Mohebbi
Grzegorz Chrupała
Willem H. Zuidema
A. Alishahi
28
12
0
15 Oct 2023
Temporally Aligning Long Audio Interviews with Questions: A Case Study
  in Multimodal Data Integration
Temporally Aligning Long Audio Interviews with Questions: A Case Study in Multimodal Data Integration
Piyush Singh Pasi
Karthikeya Battepati
P. Jyothi
Ganesh Ramakrishnan
T. Mahapatra
Manoj Singh
51
0
0
10 Oct 2023
Understanding the Feature Norm for Out-of-Distribution Detection
Understanding the Feature Norm for Out-of-Distribution Detection
Jaewoo Park
Jacky Chen Long Chai
Jaeho Yoon
Andrew Beng Jin Teoh
OODD
24
12
0
09 Oct 2023
Low-Resolution Self-Attention for Semantic Segmentation
Low-Resolution Self-Attention for Semantic Segmentation
Yu-Huan Wu
Shi-Chen Zhang
Yun-Hai Liu
Le Zhang
Xin Zhan
Daquan Zhou
Jiashi Feng
Ming-Ming Cheng
Liangli Zhen
ViT
45
3
0
08 Oct 2023
Deep Learning Based Uplink Multi-User SIMO Beamforming Design
Deep Learning Based Uplink Multi-User SIMO Beamforming Design
Cemil Vahapoglu
Tim O'Shea
Tamoghna Roy
S. Ulukus
23
7
0
28 Sep 2023
Deep Learning-Based Real-Time Rate Control for Live Streaming on
  Wireless Networks
Deep Learning-Based Real-Time Rate Control for Live Streaming on Wireless Networks
Matin Mortaheb
M. A. Khojastepour
S. Chakradhar
S. Ulukus
13
0
0
27 Sep 2023
Rethinking Session Variability: Leveraging Session Embeddings for
  Session Robustness in Speaker Verification
Rethinking Session Variability: Leveraging Session Embeddings for Session Robustness in Speaker Verification
Hee-Soo Heo
Ki-hyun Nam
Bong-Jin Lee
Youngki Kwon
Min-Ji Lee
You Jin Kim
Joon Son Chung
26
1
0
26 Sep 2023
Introducing DictaLM -- A Large Generative Language Model for Modern
  Hebrew
Introducing DictaLM -- A Large Generative Language Model for Modern Hebrew
Shaltiel Shmidman
Avi Shmidman
Amir DN Cohen
Moshe Koppel
25
0
0
25 Sep 2023
Small-scale proxies for large-scale Transformer training instabilities
Small-scale proxies for large-scale Transformer training instabilities
Mitchell Wortsman
Peter J. Liu
Lechao Xiao
Katie Everett
A. Alemi
...
Jascha Narain Sohl-Dickstein
Kelvin Xu
Jaehoon Lee
Justin Gilmer
Simon Kornblith
35
81
0
25 Sep 2023
On the Posterior Distribution in Denoising: Application to Uncertainty
  Quantification
On the Posterior Distribution in Denoising: Application to Uncertainty Quantification
Hila Manor
T. Michaeli
UQCV
23
17
0
24 Sep 2023
Large-scale Pretraining Improves Sample Efficiency of Active Learning
  based Molecule Virtual Screening
Large-scale Pretraining Improves Sample Efficiency of Active Learning based Molecule Virtual Screening
Zhonglin Cao
Simone Sciabola
Ye Wang
32
1
0
20 Sep 2023
PDPCRN: Parallel Dual-Path CRN with Bi-directional Inter-Branch
  Interactions for Multi-Channel Speech Enhancement
PDPCRN: Parallel Dual-Path CRN with Bi-directional Inter-Branch Interactions for Multi-Channel Speech Enhancement
Jia-Yu Pan
Shulin He
Tianci Wu
Hui Zhang
Xueliang Zhang
19
0
0
19 Sep 2023
Limited-Angle Tomography Reconstruction via Deep End-To-End Learning on
  Synthetic Data
Limited-Angle Tomography Reconstruction via Deep End-To-End Learning on Synthetic Data
Thomas Germer
Jan Robine
S. Konietzny
Stefan Harmeling
Tobias Uelwer
MedIm
18
5
0
13 Sep 2023
Advancing Parsimonious Deep Learning Weather Prediction using the
  HEALPix Mesh
Advancing Parsimonious Deep Learning Weather Prediction using the HEALPix Mesh
Matthias Karlbauer
Nathaniel Cresswell-Clay
Dale Durran
Raul A Moreno
Thorsten Kurth
Boris Bonev
Noah D. Brenowitz
Martin Volker Butz
MDE
25
20
0
11 Sep 2023
ImageBind-LLM: Multi-modality Instruction Tuning
ImageBind-LLM: Multi-modality Instruction Tuning
Jiaming Han
Renrui Zhang
Wenqi Shao
Peng Gao
Peng-Tao Xu
...
Yafei Wen
Xiaoxin Chen
Xiangyu Yue
Hongsheng Li
Yu Qiao
MLLM
49
116
0
07 Sep 2023
3D Transformer based on deformable patch location for differential
  diagnosis between Alzheimer's disease and Frontotemporal dementia
3D Transformer based on deformable patch location for differential diagnosis between Alzheimer's disease and Frontotemporal dementia
H. Nguyen
Michael Clement
Boris Mansencal
Pierrick Coupé
MedIm
28
0
0
06 Sep 2023
Character Queries: A Transformer-based Approach to On-Line Handwritten
  Character Segmentation
Character Queries: A Transformer-based Approach to On-Line Handwritten Character Segmentation
Michael Jungo
Beat Wolf
Andrii Maksai
C. Musat
Andreas Fischer
24
2
0
06 Sep 2023
A Unified Masked Autoencoder with Patchified Skeletons for Motion
  Synthesis
A Unified Masked Autoencoder with Patchified Skeletons for Motion Synthesis
Esteve Valls Mascaro
Hyemin Ahn
Dongheui Lee
CVBM
37
4
0
14 Aug 2023
Large-kernel Attention for Efficient and Robust Brain Lesion
  Segmentation
Large-kernel Attention for Efficient and Robust Brain Lesion Segmentation
Liam Chalcroft
Ruben Lourencco Pereira
Mikael Brudfors
Andrew S. Kayser
M. D’Esposito
Cathy J. Price
Ioannis Pappas
John Ashburner
ViT
3DV
MedIm
26
8
0
14 Aug 2023
Composable Function-preserving Expansions for Transformer Architectures
Composable Function-preserving Expansions for Transformer Architectures
Andrea Gesmundo
Kaitlin Maile
AI4CE
32
8
0
11 Aug 2023
Graph Embedding Dynamic Feature-based Supervised Contrastive Learning of
  Transient Stability for Changing Power Grid Topologies
Graph Embedding Dynamic Feature-based Supervised Contrastive Learning of Transient Stability for Changing Power Grid Topologies
Zijian Lv
X. Chen
Zijian Feng
22
0
0
01 Aug 2023
Generative Models as a Complex Systems Science: How can we make sense of
  large language model behavior?
Generative Models as a Complex Systems Science: How can we make sense of large language model behavior?
Ari Holtzman
Peter West
Luke Zettlemoyer
AI4CE
30
14
0
31 Jul 2023
Efficient Federated Learning via Local Adaptive Amended Optimizer with
  Linear Speedup
Efficient Federated Learning via Local Adaptive Amended Optimizer with Linear Speedup
Yan Sun
Li Shen
Hao Sun
Liang Ding
Dacheng Tao
FedML
19
16
0
30 Jul 2023
BARTPhoBEiT: Pre-trained Sequence-to-Sequence and Image Transformers
  Models for Vietnamese Visual Question Answering
BARTPhoBEiT: Pre-trained Sequence-to-Sequence and Image Transformers Models for Vietnamese Visual Question Answering
Khiem Vinh Tran
Kiet Van Nguyen
N. Nguyen
ViT
23
2
0
28 Jul 2023
Incrementally-Computable Neural Networks: Efficient Inference for
  Dynamic Inputs
Incrementally-Computable Neural Networks: Efficient Inference for Dynamic Inputs
Or Sharir
Anima Anandkumar
27
0
0
27 Jul 2023
Unsupervised Deep Learning-based Pansharpening with Jointly-Enhanced
  Spectral and Spatial Fidelity
Unsupervised Deep Learning-based Pansharpening with Jointly-Enhanced Spectral and Spatial Fidelity
Matteo Ciotola
Giovanni Poggi
G. Scarpa
23
22
0
26 Jul 2023
On the unreasonable vulnerability of transformers for image restoration
  -- and an easy fix
On the unreasonable vulnerability of transformers for image restoration -- and an easy fix
Shashank Agnihotri
Kanchana Vaishnavi Gandikota
Julia Grabinski
Paramanand Chandramouli
M. Keuper
32
9
0
25 Jul 2023
Simultaneous temperature estimation and nonuniformity correction from
  multiple frames
Simultaneous temperature estimation and nonuniformity correction from multiple frames
N. Oz
O. Berman
N. Sochen
David Mendelovich
I. Klapp
22
1
0
23 Jul 2023
Previous
123456...141516
Next