ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.08415
  4. Cited By
Gaussian Error Linear Units (GELUs)

Gaussian Error Linear Units (GELUs)

27 June 2016
Dan Hendrycks
Kevin Gimpel
ArXivPDFHTML

Papers citing "Gaussian Error Linear Units (GELUs)"

50 / 753 papers shown
Title
Optimizing Job Allocation using Reinforcement Learning with Graph Neural Networks
Optimizing Job Allocation using Reinforcement Learning with Graph Neural Networks
Lars C.P.M. Quaedvlieg
61
0
0
31 Jan 2025
A Survey of Large Language Models for Healthcare: from Data, Technology, and Applications to Accountability and Ethics
A Survey of Large Language Models for Healthcare: from Data, Technology, and Applications to Accountability and Ethics
Kai He
Rui Mao
Qika Lin
Yucheng Ruan
Xiang Lan
Mengling Feng
Erik Cambria
LM&MA
AILaw
93
153
0
28 Jan 2025
Parallel Sequence Modeling via Generalized Spatial Propagation Network
Parallel Sequence Modeling via Generalized Spatial Propagation Network
Hongjun Wang
Wonmin Byeon
Jiarui Xu
Jinwei Gu
Ka Chun Cheung
Xiaolong Wang
Kai Han
Jan Kautz
Sifei Liu
149
0
0
21 Jan 2025
Enhancing Retrosynthesis with Conformer: A Template-Free Method
Enhancing Retrosynthesis with Conformer: A Template-Free Method
Jiaxi Zhuang
Qian Zhang
Ying Qian
125
0
0
21 Jan 2025
A generalizable 3D framework and model for self-supervised learning in medical imaging
A generalizable 3D framework and model for self-supervised learning in medical imaging
Tony Xu
Sepehr Hosseini
Chris Anderson
Anthony Rinaldi
Rahul G. Krishnan
Anne L. Martel
Maged Goubran
MedIm
45
3
0
20 Jan 2025
MetaNeRV: Meta Neural Representations for Videos with Spatial-Temporal Guidance
MetaNeRV: Meta Neural Representations for Videos with Spatial-Temporal Guidance
Jialong Guo
Ke Liu
Jiangchao Yao
Zhihua Wang
Jiajun Bu
Haishuai Wang
AI4TS
44
0
0
20 Jan 2025
FlashSR: One-step Versatile Audio Super-resolution via Diffusion Distillation
FlashSR: One-step Versatile Audio Super-resolution via Diffusion Distillation
Jaekwon Im
Juhan Nam
DiffM
45
0
0
18 Jan 2025
SSD4Rec: A Structured State Space Duality Model for Efficient Sequential Recommendation
SSD4Rec: A Structured State Space Duality Model for Efficient Sequential Recommendation
Haohao Qu
Yifeng Zhang
Liangbo Ning
Wenqi Fan
Qing Li
Mamba
96
7
0
17 Jan 2025
SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words
SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words
Junyi Ao
Yuancheng Wang
Xiaohai Tian
Dekun Chen
J. Zhang
Lu Lu
Y. Wang
Haizhou Li
Z. Wu
AuLLM
80
17
0
17 Jan 2025
EmoNeXt: an Adapted ConvNeXt for Facial Emotion Recognition
EmoNeXt: an Adapted ConvNeXt for Facial Emotion Recognition
Yassine El Boudouri
Amine Bohi
71
15
0
14 Jan 2025
EXION: Exploiting Inter- and Intra-Iteration Output Sparsity for Diffusion Models
EXION: Exploiting Inter- and Intra-Iteration Output Sparsity for Diffusion Models
Jaehoon Heo
Adiwena Putra
Jieon Yoon
Sungwoong Yune
Hangyeol Lee
Ji-Hoon Kim
Joo-Young Kim
DiffM
55
1
0
10 Jan 2025
Transformer-Driven Inverse Problem Transform for Fast Blind Hyperspectral Image Dehazing
Transformer-Driven Inverse Problem Transform for Fast Blind Hyperspectral Image Dehazing
Po-Wei Tang
Chia-Hsiang Lin
Yangrui Liu
45
6
0
03 Jan 2025
VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks
VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks
Jiannan Wu
Muyan Zhong
Sen Xing
Zeqiang Lai
Zhaoyang Liu
...
Lewei Lu
Tong Lu
Ping Luo
Yu Qiao
Jifeng Dai
MLLM
VLM
LRM
99
48
0
03 Jan 2025
Hadamard Attention Recurrent Transformer: A Strong Baseline for Stereo Matching Transformer
Hadamard Attention Recurrent Transformer: A Strong Baseline for Stereo Matching Transformer
Ziyang Chen
Yongjun Zhang
Wenting Li
Bingshu Wang
Yabo Wu
Yong Zhao
C. L. P. Chen
49
0
0
02 Jan 2025
Optical aberrations in autonomous driving: Physics-informed parameterized temperature scaling for neural network uncertainty calibration
Optical aberrations in autonomous driving: Physics-informed parameterized temperature scaling for neural network uncertainty calibration
D. Wolf
Alexander Braun
Markus Ulrich
89
0
0
18 Dec 2024
ClarityEthic: Explainable Moral Judgment Utilizing Contrastive Ethical Insights from Large Language Models
ClarityEthic: Explainable Moral Judgment Utilizing Contrastive Ethical Insights from Large Language Models
Yuxi Sun
Wei Gao
Jing Ma
Hongzhan Lin
Ziyang Luo
Wenxuan Zhang
ELM
82
0
0
17 Dec 2024
Adaptive Rank, Reduced Forgetting: Knowledge Retention in Continual Learning Vision-Language Models with Dynamic Rank-Selective LoRA
Adaptive Rank, Reduced Forgetting: Knowledge Retention in Continual Learning Vision-Language Models with Dynamic Rank-Selective LoRA
Haodong Lu
Chongyang Zhao
Jason Xue
Lina Yao
Kristen Moore
Dong Gong
VLM
KELM
CLL
85
3
0
01 Dec 2024
Any-Resolution AI-Generated Image Detection by Spectral Learning
Any-Resolution AI-Generated Image Detection by Spectral Learning
Dimitrios Karageorgiou
Symeon Papadopoulos
I. Kompatsiaris
Efstratios Gavves
103
0
0
28 Nov 2024
Understanding Galaxy Morphology Evolution Through Cosmic Time via Redshift Conditioned Diffusion Models
Understanding Galaxy Morphology Evolution Through Cosmic Time via Redshift Conditioned Diffusion Models
Andrew Lizarraga
Eric H. Jiang
Jacob Nowack
Yun Qi Li
Ying Nian Wu
Bernie Boscoe
Tuan Do
DiffM
90
0
0
27 Nov 2024
EfficientViM: Efficient Vision Mamba with Hidden State Mixer based State Space Duality
EfficientViM: Efficient Vision Mamba with Hidden State Mixer based State Space Duality
Sanghyeok Lee
Joonmyung Choi
Hyunwoo J. Kim
110
3
0
22 Nov 2024
The Sound of Water: Inferring Physical Properties from Pouring Liquids
Piyush Bagad
Makarand Tapaswi
Cees G. M. Snoek
Andrew Zisserman
45
0
0
18 Nov 2024
Layer-Adaptive State Pruning for Deep State Space Models
Layer-Adaptive State Pruning for Deep State Space Models
Minseon Gwak
Seongrok Moon
Joohwan Ko
PooGyeon Park
25
0
0
05 Nov 2024
Clinical Evaluation of Medical Image Synthesis: A Case Study in Wireless Capsule Endoscopy
Clinical Evaluation of Medical Image Synthesis: A Case Study in Wireless Capsule Endoscopy
Panagiota Gatoula
Dimitrios E. Diamantis
Anastasios Koulaouzidis
Cristina Carretero
Stefania Chetcuti-Zammit
...
John Plevris
Alexander Robertson
Bruno Rosa
Ervin Toth
D. Iakovidis
MedIm
52
0
0
31 Oct 2024
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
Haiyang Wang
Yue Fan
Muhammad Ferjad Naeem
Yongqin Xian
J. E. Lenssen
Liwei Wang
F. Tombari
Bernt Schiele
41
2
0
30 Oct 2024
OGBench: Benchmarking Offline Goal-Conditioned RL
OGBench: Benchmarking Offline Goal-Conditioned RL
Seohong Park
Kevin Frans
Benjamin Eysenbach
Sergey Levine
OffRL
48
8
0
26 Oct 2024
COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training
COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training
Haocheng Xi
Han Cai
Ligeng Zhu
Y. Lu
Kurt Keutzer
Jianfei Chen
Song Han
MQ
65
9
0
25 Oct 2024
Swiss Army Knife: Synergizing Biases in Knowledge from Vision Foundation Models for Multi-Task Learning
Swiss Army Knife: Synergizing Biases in Knowledge from Vision Foundation Models for Multi-Task Learning
Yuxiang Lu
Shengcao Cao
Yu-xiong Wang
49
1
0
18 Oct 2024
Cliqueformer: Model-Based Optimization with Structured Transformers
Cliqueformer: Model-Based Optimization with Structured Transformers
J. Kuba
Pieter Abbeel
Sergey Levine
OffRL
AI4CE
54
2
0
17 Oct 2024
Quadratic Gating Functions in Mixture of Experts: A Statistical Insight
Quadratic Gating Functions in Mixture of Experts: A Statistical Insight
Pedram Akbarian
Huy Le Nguyen
Xing Han
Nhat Ho
MoE
42
0
0
15 Oct 2024
Deep Optimal Sensor Placement for Black Box Stochastic Simulations
Deep Optimal Sensor Placement for Black Box Stochastic Simulations
Paula Cordero-Encinar
Tobias Schröder
P. Yatsyshin
Andrew Duncan
45
0
0
15 Oct 2024
Liger Kernel: Efficient Triton Kernels for LLM Training
Liger Kernel: Efficient Triton Kernels for LLM Training
Pin-Lun Hsu
Yun Dai
Vignesh Kothapalli
Qingquan Song
Shao Tang
Siyu Zhu
Steven Shimizu
Shivam Sahni
Haowen Ning
Yanning Chen
42
26
0
14 Oct 2024
ControlMM: Controllable Masked Motion Generation
ControlMM: Controllable Masked Motion Generation
Ekkasit Pinyoanuntapong
Muhammad Usama Saleem
Korrawe Karunratanakul
Pu Wang
Hongfei Xue
C. L. P. Chen
Chuan Guo
Junli Cao
J. Ren
Sergey Tulyakov
VGen
29
4
0
14 Oct 2024
SensorLLM: Aligning Large Language Models with Motion Sensors for Human Activity Recognition
SensorLLM: Aligning Large Language Models with Motion Sensors for Human Activity Recognition
Zechen Li
Shohreh Deldari
Linyao Chen
Hao Xue
Flora D. Salim
39
6
0
14 Oct 2024
Learning Equivariant Non-Local Electron Density Functionals
Learning Equivariant Non-Local Electron Density Functionals
Nicholas Gao
Eike Eberhard
Stephan Günnemann
28
1
0
10 Oct 2024
SPA: 3D Spatial-Awareness Enables Effective Embodied Representation
SPA: 3D Spatial-Awareness Enables Effective Embodied Representation
Haoyi Zhu
Honghui Yang
Yating Wang
Jiange Yang
Limin Wang
Tong He
3DH
51
6
0
10 Oct 2024
RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation
RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation
Songming Liu
Lingxuan Wu
Bangguo Li
Hengkai Tan
Huayu Chen
Zhengyi Wang
Ke Xu
Hang Su
Jun Zhu
31
75
0
10 Oct 2024
Deep Correlated Prompting for Visual Recognition with Missing Modalities
Deep Correlated Prompting for Visual Recognition with Missing Modalities
Lianyu Hu
Tongkai Shi
Wei Feng
Fanhua Shang
Liang Wan
VLM
29
1
0
09 Oct 2024
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Jinhao Li
Jiaming Xu
Shan Huang
Yonghua Chen
Wen Li
...
Jiayi Pan
Li Ding
Hao Zhou
Yu Wang
Guohao Dai
62
15
0
06 Oct 2024
Cross Resolution Encoding-Decoding For Detection Transformers
Cross Resolution Encoding-Decoding For Detection Transformers
Ashish Kumar
Jaesik Park
ViT
33
0
0
05 Oct 2024
Oscillatory State-Space Models
Oscillatory State-Space Models
T. Konstantin Rusch
Daniela Rus
AI4TS
133
5
0
04 Oct 2024
Efficient Semantic Segmentation via Lightweight Multiple-Information Interaction Network
Efficient Semantic Segmentation via Lightweight Multiple-Information Interaction Network
Yangyang Qiu
Guoan Xu
Guangwei Gao
Zhenhua Guo
Yi Yu
Chia-Wen Lin
30
0
0
03 Oct 2024
FAN: Fourier Analysis Networks
FAN: Fourier Analysis Networks
Yihong Dong
Ge Li
Yongding Tao
Xue Jiang
Kechi Zhang
Jia Li
Jing Su
Jing Su
Jun Zhang
Jingjing Xu
AI4TS
13
4
0
03 Oct 2024
Deep Learning Alternatives of the Kolmogorov Superposition Theorem
Deep Learning Alternatives of the Kolmogorov Superposition Theorem
Leonardo Ferreira Guilhoto
P. Perdikaris
44
7
0
02 Oct 2024
AI Enabled Neutron Flux Measurement and Virtual Calibration in Boiling
  Water Reactors
AI Enabled Neutron Flux Measurement and Virtual Calibration in Boiling Water Reactors
Anirudh Tunga
Jordan Heim
Michael Mueterthies
Thomas Gruenwald
Jonathan Nistor
23
0
0
25 Sep 2024
Exploring Fine-Grained Image-Text Alignment for Referring Remote Sensing
  Image Segmentation
Exploring Fine-Grained Image-Text Alignment for Referring Remote Sensing Image Segmentation
Sen Lei
Xinyu Xiao
Heng-Chao Li
Z. Shi
Qing Zhu
20
12
0
20 Sep 2024
Machine-learning-based multipoint optimization of fluidic injection
  parameters for improving nozzle performance
Machine-learning-based multipoint optimization of fluidic injection parameters for improving nozzle performance
Yunjia Yang
Jiazhe Li
Yufei Zhang
Haixin Chen
21
0
0
19 Sep 2024
SOAP: Improving and Stabilizing Shampoo using Adam
SOAP: Improving and Stabilizing Shampoo using Adam
Nikhil Vyas
Depen Morwani
Rosie Zhao
Itai Shapira
David Brandfonbrener
Lucas Janson
Sham Kakade
Sham Kakade
66
23
0
17 Sep 2024
Baking Relightable NeRF for Real-time Direct/Indirect Illumination
  Rendering
Baking Relightable NeRF for Real-time Direct/Indirect Illumination Rendering
Euntae Choi
Vincent Carpentier
Seunghun Shin
Sungjoo Yoo
26
0
0
16 Sep 2024
TBDM-Net: Bidirectional Dense Networks with Gender Information for
  Speech Emotion Recognition
TBDM-Net: Bidirectional Dense Networks with Gender Information for Speech Emotion Recognition
Vlad Striletchi
Cosmin Striletchi
Adriana Stan
38
0
0
16 Sep 2024
Robust image representations with counterfactual contrastive learning
Robust image representations with counterfactual contrastive learning
Mélanie Roschewitz
Fabio De Sousa Ribeiro
Tian Xia
G. Khara
Ben Glocker
OOD
MedIm
47
2
0
16 Sep 2024
Previous
12345...141516
Next