Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1902.06853
Cited By
On the Impact of the Activation Function on Deep Neural Networks Training
19 February 2019
Soufiane Hayou
Arnaud Doucet
Judith Rousseau
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"On the Impact of the Activation Function on Deep Neural Networks Training"
50 / 95 papers shown
Title
AltLoRA: Towards Better Gradient Approximation in Low-Rank Adaptation with Alternating Projections
Xin Yu
Yujia Wang
Jinghui Chen
Lingzhou Xue
11
0
0
18 May 2025
Revisiting Residual Connections: Orthogonal Updates for Stable and Efficient Deep Networks
Giyeong Oh
Woohyun Cho
Siyeol Kim
Suhwan Choi
Younjae Yu
4
0
0
17 May 2025
How good is PAC-Bayes at explaining generalisation?
Antoine Picard-Weibel
Eugenio Clerico
Roman Moscoviz
Benjamin Guedj
64
0
0
11 Mar 2025
REAct: Rational Exponential Activation for Better Learning and Generalization in PINNs
Sourav Mishra
Shreya Hallikeri
Suresh Sundaram
AI4CE
53
0
0
04 Mar 2025
Feature Learning Beyond the Edge of Stability
Dávid Terjék
MLT
46
0
0
18 Feb 2025
Physics of Skill Learning
Ziming Liu
Yizhou Liu
Eric J. Michaud
Jeff Gore
Max Tegmark
46
1
0
21 Jan 2025
Emergence of Globally Attracting Fixed Points in Deep Neural Networks With Nonlinear Activations
Amir Joudaki
Thomas Hofmann
MLT
23
0
0
26 Oct 2024
Generalized Probabilistic Attention Mechanism in Transformers
DongNyeong Heo
Heeyoul Choi
56
0
0
21 Oct 2024
Theoretical Insights into Fine-Tuning Attention Mechanism: Generalization and Optimization
Xinhao Yao
Hongjin Qian
Xiaolin Hu
Gengze Xu
Wei Liu
Jian Luan
Bin Wang
Yong-Jin Liu
48
0
0
03 Oct 2024
Kolmogorov-Arnold Networks in Low-Data Regimes: A Comparative Study with Multilayer Perceptrons
Farhad Pourkamali-Anaraki
38
5
0
16 Sep 2024
Back to the Continuous Attractor
Ábel Ságodi
Guillermo Martín-Sánchez
Piotr Sokól
Il Memming Park
33
2
0
31 Jul 2024
The Impact of Initialization on LoRA Finetuning Dynamics
Soufiane Hayou
Nikhil Ghosh
Bin Yu
AI4CE
36
13
0
12 Jun 2024
Understanding and Minimising Outlier Features in Neural Network Training
Bobby He
Lorenzo Noci
Daniele Paliotta
Imanol Schlag
Thomas Hofmann
42
3
0
29 May 2024
Spectral complexity of deep neural networks
Simmaco Di Lillo
Domenico Marinucci
Michele Salvi
Stefano Vigogna
BDL
82
1
0
15 May 2024
Rolling the dice for better deep learning performance: A study of randomness techniques in deep neural networks
Mohammed Ghaith Altarabichi
Sławomir Nowaczyk
Sepideh Pashami
Peyman Sheikholharam Mashhadi
Julia Handl
34
9
0
05 Apr 2024
Disentangling the Causes of Plasticity Loss in Neural Networks
Clare Lyle
Zeyu Zheng
Khimya Khetarpal
H. V. Hasselt
Razvan Pascanu
James Martens
Will Dabney
AI4CE
55
32
0
29 Feb 2024
LoRA+: Efficient Low Rank Adaptation of Large Models
Soufiane Hayou
Nikhil Ghosh
Bin Yu
AI4CE
46
148
0
19 Feb 2024
Adaptive Activation Functions for Predictive Modeling with Sparse Experimental Data
Farhad Pourkamali-Anaraki
Tahamina Nasrin
Robert E. Jensen
Amy M. Peterson
Christopher J. Hansen
27
7
0
08 Feb 2024
Riemannian Preconditioned LoRA for Fine-Tuning Foundation Models
Fangzhao Zhang
Mert Pilanci
AI4CE
62
15
0
04 Feb 2024
Task structure and nonlinearity jointly determine learned representational geometry
Matteo Alleman
Jack W. Lindsey
Stefano Fusi
38
8
0
24 Jan 2024
Tuning the activation function to optimize the forecast horizon of a reservoir computer
Lauren A. Hurley
Juan G. Restrepo
Sean E. Shaheen
18
4
0
20 Dec 2023
The Feature Speed Formula: a flexible approach to scale hyper-parameters of deep neural networks
Lénaic Chizat
Praneeth Netrapalli
26
3
0
30 Nov 2023
Simplifying Transformer Blocks
Bobby He
Thomas Hofmann
27
30
0
03 Nov 2023
From Alexnet to Transformers: Measuring the Non-linearity of Deep Neural Networks with Affine Optimal Transport
Quentin Bouniot
I. Redko
Anton Mallasto
Charlotte Laclau
Karol Arndt
Oliver Struckmeier
Markus Heinonen
Ville Kyrki
Samuel Kaski
61
2
0
17 Oct 2023
Wide Neural Networks as Gaussian Processes: Lessons from Deep Equilibrium Models
Tianxiang Gao
Xiaokai Huo
Hailiang Liu
Hongyang Gao
BDL
25
8
0
16 Oct 2023
Commutative Width and Depth Scaling in Deep Neural Networks
Soufiane Hayou
49
2
0
02 Oct 2023
Leave-one-out Distinguishability in Machine Learning
Jiayuan Ye
Anastasia Borovykh
Soufiane Hayou
Reza Shokri
39
9
0
29 Sep 2023
A Primer on Bayesian Neural Networks: Review and Debates
Federico Danieli
Konstantinos Pitas
M. Vladimirova
Vincent Fortuin
BDL
AAML
56
18
0
28 Sep 2023
Quantitative CLTs in Deep Neural Networks
Stefano Favaro
Boris Hanin
Domenico Marinucci
I. Nourdin
G. Peccati
BDL
33
12
0
12 Jul 2023
The Shaped Transformer: Attention Models in the Infinite Depth-and-Width Limit
Lorenzo Noci
Chuning Li
Mufan Li
Bobby He
Thomas Hofmann
Chris J. Maddison
Daniel M. Roy
33
31
0
30 Jun 2023
Network Degeneracy as an Indicator of Training Performance: Comparing Finite and Infinite Width Angle Predictions
Cameron Jakub
Mihai Nica
14
0
0
02 Jun 2023
Learning Activation Functions for Sparse Neural Networks
Mohammad Loni
Aditya Mohan
Mehdi Asadi
Marius Lindauer
23
4
0
18 May 2023
On the Importance of Exploration for Real Life Learned Algorithms
Steffen Gracla
C. Bockelmann
Armin Dekorsy
21
0
0
21 Apr 2023
Criticality versus uniformity in deep neural networks
A. Bukva
Jurriaan de Gier
Kevin T. Grosvenor
R. Jefferson
K. Schalm
Eliot Schwander
34
3
0
10 Apr 2023
Deep Transformers without Shortcuts: Modifying Self-attention for Faithful Signal Propagation
Bobby He
James Martens
Guodong Zhang
Aleksandar Botev
Andy Brock
Samuel L. Smith
Yee Whye Teh
27
30
0
20 Feb 2023
Depth Degeneracy in Neural Networks: Vanishing Angles in Fully Connected ReLU Networks on Initialization
Cameron Jakub
Mihai Nica
ODL
11
5
0
20 Feb 2023
Width and Depth Limits Commute in Residual Networks
Soufiane Hayou
Greg Yang
44
14
0
01 Feb 2023
On the Initialisation of Wide Low-Rank Feedforward Neural Networks
Thiziri Nait Saada
Jared Tanner
13
1
0
31 Jan 2023
NEON: Enabling Efficient Support for Nonlinear Operations in Resistive RAM-based Neural Network Accelerators
Aditya Manglik
Minesh Patel
Haiyu Mao
Behzad Salami
Jisung Park
Lois Orosa
O. Mutlu
20
1
0
10 Nov 2022
Stochastic Adaptive Activation Function
Kyungsu Lee
Jaeseung Yang
Haeyun Lee
J. Y. Hwang
30
3
0
21 Oct 2022
Analysis of Convolutions, Non-linearity and Depth in Graph Neural Networks using Neural Tangent Kernel
Mahalakshmi Sabanayagam
P. Esser
D. Ghoshdastidar
34
2
0
18 Oct 2022
On the infinite-depth limit of finite-width neural networks
Soufiane Hayou
30
22
0
03 Oct 2022
Deep Learning Models for Detecting Malware Attacks
Pascal Maniriho
A. N. Mahmood
M. Chowdhury
AAML
19
5
0
08 Sep 2022
Evaluating the Susceptibility of Pre-Trained Language Models via Handcrafted Adversarial Examples
Hezekiah J. Branch
Jonathan Rodriguez Cefalu
Jeremy McHugh
Leyla Hujer
Aditya Bahl
Daniel del Castillo Iglesias
Ron Heichman
Ramesh Darwishi
ELM
SILM
AAML
13
48
0
05 Sep 2022
The Neural Covariance SDE: Shaped Infinite Depth-and-Width Networks at Initialization
Mufan Li
Mihai Nica
Daniel M. Roy
44
36
0
06 Jun 2022
Understanding the Role of Nonlinearity in Training Dynamics of Contrastive Learning
Yuandong Tian
MLT
26
13
0
02 Jun 2022
Gaussian Pre-Activations in Neural Networks: Myth or Reality?
Pierre Wolinski
Julyan Arbel
AI4CE
76
8
0
24 May 2022
Explainable and Optimally Configured Artificial Neural Networks for Attack Detection in Smart Homes
S. Sohail
Zongwen Fan
Xin Gu
Fariza Sabrina
AAML
8
3
0
17 May 2022
Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis
Wuyang Chen
Wei Huang
Xinyu Gong
Boris Hanin
Zhangyang Wang
30
7
0
11 May 2022
Self-scalable Tanh (Stan): Faster Convergence and Better Generalization in Physics-informed Neural Networks
Raghav Gnanasambandam
Bo Shen
Jihoon Chung
Xubo Yue
Zhenyu
Zhen Kong
LRM
37
12
0
26 Apr 2022
1
2
Next