ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1909.04653
  4. Cited By
Towards Understanding the Importance of Shortcut Connections in Residual
  Networks
v1v2v3 (latest)

Towards Understanding the Importance of Shortcut Connections in Residual Networks

Neural Information Processing Systems (NeurIPS), 2019
10 September 2019
Tianyi Liu
Minshuo Chen
Mo Zhou
S. Du
Enlu Zhou
T. Zhao
ArXiv (abs)PDFHTML

Papers citing "Towards Understanding the Importance of Shortcut Connections in Residual Networks"

22 / 22 papers shown
Data Uniformity Improves Training Efficiency and More, with a Convergence Framework Beyond the NTK Regime
Data Uniformity Improves Training Efficiency and More, with a Convergence Framework Beyond the NTK Regime
Yuqing Wang
Shangding Gu
198
0
0
30 Jun 2025
Cross-Layer Cache Aggregation for Token Reduction in Ultra-Fine-Grained Image RecognitionIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Edwin Arkel Rios
Jansen Christopher Yuanda
Vincent Leon Ghanz
Cheng-Wei Yu
Bo-Cheng Lai
Min-Chun Hu
101
0
0
03 Jan 2025
Theoretical characterisation of the Gauss-Newton conditioning in Neural Networks
Theoretical characterisation of the Gauss-Newton conditioning in Neural NetworksNeural Information Processing Systems (NeurIPS), 2024
Jim Zhao
Sidak Pal Singh
Aurelien Lucchi
AI4CE
490
4
0
04 Nov 2024
Residual Connections and Normalization Can Provably Prevent Oversmoothing in GNNs
Residual Connections and Normalization Can Provably Prevent Oversmoothing in GNNs
Michael Scholkemper
Xinyi Wu
Ali Jadbabaie
Michael T. Schaub
526
21
0
05 Jun 2024
Progressive Feedforward Collapse of ResNet Training
Progressive Feedforward Collapse of ResNet Training
Sicong Wang
Kuo Gai
Shihua Zhang
AI4CE
263
8
0
02 May 2024
JoMA: Demystifying Multilayer Transformers via JOint Dynamics of MLP and
  Attention
JoMA: Demystifying Multilayer Transformers via JOint Dynamics of MLP and AttentionInternational Conference on Learning Representations (ICLR), 2023
Yuandong Tian
Yiping Wang
Zhenyu Zhang
Beidi Chen
Simon Shaolei Du
398
45
0
01 Oct 2023
Generalization Ability of Wide Residual Networks
Generalization Ability of Wide Residual Networks
Jianfa Lai
Zixiong Yu
Songtao Tian
Qian Lin
162
4
0
29 May 2023
Scan and Snap: Understanding Training Dynamics and Token Composition in
  1-layer Transformer
Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer TransformerNeural Information Processing Systems (NeurIPS), 2023
Yuandong Tian
Yiping Wang
Beidi Chen
S. Du
MLT
483
98
0
25 May 2023
Over-Parameterization Exponentially Slows Down Gradient Descent for
  Learning a Single Neuron
Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single NeuronAnnual Conference Computational Learning Theory (COLT), 2023
Weihang Xu
S. Du
308
21
0
20 Feb 2023
SML:Enhance the Network Smoothness with Skip Meta Logit for CTR
  Prediction
SML:Enhance the Network Smoothness with Skip Meta Logit for CTR Prediction
Wenlong Deng
Lang Lang
Ziqiang Liu
B. Liu
156
0
0
09 Oct 2022
Nearly Minimax Algorithms for Linear Bandits with Shared Representation
Nearly Minimax Algorithms for Linear Bandits with Shared Representation
Jiaqi Yang
Qi Lei
Jason D. Lee
S. Du
257
16
0
29 Mar 2022
ResNEsts and DenseNEsts: Block-based DNN Models with Improved
  Representation Guarantees
ResNEsts and DenseNEsts: Block-based DNN Models with Improved Representation GuaranteesNeural Information Processing Systems (NeurIPS), 2021
Kuan-Lin Chen
Ching-Hua Lee
H. Garudadri
Bhaskar D. Rao
AI4TS
275
7
0
10 Nov 2021
Augmented Shortcuts for Vision Transformers
Augmented Shortcuts for Vision TransformersNeural Information Processing Systems (NeurIPS), 2021
Yehui Tang
Kai Han
Chang Xu
An Xiao
Yiping Deng
Chao Xu
Yunhe Wang
ViT
233
48
0
30 Jun 2021
SurvNAM: The machine learning survival model explanation
SurvNAM: The machine learning survival model explanationNeural Networks (NN), 2021
Lev V. Utkin
Egor D. Satyukov
A. Konstantinov
AAMLFAtt
201
35
0
18 Apr 2021
Spectral Analysis of the Neural Tangent Kernel for Deep Residual
  Networks
Spectral Analysis of the Neural Tangent Kernel for Deep Residual NetworksJournal of machine learning research (JMLR), 2021
Yuval Belfer
Amnon Geifman
Meirav Galun
Ronen Basri
175
21
0
07 Apr 2021
Learning Frequency Domain Approximation for Binary Neural Networks
Learning Frequency Domain Approximation for Binary Neural NetworksNeural Information Processing Systems (NeurIPS), 2021
Yixing Xu
Kai Han
Chang Xu
Yehui Tang
Chunjing Xu
Yunhe Wang
MQ
291
61
0
01 Mar 2021
Continuous-in-Depth Neural Networks
Continuous-in-Depth Neural Networks
A. Queiruga
N. Benjamin Erichson
D. Taylor
Michael W. Mahoney
281
54
0
05 Aug 2020
Proactive Network Maintenance using Fast, Accurate Anomaly Localization
  and Classification on 1-D Data Series
Proactive Network Maintenance using Fast, Accurate Anomaly Localization and Classification on 1-D Data SeriesInternational Conference on Prognostics and Health Management (PHM), 2020
J. Zhu
K. Sundaresan
J. Rupe
67
9
0
17 Jul 2020
On the Demystification of Knowledge Distillation: A Residual Network
  Perspective
On the Demystification of Knowledge Distillation: A Residual Network Perspective
N. Jha
Rajat Saini
Sparsh Mittal
142
4
0
30 Jun 2020
A Mean-field Analysis of Deep ResNet and Beyond: Towards Provable
  Optimization Via Overparameterization From Depth
A Mean-field Analysis of Deep ResNet and Beyond: Towards Provable Optimization Via Overparameterization From DepthInternational Conference on Machine Learning (ICML), 2020
Yiping Lu
Chao Ma
Yulong Lu
Jianfeng Lu
Lexing Ying
MLT
305
83
0
11 Mar 2020
Why Do Deep Residual Networks Generalize Better than Deep Feedforward
  Networks? -- A Neural Tangent Kernel Perspective
Why Do Deep Residual Networks Generalize Better than Deep Feedforward Networks? -- A Neural Tangent Kernel PerspectiveNeural Information Processing Systems (NeurIPS), 2020
Kaixuan Huang
Yuqing Wang
Molei Tao
T. Zhao
MLT
121
103
0
14 Feb 2020
On a Sparse Shortcut Topology of Artificial Neural Networks
On a Sparse Shortcut Topology of Artificial Neural NetworksIEEE Transactions on Artificial Intelligence (IEEE TAI), 2018
Fenglei Fan
Dayang Wang
Hengtao Guo
Qikui Zhu
Pingkun Yan
Ge Wang
Hengyong Yu
377
24
0
22 Nov 2018
1