Wide Neural Networks Forget Less Catastrophically

21 October 2021

Papers citing "Wide Neural Networks Forget Less Catastrophically"

38 / 38 papers shown

Title
Efficient Continual Learning through Frequency Decomposition and Integration Ruiqi Liu Boyu Diao Libo Huang Hangda Liu Chuanguang Yang Zhulin An Y. Xu CLL 35 0 0 28 Mar 2025
Continual Pre-training of MoEs: How robust is your router? Benjamin Thérien Charles-Étienne Joseph Zain Sarwar Ashwinee Panda Anirban Das Shi-Xiong Zhang Stephen Rawls S. Eugene Belilovsky Irina Rish MoE 73 0 0 06 Mar 2025
Beyond Cosine Decay: On the effectiveness of Infinite Learning Rate Schedule for Continual Pre-training Paul Janson Vaibhav Singh Paria Mehrbod Adam Ibrahim Irina Rish Eugene Belilovsky Benjamin Thérien CLL 78 0 0 04 Mar 2025
Analysis of Overparameterization in Continual Learning under a Linear Model Daniel Goldfarb Paul Hand CLL 39 0 0 11 Feb 2025
TSVD: Bridging Theory and Practice in Continual Learning with Pre-trained Models Liangzu Peng Juan Elenter Joshua Agterberg Alejandro Ribeiro René Vidal VLM CLL 46 1 0 01 Oct 2024
Continual learning with the neural tangent ensemble Ari S. Benjamin Christian Pehle Kyle Daruwalla UQCV 67 0 0 30 Aug 2024
Enhancing Continual Learning in Visual Question Answering with Modality-Aware Feature Distillation Malvina Nikandrou Georgios Pantazopoulos Ioannis Konstas Alessandro Suglia 24 0 0 27 Jun 2024
Uncovering Latent Memories: Assessing Data Leakage and Memorization Patterns in Frontier AI Models Sunny Duan Mikail Khona Abhiram Iyer Rylan Schaeffer Ila R Fiete 45 3 0 20 Jun 2024
Continual Learning of Large Language Models: A Comprehensive Survey Haizhou Shi Zihao Xu Hengyi Wang Weiyi Qin Wenyuan Wang Yibin Wang Zifeng Wang Sayna Ebrahimi Hao Wang CLL KELM LRM 46 63 0 25 Apr 2024
Revisiting Neural Networks for Continual Learning: An Architectural Perspective Aojun Lu Tao Feng Hangjie Yuan Xiaotian Song Yanan Sun 45 7 0 23 Apr 2024
Simple and Scalable Strategies to Continually Pre-train Large Language Models Adam Ibrahim Benjamin Thérien Kshitij Gupta Mats L. Richter Quentin Anthony Timothée Lesort Eugene Belilovsky Irina Rish KELM CLL 44 51 0 13 Mar 2024
FOCIL: Finetune-and-Freeze for Online Class Incremental Learning by Training Randomly Pruned Sparse Experts Murat Onur Yildirim Elif Ceren Gok Yildirim D. Mocanu Joaquin Vanschoren CLL 38 0 0 13 Mar 2024
On the Diminishing Returns of Width for Continual Learning E. Guha V. Lakshman CLL 31 4 0 11 Mar 2024
Efficient Parameter Mining and Freezing for Continual Object Detection Angelo G. Menezes Augusto J. Peterlevitz Mateus A. Chinelatto André C. P. L. F. de Carvalho 35 0 0 20 Feb 2024
Learning and Forgetting Unsafe Examples in Large Language Models Jiachen Zhao Zhun Deng David Madras James Zou Mengye Ren MU KELM CLL 80 16 0 20 Dec 2023
Continual Learning Under Language Shift Evangelia Gogoulou Timothée Lesort Magnus Boman Joakim Nivre KELM CLL 27 3 0 02 Nov 2023
Diagnosing Catastrophe: Large parts of accuracy loss in continual learning can be accounted for by readout misalignment Daniel Anthes Sushrut Thorat Peter König Tim C Kietzmann 17 2 0 09 Oct 2023
Elephant Neural Networks: Born to Be a Continual Learner Qingfeng Lan A. R. Mahmood CLL 42 9 0 02 Oct 2023
On the Disconnect Between Theory and Practice of Neural Networks: Limits of the NTK Perspective Jonathan Wenger Felix Dangel Agustinus Kristiadi 25 0 0 29 Sep 2023
An Analysis of Initial Training Strategies for Exemplar-Free Class-Incremental Learning Grégoire Petit Michael Soumm Eva Feillet Adrian Daniel Popescu Bertrand Delezoide David Picard C´eline Hudelot CLL 24 7 0 22 Aug 2023
Continual Learning as Computationally Constrained Reinforcement Learning Saurabh Kumar Henrik Marklund Anand Srinivasa Rao Yifan Zhu Hong Jun Jeon Yueyang Liu Benjamin Van Roy CLL 27 22 0 10 Jul 2023
Predicting Grokking Long Before it Happens: A look into the loss landscape of models which grok Pascal Junior Tikeng Notsawo Hattie Zhou Mohammad Pezeshki Irina Rish G. Dumas 19 23 0 23 Jun 2023
Heterogeneous Continual Learning Divyam Madaan Hongxu Yin Wonmin Byeon Jan Kautz Pavlo Molchanov CLL 29 5 0 14 Jun 2023
The Tunnel Effect: Building Data Representations in Deep Neural Networks Wojciech Masarczyk M. Ostaszewski Ehsan Imani Razvan Pascanu Piotr Milo's Tomasz Trzciñski 28 18 0 31 May 2023
The Ideal Continual Learner: An Agent That Never Forgets Liangzu Peng Paris V. Giampouras René Vidal CLL 106 26 0 29 Apr 2023
DeepReShape: Redesigning Neural Networks for Efficient Private Inference N. Jha Brandon Reagen 28 10 0 20 Apr 2023
Knowledge Accumulation in Continually Learned Representations and the Issue of Feature Forgetting Timm Hess Eli Verwimp Gido M. van de Ven Tinne Tuytelaars CLL 18 7 0 03 Apr 2023
Mind the Backbone: Minimizing Backbone Distortion for Robust Object Detection Kuniaki Saito Donghyun Kim Piotr Teterwak Rogerio Feris Kate Saenko 36 1 0 26 Mar 2023
Is forgetting less a good inductive bias for forward transfer? Jiefeng Chen Timothy Nguyen Dilan Görür Arslan Chaudhry CLL 57 14 0 14 Mar 2023
Efficient Parametric Approximations of Neural Network Function Space Distance Nikita Dhawan Sicong Huang Juhan Bae Roger C. Grosse 14 5 0 07 Feb 2023
A Comprehensive Survey of Continual Learning: Theory, Method and Application Liyuan Wang Xingxing Zhang Hang Su Jun Zhu KELM CLL 36 598 0 31 Jan 2023
CoSCL: Cooperation of Small Continual Learners is Stronger than a Big One Liyuan Wang Xingxing Zhang Qian Li Jun Zhu Yi Zhong CLL 25 46 0 13 Jul 2022
Analysis of Catastrophic Forgetting for Random Orthogonal Transformation Tasks in the Overparameterized Regime Daniel Goldfarb Paul Hand CLL 20 10 0 01 Jun 2022
Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models Kushal Tirumala Aram H. Markosyan Luke Zettlemoyer Armen Aghajanyan TDI 26 185 0 22 May 2022
How catastrophic can catastrophic forgetting be in linear regression? Itay Evron E. Moroshko Rachel A. Ward Nati Srebro Daniel Soudry CLL 22 48 0 19 May 2022
Maslow's Hammer for Catastrophic Forgetting: Node Re-Use vs Node Activation Sebastian Lee Stefano Sarao Mannelli Claudia Clopath Sebastian Goldt Andrew M. Saxe CLL 42 12 0 18 May 2022
Architecture Matters in Continual Learning Seyed Iman Mirzadeh Arslan Chaudhry Dong Yin Timothy Nguyen Razvan Pascanu Dilan Görür Mehrdad Farajtabar OOD KELM 114 58 0 01 Feb 2022
Scaling Laws for Neural Language Models Jared Kaplan Sam McCandlish T. Henighan Tom B. Brown B. Chess R. Child Scott Gray Alec Radford Jeff Wu Dario Amodei 231 4,469 0 23 Jan 2020