Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1511.05641
Cited By
Net2Net: Accelerating Learning via Knowledge Transfer
18 November 2015
Tianqi Chen
Ian Goodfellow
Jonathon Shlens
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Net2Net: Accelerating Learning via Knowledge Transfer"
50 / 135 papers shown
Title
Model Steering: Learning with a Reference Model Improves Generalization Bounds and Scaling Laws
Xiyuan Wei
Ming Lin
Fanjiang Ye
Fengguang Song
Liangliang Cao
My T. Thai
Tianbao Yang
LLMSV
34
0
0
10 May 2025
FedADP: Unified Model Aggregation for Federated Learning with Heterogeneous Model Architectures
Jiacheng Wang
Hongtao Lv
Lei Liu
FedML
25
0
0
10 May 2025
A Framework for Elastic Adaptation of User Multiple Intents in Sequential Recommendation
Zhikai Wang
Yanyan Shen
AI4TS
35
0
0
30 Apr 2025
A multilevel approach to accelerate the training of Transformers
Guillaume Lauga
Maël Chaumette
Edgar Desainte-Maréville
Étienne Lasalle
Arthur Lebeurrier
AI4CE
45
0
0
24 Apr 2025
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
Haiyang Wang
Yue Fan
Muhammad Ferjad Naeem
Yongqin Xian
J. E. Lenssen
Liwei Wang
F. Tombari
Bernt Schiele
49
2
0
30 Oct 2024
Beyond Next Token Prediction: Patch-Level Training for Large Language Models
Chenze Shao
Fandong Meng
Jie Zhou
53
1
0
17 Jul 2024
Text-to-Model: Text-Conditioned Neural Network Diffusion for Train-Once-for-All Personalization
Zexi Li
Lingzhi Gao
Chao Wu
AI4CE
DiffM
55
3
0
23 May 2024
A Multi-Level Framework for Accelerating Training Transformer Models
Longwei Zou
Han Zhang
Yangdong Deng
AI4CE
40
1
0
07 Apr 2024
Beyond Uniform Scaling: Exploring Depth Heterogeneity in Neural Architectures
Akash Guna R.T
Arnav Chavan
Deepak Gupta
MDE
32
0
0
19 Feb 2024
Initializing Models with Larger Ones
Zhiqiu Xu
Yanjie Chen
Kirill Vishniakov
Yida Yin
Zhiqiang Shen
Trevor Darrell
Lingjie Liu
Zhuang Liu
38
17
0
30 Nov 2023
Ever Evolving Evaluator (EV3): Towards Flexible and Reliable Meta-Optimization for Knowledge Distillation
Li Ding
M. Zoghi
Guy Tennenholtz
Maryam Karimzadehgan
26
0
0
29 Oct 2023
FLM-101B: An Open LLM and How to Train It with
100
K
B
u
d
g
e
t
100K Budget
100
K
B
u
d
g
e
t
Xiang Li
Yiqun Yao
Xin Jiang
Xuezhi Fang
Xuying Meng
...
Li Du
Bowen Qin
Zheng-Wei Zhang
Aixin Sun
Yequan Wang
60
21
0
07 Sep 2023
Composable Function-preserving Expansions for Transformer Architectures
Andrea Gesmundo
Kaitlin Maile
AI4CE
40
8
0
11 Aug 2023
Shrink-Perturb Improves Architecture Mixing during Population Based Training for Neural Architecture Search
A. Chebykin
A. Dushatskiy
Tanja Alderliesten
Peter A. N. Bosman
44
0
0
28 Jul 2023
TransformerG2G: Adaptive time-stepping for learning temporal graph embeddings using transformers
Alan John Varghese
Aniruddha Bora
Mengjia Xu
George Karniadakis
36
5
0
05 Jul 2023
Accelerated Training via Incrementally Growing Neural Networks using Variance Transfer and Learning Rate Adaptation
Xin Yuan
Pedro H. P. Savarese
Michael Maire
13
5
0
22 Jun 2023
Can We Scale Transformers to Predict Parameters of Diverse ImageNet Models?
Boris Knyazev
Doha Hwang
Simon Lacoste-Julien
AI4CE
39
17
0
07 Mar 2023
Multipath agents for modular multitask ML systems
Andrea Gesmundo
28
1
0
06 Feb 2023
Efficient Evaluation Methods for Neural Architecture Search: A Survey
Xiangning Xie
Xiaotian Song
Zeqiong Lv
Gary G. Yen
Weiping Ding
Yizhou Sun
38
12
0
14 Jan 2023
RMM: Reinforced Memory Management for Class-Incremental Learning
Yaoyao Liu
Bernt Schiele
Qianru Sun
CLL
37
93
0
14 Jan 2023
Deep Residual Axial Networks
Nazmul Shahadat
Anthony Maida
3DPC
48
4
0
11 Jan 2023
Reversible Column Networks
Yuxuan Cai
Yi Zhou
Qi Han
Jianjian Sun
Xiangwen Kong
Jun Yu Li
Xiangyu Zhang
VLM
31
53
0
22 Dec 2022
On-device Training: A First Overview on Existing Systems
Shuai Zhu
Thiemo Voigt
Jeonggil Ko
Fatemeh Rahimian
34
14
0
01 Dec 2022
Learning Label Modular Prompts for Text Classification in the Wild
Hailin Chen
Amrita Saha
Shafiq Joty
Steven C. H. Hoi
OOD
VLM
26
5
0
30 Nov 2022
Pruning Very Deep Neural Network Channels for Efficient Inference
Yihui He
35
1
0
14 Nov 2022
MiNL: Micro-images based Neural Representation for Light Fields
Ziru Xu
Henan Wang
Zhibo Chen
33
1
0
17 Sep 2022
On the Factory Floor: ML Engineering for Industrial-Scale Ads Recommendation Models
Rohan Anil
S. Gadanho
Danya Huang
Nijith Jacob
Zhuoshu Li
...
Cristina Pop
Kevin Regan
G. Shamir
Rakesh Shivanna
Qiqi Yan
3DV
29
41
0
12 Sep 2022
Continuous QA Learning with Structured Prompts
Yinhe Zheng
CLL
33
1
0
31 Aug 2022
A Survey of Open Source Automation Tools for Data Science Predictions
Nicholas Hoell
30
0
0
24 Aug 2022
Anomaly Detection and Inter-Sensor Transfer Learning on Smart Manufacturing Datasets
Mustafa Abdallah
B. Joung
Wo Jae Lee
C. Mousoulis
J. Sutherland
S. Bagchi
34
20
0
13 Jun 2022
Reincarnating Reinforcement Learning: Reusing Prior Computation to Accelerate Progress
Rishabh Agarwal
Max Schwarzer
Pablo Samuel Castro
Rameswar Panda
Marc G. Bellemare
OffRL
OnRL
37
63
0
03 Jun 2022
A Survey on Computationally Efficient Neural Architecture Search
Shiqing Liu
Haoyu Zhang
Yaochu Jin
38
41
0
03 Jun 2022
Online Deep Learning from Doubly-Streaming Data
H. Lian
John Scovil Atwood
Bo-Jian Hou
Jian Wu
Yi He
26
10
0
25 Apr 2022
Automated Progressive Learning for Efficient Training of Vision Transformers
Changlin Li
Bohan Zhuang
Guangrun Wang
Xiaodan Liang
Xiaojun Chang
Yi Yang
33
46
0
28 Mar 2022
Lifelong Adaptive Machine Learning for Sensor-based Human Activity Recognition Using Prototypical Networks
Rebecca Adaimi
Edison Thomaz
CLL
29
13
0
11 Mar 2022
Update Compression for Deep Neural Networks on the Edge
Bo Chen
A. Bakhshi
Gustavo E. A. P. A. Batista
Brian Ng
Tat-Jun Chin
31
17
0
09 Mar 2022
Consistent Representation Learning for Continual Relation Extraction
Kang Zhao
Hua Xu
Jian Yang
Kai Gao
CLL
31
51
0
05 Mar 2022
Continual Few-shot Relation Learning via Embedding Space Regularization and Data Augmentation
Chengwei Qin
Shafiq Joty
BDL
CLL
20
33
0
04 Mar 2022
DC and SA: Robust and Efficient Hyperparameter Optimization of Multi-subnetwork Deep Learning Models
A. Treacher
A. Montillo
27
0
0
24 Feb 2022
An Automated Question-Answering Framework Based on Evolution Algorithm
Sinan Tan
Hui Xue
Qiyu Ren
Huaping Liu
Jing Bai
21
0
0
26 Jan 2022
Bilevel Online Deep Learning in Non-stationary Environment
Ya-nan Han
Jian-wei Liu
Bing-biao Xiao
Xin-Tan Wang
Xiong-lin Luo
30
3
0
25 Jan 2022
Automated Deep Learning: Neural Architecture Search Is Not the End
Xuanyi Dong
D. Kedziora
Katarzyna Musial
Bogdan Gabrys
31
26
0
16 Dec 2021
CoMPS: Continual Meta Policy Search
Glen Berseth
Zhiwei Zhang
Grace Zhang
Chelsea Finn
Sergey Levine
CLL
OffRL
28
16
0
08 Dec 2021
Manas: Mining Software Repositories to Assist AutoML
Giang Nguyen
Johir Islam
Rangeet Pan
Hridesh Rajan
51
15
0
06 Dec 2021
On Transferability of Prompt Tuning for Natural Language Processing
Yusheng Su
Xiaozhi Wang
Yujia Qin
Chi-Min Chan
Yankai Lin
...
Peng Li
Juanzi Li
Lei Hou
Maosong Sun
Jie Zhou
AAML
VLM
31
98
0
12 Nov 2021
On Cross-Layer Alignment for Model Fusion of Heterogeneous Neural Networks
Dang Nguyen
T. Nguyen
Khai Nguyen
D.Q. Phung
Hung Bui
Nhat Ho
MoMe
24
9
0
29 Oct 2021
LFPT5: A Unified Framework for Lifelong Few-shot Language Learning Based on Prompt Tuning of T5
Chengwei Qin
Shafiq Joty
CLL
178
98
0
14 Oct 2021
bert2BERT: Towards Reusable Pretrained Language Models
Cheng Chen
Yichun Yin
Lifeng Shang
Xin Jiang
Yujia Qin
Fengyu Wang
Zhi Wang
Xiao Chen
Zhiyuan Liu
Qun Liu
VLM
24
59
0
14 Oct 2021
Speeding up Deep Model Training by Sharing Weights and Then Unsharing
Shuo Yang
Le Hou
Xiaodan Song
Qiang Liu
Denny Zhou
110
9
0
08 Oct 2021
Knowledge Transfer based Evolutionary Deep Neural Network for Intelligent Fault Diagnosis
Arun K. Sharma
N. Verma
49
2
0
28 Sep 2021
1
2
3
Next