ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1511.05641
  4. Cited By
Net2Net: Accelerating Learning via Knowledge Transfer

Net2Net: Accelerating Learning via Knowledge Transfer

18 November 2015
Tianqi Chen
Ian Goodfellow
Jonathon Shlens
ArXivPDFHTML

Papers citing "Net2Net: Accelerating Learning via Knowledge Transfer"

50 / 135 papers shown
Title
Model Steering: Learning with a Reference Model Improves Generalization Bounds and Scaling Laws
Model Steering: Learning with a Reference Model Improves Generalization Bounds and Scaling Laws
Xiyuan Wei
Ming Lin
Fanjiang Ye
Fengguang Song
Liangliang Cao
My T. Thai
Tianbao Yang
LLMSV
34
0
0
10 May 2025
FedADP: Unified Model Aggregation for Federated Learning with Heterogeneous Model Architectures
FedADP: Unified Model Aggregation for Federated Learning with Heterogeneous Model Architectures
Jiacheng Wang
Hongtao Lv
Lei Liu
FedML
25
0
0
10 May 2025
A Framework for Elastic Adaptation of User Multiple Intents in Sequential Recommendation
A Framework for Elastic Adaptation of User Multiple Intents in Sequential Recommendation
Zhikai Wang
Yanyan Shen
AI4TS
35
0
0
30 Apr 2025
A multilevel approach to accelerate the training of Transformers
A multilevel approach to accelerate the training of Transformers
Guillaume Lauga
Maël Chaumette
Edgar Desainte-Maréville
Étienne Lasalle
Arthur Lebeurrier
AI4CE
45
0
0
24 Apr 2025
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
Haiyang Wang
Yue Fan
Muhammad Ferjad Naeem
Yongqin Xian
J. E. Lenssen
Liwei Wang
F. Tombari
Bernt Schiele
49
2
0
30 Oct 2024
Beyond Next Token Prediction: Patch-Level Training for Large Language Models
Beyond Next Token Prediction: Patch-Level Training for Large Language Models
Chenze Shao
Fandong Meng
Jie Zhou
53
1
0
17 Jul 2024
Text-to-Model: Text-Conditioned Neural Network Diffusion for Train-Once-for-All Personalization
Text-to-Model: Text-Conditioned Neural Network Diffusion for Train-Once-for-All Personalization
Zexi Li
Lingzhi Gao
Chao Wu
AI4CE
DiffM
55
3
0
23 May 2024
A Multi-Level Framework for Accelerating Training Transformer Models
A Multi-Level Framework for Accelerating Training Transformer Models
Longwei Zou
Han Zhang
Yangdong Deng
AI4CE
40
1
0
07 Apr 2024
Beyond Uniform Scaling: Exploring Depth Heterogeneity in Neural
  Architectures
Beyond Uniform Scaling: Exploring Depth Heterogeneity in Neural Architectures
Akash Guna R.T
Arnav Chavan
Deepak Gupta
MDE
32
0
0
19 Feb 2024
Initializing Models with Larger Ones
Initializing Models with Larger Ones
Zhiqiu Xu
Yanjie Chen
Kirill Vishniakov
Yida Yin
Zhiqiang Shen
Trevor Darrell
Lingjie Liu
Zhuang Liu
38
17
0
30 Nov 2023
Ever Evolving Evaluator (EV3): Towards Flexible and Reliable
  Meta-Optimization for Knowledge Distillation
Ever Evolving Evaluator (EV3): Towards Flexible and Reliable Meta-Optimization for Knowledge Distillation
Li Ding
M. Zoghi
Guy Tennenholtz
Maryam Karimzadehgan
26
0
0
29 Oct 2023
FLM-101B: An Open LLM and How to Train It with $100K Budget
FLM-101B: An Open LLM and How to Train It with 100KBudget100K Budget100KBudget
Xiang Li
Yiqun Yao
Xin Jiang
Xuezhi Fang
Xuying Meng
...
Li Du
Bowen Qin
Zheng-Wei Zhang
Aixin Sun
Yequan Wang
60
21
0
07 Sep 2023
Composable Function-preserving Expansions for Transformer Architectures
Composable Function-preserving Expansions for Transformer Architectures
Andrea Gesmundo
Kaitlin Maile
AI4CE
40
8
0
11 Aug 2023
Shrink-Perturb Improves Architecture Mixing during Population Based
  Training for Neural Architecture Search
Shrink-Perturb Improves Architecture Mixing during Population Based Training for Neural Architecture Search
A. Chebykin
A. Dushatskiy
Tanja Alderliesten
Peter A. N. Bosman
44
0
0
28 Jul 2023
TransformerG2G: Adaptive time-stepping for learning temporal graph
  embeddings using transformers
TransformerG2G: Adaptive time-stepping for learning temporal graph embeddings using transformers
Alan John Varghese
Aniruddha Bora
Mengjia Xu
George Karniadakis
36
5
0
05 Jul 2023
Accelerated Training via Incrementally Growing Neural Networks using
  Variance Transfer and Learning Rate Adaptation
Accelerated Training via Incrementally Growing Neural Networks using Variance Transfer and Learning Rate Adaptation
Xin Yuan
Pedro H. P. Savarese
Michael Maire
13
5
0
22 Jun 2023
Can We Scale Transformers to Predict Parameters of Diverse ImageNet
  Models?
Can We Scale Transformers to Predict Parameters of Diverse ImageNet Models?
Boris Knyazev
Doha Hwang
Simon Lacoste-Julien
AI4CE
39
17
0
07 Mar 2023
Multipath agents for modular multitask ML systems
Multipath agents for modular multitask ML systems
Andrea Gesmundo
28
1
0
06 Feb 2023
Efficient Evaluation Methods for Neural Architecture Search: A Survey
Efficient Evaluation Methods for Neural Architecture Search: A Survey
Xiangning Xie
Xiaotian Song
Zeqiong Lv
Gary G. Yen
Weiping Ding
Yizhou Sun
38
12
0
14 Jan 2023
RMM: Reinforced Memory Management for Class-Incremental Learning
RMM: Reinforced Memory Management for Class-Incremental Learning
Yaoyao Liu
Bernt Schiele
Qianru Sun
CLL
37
93
0
14 Jan 2023
Deep Residual Axial Networks
Deep Residual Axial Networks
Nazmul Shahadat
Anthony Maida
3DPC
48
4
0
11 Jan 2023
Reversible Column Networks
Reversible Column Networks
Yuxuan Cai
Yi Zhou
Qi Han
Jianjian Sun
Xiangwen Kong
Jun Yu Li
Xiangyu Zhang
VLM
31
53
0
22 Dec 2022
On-device Training: A First Overview on Existing Systems
On-device Training: A First Overview on Existing Systems
Shuai Zhu
Thiemo Voigt
Jeonggil Ko
Fatemeh Rahimian
34
14
0
01 Dec 2022
Learning Label Modular Prompts for Text Classification in the Wild
Learning Label Modular Prompts for Text Classification in the Wild
Hailin Chen
Amrita Saha
Shafiq Joty
Steven C. H. Hoi
OOD
VLM
26
5
0
30 Nov 2022
Pruning Very Deep Neural Network Channels for Efficient Inference
Pruning Very Deep Neural Network Channels for Efficient Inference
Yihui He
35
1
0
14 Nov 2022
MiNL: Micro-images based Neural Representation for Light Fields
MiNL: Micro-images based Neural Representation for Light Fields
Ziru Xu
Henan Wang
Zhibo Chen
33
1
0
17 Sep 2022
On the Factory Floor: ML Engineering for Industrial-Scale Ads
  Recommendation Models
On the Factory Floor: ML Engineering for Industrial-Scale Ads Recommendation Models
Rohan Anil
S. Gadanho
Danya Huang
Nijith Jacob
Zhuoshu Li
...
Cristina Pop
Kevin Regan
G. Shamir
Rakesh Shivanna
Qiqi Yan
3DV
29
41
0
12 Sep 2022
Continuous QA Learning with Structured Prompts
Yinhe Zheng
CLL
33
1
0
31 Aug 2022
A Survey of Open Source Automation Tools for Data Science Predictions
A Survey of Open Source Automation Tools for Data Science Predictions
Nicholas Hoell
30
0
0
24 Aug 2022
Anomaly Detection and Inter-Sensor Transfer Learning on Smart
  Manufacturing Datasets
Anomaly Detection and Inter-Sensor Transfer Learning on Smart Manufacturing Datasets
Mustafa Abdallah
B. Joung
Wo Jae Lee
C. Mousoulis
J. Sutherland
S. Bagchi
34
20
0
13 Jun 2022
Reincarnating Reinforcement Learning: Reusing Prior Computation to
  Accelerate Progress
Reincarnating Reinforcement Learning: Reusing Prior Computation to Accelerate Progress
Rishabh Agarwal
Max Schwarzer
Pablo Samuel Castro
Rameswar Panda
Marc G. Bellemare
OffRL
OnRL
37
63
0
03 Jun 2022
A Survey on Computationally Efficient Neural Architecture Search
A Survey on Computationally Efficient Neural Architecture Search
Shiqing Liu
Haoyu Zhang
Yaochu Jin
38
41
0
03 Jun 2022
Online Deep Learning from Doubly-Streaming Data
Online Deep Learning from Doubly-Streaming Data
H. Lian
John Scovil Atwood
Bo-Jian Hou
Jian Wu
Yi He
26
10
0
25 Apr 2022
Automated Progressive Learning for Efficient Training of Vision
  Transformers
Automated Progressive Learning for Efficient Training of Vision Transformers
Changlin Li
Bohan Zhuang
Guangrun Wang
Xiaodan Liang
Xiaojun Chang
Yi Yang
33
46
0
28 Mar 2022
Lifelong Adaptive Machine Learning for Sensor-based Human Activity
  Recognition Using Prototypical Networks
Lifelong Adaptive Machine Learning for Sensor-based Human Activity Recognition Using Prototypical Networks
Rebecca Adaimi
Edison Thomaz
CLL
29
13
0
11 Mar 2022
Update Compression for Deep Neural Networks on the Edge
Update Compression for Deep Neural Networks on the Edge
Bo Chen
A. Bakhshi
Gustavo E. A. P. A. Batista
Brian Ng
Tat-Jun Chin
31
17
0
09 Mar 2022
Consistent Representation Learning for Continual Relation Extraction
Consistent Representation Learning for Continual Relation Extraction
Kang Zhao
Hua Xu
Jian Yang
Kai Gao
CLL
31
51
0
05 Mar 2022
Continual Few-shot Relation Learning via Embedding Space Regularization
  and Data Augmentation
Continual Few-shot Relation Learning via Embedding Space Regularization and Data Augmentation
Chengwei Qin
Shafiq Joty
BDL
CLL
20
33
0
04 Mar 2022
DC and SA: Robust and Efficient Hyperparameter Optimization of
  Multi-subnetwork Deep Learning Models
DC and SA: Robust and Efficient Hyperparameter Optimization of Multi-subnetwork Deep Learning Models
A. Treacher
A. Montillo
27
0
0
24 Feb 2022
An Automated Question-Answering Framework Based on Evolution Algorithm
An Automated Question-Answering Framework Based on Evolution Algorithm
Sinan Tan
Hui Xue
Qiyu Ren
Huaping Liu
Jing Bai
21
0
0
26 Jan 2022
Bilevel Online Deep Learning in Non-stationary Environment
Bilevel Online Deep Learning in Non-stationary Environment
Ya-nan Han
Jian-wei Liu
Bing-biao Xiao
Xin-Tan Wang
Xiong-lin Luo
30
3
0
25 Jan 2022
Automated Deep Learning: Neural Architecture Search Is Not the End
Automated Deep Learning: Neural Architecture Search Is Not the End
Xuanyi Dong
D. Kedziora
Katarzyna Musial
Bogdan Gabrys
31
26
0
16 Dec 2021
CoMPS: Continual Meta Policy Search
CoMPS: Continual Meta Policy Search
Glen Berseth
Zhiwei Zhang
Grace Zhang
Chelsea Finn
Sergey Levine
CLL
OffRL
28
16
0
08 Dec 2021
Manas: Mining Software Repositories to Assist AutoML
Manas: Mining Software Repositories to Assist AutoML
Giang Nguyen
Johir Islam
Rangeet Pan
Hridesh Rajan
51
15
0
06 Dec 2021
On Transferability of Prompt Tuning for Natural Language Processing
On Transferability of Prompt Tuning for Natural Language Processing
Yusheng Su
Xiaozhi Wang
Yujia Qin
Chi-Min Chan
Yankai Lin
...
Peng Li
Juanzi Li
Lei Hou
Maosong Sun
Jie Zhou
AAML
VLM
31
98
0
12 Nov 2021
On Cross-Layer Alignment for Model Fusion of Heterogeneous Neural
  Networks
On Cross-Layer Alignment for Model Fusion of Heterogeneous Neural Networks
Dang Nguyen
T. Nguyen
Khai Nguyen
D.Q. Phung
Hung Bui
Nhat Ho
MoMe
24
9
0
29 Oct 2021
LFPT5: A Unified Framework for Lifelong Few-shot Language Learning Based
  on Prompt Tuning of T5
LFPT5: A Unified Framework for Lifelong Few-shot Language Learning Based on Prompt Tuning of T5
Chengwei Qin
Shafiq Joty
CLL
178
98
0
14 Oct 2021
bert2BERT: Towards Reusable Pretrained Language Models
bert2BERT: Towards Reusable Pretrained Language Models
Cheng Chen
Yichun Yin
Lifeng Shang
Xin Jiang
Yujia Qin
Fengyu Wang
Zhi Wang
Xiao Chen
Zhiyuan Liu
Qun Liu
VLM
24
59
0
14 Oct 2021
Speeding up Deep Model Training by Sharing Weights and Then Unsharing
Speeding up Deep Model Training by Sharing Weights and Then Unsharing
Shuo Yang
Le Hou
Xiaodan Song
Qiang Liu
Denny Zhou
110
9
0
08 Oct 2021
Knowledge Transfer based Evolutionary Deep Neural Network for Intelligent Fault Diagnosis
Knowledge Transfer based Evolutionary Deep Neural Network for Intelligent Fault Diagnosis
Arun K. Sharma
N. Verma
49
2
0
28 Sep 2021
123
Next