ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2110.08190
  4. Cited By
Sparse Progressive Distillation: Resolving Overfitting under
  Pretrain-and-Finetune Paradigm

Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm

15 October 2021
Shaoyi Huang
Dongkuan Xu
Ian En-Hsu Yen
Yijue Wang
Sung-En Chang
Bingbing Li
Shiyang Chen
Mimi Xie
Sanguthevar Rajasekaran
Hang Liu
Caiwen Ding
    CLL
    VLM
ArXivPDFHTML

Papers citing "Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm"

9 / 9 papers shown
Title
Accel-GCN: High-Performance GPU Accelerator Design for Graph Convolution
  Networks
Accel-GCN: High-Performance GPU Accelerator Design for Graph Convolution Networks
Xiaoru Xie
Hongwu Peng
Amit Hasan
Shaoyi Huang
Jiahui Zhao
Haowen Fang
Wei Zhang
Tong Geng
O. Khan
Caiwen Ding
GNN
30
30
0
22 Aug 2023
AutoReP: Automatic ReLU Replacement for Fast Private Network Inference
AutoReP: Automatic ReLU Replacement for Fast Private Network Inference
Hongwu Peng
Shaoyi Huang
Tong Zhou
Yukui Luo
Chenghong Wang
...
Tony Geng
Kaleel Mahmood
Wujie Wen
Xiaolin Xu
Caiwen Ding
OffRL
32
38
0
20 Aug 2023
Dynamic Sparse Training via Balancing the Exploration-Exploitation
  Trade-off
Dynamic Sparse Training via Balancing the Exploration-Exploitation Trade-off
Shaoyi Huang
Bowen Lei
Dongkuan Xu
Hongwu Peng
Yue Sun
Mimi Xie
Caiwen Ding
13
19
0
30 Nov 2022
PROD: Progressive Distillation for Dense Retrieval
PROD: Progressive Distillation for Dense Retrieval
Zhenghao Lin
Yeyun Gong
Xiao Liu
Hang Zhang
Chen Lin
...
Jian Jiao
Jing Lu
Daxin Jiang
Rangan Majumder
Nan Duan
35
27
0
27 Sep 2022
Towards Sparsification of Graph Neural Networks
Towards Sparsification of Graph Neural Networks
Hongwu Peng
Deniz Gurevin
Shaoyi Huang
Tong Geng
Weiwen Jiang
O. Khan
Caiwen Ding
GNN
30
24
0
11 Sep 2022
The Lottery Ticket Hypothesis for Pre-trained BERT Networks
The Lottery Ticket Hypothesis for Pre-trained BERT Networks
Tianlong Chen
Jonathan Frankle
Shiyu Chang
Sijia Liu
Yang Zhang
Zhangyang Wang
Michael Carbin
148
376
0
23 Jul 2020
What is the State of Neural Network Pruning?
What is the State of Neural Network Pruning?
Davis W. Blalock
Jose Javier Gonzalez Ortiz
Jonathan Frankle
John Guttag
181
1,027
0
06 Mar 2020
BERT-of-Theseus: Compressing BERT by Progressive Module Replacing
BERT-of-Theseus: Compressing BERT by Progressive Module Replacing
Canwen Xu
Wangchunshu Zhou
Tao Ge
Furu Wei
Ming Zhou
221
197
0
07 Feb 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,950
0
20 Apr 2018
1