ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2201.00558
  4. Cited By
Which Student is Best? A Comprehensive Knowledge Distillation Exam for
  Task-Specific BERT Models

Which Student is Best? A Comprehensive Knowledge Distillation Exam for Task-Specific BERT Models

3 January 2022
Made Nindyatama Nityasya
Haryo Akbarianto Wibowo
Rendi Chevi
Radityo Eko Prasojo
Alham Fikri Aji
ArXiv (abs)PDFHTMLGithub (20693★)

Papers citing "Which Student is Best? A Comprehensive Knowledge Distillation Exam for Task-Specific BERT Models"

6 / 6 papers shown
On Multilingual Encoder Language Model Compression for Low-Resource Languages
On Multilingual Encoder Language Model Compression for Low-Resource Languages
Daniil Gurgurov
Michal Gregor
Josef van Genabith
Simon Ostermann
512
0
0
22 May 2025
Small Language Models in the Real World: Insights from Industrial Text Classification
Small Language Models in the Real World: Insights from Industrial Text Classification
Lujun Li
Lama Sleem
Niccolo Gentile
Geoffrey Nichil
Radu State
LLMAG
567
2
0
21 May 2025
The Privileged Students: On the Value of Initialization in Multilingual
  Knowledge Distillation
The Privileged Students: On the Value of Initialization in Multilingual Knowledge Distillation
Haryo Akbarianto Wibowo
Thamar Solorio
Alham Fikri Aji
262
4
0
24 Jun 2024
Simple Hack for Transformers against Heavy Long-Text Classification on a
  Time- and Memory-Limited GPU Service
Simple Hack for Transformers against Heavy Long-Text Classification on a Time- and Memory-Limited GPU Service
Mirza Alim Mutasodirin
Radityo Eko Prasojo
Achmad F. Abka
Hanif Rasyidi
VLM
184
0
0
19 Mar 2024
Improving Neural Topic Models with Wasserstein Knowledge Distillation
Improving Neural Topic Models with Wasserstein Knowledge DistillationEuropean Conference on Information Retrieval (ECIR), 2023
Suman Adhya
Debarshi Kumar Sanyal
BDL
267
3
0
27 Mar 2023
Xception: Deep Learning with Depthwise Separable Convolutions
Xception: Deep Learning with Depthwise Separable ConvolutionsComputer Vision and Pattern Recognition (CVPR), 2016
François Chollet
MDEBDLPINN
3.6K
17,433
0
07 Oct 2016
1
Page 1 of 1