ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.01353
  4. Cited By
Efficiently Distilling LLMs for Edge Applications

Efficiently Distilling LLMs for Edge Applications

1 April 2024
Achintya Kundu
Fabian Lim
Aaron Chew
L. Wynter
Penny Chong
Rhui Dih Lee
ArXivPDFHTML

Papers citing "Efficiently Distilling LLMs for Edge Applications"

8 / 8 papers shown
Title
SoftmAP: Software-Hardware Co-design for Integer-Only Softmax on
  Associative Processors
SoftmAP: Software-Hardware Co-design for Integer-Only Softmax on Associative Processors
M. Rakka
J. Li
Guohao Dai
A. Eltawil
M. Fouda
Fadi J. Kurdahi
60
1
0
26 Nov 2024
CE-CoLLM: Efficient and Adaptive Large Language Models Through
  Cloud-Edge Collaboration
CE-CoLLM: Efficient and Adaptive Large Language Models Through Cloud-Edge Collaboration
Hongpeng Jin
Yanzhao Wu
42
4
0
05 Nov 2024
New Solutions on LLM Acceleration, Optimization, and Application
New Solutions on LLM Acceleration, Optimization, and Application
Yingbing Huang
Lily Jiaxin Wan
Hanchen Ye
Manvi Jha
Jinghua Wang
Yuhong Li
Xiaofan Zhang
Deming Chen
37
12
0
16 Jun 2024
Empirical Guidelines for Deploying LLMs onto Resource-constrained Edge
  Devices
Empirical Guidelines for Deploying LLMs onto Resource-constrained Edge Devices
Ruiyang Qin
Dancheng Liu
Zheyu Yan
Zhaoxuan Tan
Zixuan Pan
Zhenge Jia
Meng-Long Jiang
Ahmed Abbasi
Jinjun Xiong
Yiyu Shi
51
10
0
06 Jun 2024
Distilling Step-by-Step! Outperforming Larger Language Models with Less
  Training Data and Smaller Model Sizes
Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes
Lokesh Nagalapatti
Chun-Liang Li
Chih-Kuan Yeh
Hootan Nakhost
Yasuhisa Fujii
Alexander Ratner
Ranjay Krishna
Chen-Yu Lee
Tomas Pfister
ALM
206
499
0
03 May 2023
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
225
574
0
12 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,943
0
20 Apr 2018
Neural Architecture Search with Reinforcement Learning
Neural Architecture Search with Reinforcement Learning
Barret Zoph
Quoc V. Le
264
5,326
0
05 Nov 2016
1