ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.02181
  4. Cited By
Not All Layers of LLMs Are Necessary During Inference

Not All Layers of LLMs Are Necessary During Inference

4 March 2024
Siqi Fan
Xin Jiang
Xiang Li
Xuying Meng
Peng Han
Shuo Shang
Aixin Sun
Yequan Wang
Zhongyuan Wang
ArXivPDFHTML

Papers citing "Not All Layers of LLMs Are Necessary During Inference"

6 / 6 papers shown
Title
Representation-based Reward Modeling for Efficient Safety Alignment of Large Language Model
Qiyuan Deng
X. Bai
Kehai Chen
Yaowei Wang
Liqiang Nie
Min Zhang
OffRL
50
0
0
13 Mar 2025
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Jinhao Li
Jiaming Xu
Shan Huang
Yonghua Chen
Wen Li
...
Jiayi Pan
Li Ding
Hao Zhou
Yu Wang
Guohao Dai
25
13
0
06 Oct 2024
SLEB: Streamlining LLMs through Redundancy Verification and Elimination
  of Transformer Blocks
SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks
Jiwon Song
Kyungseok Oh
Taesu Kim
Hyungjun Kim
Yulhwa Kim
Jae-Joon Kim
47
5
0
14 Feb 2024
One Pass Streaming Algorithm for Super Long Token Attention
  Approximation in Sublinear Space
One Pass Streaming Algorithm for Super Long Token Attention Approximation in Sublinear Space
Raghav Addanki
Chenyang Li
Zhao-quan Song
Chiwun Yang
34
2
0
24 Nov 2023
Mixture-of-Experts with Expert Choice Routing
Mixture-of-Experts with Expert Choice Routing
Yan-Quan Zhou
Tao Lei
Han-Chu Liu
Nan Du
Yanping Huang
Vincent Zhao
Andrew M. Dai
Zhifeng Chen
Quoc V. Le
James Laudon
MoE
137
203
0
18 Feb 2022
Sparsity in Deep Learning: Pruning and growth for efficient inference
  and training in neural networks
Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks
Torsten Hoefler
Dan Alistarh
Tal Ben-Nun
Nikoli Dryden
Alexandra Peste
MQ
128
526
0
31 Jan 2021
1