ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.17420
  4. Cited By

DASH: Input-Aware Dynamic Layer Skipping for Efficient LLM Inference with Markov Decision Policies

23 May 2025
Ning Yang
Fangxin Liu
Junjie Wang
Tao Yang
Kan Liu
Haibing Guan
Li Jiang
    AI4CE
ArXiv (abs)PDFHTML

Papers citing "DASH: Input-Aware Dynamic Layer Skipping for Efficient LLM Inference with Markov Decision Policies"

12 / 12 papers shown
Title
AdaSkip: Adaptive Sublayer Skipping for Accelerating Long-Context LLM Inference
AdaSkip: Adaptive Sublayer Skipping for Accelerating Long-Context LLM Inference
Zhuomin He
Yizhen Yao
Pengfei Zuo
Bin Gao
Qinya Li
Zhenzhe Zheng
Fan Wu
100
1
0
04 Jan 2025
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding
Mostafa Elhoushi
Akshat Shrivastava
Diana Liskovich
Basil Hosmer
Bram Wasti
...
Saurabh Agarwal
Ahmed Roman
Ahmed Aly
Beidi Chen
Carole-Jean Wu
LRM
107
110
0
25 Apr 2024
Accelerating Inference in Large Language Models with a Unified Layer
  Skipping Strategy
Accelerating Inference in Large Language Models with a Unified Layer Skipping Strategy
Yijin Liu
Fandong Meng
Jie Zhou
AI4CE
81
9
0
10 Apr 2024
FFN-SkipLLM: A Hidden Gem for Autoregressive Decoding with Adaptive Feed
  Forward Skipping
FFN-SkipLLM: A Hidden Gem for Autoregressive Decoding with Adaptive Feed Forward Skipping
Ajay Jaiswal
Bodun Hu
Lu Yin
Yeonju Ro
Shiwei Liu
Tianlong Chen
Aditya Akella
132
17
0
05 Apr 2024
ShortGPT: Layers in Large Language Models are More Redundant Than You
  Expect
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect
Xin Men
Mingyu Xu
Qingyu Zhang
Bingning Wang
Hongyu Lin
Yaojie Lu
Xianpei Han
Weipeng Chen
117
141
0
06 Mar 2024
Not All Layers of LLMs Are Necessary During Inference
Not All Layers of LLMs Are Necessary During Inference
Siqi Fan
Xin Jiang
Xiang Li
Xuying Meng
Peng Han
Shuo Shang
Aixin Sun
Yequan Wang
Zhongyuan Wang
121
44
0
04 Mar 2024
SkipDecode: Autoregressive Skip Decoding with Batching and Caching for
  Efficient LLM Inference
SkipDecode: Autoregressive Skip Decoding with Batching and Caching for Efficient LLM Inference
Luciano Del Corro
Allison Del Giorno
Sahaj Agarwal
Ting Yu
Ahmed Hassan Awadallah
Subhabrata Mukherjee
123
61
0
05 Jul 2023
Measuring Massive Multitask Language Understanding
Measuring Massive Multitask Language Understanding
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
Basel Alomair
Jacob Steinhardt
ELMRALM
204
4,580
0
07 Sep 2020
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
1.0K
42,651
0
28 May 2020
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning
  Challenge
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge
Peter Clark
Isaac Cowhey
Oren Etzioni
Tushar Khot
Ashish Sabharwal
Carissa Schoenick
Oyvind Tafjord
ELMRALMLRM
237
2,676
0
14 Mar 2018
Get To The Point: Summarization with Pointer-Generator Networks
Get To The Point: Summarization with Pointer-Generator Networks
A. See
Peter J. Liu
Christopher D. Manning
3DPC
318
4,032
0
14 Apr 2017
Pointer Sentinel Mixture Models
Pointer Sentinel Mixture Models
Stephen Merity
Caiming Xiong
James Bradbury
R. Socher
RALM
378
2,903
0
26 Sep 2016
1