Accelerating Inference in Large Language Models with a Unified Layer
  Skipping Strategy

Accelerating Inference in Large Language Models with a Unified Layer Skipping Strategy

Papers citing "Accelerating Inference in Large Language Models with a Unified Layer Skipping Strategy"