Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
2404.08856
Cited By

On Speculative Decoding for Multimodal Large Language Models

On Speculative Decoding for Multimodal Large Language Models

13 April 2024

Christopher Lott

ArXiv (abs)PDF HTML HuggingFace (13 upvotes)Github

Papers citing "On Speculative Decoding for Multimodal Large Language Models"

11 / 11 papers shown

Thinking with Drafts: Speculative Temporal Reasoning for Efficient Long Video Understanding

Thinking with Drafts: Speculative Temporal Reasoning for Efficient Long Video Understanding

190

1

0

30 Nov 2025

HiViS: Hiding Visual Tokens from the Drafter for Speculative Decoding in Vision-Language Models

HiViS: Hiding Visual Tokens from the Drafter for Speculative Decoding in Vision-Language Models

204

1

0

28 Sep 2025

ViSpec: Accelerating Vision-Language Models with Vision-Aware Speculative Decoding

ViSpec: Accelerating Vision-Language Models with Vision-Aware Speculative Decoding

465

9

0

17 Sep 2025

SpecVLM: Fast Speculative Decoding in Vision-Language Models

SpecVLM: Fast Speculative Decoding in Vision-Language Models

294

1

0

15 Sep 2025

VOCABTRIM: Vocabulary Pruning for Efficient Speculative Decoding in LLMs

VOCABTRIM: Vocabulary Pruning for Efficient Speculative Decoding in LLMs

Sudhanshu Agrawal

...

186

6

0

28 Jun 2025

SpecFLASH: A Latent-Guided Semi-autoregressive Speculative Decoding Framework for Efficient Multimodal Generation

SpecFLASH: A Latent-Guided Semi-autoregressive Speculative Decoding Framework for Efficient Multimodal Generation

Joey Tianyi Zhou

479

1

0

19 May 2025

MASSV: Multimodal Adaptation and Self-Data Distillation for Speculative Decoding of Vision-Language Models

MASSV: Multimodal Adaptation and Self-Data Distillation for Speculative Decoding of Vision-Language Models

Mugilan Ganesan

Nish Sinnadurai

Vithursan Thangarasa

500

0

0

15 May 2025

Growing a Multi-head Twig via Distillation and Reinforcement Learning to Accelerate Large Vision-Language Models

Growing a Multi-head Twig via Distillation and Reinforcement Learning to Accelerate Large Vision-Language Models

417

17

0

18 Mar 2025

DivPrune: Diversity-based Visual Token Pruning for Large Multimodal Models

DivPrune: Diversity-based Visual Token Pruning for Large Multimodal ModelsComputer Vision and Pattern Recognition (CVPR), 2025

Saeed Ranjbar Alvar

Gursimran Singh

Mohammad Akbari

706

92

0

04 Mar 2025

LANTERN: Accelerating Visual Autoregressive Models with Relaxed Speculative Decoding

LANTERN: Accelerating Visual Autoregressive Models with Relaxed Speculative DecodingInternational Conference on Learning Representations (ICLR), 2024

Eunho Yang

587

40

0

04 Oct 2024

Fast Transformer Decoding: One Write-Head is All You Need

Fast Transformer Decoding: One Write-Head is All You Need

Noam M. Shazeer

823

716

0

06 Nov 2019

Page 1 of 1