ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.10076
  4. Cited By
QUICK: Quantization-aware Interleaving and Conflict-free Kernel for
  efficient LLM inference

QUICK: Quantization-aware Interleaving and Conflict-free Kernel for efficient LLM inference

15 February 2024
Taesu Kim
Jongho Lee
Daehyun Ahn
Sarang Kim
Jiwoong Choi
Minkyu Kim
Hyungjun Kim
ArXivPDFHTML

Papers citing "QUICK: Quantization-aware Interleaving and Conflict-free Kernel for efficient LLM inference"

1 / 1 papers shown
Title
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Saleh Ashkboos
Maximilian L. Croci
Marcelo Gennari do Nascimento
Torsten Hoefler
James Hensman
VLM
125
145
0
26 Jan 2024
1