New Solutions on LLM Acceleration, Optimization, and Application

New Solutions on LLM Acceleration, Optimization, and Application

16 June 2024

Lily Jiaxin Wan

Deming Chen

Papers citing "New Solutions on LLM Acceleration, Optimization, and Application"

10 / 10 papers shown

Title
ML For Hardware Design Interpretability: Challenges and Opportunities Raymond Baartmans Andrew Ensinger Victor Agostinelli Lizhong Chen 29 0 0 11 Apr 2025
From Cool Demos to Production-Ready FMware: Core Challenges and a Technology Roadmap Gopi Krishnan Rajbahadur G. Oliva Dayi Lin Ahmed E. Hassan 39 0 0 28 Jan 2025
SleepCoT: A Lightweight Personalized Sleep Health Model via Chain-of-Thought Distillation Huimin Zheng Xiaofeng Xing Xiangmin Xu VLM 43 1 0 22 Oct 2024
Benchmarking the Performance of Large Language Models on the Cerebras Wafer Scale Engine Zuoning Zhang Dhruv Parikh Youning Zhang Viktor Prasanna 21 1 0 30 Aug 2024
On-Device Language Models: A Comprehensive Review Jiajun Xu Zhiyuan Li Wei Chen Qun Wang Xin Gao Qi Cai Ziyuan Ling 29 24 0 26 Aug 2024
SnapKV: LLM Knows What You are Looking for Before Generation Yuhong Li Yingbing Huang Bowen Yang Bharat Venkitesh Acyr F. Locatelli Hanchen Ye Tianle Cai Patrick Lewis Deming Chen VLM 75 148 0 22 Apr 2024
Hydragen: High-Throughput LLM Inference with Shared Prefixes Jordan Juravsky Bradley Brown Ryan Ehrlich Daniel Y. Fu Christopher Ré Azalia Mirhoseini 49 35 0 07 Feb 2024
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding Yichao Fu Peter Bailis Ion Stoica Hao Zhang 123 134 0 03 Feb 2024
Automated Code generation for Information Technology Tasks in YAML through Large Language Models Saurabh Pujar Luca Buratti Xiaojie Guo Nicolas Dupuis B. Lewis ... Atin Sood Ganesh Nalawade Matt Jones Alessandro Morari Ruchi Puri 28 3 0 02 May 2023
DFX: A Low-latency Multi-FPGA Appliance for Accelerating Transformer-based Text Generation Seongmin Hong Seungjae Moon Junsoo Kim Sungjae Lee Minsub Kim Dongsoo Lee Joo-Young Kim 64 74 0 22 Sep 2022