ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.10903
  4. Cited By
New Solutions on LLM Acceleration, Optimization, and Application

New Solutions on LLM Acceleration, Optimization, and Application

16 June 2024
Yingbing Huang
Lily Jiaxin Wan
Hanchen Ye
Manvi Jha
Jinghua Wang
Yuhong Li
Xiaofan Zhang
Deming Chen
ArXivPDFHTML

Papers citing "New Solutions on LLM Acceleration, Optimization, and Application"

10 / 10 papers shown
Title
ML For Hardware Design Interpretability: Challenges and Opportunities
ML For Hardware Design Interpretability: Challenges and Opportunities
Raymond Baartmans
Andrew Ensinger
Victor Agostinelli
Lizhong Chen
29
0
0
11 Apr 2025
From Cool Demos to Production-Ready FMware: Core Challenges and a Technology Roadmap
From Cool Demos to Production-Ready FMware: Core Challenges and a Technology Roadmap
Gopi Krishnan Rajbahadur
G. Oliva
Dayi Lin
Ahmed E. Hassan
39
0
0
28 Jan 2025
SleepCoT: A Lightweight Personalized Sleep Health Model via
  Chain-of-Thought Distillation
SleepCoT: A Lightweight Personalized Sleep Health Model via Chain-of-Thought Distillation
Huimin Zheng
Xiaofeng Xing
Xiangmin Xu
VLM
43
1
0
22 Oct 2024
Benchmarking the Performance of Large Language Models on the Cerebras
  Wafer Scale Engine
Benchmarking the Performance of Large Language Models on the Cerebras Wafer Scale Engine
Zuoning Zhang
Dhruv Parikh
Youning Zhang
Viktor Prasanna
21
1
0
30 Aug 2024
On-Device Language Models: A Comprehensive Review
On-Device Language Models: A Comprehensive Review
Jiajun Xu
Zhiyuan Li
Wei Chen
Qun Wang
Xin Gao
Qi Cai
Ziyuan Ling
29
24
0
26 Aug 2024
SnapKV: LLM Knows What You are Looking for Before Generation
SnapKV: LLM Knows What You are Looking for Before Generation
Yuhong Li
Yingbing Huang
Bowen Yang
Bharat Venkitesh
Acyr F. Locatelli
Hanchen Ye
Tianle Cai
Patrick Lewis
Deming Chen
VLM
75
148
0
22 Apr 2024
Hydragen: High-Throughput LLM Inference with Shared Prefixes
Hydragen: High-Throughput LLM Inference with Shared Prefixes
Jordan Juravsky
Bradley Brown
Ryan Ehrlich
Daniel Y. Fu
Christopher Ré
Azalia Mirhoseini
49
35
0
07 Feb 2024
Break the Sequential Dependency of LLM Inference Using Lookahead
  Decoding
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
Yichao Fu
Peter Bailis
Ion Stoica
Hao Zhang
123
134
0
03 Feb 2024
Automated Code generation for Information Technology Tasks in YAML
  through Large Language Models
Automated Code generation for Information Technology Tasks in YAML through Large Language Models
Saurabh Pujar
Luca Buratti
Xiaojie Guo
Nicolas Dupuis
B. Lewis
...
Atin Sood
Ganesh Nalawade
Matt Jones
Alessandro Morari
Ruchi Puri
28
3
0
02 May 2023
DFX: A Low-latency Multi-FPGA Appliance for Accelerating
  Transformer-based Text Generation
DFX: A Low-latency Multi-FPGA Appliance for Accelerating Transformer-based Text Generation
Seongmin Hong
Seungjae Moon
Junsoo Kim
Sungjae Lee
Minsub Kim
Dongsoo Lee
Joo-Young Kim
64
74
0
22 Sep 2022
1