Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2401.09890
Cited By
A Survey on Hardware Accelerators for Large Language Models
18 January 2024
C. Kachris
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Survey on Hardware Accelerators for Large Language Models"
10 / 10 papers shown
Title
Deploying Foundation Model Powered Agent Services: A Survey
Wenchao Xu
Jinyu Chen
Peirong Zheng
Xiaoquan Yi
Tianyi Tian
...
Quan Wan
Haozhao Wang
Yunfeng Fan
Qinliang Su
Xuemin Shen
AI4CE
112
1
0
18 Dec 2024
PIM-AI: A Novel Architecture for High-Efficiency LLM Inference
Cristobal Ortega
Yann Falevoz
Renaud Ayrignac
76
1
0
26 Nov 2024
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Jinhao Li
Jiaming Xu
Shan Huang
Yonghua Chen
Wen Li
...
Jiayi Pan
Li Ding
Hao Zhou
Yu Wang
Guohao Dai
57
15
0
06 Oct 2024
CritiPrefill: A Segment-wise Criticality-based Approach for Prefilling Acceleration in LLMs
Junlin Lv
Yuan Feng
Xike Xie
Xin Jia
Qirong Peng
Guiming Xie
21
3
0
19 Sep 2024
On-Device Language Models: A Comprehensive Review
Jiajun Xu
Zhiyuan Li
Wei Chen
Qun Wang
Xin Gao
Qi Cai
Ziyuan Ling
32
24
0
26 Aug 2024
Designing Efficient LLM Accelerators for Edge Devices
Jude Haris
Rappy Saha
Wenhao Hu
José Cano
16
7
0
01 Aug 2024
Mobile Edge Intelligence for Large Language Models: A Contemporary Survey
Guanqiao Qu
Qiyuan Chen
Wei Wei
Zheng Lin
Xianhao Chen
Kaibin Huang
33
41
0
09 Jul 2024
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Keivan Alizadeh-Vahid
Iman Mirzadeh
Dmitry Belenko
Karen Khatamifard
Minsik Cho
C. C. D. Mundo
Mohammad Rastegari
Mehrdad Farajtabar
70
104
0
12 Dec 2023
DFX: A Low-latency Multi-FPGA Appliance for Accelerating Transformer-based Text Generation
Seongmin Hong
Seungjae Moon
Junsoo Kim
Sungjae Lee
Minsub Kim
Dongsoo Lee
Joo-Young Kim
64
74
0
22 Sep 2022
Energon: Towards Efficient Acceleration of Transformers Using Dynamic Sparse Attention
Zhe Zhou
Junling Liu
Zhenyu Gu
Guangyu Sun
56
39
0
18 Oct 2021
1