Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.00326
Cited By
Teola: Towards End-to-End Optimization of LLM-based Applications
29 June 2024
Xin Tan
Yimin Jiang
Yitao Yang
Hong-Yu Xu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Teola: Towards End-to-End Optimization of LLM-based Applications"
8 / 8 papers shown
Title
OET: Optimization-based prompt injection Evaluation Toolkit
Jinsheng Pan
Xiaogeng Liu
Chaowei Xiao
AAML
69
0
0
01 May 2025
Taming the Titans: A Survey of Efficient LLM Inference Serving
Ranran Zhen
J. Li
Yixin Ji
Z. Yang
Tong Liu
Qingrong Xia
Xinyu Duan
Z. Wang
Baoxing Huai
M. Zhang
LLMAG
77
0
0
28 Apr 2025
Software Performance Engineering for Foundation Model-Powered Software (FMware)
Haoxiang Zhang
Shi Chang
Arthur Leung
Kishanthan Thangarajah
Boyuan Chen
Hanan Lutfiyya
Ahmed E. Hassan
45
0
0
14 Nov 2024
LLMProxy: Reducing Cost to Access Large Language Models
Noah Martin
Abdullah Bin Faisal
Hiba Eltigani
Rukhshan Haroon
Swaminathan Lamelas
Fahad Dogar
31
1
0
04 Oct 2024
Parrot: Efficient Serving of LLM-based Applications with Semantic Variable
Chaofan Lin
Zhenhua Han
Chengruidong Zhang
Yuqing Yang
Fan Yang
Chen Chen
Lili Qiu
71
35
0
30 May 2024
Lossless Acceleration of Large Language Model via Adaptive N-gram Parallel Decoding
Jie Ou
Yueming Chen
Wenhong Tian
42
10
0
10 Apr 2024
Optimizing LLM Queries in Relational Data Analytics Workloads
Shu Liu
Asim Biswal
Audrey Cheng
Xiangxi Mo
Shiyi Cao
...
Ion Stoica
Matei A. Zaharia
Ion Stoica
Joseph E. Gonzalez
Matei Zaharia
51
19
0
09 Mar 2024
Llama: A Heterogeneous & Serverless Framework for Auto-Tuning Video Analytics Pipelines
Francisco Romero
Mark Zhao
N. Yadwadkar
Christos Kozyrakis
31
100
0
03 Feb 2021
1