ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.04513
  4. Cited By
Online Cascade Learning for Efficient Inference over Streams
v1v2v3 (latest)

Online Cascade Learning for Efficient Inference over Streams

7 February 2024
Lunyiu Nie
Zhimin Ding
Erdong Hu
Christopher M. Jermaine
Swarat Chaudhuri
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)

Papers citing "Online Cascade Learning for Efficient Inference over Streams"

10 / 10 papers shown
C3PO: Optimized Large Language Model Cascades with Probabilistic Cost Constraints for Reasoning
C3PO: Optimized Large Language Model Cascades with Probabilistic Cost Constraints for Reasoning
Antonios Valkanas
Soumyasundar Pal
Pavel Rumiantsev
Yingxue Zhang
Mark Coates
220
0
0
10 Nov 2025
From Deferral to Learning: Online In-Context Knowledge Distillation for LLM Cascades
From Deferral to Learning: Online In-Context Knowledge Distillation for LLM Cascades
Yu Wu
Shuo Wu
Ye Tao
Yansong Li
Anand Sarwate
RALM
254
0
0
26 Sep 2025
T-TAMER: Provably Taming Trade-offs in ML Serving
T-TAMER: Provably Taming Trade-offs in ML Serving
Yuanyuan Yang
Ruimin Zhang
Jamie Morgenstern
Haifeng Xu
121
0
0
26 Sep 2025
RadialRouter: Structured Representation for Efficient and Robust Large Language Models Routing
RadialRouter: Structured Representation for Efficient and Robust Large Language Models Routing
Ruihan Jin
Pengpeng Shao
Zhengqi Wen
Jinyang Wu
Mingkuan Feng
Shuai Zhang
Jianhua Tao
373
4
0
04 Jun 2025
Bi-directional Model Cascading with Proxy Confidence
Bi-directional Model Cascading with Proxy Confidence
David Warren
Mark Dras
302
1
0
27 Apr 2025
Toward Super Agent System with Hybrid AI Routers
Toward Super Agent System with Hybrid AI Routers
Yuhang Yao
Haixin Wang
Yibo Chen
Jiawen Wang
Min Chang Jordan Ren
Bosheng Ding
Salman Avestimehr
Chaoyang He
LM&RoLLMAG
531
4
0
11 Apr 2025
Resource-efficient Inference with Foundation Model Programs
Resource-efficient Inference with Foundation Model Programs
Lunyiu Nie
Zhimin Ding
Kevin Yu
Marco Cheung
C. Jermaine
S. Chaudhuri
332
1
0
09 Apr 2025
A Unified Approach to Routing and Cascading for LLMs
A Unified Approach to Routing and Cascading for LLMs
Jasper Dekoninck
Maximilian Baader
Martin Vechev
464
25
0
14 Oct 2024
RouterDC: Query-Based Router by Dual Contrastive Learning for Assembling
  Large Language Models
RouterDC: Query-Based Router by Dual Contrastive Learning for Assembling Large Language ModelsNeural Information Processing Systems (NeurIPS), 2024
Shuhao Chen
Weisen Jiang
Xiaoyuan Zhang
James T. Kwok
Yu Zhang
RALMMQ
287
44
0
30 Sep 2024
MODL: Multilearner Online Deep Learning
MODL: Multilearner Online Deep Learning
Antonios Valkanas
Boris N. Oreshkin
Mark Coates
419
2
0
28 May 2024
1
Page 1 of 1