Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2312.14972
Cited By
Scaling Down to Scale Up: A Cost-Benefit Analysis of Replacing OpenAI's LLM with Open Source SLMs in Production
20 December 2023
Chandra Irugalbandara
Ashish Mahendra
Roland Daynauth
T. Arachchige
Jayanaka L. Dantanarayana
K. Flautner
Lingjia Tang
Yiping Kang
Jason Mars
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Scaling Down to Scale Up: A Cost-Benefit Analysis of Replacing OpenAI's LLM with Open Source SLMs in Production"
10 / 10 papers shown
Title
Aleph-Alpha-GermanWeb: Improving German-language LLM pre-training with model-based data curation and synthetic data generation
Thomas F Burns
Letitia Parcalabescu
Stephan Wäldchen
Michael Barlow
Gregor Ziegltrum
Volker Stampa
Bastian Harren
Björn Deiseroth
SyDa
28
0
0
24 Apr 2025
Beyond Release: Access Considerations for Generative AI Systems
Irene Solaiman
Rishi Bommasani
Dan Hendrycks
Ariel Herbert-Voss
Yacine Jernite
Aviya Skowron
Andrew Trask
54
1
0
23 Feb 2025
From Cool Demos to Production-Ready FMware: Core Challenges and a Technology Roadmap
Gopi Krishnan Rajbahadur
G. Oliva
Dayi Lin
Ahmed E. Hassan
37
0
0
28 Jan 2025
SLM-Mod: Small Language Models Surpass LLMs at Content Moderation
Xianyang Zhan
Agam Goyal
Yilun Chen
Eshwar Chandrasekharan
Koustuv Saha
AI4MH
30
0
0
17 Oct 2024
Benchmarking LLMs in Political Content Text-Annotation: Proof-of-Concept with Toxicity and Incivility Data
Bastián González-Bustamante
16
0
0
15 Sep 2024
Large Model Strategic Thinking, Small Model Efficiency: Transferring Theory of Mind in Large Language Models
Nunzio Lorè
Alireza Ilami
Babak Heydari
LRM
34
0
0
05 Aug 2024
Aligning Model Evaluations with Human Preferences: Mitigating Token Count Bias in Language Model Assessments
Roland Daynauth
Jason Mars
ALM
15
0
0
05 Jul 2024
ASTRA-sim2.0: Modeling Hierarchical Networks and Disaggregated Systems for Large-model Training at Scale
William Won
Taekyung Heo
Saeed Rashidi
Srinivas Sridharan
S. Srinivasan
T. Krishna
26
39
0
24 Mar 2023
DFX: A Low-latency Multi-FPGA Appliance for Accelerating Transformer-based Text Generation
Seongmin Hong
Seungjae Moon
Junsoo Kim
Sungjae Lee
Minsub Kim
Dongsoo Lee
Joo-Young Kim
58
74
0
22 Sep 2022
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
243
1,791
0
17 Sep 2019
1