Scaling Down to Scale Up: A Cost-Benefit Analysis of Replacing OpenAI's
LLM with Open Source SLMs in Production

Scaling Down to Scale Up: A Cost-Benefit Analysis of Replacing OpenAI's LLM with Open Source SLMs in Production

20 December 2023

Chandra Irugalbandara

Ashish Mahendra

Roland Daynauth

Jayanaka L. Dantanarayana

Papers citing "Scaling Down to Scale Up: A Cost-Benefit Analysis of Replacing OpenAI's LLM with Open Source SLMs in Production"

10 / 10 papers shown

Title
Aleph-Alpha-GermanWeb: Improving German-language LLM pre-training with model-based data curation and synthetic data generation Thomas F Burns Letitia Parcalabescu Stephan Wäldchen Michael Barlow Gregor Ziegltrum Volker Stampa Bastian Harren Björn Deiseroth SyDa 28 0 0 24 Apr 2025
Beyond Release: Access Considerations for Generative AI Systems Irene Solaiman Rishi Bommasani Dan Hendrycks Ariel Herbert-Voss Yacine Jernite Aviya Skowron Andrew Trask 54 1 0 23 Feb 2025
From Cool Demos to Production-Ready FMware: Core Challenges and a Technology Roadmap Gopi Krishnan Rajbahadur G. Oliva Dayi Lin Ahmed E. Hassan 37 0 0 28 Jan 2025
SLM-Mod: Small Language Models Surpass LLMs at Content Moderation Xianyang Zhan Agam Goyal Yilun Chen Eshwar Chandrasekharan Koustuv Saha AI4MH 30 0 0 17 Oct 2024
Benchmarking LLMs in Political Content Text-Annotation: Proof-of-Concept with Toxicity and Incivility Data Bastián González-Bustamante 16 0 0 15 Sep 2024
Large Model Strategic Thinking, Small Model Efficiency: Transferring Theory of Mind in Large Language Models Nunzio Lorè Alireza Ilami Babak Heydari LRM 34 0 0 05 Aug 2024
Aligning Model Evaluations with Human Preferences: Mitigating Token Count Bias in Language Model Assessments Roland Daynauth Jason Mars ALM 15 0 0 05 Jul 2024
ASTRA-sim2.0: Modeling Hierarchical Networks and Disaggregated Systems for Large-model Training at Scale William Won Taekyung Heo Saeed Rashidi Srinivas Sridharan S. Srinivasan T. Krishna 26 39 0 24 Mar 2023
DFX: A Low-latency Multi-FPGA Appliance for Accelerating Transformer-based Text Generation Seongmin Hong Seungjae Moon Junsoo Kim Sungjae Lee Minsub Kim Dongsoo Lee Joo-Young Kim 58 74 0 22 Sep 2022
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism M. Shoeybi M. Patwary Raul Puri P. LeGresley Jared Casper Bryan Catanzaro MoE 243 1,791 0 17 Sep 2019