FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance

9 May 2023

Papers citing "FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance"

37 / 37 papers shown

Title
Invoke Interfaces Only When Needed: Adaptive Invocation for Large Language Models in Question Answering Jihao Zhao Chunlai Zhou Biao Qin 52 0 0 05 May 2025
Beyond the model: Key differentiators in large language models and multi-agent services Muskaan Goyal Pranav Bhasin LLMAG ELM 143 0 0 05 May 2025
COSMOS: Predictable and Cost-Effective Adaptation of LLMs Jiayu Wang Aws Albarghouthi Frederic Sala 47 0 0 30 Apr 2025
DNB-AI-Project at SemEval-2025 Task 5: An LLM-Ensemble Approach for Automated Subject Indexing Lisa Kluge Maximilian Kähler 102 1 0 30 Apr 2025
Bi-directional Model Cascading with Proxy Confidence David Warren Mark Dras 44 0 0 27 Apr 2025
From Reviews to Dialogues: Active Synthesis for Zero-Shot LLM-based Conversational Recommender System Rohan Surana Junda Wu Zhouhang Xie Yu Xia Harald Steck Dawen Liang Nathan Kallus Julian McAuley 26 0 0 21 Apr 2025
Harnessing Multiple Large Language Models: A Survey on LLM Ensemble Zhijun Chen Jingzheng Li Pengpeng Chen Zhuoran Li Kai Sun Yuankai Luo Qianren Mao Dingqi Yang Hailong Sun Philip S. Yu ELM 50 4 0 25 Feb 2025
Beyond Release: Access Considerations for Generative AI Systems Irene Solaiman Rishi Bommasani Dan Hendrycks Ariel Herbert-Voss Yacine Jernite Aviya Skowron Andrew Trask 60 1 0 23 Feb 2025
Think Together and Work Better: Combining Humans' and LLMs' Think-Aloud Outcomes for Effective Text Evaluation SeongYeub Chu JongWoo Kim MunYong Yi 57 3 0 21 Feb 2025
A Unified Approach to Routing and Cascading for LLMs Jasper Dekoninck Maximilian Baader Martin Vechev 60 2 0 17 Feb 2025
Leveraging Uncertainty Estimation for Efficient LLM Routing Tuo Zhang Asal Mehradfar Dimitrios Dimitriadis Salman Avestimehr 51 1 0 16 Feb 2025
Cost-Saving LLM Cascades with Early Abstention Michael J. Zellinger Rex Liu Matt Thomson 102 0 0 13 Feb 2025
EvoFlow: Evolving Diverse Agentic Workflows On The Fly Guibin Zhang Kaijie Chen Guancheng Wan Heng Chang Hong Cheng K. Wang Shuyue Hu Lei Bai 77 2 0 11 Feb 2025
Towards Sustainable NLP: Insights from Benchmarking Inference Energy in Large Language Models S. Poddar Paramita Koley Janardan Misra Niloy Ganguly Saptarshi Ghosh Saptarshi Ghosh 61 0 0 08 Feb 2025
Division-of-Thoughts: Harnessing Hybrid Language Model Synergy for Efficient On-Device Agents Chenyang Shao Xinyuan Hu Yutang Lin Fengli Xu LLMAG LRM 65 4 0 06 Feb 2025
Multi-Objective Hyperparameter Selection via Hypothesis Testing on Reliability Graphs Amirmohammad Farzaneh Osvaldo Simeone 86 0 0 22 Jan 2025
Towards Optimizing SQL Generation via LLM Routing Mohammadhossein Malekpour Nour Shaheen Foutse Khomh Amine Mhedhbi AI4TS 31 2 0 06 Nov 2024
BlabberSeg: Real-Time Embedded Open-Vocabulary Aerial Segmentation Haechan Mark Bong Ricardo de Azambuja Giovanni Beltrame VLM 31 0 0 16 Oct 2024
GraphRouter: A Graph-based Router for LLM Selections Tao Feng Yanzhen Shen Jiaxuan You 79 10 0 04 Oct 2024
What is the Role of Small Models in the LLM Era: A Survey Lihu Chen Gaël Varoquaux ALM 60 23 0 10 Sep 2024
MetaLLM: A High-performant and Cost-efficient Dynamic Framework for Wrapping LLMs Quang H. Nguyen Duy C. Hoang Juliette Decugis Saurav Manchanda Nitesh V. Chawla Khoa D. Doan Khoa D. Doan 37 6 0 15 Jul 2024
Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in the Era of Large Language Models Jinliang Lu Ziliang Pang Min Xiao Yaochen Zhu Rui Xia Jiajun Zhang MoMe 38 18 0 08 Jul 2024
RouteLLM: Learning to Route LLMs with Preference Data Isaac Ong Amjad Almahairi Vincent Wu Wei-Lin Chiang Tianhao Wu Joseph E. Gonzalez M. W. Kadous Ion Stoica 70 71 0 26 Jun 2024
Mixture-of-Agents Enhances Large Language Model Capabilities Junlin Wang Jue Wang Ben Athiwaratkun Ce Zhang James Zou LLMAG AIFin 41 97 0 07 Jun 2024
Optimising Calls to Large Language Models with Uncertainty-Based Two-Tier Selection Guillem Ramírez Alexandra Birch Ivan Titov 38 8 0 03 May 2024
Model Callers for Transforming Predictive and Generative AI Applications Mukesh Dalal 21 0 0 17 Apr 2024
Routing to the Expert: Efficient Reward-guided Ensemble of Large Language Models Keming Lu Hongyi Yuan Runji Lin Junyang Lin Zheng Yuan Chang Zhou Jingren Zhou MoE LRM 40 52 0 15 Nov 2023
Tree Prompting: Efficient Task Adaptation without Fine-Tuning John X. Morris Chandan Singh Alexander M. Rush Jianfeng Gao Yuntian Deng VLM LRM 19 17 0 21 Oct 2023
AutoMix: Automatically Mixing Language Models Pranjal Aggarwal Aman Madaan Ankit Anand Srividya Pranavi Potharaju Swaroop Mishra ... Karthik Kappaganthu Yiming Yang Shyam Upadhyay Manaal Faruqui Mausam 40 17 0 19 Oct 2023
The Costly Dilemma: Generalization, Evaluation and Cost-Optimal Deployment of Large Language Models Abi Aryan Aakash Kumar Nain Andrew McMahon Lucas Augusto Meyer Harpreet Sahota 22 6 0 15 Aug 2023
Let's Sample Step by Step: Adaptive-Consistency for Efficient Reasoning and Coding with LLMs Pranjal Aggarwal Aman Madaan Yiming Yang Mausam LRM 28 34 0 19 May 2023
Ask Me Anything: A simple strategy for prompting language models Simran Arora A. Narayan Mayee F. Chen Laurel J. Orr Neel Guha Kush S. Bhatia Ines Chami Frederic Sala Christopher Ré ReLM LRM 211 206 0 05 Oct 2022
Toward Trustworthy Neural Program Synthesis Darren Key Wen-Ding Li Kevin Ellis NAI 83 5 0 29 Sep 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models Jason W. Wei Xuezhi Wang Dale Schuurmans Maarten Bosma Brian Ichter F. Xia Ed H. Chi Quoc Le Denny Zhou LM&Ro LRM AI4CE ReLM 347 8,457 0 28 Jan 2022
Towards Efficient Post-training Quantization of Pre-trained Language Models Haoli Bai Lu Hou Lifeng Shang Xin Jiang Irwin King M. Lyu MQ 71 47 0 30 Sep 2021
Efficient Online ML API Selection for Multi-Label Classification Tasks Lingjiao Chen Matei A. Zaharia James Y. Zou 32 16 0 18 Feb 2021
What Makes Good In-Context Examples for GPT- $3$ ? Jiachang Liu Dinghan Shen Yizhe Zhang Bill Dolan Lawrence Carin Weizhu Chen AAML RALM 275 1,312 0 17 Jan 2021