LADs: Leveraging LLMs for AI-Driven DevOps

Automating cloud configuration and deployment remains a critical challenge due to evolving infrastructures, heterogeneous hardware, and fluctuating workloads. Existing solutions lack adaptability and require extensive manual tuning, leading to inefficiencies and misconfigurations. We introduce LADs, the first LLM-driven framework designed to tackle these challenges by ensuring robustness, adaptability, and efficiency in automated cloud management. Instead of merely applying existing techniques, LADs provides a principled approach to configuration optimization through in-depth analysis of what optimization works under which conditions. By leveraging Retrieval-Augmented Generation, Few-Shot Learning, Chain-of-Thought, and Feedback-Based Prompt Chaining, LADs generates accurate configurations and learns from deployment failures to iteratively refine system settings. Our findings reveal key insights into the trade-offs between performance, cost, and scalability, helping practitioners determine the right strategies for different deployment scenarios. For instance, we demonstrate how prompt chaining-based adaptive feedback loops enhance fault tolerance in multi-tenant environments and how structured log analysis with example shots improves configuration accuracy. Through extensive evaluations, LADs reduces manual effort, optimizes resource utilization, and improves system reliability. By open-sourcing LADs, we aim to drive further innovation in AI-powered DevOps automation.
View on arXiv@article{khan2025_2502.20825, title={ LADs: Leveraging LLMs for AI-Driven DevOps }, author={ Ahmad Faraz Khan and Azal Ahmad Khan and Anas Mohamed and Haider Ali and Suchithra Moolinti and Sabaat Haroon and Usman Tahir and Mattia Fazzini and Ali R. Butt and Ali Anwar }, journal={arXiv preprint arXiv:2502.20825}, year={ 2025 } }