Recently, we have witnessed scientific workflows from various applications running in the cloud. Due to the pay-as-you-go price scheme, the performance and monetary cost are two important optimization metrics. While there have been previous studies on minimizing the monetary cost for scientific workflows, most of them assume static execution time and static price scheme, and has the QoS notion of satisfying the static deadline. However, cloud environment is dynamic, with performance dynamics caused by the interference from concurrent executions and price dynamics like spot prices offered by Amazon EC2. Therefore, we propose the notion of offering probabilistic performance guarantees for individual workflows, which captures both performance and price dynamics. We further develop a probabilistic scheduling framework called Dyna to minimize the monetary cost while offering probabilistic deadline guarantees. The framework includes runtime refinement for performance dynamics, and a hybrid instance configuration approach to capture the best of both worlds in price dynamics: on-demand instances offering the reliability guarantee and spot instances usually with a much lower cost. We have developed a simulator with calibrations from real cloud providers. Experimental results with Amazon EC2 settings demonstrate (1) the accuracy of our simulations in capturing the distributions of cost and execution time of running workflows on real cloud environment; (2) the effectiveness of our framework on reducing monetary cost over the existing approaches while offering probabilistic guarantees on deadline requirement.
View on arXiv