"How to control our cloud costs and reduce the bill?” very frequent meeting agenda item for users hosting applications in cloud. Typical cloud cost management plan involves
- Monitor the spend
- Identify the waste in spend
- Act on the waste
Gartner estimates that organizations lacking any coherent plan for cloud cost managementmay be overspending by as much as 70 percent or more.
There are different approaches to monitor the spend, using AWS Cost Explorer or Azure Billing. Once you identify the resources you are paying for, identifying waste is one of the most time-consuming activities, which involves talking to users, identifying priorities etc.,.
Once waste in spend is identified, based on type of waste, different techniques can be applied. For example, if the resources are “ghost”, you may delete them. If the resources are sitting idle, you could stop them etc., In this post, we are not going to go over the entire cost management plan, instead we would like to discuss thoroughly on the last point, “stop idle virtual machines or resources”.
While discussing “turn off (or) stop idle cloud resources ”, all over the internet, one blank recommendation you come across would be:
“Not all instances need to be used 24/7. Scheduling non-essential instances to shut down overnight or on weekends is more cost effective than keeping them running constantly.”
This recommendation follows with suggestions like use AWS Instance Scheduler or Azure Run books to define the schedules to shut down the resources and/or start them. What is the problem with this? When a lot of cloud experts, all over the internet recommend this, isn’t it a good solution?
This is indeed a good recommendation, we don’t deny that fact, schedulers have their place to use cases like, run a task every day at X hour. But, use cases like these are only 10% of the total possible cost savings use case. Remaining use cases, schedulers are NOT great cost solutions . So, what would be the solution? For the remaining 90% use cases, the solution should be a more dynamic, non-time bound and application usage pattern based cost optimization solution.
Though this is not a comprehensive use case list, let us walk through a few examples to understand where “schedulers” fit and which use case fit “application usage pattern based cost optimization”.
Email Sending application every day at 1 AM - 2AM
Assume your application has a requirement to send emails to users between 1AM - 2AM every day. Key thing we need to observe in this use case, 1AM - 2AM. This requirement is “time bound” rather than “application usage bound”.
This is a classic use case for “schedulers”, because the up time requirement is time bound.
QA team testing application
This use case is not “time bound”. In this use case, schedulers are NOT optimal for cost savings, because every second you leave cloud resource in idle state, you are spending money unnecessarily. While using cloud resources, you will get billed for the (most of) resources for every second they are in running state. We should reduce this idle state as much as possible.
There are few details that make this case as “non time bound”. For example, users may not be using the cloud resources for the whole time the resources are up (or) users might not be using the resources on that at all etc., these all possibilities create waste in spend.
Another dimension with these “non time bound” use cases is, availability of cloud resources outside the schedule window. Users can’t access the AWS Instances, Azure VM or any other cloud resources outside of the scheduled windows. Balancing this availability vs cost optimization becomes a challenge when the use is not time bound, be the use case is for human users (or) automated test cases like.