With all the cloud providers giving us more and more abilities to precisely control the level of resources for our applications infrastructure, we still need to define (closely monitor and control) how much CPU, RAM and I/O our system needs and pay only for the just enough. And what about resilience? How to not overpay for stability? How to control the budget and not breach SLA? What would define scalability bottlenecks and how to find them in advance?

Before we dive deeper into the problem of application scaling, we should understand where this problem comes from…otherwise, we won’t find the correct conclusions!
Reasons to autoscale are economical
The load profile of any application for the Internet is almost certainly not constant over a significant period in a time cycle (hour, day, week) and changing from time to time(after updates, new underlying infrastructure). Keeping this in mind, everyone wants to save some money when the cloud “hardware” is doing not as much work as on top of the incoming requests chart hill.