¶ Applications Run and Scale on Demand — Step-by-Step Guide
The user deploys the application on cloud services such as:
- Virtual Machines
- Containers
- Platform-as-a-Service (PaaS)
The cloud system continuously monitors:
- CPU usage
- Memory usage
- Network traffic
- Number of user requests
The user sets auto-scaling rules based on:
- CPU utilization
- Request count
- Response time
When traffic increases:
- New instances are automatically created
- Load is distributed across multiple servers
When traffic decreases:
- Extra instances are automatically removed
A load balancer automatically:
- Routes user requests to available instances
- Ensures high availability and reliability
You are billed only for the resources actually used.
An online shopping website experiences high traffic during sale events.
- Website runs on EC2 instances
- Traffic managed by Elastic Load Balancer (ELB)
Auto Scaling rules:
- Add new server if CPU > 70%
- Remove server if CPU < 30%
- Thousands of users access the website
- CPU usage increases
AWS automatically:
- Launches new EC2 instances
- Adds them to load balancer
ELB distributes traffic across all running servers.
Traffic drops:
- Extra servers are automatically terminated
- Costs are reduced
The application runs smoothly during high traffic and scales down when demand decreases.
On-demand scaling means:
- Application is deployed on cloud
- Performance is continuously monitored
- Resources scale automatically
- Load is balanced efficiently
- Cost is optimized
This ensures high performance, availability, and cost efficiency.