Scaling and Elasticity in Cloud Computing
Overview
Elasticity is the capability to automatically adjust compute resources to match demand.
This includes:
- Scaling up/down: More or fewer resources per instance.
- Scaling out/in: More or fewer instances serving traffic.
Effective elasticity improves:
- Performance
- Cost efficiency
- Resilience
Scaling Types
Vertical Scaling
-
Definition: Adjusting the resources of a single instance (CPU, RAM, instance size/type).
-
Terminology:
- Scale up: Add more resources.
- Scale down: Reduce resources.
-
Notes:
- Typically incurs downtime (instance must be stopped/restarted or replaced).
- Limited by the largest available instance type.
Horizontal Scaling
-
Definition: Adjusting the number of instances serving traffic.
-
Terminology:
- Scale out: Add instances.
- Scale in: Remove instances.
-
Notes:
- Can be performed without downtime when combined with a load balancer and health checks.
- Best suited for stateless or externally state-managed services.
