When your app is experiencing high traffic or utilization, you need to scale your service to handle that load.

Scaling can be done either vertically or horizontally — or both! Vertical scaling means making a single resource bigger or more powerful. For example, you might add more CPU or RAM to your server. Horizontal scaling means creating multiple instances of the same service. For example, you might deploy three copies of your server instead of one and then place them all behind a load balancer that handles routing the traffic to each of them. Both types of scaling can be done manually or automatically (autoscaling).

