Cloud rightsizing: how to do it without breaking production
7 min read · June 16, 2026 · TurboFinOps
Rightsizing — matching resource size to actual demand — is the highest-ROI FinOps lever because over-provisioning is so common. It is also the one teams hesitate on, because a bad downsize hurts. Done with real data and a safe process, the risk is low and the savings are immediate.
Size from p95, not averages
Averages lie: a VM at 8% average CPU might spike to 70% at peak. Size from p95 (or p99 for latency-sensitive workloads) over a representative window so you keep headroom while removing slack.
Look at the binding constraint. A workload may be memory-bound, not CPU-bound — downsizing CPU is safe, downsizing memory is not.
Pick the right target, not just smaller
Rightsizing is not only "go down a size". Sometimes the win is a different family: burstable instances for spiky workloads, ARM/Graviton for price-performance, or a custom machine type to match the exact vCPU/memory ratio.
A good recommendation names a concrete target size with the observed utilization behind it — so the owner can judge the trade-off.
Apply changes safely
Stage it: apply in non-production first, watch, then production during a low-traffic window with a rollback plan.
Respect change controls: check IaC ownership, freeze windows and policy protections before any change — the same guardrails a good action engine enforces automatically.
Prove it and repeat
Verify the saving after the change — confirm the new run-rate at 7/14/30-day checkpoints rather than trusting the estimate. Then move to the next candidate.
Rightsizing is continuous: workloads drift, and new resources are always over-provisioned at first. A standing rightsizing queue keeps the estate efficient.
Frequently asked questions
- Why use p95 instead of average utilization?
- Average utilization hides peaks. A resource at low average can still spike high; sizing to p95 keeps enough headroom for those peaks while removing the slack that average-based sizing leaves in place.
- Is rightsizing risky?
- Only if done blind. Sizing from real utilization, respecting the binding constraint (CPU vs memory), staging changes, and verifying the result makes it one of the safest, highest-ROI FinOps actions.
See your own cloud waste in minutes
Connect AWS, Azure or GCP and get a read-only scan of your top savings opportunities — with verified savings receipts when you fix them.
Run a free cloud waste scan