MAPE — Mean Absolute Percentage Error
The headline accuracy number. Averaged over holdout days with non-zero actuals: mean of |forecast − actual| / actual. Lower is better; a MAPE of 8% means forecasts were, on average, within 8% of realized spend.
Methodology
Most cloud tools show a forecast line and never tell you how often it was right. TurboFinOps backtests every forecast against your real billing history and publishes four metrics — MAPE, bias, 95% interval coverage and skill versus a persistence baseline — so you know exactly how much to trust the number.
The headline accuracy number. Averaged over holdout days with non-zero actuals: mean of |forecast − actual| / actual. Lower is better; a MAPE of 8% means forecasts were, on average, within 8% of realized spend.
Mean signed error (forecast − actual) as a percentage. Positive bias means the model systematically over-forecasts; negative means it under-forecasts. A model can have low MAPE but meaningful bias, which matters for budget planning.
The share of holdout days whose actual spend fell inside the forecast 95% prediction interval. Well-calibrated intervals cover close to 95% of outcomes — materially lower means the bands are too tight to trust.
We compare model MAPE against a naive persistence baseline (tomorrow = today). Skill is the relative improvement. A forecast that cannot beat persistence is not adding value, and we say so rather than hide it.
The same fitted model that serves your live forecast is the one scored — there is no separate “demo” model tuned to look good in a backtest.
By holdout backtesting: TurboFinOps holds out the most recent N days of billing data, refits the forecast model on everything before the holdout, then scores the model’s predictions against what actually happened over the held-out window. The accuracy exercise runs the exact same model that serves live forecasts.
It depends on spend volatility, but for stable workloads a MAPE under ~10% is strong, and 10–20% is typical for mixed estimated and metered spend. TurboFinOps always reports the measured value rather than a marketing figure.
Persistence (assuming spend stays flat) is a hard-to-beat baseline for short horizons. Reporting skill versus persistence prevents over-claiming: a model is only credited when it genuinely improves on the naive forecast.
No. Whether the underlying model is linear regression, ARIMA or a seasonality-aware fit, accuracy is always measured the same way — holdout backtest, MAPE, bias and interval coverage — so numbers stay comparable across models.
The Forecasts area of the dashboard surfaces these metrics for your organization’s spend, computed on your real billing history through the GET /forecasts/accuracy endpoint.
See it on your own spend in the Forecasts dashboard, or read how forecasts feed budgets and reports.
Connect one AWS, Azure or GCP scope, approve the safest savings actions, and give finance a receipt when the savings verify.