Methodology

How Skyversus scores and ranks providers

Skyversus compares forecast performance against observed outcomes. The goal is to show which providers have actually been strongest for a given ranking window, with enough context to judge how stable that result is.

OfficialSettled data.

PreliminaryMay still shift.

Thin sampleLimited data.

What we rank

We compare next-day temperature forecasts (D+1) against observed conditions. Each provider is scored by mean absolute error (MAE) — the average distance between what was forecast and what actually happened.

Why sample size matters

A ranking built on five days of data can swing wildly with one unusual day. We show sample counts so you can judge how stable the comparison is. Larger samples produce more reliable rankings.

What "official" means

An official ranking is based on at least 28 days of settled observation data. The underlying numbers have passed through observation refreshes and are unlikely to change.

What "preliminary" means

A preliminary ranking uses recent observation data that may still be updated by upstream sources. The scores are real but could shift slightly as observations settle.

What "thin sample" means

A thin-sample ranking is based on too little settled data to treat as stable. It can still be informative, but it should be read as an early signal rather than a mature comparison.

How to read the score

Scores are shown as mean absolute error (MAE). Lower values are better because they mean the forecast stayed closer to what actually happened.