4.6 Article

Measuring forecast skill: is it real skill or is it the varying climatology?

Journal

QUARTERLY JOURNAL OF THE ROYAL METEOROLOGICAL SOCIETY
Volume 132, Issue 621, Pages 2905-2923

Publisher

JOHN WILEY & SONS LTD
DOI: 10.1256/qj.06.25

Keywords

brier skill score; contingency tables; ensemble forecasting; equitable threat score; forecast verification; probabilistic weather forecasts; relative operating characteristic

Ask authors/readers for more resources

It is common practice to summarize the skill of weather forecasts from an accumulation of samples spanning many locations and dates. In calculating many of these scores, there is an implicit assumption that the climatological frequency of event occurrence is approximately invariant over all samples. If the event frequency actually varies among the samples, the inetrics may report a skill that is different from that expected. Many common deterministic verification metrics, such as threat scores, are prone to mis-reporting skill, and probabilistic forecast metrics such as the Brier skill score and relative operating characteristic skill score can also be affected. Three examples are provided that demonstrate unexpected skill, two from synthetic data and one with actual forecast data. In the first example, positive skill was reported in a situation where metrics were calculated from a composite of forecasts that were comprised of random draws from the climatology of two distinct locations. As the difference in climatological event frequency between the two locations was increased, the reported skill also increased. A second example demonstrates that when the climatological event frequency varies among samples, the metrics may excessively weight samples with the greatest observational uncertainty. A final example demonstrates unexpectedly large skill in the equitable threat score of deterministic precipitation forecasts. Guidelines are suggested for how to adjust skill computations to minimize these effects.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available