Best Rainfall Prediction Model Based on Skill Score

This content contains information and techniques in determining the best prediction model using the skill score method.

Eggy Pandiangan true
2024-10-10

1. INTRODUCTION

Definition and Goals of Climate Seasonal Forecast Assessment

A Climate Seasonal Forecast Assessment refers to the systematic evaluation of seasonal climate forecasts, which typically predict weather patterns over several months. This process involves analyzing the accuracy, skill, and reliability of these forecasts to determine how closely they match the actual weather outcomes (observation). The primary goal of this assessment is to verify whether the forecasted seasonal conditions, such as temperature, precipitation, and wind patterns, align with the observed climate data for the given period.

The verification aspect plays a crucial role in this assessment, as it measures the forecast’s skill and consistency. Verification involves comparing the predicted values with real-world data, often using statistical techniques to determine the forecast’s accuracy. The results of the verification help meteorologists refine forecasting models, identify areas for improvement, and better understand the underlying climatic variables that influence seasonal patterns. Validation is the confirmation through testing and providing objective evidence that certain requirements for a specific purpose are met by a model. Verification, on the other hand, is the process of comparing model calculations (forecasts) with actual values (observations), which is generally equated with validation (Jolliffe and Stephenson 2011).

The type of verification adjusts to the type of forecasts, here are some examples of the types of forecasts and verification (WWRP 2017)

Nature of forecast: Example(s): Verification methods:
deterministic (non-probabilistic) quantitative precipitation forecast visual, dichotomous, multi-category, continuous, spatial
probabilistic probability of precipitation, ensemble forecast visual, probabilistic, ensemble
qualitative (worded) 5-day outlook visual, dichotomous, multi-category
Space-time domain:
time series daily maximum temperature forecasts for a city visual, dichotomous, multi-category, continuous, probabilistic
spatial distribution map of geopotential height, rainfall chart visual, dichotomous, multi-category, continuous, probabilistic, spatial, ensemble
pooled space and time monthly average global temperature anomaly dichotomous, multi-category, continuous, probabilistic, ensemble
Specificity of forecast:
dichotomous (yes/no) occurrence of fog visual, dichotomous, probabilistic, spatial, ensemble
multi-category cold, normal, or warm conditions visual, multi-category, probabilistic, spatial, ensemble
continuous maximum temperature visual, continuous, probabilistic, spatial, ensemble
object- or event-oriented tropical cyclone motion and intensity visual, dichotomous, multi-category, continuous, probabilistic, spatial

Allan Murphy (Murphy 1993), a pioneer in the field of prediction verification, distinguishes 3 types of ‘goodness’ of a prediction :

In addition, Murphy also mentioned that there are 8 aspects (attributes) that show the quality of a forecast, including:

No. Aspects/Attributes Description Scores/metrics that can be used
1 Bias Difference between predicted and observed means Mean Error (ME), Mean Square Error (MSE), Scatter plot
2 Accuracy The degree of agreement between prediction and observation Continuous Rank Probability Score (CRPS), Taylor Diagram
3 Uncertainty The diversity of observation values, the greater the uncertainty of the observation, the more difficult it is to predict Verification Rank Histogram (VRH)
4 Sharpness The ability of the forecast to predict extreme values. Sharpness is “only” possessed by predictions (not observations); even poor predictions still have the attribute of sharpness Ensemble Spread (SPRD), Spread Skill Relationship (SSR)
5 Resolution The ability of a forecast to represent how much the prediction differs from the probabilistic mean of the climatology of an event and whether the prediction system can predict it correctly; Typically used to measure the mean square of probabilistic prediction error Brier Score (BS), Relative Operating Characteristic (ROC)
6 Discrimination The forecast’s ability to clearly distinguish a situation that leads to the occurrence or non-occurrence of an event Relative Operating Characteristic (ROC)
7 Reliability Statistical consistency between the probabilistic prediction of an event and the actual frequency of occurrence Reliability Diagram (RD), Brier Score (BS)
8 Skill The relative accuracy of a prediction model to a reference (climatological conditions) or an increase in prediction accuracy due to an improved prediction system; Reliability is used to measure the superiority of a prediction system based on a baseline of past observations Skill Score (SS): BSS, CRPSS, ROCSS

Examples of Evaluations for Climate Seasonal Forecasts

Some meteorological agencies in other countries have evaluated their predictions using several verification metrics, for example:

\(~\)

2. HOW IT IS EVALUATED?

Approaches for Assessing Climate Seasonal Forecast

There are several approaches that are commonly used to assess seasonal climate forecasts (Wilks 2011; Jolliffe and Stephenson 2011; WWRP 2017), such as:

No. Metric Formulation Description
1 Mean Error (ME) \(\text{ME}=\frac{\sum_{i=1}^{n}\left(f_i-o_i\right)}{n}\) \(n=\) Number of data pairs (forecast & observation); \(f_i=\) Forecast value; \(o_i=\) Observatioin value; Best Score \(=0\); Verification method = continuous;
2 Mean Absolute Error (MAE) \(\text{MAE}=\frac{\sum_{i=1}^{n}\left|f_i-o_i\right|}{n}\) \(n=\) Number of data pairs (forecast & observation); \(f_i=\) Forecast value; \(o_i=\) Observatioin value; Best Score \(=0\); Verification method = continuous;
3 Root Mean Square Error (RMSE) \(\text{RMSE}=\sqrt{\frac{\sum_{i=0}^{N - 1} (f_i - o_i)^2}{N}}\) \(N=\) Number of data pairs (forecast & observation); \(f_i=\) Forecast value; \(o_i=\) Observatioin value; Best Score \(=0\); Verification method = continuous;
4 Boxplot Boxplot \(Q_1=(n+1)*0.25\); \(Q_2=(n+1)*0.5\); \(Q_3=(n+1)*0.75\); \(IQR=Q_3-Q_1\); Best Score = Closest to Observation; Verification method = continuous, visual;
5 Scatter plot scatterplot Best Score = gather around the diagonal; Verification method = continuous, visual;
6 Correlation Coefficient \((r)\) \(r=\frac{\sum{(f_i-\overline{f})(o_i-\overline{o})}}{\sqrt{\sum{(f_i-\overline{f})^2}}\sqrt{\sum{(o_i-\overline{o})^2}}}\) \(f_i=\) i-th forecast value; \(\overline{f}=\) mean of forecast values; \(o_i=\) i-th observation value; \(\overline{f}=\) mean of observation values; Best Score = 1; Verification method = continuous;
7 Dichotomous Contingency Table dic_con_table \(\text{HR} = \frac{a}{a + c} = \hat{p}(\hat{x} = 1 \mid x = 1)\);
\(\text{PC or Accuracy} = \frac{a + d}{n} = \hat{p}\left[(\hat{x} = 1, x = 1) \, \text{or} \, (\hat{x} = 0, x = 0)\right]\);
\(\text{FAR} = \frac{b}{a + b} = \hat{p}(x = 0 \mid \hat{x} = 1)\);
\(\text{CSI} = \frac{a}{a + b + c}\);
ROC
8 Multi-Category Contingency Table dic_con_table \(\text{Accuracy} = \frac{1}{N} \sum_{i=1}^{K}n(F_i, O_i)\);

\(\text{HSS} = \frac{\frac{1}{N} \sum_{i=1}^{K} n(F_i, O_i)-\frac{1}{N^2} \sum_{i=1}^{K} N(F_i) N(O_i)}{1 - \frac{1}{N^2} \sum_{i=1}^{K} N(F_i) N(O_i)}\);


Best Score Accuracy=1; Best Score HSS=1

Skill Score Method

Prediction of the onset of the Indonesian season by BMKG for both the wet and dry seasons is based on rainfall values obtained from the output of several models. These models consist of dynamic and statistical models that have their own advantages and disadvantages. The number of models used will certainly cause more uncertainty in the prediction of season onset. To overcome this, several things can be done :

Some of the metrics previously described can be used to evaluate the rainfall output of models used in seasonal prediction. Here we will use an evaluation metric called the Taylor Skill Score (TSS)(Xiaoli Yang and Sheffield 2020) or commonly called Skill Score (SS), whose basis is the Taylor Diagram. The Taylor Diagram has 3 output results (Correlation Coefficient, RMSE, and \({NSD}_m\)) drawn in one graph. If referring to each of the Taylor Diagram outputs, it will certainly be difficult to determine the best model. TSS combines the 3 outputs that produce a value that states the skill of the rainfall prediction model being evaluated.

\[\begin{equation} \tag{1} SS = \frac{4(1 + CC)^4}{\left( \text{NSD}_m + \frac{1}{\text{NSD}_m} \right)^2 (1 + CC_0)^4} \end{equation}\] \[\begin{equation} \tag{2} \text{NSD}_m = \frac{\sigma_{\text{mod}}}{\sigma_{\text{obs}}} \end{equation}\]

Where: \[ \text{NSD}_m = \text{Normalized standard deviation of the simulation (model)} \]

\[ \sigma_{\text{mod}} = \text{Standard deviation of the simulation (model)} \]

\[ \sigma_{\text{obs}} = \text{Standard deviation of the observation} \]

\[ m = \text{Model value simulation} \]

\[ \text{CC}_0 = \text{Maximum correlation coefficient usually 1} \]

\[ \text{CC} = \text{Correlation coefficient between the simulated (model) and observed data} \]

The closer SS is to 1, the better the ability of the individual model to represent the observations

\(~\)

3. RAINFALL PREDICTION EVALUATION (FOR SEASON ONSET DETERMINATION)

Assessment at ZOM 9120

Here is an example of rainfall forecast data from several models (Initial January 2011) and observations, for 21 dekad days (ZOM Aceh_01):

CODE TO CREATE A TAYLOR DIAGRAM

library(openair)
library(Metrics)
library(tidyr)
#data <- read.csv('path/to/your/data.csv')
t_dt3 <- pivot_longer(data = data, cols = colnames(data[!colnames(data) %in% c('Jan_2011','OBS')]), names_to = 'MODEL', values_to = 'CH')
t_dt3$MODEL <- factor(t_dt3$MODEL, levels = unique(t_dt3$MODEL))
p2 <- TaylorDiagram(t_dt3, obs = "OBS", mod = "CH", group = "MODEL", normalise = T, cols = c('cornflowerblue','blue','yellow2','deeppink','deeppink4','orange1','orange4'),
                    text.obs = 'Observasi', key = T, main=paste0('Taylor Diagram'))

ECMWF_RAW ECMWF_COR MME1 CFSv2_RAW CFSv2_COR ARIMA WARIMA
RMSE 42.9393894 32.8570424 38.5247122 46.4539831 31.4249754 43.5126742 42.0122118
CC 0.6972082 0.7312276 0.5420013 0.4109492 0.7422360 0.3165167 0.4026214
NSDm 0.6112191 0.5841298 0.6217545 0.6297243 0.7395452 0.3083050 0.3535605
SS 0.4107342 0.4259743 0.2842024 0.2014521 0.5264507 0.0595302 0.0955697
W 0.2049660 0.2125712 0.1418237 0.1005293 0.2627113 0.0297070 0.0476915

Based on the above example, the best model in ZOM ACEH_01 for the January 2011 prediction initial is the CFSv2_COR Model. SS calculations should be carried out throughout the availability of data (in this example from 2011 - 2020) and the entire initial (January - December). So that the SS will be obtained for each month (January - December) for 10 years (2011 - 2020). Based on all the SS obtained, the average SS for the 10 years can be calculated and the best model with the highest SS is obtained.

BEST MODEL FOR EACH ZOM BASED ON AVERAGE SS 2011-2020

Using the Skill Score method to Interpret the Evaluation

Based on the highest Skill Score (maximum 1), we will get the best model in each ZOM for each initial release. From the tabular results above, we can display the Skill Score value and the best model for each ZOM spatially.

BEST MODEL FOR EACH ZOM BASED ON AVERAGE SS 2011-2020 ON INITIAL JANUARY

Then, we can also see the distribution of Skill Scores for 10 years (2011-2020) using boxplot to find out what the dominant skill score is.


EXAMPLE OF SKILL SCORE DISTRIBUTION IN SEVERAL ZOMS IN A GIVEN MONTH (2011-2020)




SS FOR 2011-2020 AT ZOM ACEH_01




CODE TO CREATE A BOXPLOT

#ss_aceh01 <- read.csv('path/to/your/data.csv')
plot.new()
boxplot(x = ss_aceh01[,3:ncol(ss_aceh01)], lwd = 2, yaxt='n', ylim=c(0,1), col = 'white', medlwd=2, cex.axis=0.5)
points(x = 1:ncol(ss_aceh01[,3:ncol(ss_aceh01)]),apply(X = ss_aceh01[,3:ncol(ss_aceh01)], MARGIN = 2,FUN=mean),pch=21,col='blue3',lwd=2)
for (i in 1:ncol(ss_aceh01)-2) {
  points(jitter(rep(i, nrow(ss_aceh01))), ss_aceh01[[i+2]], col = "red", pch = 16, cex=0.45)
}
text(x=1:ncol(ss_aceh01[,3:ncol(ss_aceh01)]), y=apply(X = ss_aceh01[,3:ncol(ss_aceh01)], MARGIN = 2, FUN=mean)+0.02, labels= paste0('Mean = ', round(apply(X = ss_aceh01[,3:ncol(ss_aceh01)], MARGIN = 2, FUN=mean),2)), cex=0.5)
axis(side=2, at=seq(0,1,0.1), labels=seq(0,1,0.1), cex.axis=0.8)
segments(x0 = seq(0,7,0.5), y0 = 0, y1 = 1, lty=2, lwd=0.5,col = 'lightgrey')
segments(x0 = 0, x1 = 7, y0 = seq(0,1,0.1), lty=2, lwd=0.5,col = 'lightgrey')
mtext(~ italic('Skill Score (SS)'), side = 2, line = 2, cex = 0.8)
title(main = paste0('Boxplot Skill Score ZOM : ACEH_01'), line = 0.5)
mtext(~ italic('Periode 2011 - 2020'), side = 1, line = par('usr')[3]-1, col = "black", cex = 0.6, at=par('usr')[2]-0.7)
mtext('Init : JAN', side = 3, font=3,line = par('usr')[3]-1, col = "darkblue", cex = 0.8, at=par('usr')[2]-0.4)

4. RAINFALL WEIGHTING

Seasonal Rainfall Prediction Weighting based on Skill Score

From the highest Skill Score results, we will know the best model for each initial in each ZOM. For season onset prediction, we can use the best model that is already known according to the initial month used as a seasonal prediction of rainfall. From the prediction of rainfall values based on this best model, the season onset can then be determined.

Another method that we can do if we still want to use all the rainfall seasonal prediction model outputs is by weighting each rainfall model output. This weight can be obtained based on the Skill Score value of each model that we have calculated. Mathematically, it can be formulated as follows:

\[\begin{equation} \tag{3} W_i = \frac{SS_i}{\sum_{j=1}^n SS_j} \end{equation}\]

\[\begin{equation} \tag{4} \sum_{i=1}^n W_i = 1 \end{equation}\]

Where: \[ SS_i = \text{$i$-th model Skill Score, where } i = 1, 2, \dots, n \]

\[ W_i = \text{$i$-th model Weight, where } i = 1, 2, \dots, n \]

Seasonal Rainfall Prediction ZOM 9120 using Skill Score-based weighting

We will do the weight calculation as practice. In this case we are using ZOM BALI_11 with the initials February. From the results of the weight calculation (W), if we add up the total is 1.

ZOM TYPE INIT MME1 ECMWF_RAW ECMWF_COR CFSv2_RAW CFSv2_COR ARIMA WARIMA
25896 BALI_11 SS Feb 0.4786296 0.4334377 0.5389604 0.0695950 0.5097606 0.5386813 0.4729669
34284 BALI_11 W Feb 0.1543724 0.1430077 0.1771860 0.0249932 0.1750009 0.1737069 0.1517329

Next, we will weight the February 2024 initial seasonal rainfall prediction (generally used for dry season prediction). We multiply each model with the weight (W) that we have obtained in the previous weight calculation.

SEASONAL RAINFALL PREDICTION DATA INITIAL FEBRUARY 2024

DAS MME1 ECMWF_RAW ECMWF_COR CFSv2_RAW CFSv2_COR ARIMA WARIMA
Feb I 163.8225 96.9621 131.9956 55.9008 202.7789 107.4265 92.3780
Feb II 72.1478 83.7558 114.0584 29.5072 107.0710 99.2311 92.3780
Feb III 81.4403 75.8253 103.3301 30.6811 111.3555 87.2150 85.6466
Mar I 25.0803 97.1974 107.0056 49.4504 151.6484 75.1727 84.5403
Mar II 120.4728 93.7277 103.1808 75.5555 231.6704 74.0638 80.1041
Mar III 169.5966 85.7359 94.4199 83.6646 256.5225 63.1690 44.2504
Apr I 29.1713 74.2389 60.4546 68.8359 74.1464 50.7099 40.5221
Apr II 38.4456 63.8937 52.0237 70.8249 76.2927 50.6072 8.5760
Apr III 60.3156 55.0750 44.8474 75.9045 81.7458 38.5234 38.8828
May I 23.8206 44.4667 28.0870 72.5172 34.3506 28.6324 10.1355
May II 31.2781 39.5156 24.9615 66.9831 31.7315 24.5754 24.9866
May III 32.9556 29.2346 18.4672 123.3038 58.4087 28.1022 7.3104
Jun I 19.5028 24.1109 15.9519 60.1589 27.3728 13.4558 7.2317
Jun II 8.7416 22.2295 14.7116 39.5339 17.9883 12.1809 0.0000
Jun III 5.8922 14.4046 9.5367 36.8783 16.7778 12.9941 2.9549
Jul I 10.4091 9.8209 9.3945 32.3297 22.2831 8.1527 2.2763
Jul II 15.2450 10.0609 9.6385 35.1482 24.2216 5.2905 14.9037
Jul III 19.0328 10.5429 10.0975 27.4801 18.9454 0.0000 8.1212
Aug I 1.3834 6.5196 3.7593 22.5320 8.8758 14.0602 13.5241
Aug II 3.0947 6.9658 4.0088 17.2212 6.7787 12.4973 10.8369
Aug III 7.1116 6.9975 4.0227 25.4537 9.9893 12.2050 3.6793

WEIGHTED FOR INITIAL FEBRUARY 2024

DAS MME1 ECMWF_RAW ECMWF_COR CFSv2_RAW CFSv2_COR ARIMA WARIMA RRw
Feb I 25.2896762 13.8663244 23.3877699 1.3971397 35.486489 18.6607283 14.0167813 132.104908
Feb II 11.1376307 11.9777222 20.2095495 0.7374793 18.737521 17.2371305 14.0167813 94.053814
Feb III 12.5721364 10.8435998 18.3086451 0.7668188 19.487312 15.1498505 12.9954065 90.123769
Mar I 3.8717067 13.8999741 18.9598922 1.2359236 26.538606 13.0580195 12.8275444 90.391666
Mar II 18.5976780 13.4037804 18.2821913 1.8883735 40.542527 12.8653959 12.1544269 117.734373
Mar III 26.1810380 12.2608917 16.7298826 2.0910458 44.891667 10.9728935 6.7142412 119.841660
Apr I 4.5032442 10.6167324 10.7117076 1.7204292 12.975686 8.8086614 6.1485355 55.484997
Apr II 5.9349404 9.1372895 9.2178703 1.7701407 13.351291 8.7908217 1.3012613 49.503615
Apr III 9.3110653 7.8761477 7.9463306 1.8970961 14.305588 6.6917818 5.8997998 53.927809
May I 3.6772437 6.3590794 4.9766226 1.8124367 6.011386 4.9736465 1.5378887 29.348303
May II 4.8284761 5.6510341 4.4228279 1.6741218 5.553041 4.2689175 3.7912891 30.189707
May III 5.0874358 4.1807722 3.2721289 3.0817562 10.221575 4.8815471 1.1092281 31.834443
Jun I 3.0106945 3.4480437 2.8264530 1.5035632 4.790265 2.3373658 1.0972868 19.013672
Jun II 1.3494620 3.1789891 2.6066893 0.9880786 3.147969 2.1159068 0.0000000 13.387094
Jun III 0.9095932 2.0599683 1.6897695 0.9217066 2.936130 2.2571653 0.4483555 11.222689
Jul I 1.6068780 1.4044641 1.6645737 0.8080226 3.899562 1.4161805 0.3453896 11.145071
Jul II 2.3534076 1.4387859 1.7078071 0.8784659 4.238802 0.9189966 2.2613815 13.797646
Jul III 2.9381394 1.5077156 1.7891354 0.6868156 3.315462 0.0000000 1.2322532 11.469521
Aug I 0.2135588 0.9323528 0.6660953 0.5631467 1.553273 2.4423543 2.0520508 8.422832
Aug II 0.4777363 0.9961629 0.7103032 0.4304128 1.186279 2.1708677 1.6443142 7.616076
Aug III 1.0978349 1.0006962 0.7127660 0.6361693 1.748136 2.1200932 0.5582708 7.873967

The seasonal rainfall prediction value that we will use is the total of each weighted model in each dasarian. In this case we can see in the yellow column.

Jolliffe, I. T., and D. B. Stephenson. 2011. Forecast Verification: A Practitioner’s Guide in Atmospheric Science. Wiley. https://books.google.co.id/books?id=sgwIEAAAQBAJ.
Murphy, Allan H. 1993. “What Is a Good Forecast? An Essay on the Nature of Goodness in Weather Forecasting.” Weather and Forecasting 8 (2): 281–93. https://doi.org/10.1175/1520-0434(1993)008<0281:WIAGFA>2.0.CO;2.
Wilks, Daniel S. 2011. Statistical Methods in the Atmospheric Sciences. Amsterdam; Boston: Elsevier Academic Press. https://www.amazon.com/Statistical-Atmospheric-Sciences-International-Geophysics/dp/0123850223/ref=pd_bxgy_14_img_3?_encoding=UTF8&psc=1&refRID=ESPQQ0R2PB1TP1VJSGCZ.
WWRP. 2017. “WWRP/WGNE Joint Working Group on Forecast Verification Research.” https://www.cawcr.gov.au/projects/verification.
Xiaoli Yang, Yuqian Wang, Xiaohan Yu, and Justin Sheffield. 2020. “The Optimal Multimodel Ensemble of Bias-Corrected CMIP5 Climate Models over China.” Journal of Hydrometeorology 21 (4): 845–63. https://doi.org/10.1175/JHM-D-19-0141.1.

References