Part3 Plotting verification scores

The plot_point_verif function

All plots of verfication scores are done using the plot_point_verif function from the harpVis package. So, let’s begin by attaching the harpVis package, and if you are running in a new session than the one you did Part 2 of this tutorial in, load the verification data from Part 2 - the data was saved in .rds format, which is an R format so can be loaded with readRDS. The filename is a bit difficult to remember, but when you saved it a message was printed to the screen telling you what it was saving.

library(harpVis)
verif_T2m <- readRDS("/scratch/ms/no/fa1m/harp_output/verification/harpPointVerif.harp.T2m.harp.2017053000-2017053100.harp.model1.model.model2.model.model3.rds")

When we run harp_point_verif the minumum information we need to give it is the verification data and the score we wish to plot. The score is typically a column name in the data, though there are some derived scores as well. For this reason, the score is not quoted. Let’s start with something simple, like the mean bias of the ensemble. If we look at the verification data…

verif_T2m

## $ens_summary_scores
## # A tibble: 36 x 11
##    mname leadtime num_cases mean_bias  rmse  stde spread  crps
##    <chr>    <int>     <int>     <dbl> <dbl> <dbl>  <dbl> <dbl>
##  1 mode…        3      1566  -0.348    1.29  1.24  0.423 0.850
##  2 mode…        6      1649  -0.311    1.19  1.15  0.459 0.776
##  3 mode…        9       770   0.00976  1.68  1.68  0.813 1.05 
##  4 mode…       12       797  -0.0880   1.91  1.91  0.989 1.21 
##  5 mode…       15       770  -0.357    1.91  1.88  1.05  1.19 
##  6 mode…       18       829  -0.392    1.67  1.62  0.848 1.00 
##  7 mode…       21       774  -0.299    1.45  1.42  0.710 0.897
##  8 mode…       24       810  -0.451    1.55  1.48  0.781 0.971
##  9 mode…       27       789  -0.478    1.40  1.32  0.742 0.881
## 10 mode…       30       831  -0.456    1.25  1.17  0.782 0.777
## # … with 26 more rows, and 3 more variables: crps_potential <dbl>,
## #   crps_reliability <dbl>, rank_histogram <list>
## 
## $ens_threshold_scores
## # A tibble: 180 x 17
##    mname threshold leadtime economic_value sample_climatol…
##    <chr>     <dbl>    <int> <list>                    <dbl>
##  1 mode…      284.        3 <tibble [20 ×…            0.334
##  2 mode…      284.        6 <tibble [20 ×…            0.415
##  3 mode…      284.        9 <tibble [20 ×…            0.588
##  4 mode…      284.       12 <tibble [20 ×…            0.645
##  5 mode…      284.       15 <tibble [20 ×…            0.640
##  6 mode…      284.       18 <tibble [20 ×…            0.546
##  7 mode…      284.       21 <tibble [20 ×…            0.412
##  8 mode…      284.       24 <tibble [20 ×…            0.340
##  9 mode…      284.       27 <tibble [20 ×…            0.332
## 10 mode…      284.       30 <tibble [20 ×…            0.391
## # … with 170 more rows, and 12 more variables: bss_ref_climatology <dbl>,
## #   num_cases_for_threshold_total <int>,
## #   num_cases_for_threshold_observed <int>,
## #   num_cases_for_threshold_forecast <int>, brier_score <dbl>,
## #   brier_skill_score <dbl>, brier_score_reliability <dbl>,
## #   brier_score_resolution <dbl>, brier_score_uncertainty <dbl>,
## #   reliability <list>, roc <list>, roc_area <dbl>
## 
## $det_summary_scores
## # A tibble: 252 x 8
##    mname  leadtime member num_cases     bias  rmse   mae  stde
##    <chr>     <int> <chr>      <int>    <dbl> <dbl> <dbl> <dbl>
##  1 model1        3 mbr000      1566 -0.369    1.33 1.00   1.28
##  2 model1        6 mbr000      1649 -0.315    1.22 0.925  1.18
##  3 model1        9 mbr000       770 -0.00320  1.71 1.25   1.71
##  4 model1       12 mbr000       797 -0.0778   1.97 1.48   1.97
##  5 model1       15 mbr000       770 -0.348    1.93 1.45   1.90
##  6 model1       18 mbr000       829 -0.360    1.67 1.20   1.63
##  7 model1       21 mbr000       774 -0.316    1.45 1.07   1.42
##  8 model1       24 mbr000       810 -0.462    1.56 1.19   1.49
##  9 model1       27 mbr000       789 -0.549    1.45 1.10   1.34
## 10 model1       30 mbr000       831 -0.538    1.42 1.06   1.31
## # … with 242 more rows
## 
## attr(,"parameter")
## [1] "T2m"
## attr(,"start_date")
## [1] "2017053000"
## attr(,"end_date")
## [1] "2017053100"
## attr(,"num_stations")
## [1] 842

… we see that there is a column mean_bias in the ens_summary_scores table. So all we need to do is call plot_point_verif with mean_bias as the score

plot_point_verif(verif_T2m, mean_bias)

By defaulty the number of cases is plotted below - this can be switched of by setting plot_num_cases = FALSE

plot_point_verif(verif_T2m, mean_bias, plot_num_cases = FALSE)

We can also change the colours by supplying a colour table. This colour table is in the form of a data frame with a column for the data you want to control the colour for and a column for the colour. By default, the colours are controlled by the mname column, which is the model name. In our case we have model1, model2, and model3, so let’s colour them red, green and blue respectively.

my_colours <- data.frame(
  mname  = c("model1", "model2", "model3"),
  colour = c("red", "green", "blue")
)

plot_point_verif(verif_T2m, mean_bias, plot_num_cases = FALSE, colour_table = my_colours)

Colours can also be specified as hex codes (e.g. “#AF469B”). It is recommended to avoid primary colours such as red, green and blue and rather make use of some professional colour palettes. See here for some useful guidance.

Of the derived scores that do not appear in the verification data, spread_skill and spread_skill_ratio are perhaps the most useful. Simply set these as the score and run the function as normal.

plot_point_verif(verif_T2m, spread_skill, plot_num_cases = FALSE)

plot_point_verif(verif_T2m, spread_skill_ratio, plot_num_cases = FALSE)

Faceting and filtering

If we want to plot a threshold score, we run into a problem of how to separate the plots. If we simply try to plot the Brier score we will get a rather odd looking plot…

plot_point_verif(verif_T2m, brier_score)

This is because it’s trying to plot every threshold in one plot. There are two ways around this - faceting, which is splitting the plot into panels based on the value in a particular column of the data, or filtering where we filter the data down to just what we want to plot. Let’s start by faceting the plot based on the threshold - we do this with the facet_by argument. Anything that is supplied to facet_by must be wrapped in the vars function - this is because any number of column names can be used for faceting and vars splices the information together for the plotting function to understand.

plot_point_verif(verif_T2m, brier_score, facet_by = vars(threshold))

We can also set the number of columns in our faceted plot using the num_facet_cols argument.

plot_point_verif(verif_T2m, brier_score, facet_by = vars(threshold), num_facet_cols = 2)

If we just want to plot for one threshold we can use the filter_by argument.

plot_point_verif(verif_T2m, brier_score, filter_by = vars(threshold == 283.8))

Faceting is also useful for inspecting rank histograms for each lead time.

plot_point_verif(verif_T2m, rank_histogram, facet_by = vars(leadtime), num_facet_cols = 4)

Plotting deterministic scores for ensembles.

For plotting deterministic scores we need to use what we have learned about faceting as well as introduce a new functionality - colour_by. We do not want to plot all members for each model on the same plot as this would clutter it greatly. We also need to tell the function to separate the data based on member - we do this with colour_by. We also need to tell the plot function that we want the deterministic scores, which we can do by settingverif_type = "det"

plot_point_verif(verif_T2m, rmse, facet_by = vars(mname), colour_by = member, verif_type = "det")

The legend is too wide dor the plot, so we have a number of options - increase the number of rows available to the legend with the num_legend_rows argument, or move or delete the legend with the legend_position argument.

plot_point_verif(
  verif_T2m, rmse, facet_by = vars(mname), colour_by = member, 
  verif_type = "det", num_legend_rows = 3
)

plot_point_verif(
  verif_T2m, rmse, facet_by = vars(mname), colour_by = member, 
  verif_type = "det", legend_position = "right", num_legend_rows = 8
)

plot_point_verif(
  verif_T2m, rmse, facet_by = vars(mname), colour_by = member, 
  verif_type = "det", legend_position = "none"
)

It may also be useful to have all of the members in one colour and only highlight the control member, for example. This can be done by setting the colour table with a member column. We’ll use the R functions rep, seq, paste0 and formatC to make this easier.

member_colours = data.frame(
  member = paste0("mbr", formatC(seq(0, 7), width = 3, flag = "0")),
  colour = c("#EC5581", rep("grey", 7))
)
plot_point_verif(
  verif_T2m, rmse, facet_by = vars(mname), colour_by = member, 
  verif_type = "det", legend_position = "none", colour_table = member_colours
)

## Warning: plot_num_cases = TRUE cannot be used with facet_by. plot_num_cases
## set to FALSE.

There are many more things you can do with plots. They are generated with the ggplot2 package, so it would pay to learn how to make best use of ggplot2, so that you can manipulate the plots in any way you’d like. Perhaps the biggest advantage of this is that the your plots can be saved with ggsave. By default it will save the most recent plot in the plot window.

The harp verification process

Andrew Singleton

02/15/2019

Part3 Plotting verification scores

The plot_point_verif function

Faceting and filtering

Plotting deterministic scores for ensembles.