HWT visit

Jamie was at HWT for the last week (5) of the testbed (31 May - 3 June)

Notes from Day 1 overview:

CAPS radar assimilation ended up not running so there will be no GSD vs. CAPS comparison
Plan to continue controlled experiment (CLUE) in future HWT - has been successful but also had a few "lessons learned" in this inaugural year that can be improved upon next year
CAM hail size evaluation a focus for this year
- Three hail algorithms: HAILCAST, directo output from the mp scheme (developed by G. Thompson), and machine learning (statistical) technique (Gagne)
Ensemble sensitivity (Texas Tech work - Brian Ancell)
- Features in flow early in the forecast that impact the ensemble response later (predictability)
So far they have noticed if the CAMs don't handle overnight convection well they have problems the next day
- There are a lot of solutions between the CAMs in this type of pattern
Week 5 had a more of weaker shear/multi-cell storm pattern - this is a challenge for CAMs
This blog entry is a good overview of what we did each day (http://springexperiment.blogspot.com/2016/05/data-driven.html#more)

Notes from sitting with forecasters each day

opHRRR tends to have
- PBL too warm/dry
- too much convection
parallel HRRR (going operational in early July)
- has been decent during HWT
5-day MPAS
- performance is region dependent
- strongly forced systems easier
- general temporal/spatial coverage OK but not specific storm location
Thompson mp
- less aggressive cold pools to slow propagation (this was an intentional design based on feedback from previous experiments
- see the result of this in the statistics
For verification (subjective during the experiment) they used LSRs (local storm reports), WFO warnings (especially in rural areas where no reports are received), and MESH (maximum estimated size of hail - MRMS hail product)
- Forecasters generally like the MESH - seems to be pretty accurate
If we draw a 5% poly we would want 5 reports for each 100 grid box (at 80 km resolution) area
General comment from HWT coordinators over the past 4 weeks
- ARW (HRRR) generally has (incrementally) better performance than NAMRR - but on cases when NAMRR is better, it tends to be much better
When evaluating probabilities of 40 dBZ or greater they used reflectivity > 40 dBZ as the comparison field
In operations, NAM has poor sounding structure near convective initiation
Forecasters need to be aware of the CWAs they issue
- Don't want to change their poly just enough to include a CWA if it wasn't in there previously (unless warranted)
- They joke that they could put so-and-so's house in a slight risk!
There is no reward to the forecaster for keeping the poly smaller (to reduce FAR) but they are punished if the area is too small and they miss reports
- Every bust makes them draw larger poly's
- Only need a handful of reports to verify
Don't care so much about FARs
Hard to decrease probabilities once they are issued to the public ("Thou shalt not downgrade...")
- They tend to err on the side of too low early on to avoid this problem
How do you evaluate hail forecast if storms are in the wrong spot?!
To start eh SFE2016 this post talks a bit about CLUE (http://springexperiment.blogspot.com/2016/05/the-2016-spring-forecasting-experiment.html); the final blog entry to wrapup SFE2016 is here (http://springexperiment.blogspot.com/2016/06/sfe-2016-wrap-up.html)

A few of the days they took ~ 2-5 minutes to show some objective statistics from the experiment

Aggregated ROC for SFE2016 to-date (3-hrly ROC area by forecast lead time)
- Assess mixed core vs. single core - In general, mixed (ARW+NMMB) beats core beats any single (ARW or NMMB) core; for single core, ARW generally beats NMMB
When looking at FSS: mixed (ARW+NMMB) beats core beats any single (ARW or NMMB) core; for single core, NMMB generally beats ARW at shorter lead times and ARW beats NMMB at longer lead times
- When they compute FSS they do the following:
  - Make obs 0/1 and apply smoother to get continuous values between 0-1 in obs
  - Apply a 40 km radius to forecast field
  - Difference forecast probabilities from the observations and look at the squared difference
Does influence of DA extend longer when looking at probabilities rather than deterministic?
They compared PQPF to observations by using the same threshold for a single case
This blog entry has an example of the ROC curves and PQPF comparison that we looked (http://springexperiment.blogspot.com/2016/05/clue-comparisons.html#more). I can't seem to find a link to these plots on the testbed webpage (http://hwt.nssl.noaa.gov/Spring_2016/), however.

Child pages

HWT visit