Storm Identification, Tracking and Nowcasting

Storm Identification, Tracking and Nowcasting VA L L I A P PA L A K S H M A N A N N AT I O N A L S E V E R E S T O R M S L A B O R AT O R Y / U N I V E R S I T Y O F O K L A H O M A M A R C H 2 0 1 3 L A K S H M A N @ O U. E D U

Tuning an Algorithm for Identifying and Tracking Cells Introduction Storm Identification Storm Association Extracting Storm characteristics Estimating Motion Extrapolating Images WDSS-II w2segmotionll 2

Motivation? Why would you want to identify and track storms from radar images? 3

Why tracking or nowcast storms? Two key reasons: Extract information related to storm characteristics Useful for severe weather diagnosis and warnings Create short-term forecasts of radar images Useful for flash flood warnings Also useful as visualization for non-meteorologists Need to be clear which application you are interested in Nowcasts may be either of images or of storm characteristics Different techniques work better for these scenarios 4

Example: SCAN in United States 5

Example: COTREC in Switzerland 6

Storm Characteristics To track storm characteristics: Identify storms in radar images Associate storms in one image with storms in the previous image Pull out characteristics of the storms at both time steps Could involve non-radar data also Do statistical analysis or provide graphical output of trends Operational way to do this in the US NWS Storm Cell Identification and Tracking (SCIT) Johnson et. Al System for Convective Analysis and Nowcasting (SCAN) Part of AWIPS 7

Nowcasting Images To create extrapolation forecasts Estimate motion vector at each pixel of image Apply motion vector to extrapolate most current image Not done operationally by the NWS Extrapolation provides very little value to an expert forecaster But extrapolation forecasts are part of many systems: AutoNowCaster (NCAR, widely used around the world) Growth and Decay Tracker (MIT Lincoln Lab, used by FAA) WDSS-II (NSSL, used in US TV station graphics and internationally) CoTREC (Czech Cross-Correlation Tracker, used widely in Europe) Constantly reinvented Optical flow is a collection of well-known image processing techniques that are generally applicable to this problem e.g. Dark Sky, Rain Aware, etc. targeting mobile phones 8

Outline of Talk Discuss different methods of: Identifying storms Associating storms Extracting storm characteristics Estimating motion Extrapolating images The devil is in the details Real-time, quasi-operational systems involve specific choices of each of the above pieces Self-study slides on w2segmotionll, the WDSS-II storm identification/tracking/extrapolation algorithm Encourage you to download and try out (http://www.wdssii.org/) 9

Tuning an Algorithm for Identifying and Tracking Cells Introduction Storm Identification Storm Association Extracting Storm characteristics Estimating Motion Extrapolating Images WDSS-II w2segmotionll [self-study] 10

Identifying Storms Simple idea: Areas of high reflectivity Area = contiguous range gates (or pixels, if Cartesian) But before that: What field to use? Base reflectivity? Composite reflectivity? Should we use reflectivity? How about VIL? Reflectivity at -10C Need to quality control the data first Clutter, sun-spikes, biological echoes, etc. Smooth the data Which smoothing filter? 11

Single Threshold A single threshold is problematic Could be too high, and miss initiating storms Could be too low and storms end up being unrealistically large 50 dbz 20 dbz There is no perfect threshold Threshold just right Threshold too low 30 dbz Threshold too high

Hysteresis Storm = contiguous pixels above t2 that have at least one pixel above t1 Solves problem of initiating cells But not problem of overly large areas 13

Enhanced Watershed Storm = contiguous pixels above t2 that have at least one pixel above t1 t2 = highest value that will allow objects be at least N pixels 40 dbz 10 px 40 dbz 5 px 30 dbz 10 px 30 dbz 5 px 14

Multiscale Storms Can use this idea to get storms at multiple scales Storms found in this manner are hierarchical 15

Associating Storms Suppose you identify a set of storms in one image And have the set of storms identified in previous image Need to associate the two sets of storms Methods: Find centroids and associate the closest centroid Project the centroid at previous time to its new expected location Associate the closest centroid (within some sanity limits) Use cost function that weights distance and storm characteristics Minimize cost function: this is a linear optimization problem called the Munkres or Hungarian algorithm Greedy optimization Rank storms based on importance (so that stronger, longer-lived storms get associated first, for example) 17

Difficulties in storm association Misidentification Storm splits and merges 18

Storm characteristics Identified storm cells have an area Can extract statistics of other spatial grids within those areas Track properties over time Alternately, extract stats in neighborhood of centroid Things to think about: Is outline of object exact? Impact of using watershed approach on spatial properties Impact of using thresholds on initiation/intensification properties Problem of core vs. periphery Use properties + data mining to make predictions 20

21 Initiation Forecast

Estimating Motion Three ways to estimate motion: Object based: use object association to find velocity of each object Interpolate motion field between objects Works best for small objects; size changes problematic Optical flow Consider a rectangular window centered around each pixel Move window around in previous frame and maximize cross-correlation or some other measure of matching skill Can also be done using phase correlation Works well for large objects; subject to aperture problem Hybrid Identify objects and move object around in previous frame to minimize mean absolute error (does not depend on associating objects) 23

Extrapolating images Extrapolation of radar echoes Intuitive approach: put Take current echoes Move them forward in time (advect) by their motion field put is subject to holes in the created images get results in the wrong motion field being used Simple put followed by get Move echoes forward, fill in gaps via interpolation Results in ghost echoes Smarter approach: Advect the motion field, interpolating within gaps Now, do get using advected motion field 25

Example 26

Tuning an Algorithm for Identifying and Tracking Cells Introduction Storm Identification Storm Association Extracting Storm characteristics Estimating Motion Extrapolating Images WDSS-II w2segmotionll 27

segmotion is a general-purpose algorithm Segmentation + Motion Estimation Segmentation --> identifying parts ( segments ) of an image Here, the parts to be identified are storm cells segmotion consists of image processing steps for: Identifying cells Estimating motion Associating cells across time Extracting cell properties Advecting grids based on motion field segmotion is part of WDSS-II Can be downloaded from http://www.wdssii.org/ Can be applied to any uniform spatial grid Has been applied to radar, satellite, model grids, 2D/3D lightning Some applications interested only in advection; others only on attribute extraction 28

Try things out Download WDSS-II from http://www.wdssii.org/ Follow Quick Start to learn how to run algorithms Download radar data from NCDC Convert to netcdf using ldm2netcdf Use c option to create composite Run w2segmotioncg on ReflectivityComposite Experiment with various options as described on the following slides Try applying QC/smoothing to data first w2qcnn w2smooth (or k option to w2segmotioncg) 29

Vector quantization via K-Means clustering [1] Quantize the image into bands using K-Means Vector quantization because pixel value could be many channels Like contouring based on a cost function (pixel value & discontiguity) 30

Mathematical Description: Clustering Each pixel is moved among every available cluster and the cost function E(k) for cluster k for pixel (x,y) is computed as d E m xy Distance in measurement space (how similar are they?) ( k) d ( k) (1 ) d ( k) m, xy c, Discontiguity measure (how physically close are they?) n, xy ( k) k I d ( ) (1 ( )) xy c, xy k Sij k ij N xy xy Weight of distance vs. discontiguity (0 λ 1) Mean intensity value for cluster k Pixel intensity value Courtesy: Bob Kuligowski, NESDIS Number of pixels neighboring (x,y) that do NOT belong to cluster k 31

Tuning vector quantization (-d) The K in K-means is set by the data increment Large increments result in fatter bands Size of identified clusters will jump around more (addition/removal of bands to meet size threshold) Subsequent processing is faster Limiting case: single, global threshold Smaller increments result in thinner bands Size of identified clusters more consistent Subsequent processing is slower Extremely local maxima The minimum value determines probability of detection Local maxima less intense than the minimum will not be identified 32

Storm Cell Identification: Characteristics Cells grow until they reach a specific size threshold Cells are local maxima (not based on a global threshold) Optional: cells combined to reach size threshold 33

Enhanced Watershed Algorithm [2] Starting at a maximum, flood image Until specific size threshold is met: resulting basin is a storm cell Multiple (typically 3) size thresholds to create a multiscale algorithm 34

Tuning watershed transform (-d,-p) The watershed transform is driven from maximum until size threshold is reached up to a maximum depth 35

Cluster-to-image cross correlation [1] Pixels in each cluster overlaid on previous image and shifted The mean absolute error (MAE) is computed for each pixel shift Lowest MAE -> motion vector at cluster centroid Motion vectors objectively analyzed Forms a field of motion vectors u(x,y) Field smoothed over time using Kalman filters 36

Cluster-to-image cross correlation [1] The pixels in each cluster are overlaid on the previous image and shifted, and the mean absolute error (MAE) is computed for each pixel shift: Number of pixels in cluster k Summation over all pixels in cluster k MAE k ( x x, y y) 1 n k xy k I t ( x, y) I t t ( x x, y y) Intensity of pixel (x,y) at current time Intensity of pixel (x,y) at previous time To reduce noise, the centroid of the offsets with MAE values within 20% of the minimum is used as the basis for the motion vector. Courtesy: Bob Kuligowski, NESDIS 37

Motion Estimation: Characteristics Because of interpolation, motion field covers most places Optionally, can default to model wind field far away from storms The field is smooth in space and time Not tied too closely to storm centroids Storm cells do cause local perturbation in field 38

Unique matches; size-based radius; longevity; cost [4] Project cells identified at t n-1 to expected location at t n Sort cells at t n-1 by track length so that longer-lived tracks are considered first For each projected centroid, find all centroids that are within sqrt(a/pi) kms of centroid where A is area of storm If unique, then associate the two storms Repeat until no changes Resolve ties using cost fn. based on size, intensity or 39

Resolve ties using cost function Define a cost function to associate candidate cell i at t n and cell j projected forward from t n-1 as: Location (x,y) of centroid Area of cluster Peak value of cluster Max Magnitude For each unassociated centroid at t n, associate the cell for which the cost function is minimum or call it a new cell 40

Geometric, spatial and temporal attributes [3] Geometric: Number of pixels -> area of cell Fit each cluster to an ellipse: estimate orientation and aspect ratio Spatial: remap other spatial grids (model, radar, etc.) Find pixel values on remapped grids Compute scalar statistics (min, max, count, etc.) within each cell Temporal can be done in one of two ways: Using association of cells: find change in spatial/geometric property Assumes no split/merge Project pixels backward using motion estimate: compute scalar statistics on older image Assumes no growth/decay 41

Specifying attributes to extract (-X) Attributes should fall inside the cluster boundary C-G lightning in anvil won t be picked up if only cores are identified May need to smooth/dilate spatial fields before attribute extraction Should consider what statistic to extract Average VIL? Maximum VIL? Area with VIL > 20? Fraction of area with VIL > 20? Should choose method of computing temporal properties Maximum hail? Project clusters backward Hail tends to be in core of storm, so storm growth/decay not problem Maximum shear? Use cell association Tends to be at extremity of core 42

Interpolate spatially and temporally After computing the motion vectors for each cluster (which are assigned to its centroid, a field of motion vectors u(x,y) is created via interpolation: Motion vector for cluster k Sum over all motion vectors u( x, y) k k u k w w k k ( x, ( x, The motion vectors are smoothed over time using a Kalman filter (constant-acceleration model) y) y) w k ( x, y) Nk xy c Number of pixels in cluster k Euclidean distance between point (x,y) and centroid of cluster k k

Tuning motion estimation (-O) Motion estimates are more robust if movement is on the order of several pixels If time elapsed is too short, may get zero motion If time elapsed is too long, storm evolution may cause flat crosscorrelation function Finding peaks of flat functions is error-prone! 44

Preprocessing (-k) affects everything The degree of pre-smoothing has tremendous impact Affects scale of cells that can be found More smoothing -> less cells, larger cells only Less smoothing -> smaller cells, more time to process image Affects quality of cross-correlation and hence motion estimates More smoothing -> flatter cross-correlation function, harder to find best match between images 45

Evaluate advected field using motion estimate [1] Use motion estimate to project entire field forward Compare with actual observed field at the later time Caveat: much of the error is due to storm evolution But can still ensure that speed/direction are reasonable 46

Evaluate tracks on mismatches, jumps & duration Better cell tracks: Exhibit less variability in consistent properties such as VIL Are more linear Are longer Can use these criteria to choose best parameters for identification and tracking algorithm 47

WDSS-II w2segmotionll References Algorithm for identifying and tracking cells from spatial grids Identify cells using vector quantization and watershed transform [2] Estimate motion using cross-correlation and interpolation [1] Track cells: longevity, size-based search radius and cost function [4] Extract attributes: geometric, spatial and temporal [3] How to evaluate algorithm: Evaluate cell tracks using mismatches, jumps & duration [4] Evaluate motion field using advected products [1] 1) V. Lakshmanan, R. Rabin, and V. DeBrunner, ``Multiscale storm identification and forecast,'' J. Atm. Res., vol. 67, pp. 367-380, July 2003. 2) V. Lakshmanan, K. Hondl, and R. Rabin, ``An efficient, general-purpose technique for identifying storm cells in geospatial images,'' J. Ocean. Atmos. Tech., vol. 26, no. 3, pp. 523-37, 2009. 3) V. Lakshmanan and T. Smith, ``Data mining storm attributes from spatial grids,'' J. Ocea. and Atmos. Tech., vol. 26, no. 11, pp. 2353-2365, 2009 4) V. Lakshmanan and T. Smith, ``An objective method of evaluating and devising storm tracking algorithms,'' Wea. and Forecasting, vol. 25, no. 2, pp. 721-729, 2010. 48

WDSS-II Programs w2segmotionll Multiscale cell identification and tracking: this is the program that much of this talk refers to. w2advectorll w2scoreforecast Uses the motion estimates produced by w2segmotionll (or any other motion estimate, such as that from a model) to project a spatial field forward The program used to evaluate a motion field. This is how the MAE and CSI charts were created w2scoretrack The program used to evaluate a cell track. This is how the mismatch, jump and duration bar plots were created. 49