The STC for Event Analysis: Scalability Issues Georg Fuchs Gennady Andrienko http://geoanalytics.net
Events Something [significant] happened somewhere, sometime Analysis goal and domain dependent, e.g. Object starts/stops moving, Object property changes, Earthquake with magnitude > 2 on Richter scale Visualization methods Animated and dynamic query maps Space-Time Cube (STC)
The Scalability Challenge 53.000 events
Analysis of Spatially Distributed Events: Major Questions How are the events distributed in space? at one particular time moment, or all events that occurred over a time period How are the event occurrences distributed over time? E.g., how does the overall event frequency vary? How does the pattern of spatial distribution of the events change over time? How are the events distributed in space + time? Are there any spatio-temporal clusters? 4
Example: Earthquakes in Marmara region (western Turkey and around) Data structure: <event identifier, position, time, {other attributes}> 5
Adressing the Scalability Challenge: Optimized Rendering? Full Opacity
Adressing the Scalability Challenge: Optimized Rendering? 50% Transparency
Adressing the Scalability Challenge: Optimized Rendering? 70% Transparency
Events Addressing the Scalability Challenge Something [significant] happened somewhere, sometime Analysis goal and domain dependent, e.g. Object starts/stops moving, Object property changes, Earthquake with magnitude > 2 on Richter scale Visualization methods Animated and dynamic query maps Space-Time Cube (STC) Analysis methods Spatio-Temporal Aggregation
Spatio-temporal aggregation Reduction of object/rendering primitive count Spatial aggregation: by units of any territory division E.g., cells of a regular grid Temporal aggregation: by time intervals Occlusion is still a problem since ST-aggregates typically use larger glyphs (e.g., spheres) to convey the aggregated region + time interval!
Events Addressing the Scalability Challenge Something [significant] happened somewhere, sometime Analysis goal and domain dependent, e.g. Object starts/stops moving, Object property changes, Earthquake with magnitude > 2 on Richter scale Visualization methods Animated and dynamic query maps Space-Time Cube (STC) Analysis methods Spatio-Temporal Aggregation Event Density Calculation
Event Density Calculations In case of 2D maps: compute density surfaces 1976 1977 1978 Disclaimer: There are far more polished tools than the one used for these illustrations...
Event Density Calculations In case of 3D STC: worthwhile looking at volume visualization??? MathWorks
Events Adressing the Scalability Challenge Something [significant] happened somewhere, sometime Analysis goal and domain dependent, e.g. Object starts/stops moving, Object property changes, Earthquake with magnitude > 2 on Richter scale Visualization methods Animated and dynamic query maps Space-Time Cube (STC) Analysis methods Spatio-Temporal Aggregation Event Density Calculation Spatio-Temporal Clustering
Event Distribution in Space-Time Finding clusters in Space-Time This is what we are interested in!
Event Distribution in Space-Time Finding clusters in Space-Time We see that all but one events really occurred very close to each other. We can conclude that this is indeed a spatiotemporal cluster and, hence, there may be a relationship between these events
Event Distribution in Space-Time Finding clusters in Space-Time We see that the events seem to split into two sequences with a certain time lapse between them
Event Distribution in Space-Time Automated Detection of ST Event Clusters The number of clusters must be known in advance Returns convex shaped clusters Connection between events with a certain distance threshold. Difficult to parametrize. Extract arbitrarly shaped clusters. Doesn t require a priori specification of the amount of clusters.
Density based Clustering Algorithm
Event Distribution in Space-Time Automated Detection of ST Event Clusters Clusters detection using density-based clustering Parameters: spatial distance threshold = 10 km Temporal distance threshold = 30 days 20
Event Distribution in Space-Time Automated Detection of ST Event Clusters Clusters detection using density-based clustering Observations and caveats: The space-time cube reveals an interesting pattern: a west-east shift of cluster locations over the studied time period Number of detected clusters (108) exceeds number of discernible colors different clusters are often colored very similarly 21
Automated Detection of ST Event Clusters Scaling to extremely large event data Extended DBScan Density-based algorithms typically assume entire data fits into RAM at once Might not hold during initial explorative analysis e.g., Flickr photo-taking ~100,000,000 events Proposed scalability extension to DBSCAN (EuroVA 12) Scalable to large datasets not fitting in RAM Accounts for spatiotemporal nature of the data Improved execution time compared to DBSCAN
Extended DBSCAN Spatio-temporal neighborhood parameters
Extended DBSCAN Principal algorithm steps Data is successively loaded into RAM in partially overlapping frames Database
Extended DBSCAN Principal algorithm steps DBSCAN is applied to each frame independently using ST-neighborhood criterion Database Main Memory: RAM
Extended DBSCAN Principal algorithm steps DBSCAN is applied to each frame independently using ST-neighborhood criterion Database Main Memory: RAM
Extended DBSCAN Principal algorithm steps DBSCAN is applied to each frame independently using ST-neighborhood criterion Database Main Memory: RAM
Extended DBSCAN Principal algorithm steps DBSCAN is applied to each frame independently using ST-neighborhood criterion Database Main Memory: RAM
Extended DBSCAN Principal algorithm steps DBSCAN is applied to each frame independently using ST-neighborhood criterion Database Main Memory: RAM
Extended DBSCAN Principal algorithm steps When clustering is completed, the clusters of consecutive frames are merged. Database Main Memory: RAM
Extended DBSCAN Principal algorithm steps When clustering is completed, the clusters of consecutive frames are merged. Database Main Memory: RAM
Extended DBSCAN Principal algorithm steps When clustering is completed, the clusters of consecutive frames are merged. Database Main Memory: RAM Database
Extended DBSCAN Principal algorithm steps After merging, RAM occupied by old frames is released. Database Main Memory: RAM Database
Extended DBSCAN Principal algorithm steps Database Main Memory: RAM Database
Extended DBSCAN Principal algorithm steps Database Main Memory: RAM Database
Extended DBSCAN Principal algorithm steps Database Main Memory: RAM Database
Extended DBSCAN Principal algorithm steps Database Main Memory: RAM Database
Extended DBSCAN Merging process
Extended DBSCAN Merging process
Extended DBSCAN Use for visual exploration The proposed algorithm can be used for visual analysis large datasets. 2 mil. points. / 17.200 GPS- tracks Collected in one week. Objective: Detect traffic jams in the city. Investigate the properties of the clusters.
Extended DBSCAN Use for visual exploration Detection: Spatio-temporal clusters of slow movement events Remove noise (i.e., spurious slow movements) Investigation: Temporal distribution of these traffic jams Convex hulls/prism representation Less objects/glyphs to visualize Spatial and/or temporal zooming can be applied
Extended DBSCAN Use for visual exploration convex hull cluster representation
Extended DBSCAN Use for visual exploration temporal zooming
Extended DBSCAN Use for visual exploration
Extended DBSCAN Future Work Combine temporal with spatial framing Dynamic frame sizes according to local density distribution Exploit inherent parallelism of independent frame clustering
Executive Summary Or: Why is that guy at this workshop? STC useful tool for event analysis One focus of interest: scalability of STC visualization and backing analysis methods Improved rendering, data reduction (clustering), volume rendering(?) Strong interest in software engineering & rendering: would also like to exchange experiences on architectures, data structures, shader-based graphics pipelines + rendering engines!