The STC for Event Analysis: Scalability Issues

Similar documents
Space-Time Cube in Visual Analytics

Scalable Cluster Analysis of Spatial Events

Real-time Processing and Visualization of Massive Air-Traffic Data in Digital Landscapes

Big Data and Analytics: Getting Started with ArcGIS. Mike Park Erik Hoel

Interactive Analysis of Event Data Using Space-Time Cube

Facts about Visualization Pipelines, applicable to VisIt and ParaView

Spatio-Temporal Networks:

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining

Exploratory Data Analysis for Ecological Modelling and Decision Support

Clustering & Visualization

Data Mining Clustering (2) Sheets are based on the those provided by Tan, Steinbach, and Kumar. Introduction to Data Mining

Big Data and Analytics: A Conceptual Overview. Mike Park Erik Hoel

GEO-VISUALIZATION SUPPORT FOR MULTIDIMENSIONAL CLUSTERING

SPATIAL DATA CLASSIFICATION AND DATA MINING

Forschungskolleg Data Analytics Methods and Techniques

Parallel Large-Scale Visualization

MOBILITY DATA MODELING AND REPRESENTATION

Understanding the Value of In-Memory in the IT Landscape

Visualization methods for patent data

IMPLICIT SHAPE MODELS FOR OBJECT DETECTION IN 3D POINT CLOUDS

Data Mining. Cluster Analysis: Advanced Concepts and Algorithms

Big Data Mining Services and Knowledge Discovery Applications on Clouds

Scientific Visualization with ParaView

Cluster Analysis: Advanced Concepts

Introduction to Computer Graphics

Applications of Dynamic Representation Technologies in Multimedia Electronic Map

Introduction to Data Mining

Clustering of Documents for Forensic Analysis

DICON: Visual Cluster Analysis in Support of Clinical Decision Intelligence

Traffic Monitoring Systems. Technology and sensors

Draft Martin Doerr ICS-FORTH, Heraklion, Crete Oct 4, 2001

REGULATIONS FOR THE DEGREE OF MASTER OF SCIENCE IN COMPUTER SCIENCE (MSc[CompSc])

How To Understand The History Of Navigation In French Marine Science

A Genetic Algorithm-Evolved 3D Point Cloud Descriptor

Interactive Information Visualization of Trend Information

A Learning Based Method for Super-Resolution of Low Resolution Images

Parallel Analysis and Visualization on Cray Compute Node Linux

An Interactive Web Based Spatio-Temporal Visualization System

CHAPTER-24 Mining Spatial Databases

Categorical Data Visualization and Clustering Using Subjective Factors

ASSESSMENT OF VISUALIZATION SOFTWARE FOR SUPPORT OF CONSTRUCTION SITE INSPECTION TASKS USING DATA COLLECTED FROM REALITY CAPTURE TECHNOLOGIES

Visualization with ParaView. Greg Johnson

Introduction to Visualization with VTK and ParaView

INFORMING A INFORMATION DISCOVERY TOOL FOR USING GESTURE

BIG DATA VISUALIZATION. Team Impossible Peter Vilim, Sruthi Mayuram Krithivasan, Matt Burrough, and Ismini Lourentzou

The Scientific Data Mining Process

Knowledge Discovery from patents using KMX Text Analytics

Public Transportation BigData Clustering

The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets

Spatio-Temporal Clustering: a Survey

The Big Data methodology in computer vision systems

This high level land planning and design system will replace the land

SuperViz: An Interactive Visualization of Super-Peer P2P Network

Data Distribution Algorithms for Reliable. Reliable Parallel Storage on Flash Memories

Clustering Data Streams

Alejandro Vaisman Esteban Zimanyi. Data. Warehouse. Systems. Design and Implementation. ^ Springer

How To Use Hadoop For Gis

Parallel Visualization for GIS Applications

Big Data: Rethinking Text Visualization

Dr. Shih-Lung Shaw s Research on Space-Time GIS, Human Dynamics and Big Data

Topics in basic DBMS course

Exploratory Spatial Data Analysis

Environmental Remote Sensing GEOG 2021

Silverlight for Windows Embedded Graphics and Rendering Pipeline 1

Lecture Notes, CEng 477

A Short Introduction to Computer Graphics

PRACTICAL DATA MINING IN A LARGE UTILITY COMPANY

Spatial Data Analysis

1.1 Difficulty in Fault Localization in Large-Scale Computing Systems

Geovisual Analytics Exploring and analyzing large spatial and multivariate data. Prof Mikael Jern & Civ IngTobias Åström.

Using Photorealistic RenderMan for High-Quality Direct Volume Rendering

Monitoring and Mining Sensor Data in Cloud Computing Environments

A Pattern-Based Approach to. Automated Application Performance Analysis

A Security Specification Language (SSL) for Run-Time Policy Enforcement

Clustering. Data Mining. Abraham Otero. Data Mining. Agenda

Customer Analytics. Turn Big Data into Big Value

Particles, Flocks, Herds, Schools

Advanced Volume Rendering Techniques for Medical Applications

Unsupervised Data Mining (Clustering)

Consumption of OData Services of Open Items Analytics Dashboard using SAP Predictive Analysis

Massive Cloud Auditing using Data Mining on Hadoop

Visualizing Data: Scalable Interactivity

SQL Server 2005 Features Comparison

VISUALIZING SPACE-TIME UNCERTAINTY OF DENGUE FEVER OUTBREAKS. Dr. Eric Delmelle Geography & Earth Sciences University of North Carolina at Charlotte

Visual Analytics Tools for Analysis of Movement Data

Transcription:

The STC for Event Analysis: Scalability Issues Georg Fuchs Gennady Andrienko http://geoanalytics.net

Events Something [significant] happened somewhere, sometime Analysis goal and domain dependent, e.g. Object starts/stops moving, Object property changes, Earthquake with magnitude > 2 on Richter scale Visualization methods Animated and dynamic query maps Space-Time Cube (STC)

The Scalability Challenge 53.000 events

Analysis of Spatially Distributed Events: Major Questions How are the events distributed in space? at one particular time moment, or all events that occurred over a time period How are the event occurrences distributed over time? E.g., how does the overall event frequency vary? How does the pattern of spatial distribution of the events change over time? How are the events distributed in space + time? Are there any spatio-temporal clusters? 4

Example: Earthquakes in Marmara region (western Turkey and around) Data structure: <event identifier, position, time, {other attributes}> 5

Adressing the Scalability Challenge: Optimized Rendering? Full Opacity

Adressing the Scalability Challenge: Optimized Rendering? 50% Transparency

Adressing the Scalability Challenge: Optimized Rendering? 70% Transparency

Events Addressing the Scalability Challenge Something [significant] happened somewhere, sometime Analysis goal and domain dependent, e.g. Object starts/stops moving, Object property changes, Earthquake with magnitude > 2 on Richter scale Visualization methods Animated and dynamic query maps Space-Time Cube (STC) Analysis methods Spatio-Temporal Aggregation

Spatio-temporal aggregation Reduction of object/rendering primitive count Spatial aggregation: by units of any territory division E.g., cells of a regular grid Temporal aggregation: by time intervals Occlusion is still a problem since ST-aggregates typically use larger glyphs (e.g., spheres) to convey the aggregated region + time interval!

Events Addressing the Scalability Challenge Something [significant] happened somewhere, sometime Analysis goal and domain dependent, e.g. Object starts/stops moving, Object property changes, Earthquake with magnitude > 2 on Richter scale Visualization methods Animated and dynamic query maps Space-Time Cube (STC) Analysis methods Spatio-Temporal Aggregation Event Density Calculation

Event Density Calculations In case of 2D maps: compute density surfaces 1976 1977 1978 Disclaimer: There are far more polished tools than the one used for these illustrations...

Event Density Calculations In case of 3D STC: worthwhile looking at volume visualization??? MathWorks

Events Adressing the Scalability Challenge Something [significant] happened somewhere, sometime Analysis goal and domain dependent, e.g. Object starts/stops moving, Object property changes, Earthquake with magnitude > 2 on Richter scale Visualization methods Animated and dynamic query maps Space-Time Cube (STC) Analysis methods Spatio-Temporal Aggregation Event Density Calculation Spatio-Temporal Clustering

Event Distribution in Space-Time Finding clusters in Space-Time This is what we are interested in!

Event Distribution in Space-Time Finding clusters in Space-Time We see that all but one events really occurred very close to each other. We can conclude that this is indeed a spatiotemporal cluster and, hence, there may be a relationship between these events

Event Distribution in Space-Time Finding clusters in Space-Time We see that the events seem to split into two sequences with a certain time lapse between them

Event Distribution in Space-Time Automated Detection of ST Event Clusters The number of clusters must be known in advance Returns convex shaped clusters Connection between events with a certain distance threshold. Difficult to parametrize. Extract arbitrarly shaped clusters. Doesn t require a priori specification of the amount of clusters.

Density based Clustering Algorithm

Event Distribution in Space-Time Automated Detection of ST Event Clusters Clusters detection using density-based clustering Parameters: spatial distance threshold = 10 km Temporal distance threshold = 30 days 20

Event Distribution in Space-Time Automated Detection of ST Event Clusters Clusters detection using density-based clustering Observations and caveats: The space-time cube reveals an interesting pattern: a west-east shift of cluster locations over the studied time period Number of detected clusters (108) exceeds number of discernible colors different clusters are often colored very similarly 21

Automated Detection of ST Event Clusters Scaling to extremely large event data Extended DBScan Density-based algorithms typically assume entire data fits into RAM at once Might not hold during initial explorative analysis e.g., Flickr photo-taking ~100,000,000 events Proposed scalability extension to DBSCAN (EuroVA 12) Scalable to large datasets not fitting in RAM Accounts for spatiotemporal nature of the data Improved execution time compared to DBSCAN

Extended DBSCAN Spatio-temporal neighborhood parameters

Extended DBSCAN Principal algorithm steps Data is successively loaded into RAM in partially overlapping frames Database

Extended DBSCAN Principal algorithm steps DBSCAN is applied to each frame independently using ST-neighborhood criterion Database Main Memory: RAM

Extended DBSCAN Principal algorithm steps DBSCAN is applied to each frame independently using ST-neighborhood criterion Database Main Memory: RAM

Extended DBSCAN Principal algorithm steps DBSCAN is applied to each frame independently using ST-neighborhood criterion Database Main Memory: RAM

Extended DBSCAN Principal algorithm steps DBSCAN is applied to each frame independently using ST-neighborhood criterion Database Main Memory: RAM

Extended DBSCAN Principal algorithm steps DBSCAN is applied to each frame independently using ST-neighborhood criterion Database Main Memory: RAM

Extended DBSCAN Principal algorithm steps When clustering is completed, the clusters of consecutive frames are merged. Database Main Memory: RAM

Extended DBSCAN Principal algorithm steps When clustering is completed, the clusters of consecutive frames are merged. Database Main Memory: RAM

Extended DBSCAN Principal algorithm steps When clustering is completed, the clusters of consecutive frames are merged. Database Main Memory: RAM Database

Extended DBSCAN Principal algorithm steps After merging, RAM occupied by old frames is released. Database Main Memory: RAM Database

Extended DBSCAN Principal algorithm steps Database Main Memory: RAM Database

Extended DBSCAN Principal algorithm steps Database Main Memory: RAM Database

Extended DBSCAN Principal algorithm steps Database Main Memory: RAM Database

Extended DBSCAN Principal algorithm steps Database Main Memory: RAM Database

Extended DBSCAN Merging process

Extended DBSCAN Merging process

Extended DBSCAN Use for visual exploration The proposed algorithm can be used for visual analysis large datasets. 2 mil. points. / 17.200 GPS- tracks Collected in one week. Objective: Detect traffic jams in the city. Investigate the properties of the clusters.

Extended DBSCAN Use for visual exploration Detection: Spatio-temporal clusters of slow movement events Remove noise (i.e., spurious slow movements) Investigation: Temporal distribution of these traffic jams Convex hulls/prism representation Less objects/glyphs to visualize Spatial and/or temporal zooming can be applied

Extended DBSCAN Use for visual exploration convex hull cluster representation

Extended DBSCAN Use for visual exploration temporal zooming

Extended DBSCAN Use for visual exploration

Extended DBSCAN Future Work Combine temporal with spatial framing Dynamic frame sizes according to local density distribution Exploit inherent parallelism of independent frame clustering

Executive Summary Or: Why is that guy at this workshop? STC useful tool for event analysis One focus of interest: scalability of STC visualization and backing analysis methods Improved rendering, data reduction (clustering), volume rendering(?) Strong interest in software engineering & rendering: would also like to exchange experiences on architectures, data structures, shader-based graphics pipelines + rendering engines!