NetSurv & Data Viewer



Similar documents
New Tools for Spatial Data Analysis in the Social Sciences

Introduction to Exploratory Data Analysis

Spatial Analysis with GeoDa Spatial Autocorrelation

Geostatistics Exploratory Analysis

Spatial Data Analysis Using GeoDa. Workshop Goals

Data Preparation and Statistical Displays

Diagrams and Graphs of Statistical Data

Norwegian Satellite Earth Observation Database for Marine and Polar Research USE CASES

CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19

The Comparisons. Grade Levels Comparisons. Focal PSSM K-8. Points PSSM CCSS 9-12 PSSM CCSS. Color Coding Legend. Not Identified in the Grade Band

Using multiple models: Bagging, Boosting, Ensembles, Forests

EXPLORING SPATIAL PATTERNS IN YOUR DATA

NC Public Health and Cancer - Trends for 2014

GIS Initiative: Developing an atmospheric data model for GIS. Olga Wilhelmi (ESIG), Jennifer Boehnert (RAP/ESIG) and Terri Betancourt (RAP)

430 Statistics and Financial Mathematics for Business

Advances in the Application of Geographic Information Systems (GIS) Carmelle J. Terborgh, Ph.D. ESRI Federal/Global Affairs

Information visualization examples

Introduction to Data Visualization

STAT355 - Probability & Statistics

A Correlation of. to the. South Carolina Data Analysis and Probability Standards

How To Write A Data Analysis

Shuming Bao. Spatial Explorer of Religions and Society - Data, Methodology and Technology. China Data Center University of Michigan

COMMON CORE STATE STANDARDS FOR

IC05 Introduction on Networks &Visualization Nov

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat

HIGH PERFORMANCE ANALYTICS FOR TERADATA

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler

A Tutorial on Color Symbolization and Data Classification for Mapping and Visualization

Today's Topics. COMP 388/441: Human-Computer Interaction. simple 2D plotting. 1D techniques. Ancient plotting techniques. Data Visualization:

The Forgotten JMP Visualizations (Plus Some New Views in JMP 9) Sam Gardner, SAS Institute, Lafayette, IN, USA

ANALYTICS IN BIG DATA ERA

39 TerraFly GeoCloud: An Online Spatial Data Analysis and Visualization System

Monday 28 January 2013 Morning

5 Correlation and Data Exploration

Engineering Problem Solving and Excel. EGN 1006 Introduction to Engineering

Data Visualization. Scientific Principles, Design Choices and Implementation in LabKey. Cory Nathe Software Engineer, LabKey

South Carolina College- and Career-Ready (SCCCR) Probability and Statistics

Create Cool Lumira Visualization Extensions with SAP Web IDE Dong Pan SAP PM and RIG Analytics Henry Kam Senior Product Manager, Developer Ecosystem

VISUALIZING SPACE-TIME UNCERTAINTY OF DENGUE FEVER OUTBREAKS. Dr. Eric Delmelle Geography & Earth Sciences University of North Carolina at Charlotte

Data exploration with Microsoft Excel: analysing more than one variable

Simple linear regression

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Data Exploration Data Visualization

Utilizing Public Data to Successfully Target Population for Prevention

Univariate Regression

How To Understand The History Of Navigation In French Marine Science

Visualizing Data from Government Census and Surveys: Plans for the Future

Time series analysis in loan management information systems

Exercise 1.12 (Pg )

GeoDa 0.9 User s Guide

Data Mining mit der JMSL Numerical Library for Java Applications

Correlation and Regression

Visualisation in the Google Cloud

Session 15 OF, Unpacking the Actuary's Technical Toolkit. Moderator: Albert Jeffrey Moore, ASA, MAAA

Harvard Data Visualization Project

CHARTS AND GRAPHS INTRODUCTION USING SPSS TO DRAW GRAPHS SPSS GRAPH OPTIONS CAG08

Visualizing Data. Contents. 1 Visualizing Data. Anthony Tanbakuchi Department of Mathematics Pima Community College. Introductory Statistics Lectures

Competency 1 Describe the role of epidemiology in public health

Workflow Requirements (Dec. 12, 2006)

Regression and Correlation

Interactive Data Mining and Visualization

Epidemiological Data Analysis in TerraFly Geo-Spatial Cloud

Iris Sample Data Set. Basic Visualization Techniques: Charts, Graphs and Maps. Summary Statistics. Frequency and Mode

Lab 7. Exploratory Data Analysis

Visibility optimization for data visualization: A Survey of Issues and Techniques

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

Bangladesh Water Supply and Sanitation Open data Monitoring Platform

Data Mining Applications in Higher Education

Brett Gaines Senior Consultant, CGI Federal Geospatial and Data Analytics Lead Developer

Extracting correlation structure from large random matrices

Learning outcomes. Knowledge and understanding. Competence and skills

Lavastorm Analytic Library Predictive and Statistical Analytics Node Pack FAQs

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining

Information & Data Visualization. Yasufumi TAKAMA Tokyo Metropolitan University, JAPAN ytakama@sd.tmu.ac.jp

Common Tools for Displaying and Communicating Data for Process Improvement

Data Visualization Handbook

TerraLib as an Open Source Platform for Public Health Applications. Karine Reis Ferreira

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

3D Interactive Information Visualization: Guidelines from experience and analysis of applications

What Does a Personality Look Like?

Visual Analytics Law Enforcement Toolkit

Large Data Visualization using Shared Distributed Resources

Predictive Analytics Powered by SAP HANA. Cary Bourgeois Principal Solution Advisor Platform and Analytics

PROGRAM DIRECTOR: Arthur O Connor Contact: URL : THE PROGRAM Careers in Data Analytics Admissions Criteria CURRICULUM Program Requirements

Improved metrics collection and correlation for the CERN cloud storage test framework

Transcription:

NetSurv & Data Viewer Prototype space-time analysis and visualization software from TerraSeer Dunrie Greiling, TerraSeer Inc. TerraSeer Software sales BoundarySeer for boundary detection and analysis ClusterSeer for disease cluster detection SpaceStat for spatial regression modeling Training Short courses Custom development June 2003 1

BioMedware TerraSeer s R&D partner developed BoundarySeer and ClusterSeer NIH/NCI SBIR funding Selection from current projects NetSurv distributed disease surveillance software Cancer Atlas Viewer spatio-temporal visualization of the National Cancer Mortality Atlas DataViewer under construction BioMedware TerraSeer s R&D partner developed BoundarySeer and ClusterSeer SBIR funding Selection from current projects NetSurv distributed disease surveillance software Cancer Atlas Viewer spatio-temporal visualization of the National Cancer Mortality Atlas DataViewer under construction June 2003 2

NetSurv Project Provide decision support and monitoring tools that will enhance existing disease surveillance systems and support timely analysis, policy formulation, and public health actions Surveillance Continuous and systematic process of collection, analysis, and interpretation of information for monitoring health problems Ongoing monitoring of temporal and spatial disease trends June 2003 3

NIH SBIR grants Small Business Innovation Research Phase I Evaluate scientific and technical merit and feasibility of an idea (6 months) Phase II Expand on the results and further pursue the development of Phase 1 (2 years) NetSurv: Phase I Provide CuSum technique (Hutwagner et al 1997) for monitoring temporal trends, providing direct access to a surveillance database and graphical display of results access to single dataset June 2003 4

CuSum Technique Cumulative sum over time, of the differences between observed case counts and a reference/baseline value Differences are added together and plotted on graph over time Magnifies small, abrupt change which are too small to be visible in conventional graphical plots of a fluctuating series of data NetSurv: Phase I CuSum technique (Hutwagner et al 1997) Distributed system Web browser interface thin client June 2003 5

Multi-Tier Distributed Apps Windows Mac Unix Browser Thin Clients NetSurv Components CuSum Method Security Manager Database Broker Application Server surveillance DBMS Server census Web Server Interface screenshot June 2003 6

NetSurv phase I results Web-based interface difficult, not user friendly difficult: interface complex, difficult to implement not user friendly: mapping, graphing slow, interface static not dynamic BioMedware TerraSeer s R&D partner developed BoundarySeer and ClusterSeer SBIR funding Selection from current projects NetSurv distributed disease surveillance software Cancer Atlas Viewer spatio-temporal visualization of the National Cancer Mortality Atlas DataViewer under construction June 2003 7

Motivation for Cancer Atlas Viewer Provide real-time visualization of the National Cancer Mortality Atlas Data Provide statistics for spatial, temporal, and space-time evaluation of Atlas data Explore general STIS specifications with a specific example June 2003 8

Real Time Interaction Avoid the world wide wait June 2003 9

Real Time Interaction Provide more flexible access to the data. Concurrency issues June 2003 10

Downloading Data Real Time Interaction Provide linked views that you can brush for interactive data exploration Map Scatterplot Box plot Histogram Table June 2003 11

Space-Time Viz Slideshow Group of maps with a common legend Provide Statistics Standardization Z-score LISA Univariate spatial contagion Bivariate space-time contagion Cluster persistence June 2003 12

Moran s I Global statistic 1 value for entire dataset Spatially weighted correlation coefficient Range ~ (-1, 1) Moran, P.A.P. 1950. Notes on continuous stochastic phenomena. Biometrika 37: 17-23. Calculation of LISA s 1. Standardize data as z-score = (x i µ x )/ var(x) ½ z i 2. Calculate LISA statistics (Anselin, 1995) local statistic, 1 value for every location I i = z i Σ w ij z j 3. Evaluate significance of LISA statistics via Monte Carlo randomization June 2003 13

The Moran Scatter Plot Graphs the values (z i ) of each area versus the average of its neighbors Σ w ij z j Has four quadrants that display high-high and low-low clusters, and high-low and low-high outliers Local Clustering (LISA) June 2003 14

Mask Sparse Data Count < 6 Analyze Masked Datasets June 2003 15

Provide Statistics Standardization Z-score LISA Univariate spatial contagion Bivariate space-time contagion Cluster persistence Long Term Include other statistics ClusterSeer temporal, spatial, spatio-temp, & surveillance methods BoundarySeer edge detection (wombling), classification (fuzzy, spatiallyconstrained) Other change detection Provide open interface for user-scripted methods Python June 2003 16

Long Term Open to other data (more general product) Currently - Adding visualization of points moving through time modeling individuals movements Interested in applying to infectious disease spread humans plant pathogen amphibians Back to NetSurv Replace static web-based interface with more interactive Atlas/Data Viewer like interface June 2003 17

NetSurv phase II Retain attention to data concurrency web access to download data check for updates Retain attention to permissions/privacy concerns Pull down data and then do analysis on local machine avoids world-wide-wait for mapping, graphing Long term plans for NetSurv Atlas-like interface Custom statistics for surveillance applications User-programmed in Python Interact with existing web data repositories DataWeb Census Geographic data plus provide room for custom/non-public data repositories June 2003 18

Acknowledgments NetSurv was funded by a grant from the National Cancer Institute and the National Library of Medicine to BioMedware, Inc. The Cancer Atlas software was funded by a grant from the National Cancer Institute to BioMedware, Inc. June 2003 19