Highly Scalable Tile-Based Visualization for Exploratory Data Analysis

Similar documents
Create Mobile, Compelling Dashboards with Trusted Business Warehouse Data

BIG Data Analytics Move to Competitive Advantage

Data Visualization Techniques

Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January Website:

How To Create A Data Science System

VisualCalc AdWords Dashboard Indicator Whitepaper Rev 3.2

Beyond Web Application Log Analysis using Apache TM Hadoop. A Whitepaper by Orzota, Inc.

Esri Maps for Salesforce and Microsoft Dynamics CRM

Data Visualization Techniques

A Workflow for Creating and Sharing Maps

From Spark to Ignition:

Making the Most of Your Enterprise Reporting Investment 10 Tips to Avoid Costly Mistakes

User Guide. Sales Performance App

Better Business Analytics with Powerful Business Intelligence Tools

<no narration for this slide>

Oracle Big Data Discovery The Visual Face of Hadoop

IBM Cognos Insight. Independently explore, visualize, model and share insights without IT assistance. Highlights. IBM Software Business Analytics

Copyright 2013 Splunk Inc. Introducing Splunk 6

2.0 COMMON FORMS OF DATA VISUALIZATION

Big Data and Analytics: Getting Started with ArcGIS. Mike Park Erik Hoel

Ernesto Ongaro BI Consultant February 19, The 5 Levels of Embedded BI

Developing Business Intelligence and Data Visualization Applications with Web Maps

CRGroup Whitepaper: Digging through the Data. Reporting Options in Microsoft Dynamics GP

Location Analytics for Financial Services. An Esri White Paper October 2013

Consuming Real Time Analytics and KPI powered by leveraging SAP Lumira and SAP Smart Business in Fiori SESSION CODE: 0611 Draft!!!

Delivering secure, real-time business insights for the Industrial world

BIRT ihub Actuate Customer Days. Wow that looks good! Jeff Morris & Mark Gamble

Banking Industry Performance Management

Oracle Big Data Discovery Unlock Potential in Big Data Reservoir

The Future of Business Analytics is Now! 2013 IBM Corporation

Big Data Visualization and Dashboards

P6 Analytics Reference Manual

GeoManitoba Spatial Data Infrastructure Update. Presented by: Jim Aberdeen Shawn Cruise

Ad Hoc Analysis of Big Data Visualization

How To Create A Graph On An African Performance Monitor (Dvt)

NStreamAware: Real-Time Visual Analytics for Data Streams to Enhance Situational Awareness

XpoLog Competitive Comparison Sheet

SkySpark Tools for Visualizing and Understanding Your Data

Business Intelligence Solutions for Gaming and Hospitality

Embedding Customized Data Visualization and Analysis

ATLAS CARTOGRAPHIC TECHNOLOGIES LTD. (ATLASCT) Dedicated Geo-Server. Business Proposal

From Higgs to Healthcare Big Data for better business and society! Egge van der Poel Parttime Clinical Data Erasmus MC, Parttime

CHAPTER. Monitoring and Diagnosing

Adobe Insight, powered by Omniture

The Cisco CMX Analytics Service

How To Use Big Data For Telco (For A Telco)

Spotfire and Tableau Positioning. Summary

TABLEAU COURSE CONTENT. Presented By 3S Business Corporation Inc Call us at : Mail us at : info@3sbc.com

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

Izenda & SQL Server Reporting Services

XpoLog Center Suite Log Management & Analysis platform

Working with The Matrix Portal: Auto

Why Big Data Analytics?

SAP Best Practices for SAP S/4HANA Scope 1511 Cloud Marketing Edition Business Priority view. last update:

Apigee Insights Increase marketing effectiveness and customer satisfaction with API-driven adaptive apps

PROVIDING INSIGHT FOR OPERATIONAL SUCCESS

Big Data and Analytics: A Conceptual Overview. Mike Park Erik Hoel

TOP New Features of Oracle Business Intelligence 11g

Business Intelligence & Product Analytics

Google Analytics in the Dept. of Medicine

ALTAIR SOFTWARE ASSET OPTIMIZATION USER GUIDE

Big Data for Investment Research Management

Understanding BI. The Top 10 Business Questions That Drive Your BI Technology Requirements

Reporting. Understanding Advanced Reporting Features for Managers

Transforming the Telecoms Business using Big Data and Analytics

Safe Harbor Statement

SYNTHESIO PRODUCT GUIDE THE EXPERIENCE RELEASE

Developing Fleet and Asset Tracking Solutions with Web Maps

Prepare Data ArcMap Open Trees.mpk Chrome ArcGIS Online New Map Chrome Esri Blogs Chrome Local Government Resources Reflector Phone Connected

Solving big data problems in real-time with CEP and Dashboards - patterns and tips

What s New in JReport 13.1

Oracle Big Data Spatial and Graph

Intelligent Process Management & Process Visualization. TAProViz 2014 workshop. Presenter: Dafna Levy

Information-Driven Apps. The evolution of the dashboard.

BI SURVEY. The world s largest survey of business intelligence software users

Where is... How do I get to...

Oracle BI 11g R1: Create Analyses and Dashboards

Geo Analysis, Visualization and Performance with JReport 13

Microsoft Business Intelligence

How To Create An Insight Analysis For Cyber Security

How to build Dashboard - Step by Step tutorial/recipe

Accelerate Business Intelligence Adoption with Interactive, Mobile Dashboards

Why does Location Analytics give you a competitive advantage? WHITE PAPER

The Big Data Paradigm Shift. Insight Through Automation

Microsoft 311 Service Center accelerator Demo Script

DRIVING INNOVATION THROUGH DATA ACCELERATING BIG DATA APPLICATION DEVELOPMENT WITH CASCADING

How To Turn Big Data Into An Insight

Process Portal Hands-on Exercise

Embedded BI made easy

Leveraging Machine Data to Deliver New Insights for Business Analytics

Hunting for the Undefined Threat: Advanced Analytics & Visualization

SuperViz: An Interactive Visualization of Super-Peer P2P Network

With business intelligence, we create a learning organization that adapts quickly to market changes and stays one step ahead of the competition.

Table of Contents Find the story within your data

Data Isn't Everything

GAZETRACKERrM: SOFTWARE DESIGNED TO FACILITATE EYE MOVEMENT ANALYSIS

Tax Fraud in Increasing

Session 805 -End-to-End SAP Lumira: Desktop to On-Premise, Cloud, and Mobile

SAS BI Dashboard 4.4. User's Guide Second Edition. SAS Documentation

General overview, and sources and uses of Big Data for urban and regional analysis

Transcription:

Highly Scalable Tile-Based Visualization for Exploratory Data Analysis Strata NY: Hadoop and Beyond, 10/17/2014 David Jonker, Rob Harper 2014 OCULUS INFO INC.

Making Sense of Big Data Big Data for us is in complexity. We have complex data. We have big complexity. Analyst No one wants more data, everyone wants better data. - Analyst Oversimplified Less Time Required to acquire insights Dashboard Basic charts assembled through drag and drop Report Static Snapshot?" Slice and Dice Unstructured toolbox of widgets for data exploration More Information Available To make an effective decision Overwhelming 2014 OCULUS INFO INC. 2"

Challenges of Effective Visual Analytics ESSENTIAL o Richly informative and true to reality. o Answers easily arrived at, easily understood. ALSO, every business problem is unique, but cannot afford an entirely unique solution for every problem. o Need relatively universal, repeatable technical approaches. 2014 OCULUS INFO INC. 3"

Tile-Based Visualization for Big Data WEB MAPPING APPS have an established model for intuitively navigating and understanding big data, based on zoomable multiresolution tiles and layers. BUT geospatial data is relatively static. How can a tiled, layered approach be applied with dynamic data, and non-geospatial problems, at scale? 2014 OCULUS INFO INC. 4"

Aperture Tiles TILE-BASED VISUAL ANALYTICS o Hierarchical data tiling using cluster computing. o Interactive on-demand image tile generation. o Layers of raw data and derivative analytics. OPEN SOURCE o Oculus research product. o Built on Apache Spark and Hadoop. o Preparing for version one product release. 2014 OCULUS INFO INC. 5"

Example Applications Geospatial Social Media Biomedical Financial 2014 OCULUS INFO INC. 6"

187,000,000 NYC Taxi Trips Photo by Sakeeb Sabakka, CC

NYC Taxi Data FOIL request by Chris Whong, March 2014 Taxi ID Hack, Medallion, Vendor Origin Location, Time Dest Location, Time Trip Duration, Distance, # Passengers Fare $ Base, Tip, Tolls, Taxes, Payment method 2014 OCULUS INFO INC. 8"

Visualizing NYC Taxi Data Eric Fischer, MapBox www.mapbox.com/blog/nyc-taxi/

Demo Continue presentation

Location data errors GPS blurring likely caused by tall buildings Pick-up/drop-off GPS fixes enroute Visualizing all of the data including bad data helps identify source of errors, understand how errors manifest, and how they may affect analysis

Spanish Harlem Brooklyn Brooklyn Comparing Pick-up and Drop-off Locations Red Pick-up Locations Blue Drop-off Locations Multi-scale visualization allows exploration of macro and micro trends.

Average Tip % by Pick-up Location* Red ~15% Green ~20% Yellow ~25% LaGuardia passengers tip more than those from JFK? *CC payments only Downtown bankers tip less than mid-town shoppers? Using a spectral color ramp helps us compare differences across the data (luminance better for multi-series). Tiles uses aggregate data to generate rasters, not precomputed images

Easily switching between fully zoomable multi-scale layers makes the task of understanding the data easier. Very short ride anomaly in upper east side exposed by range filtering hospital sending patients on a short ride to clinic? Roles reversed Distance Red = 1m Yellow = 18m

Average $/hr by Pick-up Location Red $30/hr Yellow $75/hr Calculated using: (fare + tip) / (duration of trip + time until next fare) Hot zones at ends of tunnels due to GPS fix lag observed earlier? Profitability correlates with pickup likelihood on one-way streets Brooklyn

Up until now have focused on pick-up location, what about drop-off? Average $/hr by Drop-off Location Red $30/hr Yellow $60/hr Calculated using: (fare + tip) / (duration of trip + time until next fare)

Average annual taxi income by pick-up location Red $80k Yellow $100k

Fare Total vs Time Feb 2013 Nor`easter storm Expensive fares drop off by low fares unaffected Tiles approach isn t limited to geographic plots X-Y crossplots can show 3 arbitrary dimensions and expose macro and micro patterns. e.g. Time + Fare Total + Frequency

World Cup Tweets player @mentions Photo by paulisson miura, CC

100M geolocated tweets during the World Cup

Analytic overlays showing @mention summaries

Team and Player @mention sentiment Interactive overlays

Tiles from 10,000

Three Tiers Source Data Tile Generation Tiled Data Views Tile Service Query results Tile Client 2014 OCULUS INFO INC. 24"

Tile Generation Source Data Tile Generation Tiled Data Views 2014 OCULUS INFO INC. 25"

Tiled Data Views Aggregate views of source data designed for answering analytic questions Optimized for query speed Indexed by tile key - one view/row per tile AVRO / Thrift 2014 OCULUS INFO INC. 26"

Tiled Data Views Taxi Tip % 16% 17% 14% 10% 18% 16% 17% 20% 14% 17% 10% 15% 16% 12% 17% 14% 18% 20% 17% 15% 12% 21% 22% 21% 19% 14% 20% 20% 21% 17% 17% 15% 15% 20% 21% 17% 15% 12% 21% 19% 18% 16% 17% 14% 10% 21% 18% 20% 17% 15% 21% 22% 21% 19% 14% 21% 19% 18% 21% 22% 21% 19% 21% 19% 18% 14% 13% 16% 17% 14% 10% 14% 18% 20% 17% 15% 20% 21% 22% 21% 19% 16% 20% 21% 17% 15% 18% 21% 19% 18% 18% 19% 18% 20% 17% 15% 21% 22% 21% 20% 21% 21% 19% 2014 OCULUS INFO INC. 27"

Tiled Data Views World Cup Sentiment Top Mentions: [ ] Sentiment Values: [ ] Top Mentions: [ ] Sentiment Values: [ ] Tweets per minute: [ ] Top Mentions: [ ] Tweets per minute: [ ] Top Mentions: [ ] Sentiment Values: [ ] Tweets per minute: [ ] Sentiment Values: [ ] Tweets per minute: [ ] 2014 OCULUS INFO INC. 28"

Tile Generation 16% 17% 14% 10% query 18% 20% 17% 15% 12% 21% 22% 21% 19% 14% 20% 21% 17% 15% Tile Service 21% 19% 18% 2014 OCULUS INFO INC. 29"

Tile Client Tile Map Service (TMS) OpenLayers http://{rooturl}/{layer}/{zoom}/{x}/{y}.png ESRI 2014 OCULUS INFO INC. 30"

Tile Client Ready-to-go Tiled Data Browser 2014 OCULUS INFO INC. 31"

http://tiles.oculusinfo.com https://github.com/oculusinfo/aperture-tiles

Key Points of Value PLOTTING ALL of the data for richer, truer insights. o Reveals truths in the data that summaries cannot. o Provides important context and evidence for analytics. TILED, LAYERED user interface for natural, scalable exploration. o Intuitive navigation and understanding. o Analytic layers at levels of detail, down to raw provenance. INTERACTIVE analysis with data tiling and analytics. o Rapidly repeatable tailored solutions, with filtering + selection. 2014 OCULUS INFO INC. 33"

50,000,000 link interactive graph 1 trillion+ pixels of resolution What s Next?

What s Next? 01010100110 > Interactive Tiled Graphs Streaming Data and live query More advanced filtering and analytics Questions? Oculus gratefully acknowledges the support of DARPA in research and development of this work. 2014 OCULUS INFO INC. 35"