Data Isn't Everything



Similar documents
Is a Data Scientist the New Quant? Stuart Kozola MathWorks

Information Visualization WS 2013/14 11 Visual Analytics

Data Mining Applications in Higher Education

Statistics for BIG data

Predictive Analytics

The Informatica Solution for Improper Payments

From Raw Data to. Actionable Insights with. MATLAB Analytics. Learn more. Develop predictive models. 1Access and explore data

ANALYTICS CENTER LEARNING PROGRAM

Why Big Data Analytics?

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing

Demystifying Big Data Government Agencies & The Big Data Phenomenon

An Overview of Knowledge Discovery Database and Data mining Techniques

An Introduction to Advanced Analytics and Data Mining

OPTIMUS SBR. Optimizing Results with Business Intelligence Governance CHOICE TOOLS. PRECISION AIM. BOLD ATTITUDE.

Using Artificial Intelligence to Manage Big Data for Litigation

Technology and Trends for Smarter Business Analytics

SAP Solution Brief SAP HANA. Transform Your Future with Better Business Insight Using Predictive Analytics

Delivering Smart Answers!

Introduction to Data Mining

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

Demonstration of SAP Predictive Analysis 1.0, consumption from SAP BI clients and best practices

Our Engine allows you to connect and sync your data. In a flash, connect the data that matters and is relevant to you, simply and easily.

Machine Learning: Overview

ANALYTICS & CHANGE. Keys to Building Buy-In

Course DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

MEDICAL DATA MINING. Timothy Hays, PhD. Health IT Strategy Executive Dynamics Research Corporation (DRC) December 13, 2012

Advanced Big Data Analytics with R and Hadoop

International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: Vol. 1, Issue 6, October Big Data and Hadoop

Pentaho Data Mining Last Modified on January 22, 2007

Improving Decision Making and Managing Knowledge

BIG Data Analytics Move to Competitive Advantage

The Future of Business Analytics is Now! 2013 IBM Corporation

The Rise of Industrial Big Data

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

ANALYTICS STRATEGY: creating a roadmap for success

Chapter 5. Warehousing, Data Acquisition, Data. Visualization

Foundations of Business Intelligence: Databases and Information Management

Reimagining Business with SAP HANA Cloud Platform for the Internet of Things

Data mining and official statistics

AV-24 Advanced Analytics for Predictive Maintenance

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

Session 805 -End-to-End SAP Lumira: Desktop to On-Premise, Cloud, and Mobile

BUSINESS INTELLIGENCE COMPETENCY CENTER

Sanjeev Kumar. contribute

Learning outcomes. Knowledge and understanding. Competence and skills

not possible or was possible at a high cost for collecting the data.

DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support

ANALYTICS & CHANGE KEYS TO BUILDING BUY-IN

Transforming the Telecoms Business using Big Data and Analytics

Machine Learning and Data Mining. Fundamentals, robotics, recognition

Fluency With Information Technology CSE100/IMT100

Transforming big data into supply chain analytics

Explore the Possibilities

Whitepaper Data Governance Roadmap for IT Executives Valeh Nazemoff

USING DATA DISCOVERY TO MANAGE AND MITIGATE RISK: INSIGHT IS EVERYONE S JOB

MSCA Introduction to Statistical Concepts

Performing a data mining tool evaluation

IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH

This Symposium brought to you by

Information Management course

Promises and Pitfalls of Big-Data-Predictive Analytics: Best Practices and Trends

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

DATA MANAGEMENT FOR THE INTERNET OF THINGS

Miracle Integrating Knowledge Management and Business Intelligence

PROGRAM DIRECTOR: Arthur O Connor Contact: URL : THE PROGRAM Careers in Data Analytics Admissions Criteria CURRICULUM Program Requirements

In this presentation, you will be introduced to data mining and the relationship with meaningful use.

DATA MINING AND WAREHOUSING CONCEPTS

Proven Testing Techniques in Large Data Warehousing Projects

Fight fire with fire when protecting sensitive data

Data Mining for Customer Service Support. Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin

Predicting the future of predictive analytics. December 2013

Data Mining + Business Intelligence. Integration, Design and Implementation

Learning is a very general term denoting the way in which agents:

White Paper. Data Mining for Business

BIG DATA STRATEGY. Rama Kattunga Chair at American institute of Big Data Professionals. Building Big Data Strategy For Your Organization

A Review of Data Mining Techniques

Melissa Coates. Tools & Techniques for Implementing Corporate and Self-Service BI. Triad SQL BI User Group 6/25/2013. BI Architect, Intellinet

Big Data Analytics Best Practices

DATA EXPERTS MINE ANALYZE VISUALIZE. We accelerate research and transform data to help you create actionable insights

Global Headquarters: 5 Speen Street Framingham, MA USA P F

Maximize Social Media Effectiveness with Data Science. An Insurance Industry White Paper from Saama Technologies, Inc.

locuz.com Big Data Services

TEXT ANALYTICS INTEGRATION

The following is intended to outline our general product direction. It is intended for informational purposes only, and may not be incorporated into

Anatomy of a Decision

A GENERAL TAXONOMY FOR VISUALIZATION OF PREDICTIVE SOCIAL MEDIA ANALYTICS

Master s Program in Information Systems

ORACLE UTILITIES ANALYTICS

Consumption of OData Services of Open Items Analytics Dashboard using SAP Predictive Analysis

Traffic Prediction and Analysis using a Big Data and Visualisation Approach

Information and Decision Sciences (IDS)

Transcription:

June 17, 2015 Innovate Forward Data Isn't Everything The Challenges of Big Data, Advanced Analytics, and Advance Computation Devices for Transportation Agencies. Using Data to Support Mission, Administration, and Operations

Purpose + This presentation explores the challenge of Big Data and advanced analytics for transportation agencies + Specifically it: Provide background to the Big Data and advanced analytics revolution Identifies specific challenges Discuss approaches to addressing these challenges Booz Allen Hamilton and Client proprietary and business confidential 1

Background: The Big Data Revolution Over the next decade we will see an explosion in data from omnipresent sensors (>50 billion passive and active sensors by 2020) and other data collection technologies. Combined with a vast increase in processing power and a continuing decline in cost, this offers the possibility for dramatic growth Exabytes 40,000 30,000 20,000 10,000 92 percent of the world s data was produced in the last two years 130 exabytes in 2000 40,000 exabytes, or 40 trillion gigabytes (more than 5,200 gigabytes for every man, woman, and child in 2020) 0 2000 2005 2010 2015 2020 1 Exabyte = 1,000,000.000 gigabytes 2

Background: Will the Big Data and advanced analytics live up to its promise? The explosion of Big Data and processing power has generated a lot of hype and offers the potential that advanced analytics techniques could be easily accessible to everyone- generating substantial value The key questions are how should these technologies be used to maximize their value and what are the key factors (technological, organizational, cultural) that lead to organizations being able to realize their promise 3

The Challenge of Big Data and Advanced Analytics: We have found that organizations gain the most from advanced analytics and Big Data when they consider a number of factors Business Needs Technology Increasing value from Big Data requires the correct alignment of numerous factors Domain/ Disciplinary Expertise Organization and Process Data On the following slides we will provide some illustrations of this concept 4

Example: There s No Substitute for Domain SMEs and a Carefully Defined Question Too often organizations implement Big Data and advanced analytics initiatives without integrating existing domain expertise into the initiatives or defining the key questions or mission/business needs they need to have answered The Wrong Way: Big Data Advanced Analytics The Right Way: Business/ mission need Defined question or problem Domain SMEs Big Data Advanced analytics Analyze Data no direction, no goal, no domain SMEs Analyze Data Directed, goal driven, domain SMEs What did we get? What did we get? No Way Spurious Correlations So What Well-Know Correlations Meaningful results Useable findings New insights and knowledge 5

Example: How can you organize analytics? There are many ways to organize analytics each has different advantages and disadvantages CENTRALIZED ANALYTICS MODEL BUSINESS UNIT ANALYTICS MODEL Centralized Analytics Organization Analytics Capability People, Data, Tools, Organization Business Unit Business Unit Business Unit Business Unit Analytics Capability Analytics Capability Analytics Capability Analytics Capability Business Unit Business Unit Business Unit Business Unit Critical mass of resources to catalyze performance Access to cross-functional data to conduct high-value analytics Avoid duplication Reduces costs and increases analytics output Organizational resistance Needs to work with domain experts Need to resolve data ownership/stewardship Insufficient resources in each business unit to catalyze performance Lack cross-functional data to conduct high-value analytics Duplication of capacity competition for scare resources Increases costs and lowers analytics output Domain expertise and analytics combined in single organization Data ownership/stewardship is simplified 6

Example: How can you organize analytics? Description Pros Cons Centralized Centralized analytics group providing analytical services to the organization Critical mass of resources to catalyze performance Access to cross-functional data to conduct highvalue analytics Avoid duplication Reduces costs and increases analytics output Organizational resistance from program offices Needs to develop means to work with domain SMEs Need to resolve data ownership/ stewardship Requires charge-back mechanism Center of Excellence Analysts are distributed through the organization but are supported by a small group responsible for training, technology tracking, disseminating best practices, and facilitating communication Analysts and domain SMEs develop natural, organic cooperative relationship Allows organization to share best practices and core analytics function Consumer driven minimal organizational resistance Data ownership/stewardship is simplified Lack cross-functional data to conduct high-value analytics Duplication of capacity competition for scare resources Decentralized Analysts are dispersed among the organization and aligned with business units not centralized control or coordination Analysts and domain SMEs develop natural, organic cooperative relationship Data ownership/stewardship is simplified Each organization get the analytics it perceived it needs Insufficient resources in each program office catalyze performance Lack cross-functional data/platform to conduct high-value analytics Duplication of capacity competition for scare resources 7

Example: Data Governance Proper Analytics governance can prevent major problems from emerging. For example: Problem: Cowboy Analytics Models and techniques are used by users who do not understand their purpose, requirements or the limitation of their data. The result is that resources may be wasted, quality stuffers and the reputation of analytics organizations can be damaged when substandard products are produced Solution: With Great Power Comes Great Responsibility Governance needs to establish training requirements, recommended technology tools, verification and validation protocols, and risk management processes to keep enterprise-wide analytics refreshed and reliable 8

Example: Big Data vs. Smart Data Data and advanced analytics are meaningless unless business needs and domain expertise provide context and identify problems of interest. Smart Data mean something to domain SMEs, where key messages can be communicated to non-smes through data visualization and messaging Data Transformation DUMB DATA Difficult to acquire Hard to integrate Numerous errors Difficult to understand Business needs define the key questions to answer (data don t tell you what you need to ask) Domain expertise adds context and meaning Data cleansing and preliminary analytics create usable data Intelligent integration creates meaningful cross functional data sets SMART DATA Contextual Easy to access Integrated Accurate Easy for SMEs to understand 9

Example: Don t Use a Wheel to Break a Butterfly Technology is making numerous advanced analytical techniques available, but they need to be combined with the appropriate data, domain expertise and skilled users not all problems require complex solutions Individualized Data Complexity of Data Generalized Data MapReduce Big Data Processing (e.g. Hadoop ) Big Data Regression Text Mining Natural Language Processing Genetic Algorithms Artificial Neural Networks Relational Databases Spreadsheets, Reports Filtering Descriptive Analytics Sensitivity Analysis Regression Predictive Analytics Image Processing, Computer Vision Monte -Carlo Simulation Agent -Based Models Differential Equations Hidden Markov Models Bayesian Belief Networks Discrete Choice Models Support Vector Machines Probit, Logit Models Time Series Analysis Geospatial Modeling Classification Trees Non-Linear Programming, Optimization Prescriptive Analytics Stochastic Processes Simulation Linear/Integer Programming, Optimization Reporting Optimization 10

Key takeaways from our experience implementing Big Data and advanced analytics initiatives Context is King: Big Data and advanced analytics are nothing without domain expertise No Substitute for a Passionate Question: Business and mission needs and a desire to understand a critical phenomena determine value data doesn t speak for itself Organization and Culture Matter: How advanced analytics is integrated into an organization matters analyze and plan your adoption of new technologies Data Ownership/Stewardship is Critical: Who owns the data, the access rights that analysts have and the responsibility for managing data are critical Merging Expertise: Merging domain and analytics expertise is difficult finding the right people and the organizational processes that work are often the biggest problem Don t Pay for What you Don t Need: Big Data and advanced analytics are not for every organization or every problem match the tool to the challenge 11

Advanced Analytics & Data Dashboard (A 2 D 2 ) Can Help Organizations Understand Advanced Analytics + Booz Allen has developed the Advanced Analytics & Data Dashboard (A 2 D 2 ) approach to assess the readiness of an organization to implement advanced analytics and to help organization leaders decide the path forward in developing advanced analytics + It identifies the gap between current capabilities and goals that are developed from business needs by: Identifying technology and data requirements to support business needs Assessing organizational and cultural issues that impact on an organization s ability to support advanced analytics + A 2 D 2 provides a rapid, stakeholder-focused diagnostic that gives organization map of how to move from their current as-is state to a envisioned to-be state where Big Data and advanced analytics can be used to support business needs 12

Advanced Analytics & Data Dashboard (A 2 D 2 ) Framework A 2 D 2 s Analytical Framework is the Lens through which all activities are evaluated The right framework is the one that facilitates communication and understanding across all stakeholders. The final model captures the iterative, highly fragmented, and personal nature of analytical activity and identifies key types of analyses: Exploratory Operational Descriptive Deductive Predictive Inductive Decision Abductive The operational perspective of the analytical activity is defined in the Operational Framework Exposes organizational boundaries and structures and their impact on analysis Technology and associated products can be mapped onto Operational Model Capability gaps are identified Identifies capabilities that enable the IT platforms to interact with operational components to support analytical tasks Produce Author Control Review Publish Collaborate (Drill Down/Drill Back) Archive Visualize Link Analyze Geo/Locate/ Explore Proximity Filter/Focus Timeline Hypothesize Trends Model Anomalies Forecast Annotate/ Enhance Peer Review/ Validation Plan & Manage Manage Analysis Manage Collection Control Quality Collaboration and Knowledge Management Access Capture Harvest ETL/Cleanses Code: Geo, Temporal, etc. Translate Unstructured Structured Data Data Associate Categorize Characterize Cluster Classify Extract Predict Summarize Cluster Discover Patterns Analyze Exceptions Organize Load/Persist Aggregate Match Search & Select Attach Attributes 13

A 2 D 2 illustrative work product: Demonstrates multiple dimensions of competency Summary Chart indicating current capabilities and areas of focus (in Blue), and the target capability level denoted by the red dashed circle

Questions 15