Integrating a Big Data Platform into Government:

Similar documents
End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Ganzheitliches Datenmanagement

HDP Hadoop From concept to deployment.

ANALYTICS CENTER LEARNING PROGRAM

The Future of Data Management

Three Open Blueprints For Big Data Success

2015 Analyst and Advisor Summit. Advanced Data Analytics Dr. Rod Fontecilla Vice President, Application Services, Chief Data Scientist

The Future of Data Management with Hadoop and the Enterprise Data Hub

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, Viswa Sharma Solutions Architect Tata Consultancy Services

Safe Harbor Statement

Big Data Executive Survey

Cloudera Enterprise Data Hub in Telecom:

SAP and Hortonworks Reference Architecture

Architecting for the Internet of Things & Big Data

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics

Big Data Use Cases. To Start Today. Paul Scholey Sales Director, EMEA. 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866)

The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer,

The Enterprise Data Hub and The Modern Information Architecture

Big Data Storage Challenges for the Industrial Internet of Things

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.

Oracle Big Data Building A Big Data Management System

Consulting and Systems Integration (1) Networks & Cloud Integration Engineer

More Data in Less Time

CORPORATE OVERVIEW. Big Data. Shared. Simply. Securely.

Databricks. A Primer

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

Information Builders Mission & Value Proposition

BIG DATA & DATA SCIENCE

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Big Data Explained. An introduction to Big Data Science.

How To Create A Data Science System

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

Introduction to Big Data! with Apache Spark" UC#BERKELEY#

Databricks. A Primer

HDP Enabling the Modern Data Architecture

Microsoft Big Data. Solution Brief

Empower Your organization with

Talend Real-Time Big Data Sandbox. Big Data Insights Cookbook

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

The Future of Business Analytics is Now! 2013 IBM Corporation

Optimized for the Industrial Internet: GE s Industrial Data Lake Platform

The Analytics Value Chain Key to Delivering Value in IoT

Making Sense of Big Data in Insurance

Advanced In-Database Analytics

Big Data Analytics. Copyright 2011 EMC Corporation. All rights reserved.

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns

Getting Real Real Time Data Integration Patterns and Architectures

The 4 Pillars of Technosoft s Big Data Practice

The Five Most Common Big Data Integration Mistakes To Avoid O R A C L E W H I T E P A P E R A P R I L

Amplify Serviceability and Productivity by integrating machine /sensor data with Data Science

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Making big data simple with Databricks

Are You Ready for Big Data?

Building Your Big Data Team

The State of Real-Time Big Data Analytics & the Internet of Things (IoT) January 2015 Survey Report

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc All Rights Reserved

BIG DATA STRATEGY. Rama Kattunga Chair at American institute of Big Data Professionals. Building Big Data Strategy For Your Organization

Architecting your Business for Big Data Your Bridge to a Modern Information Architecture

Cisco IT Hadoop Journey

Disrupt or be disrupted IT Driving Business Transformation

Optimized for the Industrial Internet: GE s Industrial Data Lake Platform

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

This Symposium brought to you by

Data Science & Big Data Practice

Are You Ready for Big Data?

The Business Analyst s Guide to Hadoop

Cloud Integration and the Big Data Journey - Common Use-Case Patterns

Real-Time Big Data Analytics + Internet of Things (IoT) = Value Creation

Mohan Sawhney Robert R. McCormick Tribune Foundation Clinical Professor of Technology Kellogg School of Management

#TalendSandbox for Big Data

Transforming the Telecoms Business using Big Data and Analytics

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

Questionnaire about the skills necessary for people. working with Big Data in the Statistical Organisations

DAMA NY DAMA Day October 17, 2013 IBM 590 Madison Avenue 12th floor New York, NY

Role Description. Position of a Data Scientist Machine Learning at Fractal Analytics

Challenges of Analytics

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Data Science and Big Data: Below the Surface and Implications for Governance

COMP9321 Web Application Engineering

Big Data Analytics Best Practices

Cisco Data Preparation

The Internet of Things and Big Data: Intro

Apigee Insights Increase marketing effectiveness and customer satisfaction with API-driven adaptive apps

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

Understanding Your Customer Journey by Extending Adobe Analytics with Big Data

Big Data and Data Science: Behind the Buzz Words

May 2015 Robert Gibbon & Jochen Stroobants

We are building the next generation of Big Data and Analytics solutions!

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

Leveraging Machine Data to Deliver New Insights for Business Analytics

BIG DATA TRENDS AND TECHNOLOGIES

Where is... How do I get to...

How To Use Big Data For Business

Senior Business Intelligence/Engineering Analyst

Data Governance in the Hadoop Data Lake. Michael Lang May 2015

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

Teradata Unified Big Data Architecture

The Lab and The Factory

The session is about to commence. Please switch your phone to silent!

Big Data. Value, use cases and architectures. Petar Torre Lead Architect Service Provider Group. Dubrovnik, Croatia, South East Europe May, 2013

Transcription:

Integrating a Big Data Platform into Government: Drive Better Decisions for Policy and Program Outcomes John Haddad, Senior Director Product Marketing, Informatica Digital Government Institute s Government Big Data Conference, October 9, Washington, DC

Regulatory Data Climate Data Web Logs Social Data Sensor Data Energy Consumption GPS Insurance Claims EMR Flight Data Network Monitoring

Big Data is a balancing act Keeping the lights on (KOL) Complying with regulations Reducing costs Adopting new technologies Acquiring new analytic skills Increasing agility

How do you get value from Big Data? Agency Goals Analyst Data Scientist Developer Business Prioritize Goals Generate Insights Validate Hypothesis Make Operational Take Action Big Data Supply Chain Acquire & Store Refine & Enrich Explore & Curate Distribute & Manage Big Data Data Management & Analytic Systems Business Value

The Big Data Journey Healthcare Data Warehouse Optimization Managed Data Lake Real-time Operational Intelligence Security Transportation Optimize infrastructure for performance, cost, & scalability A single place to manage the supply and demand of data Proactively respond to threats and opportunities in real-time Treasury Energy IT driven Business driven Public Safety

Data Warehouse Optimization EHR Business Intelligence / Data Warehouse Web & App Log Files EMR Claims Data Reduce IT Costs ERP Batch Load Near Real- Time Increase Op Efficiencies Changed Data Staging Data Integration Data Quality

Managed Data Lake EHR Business Intelligence / Data Warehouse Patient / Provider Master Visualization / Analytics Web & App Log Files EMR Claims Data Master Data Management Reduce IT Costs ERP Healthcare & Patient Forums Social Data / Signals Patient / Provider Mobile Devices Batch Load Real-Time Ingestion Changed Data Staging Sandbox Reservoir Data Integration Data Matching Near Real- Time Pub / Sub Increase Op Efficiencies Improve Fraud Detection Reduce Readmissions RFID, Patient Monitoring Data Quality Data Security Improve Outcomes

Real-time Operational Intelligence EHR Business Intelligence / Data Warehouse Patient / Provider Master Visualization / Analytics Web & App Log Files EMR Claims Data Master Data Management Reduce IT Costs ERP Healthcare & Patient Forums Social Data / Signals Patient / Provider Mobile Devices Batch Load Real-Time Ingestion Changed Data Staging Sandbox Reservoir Data Integration Data Matching Event Based Processing Near Real- Time Pub / Sub Real-Time Delivery Increase Op Efficiencies Improve Fraud Detection Reduce Readmissions RFID, Patient Monitoring Data Quality Data Security Streaming Analytics Improve Outcomes

Cyber Security Business Intelligence / Data Warehouse Person of Interest Master Visualization / Analytics Access Monitors, Honeypots System & Network Monitors, Log Files Master Data Management Reduce IT Costs Social Data / Signals RDBMS, Flat Files OSINT (Security Bulletins, Internet Events) DoD/Intel Security Messages & Alerts Batch Load Real-Time Ingestion Changed Data Staging Sandbox Reservoir Data Integration Data Quality Data Matching Data Security Event Based Processing Streaming Analytics Near Real- Time Pub / Sub Real-Time Delivery Increase Op Efficiencies Stop & Predict Cyber Threats Share Threat Information

Transportation Service Records Business Intelligence / Data Warehouse Person of Interest Master Visualization / Analytics Image & Video Master Data Management Reduce IT Costs GPS Scheduled Routes Weather & Climate Batch Load Real-Time Ingestion Changed Data Near Real- Time Pub / Sub Real-Time Delivery Optimize Routes Reduce Delays & Disruptions Social Data / Signals Sensors & Radar Staging Sandbox Reservoir Data Integration Data Quality Data Matching Data Security Event Based Processing Streaming Analytics Improve Public Safety Reduce Fuel Consumption

Does your data platform support Big Data requirements? DEPLOY your data pipeline (i.e. access, integrate, and prepare data) from pilot to production quickly STAFF projects with affordable and readily available skills (e.g. analytics, ETL, data quality) ADOPT new Big Data technologies (e.g. Hadoop, NoSQL, IoT) without major disruption to your production environment TRUST (i.e. certify, secure, master) your data to make the right decisions faster with minimal risk REAL TIME processing (i.e. ingest, correlate, alert) to proactively respond to business situations (e.g. events, threats, opportunities)

Do you have the right skills? Enterprise Architect Data Steward Data Analyst Data Scientist Business Analyst Data Engineer Domain Expert Application Developer ETL Developer Data Architect Database Admin Solution Architect

Qualifications for a Data Scientist (source: job posting on Dice.com) A background in data mining, machine learning and distributed computing is desired Bachelor's Degree or Master's Degree in a quantitative discipline such as Mathematics, Statistics, Finance, Accounting, Economics, Operational Research or a related discipline 6 plus years experience in a decision support analytic function is a must 4 plus years of quantitative analysis experience Knowledge of Hadoop, Pig, Hive and MapReduce 8 plus years experience in a decision support analytic function Experience on a Hadoop Platform Experience with Python, Perl or other scripting language Familiarity with object-oriented programming concepts Experience in Java or C++ is a plus Demonstrated proficiency with statistical computing languages such as R, MATLAB, etc Experience with integrating large-scale heterogeneous datasets Expertise with statistical research techniques, including modeling, data mining, clustering and segmentation Strong analytical and problem solving skills Excellent interpersonal skills and ability to communicate effectively with third parties and internal staff at all levels of the organization Excellent organization and time management skills A proven record as a team player plus the ability to work independently given general direction Knowledge of streamline program analysis procedures in SQL, SAS, R, Pig, Python, Apache Mahout or other chosen languages Ability to create, deploy, maintain and refine decision management models Perform study and discovery of new data sources or new uses for existing data sources Participate in the design and implementation of statistical data quality procedures Interpret and implement data findings creatively in a variety of formats Ability to work closely across an array of various teams and organizations in the company to champion Big Data technologies and advanced analytics Ability to work with cross-departmental teams to define metrics, guidelines and strategies for effective use of algorithms and data

Increase Productivity

Ensure Trust

Provide Self-Service

21 Increase Intelligence 5