Are You Big Data Ready?

Similar documents
Big Data and New Paradigms in Information Management. Vladimir Videnovic Institute for Information Management

Safe Harbor Statement

Architecting for the Internet of Things & Big Data

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

The Future of Data Management

Welkom! Copyright 2014 Oracle and/or its affiliates. All rights reserved.

Oracle Big Data Discovery Unlock Potential in Big Data Reservoir

Oracle Big Data Building A Big Data Management System

The Future of Data Management with Hadoop and the Enterprise Data Hub

IBM Big Data Platform

Safe Harbor Statement

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

Comprehensive Analytics on the Hortonworks Data Platform

Data Governance in the Hadoop Data Lake. Michael Lang May 2015

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

More Data in Less Time

Modernizing Your Data Warehouse for Hadoop

Bringing Big Data to People

Architecting your Business for Big Data Your Bridge to a Modern Information Architecture

VIEWPOINT. High Performance Analytics. Industry Context and Trends

Big Data Analytics. Copyright 2011 EMC Corporation. All rights reserved.

Apache Hadoop in the Enterprise. Dr. Amr Awadallah,

Native Connectivity to Big Data Sources in MSTR 10

White Paper: Evaluating Big Data Analytical Capabilities For Government Use

Cloudera Enterprise Data Hub in Telecom:

Implementation of Big Data and Analytics Projects with Big Data Discovery and BICS March 2015

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Making Sense of Big Data in Insurance

Hadoop Ecosystem B Y R A H I M A.

HDP Hadoop From concept to deployment.

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

Big Data and Analytics in Government

Testing 3Vs (Volume, Variety and Velocity) of Big Data

Big Data Integration: A Buyer's Guide

Ganzheitliches Datenmanagement

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

Are You Ready for Big Data?

Navigating Big Data business analytics

Data Lake In Action: Real-time, Closed Looped Analytics On Hadoop

BIG DATA TECHNOLOGY. Hadoop Ecosystem

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

Big Data Are You Ready? Jorge Plascencia Solution Architect Manager

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop

How To Turn Big Data Into An Insight

Deploying an Operational Data Store Designed for Big Data

Are You Ready for Big Data?

Safe Harbor Statement

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014

Dansk IT Big Data i de største danske banker

SOLUTION BRIEF. Increase Business Agility with the Right Information, When and Where It s Needed. SAP BusinessObjects Business Intelligence Platform

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013

Hadoop Data Hubs and BI. Supporting the migration from siloed reporting and BI to centralized services with Hadoop

Big Data and Data Science. The globally recognised training program

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

BIG DATA & DATA SCIENCE

Cisco IT Hadoop Journey

BIG DATA TRENDS AND TECHNOLOGIES

3. Provide the capacity to analyse and report on priority business questions within the scope of the master datasets;

JOURNAL OF OBJECT TECHNOLOGY

The Digital Enterprise Demands a Modern Integration Approach. Nada daveiga, Sr. Dir. of Technical Sales Tony LaVasseur, Territory Leader

The Next Wave of Data Management. Is Big Data The New Normal?

Big Data Management and Security

Getting Started Practical Input For Your Roadmap

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data

Master Data Management Architecture

Klarna Tech Talk: Mind the Data! Jeff Pollock InfoSphere Information Integration & Governance

Big Data Open Source Stack vs. Traditional Stack for BI and Analytics

Data Virtualization A Potential Antidote for Big Data Growing Pains

Business Intelligence

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

#TalendSandbox for Big Data

Self-service BI for big data applications using Apache Drill

Modern Data Architecture for Predictive Analytics

Disrupt or be disrupted IT Driving Business Transformation

How To Handle Big Data With A Data Scientist

Dell In-Memory Appliance for Cloudera Enterprise

Please give me your feedback

Big Data and Apache Hadoop Adoption:

Traditional BI vs. Business Data Lake A comparison

Bringing Strategy to Life Using an Intelligent Data Platform to Become Data Ready. Informatica Government Summit April 23, 2015

Constructing a Data Lake: Hadoop and Oracle Database United!

ORACLE DATA INTEGRATOR ENTERPRISE EDITION

The Enterprise Data Hub and The Modern Information Architecture

This Symposium brought to you by

MySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering

Data processing goes big

The big data business model: opportunity and key success factors

Extend your analytic capabilities with SAP Predictive Analysis

How To Make Data Streaming A Real Time Intelligence

Big Data for Banking. Kaleem Chaudhry Senior Director, Sales Consulting, ASEAN. Copyright 2013, Oracle and/or its affiliates. All rights reserved.

HDP Enabling the Modern Data Architecture

Talend Big Data. Delivering instant value from all your data. Talend

A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel

ITG Software Engineering

Transcription:

ACS 2015 Annual Canberra Conference Are You Big Data Ready? Vladimir Videnovic Business Solutions Director Oracle Big Data and Analytics

Introduction

Introduction What is Big Data? If you can't explain it simply, you don't understand it well enough Albert Einstein

Big Data Term commonly used to describe phenomenon of large and/or complex datasets caused by the exponential growth and availability of data that traditional data processing approaches cannot handle effectively

Big Data Big Vs Volume Velocity Variety Visibility Vulnerability Veracity Variability Value V Complexity Immaturity Scalability Performance Availability Manageability Trustworthiness Cost

6

Organisation Volume Velocity Variety Visibility Vulnerability Veracity Value 7

Big Vs Business Perspective It s just Data nothing has changed... Value Enable smarter decisions, faster actions and optimised results Valuation Evaluation of outcomes and continuous optimisation Measure the impact on key performance indicators Assess the ability to forecast future outcomes Visibility Discover important insights demand for good quality analytics & insight Study patterns, behaviours, trends and relationships The demand for data is highlighting the IT bottleneck the cash data economy

Big Vs Information Management Perspective It s Information and many things have changed! Variety Data Integration New data sources (Social Media, Data streams, Internet of Things) Lack of standards for managing various data types Velocity Data streaming, machine/sensor generated data Real-time insights Vulnerability Information Security Fraud discovery

Big Vs Data Science Perspective Value Understanding what is possible Prioritising/sizing up the opportunities Analytics that do not support decision making doesn t add value Variety Internal document/text/audio sources may be inaccessible Need to prototype/test value add in adding new unstructured data from social media and other external sources Velocity Move to real-time services Fraud detection

Big Data Early Days

BIG Data Large Data

BIG Data Large Data

Big Vs Survey Source: Ernst & Young - Global Forensic Data Analytics Survey 2014

BIG Data Better Information Governance

BIG Information Enterprise capability required to acquire relevant Data from multiple sources, organise it in a repository, convert it into Information by analysing it, utilise it for deriving Knowledge to make smarter business Decisions and Outcomes

Promise of Analytics An Information-Powered Organisation is equipped with timely insights and knowledge to continuously optimise the way they derive answers, decisions and actions conduct their business streamline their operations deliver services and/or products adapt to new realities

Information Management Today Maturity Copyright 2015, Oracle 2012 and/or Oracle its Corporation affiliates. All rights Proprietary reserved. and Confidential

Information Management Today Common Challenges Access to Information Information Quality Efficient Use of Information Information support for Business Evolution

Barriers to Mature Use of Information Survey Source: Deloitte - The Analytics Advantage Survey 2013

Information Management Transition

Modern Information Environment People

Need Analytic Value at Fast Pace Data Uncertainty Not familiar and overwhelming Potential value not obvious Requires significant manipulation Tool Complexity Early Hadoop tools only for experts Existing BI tools not designed for Hadoop Emerging solutions lack broad capabilities 80% effort typically spent on evaluating and preparing data Overly dependent on scarce and highly skilled resources 24

Modern Information Environment Roles Data Engineer (Data Wrangler, Data Janitor) Data Scientist (Data Analyst, Data Miner) Business Intelligence Specialist (Report Designer, Data Specialist) Information Consumer

Data Engineer Role Prepare data for easier access, analytical processing and efficient interpretation Skills ability to quickly understand data transform and enrich data programming data visualisation statistics effective present of results

Data Engineer Role Create insights from available data Skills data pattern discovery Ability to tell the story curiosity attention to detail research skills Question everything strive to make a difference

Modern Information Environment People Everyone on the Journey Change Empowered people Easy Access to ALL Information Engineered Processes Technology support Information-powered

Modern Information Environment Business Processes

Modern Information Environment Business Processes Digitally-enabled Business Process Paper-based process in digital form

Modern Information Environment Business Processes in the Digital Era Engineered Automated Fit-for-purpose Easy to (re)configure Easy to use Easy to integrate

Modern Information Environment Technology

Data Reality Existing sources + Emerging Sources Data Warehouse Data Reservoir Existing Sources Emerging Sources 33

Information Management Logical View Information Warehousing

Big Data Eco-System Data Fast Data Apps Outputs 1 2 3 Streams Events Actions Custom Data Services Information Warehouse Packaged Reservoir Factory Warehouse Business Analytics Smart Things Data Lab Reports People Data Sets Discovery Data Science Visualisation Oracle Confidential Internal Business Analytics Product Group 35

Cloudera Hadoop Distribution HDFS Storage - distributed file system that provides high-throughput access to data MapReduce Process - framework for performing distributed data processing Hive - Data Warehouse system for Hadoop HCatalog - metadata abstraction layer for referencing data HBase - Hadoop DataBase - distributed columnar database Hue user interface for managing and using Hadoop components Spark - parallel data processing framework for developing Big Data applications YARN cluster management reliable distributed processing of very large data sets Oozie workflow - coordination system for managing Hadoop jobs ZooKeeper - centralised service for maintaining configuration information Sqoop efficient transfer of bulk data between Hadoop and structured datastores Impala - analytic database designed to leverage the flexibility and scalability of Hadoop Solr - reliable, scalable and fault tolerant search based on Lucene Pig - platform for analysing large data sets 36

Information Management Capabilities Focus Organisation-wide Information Management Goal Information Capability Design Patterns enabling Information-Powered organisation Addressing Barriers 37

Modern Information Environment Governance

Unified Information Architecture Maturity Phases Silos Standardise Optimise Information on Demand One off tools/solutions Bottom-up Local business unit driven Many versions of truth Independent data marts Enterprise standard tools/solutions Data warehouse and dependent data marts IT & LOB partnership Secure, consistent access to all data Agile, flexible architecture Analyse just in time structured & unstructured data together Advanced analytics & real-time recommendations All stage 3 capabilities delivered as a platform (service/cloud) Access of tools and data among broad subscriber group Maturity & Capability

Data Processing - Exploration Definition Understand Data and Data Value Explore attributes/ metadata Understand data quality (needs/reality) Prioritise Identify custodians Addressing Barriers IM Capabilities Data Staging Data Discovery Enterprise Capabilities Planning Experience Continuous Improvement Data and Information Enablers Tools 40

Data Processing - Collection Definition Acquisition/Ingestion/Extraction of data Quality audit Change Data Capture High-Volume Data Acquisition Stream/Event Data Capture Addressing Barriers IM Capabilities Extract Data Ingest Content Change Data Capture Enterprise Capabilities Resource Management Outcomes Skills Tools Enablers 41

Data Processing - Enrichment Definition Data Organisation Enrichment Processing Transformation Protection Data Lineage Addressing Barriers IM Capabilities Transform Data Integrate Data Enrich Content Enterprise Capabilities Procedures Outcomes Hygiene Factors Tools Intellectual Property Security Enablers 42

Data Processing - Storage Definition Structured Data Content/Documents/Records Non-structured Data Storage Strategy Data availability Addressing Barriers IM Capabilities Enterprise Data Warehouse (Curated Data) Data Reservoir (Exploratory Data) Enterprise Capabilities Resource Management Business Continuity Knowledge Data and Information Intellectual Property 43

Data Processing Data Use/Information Creation Definition Generate Information Assets Derive intelligence Actionable Insights Support Decisions Enable Innovation Support Business Processes Addressing Barriers IM Capabilities Transactional Reporting Ad-Hoc reporting Structured Analysis Sentiment Analysis Advanced Analysis Enterprise Search Enterprise Capabilities Decision Making Performance Mngmt. Change Management Strategy and Plans Resource Management Intellectual Property Policies Innovation Experience Skills Templates Creativity 44

Data Processing Data/Information Sharing and Collaboration Definition Publish Information Assets Action on Intelligence Communicate Decisions Operationalise Innovation Addressing Barriers IM Capabilities Enterprise Information Portal Enterprise Capabilities Outcomes Decision Making Communication Collaboration Change Management Shared Vision Tools Experience Structure Relationships 45

Data Processing - Optimisation Definition Optimise Information Assets Optimise Business Processes Optimise Decisions Addressing Barriers IM Capabilities Unknown Data Analysis Evidence-based Decision-Making Enterprise Capabilities Performance Management Business Continuity Outcomes Shared Vision Data and Information Continuous Improvement 46

Data Processing Information Archiving/Disposal Definition Define and apply disposal authorities Preserve information of interest Addressing Barriers IM Capabilities Enterprise Content Management/Archiving Enterprise Capabilities Procedures Policies Security Resource Management 47

Core Information Governance Capabilities IM Capabilities Change Management Data Quality Continuous Optimisation Data Lineage Master Data Information Security Addressing Barriers Enterprise Capabilities Strategy and Planning Procedures Policies Data and Information Security Resource Management Change Management Collaboration 48

Modern Information Environment Summary

Modern Information Environment People Business Processes Technology Governance

Information Management Capability Modernisation Benefits ACCESS QUALITY EFFICIENCY BUSINESS EVOLUTION Informationpowered users Right Information Informed Decisions Operationalised Innovations - Data available for analysis and ad-value processes - Information transparent, useable and actionable to larger audience - Improved utilisation of existing information assets - Information presented consistently and accurately to all authorised users via appropriate channels - Standardised information platform ensures reuse of business patterns and information consistency - Consolidated information platform can reduce costs of maintenance and increase staff efficiency - Minimise support issues from inadequate delivery of information - Real-time analytic environment to support forward-looking analysis and evidence-based decisionmaking - Support Innovation to continuously improve business processes and/or delivery of services - On-time insights available to larger population of information consumers, enabling better and consistent decision-making

Q&A 52

53