How To Use Big Data For Business

Similar documents
Getting Started Practical Input For Your Roadmap

Enriching Customer Data With New Customer Insights Using Big Data And Analytics

Transitioning to a Data Driven Enterprise - What is A Data Strategy and Why Do You Need One?

INTELLIGENT BUSINESS STRATEGIES WHITE PAPER

Big Data Multi-Platform Analytics (Hadoop, NoSQL, Graph, Analytical Database)

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

TECHNOLOGY TRANSFER PRESENTS MIKE FERGUSON NEXT GENERATION DATA MANAGEMENT BUILDING AN ENTERPRISE DATA RESERVOIR AND DATA REFINERY

WHITE PAPER. An Analytical Platform For The Smart Enterprise INTELLIGENT BUSINESS STRATEGIES

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Ganzheitliches Datenmanagement

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

Luncheon Webinar Series May 13, 2013

The Future of Data Management

W H I T E P A P E R. Architecting A Big Data Platform for Analytics INTELLIGENT BUSINESS STRATEGIES

HDP Hadoop From concept to deployment.

How the oil and gas industry can gain value from Big Data?

VIEWPOINT. High Performance Analytics. Industry Context and Trends

Architecting for the Internet of Things & Big Data

WHITE PAPER. Data Migration and Access in a Cloud Computing Environment INTELLIGENT BUSINESS STRATEGIES

The Enterprise Data Hub and The Modern Information Architecture

WHITE PAPER. Big Data - Why Transaction Data is Mission Critical To Success INTELLIGENT BUSINESS STRATEGIES

Big Data and Your Data Warehouse Philip Russom

BIG DATA: FIVE TACTICS TO MODERNIZE YOUR DATA WAREHOUSE

The Future of Data Management with Hadoop and the Enterprise Data Hub

How To Turn Big Data Into An Insight

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS!

HDP Enabling the Modern Data Architecture

Achieving Business Value through Big Data Analytics Philip Russom

An Integrated Big Data & Analytics Infrastructure June 14, 2012 Robert Stackowiak, VP Oracle ESG Data Systems Architecture

The big data business model: opportunity and key success factors

The 4 Pillars of Technosoft s Big Data Practice

An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise

Microsoft Big Data. Solution Brief

Investor Presentation. Second Quarter 2015

Hadoop and Relational Database The Best of Both Worlds for Analytics Greg Battas Hewlett Packard

QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM

Navigating Big Data business analytics

BIG Data Analytics Move to Competitive Advantage

Big Data Analytics Nokia

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

MDM and Data Warehousing Complement Each Other

Deploy. Friction-free self-service BI solutions for everyone Scalable analytics on a modern architecture

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

IBM Big Data Platform

Integrating a Big Data Platform into Government:

Cloud Integration and the Big Data Journey - Common Use-Case Patterns

Information Builders Mission & Value Proposition

Data Refinery with Big Data Aspects

Next presentation starting soon Business Analytics using Big Data to gain competitive advantage

Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru

Building Confidence in Big Data Innovations in Information Integration & Governance for Big Data

Artur Borycki. Director International Solutions Marketing

Using Tableau Software with Hortonworks Data Platform

Cloud-based Business Intelligence A Market Study

Cloudera Enterprise Data Hub in Telecom:

This Symposium brought to you by

Tap into Hadoop and Other No SQL Sources

The Data Reservoir as an enabler of differentiating Analytics initiatives

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Splunk Company Overview

Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth

Harnessing big data with Hortonworks Data Platform and Red Hat JBoss Data Virtualization

2015 Analyst and Advisor Summit. Advanced Data Analytics Dr. Rod Fontecilla Vice President, Application Services, Chief Data Scientist

Executive Summary... 2 Introduction Defining Big Data The Importance of Big Data... 4 Building a Big Data Platform...

Apache Hadoop Patterns of Use

#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld

Evolution to Revolution: Big Data 2.0

Data Virtualization A Potential Antidote for Big Data Growing Pains

Introducing Oracle Exalytics In-Memory Machine

How Big Is Big Data Adoption? Survey Results. Survey Results Big Data Company Strategy... 6

Big Data Challenges and Success Factors. Deloitte Analytics Your data, inside out

Apache Hadoop: The Big Data Refinery

Teradata s Big Data Technology Strategy & Roadmap

Big Analytics: A Next Generation Roadmap

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.

Apache Hadoop in the Enterprise. Dr. Amr Awadallah,

Big Data Are You Ready? Jorge Plascencia Solution Architect Manager

Addressing Risk Data Aggregation and Risk Reporting Ben Sharma, CEO. Big Data Everywhere Conference, NYC November 2015

Leveraging Machine Data to Deliver New Insights for Business Analytics

Big Data and Your Data Warehouse Philip Russom

The Internet of Things and Big Data: Intro

Exploiting Data at Rest and Data in Motion with a Big Data Platform

Introduction to Big Data! with Apache Spark" UC#BERKELEY#

Transcription:

Big Data Maturity - The Photo and The Movie Mike Ferguson Managing Director, Intelligent Business Strategies BA4ALL Big Data & Analytics Insight Conference Stockholm, May 2015

About Mike Ferguson Mike Ferguson is Managing Director of Intelligent Business Strategies Limited. As an independent analyst and consultant he specializes in business intelligence, analytics, data management and big data. With over 33 years of IT experience, Mike has consulted for dozens of companies, spoken at events all over the world and written numerous articles. Formerly he was a principal and cofounder of Codd and Date Europe Limited the inventors of the Relational Model, a Chief Architect at Teradata on the Teradata DBMS and European Managing Director of DataBase Associates. www.intelligentbusiness.biz mferguson@intelligentbusiness.biz Twitter: @mikeferguson1 Tel/Fax (+44)1625 520700 2

New Data Sources Have Emerged Inside And Outside The Enterprise That Business Now Wants To Analyse Data volume Data velocity E.g. RFID tag sensor networks Customers Front Office Service Sales Marketing Credit Verification Product/ service line 1 Product line 2 Product line 3 Product line 4 BackOffice Finance Procurement HR Supply Chain Suppliers Planning Product line n Operations Data volume Data variety Number of sources weather data 3

Popular Types of Data That Businesses Now Want to Analyse Web data Clickstream data, e-commerce logs Social networks data e.g., Twitter Semi-structured data e.g., e-mail Unstructured content How much is TEXT worth to you Sensor data Temperature, light, vibration, location, liquid flow, pressure, RFIDs Vertical industries structured transaction data E.g. Telecom call data records, retail 4

Big Data Analytics Has Taken Us Beyond The Traditional DW New Big Data Analytical Workloads 1. Analysis of data in motion 2. Complex analysis of structured data 3. Exploratory analysis of un-modeled multi-structured data 4. Graph analysis e.g. social networks 5. Accelerating ETL and analytical processing of un-modeled data to enrich data in a data warehouse or analytical appliance 6. Data warehouse optimisation offload ETL processing 7. The storage and re-processing of archived data 5

The Changing Landscape We Now Have Different Platforms Optimised For Different Analytical Workloads Big Data workloads result in multiple platforms now being needed for analytical processing Real-time stream processing & decision m gmt Graph analysis Investigative analysis, Data refinery Data mining, model development Advanced Analytic (multi-structured data) DW & marts Advanced Analytics (structured data) MDM R Streaming data NoSQL DB e.g. graph DB NoSQL DBMS Hadoop data store EDW mart Data Warehouse RDBMS DW Appliance Analytical RDBMS C Prod Cust Asset D U Traditional query, reporting & analysis 6

The Photo - Big Data Workloads Mean Multiple Platforms Are Now Needed IBS Enterprise Analytical Ecosystem Graph Analytics tools Custom Analytic applications MR & Spark BI tools Search based BI tools BI tools platform & data visualisation tools Advanced Analytics (multi-structured data) Data Virtualisation and optimization DW & marts Advanced Analytics (structured data) MDM System R actions NoSQL DB e.g. graph DB EDW mart DW Appliance C Prod Cust Asset D U Filtered Stream processing data Enterprise Information Management sensors XML, clickstream JSON Web logs web services RDBMS feeds social Cloud Files office docs 7

Making The Movie - How Do You Bring This To Life? - Key Questions When Building A Big Data Roadmap Why do you need Big Data? What is the business purpose? What kinds of data do you need to analyse to achieve your business goals and where is that data? What kinds of characteristics does that data have? What kinds of analytical workload do you need to support to derive insight from that data? What skills do you need to do this? What technology choices best fit your needs? How will you deploy big data technologies? How will you organise and manage implementation? How will you integrate with your existing analytical investment? 8

What Are You Trying to Achieve And What Data Do You Need?- Industry Use Case Examples And Data Required Source: Hortonworks 9

Do You Need Real-Time Streaming Analytics? - Popular Streaming Analytic Applications Use Cases Extreme real-time analytics Source: Hortonworks 10

Business Example Oil and Gas Graph Analytics tools actions stream processing Custom Analytic applications NoSQL DB e.g. graph DB MR & Spark BI tools Seismic analysis, well integrity, Advanced pipeline Analytics analysis, (multi-structured data refinery data) Financial planning Search based BI tools DW & marts EDW BI tools platform & data visualisation tools Data Virtualisation and optimization mart Financial reporting, Advanced Analytics (structured data) DW Appliance Enterprise spend Information analysis, Management Tool Suite production reporting, field service maintenance Production forecasting equipment failure prediction development Simplified access, virtual marts, information services MDM System R C Prod Cust Asset D U sensors Real-time sensor data analysis of drilling, XML, clickstream JSON Web logs web services RDBMS equipment monitoring, well integrity, pipeline flows, market trade monitoring feeds social Cloud Files office docs 11

Identify Candidate Big Data Projects, Categorise them and Align With Priorities in Your Business Strategy Business Strategy Objectives KPIs KPI targets Priorities Initiatives Budgets Customer Operations Risk Finance Sustainability align align Candidate Big Data Projects enrich enrich enrich enrich enrich R R C asset R U prod C asset cust R U prod C asset cust R U prod C asset D cust U prod C asset D cust U prod D cust D D DW & marts EDW mart 12

Why New Data And Big Data Analytics? Example: Enrich Customer Data and Insight Source: IBM Redbook - Information Governance Principles and Practices for a Big Data Landscape 13

Closing the Loop Feeding Insight Back Into MDM to Create Competitive Customer Data New insights from big data e.g. social intelligence, online behaviour customer intelligence C R Enriched customer customer D U MDM system with master data services reporting CPM alerts OLAP/Mine BI system mart DW mart mart historical data DW/BI system social networks 14

Assessing New Sources To Enrich Master Data Has To Be A Collaborative Process You Need Business In The Loop We need all relevant people to help determine high value data sources Business data expert IT Data Architect Data Scientist Goal: Enrich CUSTOMER data for better marketing Data Steward Business analyst Business data expert sandbox sandbox Business data expert IT Data Architect IT Developer We need to capture discussions, share exploratory results, rate data, prioritise projects 15

Once You Have Assessed The Value You Can Recruit & Start Project(s) To Acquire And Analyse New Data For example additional data about customers could come from: Social media data Professional life Lifestyle Relationships Likes/dislikes Sentiment - positive or negative opinion Intent - wants to buy, travel etc. Ownership - products owned (could be from competitors) Interests - Could be short-lived In-bound customer email Sentiment Call centre notes 16

Going Beyond Basic Customer Identity E.g. Extending / Enriching Customer Insight Customer interaction data Email Chat / transcripts Call centre notes Click stream Person-to-person dialogue C R enriched customer customer Customer attitude data U Opinions Preferences Needs and desires Customer bahaviour data Orders Payments Transaction history Usage history D MDM system with master data services Customer descriptive data Attributes Characteristics Relationships Demographics The objective is to create the best Customer dimension possible using additional internal and external data sources Source: MDM 17

Enriching Customer MDM Which Data Sources Potentially Require Big Data Analytics to Derive Insight? Customer interaction data CRM, Email web logs Chat / transcripts Call centre notes Click stream Person-to-person dialogue C R enriched customer Customer attitude data U Opinions Preferences Needs and desires Potential big data sources CRM, social media data, review web sites Customer bahaviour data Orders Payments Transaction history sensor data, Usage history Source: MDM MDM system with master data services Customer descriptive data Attributes Characteristics Relationships Demographics web logs filings The objective is to create the best Customer dimension possible using additional internal and external data sources D social media data, SEC 18

Enriching Customer MDM Need to Consider Volume, Variety and Velocity of Valuable New Data Sources Customer interaction data CRM, Email web logs Chat / transcripts High volume undiscovered structured data Source: MDM Call centre notes Click stream Person-to-person dialogue Customer bahaviour data unstructured data C unstructured data Customer attitude data Opinions Preferences Needs and desires Customer descriptive data High velocity, high MDM system volume semistructured data Payments services Characteristics Orders with master data Attributes Transaction history Relationships social media sensor data, Usage history Demographics data web logs The objective is to create the best Customer dimension possible using additional internal and external data sources R enriched customer D U Potential big data sources CRM, social media data, review web sites semistructured data semistructured data 19

New Data Sources Example - What Are We Looking To Extract From Social Media Data Sources? Social Data Platforms Do you have people with these skills? Do you have all the technology needed? Additional Organisation data Additional Person data e.g. hobbies, Interests, desires Professional data e.g. employers Product ownership data Intent Sentiment Unknown Relationships Requires several techniques: 1. JSON schema extraction 2. Text analytics for entity extraction 3. Clickstream analysis 4. Graph analytics for relationship discovery analysis enrich R C customer U D MDM System HDFS files 20

Social Media Data Challenges A Person Could Have Multiple Social Personas 21

Enriching Customer Data - Extracting LinkedIn Social Profile Data Via Their REST API Most social media sites have APIs to access informaton Additional Person data e.g. education, interests Professional data e.g. employers, skills LinkedIn returns data in JSON or XML formats Source: LinkedIn 22

Key Point! Several Different Types Of Big Data Analytic Workloads Can Be Used to Produce New Insights Text analytics to get new structured data attributes from millions of documents e.g. SEC filings, tweets, reviews Sentiment analytics for customer opinion Graph analytics for discovery of new customer relationships Clickstream analytics for customer interaction behaviour You can also combine these to find new data E.g. Text analytics to extract new data feeding graph analytics to find relationships in extracted data Do you know what you types of analytical workloads you need to implement? Do you know what technologies you already have and what you need? 23

Increased Data and Analytical Complexity Has Created A Need For A New Role The Data Scientist Image source: www.computing.co.uk Source: Wikipedia Data Science is the process of investigative / exploratory analysis of multi-structured data to discover and produce new business insights 24

Ensure People In Different Roles In The Analytical Landscape Work Together To Deliver Business Value Business Strategy strategic objectives and targets including sustainability targets Strategic Business Objective Priority KPI Current KPI Value What is +1% worth? 1 $$$ 2 3 4 KPI Target Executive Accountable Business Initiatives (projects) Project Project Project Budget Allocation x Million Action Plan Data Scientist Business Analyst Business Manager / Operations worker / Customer Exploratory analysis Predictive / statistical model producer Model consumer Data visualisation Information Producer Build reports Build and publish dashboards Information consumer Decision maker Action taker 25

Chaos Is NOT An Option Business Alignment of Information Being Produced is Critical To Success Projects without alignment run a high risk of failure or cancellation Strategic Objectives Business Strategy What problem are you trying to solve? What data do you need? What kind(s) of analytic workload are needed Project Project Project Project Project 26

Organising Your Data Science Projects In A Data Reservoir This Needs To Be Done Incrementally Do you have this already planned and organised? Enterprise Local Data marts Data Ingest zone Trusted Data e.g. Master Data DW Archive zone Exploratory analysis zone (prepare & analyse data) sandbox New Insights zone archive Txns DW insights NoSQL DB Graph DBMS Analytical DBMS DW Appliance MDM R C D U How will you know what data you have? 27

Governance Policies Will Apply More To Refined Data In A Reservoir Raw data In-Progress data Refined data sandbox 1. Rate/classify as sensitive 2. Define privacy policies 3. Define access policies corporate firewall Fit for use What governance have Untrusted Data Refinery Trusted you got in place? Classifying data in a catalog helps determine what governance policies to apply 28

Technology Options for Refining (Preparing) Data IT developed ELT processing on Hadoop and analysis by data scientists Self-service data integration and analysis by data scientists Multi-role data management platforms with analytics A combination of the above 29

Hadoop As A Data Reservoir and Data Refinery contains clean, high value data Graph DBMS EDW DW appliance New high value Insights (pub/sub) sandbox sandbox sandbox other data Data Refinery Data Reservoir (raw data) Transform & Cleanse Data in Hadoop (MapReduce) Parse & Prepare Data in Hadoop (MapReduce) Discover data in Hadoop Load data into Hadoop ELT work -flow 30

Exploratory Analysis of Clickstream Data in Hadoop E.g. Weblog Data in HortonWorks 31

Sentiment Analytics - Deriving Structure From Unstructured Content Additional Person data e.g. hobbies, Interests, desires Intent Sentiment (source: Crunchbase) 32

Using Text Analytics To Extract Additional Data From Unstructured Content Requirement is automatic recognition of people, organisations, addresses This Text can analytics be a computationally can derive information intensive from process unstructured involving content, complex e.g. emails character-level operations such as pattern matching. On large volumes, scalability matters 33

The Sentiment Analytics Process Data and Insights Can Be Matched To Master Data Using Fuzzy Matching Social Data Platforms Text Analysis Customer Engagement Management Social Media Aggregators HDFS files MapReduce or Spark sentiment scoring application Hive tables Scored sentiment and Social profile data Analyse / Index / Deliver Twitter Firehose MySpace Klout Amazon Facebook reddit Flickr Youtube bit.ly CRM applications critical fields Probabilistic ( fuzzy ) matching R C customer U D enrich C R enriched customer D MDM System U MDM System 34

Key Use Case - Enriching Customer Master Data With New Relationships Using Graph Analysis Customer interaction data Email Chat / transcripts Call centre notes Clickstream Person-to-person dialogue C R Enriched customer customer Customer attitude data U Opinions Preferences Needs and desires Source: MDM Customer bahaviour data Orders Payments Transaction history Usage history Click stream navigation MDM system with master data services Customer descriptive data Attributes Characteristics Relationships Demographics The objective is to create the best Customer dimension possible using additional internal and external data sources D 35

Example Identifying New Relationships Using Information Extracted From SEC Filings Subsidiaries list subsidiaries of a company Forms 8-K Current Events merger and acquisition bankruptcy change of officers and directors material definitive agreements Forms 3/4/5, SC 13D, SC 13G, 10-K, FDIC Call Report subsidiaries, insider, 5%, 10% owner, banking subsidiaries Shareholders related institutional managers Holdings in different securities Forms 10-K, DEF 14A, 8-K, 3/4/5, 13F, SC 13D, SC 13G, FDIC Call Report Forms 10-K, 10-Q, 8-K Loan Agreements loan summary details counterparties (borrower, lender, other agents) commitments borrower, lender Company Loan Reference SEC table Event Forms 3/4/5, SC 13D, SC 13G employment, director, officer insider, 5% owner, 10% owner Insider filings transactions holdings Insider relationship Security holdings, transactions Person Forms 13F, Forms 3/4/5 Forms 10-K, DEF 14A, 8-K, 3/4/5 5% beneficial ownership owner issuer % owned date Officers & Directors mention bio range, age, current position, past position signed by committee membership Source: IBM 36

Analysing Enriched Customer Master Data Can Improve Accuracy of Next Best Action To Be Taken Life events Additional Organisation data Additional Person data e.g. hobbies, Interests, desires Professional data e.g. employers Behaviour Product ownership data Intent score Sentiment score Unknown Relationships enrich C R Enriched customer D U Enriched MDM System analyse Next best action 37

New Insights Can Be Added Into A Data Warehouse To Enrich What You Already Know R C MDM U Operational systems D Data Scientists Web logs sandbox D I DW social web cloud new insights e.g. Deriving insight from social web sites like for sentiment analytics 38

Alternatively New Insights In Hadoop Can Integrated With A DW Using Data Virtualization To Provide Enriched Information OLTP systems D I DW Web logs Data Scientists sandbox SQL on Hadoop Data Vitualisation social web cloud new insights e.g. Deriving insight from social web sites like for sentiment analytics 39

Conclusions Whare Are You On The Maturity Model Taking Phots Or Making Movies? Source: IBM 40

Thank You! Big Data and Analytics Stockholm, November 26-27, 2015 www.intelligentbusiness.biz mferguson@intelligentbusiness.biz Twitter: @mikeferguson1 Tel/Fax (+44)1625 520700 41