Getting Started Practical Input For Your Roadmap Mike Ferguson Managing Director, Intelligent Business Strategies BA4ALL Big Data & Analytics Insight Conference Stockholm, May 2015 About Mike Ferguson Mike Ferguson is Managing Director of Intelligent Business Strategies Limited. As an independent analyst and consultant he specializes in business intelligence, analytics, data management and big data. With over 33 years of IT experience, Mike has consulted for dozens of companies, spoken at events all over the world and written numerous articles. Formerly he was a principal and cofounder of Codd and Date Europe Limited the inventors of the Relational Model, a Chief Architect at Teradata on the Teradata DBMS and European Managing Director of DataBase Associates. www.intelligentbusiness.biz mferguson@intelligentbusiness.biz Twitter: @mikeferguson1 Tel/Fax (+44)1625 520700 2 1
Key Questions When Building A Big Data Roadmap Why do you need Big Data? What is the business purpose? What kinds of data do you need to analyse to achieve your business goals and where is that data? What kinds of characteristics does that data have? What kinds of analytical workload do you need to support to derive insight from that data? What skills do you need to do this? What technology choices best fit your needs? How will you deploy big data technologies? How will you organise and manage implementation? How will you integrate with your existing analytical investment? 3 Audience Question - What Are The Biggest Barriers To Adoption? 1. Can t see how to get business value out of Hadoop 2. Lack of skilled people 3. Unclear what should be in your Big Data strategy 4. Unclear how it integrates with your existing analytical environment 5. Risk / Governance e.g. security, data privacy, auditability 6. Vendor lock in 4 2
10/06/2015 Top Big Data Challenges 5 ETL Offload to Hadoop Has Emerged As A Popular IT Use Case To Avoid Expensive DW MPP Upgrades OLTP systems DW staging ELT Offload D I / D Q If you chose to do this your need to be careful of lookups to DW data during ELT processing on Hadoop. This can be a problem helped by loading master data into Hadoop Archived data 6 Copyrights Intelligent Business Strategies 1992-2015 All Rights Reserved 3
ETL Offload Often Caused Bu Major Growth In Staging Area Sizes on Traditional DW MPP RDBMSs Source 1 EMP DEPT DB Transactions change data capture G R O W T H Data Integration MPP RDBMS DIM DIM Source 2 EMP DEPT FACT change data capture EMP DEPT DIM DIM DB Transactions Source: Adapted from a slide originally developed by Oracle Staging Area Schema DW Schema Data Warehouse Platform 7 7 New Option Hadoop As A Data Reservoir and Data Refinery contains clean, high value data Graph DBMS EDW DW appliance New high value Insights (pub/sub) sandbox sandbox sandbox other data Data Refinery Data Reservoir (raw data) Transform & Cleanse Data in Hadoop Parse & Prepare Data in Hadoop Discover data in Hadoop Load data into Hadoop ELT work -flow 8 4
Popular Big Data Analytic Applications Web Data Clickstream analytics Site navigation behaviour (session) analysis Paths to buy, paths to abandonment, what else they looked at Improve customer experience and conversion Associate clicks with customers & prospects How many concurrent users? 9 Businesses Where Clickstream Is Now Critical On-line media and media On-line gaming Telco Retail Retail banking 10 5
10/06/2015 Why Big Data? Influencer Analysis 11 Why Big Data? Graph Analysis Dominant Edges e.g. Anti-Money Laundering Does this indicate AML? Copyrights Intelligent Business Strategies 1992-2015 All Rights Reserved 12 6
10/06/2015 Popular Big Data Analytic Applications Unstructured Data Case management Fault management and field service optimisation Voice of the customer Sentiment analytics Competitor analysis Media coverage analysis How much is TEXT worth to your business? Improve pharma drug trials Ø Unstructured content is hard to analyse 13 Open Data The Great Government Give Away What value does this data have to your business? 14 Copyrights Intelligent Business Strategies 1992-2015 All Rights Reserved 7
10/06/2015 Even The CFO Is Interested Source: Oracle Enterprise Performance Management Top Trends 2015 15 Popular Big Data Analytic Applications Sensor Data For Improving Process Efficiency and Optimisation Sustainability analytics e.g. energy optimisation E.g. RFID tag Supply/distribution chain optimisation Asset management and field service optimisation Manufacturing production line optimisation Location based advertising (mobile phones) Grid health monitoring Electricity, water, mobile phone cell network Smart metering (collect data every 15 minutes) Fraud Healthcare ITC vital signs, fit bits,. Traffic optimisation Ø WHAT ARE YOU PREPARED TO INSTRUMENT? 16 Copyrights Intelligent Business Strategies 1992-2015 All Rights Reserved 8
10/06/2015 Why Big Data? - Anyone Know What This Is? Stealth 17 Why Big Data? Data Driven Stealth Competitors - New Business Models Apple iwatch Potential new business? Life Insurance Fitbits 18 Copyrights Intelligent Business Strategies 1992-2015 All Rights Reserved 9
Why Big Data? Data Driven Stealth Competitors - New Business Models Car manufacturers Sensor data Potential new business? Car Insurance 19 Why Big Data? Data Driven Stealth Competitors - New Business Models Telco GPS + Clickstream Potential new business? Flash advertising services 20 10
Audience Question - Who Is Initiating Big Data In Your Enterprise? CIO CTO COO CMO CFO Chief Data Officer Enterprise Architect 21 Business Case Which Big Data Projects Will Help You Achieve High Priority Objectives in Your Business Strategy? Customer Operations Risk Finance Sustainability align Business Strategy align Objectives KPIs KPI targets Priorities Initiatives Budgets Candidate Big Data Projects R enrich R enrich C asset R enrich U prod C asset cust R enrich U prod C asset cust R enrich U prod C asset D cust U prod C asset D cust U prod D cust D D DW & marts EDW mart 22 11
Customer Example - Structured And Unstructured Data Hadoop On-line behaviour (clickstream) RDBMS Transactional activity Opinion (inbound email, social media) Customer Record Campaigns, risk scores New Relationships Personal Details (master data) Source: IBM 23 Data Sources and Analytical Characteristics What data sources do you need to help achieve your business goals? Internal? External? Both? Data characteristics What is the variety of the data? Structured, semi-structured (JSON, XML) or unstructured What data volumes? What is the data arrival rate (velocity)? What kind of analysis? Real-time streaming analytics? Exploratory analysis Text analytics? Graph analysis? This highlights Skills required technologies needed E.g. Hadoop, NoSQL, analytical RDDBMS 24 12
The Skills Gap What if you can t find people? Create a team with each member having some skills Raise the level of abstraction e.g. don t write code - generate it 25 Ensure People In Different Roles In The Analytical Landscape Work Together To Deliver Business Value Business Strategy strategic objectives and targets including sustainability targets Strategic Business Objective Priority KPI Current KPI Value What is +1% worth? 1 $$$ 2 3 4 KPI Target Executive Accountable Business Initiatives (projects) Project Project Project Budget Allocation x Million Action Plan Data Scientist Business Analyst Business Manager / Operations worker / Customer Exploratory analysis Predictive / statistical model producer Model consumer Data visualisation Information Producer Build reports Build and publish dashboards Information consumer Decision maker Action taker 26 13
Try It Out On The Cloud Before Bringing In-House HDFS / Hbase/ Hive 27 Hundreds of terabytes up to petabytes Integration Data In A Hadoop System Should Produce New High Value Insights To Add Into A DW Cloud Data Extract e.g. Deriving insight from huge volumes of social web content on sites like twitter, facebook. Digg, myspace, tripadvisor, Linkedin.for sentiment analytics Operational systems HDFS Transform Cloud Data Map/ Reduce & Spark data transformation and analytics applications D I new insights DW 28 14
Big Data - New Insights In Hadoop Can Integrated With A DW Using Data Virtualization To Provide Enriched Information OLTP systems D I DW Web logs Data Scientists sandbox SQL on Hadoop Data Vitualisation social web cloud new insights e.g. Deriving insight from social web sites like for sentiment analytics 29 Using Hadoop As A Data Archive Means Data Can Be Kept On-line, Analysed And Still Integrated With Data In The DW OLTP systems D I or data > n years DW Archive unused SQL on Hadoop Data Vitualisation Archived data 30 15
Conclusions Look for business cases first Align and prioritise use cases with business strategy Understand data sources required Understand data and analytical characteristics required Then determine skills and technologies needed Look for technologies that automate tasks if skills are unavailable Integrate with existing environment 31 Thank You! Big Data and Analytics Stockholm, November 26-27, 2015 www.intelligentbusiness.biz mferguson@intelligentbusiness.biz Twitter: @mikeferguson1 Tel/Fax (+44)1625 520700 32 16