Architected Blended Big Data with Pentaho



Similar documents
Eliminating Complexity to Ensure Fastest Time to Big Data Value

Eliminating Complexity to Ensure Fastest Time to Big Data Value

The Power of Pentaho and Hadoop in Action. Demonstrating MapReduce Performance at Scale

Build a Streamlined Data Refinery. An enterprise solution for blended data that is governed, analytics-ready, and on-demand

The Ultimate Guide to Buying Business Analytics

Big Data at Cloud Scale

Buying vs. Building Business Analytics. A decision resource for technology and product teams

The Ultimate Guide to Buying Business Analytics

Embedded Analytics Vendor Selection Guide. A holistic evaluation criteria for your OEM analytics project

The SMB s Blueprint for Taking an Agile Approach to BI

Blueprints for Big Data Success

Performance and Scalability Overview

Performance and Scalability Overview

Product Innovation with Big Data

Your Path to. Big Data A Visual Guide

Big Data Integration: A Buyer's Guide

Real-Time Data Access Using Restful Framework for Multi-Platform Data Warehouse Environment

9 Reasons Your Product Needs. Better Analytics. A Visual Guide

5 Big Data Use Cases to Understand Your Customer Journey CUSTOMER ANALYTICS EBOOK

INTELLIGENT BUSINESS STRATEGIES WHITE PAPER

Informatica PowerCenter The Foundation of Enterprise Data Integration

Automated Business Intelligence

Callidus for Insurance

Big Data Use Cases. To Start Today. Paul Scholey Sales Director, EMEA. 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866)

E-Guide THE CHALLENGES BEHIND DATA INTEGRATION IN A BIG DATA WORLD

Ten Mistakes to Avoid

Cloudera Enterprise Data Hub in Telecom:

Turn your information into a competitive advantage

Big Data Comes of Age: Shifting to a Real-time Data Platform

Hadoop Data Hubs and BI. Supporting the migration from siloed reporting and BI to centralized services with Hadoop

Automated Data Ingestion. Bernhard Disselhoff Enterprise Sales Engineer

Enterprise Data Integration

IST722 Data Warehousing

PUSH INTELLIGENCE. Bridging the Last Mile to Business Intelligence & Big Data Copyright Metric Insights, Inc.

QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM

Revenue Enhancement and Churn Prevention

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Data Virtualization: Achieve Better Business Outcomes, Faster

Data Virtualization and ETL. Denodo Technologies Architecture Brief

Understanding Your Customer Journey by Extending Adobe Analytics with Big Data

Agile Business Intelligence Data Lake Architecture

Converging Technologies: Real-Time Business Intelligence and Big Data

Cloud Integration and the Big Data Journey - Common Use-Case Patterns

Making Sense of Big Data in Insurance

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Three Open Blueprints For Big Data Success

UNIFY YOUR (BIG) DATA

Data Virtualization for Agile Business Intelligence Systems and Virtual MDM. To View This Presentation as a Video Click Here

The Principles of the Business Data Lake

Pentaho & MongoDB Partner to Solve Government Big Data Challenges

IP Expo 2014 Pentaho Big Data Analytics Accelerating the time to big data value London, UK

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS!

A SAS White Paper: Implementing a CRM-based Campaign Management Strategy

Master Your Data. Master Your Business. Empower your business with access to consolidated and reliable business-critical data

TopBraid Insight for Life Sciences

Evolution to Revolution: Big Data 2.0

Oracle Data Integrator 12c (ODI12c) - Powering Big Data and Real-Time Business Analytics. An Oracle White Paper October 2013

Deploying an Operational Data Store Designed for Big Data

Big Analytics: A Next Generation Roadmap

White. Paper. EMC Isilon: A Scalable Storage Platform for Big Data. April 2014

Healthcare Data Management

Data Warehousing in the Age of Big Data

Maximize strategic flexibility by building an open hybrid cloud Gordon Haff

The Top Challenges in Big Data and Analytics

Offload Enterprise Data Warehouse (EDW) to Big Data Lake. Ample White Paper

Business Voice BT s Managed Global Voice Solution

How To Make Data Streaming A Real Time Intelligence

FACTS ABOUT BIG DATA ANALYTICS PLATFORA. BIG DATA ANALYTICS Series

The Definitive Guide to Data Blending. White Paper

Why Most Big Data Projects Fail

Traditional BI vs. Business Data Lake A comparison

Data Virtualization. Paul Moxon Denodo Technologies. Alberta Data Architecture Community January 22 nd, Denodo Technologies

The Business Analyst s Guide to Hadoop

3 Top Big Data Use Cases in Financial Services

Cisco, Big Data and the Internet of Everything. Paul Davies, Big Data Sales Solution Leader, EMEAR Data Center

SQL Server 2012 Business Intelligence Boot Camp

Data virtualization: Delivering on-demand access to information throughout the enterprise

Integrating SAP and non-sap data for comprehensive Business Intelligence

Business Intelligence: Build it or Buy it?

META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING

> Cognizant Analytics for Banking & Financial Services Firms

Turnkey Hardware, Software and Cash Flow / Operational Analytics Framework

10 Steps to a Multichannel Strategy and an Exceptional Customer Experience

End Small Thinking about Big Data

Accelerate BI Initiatives With Self-Service Data Discovery And Integration

Unlocking the True Potential of Usage Data. Amdocs White Paper November 2014

TE's Analytics on Hadoop and SAP HANA Using SAP Vora

Create and Drive Big Data Success Don t Get Left Behind

Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January Website:

Databricks. A Primer

Providing real-time, built-in analytics with S/4HANA. Jürgen Thielemans, SAP Enterprise Architect SAP Belgium&Luxembourg

Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth

Building and Measuring Business Value:

Data as a Service Virtualization with Enzo Unified

WHITE PAPER. Harnessing the Power of Advanced Analytics How an appliance approach simplifies the use of advanced analytics

Analance Data Integration Technical Whitepaper

See the Big Picture. Make Better Decisions. The Armanta Technology Advantage. Technology Whitepaper

Course Outline: Course: Implementing a Data Warehouse with Microsoft SQL Server 2012 Learning Method: Instructor-led Classroom Learning

Transcription:

Architected Blended Big Data with Pentaho A Solution Brief Copyright 2013 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For the latest information, please visit our web site at pentaho.com.

Contents Contents... 2 Introduction... 3 Customer Value from Big Data... 3 Evolving Big Data Architectures... 4 Why Blending at the Source Matters... 5 Accurate, Blended Big Data Analytics... 6 Pentaho 5.0 Architected for the Future... 7 PENTAHO 2

Introduction The value of Big Data is well recognized today, with implementations across every size and type of business today. What has become apparent is that the real value of big data is not the data in and of itself, but in the combination of data with other relevant data from existing internal and external systems and sources. This need to blend data to derive maximum value will only escalate as new types and sources of data and information continue to emerge. The previous approach to blending data for analytics focused on transforming and moving data into centralized data marts and warehouses, with all access and analytics targeting those aggregated stores. This approach was consistently time consuming, but made sense when most data was in structured, and largely relational, formats, at reasonable volumes. Customer Value from Big Data Monetizing big data-driven use cases driving need to blend data This approach no longer makes sense in the current world of big and specialized data. We are rapidly moving into an era of distributed analytic data architectures, where data remains housed in the type of store most optimal for its volume and variety. This new ecosystem is called different things by different research organizations the Hybrid Data Ecosystem by EMA, the Logical Data Warehouse by Gartner, the multi-platform Data Warehouse Environment by TDWI but all recognize this distributed approach for the future. Why? Because it simply doesn t make sense to move big data into an EDW as we did with relational data. The structural variety and volume make it extremely difficult and time consuming, delaying any valuable use of the data, plus the economics are not feasible due to the volumes involved. Think for example of the typical volumes major telcos must manage today with their call data records one major provider reports they are dealing with 140 billion records in 650,000 files per day! DRIVE INCREMENTAL REVENUE Predict customer behavior across all channels Understand and monetize customer behavior Begin to monetize data as a service IMPROVE OPERATIONAL EFFECTIVENESS Machines/sensors: predict failures, network attacks Financial risk management: reduce fraud, increase security REDUCE DATA WAREHOUSE COST Integrate new data sources without increased database cost Provide online access to dark data PENTAHO 3

Evolving Big Data Architectures NoSQL Network Location Process or Hadoop Cluster mongodb Analytic DB Web Social Media Just-in-time Integration & Blending Analytics Customer Provisioning Billing ETL Tool or EDW Data Marts BI Tools ETL Tool or The pace of business and competition has also changed, exacerbating the issues. Competitiveness and success require the ability to respond rapidly to changing conditions in real time. Businesses cannot afford to wait for data to be extracted, merged, cleansed, transformed, and stored before it can be analyzed. Agile analytics require the data to be accessible where it resides at the source to ensure it is based on the most up to date view possible. Businesses must utilize the most efficient point of processing to deal with the potential volumes being searched and returned. All of these factors signal a new age of information architecture that must manage the flow of data through analytics differently than before. Data must be blended in place, across disparate stores and structures, while ensuring performance and ease of use across the breadth of analytics from historical to operational to predictive. Let s look at an example of blending at the source to better understand these points. In the diagram below we have an example of Telco customer experience analytics. Customer Experience Analytics have the same goal in every industry preventing customer churn and creating better loyalty in order to protect and grow revenue. After all, in this age of commoditization, service and fast response to product requests become the new differentiators driving loyalty in most industries. Telco customer allegiance comes mostly from satisfaction with calling plans and the quality and availability of service. Call detail records have long been created and derived from the operational systems for access to BI and reporting systems via warehousing, but they only make up part of the picture. PENTAHO 4

Why Blending at the Source Matters Customer Experience Analytics for loyalty and revenue PREVENT CUSTOMER CHURN, CREATE BETTER LOYALTY AND PROTECT REVENUE Call Detail Records from: Customer Provisioning ETL Tool or EDW > Billing > Payment > Usage Analytics Analyze quality of service: > Network outages > Dropped calls Billing Blend revenue-related > Poor quality > Calls to support center and quality-of-service data together to find For profiles of customers: customers at risk > Up for renewal > Profitable > Multiple agreements/ Network Location NoSQL mongodb Call Detail Records from Network: > Outages > Drops services > In competitive area Determine best action to take: > Billing credit > Customer coupon > No action Quality of service changes in real time dependent on the network: was the customer able to connect, to hear, to remain connected without being dropped, etc.? This network-based data is usually captured in a big data source that is capable of handling the volume and unstructured nature of the data AND can be blended with the Call Detail Record information to give the complete picture of a customer s experience. With Pentaho, you can easily create architected, blended views across both the traditional Call Detail Records in the warehouse, and the network data streaming into Big Data/NoSQLstore (MongoDB in this example) without sacrificing the governance or performance you expect. These blended views allow analysts and customer call centers to get accurate, of-the-minute information in near real time to determine the best action to take for each customer to maximize satisfaction and retain them as loyal customers even when outages or other service quality issues occur. PENTAHO 5

Just in time, architected blending delivers accurate big data analytics based on blended data. You can connect to, combine, and even transform data from any of the multiple data stores in your hybrid data ecosystem into blended views, then query the data directly via that view using the full spectrum of analytics in the Pentaho Analytics platform, including predictive analytics. Most importantly, since these blends are architected on the source data, you maintain all the rules of governance and security over the data while providing the ease of use and real time access needed for today s agile analytics requirements. > Sensitive data is kept from those who are not allowed to use or view it. > Full lifecycle maintenance, change management and control ensuresthe blends being used meet changing requirements. > Auditability is preserved. > Blends are designed with full knowledge of the underlying data volumes and source system capabilities and constraints, preserving throughput and performance during analytic access and preventing the query from hell / runaway query problems prevalent in many federation tools. Accurate, Blended Big Data Analytics Optimally stored data, blended when needed Just-in-time blending of data from multiple sources for a complete picture > Connect, combine and transform data from multiple sources > Query data directly from any transformation > Access architected blends with the full spectrum of Pentaho Analytics > Manage governance and security of data for on-going accuracy Customer Provisioning ETL Tool or EDW Billing Just-in-time Integration & Blending Analytics Network NoSQL Location mongodb PENTAHO 6

Combining the power of design via drag-anddrop across all data sources, including schemas generated on read from big data sources, with knowledge of the full data semantics the real meaning, cardinality, and match of fields and values in the data means your business gets accurate results. Decisions become optimized and actions can be taken to significantly impact business positively and improve results. Other solutions in the market talk about blending but it s not apples to apples. Blending at the glass, i.e. blending done by end users or analysts away from the source with no knowledge of the underlying semantics, often delivers inaccurate or even completely incorrect results. There is not way to ensure that the chosen fields being matched truly do match. For instance, think what happens when someone matches two fields both named revenue in records that match on customer, but one is a monthly sum total and the other is a daily total. This won t be apparent to an analyst since the blending is done based on similar names. The analyst then runs a summation that adds the two together as the day s total revenue from that customer. Unwittingly the monthly figure is added into each day s total, distorting the actual revenue generated from that customer Pentaho 5.0 Architected for the Future Simplified analytics experience for all users Simplified Analytics Experience Blended Big Data Enterprise Big Data Integration Visual Analytics Data Integration Predictive Analytics 100% Java Open Web-Based API s Pluggable Architecture Operational Data Big Data Public/Private Clouds Data Stream PENTAHO 7

dramatically. The business then targets that customer as highly profitable and offers significant discounts to maintain their interest. Not only have you targeted the wrong customer and potentially ignored the real profitable customers you ve also now given undeserved discounts. The net result lowers your revenue from this customer, and potentially causes a loss of other profitable customers who were more deserving but left in favor of competitors offering them discounts. You ve made the wrong decision because the analytics themselves were inaccurate and incorrect. The only choice to avoid this without tools that blend at the source is to train every user and analyst on the semantics of the data to ensure reliable results. The solution is of course largely infeasible for most organizations as it would take far too much time and expense while impacting productivity. Even if you can take on this level of investment in training, you still face issues with the timeliness of the data, since other tools do not pull from the source systems. It is impossible to know if the data pulled is indeed the latest and therefore the most accurate on that level. Appearance isn t everything, and your decisions should not be based on interpretation. You need to be sure the analytics lead to accurate information so you can make the right decision. You need architected data blending at the source you need Pentaho! PENTAHO 8

Global Headquarters Citadel International - Suite 340 5950 Hazeltine National Dr. Orlando, FL 32822, USA tel +1 407 812 6736 fax +1 407 517 4575 US & Worldwide Sales Office 353 Sacramento Street, Suite 1500 San Francisco, CA 94111, USA tel +1 415 525 5540 toll free +1 866 660 7555 Learn more about Pentaho Business Analytics pentaho.com/contact +1 (866) 660-7555 United Kingdom, Rest of Europe, Middle East, Africa London, United Kingdom tel +44 7711 104854 toll free (UK) 0 800 680 0693 FRANCE Offices - Paris, France tel +33 97 51 82 296 toll free (France) 0800 915343 GERMANY, AUSTRIA, SWITZERLAND Offices - Frankfurt, Germany tel +49(0)89 / 37 41 40 81 toll free (Germany) 0800 186 0332 BELGIUM, NETHERLANDS, LUXEMBOURG Offices - Antwerp, Belgium tel +31 6 52 69 88 01 toll free (Belgium) 0800 773 83 Copyright 2013 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For the latest information, please visit our web site at pentaho.com. 13-140