Product Innovation with Big Data

Similar documents
Big Data at Cloud Scale

The Power of Pentaho and Hadoop in Action. Demonstrating MapReduce Performance at Scale

Build a Streamlined Data Refinery. An enterprise solution for blended data that is governed, analytics-ready, and on-demand

Eliminating Complexity to Ensure Fastest Time to Big Data Value

Blueprints for Big Data Success

Embedded Analytics Vendor Selection Guide. A holistic evaluation criteria for your OEM analytics project

Architected Blended Big Data with Pentaho

Buying vs. Building Business Analytics. A decision resource for technology and product teams

Eliminating Complexity to Ensure Fastest Time to Big Data Value

IP Expo 2014 Pentaho Big Data Analytics Accelerating the time to big data value London, UK

The Ultimate Guide to Buying Business Analytics

Your Path to. Big Data A Visual Guide

The Ultimate Guide to Buying Business Analytics

Performance and Scalability Overview

The SMB s Blueprint for Taking an Agile Approach to BI

How To Handle Big Data With A Data Scientist

Three Open Blueprints For Big Data Success

Performance and Scalability Overview

Big Data Use Cases. To Start Today. Paul Scholey Sales Director, EMEA. 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866)

Actian SQL in Hadoop Buyer s Guide

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

Big Data Analytics. Copyright 2011 EMC Corporation. All rights reserved.

Big Analytics: A Next Generation Roadmap

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Page 1. Transform the Retail Store with the Internet of Things

Information Architecture

Using Tableau Software with Hortonworks Data Platform

Evolution to Revolution: Big Data 2.0

INTRODUCTION TO CASSANDRA

WHITEPAPER. A Technical Perspective on the Talena Data Availability Management Solution

Data Integration Checklist

Cloudera Enterprise Data Hub in Telecom:

Tap into Big Data at the Speed of Business

Hadoop Market - Global Industry Analysis, Size, Share, Growth, Trends, and Forecast,

Big Data Everywhere. Chicago

Tap into Hadoop and Other No SQL Sources

Next-Generation Cloud Analytics with Amazon Redshift

Microsoft Big Data. Solution Brief

Comprehensive Analytics on the Hortonworks Data Platform

The Future of Data Management

Why Big Data in the Cloud?

Modern IT Operations Management. Why a New Approach is Required, and How Boundary Delivers

Big Data Defined Introducing DataStack 3.0

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time?

The 4 Pillars of Technosoft s Big Data Practice

5 Big Data Use Cases to Understand Your Customer Journey CUSTOMER ANALYTICS EBOOK

Internet of Things. Point of View. Turn your data into accessible, actionable insights for maximum business value.

INVESTOR PRESENTATION. Third Quarter 2014

BIG DATA Alignment of Supply & Demand Nuria de Lama Representative of Atos Research &

9 Reasons Your Product Needs. Better Analytics. A Visual Guide

Business Intelligence / Big Data Consulting Service

How To Make Data Streaming A Real Time Intelligence

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances

Interactive data analytics drive insights

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

How To Use Big Data To Help A Retailer

Information-Driven Transformation in Retail with the Enterprise Data Hub Accelerator

INVESTOR PRESENTATION. First Quarter 2014

Big Data Architectures. Tom Cahill, Vice President Worldwide Channels, Jaspersoft

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014

Extend your analytic capabilities with SAP Predictive Analysis

VIEWPOINT. High Performance Analytics. Industry Context and Trends

Understanding Your Customer Journey by Extending Adobe Analytics with Big Data

HDP Enabling the Modern Data Architecture

Getting Started & Successful with Big Data

Hadoop for Enterprises:

Native Connectivity to Big Data Sources in MSTR 10

Investor Presentation. Second Quarter 2015

Hadoop. Sunday, November 25, 12

SAP and Hortonworks Reference Architecture

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Harnessing Data to Optimize and Personalize the In-Store Shopping Experience

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

Information Builders Mission & Value Proposition

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

for Retail One solution connects retail end-to-end, driving growth and fostering customer relationships.

THE JOURNEY TO A DATA LAKE

Big Data Are You Ready? Jorge Plascencia Solution Architect Manager

Create and Drive Big Data Success Don t Get Left Behind

A TECHNICAL WHITE PAPER ATTUNITY VISIBILITY

The Business Analyst s Guide to Hadoop

Enterprise Operational SQL on Hadoop Trafodion Overview

Integrating a Big Data Platform into Government:

Using Big Data for Smarter Decision Making. Colin White, BI Research July 2011 Sponsored by IBM

DATAMEER WHITE PAPER. Beyond BI. Big Data Analytic Use Cases

UNLEASHING THE VALUE OF THE TERADATA UNIFIED DATA ARCHITECTURE WITH ALTERYX

Informatica PowerCenter The Foundation of Enterprise Data Integration

Executive Summary... 2 Introduction Defining Big Data The Importance of Big Data... 4 Building a Big Data Platform...

Harnessing the power of advanced analytics with IBM Netezza

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

Turn your information into a competitive advantage

JDSU Partners with Infobright to Help the World s Largest Communications Service Providers Ensure the Highest Quality of Service

Protecting Big Data Data Protection Solutions for the Business Data Lake

Ganzheitliches Datenmanagement

Lambda Architecture. Near Real-Time Big Data Analytics Using Hadoop. January Website:

Big Data and Analytics 21 A Technical Perspective Abhishek Bhattacharya, Aditya Gandhi and Pankaj Jain November 2012

DATA MANAGEMENT FOR THE INTERNET OF THINGS

Big Data Analytics. Optimizing Operations and Enabling New Business Models

Transcription:

Product Innovation with Big Data A resource for software product organizations and enterprise IT groups Copyright 2015 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For the latest information, please visit our web site at pentaho.com.

Introduction Effective product managers are able to focus on the day-to-day fulfillment of user requirements and project deliverables while still keeping their eyes on the horizon, looking toward the market and technology trends that will help them create sustainable competitive advantages. The goal of this brief is to explain why Big Data represents one of those key trends as well as how it can facilitate better outcomes for end users, the business, and other stakeholders. Technology organizations naturally often lead the way in early adoption of disruptive systems, and we have already seen this happen with Big Data. Read on to understand the implications for application problem-solving potential, scalability, and intelligence. Big Data Background Historically, data that was high in volume, diverse in structure, and rapidly changing posed difficult challenges for organizations that were used to working with traditional relational database technology. However, new technical paradigms such as schema-on-read, massively parallel processing, and MapReduce have provided ways to reduce the overhead required to get raw data into a data store and drastically increase the speed and efficiency of processing large amounts of data. They have also made unstructured and semi-structured data much more accessible for businesses. These innovations have begun to unleash actionable analysis on a variety of previously challenging data sources, including web logs, documents and text, social media, mobile devices, and industrial sensors. Even, dark data (data locked in corporate warehouses with little analytic access) has been given new life through these new technologies. As open source Big Data technologies have matured into commercially supported products, we have seen several Big Data platform categories start to gain rapid adoption: Hadoop distributions: Frameworks for large scale data storage and high performance processing across a distributed file system with the MapReduce paradigm, as well as the more recently released MapReduce 2 also known as the YARN data operating system for cluster resource management; ideal for high volume unstructured data. Hadoop vendors include Cloudera, MapR, and Hortonworks, among others. NoSQL stores: Non-traditional databases with a flexible structure; often ideal for extremely rapid data ingestion and large numbers of reads based on key values. Rather than storing data in a relational or tabular structure, these stores may leverage structures such as documents, graphs, key-value pairs, or columns. Sample NoSQL store providers include MongoDB and DataStax (Cassandra database). Analytic databases: Databases designed for high performance analytics, leveraging techniques like compression, column-based storage, and high-speed bulk inserts of structured data; ideal for complex queries and OLAP analysis. Examples include HP Vertica and Amazon Redshift. Taken together, these systems have enabled organizations to start harnessing data that is massive, fast moving, and diverse in structure with powerful implications for both application and analytic capabilities. PENTAHO 2

Changing the Game for Software Applications and Products While the technology landscape is still evolving, teams in the software, web, and hardware areas have actually led the way in delivering real value from Hadoop, NoSQL, Analytic Databases, and other emerging technologies. They have illustrated that integrating these big data systems into existing primarily relational architectures can create highly competitive product capabilities that deliver big benefits to software end users. A good way to understand this is to contrast innovative big data applications with more traditional applications supported largely by relational databases. Aaron Kimball, a committer on the Apache Hadoop project since 2007, indicates that the rigid structural requirements for storage and retrieval of data on relational platforms can limit traditional applications to solving narrowly defined problems. He suggests that Big Data applications can complement these traditional architectures, introducing a wider array of problem-solving possibilities thanks to flexible accommodation of different data structures and latency requirements. 1 An example of this is idea is illustrated by Paytronix, a customer loyalty technology provider to restaurant chains. Initially, Paytronix customers had access to basic survey data on demographic characteristics of their patrons such as age and whether or not they had children. However, this data was self-reported and not always accurate. Leveraging a Big Data architecture including Hadoop and Pentaho for multi-format data processing, Paytronix was able to correct the missing and inaccurate demographic information by modeling and blending customer social media profile data and on-site entree ordering trends. Innovation with this data has allowed Paytronix to go beyond established loyalty program services to provide intelligent marketing and segmentation recommendations to customers that can boost the value generated by those programs. From a strategic point of view, the ability to support novel sources of rich information in near real-time starts to open up possibilities for new products targeted at new use cases and, potentially, new markets. At Pentaho, we have seen that ingesting and blending a wider variety of data sources into Big Data systems can provide a more complete picture of customers across different industries, which ultimately can lead to better business decisions. These analytics can also be automated behind the scenes with data pre-processing and predictive algorithms in order to deliver improved application experiences and operationalize insights as part of product workflow. In other words, Big Data architectures can fuel comprehensive data-driven applications, and not just analytics on large amounts of data. According to recent research, a new design approach is leading to apps that leverage big data predictive analytics to anticipate and provide the right functionality and content on the right device at the right time for the right person by continuously learning about them. 2 Machine learning and predictive analytics, when applied to multi-source blended data at scale, allow applications to become more intelligent and responsive to end user needs in a timely fashion. The same type of Big Data architectures can also bring to life smarter devices and equipment in B2B sectors, like heavy industry and networking. The end result is automated and intelligent products driven by prescriptive analytics. Big Data can also help technology teams deliver a greater degree of scalability, in terms of data volumes processed, user loads, and responsiveness of applications to realtime or near real-time requests. Hadoop, for instance, can accelerate data processing and reduce storage costs by an order of magnitude relative to traditional approaches. Meanwhile, NoSQL frameworks are often able to fuel faster, more efficient application performance on hot or more urgently required data sets that are closer to the presentation layer for end users. In general, many of these technologies have evolved from projects that originally started inside the walls of some of the largest and most successful consumer technology firms, which needed to support user bases that were growing into the hundreds of millions and beyond. This type of scalability is just now becoming accessible to technology firms of all sizes and sub-sectors. 1 Aaron Kimball, The secrets of designing and building big data apps, venturebeat.com, 12/24/2013. PENTAHO 3

Blueprints for Next-Generation Applications While there is not one right way to leverage big data to create a new end user application or enhance an existing one, Pentaho has observed a few patterns based on customer and market experience. This diagram is not meant to illustrate a complete solution architecture, but rather a blueprint for different data components seen in emerging applications. Big Data Application Patterns Weblog & social media data Hadoop Cluster NoSQL Store Fast processing on many data formats Flexible and fast read/write access Machine, sensor, & device data Affordable historical storage Training machine learning algorithms Near-line speed layer for performance Operational store Client-side User Interface Structured & Relational Relational Database Relational Database Web-based experience including embedded visual analytics Customer profile data Existing application data Existing application database May integrate with other enterprise systems & customer profile data Facilitates fast analytic queries for end users Often used in data refinery pattern with Hadoop for Big Data analytics Often we see a two-tiered architecture, where Hadoop serves as a massively scalable back-end archive and training ground for machine learning and predictive analytics algorithms. It also ingests the previously challenging semi-structured and unstructured data that were not a fit for traditional relational database technology. Closer to the user, a NoSQL database often serves as an operational store holding less data than Hadoop but designed to facilitate accelerated application performance and address near real-time data needs. These components together support core application functionality, while an analytic database meets needs for high performance, low-latency ad hoc analysis, visualization, and reporting by end users. The visual analytics are often embedded in the user interface as a seamless part of the end user experience with that application. Overall, the different data stores and frameworks are linked via a data integration and orchestration layer, which may include Pentaho. This both streamlines the delivery of data in the application architecture and facilitates the use of predictive algorithms in an automated process. 2 Mike Gaultieri, Forrester Research, Predictive Apps Are the Next Big Thing in Customer Engagement, 6/25/2013. PENTAHO 4

Real Life Examples As indicated above, the discussed design patterns are based on real-life examples from Pentaho s customer base. Interestingly, many of these examples fall into one of two categories 1) Intelligent CRM, marketing, and e-commerce products, and 2) Internet of Things (IoT) products that leverage sensor, equipment, or device data. We ll discuss an example of each below. RICH RELEVANCE Next Generation Data Platform for Retail Personalization RichRelevance provides a platform that delivers personalization services for Fortune 500 retailers, allowing them to deliver the most relevant content to their customers across online and in-store engagement channels. The platform delivers over 50 million personalized shopping sessions a day with sub-second response times. This performance is only possible thanks to the company s early investment in an intelligent Big Data application architecture. At its core, the RichRelevance platform leverages Hadoop, Hbase (a NoSQL database), and Hive (a relational layer on top of Hadoop), as well as Pentaho though they are always incorporating new frameworks. These systems enable the rapid ingestion and processing of massive amounts of web session information, like pageviews and purchases, as well as rapidly changing product catalog information. On this architecture, RichRelevance runs a variety of regularly updated predictive algorithms based on web visitor behavior, product information, and merchandising objectives in order to determine the bvest content to serve. These recommendations can be optimized to maximize margin and revenue against such constraints as inventory stocks and legal restrictions. RichRelevance has not only streamlined 1.6 Petabytes of diverse data, but they have also embedded Pentaho analytics into their customer facing application to provide insight into the performance of these omni-channel personalization services. Overall, Big Data has enabled RichRelevance to create a unique offering to retailers that serves highly personalized content to each individual shopper in order to boost conversion at scale. RichRelevance Big Data Architecture Server Log Data Data Scientist Data Mining and Machine Learning refinement Customer Demographics Data Marts Website Tracking PDI Business User (CFO) leverage real-time reporting Customer ERP and Supply Data PDI Business Analytics Server End Users Agile BI capabilities and self-service Online Transactions PENTAHO 5

RUCKUS WIRELESS Delivering Differentiated Networking Products with Big Data Ruckus Wireless is a high performance wireless infrastructure provider, catering both to telecommunications carriers and enterprises. Recently, they sought to launch a flexible analytics product to provide their clients with detailed visibility into network traffic, capacity, and performance. In order to provide the best possible product, they adopted a big data architecture that could make the solution scalable to decade-long analysis on millions of user sessions and hundreds of thousands of wireless access points per carrier. In order to meet these needs, Ruckus leverages Pentaho to ingest complex JSON and XML files from the Wi-Fi equipment into a Hadoop cluster, later pulling data into HP Vertica (an analytic database) for high performance WiFi network analytics. Further, they chose to partner with Pentaho in order to OEM an analytics solution for drag-and-drop reporting, ad hoc analysis, and visualization. The new Ruckus analytics offering enables customers to quickly uncover trends in the health and performance of their networks, at a scale of data only possible with a Big Data back-end. Importantly, they ve been able to launch the application as a new revenue-generating product, which complements their hardware-focused core business. Ruckus Wireless Big Data Architecture Data Scientist Data Mining and Machine Learning refinement Unstructured Wi-Fi Data Account and ERP Data Business User (CFO) leverage real-time reporting PDI PDI Business Analytics Server End Users Agile BI capabilities and self-service Machine and Network Data PENTAHO 6

Conclusion The Big Data market is still in its early innings, but we are already seeing pioneering tech teams and product companies leverage Hadoop, NoSQL, and other emerging systems to deliver intelligent, data-driven applications that delight users in novel and valuable ways. Recent changes in the technology landscape have made it possible to build capabilities into applications that were once only dreamed of think intelligent recommendations to millions of users on-demand, and automated granular analytics on sensors across networking equipment, jet engines, and maritime vessels. These use cases are not restricted to firms like Facebook, Netflix, or General Electric they are now much more broadly accessible. PENTAHO 7

Learn more about Pentaho Business Analytics pentaho.com/contact +1 (866) 660-7555. Global Headquarters Citadel International - Suite 340 5950 Hazeltine National Drive Orlando, FL 32822, USA tel +1 407 812 6736 fax +1 407 517 4575 US & Worldwide Sales Office 353 Sacramento Street, Suite 1500 San Francisco, CA 94111, USA tel +1 415 525 5540 toll free +1 866 660 7555 United Kingdom, Rest of Europe, Middle East, Africa London, United Kingdom tel +44 (0) 20 3574 4790 toll free (UK) 0 800 680 0693 FRANCE Offices - Paris, France tel +33 97 51 82 296 toll free (France) 0800 915343 GERMANY, AUSTRIA, SWITZERLAND Offices - Munich, Germany tel +49 (0) 322 2109 4279 toll free (Germany) 0800 186 0332 BELGIUM, NETHERLANDS, LUXEMBOURG Offices - Antwerp, Belgium tel (Netherlands) +31 8 58 880 585 toll free (Belgium) 0800 773 83 ITALY, SPAIN, PORTUGAL Offices - Valencia, Spain toll free (Italy) 800 798 217 toll free (Portugal) 800 180 060 Be social with Pentaho: Copyright 2015 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For the latest information, please visit our web site at pentaho.com.