Navigating Big Data business analytics



Similar documents
How To Turn Big Data Into An Insight

Navigating the Big Data infrastructure layer Helena Schwenk

A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel

ANALYTICS CENTER LEARNING PROGRAM

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

Big Data Integration: A Buyer's Guide

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.

The Future of Business Analytics is Now! 2013 IBM Corporation

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

BIG Data Analytics Move to Competitive Advantage

Big Data Challenges and Success Factors. Deloitte Analytics Your data, inside out

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

Microsoft Big Data. Solution Brief

CONNECTING DATA WITH BUSINESS

The 4 Pillars of Technosoft s Big Data Practice

The Business Analyst s Guide to Hadoop

Apache Hadoop Patterns of Use

Oracle Big Data Discovery Unlock Potential in Big Data Reservoir

Big Data better business benefits

Are You Ready for Big Data?

In-Memory Analytics for Big Data

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS!

BANKING ON CUSTOMER BEHAVIOR

ANALYTICS BUILT FOR INTERNET OF THINGS

Interactive data analytics drive insights

The big data business model: opportunity and key success factors

Are You Ready for Big Data?

Big Data and Healthcare Payers WHITE PAPER

Delivering new insights and value to consumer products companies through big data

Big Data and Data Science. The globally recognised training program

Three Open Blueprints For Big Data Success

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc All Rights Reserved

UNLEASHING THE VALUE OF THE TERADATA UNIFIED DATA ARCHITECTURE WITH ALTERYX

How To Handle Big Data With A Data Scientist

Advanced Big Data Analytics with R and Hadoop

Customer analytics case study: T-Mobile Austria

Advanced Analytics. The Way Forward for Businesses. Dr. Sujatha R Upadhyaya

BIG DATA & ANALYTICS. Transforming the business and driving revenue through big data and analytics

Big Data Analytics. Copyright 2011 EMC Corporation. All rights reserved.

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Transforming the Telecoms Business using Big Data and Analytics

Are You Big Data Ready?

Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities

Advanced In-Database Analytics

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Bruhati Technologies. About us. ISO 9001:2008 certified. Technology fit for Business

Data Refinery with Big Data Aspects

DATA VISUALIZATION: When Data Speaks Business PRODUCT ANALYSIS REPORT IBM COGNOS BUSINESS INTELLIGENCE. Technology Evaluation Centers

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

How To Make Data Streaming A Real Time Intelligence

Solve your toughest challenges with data mining

The Potential of Big Data in the Cloud. Juan Madera Technology Consultant

VIEWPOINT. High Performance Analytics. Industry Context and Trends

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

Three proven methods to achieve a higher ROI from data mining

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

QUICK FACTS. Implementing a Big Data Solution on Behalf of a Media House TEKSYSTEMS GLOBAL SERVICES CUSTOMER SUCCESS STORIES

QUICK FACTS. Delivering a Unified Data Architecture for Sony Computer Entertainment America TEKSYSTEMS GLOBAL SERVICES CUSTOMER SUCCESS STORIES

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, Viswa Sharma Solutions Architect Tata Consultancy Services

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances

Cloudera Enterprise Data Hub in Telecom:

The 3 questions to ask yourself about BIG DATA

Harnessing the power of advanced analytics with IBM Netezza

Demonstration of SAP Predictive Analysis 1.0, consumption from SAP BI clients and best practices

BIG DATA + ANALYTICS

Getting Started Practical Input For Your Roadmap

How To Use Big Data For Business

The Future of Data Management

Predicting & Preventing Banking Customer Churn by Unlocking Big Data

Wikibon Big Data Analytics Adoption Survey, Frequency Analysis

Evolution to Revolution: Big Data 2.0

Bringing the Power of SAS to Hadoop. White Paper

SAP Predictive Analysis: Strategy, Value Proposition

SAP BusinessObjects Predictive Analysis. Transforming the Future with Insight Today

Customized Report- Big Data

BIG DATA TRENDS AND TECHNOLOGIES

Big Data. Fast Forward. Putting data to productive use

SELLING PROJECTS ON THE MICROSOFT BUSINESS ANALYTICS PLATFORM

End Small Thinking about Big Data

Insightful Analytics: Leveraging the data explosion for business optimisation. Top Ten Challenges for Investment Banks 2015

Using Tableau Software with Hortonworks Data Platform

Modern Data Architecture for Predictive Analytics

Predicting & Preventing Banking Customer Churn by Unlocking Big Data

Achieving Business Value through Big Data Analytics Philip Russom

Big Data must become a first class citizen in the enterprise

DATA EXPERTS MINE ANALYZE VISUALIZE. We accelerate research and transform data to help you create actionable insights

Reaping the Rewards of Big Data

Oracle Big Data Discovery The Visual Face of Hadoop

SAP Solution Brief SAP HANA. Transform Your Future with Better Business Insight Using Predictive Analytics

Reduce and manage operating costs and improve efficiency. Support better business decisions based on availability of real-time information

WHITE PAPER. Harnessing the Power of Advanced Analytics How an appliance approach simplifies the use of advanced analytics

In-Database Analytics

!!!!! BIG DATA IN A DAY!

Transcription:

mwd a d v i s o r s Navigating Big Data business analytics Helena Schwenk A special report prepared for Actuate May 2013 This report is the third in a series and focuses principally on explaining what s needed in the business analytics layer of a Big Data platform. For more information about how this layer relates to others in a Big Data platform please refer to the corresponding papers in this series: Navigating the Big Data infrastructure layer and Turning Big Data into Big Insights. Finally, for more information about the opportunities and challenges posed by Big Data for organisations today please refer to the first paper in the series, Unlocking the potential of Big Data. This is a special report prepared independently for Actuate. For further information about MWD Advisors research and advisory services please visit www.mwdadvisors.com. MWD Advisors is a specialist advisory firm which provides practical, independent industry insights to business analytics, process improvement and digital collaboration professionals working to drive change with the help of technology. Our approach combines flexible, pragmatic mentoring and advisory services, built on a deep industry best practice and technology research foundation. www.mwdadvisors.com

Navigating Big Data business analytics 2 Summary Understand the business need of Big Data The promise of a Big Data platform is that it takes in its rawest form and converts it into consumable, actionable information Business analytics brings potential to Big Data Tool choices are dependent on a range of factors To really get to value from your Big Data you first need to understand how this new world of varied and voluminous sources can potentially solve problems or create opportunities within your business. It requires you to not only make sense of your by analysing it and deriving meaningful insights from it, but to be able to apply those insights in a business context in a timely and impactful way. The concept of a Big Data platform provides a technology framework for taking in its rawest form, transforming it and putting it in a format where it can be consumed and acted upon by decision makers. Three core layers are required to support these capabilities: the lowest layer is responsible for the storage and organisation of ; the middle layer is where the of that occurs; and the upper layer is where insights are discovered and consumed. This report focuses on the second: the business analytics layer. Business Analytic tools help bring understanding and meaning to your Big Data. Technologies such as predictive analytics, for instance, can analyse and model Big Data to help make predictions about future events, whereas visual analytics tools can help identify trends or patterns in large volumes of more easily, and text mining and natural language processing can be used to understand sentiment and extract meaning from textual. What tool you choose will ultimately depend on the problem your business is trying to solve. But equally it will also need to take into account other technical factors such as the type of being analysed (whether it s structured or multistructured, for example) and also the scope of being performed (such as whether it involves real-time, exploratory or advanced analytic techniques). Being able to understand and map both business and IT requirements to your Business Analytic tool choices remains an important part of any Big Data initiative.

Navigating Big Data business analytics 3 Technology cost and sophistication driving the Big Data train As outlined in the first report in this series, Unlocking the potential of Big Data, in spite of all the headlines and vendor rhetoric, the ability to manage growing volumes of is not a new phenomenon for organisations today. In fact, many early adopters of Business Intelligence (BI) and warehousing technology (especially in the retail, telecoms and financial services industries) have long been accustomed to capturing and managing large volumes of. Yet in spite of this we still see the rise and rise of Big Data as a seemingly relatively new concept so what has changed? Through their own technology innovations, web and social -driven businesses such as Google and LinkedIn have shown us how to process Big Data sets (in their case web searches) on massively scalable storage and computing platforms using commodity hardware. Their technology expertise and success is the inspiration behind open source Big Data technologies such as Apache Hadoop and its ecosystem of tools (which we introduce in more detail in the second report of this series, Navigating the Big Data infrastructure layer). The challenge of processing certain kinds of Big Data has also driven other technology innovations related to massive parallel processing architectures, in-memory analytics, columnar bases and complex event processing platforms. All of these pieces bring more choices to organisations that want to advance their use and management of Big Data. Similarly, enhancements in predictive analytics, text mining and advanced visualisation tools make the exploitation of Big Data more straightforward by making it easier to discover hidden or interesting patterns and insights that, in turn, can be used to enhance productivity, drive efficiencies and growth, and create a sustainable competitive advantage. Figure 1: Drivers of broader Big Data adoption Source: MWD Advisors But it s not only technology developments spurring the advancement of Big Data; as figure 1 shows, the deployment economics of technologies are equally important. In particular, the decreasing cost of storage and memory, alongside the scalability of cloud computing platforms and appliances together with the growing influence of open source tools brings the promise of lower cost and more affordable Big Data platforms. The opportunities of Big Data are opening up to a wider audience, as it becomes more economically feasible to exploit, manage and leverage Big Data especially for those organisations that may have been priced out of this activity previously.

Navigating Big Data business analytics 4 A Big Data platform has three layers Most of the commentary around Big Data has focused on the type of under management whether structured or multi-structured (defined as stored and organised in a multitude of formats, including text, video, documents, web pages, email messages, audio or social media posts, and so on), or real-time or in-motion. However, before any decision can be made about what kind of information and technology capabilities are required to support this there needs to be agreement and buy-in about what you want to achieve from your Big Data initiative. At the very least it needs to be framed by a clear strategy that helps outline how and analytics can be tied to a particular business challenge or opportunity that needs addressing. This in turn provides the starting point from which organisations can assess the technical implications of their Big Data effort, for example by examining how can be transformed from its raw state to a point to where it can be consumed and acted upon. To support this capability a Big Data platform needs to provide capabilities for: Capturing, processing and storing Exploring and applying advanced analytics techniques Discovering and consuming insights. Today these activities are supported by a multitude of technology components some of them are relatively new, while others are based on existing technologies and architectures. In figure 2 we bring these concepts together as part of an overall Big Data platform with three layers. The lowest layer is concerned with organising and storing ; the middle layer is where the of that occurs; and the upper layer is where insights are discovered and consumed. Figure 2: Capabilities of the Big Data platform layers Source: MWD Advisors Although these capabilities aren t necessarily new to BI and warehousing practitioners, it s become apparent that the old models for storing and analysing don t necessarily apply to all Big Data assets. Not only is the amount of vast and potentially more time-sensitive in nature, but the variety of to be managed can be far greater and this is markedly changing the requirements of the technology needed. This report focuses principally on explaining what s needed in the analytics layer of a Big Data platform. Please refer to the other papers in this series for an explanation of the other two layers.

Navigating Big Data business analytics 5 Getting to grips with Big Data business analytics Within the Big Data analytics layer, technologies extract value from by exploring, modeling and analysing it. Assuming that your company has been successful in organising and storing its Big Data assets then it s at this point that the comes to life and organisations have the potential to unlock valuable insights within it. However, before any decision is made about what technology to use, any organisation embarking on a Big Data initiative needs to be clear about the business challenge or opportunity they are trying to address through its use, whether it s about devising a more profitable pricing strategy, offering more sophisticated product recommendations, improving fraud detection or being able to apply more granular customer segmentation to your. Once this has been established you can then look towards how business analytic technology can help support these aims and objectives. What technology you use to support the of Big Data, however, depends on two key factors: the type of that is required for (such as whether it s structured or multi-structured ), and the use cases driving the need. To help assimilate a picture of what technology fits where in a Big Data analytic environment, it s worth classifying and grouping the different types of that can be performed with these technologies. Our research suggests that three broad categories are prevalent: is a practice focused on applying sophisticated algorithms such as machine learning, predictive modeling or natural language processing algorithms to Big Data (either structured or multi-structured) to solve a particular business problem or maximise an opportunity. It can be performed by both line-of-business and/or IT users and is focused on identifying a specific goal such as predicting churn, identifying a customer s propensity to respond or understanding consumer sentiment before the analytics process can begin. Real-time is focused on using technology enablers such as in-memory or event stream processing engines to facilitate the rapid ingestion and/or of where the results are served up in real time to a user (such as an online product recommendation, for example), or equally where the results are served up to business users in dashboards where the information is used to drive decision-making. Exploratory differs from traditional BI query and reporting as it centres on exploring a complete set of less well understood (rather than a sample), to determine what has value, and where the hidden patterns and trends lie within that subset without any constraints as to what those patterns or trends may infer. Exploratory may be performed in an academic or research setting and hence requires a different mindset, one where an analyst or scientist can be more creative in their and one where they don t always have a clear understanding of the questions they want to ask from the. Table 1 below provides an overview of the key technologies you should consider as part of your Big Data analytics layer. As you can see from the table, Big Data encompasses a whole range of technologies and tools. Some, such as predictive analytics or SQL tools, are well established, whereas others especially where the of multi structured is required shine the spotlight on a newer breed of Big Data technologies such as Hadoop Hive or text analytics.

Navigating Big Data business analytics 6 Table 1: Big Data analytics options Big Data Analysis technology Key Facts Predictive and advanced analytics The main goal of predictive analytics is to develop a model using a combination of sophisticated analytic algorithms, statistical models and mathematical calculations that analyse current and historical facts to make predictions about future events. Some base vendors support the execution of advanced analytics within the base (typically within SQL-based MPP bases) to take advantage of parallel processing capabilities of the source base to speed up query processing times. Today an increasing number of analytic applications are also being built in Hadoop HDFS using the MapReduce paradigm in languages such as R or by utilising Apache Mahout, an open source project providing a library of scalable machine learning and mining algorithms. In-memory visual analytic tools Text analytics Underpinned by an in-memory base, these tools support advanced users in the interactive on-the-fly exploration and of large, complex structured sets to help pin point trends, segment the set, and identify outliers and hidden patterns far more easily and often in real time. Text analytics applies linguistic rules and statistical methods to automatically assess, analyse and find patterns found within large quantities of electronic text such as those found within social media posts, emails, and call centre notes. The process of analysing text usually involves parsing and filtering the text, understanding and extracting its meaning in a structured form for use and in a store such as a warehouse. Sentiment that utilises Natural Language Processing (NLP) techniques is a growing branch of text analytics used to extract linguistic subjective information about opinions, attitudes, emotions and perspectives from text. SQL Event stream processing SQL is the primary query language used by most BI and analytics tools as well as a lot of business analysts. While it is primarily used to query structured, today many vendors are increasing support for querying Hadoop directly using SQL, for example by supporting a Hive interface which allows SQL to be converted to a MapReduce program and processed within Hadoop. This technology detects events or patterns of events as streams through transactional systems, networks or communications buses, before correlating and analysing the so an appropriate action can be taken to minimise risk or maximise an opportunity, for example. Analysis of occurs when the is in-motion, i.e. before the is usually stored in a base or file system, and is often used in conjunction with other technologies such as business rules, predictive analytics and optimisation techniques to help organisations automate and guide decision-making processes, for instance around detecting fraud, managing risk, optimising pricing and strategic process improvements. Mapping Big Data technologies to analytic use cases To help explain how these analytic use cases impact and map to your Big Data technology analytic choices, the following table takes a look at some sample Big Data applications and details what makes each technology option particularly suitable for this form of. As always this should only be used as a guide as it does not take into account other factors such as interoperability with existing tools and infrastructure, budget, and skill levels that will also naturally dictate technology choices. For a more detailed explanation of each storage component mentioned please refer to the other paper in this series, Navigating Big Data infrastructure.

Navigating Big Data business analytics 7 Table 2: Big Data applications and supporting technologies Example application area Usage scenario Example type Example technology option Customer Churn Structured Predictive mining models that analyse transactional, behaviour, demographic and social interaction can take advantage of the in-base analytics and parallel processing capabilities of the SQL MPP base to run and score customers to identify those that are at risk of churning. Marketing campaign Structured In-memory visual analytic tools can be used to analyse revenue by market, campaign, or other attributes to help improve campaigns and market segmentation as well as identifying segments in the customer base that can be used to tailor marketing messages to particular groups or markets. Click stream analytics Multi structured and structured Hadoop MapReduce programs written in R can support the parallel processing of large amounts of web log files where insights into navigation behaviour are extracted and combined with existing customer from the warehouse to support activities such as website optimisation and conversion rate. Product affinity Multi and structured Statistical methods are used to determine the relationship between different products and/or product features based around customer purchasing patterns, interaction, and transaction. This can then be analysed using visualisation tools to identify opportunities for cross-selling and up-selling, for example. Real-time sentiment Real-time Structured and multi-structured Event stream processing technology that combines sophisticated analytics and natural language processing technologies can be utilised to enable real-time opinion mining on millions of public tweets to gain a view into brand performance that in turn can help organisations understand target audiences and shape decisionmaking. Real-time offer management Real time Structured and multi-structured In-memory technology and advanced analytics tools can be used to calculate loyalty card points in real time so that when a customer enters the store, they are provided with real-time offers based on loyalty status and specific store inventory. On-line recommendation engine Multi-structured HDFS can be used to store and process huge volumes of online behaviour and used in conjunction with Mahout s library of machine learning algorithms (which operates on top of Hadoop) and the Pig language to recommend complementary products based on predictive for cross-selling. Customer segmentation Real-time Structured In-memory visual analytic tools can query and analyse large amounts of structured providing a fast and interactive way to segment customers based on behaviour, or attributes of customer to help quickly identify potential growth or profitable customer segments. Drug research Exploratory Multi Structured Hadoop MapReduce can support the processing and interpretation of large amounts of research. The ability to easily and economically store in its rawest form without the need for rigid formatting means analysts can focus their efforts on building hypotheses and exploring what questions could be asked of that. On-line recommendation engine Multi-structured HDFS can be used to store and process huge volumes of online behaviour and used in conjunction with Mahout s library of machine learning algorithms (which operates on top of Hadoop) and the Pig language to recommend complementary products based on predictive for cross-selling.

Navigating Big Data business analytics 8 In many ways the problems a business is trying to solve will dictate the kind of architectures and business analytic technologies employed. As the table above demonstrates, it s possible to use a range of technologies and tools to satisfy your needs, some of which can be supported through traditional analytic tools, whereas others will require the introduction of new analytic practices and tools, especially where the scalability, performance and capabilities of existing analytic tools have run out of steam. Tapping into the potential of Big Data business analytics Although the breadth and variety of Big Data analytics options available to organisations is not in question, technology choices should only form part of the equation when it comes to assessing how you move forward with a Big Data project. To really get to grips with Big Data you first need to understand exactly how you can get value from large volumes of, very complicated, or very fast-moving (or a combination of any of these) prevalent across the organisation. It s an effort that requires organisations to improve their literacy by finding ways of understanding how this new world of Big Data can potentially solve problems or create opportunities in their business. What it boils down to is the need to not only make sense of and derive meaningful insights from it, but to be able to apply those insights in a business context. As we will see in the next report, Turning Big Data into Big Insights, this is an evolving area and one in which we expect both enterprises and vendor support to develop over time.

Navigating Big Data business analytics 9 Key considerations when planning your Big Data business analytics investment Big Data encompasses a whole range of technologies and tools. Some, such as predictive analytics and visual analytics, are well established, whereas others especially where the of multi-structured is required shine a spotlight on a newer breed of emerging Big Data technologies such as Hadoop MapReduce, R or Mahout. Today no one single technology platform can support the entire range of Big Data use cases, so expect to extend your existing BI and warehousing environment to incorporate these newer analytic components an effort that will increase demands on and application integration capabilities across a more diverse analytic environment. The options available for applying sophisticated advanced and specialised analytics to Big Data are growing as support for running predictive analytics and machine learning algorithms both in-base or in-hadoop (for example by using Mahout, Knime or R) increase. Be aware, however, that this will require you to step up your analytical practices and the type of skills employed within your analytics team. Processing and analysing text, such as conducting sentiment on social media, promises to open up new sources of intelligence for many organisations. It uses techniques such as natural language processing (NLP) to understand the opinions, attitudes and intent within text and is often used to understand the voice of the customer. However, no tool can fully automate this type of ; it still needs a human touch, and one that blends the power of machines with human intelligence and looks to build, train and evolve the tools language and linguistic capabilities over time. The unconstrained nature and scalability of the Hadoop environment and its associated technologies provides an ideal platform for iterative and exploratory. For example, it can be used to support analysts and scientists in their quest to uncover nonobvious relationships in the, detect hidden patterns and generate new theories, hypotheses and experiments based on a full set of rather than just a selected sample. Event stream processing software is a valuable technology for continuously analysing as it is received and hence is often used for mission-critical and decision management applications such as real-time fraud detection, sentiment and risk management. However, while this technology supports streaming and analysing in motion, consideration also needs to be given to the speed of the feedback loop that is, the ability of a user or organisation to act on the information within an appropriate timescale otherwise its value could be lost. Above all, before you embark on your Big Data analytic journey consideration also needs to be given to the readiness of your organisation to deal with the deluge. This, amongst other things will involve developing the necessary skills or 'literacy' across your organisation to be able to understand how to value, its quality or validity, and how it can be utilised to make more effective, accurate and informed business decisions.