1 COULD VS. SHOULD: BALANCING BIG DATA AND ANALYTICS TECHNOLOGY
2 The business world is abuzz with the potential of data. In fact, most businesses have so much data that it is difficult for them to process and analyze it using traditional applications. There's a term often used in today's business lexicon to describe this issue: "big data." When it comes to extracting value from the vast array of available data, there is almost no limit to the technology and real-time analytics capabilities that you could develop. The key is finding the solution you should use one that provides the right balance between sophisticated technology and cost-effective business results.
3 WEALTH OF DATA OR WEALTH OF INSIGHT? As your business becomes more and more digital, everything you do leaves a trail of data high-volume, highly variable data about customers, transactions, operations, finances, competitors, markets, and more. Even social media such as web blogs and Twitter generate valuable data. There s no question that data can hold the key to better performance and competitive advantage. The real question is how best to gain insight by examining every available data source. In today s environment of complex operations and abundant data, that type of insight is possible using advanced analytics. Just about every company recognizes the value of analyzing its big data, but many still struggle to develop the right analytics capabilities to support their business. In fact, it is very easy to overengineer or, conversely, under-develop an analytics solution. JUST WHAT DO WE MEAN BY "BIG DATA"? In its simplest form, the term "big data" generally describes both the data itself its volume, velocity, and variety as well as the process of data-driven decision-making through analytic insights, predictive modeling, and optimization. Big data solutions require the ability to Integrate and manage data, as well as to analyze it in a timely and effective way. Commonly, these solutions combine low-cost storage, open-source tools for refining and integrating the data with high- performance analytics, modeling, and visualization platforms. BIG DATA TOOLS AND PLATFORMS Big data solutions generally fall into two major categories: tools and platforms. The tools provide the method of analysis while the platforms execute the analysis. Both of these are generally divided between open-source and proprietary implementations. Open Source Platforms: Open-source high performance data management and parallel processing tools have shown the greatest growth recently. Apache Hadoop, a distributed computing platform, has popular releases made by Hortonworks and Cloudera. Other platforms and tools such as MongoDB, HBase, and R are rapidly gaining prominence. Hadoop and R in particular are experiencing dramatic uptake with R finding its way into statistical and actuarial circles. Proprietary Platforms: Proprietary platforms include Teradata and SAP HANA, as well as other players. Microsoft's HD Insight takes a slightly different route by being a 100-percent Apache Hadoop compatible platform that runs on the Windows operating system. Microsoft's decision to embrace the Hadoop ecosystem and its support of open source development is a continuation of its involvement and active development in Linux.
4 Tools: Some of the greatest advances have been in tools and methodologies including genetic algorithms, natural language processing, and predictive modeling. Applying these machine learning approaches to big data enhances their efficacy and opens new possibilities. Hive is an example of a tool that brings big data analysis within easy reach of most organizations. START BY ASKING THE RIGHT QUESTIONS In general, we have found that asking and answering a few questions in four key areas can help you design the best, most cost-effective big data analytics solution for your specific business and operations: 1. Context what business problem(s) are we trying to solve? 2. Action which specific actions or decisions do we need an analytics solution to support? 3. Use how, when, and where will we use analytics? 4. Data is our data accurate and sufficient to support decision making? Is it available continuously, or do we have to acquire it on a one-off basis (often, with great effort)? By focusing on these four areas, you will be able to design a big data ecosystem that creates timely insights and drives better decision making. CONTEXT First, it is important to view a potential analytics solution in terms of the business value it produces, not the technology it utilizes. In other words, your focus should be on big answers rather than big data. According to a December 2012 survey of business executives from fifty Fortune 1000 firms by NewVantagePartners, nearly a quarter of respondents desired a big data solution to improve customer experience. By integrating and analyzing a wider variety of customer data, these organizations want to better understand customers desires and intentions and be better- equipped to serve or target them for additional products and services. While better customer experience should drive return on investment (ROI) through improved customer retention and growth, leading companies are looking beyond customer and market-facing opportunities to also improve their operational and business processes. Manufacturers and retailers are using big data analytics to improve their supply chains and time to market, while insurance companies are pushing the limits of data to do everything from managing claims more efficiently to identifying fraud among their customers. Understanding the potential value of your solution will help guide how aggressively you pursue data that may be costly to gather and difficult to integrate with other sources and yet only provide incremental insights.
5 ACTION Analytics drive more effective decision making. If your goal is to improve retention of your most valuable customers, what decisions will you need to make and what actions will you need to take to do so? An analytics solution could help you score, rank, or segment customers more precisely; develop differentiated customer service/support strategies for specific customer segments; or design timely customer intervention and retention strategies. Analytics also could help you analyze and improve operational processes that impact customer experience such as redesigning call or order routing, creating more efficient inventory and fulfillment processes, or making other process changes that create financial advantages. Your desired action or decision, in turn, will affect the timeliness and accuracy required. For example, if your focus is on improving customer experience through better marketing and service, you will need to integrate more data on a timelier basis to ensure an accurate view of your customer relationship. This may involve integrating customer transactional history, likely found in current systems and databases, with other behavioral data such as web click-streams, customer services calls and outcomes, and social activity such as likes, tweets/retweets, etc. Additionally, you also may want to combine this with other, more static data such as demographic and financial data to have an even clearer understanding of each customer. You may be able to tolerate less accuracy in the resulting analytics and insights, however, when incremental cost of being wrong is low. For example, incremental cost or risk of an incorrect product recommendation prior to online retail check out may be relatively low. Conversely, if you re a bank and your big data solution is focused on making better, more timely risk decisions, or a life sciences company focused on improving patient outcomes, there are likely compliance and regulatory requirements that demand greater accuracy. Determining what s good enough to make a decision or take action will save your big data team from an endless quest for the perfect answer.
6 Case study: A retail giant commits to big data With multiple brands and more than 32 different business units, Sears Holdings, parent company of Sears, Kmart, Land s End, Craftsman Tools and others, had a big data problem. Over the years, the company had built up a patchwork of databases, hardware, software, and analytic and reporting solutions that served thousands of different users across the enterprise. This created a number of challenges in running a modern, multi-channel retail business: No single version of the truth the same question produced different answers for different users Dozens of proprietary databases and appliances across business units Inflated IT capital expenses and turn-around times Inflexible data models Most importantly, lack of timely customer and business insights for decision-making Under the leadership of new Chief Technology Officer Dr. Philip Shelley, the company embarked on a threeyear journey to improve its data and analytics infrastructure to overcome the challenges above. By embracing big data technology, the firm was able to replace multiple databases and appliances with a large Hadoop cluster at a fraction of the cost it had been spending on IT infrastructure and support. As Dr. Shelley stated in a recent article for Retail Information Systems News, It's involved a huge mindset change relevant to keeping data. We used to only keep aggregates of data and throw away the detail, because it had been too big. Now we don't throw anything away, theoretically, ever. While there was a learning curve initially, this new mindset has enabled Sears to shave weeks off of customer campaigns, which leads to more timely and relevant offers. Likewise, the company now has new, data driven insights for planning and investment decisions. For example, because it now has access to granular-level data across its customers, stores, and supply chain, Sears can analyze the seasonality of individual items or SKUs using eight years of detailed historical data analysis that was impossible in the past because of limited data storage and availability. Similarly, the company s new big data installation has reduced typical IT timelines to gather and integrate new data sources by up to 70 percent and has nearly eliminated the need for capital investments as a result of replacing proprietary storage with commodity servers.
7 USE Another key to realizing value from analytics tools and processes is carefully defining their use. Consider where you will use analytics to support decision making in your call center, for online transactions and interactions, or at the point of sale. Frequency is also important in this equation. For example, if you are using analytics to improve forecasting, batched results may suffice. But if you are using analytics to intervene in situations where a customer relationship is at risk well, the sooner you can see it, the sooner you can act on it. Although most executives generally think of big data as being real time, the action required should dictate whether you need real-time, near-real-time, or less-frequent batch updates and integration to drive decision making. Timeliness is critical, as the cost of instrumenting big data increases greatly as you move from batch to real-time insights. While the goal should always be to shorten the time from insight to action, make sure your intended use of the insights not the timeliness or velocity of the underlying data drives your big data solution. DATA Finally, understanding the nature of your data will help define the most appropriate data management and analytics techniques for your needs. One useful way to look at your data is in terms of 3 Vs volume, variety, and velocity. Popular business media tend to focus on the volume of data being collected. However, it s the velocity and variety of data that s being collected that is much more important. Simply put, the cost to store larger and larger amounts of data is incremental compared to the cost of cleansing, integrating, and analyzing that data. In fact, in the same NewVantage Partners survey referenced above, only 10 percent of respondents cited data volume in their definition of big data, while 40 percent said data variety is the critical factor that defines big data. Likewise, big data solutions often focus on unstructured data, such as social data, video, and data stored in electronic documents and messages. While this type of unstructured data can be helpful, the promise of big data lies in the ability to mash up data from many different domains and perspectives temporal, transactional, operational, spatial to create unique and action-ready business insights. These types of data are often already structured or semi-structured, so the real challenge lies in the ability to cleanse and integrate these sources with each other. Another key consideration is viewing raw data as temporary data that can be discarded once the key insights are derived. Analytics software giant SAS calls this a stream it, score it, store it approach to big data management. In other words, if you are applying an algorithm to a customer s online shopping activity
8 to make product recommendations, once you derive the best product to recommend, the underlying browsing activity can be discarded. Finally, keep in mind that while big data solutions utilize new tools and techniques for collecting, integrating, and analyzing data, the key to a successful solution requires an understanding and commitment to many traditional data management disciplines: Data quality and meta-data management Data governance Data visualization and business intelligence Case Study: Big data isn't all open source--but It's not proprietary either Klout is a service that measures influence across social networks. It is a dynamic and rapidly expanding organization that continues to improve its product offerings and grow its user base. Klout processes feeds from social networks to measure users influence levels across social media. This scoring involves complex calculations but also very large amounts of raw data. Like most startups, Klout went to market with an open source software stack that is the core of its operating environment. Apache Hadoop is the center of this open source ecosystem. As Klout grew, it needed to be able to provide ad-hoc query capabilities and advanced analytics for business level reporting. To bridge the gap between its big data platform and the business level visibility desired by its end users, Klout turned to SQL Server Analysis Services to provide familiar tools and experiences such as Excel. This allowed Klout to preserve investments and experience in both existing IT technologies and business skill sets. Analysis Services acts as a conduit from user facing tools and queries into Hadoop using ODBC, Hive, and linked servers. This provides an integrated analysis capability reaching from the user to the big data platform. In addition to reporting and analytics, Analysis Services also provides alerts and QoS capabilities. WHAT COULD YOU DO, OR WHAT SHOULD YOU DO? The possibilities that come from using big data are virtually endless. So is the possibility that you can invest too much time and expense in creating analytics capabilities and analyses that are nice but not essential to running your business. The key is finding the right balance between the technology-driven possibilities and practical solutions that can help your business meet its goals. Answering a few key questions up front can help you sort through the myriad possibilities and refocus your efforts from what you could do to what you should do and, in the process, increase the potential for
9 return on your analytics investment. Addressing these questions will also ensure that all of your stakeholders support the solution and its potential benefits. Big data forces us to think differently about data-driven decision making, and alignment is crucial to the success of your project. If IT, marketing, finance, operations, and others are all on the same page with respect to the context, action, use, and data requirements of your big data project, you ll be well on your way to a successful outcome.
For Big Data Analytics There s No Such Thing as Too Big The Compelling Economics and Technology of Big Data Computing March 2012 By: 4syth.com Emerging big data thought leaders Forsyth Communications 2012.
fs viewpoint www.pwc.com/fsi 02 15 19 21 27 31 Point of view A deeper dive Competitive intelligence A framework for response How PwC can help Appendix Where have you been all my life? How the financial
Introduction.... 1 Emerging Trends and Technologies... 3 The Changing Landscape... 4 The Impact of New Technologies... 8 Cloud... 9 Mobile... 10 Social Media... 13 Big Data... 16 Technology Challenges...
CGMA REPORT From insight to impact Unlocking opportunities in big data Two of the world s most prestigious accounting bodies, AICPA and CIMA, have formed a joint venture to establish the Chartered Global
At the Big Data Crossroads: turning towards a smarter travel experience Thomas H. Davenport Visiting Professor at Harvard Business School Amadeus IT Group is committed to minimizing its carbon footprint.
1 Contents Introduction. 1 View Point Phil Shelley, CTO, Sears Holdings Making it Real Industry Use Cases Retail Extreme Personalization. 6 Airlines Smart Pricing. 9 Auto Warranty and Insurance Efficiency.
Global Headquarters: 5 Speen Street Framingham, MA 01701 USA P.508.872.8200 F.508.935.4015 www.idc.com W H I T E P A P E R B i g D a t a : W h a t I t I s a n d W h y Y o u S h o u l d C a r e Sponsored
April 2013 Operational Intelligence: What It Is and Why You Need It Now Sponsored by Splunk Contents Introduction 1 What Is Operational Intelligence? 1 Trends Driving the Need for Operational Intelligence
www.pwc.com PwC Advisory Oracle practice 2012 How to drive innovation and business growth Leveraging emerging technology for sustainable growth 1 Heart of the matter Top growth driver today is innovation
A Forrester Consulting Thought Leadership Paper Commissioned By SAP Real-Time Data Management Delivers Faster Insights, Extreme Transaction Processing, And Competitive Advantage June 2013 Table Of Contents
INTELLIGENT BUSINESS STRATEGIES W H I T E P A P E R Architecting A Big Data Platform for Analytics By Mike Ferguson Intelligent Business Strategies October 2012 Prepared for: Table of Contents Introduction...
SAP Statement of Direction Business Intelligence Solutions Business Intelligence Solutions from SAP: Statement of Direction Table of Contents 3 Quick Facts 4 Driving Business Innovation Through Radical
Technology that matters Harnessing the technology wave in banking Using new technology to reshape your bank for the future Up to two thirds of the profitability uplift required to be a high performer of
Big Data Getting Value from Big Data: Focus on the Opportunities, Not the Obstacles Table of Contents 2 Embark on Your Big Data Journey with Confidence Getting Started, Keeping Moving 3 Big Data Hype Versus
CIO Roundtable - Big March 13, 2013 Big and its Dimensions Big refers to internal and external data that is multi-structured, generated from diverse sources in near real-time and in large volumes making
SAP BusinessObjects Business Intelligence SAP BusinessObjects Business Intelligence 4.0 Solutions Empowering the Real-Time, Mobile, Social, and Global Enterprise SAP BusinessObjects Business Intelligence
TABLE OF CONTENTS Introduction... 3 The Importance of Triplestores... 4 Why Triplestores... 5 The Top 8 Things You Should Know When Considering a Triplestore... 9 Inferencing... 9 Integration with Text
WHITE PAPER Keep Your Eye on the Enterprise: Developing a Long-Term Master Data Management Strategy DEVELOPING A LONG-TERM MASTER DATA MANAGEMENT STRATEGY 1 For business decision makers, there is perhaps
E-PAPER March 2014 Big Data & the Cloud: The Sum Is Greater Than the Parts Learn how to accelerate your move to the cloud and use big data to discover new hidden value for your business and your users.
An Oracle White Paper June 2013 Oracle: Big Data for the Enterprise Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure
DELIVERING ON THE PROMISE OF BIG DATA AND THE CLOUD by Mark Jacobsohn Senior Vice President Booz Allen Hamilton Joshua Sullivan, PhD Vice President Booz Allen Hamilton WHY CAN T WE SEEM TO DO MORE WITH
Digitizing Manufacturing: Ready, Set, Go! Manufacturing at the verge of a new industrial era 2 Content Executive Summary 04 The Need for Digitization 06 The Industry s Digital Maturity 08 Digital business