BIM. the way we see it. Mastering Big Data. Why taking control of the little things matters when looking at the big picture

Similar documents
MDM is an operational problem. Customers are individuals too. A spade is a spade. Campaigning in a social world. What it takes to deliver

Master Data Management (MDM)

Master Data Management (MDM)

Five Ways Retailers Can Profit from Customer Intelligence

CA Clarity PPM - RallyDev Integrator

Mastering Campaign Management

A new beginning for outsourcing closed book insurance policy administration

The Aerospace & Defence industry of tomorrow

Grabbing Value from Big Data: Mining for Diamonds in Financial Services

The Principles of the Business Data Lake

HP PPM - RallyDev Integrator

Transforming Insurance Risk Assessment with Big Data: Choosing the Best Path

Grabbing Value from Big Data: The New Game Changer for Financial Services

Information-Driven Transformation in Retail with the Enterprise Data Hub Accelerator

Leverage SWIFT methodology for results-driven Oracle Fusion CRM

How Effective Data Management Can Help Your Organization Unlock Its True Potential

Data Discovery, Analytics, and the Enterprise Data Hub

How To Create An Intelligent Enterprise With Oracle Business Intelligence Applications

Streamlining the Order-to-Cash process

The changing role of the IT department in a cloud-based world. Vodafone Power to you

IMPLEMENTING A SECURITY ANALYTICS ARCHITECTURE

The 2013 Supply Chain Agenda

Speeding Time to Market, Increasing Time in Market & Maintaining Market Velocity

Cybersecurity Strategic Consulting

Digital Service Centre. Automate support and empower users.

Capgemini Big Data Analytics Sandbox for Financial Services

Shawn O Neal. Driving the Data Engine: How Unilever is Using Analytics to Accelerate Customer Understanding. An interview with

Prosodie and Salesforce: Front End solution. Nicolas Aidoud and Ronan Souberbielle

TRADE PROMOTION ANALYTICS FOR CPG

cprax Internet Marketing

The 2011 Global Supply Chain Agenda Market and demand volatility drives the need for supply chain visibility

Digital Customer Experience

Shifting your marketing message from a brand focus to customer value

ETPL Extract, Transform, Predict and Load

The Changing World of Big Data. And How to Profit from It

Taking Control of Spend Data Management and Analytics Without Bothering IT

Extension of ERP for marketing: internal system + external communication Microsoft AX Dynamics. Prof.dr. Dalia Krikščiūnienė

Planning a Basel III Credit Risk Initiative

Transforming Your Core Banking and Lending Platform

Intelligent Service Centre. A smarter way to drive continuous improvement in business processes.

Personalization is a hot topic among digital marketers

G-Cloud Healthcare Analytics Service. October G-Cloud. service definitions

Retail Analytics The perfect business enhancement. Gain profit, control margin abrasion & grow customer loyalty

The conversation has already begun. Are you a part of it? Building stronger customer relationships in the digital world.

When to Leverage Video as a Platform A Guide to Optimizing the Retail Environment

ENZO UNIFIED SOLVES THE CHALLENGES OF REAL-TIME DATA INTEGRATION

Traditional BI vs. Business Data Lake A comparison

Banking. the way we see it. Customer Cross-Sell. Using advanced analytics and creating a marketing-it partnership to increase cross-sell penetration

Digital Segmentation. Basic principles of effective customer segmentation

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics

Fraud Solution for Financial Services

Finance in All-Channel Retail. Improving the Customer Proposition through Effective Finance and Enterprise Performance Management

4) CRM provides support for front-end customer facing functionality. Answer: TRUE Diff: 1 Page Ref: 334

1Current. Today distribution channels to the public have. situation and problems

Address C-level Cybersecurity issues to enable and secure Digital transformation

Pick and Mix Services

Accelerate Your Transformation: Social, Mobile, and Analytics in the Cloud

Using SOA to Enhance Notifications. Rajas Kirtane 8/11/2014

THE IMPORTANCE OF EXPENSE MANAGEMENT AUTOMATION

Optimizing Multi-Channel Customer Service to Support Customer Centricity

My Experience. Serve Users in a Way that Serves the Business.

Financial and Strategic Insights CFO Services, Financial Analytics, Strategic & Financial Business Consultants

Building Relationships by Leveraging your Supply Chain. An Oracle White Paper December 2001

Omnichannel approach The secret ingredient of the marketing mix How omnichannel marketing can enhance customer experience and induce loyalty

Project, Program & Portfolio Management Help Leading Firms Deliver Value

Building a leading e-commerce business: The 3 Pillars

Datalogix. Using IBM Netezza data warehouse appliances to drive online sales with offline data. Overview. IBM Software Information Management

Test Data Management. Representative data, compliant for test environments

What sets breakthrough innovators apart PwC s Global Innovation Survey 2013: US Summary

The Liaison ALLOY Platform

Getting a 360 customer view with SAP Business Communications Management (BCM)

Why Big Data Analytics?

Agile Manufacturing for ALUMINIUM SMELTERS

Digital Transformation and the future of QA & Testing. March 3 rd, 2016 Jérôme Cadiou

How to Build a Service Management Hub for Digital Service Innovation

Capgemini Business Process Outsourcing

STRATEGIC AND FINANCIAL PERFORMANCE USING BUSINESS INTELLIGENCE SOLUTIONS

Website (Digital) & Mobile Optimisation. 10 April G-Cloud. service definitions

Investing in Property through your Self-Managed

YOUR BIG DATA AUDIENCE INSIGHT

Test Automation. Full service delivery for faster testing at optimum cost

Analytics in the Finance Organization

Empower loss prevention with strategic data analytics

Data Management Emerging Trends. Sourabh Mukherjee Data Management Practice Head, India Accenture

How to Become a Successful Designer

CONTINUOUS DEPLOYMENT EBOOK SERIES: Chapter 1. Why Continuous Deployment Is Critical to Your Digital Transformation Strategy

The Stacks Approach. Why It s Time to Start Thinking About Enterprise Technology in Stacks

Beat the Beast - Java Performance Problem Tracking. with you. Java One - San Francisco, , Miroslaw Bartecki

ITIL V3 - The Future Is Here

Wealth management offerings for sustainable profitability and enhanced client centricity

BUSINESS INTELLIGENCE

Business ByDesign. The SAP Business ByDesign solution helps you optimize project management

Boosting Customer Loyalty and Bottom Line Results

Core Banking Transformation using Oracle FLEXCUBE

Digital Business Platform for SAP

UNLEASHING THE VALUE OF THE TERADATA UNIFIED DATA ARCHITECTURE WITH ALTERYX

Data Governance for Financial Institutions

THE FUTURE CONTENT PART I: PERSONALIZING THE CONTENT EXPERIENCE

Accelerate your Big Data Strategy. Execute faster with Capgemini and Cloudera s Enterprise Data Hub Accelerator

LOCATION PLANNING FOR FINANCIAL SERVICES. Optimising performance from your branch estate in a multichannel market

Transcription:

Mastering Big Data Why taking control of the little things matters when looking at the big picture

2

Big Data represents a big opportunity and a big reality Many industry analysts and advisors are looking at Big Data as being the next frontier for competition and innovation. The first question, of course, is what is Big Data? The definition is hugely variable, but tends to refer to the massive explosion of information now available to organizations: the shift, for instance, of a retailer from tracking transactions to tracking the journey a customer takes around a store, and when they choose to buy. The breadth and depth of this data means that it can t simply be stored and queried within a traditional database solution. The amount of data being created by organizations is increasing exponentially every year and is not something that companies can opt out of. The challenge is therefore to quickly identify what should be retained, avoid duplication where possible, and make use of the information that is being generated. Big Data is not about acquiring data from outside of an organization; it s about combining the Big Data being created internally with external Big Data sets. Big Data is, therefore, about being able to access and leverage much larger information sets than ever before in order to gain greater insight into markets and opportunities. McKinsey talks about opportunities worth trillions for the overall exploitation of Big Data. At the heart of Big Data is the ability to ask questions about what will happen and receive more accurate answers than has been possible until now. 1 McKinsey Global Institute, Big Data: The next frontier for innovation, competition and productivity, May 2011 3

The Bigger the Data, the harder they fall Let s imagine the information requirements of the future. We want to find clear trends on how people buy products, and we also want to add new information every day to keep our trends up-to-date. We will have a few large data sets that gather information from the US over the last ten years, including: 1. Every retail transaction, obtained from Point of Sale (PoS) information 2. Weather history, by hour and city 3. Local sports team results by city 4. TV and other media advertising by region 5. Every online advert view by individual and location Clearly these data sets are bigger than those being considered today, but it s this scale of challenge that companies will be looking to address in coming years. Before we leap into the world of distributed storage cloud computing, there are some basics that need to be realized. Garbage In, Garbage Out rules The first fact to realize is that any analytical model is only as good as the quality of information that you put into it. This means that you need sources of information that are trusted and known to be reliable. If you simply want a big data set and aren t worried about accuracy, then a random number generator will give you more than enough information. Build the Islands of Certainty The second key fact is that these differing data sets need common information to link them together so that, for example, a retail receipt for Lincoln, Nebraska can be linked to the weather report for that town. Within the data sets we also need accuracy. When someone is buying beer in New Jersey, are they buying the same brand of beer as someone in Connecticut or Maryland? And which products can be defined as beer ; how well are the terms defined to ensure consistent results across locations? These are the Islands of Certainty, a concept that Capgemini introduced in the Mastering the Information Ocean paper. Islands of Certainty are provided by understanding the points of reference before you create a large data set. 2 Available from http://www.capgemini.com/insights-and-resources/by-publication/ 4

Hanging Big Data from the POLE The key to mastering Big Data is understanding where these data reference points are located. In our previous example, we have several reference points: 1. Customers the individuals in the various interactions 2. Locations where the customers are based and where they are buying 3. Products what is being advertised and what is being sold 4. Team the sports teams 5. Channels the advertising media 6. Adverts a linking of a product to a channel 7. Times the calendar items when events happened In addition to these key elements, there are the specific events that create the mass of data. We can structure this information as firstly a core of Parties, Objects and Locations, and secondly a volume of transactions or Events. We call this information model The POLE. The model is shown in figure 1. It s by focusing on the P, O and L of the POLE that we can ensure Big Data actually delivers value. The POLE, therefore, acts as the structure around which Big Data is hung, in the manner of a skyscraper, from its internal infrastructure. Capgemini s blog post MDM: Making Information Dance around the POLE outlines this approach. This means that there is a clear structure, based on the POLE, for our core entities, and from here it becomes possible to add the event information. Before we consider how to handle event information, we will take a look at the question of governance. Figure 1 The POLE Events Parties Relationships Objects Events Events Locations 5

Governing the POLE When looking at the task of mastering the POL for Big Data (figure 2), the challenges are exactly the same as those for mastering data generally within the business and between business partners. The number of entities is similar to those that an organization should already be mastering, and the practices are exactly the same. However, the impact of not mastering information becomes exponentially greater in the case of Big Data. In our example, if we are unaware that cornflakes are sold in different size packets, for instance, then it is impossible to compare volumes sold between regions, or to compare prices accurately. If we don t even know that two different products are both cornflakes, then there is no ability to understand substitutions, and if we don t know that cornflakes are breakfast cereals, then we can t understand the broader category buying trends. Figure 2 The POL of the POLE With customer information it is important to be able to marry customers online presence with their physical location and experiences, so as to be aware of the adverts they have seen; both traditional and digital. This means understanding the individuals across all channels, the products being advertised or placed on shelves, and the locations where all of these interactions happen. To achieve this you need organizational governance, which comes in two distinct types: 1. Standards which define the structural format and definitions of information: a. When are two objects, locations or parties considered equivalent? b. What core information is required to identify these elements? c. What rules govern the relationships between them? 2. Policies which define how the standards will be enforced: a. Is there a central cleansing team? b. At which point is a relationship considered valid? c. What will be done with possible matches? d. Who gets to decide when a standard needs to change? 6

Governance is about taking control of the POLE in order to drive consistency across Big Data so it can deliver real insights (figure 3). Setting up this governance not only drives value within the analytical world of Big Data, but also enables the insights of Big Data to be directly tied to the operational aspects of the business. Governance is not simply something for Big Data: it is something for the whole of the business that enables operations and analytics to work in sync. Governance applies not to a single information area but to the consistent elements that occur across all of the enterprise, including its external interactions. MDM Governance Figure 3 Governance is at the center of the MDM landscape Master data Big Data and analytics Finance ENTERPRISE Sales & marketing Manufacturing Organizations Customers Individuals Supply chain Social Logistics Partners Research organizations Contract manufacturers Industry & regulation Suppliers Information flow of transactional information and events Translation via master data Setting up governance for the POLE is standard practice in any sophisticated MDM organization. In our example there would probably be two groups of standards and policies: 1. Customer-centric a. Customer handling the definition and standards around customers across all channels b. Product handling the definition and standards around products across all channels 2. Enterprise-centric a. Locations handling the definition and standards for locations, including how sports teams are assigned to regions and how advertising channels are defined within regions or across regions b. Weather how weather information will be managed and tied to regions Broad engagement across the business is needed in order to create standards that will work effectively and policies that can be implemented operationally on a broad basis. 7

Using the structure to load Big Data Once we have defined the P, O and L of the POLE, and set up a clear governance structure that enables the business to match across the different information channels we can load our Big Data set-up, based around this framework. Events the individual transactional elements are added to the model at this stage and their links to the core POL are used to associate information in a consistent way across channels (figure 4). This is a reasonably standard approach within well-managed data warehouses, but with Big Data this level of rigor is essential. The challenge of Big Data is that many sources might be external to the organization during the transformation, so the matching processes might require significant processing power owing to the size of data. Big Data scaling is often overlooked. It is necessary to scale the actual load as well as the analytics. Too often the focus is simply on the data-shifting challenge, rather than on ensuring that what is contained within the Big Data environment is of sufficient quality to enable any questions to be effectively answered. The POL structure has now been augmented with events, giving us the full map of what we wish to query within Big Data, all hung from a single consistent structure. This framework-driven approach to Big Data management ensures that new dimensions and information sources can be rapidly adopted without requiring massive re-engineering of the solution. Figure 4 Mastering Big Data Information source 1 Information source 2 Information source 3 Extract Transform & standardize Parties, objects, locations Merge Match Cleanse Events Find mastered reference Use mastered reference Create Big Data load record Load Big Data 8

Purchase Figure 5 Cultural structure for Big Data Party/Party Object/Object Channels Customer Party Party/Object Object Products Teams Advert Advert Event Match Party/Location Object/Location Time Location Physical Locations IP Locations Location/ Location Weather Electronic Locations Social Locations Securing Big Data anonymizing information One of the risks with Big Data, particularly when looking at customer information, is that any information leak or breach would be massive in scale. Using master data during loading gives a simple way of anonymizing the information. Rather than loading the full customer profile information with information such as names and ages, the information can be anonymized to include the demographic segmentation of the individual but omit anything that could be deemed sensitive. In order to map this information back to the individual later, the Master Data Management (MDM) key could also be supplied for that record. By employing these strategies it becomes possible for Big Data to answer challenging queries without loading sensitive information into the data set. Using MDM as a filter helps to deliver the value, control and security required to leverage Big Data effectively. 9

A simple plan to turn Big Data into Big Information To get value out of Big Data, you need to turn the mass of disparate data into easily accessible, measurable information. This transformation can be achieved through a simple threestep approach, shown in Figure 6. The first two of these steps are critical. The first step ensures that the correct entities have the authority to define the mappings, the second defines those mappings, and the third realizes them within both Big Data and operations. Thirdly, technology is at the heart of automating this solution, ensuring a consistent level of quality by providing a hub that automates the distribution of information between the sources and Big Data. Figure 6 Three steps to turn Big Data into Big Information Who can decide? What groups are authorized to make decisions? How will groups be co-ordinated? Understand your POLE What is your POL? What is the business information model? Automation ETL for Big Data Push back into operations Data stewardship By starting with governance and business information, it becomes possible to construct a Big Data environment that is able to answer specific questions on historic data, as well as answer questions based on projected data by analyzing future trends. In practical terms, this means that we would have first set up a governance team that could authorize any data standards and policies. This group works across the multiple organizations that are providing information to ensure there is consistency of definition and agreement about where any decisions about new entities, attributes or policies may be taken. Having an active group that is constantly reviewing and updating policies and standards ensures that, as Big Data sets expand and change, the organization is ready to adapt. 10

Plan for Big Data by taking control A simple philosophy lies at the heart of this paper: that the complexities of BIG Data can be easily managed and exploited by dividing the project up into a series of disciplines and recompiling it around a clearly defined structure. Information becomes easier to access, manage, and far more secure. By setting up the governance of master data, you provide Big Data with the framework required to drive its long-term success, and do so in a way that ensures both the accuracy of results and their direct correlation to operations. There is a clear case for effective MDM as the first step towards leveraging Big Data, especially given its added benefits for information security. Central to an organizations approach to mastering Big Data is which parts of the POLE model deliver the benefits to master at a given point in time. For most organizations, the mastering of the POLE is an iterative process, often driven by a business function that understands the benefits that mastering the information will deliver. As a basic rule, it is only worth mastering information at the point at which you wish to have quality information for analytical purposes. If it is sufficient for information to simply provide trends rather than an accurate holistic view then mastering should be delayed until it has a clear ROI. By taking an iterative approach to mastering the POLE it becomes possible for organizations to continually realize more and more benefits from Big Data as the ability to leverage and understand the complex analytical models improves. Big Data is a powerful tool, but power is nothing without control. MDM provides business with the tools and ability to realize the power of Big Data. By taking an iterative approach to mastering the POLE it becomes possible for organizations to continually realize more and more benefits from Big Data as the ability to leverage and understand the complex analytical models improves. 11

www.capgemini.com/bim About Capgemini With more than 115,000 people in 40 countries, Capgemini is one of the world s foremost providers of consulting, technology and outsourcing services. The Group reported 2010 global revenues of EUR 8.7 billion. Together with its clients, Capgemini creates and delivers business and technology solutions that fit their needs and drive the results they want. A deeply multicultural organization, Capgemini has developed its own way of working, the Collaborative Business Experience, and draws on Rightshore, its worldwide delivery model. More information is available at www.capgemini.com Rightshore is a trademark belonging to Capgemini To find out more about Master Data Management and Big Data visit www.capgemini.com/bim or email: bim@capgemini.com 2011 Capgemini. All Rights Reserved. No part of this document may be modified, deleted or expanded by any process or means without prior written permission from Capgemini.