1 Can You Win the Big Data Arms Race?
2 About the Author Following a career in academic research, Nick has worked in the commercial IT & Technology sector since His work has been wide ranging, including developing Tessella s award winning Asset Management and Performance Optimisation software. Moving through technical roles in analysis and project management into a senior leadership position, Nick is now Head Of Analytics. The years spent crunching numbers to model and visualise the subtleties of chemical bond formation, complement those spent more recently solving practical problems in industry, to give Nick a great feel for what you can do with data. In this role, Nick combines his knowledge with the weight of experience of his Tessella colleagues, to craft a comprehensive analytics resource, focussed on enabling radically better decision making under challenging conditions. Keep up to speed with the latest thinking on analytics; follow Nick on his blog and on Nick Clarke Tessella Head of Analytics 2
3 Can You Win the Big Data Arms Race? We are in the middle of an arms race 1 the big data arms race and it s gathering pace. In some respects, it s starting to feel like a runaway train. The aim is deceptively simple: collect loads of data, analyse it, and generate insight that gains you a competitive edge. If future success is truly in the hands of the data haves, then no-one can afford to be out-competed by better placed rivals, and risk becoming a data have-not. This is the classic scenario for an escalating arms race, with participants locked-in as much by fear as opportunity. With technology increasingly providing cost effective data collection and storage, the explosion in the use of intelligent, connected devices means that data volumes are growing exponentially 2. This trend can be seen in all industry sectors, with the escalation driven by the common promise of being able to understand key behaviours better than your competition. The data may relate to the shopping habits of retail customers, the propensity of a reservoir to relinquish its oil, or of a fleet of trains to stay healthy. The aim is the same; use big data to better understand your asset and the market. However, although the theoretical potential is well documented 3,4, clear examples of concrete benefit remain thin on the ground. Worse, the cost of managing the data is rising unchecked all the time. If this data is not consistently turned into actual business benefit, the arms race will produce casualties. Benefit only comes from making better decisions and improved operations, driven by greater insight unlocked from the data. All too often, the mechanics of managing the data volumes overwhelm any efforts to extract insight and align it to a business need. This situation is not sustainable, and we can only move forward by becoming smarter, rather than just bigger. The root cause of the failure is well understood: an inherent lack of analytics expertise, a lack which is projected to last for many years to come. Looking through to 2018, the MGI report Big Data: The next frontier for innovation, competition, and productivity predicts a shortfall of up to 190,000 deep analytical and 1.5 million analytics delivery experts in the United States alone, with other economies facing a similar crisis 5. If future success is truly in the hands of the data haves, then no-one can afford to be outcompeted by better placed rivals, and risk becoming a data have-not. 3
4 Smarter Analytics, not Bigger Data At Tessella, our experience has taught us that to get ahead in the analytics arms race, it s not enough to get bigger you have to get smarter too. Yes, the growing choice of off-the-peg integration products makes data collection ever more accessible, but it s not much help if the data sits largely unused. In recent years Oracle, IBM, Microsoft and SAP between them have spent more than $15 billion on buying software firms specialising in data management 6. To be a world class intelligence-led organization, the data has to get from the database into your decision-making process 7. Lack of access to guidance from analytics experts has meant that the majority of such projects tend to be run bottom-up, dominated by the concerns of data collection. They fall into the trap of storing the lot, and only address decision making when it s far too late. Analytics becomes reduced to a game of trawling the data for intriguing patterns or correlations. As a result they fail to deliver sustainable value. The arms race solution of seeking to buy competitive advantage through the best kit and collecting every scrap of data, just makes this situation worse. Not only a classic zerosum tactic, it doesn t address the core issue of finding the analytics talent to identify and release the insight locked inside the data. We believe the key to success is to focus resolutely on providing clearer insight to support the key decisions, achieving this through making the analytics smarter, rather than the data bigger. We summarise this as: Decisions First, Data Last Decisions Insight Analytics Infrastructure Automation Data informed by delivered through supported by streamlined via to manage and process Successful analytics projects show that the only real differentiator comes through having both the right data and the right people to turn it into actionable insight. With a combination of industry domain knowledge, analytics capability and IT skills, Tessella will ensure the data remains as a useful servant to your business needs, and never becomes an overwhelming burden, the master dictating your efforts. It s easy to say, make your analytics smarter, but what does that really mean? Well the way we look at it is to take a scientific approach to the data, and use it to build an analytics laboratory. This is an environment where analysts can actively run experiments with the data to test and improve their knowledge 8. 4
5 The Analytics Laboratory The Scientific Approach to Commercial Data The top experimental laboratories are clean, well-organised, well-equipped environments where experts can work flexibly, and effectively, to uncover new truths. Adopting a scientific approach ensures this new knowledge is backed by real-world evidence. Why not apply the same conditions to your business data? Without the kind of smart planning advocated by Tessella, a data warehouse becomes a cluttered, unmanageable storage cupboard, or worse still, a museum with data preserved as a curiosity. This might all sound very science based and clinical. But is it relevant to the highly pragmatic commercial sector? Well, fewer industries keep a closer eye on the bottom line than the retail and consumer goods sector and they are fast learning the analytics lesson. Just take a look at Tesco, and their game changing Clubcard Loyalty Scheme 9. With their specialist marketing analytics partners dunnhumby, they have shown what can be achieved by taking a rigorously scientific approach to consumer data analytics, using the insight gained into shopper behaviour to drive significant intelligent-led business improvements. Ignoring huge volumes of detailed transactional records, the scope of the data made available to the Clubcard business analysts was restricted to a higher level, derived and aggregated behavioural picture. The level of detail, at any point in time, was aimed to be consistent with Tesco s ability to realise in practice any smarter decisions resulting from the new insight, an essential cost control discipline. The brief for the analysts was to test for, measure and interpret changes in consumer behaviour in order to: quantify the impact of trials in specific test locations differentiate the relative response within specific shopper demographics To gain new business insights you need to use the right tools, at the right time, to perform tests on structured, relevant data. A data analytics laboratory lets you: explore and challenge existing prejudices about your business, your assets, your customers and their clients validate new ideas against real historical data monitor proactively the response to your current change programmes Only as the shopper behavioural models were validated and became more sophisticated, was the scope of the data provided to the analysts allowed to grow. All the time, the discipline of restricting the database to the minimum data needed to address current business need, remained. This scientific approach, focussed on collecting real-world evidence and insight to support key business decisions is an excellent example of what we mean by an analytics laboratory. This level of focus and integration of the analytics into the wider decision making process is a big step forward from the more typical situation that unfortunately still holds back many analytics programmes. It is common to find that data collection and storage is the responsibility of the corporate IT unit, whilst the analytics is siloed within specialist, highly mathematically trained, statistical analysis teams. Traditionally, it has often been difficult to align delivery from these functions with the operational needs of the business. This organisational flaw is a crucial factor behind the current lack of sustainability within the world of big data, and has to change. As a recent Winterberry report quotes it s a rare breed of person who can understand what s going on technology-wise and tie it to the marketing world. 10 Thankfully the way forward is neither radical nor so very difficult to achieve. understand price and product sensitivity collect solid evidence that business benefit would be realised nationally improve planning for new and existing stores 5
6 Creating an Intelligence-led Organisation Forget about investing in technology for a minute. The simple truth is that in order to be effective analytically, you have to be able to make the best use of your best people. All too often specialists are locked inside isolated teams, and their game-changing knowledge is hidden with them. To deliver the increasingly significant benefits from big data, required to justify its price tag, it will be necessary to create multidisciplinary teams that span a complex organisation. The key to creating an intelligence-led organization is finding the right way to put together all of the individual pieces that complete the analytics puzzle. Just getting more data won t help. Adding more people who only understand their little piece won t help either. You need to bring together people who understand:- the changes needed by the business the key decisions which have to be improved where information resides currently in the business and how it needs to be cleaned what information is missing and what the associated uncertainty is costing you how the existing fragments of knowledge from different silos has to be blended together to generate far-reaching insight that can be put to common use across the business which analytics tools and data structures can deliver the insight at a sustainable cost how to get all of the disparate business units to work together to a common goal Although organisations will often already have within them, the talent and knowledge needed to generate the coherent vision needed by the high-level decision makers, most have no way of joining them all up to make it a reality. At the heart of it is a misalignment of communication and outlook. Without this alignment, it is impossible to build your data analytics laboratory. Business Area Owners Business Area Operations Corporate IT Analytics Laboratory The analytics laboratory at the heart of the intelligenceled organization Strategy Board Specialist Analysts Planners 6
7 At the heart of the intelligence-led organisation is a flexible team that can talk the language of each specialist unit, ask the right questions, collect the right data, complete the joins, and pass the insight on. These are the rare people with the breadth of business, domain, analytics and people skills to pull all the multi-disciplinary pieces together 11. This facilitating analytics layer will:- translate the different technical jargons into a common language contribute proactively their own ideas, experience and analytics solutions not be distracted by the day to day crises associated with running each business unit remain free of the historical personal baggage that can hinder intergroup communication minimize the drain on each experts busy schedules, leaving them free to focus on their day job The effect is to forge a common understanding and a coherent, sustainable communication route that gives you sharper, clearer analysis from which to base your most important business decisions 12. The Intelligenceled Organization in Action Tessella has worked with a number of major clients, adopting the data analytics laboratory principle to build an intelligence-led structure: The global pharmaceutical company that needs to reduce the manual effort required to find new drugs. Working on their global predictive science programme, Tessella is helping them establish a central intelligence resource through which new experimental data, and the expert knowledge derived from it, is channelled more effectively between specialist units. The extra insight is harnessed to enable more accurate predictions of the key properties of candidate compounds, such as efficacy and toxicity, to be made at an earlier stage. This reduces the time spent working manually on low-probability options, delivering significant cost savings and improving the quality of the candidate drug pipeline. The oil & gas supermajor that needs to boost global access to expert technical knowledge, residing in a limited pool of specialists that is being hit further by the rapid retirement of an ageing workforce. The intelligence hub, in this case, supports the creation of an analytics laboratory built to house an ever expanding expert software toolkit, which supports the next generation by preventing the loss of vital knowledge. This insulating layer within the client allows expert knowledge to be both captured and crystallised into a toolset, whilst minimising the disruption to existing workloads. The future can be safeguarded without sacrificing the needs of the present. The major UK utility company that needs greater clarity of the intelligence that underpins its investment planning. With a regulatory requirement to commit to 5 year, multi-billion asset investment strategies, the board need access to hard evidence which enables them to judge clearly how any potential investment programme will meet their key business objectives. They need to be able to: compare between alternative commercial and technical goals visualize the predicted outcomes, and their sensitivity to changing circumstances understand the profile of risk across the asset and operating base, and the degree of risk being retained by the company understand the uncertainties inherent within the plans It requires input from a huge diversity of specialist departments, from operators who record visual inspections through to planners who model anything from asset degradation to customer price sensitivity. Each part of the total intelligence picture, contributed by the many expert areas, was translated into a common risk-based language for the first time. The consistent view across the organisation gives the investment board and the regulator confidence in the robustness of the approved strategy. Bringing people and their ideas together with a new way to communicate, additional levels of insight have been not been hard to find. Clients are often surprised at the amount of low hanging fruit available, once the structure is in place. 7
8 The Analytics Laboratory Gaining Behavioural Insight More often than not, the aim of the analytics laboratory is to understand better how something affecting your business behaves, so you can not only react appropriately to current events but also predict the impact of future change. This change may be driven proactively by you, or it may be external. In the retail sector example we cited, the key is to understand what drives shopping behaviour and its variation across different population segments. Although in other, less clearcut, cases the underlying motivation can seem quite different, behavioural profiling often lies at the heart of the additional insight being sought by the business. Although technology-driven improvements to a transport network may appear, at first glance, to occupy a completely different world, the similarities are actually very strong. At a time when the industry standard was to dump every last scrap of data generated by on-train management software into a central warehouse, Tessella instead built a streamlined behavioural analytics solution covering both the vehicles and their drivers. Knowing from experience that mining vast amounts of static, historical data for behavioural anomalies was not the way forward, we adopted an intelligence-led, Decisions First, Data Last approach with the train operator. This started with the clarification of the two critical insights they needed to gain: what missing knowledge was stopping improvements to the performance and reliability of the vehicles, the drivers, and to the reduction of maintenance costs how this new insight needs to be fed into existing business processes, in order to make a difference With this cleared up, we were able to design the most cost effective way to feed relevant data into an analytics laboratory, for use by their engineers and driving technique analysts. Manageable, living data sets of derived behavioural measurements are generated in near realtime from the massive raw data feeds. The specialists have a dedicated environment in which to turn, via analytics, their expert knowledge into insight. The behavioural analysis, of both man and machine, is revealing those factors that impact driver performance and vehicle reliability. The big data challenge was met by partitioning and transforming vast raw data volumes according to the operational and planning needs of the business. Smart design and the strict discipline to justify any stored data, resulted in a lean analytics solution that exploited smart personalization of the derived knowledge to generate more insight into the behaviour of driver and vehicle populations. Accurate personalisation is the key to delivering behavioural insight, just as dunnhumby found when profiling Tesco s shopper population. Manageable, living data sets of derived behavioural measurements are generated in near real-time from the massive raw data feeds. 8
9 The Analytics Laboratory - Maturity Model Although its sheer scale and diversity undoubtedly presents new challenges, there s a lot more behind an effective analytics solution than just dealing with big data. Despite industry analysts, and the popular and technical press, focussing predominantly upon the issues arising from the explosion of data volumes, there are plenty of other aspects to consider. The development of analytical sophistication follows a clear maturity model, progressing in steps from standard reporting using fixed data mining, through to advanced insight used to optimize business processes. It is important that an organisation understands how far its programme to become more intelligence-led has progressed: Reporting Analytics Analytical Optimization Optimization: Achieving the best available Predictive modelling: The probability & impact of future events Forecasting: What the consequences will be Statistical analysis: Why this is happening Scheduled and ad-hoc reporting: What has happened, when, where and how many Increasing competitive advantage Analytics Laboratory Maturity Model Wherever you are on the journey to becoming an intelligence-led organisation, at the outer edges or deep into the bulls-eye, it is very important to be clear about the level of data needed to meet your current needs. As we saw with the Tesco retail behavioural models, this should be driven by your ability to react to the insight, derived from the data. Little business value is gained from investing in an extensive analytics laboratory if other organisational factors limit your ability to implement change. The relentless rise in the cost of big data driven analytics programmes is exacerbated through failing to appreciate this simple fact, and being sucked headlong into the arms race. You might naturally expect to accept an increase in the amount of data you have to process in order to deliver more sophisticated analytical insight; the move to the centre triggering an inexorable slide into big data. It is important to realise that this is by no means necessarily the case. One size truly does not fit all when it comes to data analytics. Valuable insight is locked inside datasets of all shapes and sizes. Whereas when dealing with big data, smart partitioning and structuring of that data is essential, working with sparse data often requires you to be even more creative. A mature intelligence-led organisation will have developed a flexible analytics laboratory, able to extract insight from a wide variety of data sources. 9
10 It Does not Have to be Big to be Better In the pharmaceutical industry, new drug discovery is very much a big data problem, dominated by automated activity and toxicity screening of vast numbers of potential compounds. In contrast, final stage clinical trials that prove whether a drug is safe and effective, is the exact opposite. A painstaking process of recruiting, treating and applying complex statistical analysis to clinical results from a small sample of carefully controlled human subjects, the challenge is to infer the maximum drug response information from the smallest possible number of subjects. The inherent difficulties of running clinical trials raise difficult scientific, cost and ethical questions that have been wrestled with for decades. Adaptive Bayesian statistical techniques are an excellent example of how much more you can achieve if you look to be smarter with your analytics, as well as bigger in your data. If you have limited data sets, these techniques are able to deliver deeper inference and greater insight, compared to conventional statistics. If you are fortunate enough to have comprehensive data sets, the additional statistical efficiency of the Bayesian techniques allow you to establish clear patterns and robust behavioural inference much earlier into the data processing. Given the scaling implications of managing data access within a big data installation, anything that delivers the required insight faster and more efficiently is very welcome. Tessella has pioneered advances in the commercial application of adaptive Bayesian statistical techniques to clinical trials, which allow in progress optimization of the trial and delivers early insight from every new data point. The improved analytics deliver more efficient mapping of dose levels to patient type, within strict trial randomization, allowing ineffective treatments to be stopped earlier and highly effective ones brought forward. Early trial termination has saved Wyeth >$15m 13 and saved countless subjects from enduring the damaging side-effects of trial treatment programmes that will not help prove a drug s efficacy. Variants of the adaptive Bayesian analytics are used in analytics solutions ranging from state-of-the-art radar tracking through to pain management, via characterisation of the behaviour of nerve fibres in response to electrical stimulus. The aim of smart analytics should be to give you more, for less, more quickly. Conclusion: The analytics skills you need are out there The world of big data is a high stakes game to play. The amount of investment being sunk into technology, and the levels of expectation generated by the media as to what this data will deliver, are such that failure to deliver has serious consequences. Most of the evidence to date appears to say that, more often than not, big data is failing to meet those expectations. As the Economist says, Big data has the same problems as small data, but bigger. Data-heads frequently allow the beauty of their mathematical models to obscure the unreliability of the numbers they feed into them. 14 In the middle of an arms race, this is not a comfortable position to be in. A fight to secure access to scarce resources is also a feature to be expected within an arms race. As we have seen, the scarce resource in this case is the analytics talent that can turn the essentially dumb data into the insight needed to support better business decisions. It is a mistake, however, to think that the solution is simply to hire or train a greater number of statisticians and number crunchers. That is only part of the picture. Success will not be achieved by laying hands on enough people who understand just their bit of the analytics puzzle. All the pieces have to be joined together to create a truly intelligence-led organisation, one that works using a common language to pursue common goals that are not continually hi-jacked by day to day operational crises. The full range of skills, proven in delivering such challenging results, is out there. Companies like Tessella, with a powerful combination of industry domain knowledge, scientific know-how, analytics capability and commercial IT skills, are ideally placed to bridge those gaps. The breadth of experience, gained working for over 30 years across many industry sectors, creates a knowledge base of what works best in each situation. It also means that each new challenge rarely needs to be tackled from scratch. Above all, however, is the need to always target business need over technology, decisions over data, and interaction between people over machines. Only then can you transform a big data organisation into one that is intelligence-led. If you are struggling to find the analytics expertise that you need, and prefer the idea of analysts putting your data to work in a laboratory, to having it sat in a warehouse, then contact Tessella. You can find more content and examples at consumer-industries/big-data-analytics/ 10
11 References 1. see T. H. Davenport and J. G. Harris Competing On Analytics. The data analytics arms race is the key concept behind this illuminating book from the Harvard Business School. Published in 2007, the ideas are still current and relevant 2. see MGI Report Big Data: The next frontier for innovation, competition, and productivity, June 2011, page 3 3. Ibid, Section 4, page Cebr report Data Equity: Unlocking the value of big data, April Ibid, page 10, page see The Economist, Feb 2010 special report: Managing information, Data data everywhere. 7. see McKinsey Quarterly Oct 2011 Competing Through Data for three different senior exec views on addressing the threats and opportunities presented by big data 8. see McKinsey Quarterly Oct 2011 Are you ready for the era of big data? Section 2 deals with the value gained by taking decisions based on controlled experiments made on your data 9. See C. Humby, T. Hunt, T, Phillips Scoring Points - for a detailed account of the dunnhumby approach 10. Winterberry Group From Information to Audiences: The emerging marketing data use cases, January Ibid 1, page 144, Davenport and Harris evocatively refer to these rare people as PhDs with personality 12. see Quarterly Nov 2011 Inside P&G s digital revolution for a fascinating insight into how P&G are addressing the challenge of becoming the world s most digital data-led, technologically enabled company. 13. see and follow Savings at Wyeth tab 14. The Economist: Schumpeter: Building with big data, May
12 Tessella - Copyright August 2013 Tessella all rights reserved. Tessella, the trefoil knot device, and Preservica are trademarks of Tessella.