Operational Analytics Version: 101
Table of Contents Operational Analytics 3 From the Enterprise Data Hub to the Enterprise Application Hub 3 Operational Intelligence in Action: Some Examples 4 Requirements for Operational Intelligence 4 Conclusion 5 About the Author 5 2
Operational Analytics The technology industry has been working at improving decision-making in organizations for decades by finding increasingly better ways to inform decisions and decision-makers. While progress has been steady, it has been hampered by the limitations of technology, economics and the extent of feasible methodologies. The complexity and immediacy of operational decisions outstripped the capacity of information systems, until recently. Gathering the needed data to drive decisions, and having the physical resources to process and store it has always been a struggle, but the economics of computing today has eliminated those constraints. Because computing resources have always been expensive, the old methodologies of building applications in the most parsimonious way, or managing from scarcity, still has a hold on IT departments, but it is fading quickly. The ability to understand operations, opportunity, and risk is now a reality and can offer bonafide results by automating many operational decisions, and hastening those that still require human input. All without endless design, programming, and inability to adapt quickly to changing conditions. Tools for aiding in decision-making, such as Business Intelligence, Data Discovery, and even Decision Management and Complex Event Processing, largely deal with data internal to an organization. The data is typically well-structured though not at all clean, either within a source system or especially when trying to integrate it with other systems. Organizing it for analysis is a time-consuming process slowed by the need to move data to low-powered servers for cleansing and integration. These approaches were, and still are, useful for understanding strategic and tactical aspects of the organization, where analysis and discussion can take place at a more relaxed pace. Operational Analytics, on the other hand, offers the promise of automating analytics in order to reach end users (or systems) during the decision-making process itself, leading to Operational Intelligence. The economics of Operational Intelligence are also very different from existing forms of data-driven decision-making and decision automation. Rather than requiring proprietary software licenses for relational databases, data transformation, modeling tools, business rules management systems, statistical tools and applications, and high-end server platforms; Operational Intelligence is capable of being deployed on mostly open-source software and relatively inexpensive clusters of servers. In addition, the big data approach aims to gather data in one place for many uses rather than making copies and subsets for silos of users, increasing the cost and complexity of the environment. The components of the big data approach are almost entirely born in the cloud, so implementing in various cloud configurations is much simpler than integrating the legacy tools described above. From the Enterprise Data Hub to the Enterprise Application Hub Hadoop was initially designed to ingest extremely large amounts of data in all sorts of formats for he purpose of indexing search engines, doing web analytics and other data-intensive operations. The early characterization of Hadoop was the platform for big data loosely defined as the 3 V s: Volume, Velocity, and Variety. How that data was useful and to what extent was not clear. Applications consisted of code development in the MapReduce context. Preparing these transformations and using them was done by professionals with a high degree of skill in data management, programming, advanced quantitative methods and even presentation skills. Those were deemed the so-called data scientists. That has all changed. It became pretty clear that Hadoop could be pressed into service for far more uses than a single data scientist could do in a day. Plus, data scientists were hard to come by. But before CIO s would consider using Hadoop as an enterprise platform, they needed assurance that five critical areas were addressed: Scalability (meaning concurrent users, workload management), flexibility, fault tolerance, resource management, and security. 3
The expansive and energized Hadoop community went to work to enable an ecosystem and expanded it at a pace that is almost unheard of in technology. The entire nature of Hadoop has transformed from a utility for individual investigators to a true enterprise platform. The term Enterprise Data Hub, EDH, rose from obscurity a few years ago, but is now part of the common lexicon in data management. There is some confusion about the role of the EDH versus a data warehouse, but that is working itself out. But like the data warehouse, which alludes to just a collection of data, the EDH is somewhat limited by the word data in its name because the EDH is not just a collection of data, it is rapidly emerging as a powerful application hub. Operational Intelligence in Action: Some Examples Streaming data applications require a platform engineered for extreme performance, but that is only the first step. For example, and commercial aircraft has engine sensors in flight for monitoring operation in real-time that can display a steady visual stream for eyeball monitoring with warning capabilities when a value goes out of range. This is useful to a point, but when it can anticipate a problem by monitoring activity patterns and constructing complex events from the data, it crosses the threshold from streaming analytics to an intelligent system. It gets even more interesting when the monitoring application pulls data from multiple streams, such as the fuel system, temperature, airspeed, etc. This is where Operational Analytics becomes Operation Intelligence. Not all operational intelligence applications are as critical as monitoring an aircraft in flight, but they can have a real impact. For example, a clothing manufacturer and retailer can monitor point-of-sale data in real-time, weigh sales of certain items by stores that are seen as leaders, then generate orders to boost manufacturing 5000 miles away on suddenly hot sellers. Conversely, they can drop orders on items that are seen as slowing down based on predictive models. In other situations, businesses often see their sales suddenly slump and have only rule-ofthumb hypotheses about the cause and remedy. Streaming operational data, combined with the application of Social Physics, the ability to capture and use data from social media and a host of other non-traditional data sources, can provide immediate guidance. A giant leap in analytics is possible with the end of sampling and aggregation and the realization that tiny details matter. Requirements for Operational Intelligence Be clear that analytics only leads to good decision-making with a proper adjustment in the business process. Like that old saying, all dressed up with nowhere to go, the best analytics in the world can t help you if you don t have a conduit for action. This involves informing/alerting people who are in a position to take action, or a direct connection to operational systems that put a decision into action. Typically, a change in direction in a business process usually involves more than one operational system, so organizations with a functioning Business Process Automation system will find the implementation of operational intelligence more direct. A Willingness to adopt completely new (and even bewildering) measures. If advanced quantitative systems (predictive/prescriptive analytics) only highlighted what is already known, it would be disappointing. As new, sometimes counter-intuitive measurements emerge, there will naturally be a reluctance on the part of incumbents to readily adopt them. This kind of change management takes a little time. Data scientists are useful to a point, but you should employ a distributed network of analysts and decision-makers. No matter how skilled, the scarcity of data scientists cannot be allowed to create a bottleneck. A great deal of work they do can be done by others with less training, freeing the data scientist for the higher value work they are capable of. 4
Getting analytics into the user workflow. Packaged analytics for those with only modest training in statistics are available now. Some even provide through guidance and error resolution suggestion. This is becoming a very competitive field with many compelling solutions. Create a framework for operational intelligence applications to integrate back into the operational workflow quickly as analytical/predictive models change. One criticism of data scientists is that, from the time they develop, test and vet a model, it takes far too long for IT to recode it, test it and put it in production. Predictive models can, and often do, become stale before that happens. Conclusion Operational Analytics at the scale, speed, and complexity of actual operational events is now possible because of the economics of big data and Hadoop. The Hadoop ecosystem quickly adapted from one meant to serve a narrow range of applications to one that can serve the broadest range of enterprise applications and needs. Operational Intelligence is poised to change the way organizations do business by informing and even enacting decision-making in real-time. About the Author Neil Raden, based in Santa Fe, NM, is an industry analyst and active consultant, widely published author and speaker and the founder of Hired Brains Research LLC, http://www.hiredbrains.com. Hired Brains provides research, advisory and consulting services in Analytics, Big Data, and Decision Management for clients worldwide. Neil is also the co-author of the Dresner Advisory Services Wisdom of BI series on Advanced and Predictive Analytics. Neil was a contributing author to one of the first (1995) books on designing data warehouses and he is more recently the co-author of Smart (Enough) Systems: How to Deliver Competitive Advantage by Automating Hidden Decisions, Prentice-Hall. He is a contributor to publications such as Wall Street Week, Forbes, Information Week and ComputerWorld. He welcomes your comments at nraden@hiredbrains.com or his blog at http://hiredbrains.wordpress.com 5
About Cloudera Cloudera is revolutionizing enterprise data management by offering the first unified Platform for big data, an enterprise data hub built on Apache Hadoop. Cloudera offers enterprises one place to store, access, process, secure, and analyze all their data, empowering them to extend the value of existing investments while enabling fundamental new ways to derive value from their data. Cloudera s open source big data platform is the most widely adopted in the world, and Cloudera is the most prolific contributor to the open source Hadoop ecosystem. As the leading educator of Hadoop professionals, Cloudera has trained over 22,000 individuals worldwide. Over 1,400 partners and a seasoned professional services team help deliver greater time to value. Finally, only Cloudera provides proactive and predictive support to run an enterprise data hub with confidence. Leading organizations in every industry plus top public sector organizations globally run Cloudera in production. For additional information, please visit us at: www.cloudera.com cloudera.com 1-888-789-1488 or 1-650-362-0488 Cloudera, Inc. 1001 Page Mill Road, Palo Alto, CA 94304, USA 2015 Cloudera, Inc. All rights reserved. Cloudera and the Cloudera logo are trademarks or registered trademarks of Cloudera Inc. in the USA and other countries. All other trademarks are the property of their respective companies. Information is subject to change without notice.