ROCANA WHITEPAPER Control Your Modern Infrastructure
ABSTRACT Managing IT operations is getting harder at the same time that business leaders are expecting more efficiency and flexibility due to the direct impact of IT on customers. IT managers are facing both increasing infrastructure complexity (due to technologies like containerization, IaaS, PaaS, Software Defined Networking, and Software Defined Storage) and increasing application complexity (SOA, microservices, and cloud integration). Ironically, the very advances that promise efficiency are creating new levels of complexity for IT managers. Siloed monitoring systems, limited cross-domain visibility, and massive growth in machine data volumes all contribute to a loss of control and decline in service level assurance. Rocana is creating software for Augmented IT Operations that provides powerful new ways of analyzing and interacting with data to improve operations. Failure is not fatal, but failure to change might be. JOHN WOODEN, LEGENDARY COACH, UCLA BASKETBALL 2015 Rocana, Inc. 1
INTRODUCTION There may be no stronger proof that the only constant is change than the pace of technological innovation. For example, consider the rise of Docker over the last two years and the threat containerization poses to virtualization, a technology that itself went from practically unknown to ubiquitous within a decade. This rapid pace of change in technology brings with it both opportunity and peril. CIOs are facing a sea of change in infrastructure technology and operations that exemplify the double-edge sword of progress: Software-Defined Networks, Storage and Data Centers Cloud-based and hybrid cloud applications Mobile device proliferation Containerization and Data Center OS (Docker, Mesosphere, Kubernetes, etc) Microservice architectures Each of these new approaches was designed to make IT more efficient; but at the same time each creates new challenges. For example, microservices should shorten development cycles and enable faster time to innovation. Yet once deployed, the loosely coupled and scale-up/scale-down aspect adds significant and novel complexity to monitoring and managing the hosting and delivery infrastructure. While dealing with all the underlying technology issues, CIOs also face greater pressure from business leaders to be a better partner. Business leaders know that IT can be a key component of creating competitive advantage, now more than ever, owing to the increasingly direct role IT plays in customer experience. A recent report by Watermark Consulting showed that leaders in Customer Experience outperformed the market by 35% while laggards trailed the market by 45% 1. With such high stakes, CIOs are being asked not only to run infrastructure, they are also expected to be experts in how technology can be used to advance business objectives all with the expectation that they ll deliver those results faster and more efficiently than ever before! A NEXT-GENERATION SOLUTION FOR NEXT-GENERATION INFRASTRUCTURE At Rocana, we have a vision for the next generation of IT operations software. A new paradigm of Augmented IT Operations, where Big Data, advanced analytics, and visualizations are applied to DevOps, guiding users to the root causes of any operational or security issues. TOTAL OPERATIONAL AWARENESS Modern, complex IT environments are plagued by limited visibility and incomplete information. Large organizations typically have 10 or more legacy monitoring and management solutions, each addressing a limited part of the stack and often restricting the amount of data collected through either technical or license deficiencies. This is where Rocana s Big Data approach gives IT significantly greater operational awareness. Rocana Ops is designed to collect and retain operational 2015 Rocana, Inc. 2
data across the full infrastructure and application stack, from hardware devices to software systems and applications. With complete fidelity of information and a single source of truth, Rocana fundamentally changes how IT triages application, infrastructure and security issues. DO I REALLY NEED TO COLLECT THAT DATA? A Fortune 500 retailer had decided not to collect and monitor syslog data from switches and routers because it was too expensive with their existing solution. During a proof-of-concept with Rocana, the customer started analyzing this data and quickly identified the early stages of a denial-of-service attack in part of their infrastructure that they didn t know was susceptible. Data collection and retention is the first part of the puzzle, and no easy feat. Many enterprises are generating more than 1TB of log data per day, and that number is doubling every year. With such large data volumes, it is no longer possible to rely on humans to interpret it all, which is why Rocana pioneered Augmented IT Operations. At the core of the solution are advanced analytics that find anomalies in the data. As data is streamed in from all the different systems in your enterprise, Rocana Ops is constantly developing models and running purpose built algorithms to spot differences in patterns of behavior. The key point: either because of inability to cope with scale or by being designed for a segment of the stack, legacy tools limit visibility and impair effective IT operations. Rocana Ops was designed to address these limitations. Rocana Ops: collects operational data from every piece of hardware and software maintains a history of activity for search and visualization analyzes streaming data in real time The resulting visibility into all aspects of your infrastructure, coupled with assurance that systems are being actively monitored with advanced analytics, is what we call total operational awareness.. HOW DOES ROCANA HELP ME? Total Operational Awareness is a strong starting point for regaining control of your complex infrastructure. As a CIO you need more than that you need to know what problems Rocana is actively addressing. Those problem areas are: IT Operations Security Business Analytics IT OPERATIONS IT monitoring and management tools haven't changed that much in the past decade. IT has dashboards that show the status of servers and other components based on historically sensitive metrics. They have alerting tools that notify the team when some rule-based condition is met. These systems worked reasonably well when the operating systems, applications, and processes on machines ran in isolation and were stable for months at a time. Modern infrastructure like containerization, microservices, and converged infrastructure, where the failure of individual components is not only probable but expected, this model is no longer effective. Most of the time, infrastructure issues are found by customers, not by IT staff. At best this is frustrating to users; at worst it causes customer churn and lost revenue. 2015 Rocana, Inc. 3
Tools like ELK and Splunk are available, but they offer little more than glorified grok,grep and charting. The search-first, brute force model has not fundamentally changed. ELK and Splunk users simply have a larger set of data to pour over and better indexing into the data set. The onus is still on the user to understand the infrastructure components, the machine-generated data, and more importantly, the relationships between them. When you have 1TB or more data per day, this is no longer a scalable approach to IT operations? Consider when you get to 10TB, 100TB or a petabyte per day. In contrast, Rocana gives IT teams Augmented IT Operations advanced analytics, out-of-the-box visualizations, and machine learning algorithms that aid IT teams in performing their daily tasks, especially root cause analysis. For example, Rocana provides the ability for IT staff to flag conditions with annotations that overlay notifications triggered by other systems. When problem solving, annotations allow IT admins to record their current interpretation with the data to enhance collaboration. These annotations are critical for IT admins working in elastic environments who need to go back in time and review how previous, similar problems were debugged and solved, or what other team members suspected under similar conditions. SECURITY The threat landscape has turn inside out over the past few years, with for-profit hacking networks and state-sponsored attacks significantly increasing the sophistication and ramifications of breaches. The common wisdom today is that nearly every organization has already been breached, with attackers quietly navigating between systems looking for high value data. The security industry has responded by providing a plethora of solutions to address threat vectors, with SIEMs leading the charge, augmented by point-products for Privileged User Monitoring, Identity Access Management, Advanced Persistent Threats, security analytics, and more. The same Big Data approach that serves the IT operations users so well is an equally effective tactic in security analytics. In fact, a new security architecture is emerging that mandates a Big Data solution as the common warehouse for security events. Rocana as the Hub of a Modern Security Architecture 2015 Rocana, Inc. 4
The advanced analytics implemented in Rocana Ops are particularly germane to security. For example, Rocana Ops automatically detects anomalous over-utilization and under-utilization. When hackers attack they often try to turn off logging in order to mask their presence, something Rocana will detect and flag as a change in system behavior. On the other side of the equation, many attacks involve rapid-fire, repeated attempts at exploiting a weakness, which in turn results in significant increases in log volume. Because attacks can be as distributed as the applications and services they target a Big Data approach is necessary to develop the baseline models and event volumes required to detect such changes when across a vast IT environment. Rocana Ops can identify such scenarios and flag as anomalies. Our analytics component can even be trained to specially monitor particularly sensitive locations, hosts, and services with customized attention and adjusted alert thresholds. BUSINESS ANALYTICS Businesses are clamoring for more rapid feedback on key performance indicators. The day- or week-long delays associated with the batch oriented processing of warehoused data is no longer aligned with business needs. In many cases, the realtime data collection and analysis features of Rocana Ops are perfectly suited to the task. Using Rocana Ops, a retailer can analyze clickstream data to determine how long it is taking to process online checkout or the latency of product catalog search. By enriching customer clickstream data with geolocation data for IP addresses, the retailer can identify which geographies are performing poorly and spin-up more elastic capacity to restore service levels. In a more complex example, consider an airline that has a highly customized web and mobile application for ticketing integrated with their yield management system. By similarly analyzing clickstream data in real time, the performance of different yield management algorithms can be determined leading to pricing adjustments that improve booking rates. While both of these examples are related to clickstream analysis, the utility of Rocana Ops for business analytics is much broader. For example, IoT applications often require large-scale monitoring and real-time analytics. Other good use cases include financial fraud detection, VOIP and mobile call-data record (CDR) quality monitoring, and traffic congestion detection. CONCLUSION Modern IT technologies gives businesses the opportunity to gain competitive advantages by embracing technologies such as software defined networking, storage and data centers, microservices, containers and other elastic infrastructure. These technologies come with significant operational differences that historically available, legacy tools were not designed to handle. By pioneering Augmented IT Operations the application of Big Data, advanced analytics and visualization to DevOps Rocana has created the next generation of IT 2015 Rocana, Inc. 5
operations software. Designed to handle the complexity of next generation IT infrastructure, Rocana Ops gives IT operators total awareness of the all infrastructure elements in a single location, speeds time to resolution, and increases control. ABOUT ROCANA Rocana (formerly known as ScalingData) provides enterprises with the ability to maintain control of their modern, global-scale infrastructure. By using Big Data and advanced analytics, Rocana augments staff skills to increase efficiency and awareness, thereby improving service assurance. Unlike brute-force, legacy log management tools that lack scalability, are slow, and have poor cost-to-value ratios, Rocana Ops is optimized to manage huge amounts of data and encourage analysis to show a complete picture of IT operations. 1 http://www.watermarkconsult.net/docs/watermark-customer-experience-roi-study.pdf Rocana, Inc. 548 Market St #22538, San Francisco, CA 94104 +1 (877) ROCANA1 info@rocana.com www.rocana.com 2015 Rocana, Inc. All rights reserved. Rocana and the Rocana logo are trademarks or registered trademarks of Rocana, Inc. in the United States and/or other countries. WP-CMI-0715 2015 Rocana, Inc. 6