IT Workload Automation: Control Big Data Management Costs with Cisco Tidal Enterprise Scheduler



Similar documents
Cisco Unified Data Center Solutions for MapR: Deliver Automated, High-Performance Hadoop Workloads

Cisco IT Automates Workloads for Big Data Analytics Environments

Cisco Tidal Enterprise Scheduler

Powerful Duo: MapR Big Data Analytics with Cisco ACI Network Switches

Why Big Data in the Cloud?

Cisco Tidal Enterprise Scheduler

The Liaison ALLOY Platform

Driving workload automation across the enterprise

Big Data and Analytics in Government

A TECHNICAL WHITE PAPER ATTUNITY VISIBILITY

ENABLING OPERATIONAL BI

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

Elastic Application Platform for Market Data Real-Time Analytics. for E-Commerce

Interactive data analytics drive insights

WHITE PAPER GoundWork: Bringing IT Operations Management to Open Source and Beyond

Cisco Tidal Enterprise Scheduler

BI and ETL Process Management Pain Points

For more information about UC4 products please visit Automation Within, Around, and Beyond Oracle E-Business Suite

Data virtualization: Delivering on-demand access to information throughout the enterprise

Oracle Data Integrator 12c (ODI12c) - Powering Big Data and Real-Time Business Analytics. An Oracle White Paper October 2013

IT Automation: Evaluate Job Scheduling and Run Book Automation Solutions

A Tipping Point for Automation in the Data Warehouse.

How Cisco IT Built Big Data Platform to Transform Data Management

Tapping the benefits of business analytics and optimization

Simplified Management With Hitachi Command Suite. By Hitachi Data Systems

IBM BigInsights for Apache Hadoop

Why Change Your Job Scheduler? Today s business environments demand event-driven, enterprise-wide job scheduling unsupportable by legacy tools.

Effective Data Integration - where to begin. Bryte Systems

BIG Data Analytics Move to Competitive Advantage

Data Virtualization: Achieve Better Business Outcomes, Faster

A New Foundation For Customer Management

An Oracle White Paper November Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

Business white paper. environments. The top 5 challenges and solutions for backup and recovery

How To Handle Big Data With A Data Scientist

Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities

Addressing Open Source Big Data, Hadoop, and MapReduce limitations

EII - ETL - EAI What, Why, and How!

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns

How To Make Data Streaming A Real Time Intelligence

QUICK FACTS. Delivering a Unified Data Architecture for Sony Computer Entertainment America TEKSYSTEMS GLOBAL SERVICES CUSTOMER SUCCESS STORIES

High-Performance Business Analytics: SAS and IBM Netezza Data Warehouse Appliances

Analance Data Integration Technical Whitepaper

ORACLE OLAP. Oracle OLAP is embedded in the Oracle Database kernel and runs in the same database process

Integrate Big Data into Business Processes and Enterprise Systems. solution white paper

Next-Generation Cloud Analytics with Amazon Redshift

Optimize workloads to achieve success with cloud and big data

Offload Enterprise Data Warehouse (EDW) to Big Data Lake. Ample White Paper

Protecting Big Data Data Protection Solutions for the Business Data Lake

Master big data to optimize the oil and gas lifecycle

Find the Information That Matters. Visualize Your Data, Your Way. Scalable, Flexible, Global Enterprise Ready

For Midsize Organizations. Oracle Product Brief Oracle Business Intelligence Standard Edition One

SQL Server 2012 Gives You More Advanced Features (Out-Of-The-Box)

CA Workload Automation

IBM Software Integrating and governing big data

An Oracle White Paper September Oracle: Big Data for the Enterprise

Cisco Intelligent Automation for Cloud

Analance Data Integration Technical Whitepaper

Hadoop in the Hybrid Cloud

Big Data: Business Insight for Power and Utilities

BMC Control-M Workload Automation

Executive Summary... 2 Introduction Defining Big Data The Importance of Big Data... 4 Building a Big Data Platform...

BUSINESSOBJECTS DATA INTEGRATOR

Simple. Extensible. Open.

Exploring the Synergistic Relationships Between BPC, BW and HANA

The Future of Data Management

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale

Bringing Together ESB and Big Data

Survey of Big Data Architecture and Framework from the Industry

Enterprise Information Integration (EII) A Technical Ally of EAI and ETL Author Bipin Chandra Joshi Integration Architect Infosys Technologies Ltd

OPEN MODERN DATA ARCHITECTURE FOR FINANCIAL SERVICES RISK MANAGEMENT

How To Use Ibm Tivoli Monitoring Software

Grabbing Value from Big Data: Mining for Diamonds in Financial Services

Data. Data and database. Aniel Nieves-González. Fall 2015

IBM InfoSphere BigInsights Enterprise Edition

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2

Cisco Cloud Enablement Services for Adopting Clouds

Performance Management for Enterprise Applications

Microsoft Big Data Solutions. Anar Taghiyev P-TSP

Extend the value of your service desk and integrate ITIL processes with IBM Tivoli Change and Configuration Management Database.

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

Microsoft Big Data. Solution Brief

MANAGEMENT AND ORCHESTRATION WORKFLOW AUTOMATION FOR VBLOCK INFRASTRUCTURE PLATFORMS

WHITE PAPER OCTOBER Unified Monitoring. A Business Perspective

Accelerating Web-Based SQL Server Applications with SafePeak Plug and Play Dynamic Database Caching

APICS INSIGHTS AND INNOVATIONS EXPLORING THE BIG DATA REVOLUTION

How To Understand The Benefits Of Big Data

Internet of Things. Point of View. Turn your data into accessible, actionable insights for maximum business value.

Zend and IBM: Bringing the power of PHP applications to the enterprise

Engage your customers

Transcription:

White Paper IT Workload Automation: Control Big Data Management Costs with Cisco Tidal Enterprise Scheduler What You Will Learn Big data environments are pushing the performance limits of business processing solutions. Platforms such as Hadoop can help analyze large amounts of complex, unstructured data, but require time-consuming management and can create service-level agreement (SLA) performance bottlenecks. Cisco Tidal Enterprise Scheduler provides workload automation that facilitates the flow of data between a wide variety of applications such as Hadoop, cutting costs and increasing the value realized from big data. Cisco Tidal Enterprise Scheduler provides: End-to-end business process management and visibility Capacity awareness of the entire business processing environment Real-time, latency-free job stream automation Reduction of errors in business analytics and reporting SLA awareness for historical and predictive analysis Big Data Is Big Business From data collection to storage, retrieval, analysis, and reporting, business processing depends on the ability to move data between files, applications, databases, and business intelligence solutions. For many years, advances in networking, storage, and data processing applications kept pace with the growing volume of enterprise data and the growing needs of business processing. But with the accelerated growth of unstructured data such as voice, text, and video streams, the traditional data processing approach breaks down. Analyzing the torrent of chat and text streams to measure customer sentiment is a different exercise than running a sales report against a Structured Query Language (SQL) database. Today s challenge is capturing unstructured data and quickly manipulating it to meet a variety of business goals and deliver valuable information to the end business user. This is the world of big data. 2012 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 1 of 5

How Big Data Impacts Business Processing Big data is the point at which standard SQL data storage and data warehouse processing breaks down. High data volumes and the speed of voice and video delivery are obvious causes. But there are other concerns as well. Variability: The type of data consumed may vary. Can it be stored in a standard SQL structure, or does it need a special application or dataset? Variety: Disparities may be due to the sources. Is the data coming in from different telemetry sources, and therefore formatted in different ways? Is it perhaps in Microsoft Excel format from one source and a data log or video stream from another source? Scalability: The ability to scale is a critical issue, as the growing volume and diversity of business data is the primary promoter of big data. Should the business processing system support linear scalability, or a different type of scaling environment? Can the infrastructure handle the quantity of incoming data? Are the CPUs sized correctly? Does the storage environment have the proper capacity and is it readily scalable? Reliability: Big data solutions must support 24-hour operation. Is the data processing software designed to withstand multiple failure points to support business continuity? Can the workloads be automated to such a degree that they do not create processing backlogs? Gaming and continuously available entertainment data centers are good examples of big data environments that can put tremendous stress on data processing systems. Data flows into a casino data center from its website reservation system, security videos, and guest check-in with multiple data elements such as information about credit cards and unique customer needs. Some casinos also provide guests with a swipe card that records every activity, adding to the deluge of incoming data. The data center must absorb this information 24 hours a day, 7 days a week, 365 days a year. There is no downtime to run batch windows; data must be processed and analyzed as it is collected. As an example, marketing professionals might want to analyze the profitability of existing offers, so they can provide guests with updated deals communicated through their in-room video. Achieving this level of data management requires rapid analysis of huge streams of data gathered from a wide variety of sources. Without a smooth transition between the applications that manage the various types of input, bottlenecks and performance issues can reduce data center efficiency and prevent businesses from taking advantage of opportunities to increase revenues. While gaming and always-on entertainment environments are examples of big data overload, big data is actually widespread. Web and e-commerce: Collecting, analyzing, and quickly using data is the primary focus for the Internet market. Web and e-commerce businesses need fast turnaround of collected data to understand customer behavior and turn it into delivered, targeted advertising. Retail: Like web and e-commerce businesses, retail operations also depend on fast data analysis of large amounts of structured and unstructured data to quickly turn data into actionable information. But point-ofsale transactions add the complexity of multiple touchpoints and multiple telemetry feeds, which complicate data storage, retrieval, and analysis. Insurance and finance: Protecting profitability while meeting customer needs is vital to these industries. Balance is critical. These businesses must make sure that they are creating the best possible offerings while limiting risk. Incoming data feeds must be collected and analyzed in near real-time to stay ahead of the market. 2012 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 2 of 5

Biotech and pharmaceutical: Regulatory compliance has a significant impact on business processing systems in these industries. Whether in manufacturing or in product trials, data streams from different sources must be timely and accurately recorded, processed, and analyzed. Every telemetry feed must be auditable. And as these companies move into multiple points of regulatory compliance, the tracking burden and the need for efficient big data management increase. Regardless of industry, the goal of managing a big data environment remains how to capture vast amounts of diverse data, manipulate it, and quickly use it in a manner that is aligned with business requirements. A New Era of Analytics While structured data can be readily analyzed using traditional data mining techniques, unstructured data cannot. Correlating massive amounts of structured and unstructured data to develop useful information requires a big data analytics engine. A variety of platforms are available to meet the specialized needs of analyzing big data. Big data solutions range from enterprise-level commercial offers to free, open-source solutions. Some handle only structured data, such as SQL databases and data warehouses, while others are compatible with unstructured data as well. As an open-source tool with the ability to manage unstructured data, Hadoop is a popular platform for managing big data. Hadoop has several building blocks. These include top-level abstractions such as loading data into or extracting it out of the Hadoop environment, processing data into different datasets, and a file structure that allows data to be quickly retrieved and analyzed in real time. While solutions such as Hadoop provide many advantages, one of the common pitfalls is a lack of integration with existing data warehouse technology. Hadoop, for example, is a standalone environment in which data is manually entered and reports are manually extracted. It was not designed to be an automated environment for the enterprise, and is not integrated with the broader business processing environments. As a result, there is no mechanism for monitoring and controlling the flow of data between enterprise resource planning (ERP) applications and Hadoop environments. Without proper workload management, any part of the infrastructure can be easily overloaded, reducing performance and reliability. One Tool for Many Business Processing Touchpoints With so much data coming in from so many sources and in such a diversity of formats, any business processing weaknesses can become magnified in a big data environment. As a result, a successful big data solution requires end-to-end business process visibility and a smooth transition between applications, databases, and big data processing environments. Big data workload solutions must therefore manage the input and output of data at every point through which data is acquired. Whether the data comes from telemetry feeds, reservation agents, retail consignments, or other sources, every piece of incoming data must be handled by one solution. And that toolset must have the following attributes. End-to-end visibility: A true enterprise workload automation solution that can scale to meet the demands of a big data environment must be able to see the entire business processing environment. Data sources from business partners through FTP and data exchange mechanisms, to ERP and databases, to data warehouse solutions, to data integration, to business information solutions need to be accessible through a single management console. 2012 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 3 of 5

Deep API integrations: Not only does the solution need a high degree of visibility, but it must be able to interact with each touchpoint at a comprehensive and easy-to-administer level. Each process step in the entire workload has to be smoothly integrated with the previous one, allowing administrators to quickly define, test, run, and customize the entire job stream. The tool must be able to interact with each application at a fundamental level without unnecessary code management or language barriers. Capacity awareness: The data center administrator needs to understand what percentage of data center resources each business process requires, and which tools the data passes through as it enters and exits each phase of business processing. Understanding the volume of data that will emerge from every input source is critical, because there is less and less opportunity to adjust to capacity fluctuations as the amount of data grows. A lack of capacity awareness can increase operating costs and processing errors, while decreasing resource utilization. Real-time, latency-free delivery: The ability to run business processes in real time is an important complement to capacity awareness, because latency is critical to the delivery of big data SLAs. As each process completes, another must pick up the data and continue the business process flow. This is traditionally a labor-intensive, manual process that must be automated in a big data environment. Real-time, error-free analytics: Analytics are the logical delivery end points of big data business processing environments. Lack of integration with business intelligence and analytics applications can result in a high error rate due to disconnected processes and missed or degraded service-level agreements. Controlling Big Data Workflows Cisco Tidal Enterprise Scheduler acts as an OS-like management layer that automates and tracks the flow of data between the infrastructure layer of computer, network, and storage devices, and the data movement layer of Hadoop, ERP, web services, and other data feeds. Controlling data flow between the infrastructure and applications simplifies business processing management and integrates the Hadoop environment into the broad range of business processes in the enterprise. The functionality is similar to controlling water movement through dams, reservoirs, pumping stations, treatment, and pressure regulation points. Water is permitted to flow as needed and processed as needed without flooding any part of the system. When the tap is turned on you get the right amount, with the right quality, at the right time. Cisco Tidal Enterprise Scheduler provides enterprise-class workload automation by handling the complete, end-toend business process environment. Integration is managed through APIs with a number of different application sets and databases, including Oracle, Informatica, IBM, Microsoft, and SAP. Each interface has been carefully developed and tested to work transparently with each application. Combining these third-party data management partnerships with Cisco s workload automation solution has the added benefit of providing a low-risk path for transitioning a Hadoop environment from a test operation to an enterprise-ready solution. 2012 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 4 of 5

Conclusion As big data proliferates, solutions such as Hadoop must be integrated with existing data management environments. Cisco Tidal Enterprise Scheduler is a dynamic, intelligent, efficient solution for integrating big data processing and analytics into an existing data center infrastructure. It offers compatibility with leading applications, capacity awareness to prevent bottlenecks, automated process orchestration, and complete visibility across the business processing environment. For More Information To learn more about Cisco products and services, please visit http://www.cisco.com/go/workloadautomation. Printed in USA C11-713162-00 08/12 2012 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public Information. Page 5 of 5