1 Cloudera Enterprise Data Hub GCloud Service Definition Lot 3: Software as a Service December 2014
2 1 SERVICE OVERVIEW & SOLUTION Service Overview Introduction to Cloudera Cloudera Enterprise Overview Cloudera Enterprise Data Hub Components CDH Cloudera Manager Cloudera Navigator Cloudera Support INFORMATION ASSURANCE BACKUP/RESTORE AND DISASTER RECOVERY PROVISION ON-BOARDING AND OFF-BOARDING PROCESSES On-Boarding Off-Boarding SECURITY SERVICE MANAGEMENT DETAILS Technical Boundary Support Boundary User Authorization and Roles General Support details SERVICE CONSTRAINTS Planned Maintenance Emergency Maintenance SERVICE LEVELS Case Priority Definitions Support SLAs Escalation Timelines Award of Service Credits: Payment of Service Credits: Financial recompense TRAINING INVOICING PROCESS TERMINATION TERMS DATA EXTRACTION /REMOVAL CRITERIA Data standards in use Consumer generated data Data extraction Price of extraction...14
3 13.5 Purge & destroy DATA PROCESSING AND STORAGE LOCATION(S) DATA RESTORATION / SERVICE MIGRATION CUSTOMER RESPONSIBILITIES TECHNICAL REQUIREMENTS BROWSERS DETAILS OF ANY TRIAL SERVICE AVAILABLE ICT Greening Policy Compliance ICT Strategy Policy Compliance W3C Compliance... 16
4 1 SERVICE OVERVIEW & SOLUTION 1.1 Service Overview Cloudera Enterprise is a revolutionary data management platform that is designed specifically to address the opportunities and challenges of Big Data. Cloudera Enterprise combines Apache Hadoop with a number of other open source projects to create a single, massively scalable system where you can unite storage with an array of powerful processing and analytic frameworks - a vision we call the Enterprise Data Hub. By uniting flexible storage and processing under a single management framework and set of system resources, Cloudera delivers the versatility and agility required for modern data management - where you can ingest, store, process, explore and analyse data of any type or quantity without migrating it between multiple specialised systems. The Cloudera Enterprise Data Hub includes core Apache Hadoop functionality for flexible, scalable storage and data processing as well as several added value projects including Cloudera Impala for interactive SQL, Cloudera Search for unstructured search, Apache HBase for real-time NoSQL and Cloudera Navigator for data management (data discovery, metadata management, lineage and auditing). The platform also adds security features to Hadoop, enabling strong authentication, fine-grained authorisation and encryption. This Service Definition covers the Software Subscription offerings that can be provided by Cloudera as part of GCloud Lot 2. For information about the Cloudera Training and Services offerings, please refer to the Cloudera Service Definition for Lot 4.
5 1.2 Introduction to Cloudera Cloudera is the first and leading commercial provider of Apache Hadoop and the top contributor to the Hadoop open source community. Founded in June 2008 by leading experts on big data from Facebook, Yahoo!, Google, and Oracle. Cloudera s Chief Architect, Doug Cutting, is the original creator of Hadoop, and is on the board of the Apache Software Foundation CLOUDERA FOUNDED BY MIKE OLSON, AMR AWADALLAH 2009 CDH: FIRST COMMERCIAL APACHE HADOOP 2011 CLOUDERA REACHES 100 PRODUCTION 2012 CLOUDERA ENTERPRISE 4: THE STANDARD FOR HADOOP IN THE ENTERPRISE NOW TRANSFORMING HOW COMPANIES THINK ABOUT DATA CDH CLOUDERA MANAGER CLOUDERA ENTERPRISE 4 ASK BIGGER QUESTIONS 2009 HADOOP CREATOR DOUG CUTTING JOINS CLOUDERA 2010 CLOUDERA MANAGER: FIRST MANAGEMENT 2011 CLOUDERA UNIVERSITY EXPANDS TO CLOUDERA CONNECT Cloudera pioneered the business case for Hadoop with CDH, the world s most comprehensive, thoroughly tested and widely deployed 100% open source distribution of Apache Hadoop in both commercial and non-commercial environments. Now, the company is redefining data management with its Platform for Big Data, Cloudera Enterprise, empowering enterprises to Ask Bigger Questions and gain rich, actionable insights from all their data, to quickly and easily derive real business value that translates into competitive advantage. As the top contributor to the Apache open source community and leading educator of data professionals with the broadest array of Hadoop training and certification programs, Cloudera also offers comprehensive consulting services. Over 700 partners across a broad eco-system of hardware, software and services have teamed with Cloudera to help meet organizations big data goals. With tens of thousands of nodes under management and hundreds of customers across diverse markets, Cloudera is the category leader that has set the standard for Hadoop in the enterprise. Cloudera's goal is for CDH to serve as the industry standard for big data management in the enterprise, in order to realize this goal we must: Continue to develop a platform that is open. CDH is 100% open source Create a platform that has enterprise functionality and properties. CDH leads the industry in such functionality including the most extensive set of functionality for security, availability, recoverability and integration/extensibility.
6 Create a platform that supports a diverse ecosystem. CDH has a supporting ecosystem that is ~10X larger than the next closest distribution. Create a platform that supports an ever-broadening set of workloads. CDH has facilities for batch MapReduce workloads as well as interactive SQL and Search. Maintaining a rich commercial ecosystem of hardware, software and services providers is central to Cloudera's strategy. Today there are more than 700 companies that are Cloudera partners, a commercial ecosystem that is nearly 10X the size of the next closest competitor. Cloudera has maintained a certification program for the past 2 years where partners test that their solutions interoperate with Cloudera's platform. These certifications are valid for the life of a major CDH release and we assure compatibility from update to update. Cloudera also often develops joint roadmaps with key gold and platinum partners like SAS, Informatics, Teradata, Microstrategy and Oracle. The List of many of Cloudera partners: Cloudera has the most effective, experienced, and talented engineering team of any Big Data company. Cloudera additionally has the most committers and contributors to the open source Hadoop Ecosystem of any other Big Data company. Cloudera is ahead of all competition and intends to remain so by continuously innovating and providing value to its customers. 1.3 Cloudera Enterprise Overview Cloudera Enterprise helps you become information-driven by leveraging the best of the open source community with the enterprise capabilities you need to succeed with Apache Hadoop in your organization. Designed specifically for missioncritical environments, Cloudera Enterprise includes CDH, the world s most popular open source Hadoop-based platform, as well as advanced system management and data management tools plus dedicated support and community advocacy from our world-class team of Hadoop developers and experts. Cloudera is your partner on the path to big data. Cloudera Enterprise, with Apache Hadoop at the core, is: Unified one integrated system, bringing diverse users and application workloads to one pool of data on common infrastructure; no data movement required Secure perimeter security, authentication, granular authorization, and data protection Governed enterprise-grade data auditing, data lineage, and data discovery Managed native high-availability, fault-tolerance and self-healing storage, automated backup and disaster recovery, and advanced system and data management Open Apache-licensed open source to ensure your data and applications remain yours, and an open platform to connect with all of your existing investments in technology and skills The Cloudera Enterprise Data Hub provides: One massively scalable platform to store any amount or type of data, in its original form, for as long as desired or required Integrated with your existing infrastructure and tools Flexible to run a variety of enterprise workloads -- including batch processing, interactive SQL, enterprise search and advanced analytics Robust security, governance, data protection, and management that enterprises require With Cloudera Enterprise, today s leading organizations put their data at the center of their operations, to increase business visibility and reduce costs, while successfully managing risk and compliance requirements.
7 Cloudera Enterprise includes the following components: CDH: At the core of Cloudera Enterprise is CDH, which combines Apache Hadoop with a number of other open source projects to create a single, massively scalable system where you can unite storage with an array of powerful processing and analytic frameworks. Cloudera Manager: Cloudera Enterprise includes Cloudera Manager to help you easily deploy, manage, monitor, and diagnose issues with your cluster. Cloudera is critical for operating clusters at scale. Cloudera Support: Get the industry s best technical support for Hadoop. With Cloudera Support, you ll experience more uptime, faster issue resolution, better performance to support your mission critical applications, and faster delivery of the platform features you care about. Cloudera Enterprise also offers support for several advanced components that extend and complement the value of Apache Hadoop: Online NoSQL HBase: a distributed key-value store that helps you build real-time applications on massive tables (billions of rows, millions of columns) with fast, random access. Analytic SQL Impala: the industry s leading massively-parallel (MPP) SQL engine built for Hadoop. Search Cloudera Search lets your users query and browse data in Hadoop just they would search Google or your favorite e-commerce site. In-Memory Machine Learning and Stream Processing Apache Spark: delivers fast, in-memory analytics and realtime stream processing for Hadoop. Data Management Cloudera Navigator: provides critical enterprise data audit, lineage, and data discovery capabilities that enterprises require.
8 Cloudera Enterprise is available on a subscription basis in three editions, each designed for your specific needs. Basic Edition: Rely on superior support and advanced management for core Hadoop to run storage and batch processing in production environments. Flex Edition: Run dedicated applications built on your choice of advanced component. Data Hub Edition: Get everything you need to become information-driven, including unlimited use of every advanced component. Each edition is available with your choice of 8x5 or 24x7 support from the industry s leading team of Hadoop experts, licensed either per server, or per terabyte stored. Flex and Data Hub Editions also include open source indemnification, and an optional premium support extension for mission-critical environments. 1.4 Cloudera Enterprise Data Hub Components CDH CDH delivers everything you need for enterprise use right out of the box. By integrating Apache Hadoop with more than a dozen other critical open source projects, Cloudera has created a functionally advanced system that helps you perform endto-end Big Data workflows. The only solution with real time query & search Introduced high availability for HDFS in 2012 The most widely deployed & proven The broadest ecosystem of certified partners 100% open source & built for the enterprise Cloudera Manager As the industry s first and most sophisticated management application for Apache Hadoop, Cloudera Manager sets the standard for enterprise deployment by delivering granular visibility into and control over every part of CDH empowering operators to improve cluster performance, enhance quality of service, increase compliance and reduce administrative costs. As with any distributed computing or storage platform, deployment and ongoing administration of a Hadoop cluster can be difficult and time consuming. Deciding which components and versions to deploy based on use cases; assigning roles for nodes; effectively configuring, starting and managing services across the cluster; and performing diagnostics to optimize cluster performance require significant expertise and constant attention. Cloudera Manager is designed to make administration of CDH simple and straightforward, at any scale. With Cloudera Manager, you can easily deploy and centrally operate the complete Hadoop stack. The application automates the installation process, reducing deployment time from weeks to minutes; gives you a cluster-wide, real-time view of nodes and services running; provides a single, central console to enact configuration changes across your cluster; and incorporates a full range of reporting and diagnostic tools to help you optimize performance and utilization. Manage: Easily deploy, configure and operate clusters with centralized, intuitive administration for all services, host and workflows Monitor: Maintain a central view of all activity in the cluster through heat-maps, proactive health checks and alerts Diagnose: Easily diagnose and resolve issues with operational reports and dashboards, events, intuitive log viewing and search, audit trails and integration with Cloudera Support Integrate: Integrate Cloudera Manager with existing enterprise monitoring tools through SNMP, SMTP and a comprehensive API
9 1.4.3 Cloudera Navigator Cloudera Navigator is the only native end-to-end governance solution for Apache Hadoop-based systems. Through a single user interface, it provides visibility for administrators, data managers, data scientists, and analysts to secure, govern, and explore the large amounts of diverse data that land in Hadoop. Cloudera Navigator is part of Cloudera Enterprise s comprehensive data security and governance offering and is key to meeting compliance and regulatory requirements. Cloudera Navigator includes: Comprehensive, Unified Auditing Across Hadoop o Maintain a full audit history and track access for HDFS, Impala, Hive, HBase, and Sentry o Easily report on data access to meet regulatory requirements o Export audit information to global Security Information and Event Management (SIEM) systems to incorporate into infrastructure-wide reporting Unified, Searchable Technical and Business Metadata o Consolidate technical metadata for Hadoop files and tables o Easily track, classify, and locate data to comply with business governance and compliance rules Collect, View, and Share Lineage o Automatically collect, and view upstream and downstream column-level lineage in an easy-to-follow graph o Quickly identify the origin of a data set and its impact on downstream analysis o Export lineage to enterprise-wide lineage management systems Lifecycle Management o Define and automate complex data lifecycle activities, such as classification, retention, and encryption policies - all built on Navigator s rich business metadata foundation Comprehensive encryption and key management o Navigator encrypt provides transparent encryption for Hadoop data that is scalable and highly performant. Navigator key trustee provides a virtual safe-deposit box for managing encryption keys and other Hadoop security assets Cloudera Support Cloudera offers the industry s highest quality technical support for Hadoop. We have a dedicated team of support engineers comprised of contributors and committers for every component of CDH, our market-leading open source Apache Hadoop distribution. No one knows the Hadoop stack better or has more experience supporting large-scale clusters in production. With Cloudera Support behind you, you ll experience more uptime, faster issue resolution, better performance to support your mission critical applications, and faster delivery of the platform features you care about Dedicated team of experts with a global presence End-to-end coverage for the complete Cloudera platform - Contributors and committers for every part of CDH Tens of thousands of nodes under management across industry 8x5 or 24x7 service levels Proactive cluster optimization Regular releases Thorough documentation Rich knowledgebase Influence over open source roadmap
10 2 INFORMATION ASSURANCE Cloudera Enterprise includes components implementing perimeter security, authentication, granular authorization, and data protection as well as enterprise-grade data auditing, data lineage, and data discovery. 3 BACKUP/RESTORE AND DISASTER RECOVERY PROVISION By default, all data is replicated onto 3 servers for resilience. If backup and disaster recovery are required, this is usually provided by implementing two clusters in two separate location. Data can be replicated between the clusters using the Cloudera Backup and Disaster Recovery (BDR) facility. 4 ON-BOARDING AND OFF-BOARDING PROCESSES 4.1 On-Boarding On procurement of the service, a Cloudera Account Executive will contact the customer to arrange onboarding. This will include delivery of a license key to enable the enterprise features of the Cloudera software and onboarding of Primary Support Contact(s) from the customer organization. 4.2 Off-Boarding When the subscription ends (if it is not renewed), the license key will expire and the customer will no longer have access to the enterprise features in the Cloudera software or Cloudera Support. 5 SECURITY Cloudera Enterprise supports the following security features: Authentication via Kerberos, LDAP or Active Directory Authorisation can be controlled via Apache Sentry Auditing of services via Cloudera Navigator Encryption of data at rest via Navigator Encrypt and data in flight via SSL 6 SERVICE MANAGEMENT DETAILS 6.1 Technical Boundary Cloudera Support includes remote predictive, proactive and reactive support for Cloudera software as described in the Support Agreement. 6.2 Support Boundary Cloudera Support includes remote predictive, proactive and reactive support for Cloudera software as described in the Support Agreement. 6.3 User Authorization and Roles Authorisation of access to data can be controlled via Apache Sentry.
11 6.4 General Support details Cloudera offers the industry s highest quality technical support for Hadoop. We have a dedicated team of support engineers comprised of contributors and committers for every component of CDH, our market-leading open source Apache Hadoop distribution. No one knows the Hadoop stack better or has more experience supporting large-scale clusters in production. With Cloudera Support behind you, you ll experience more uptime, faster issue resolution, better performance to support your mission critical applications, and faster delivery of the platform features you care about Dedicated team of experts with a global presence End-to-end coverage for the complete Cloudera platform - Contributors and committers for every part of CDH Tens of thousands of nodes under management across industry 8x5 or 24x7 service levels Proactive cluster optimization Regular releases Thorough documentation Rich knowledgebase Influence over open source roadmap 7 SERVICE CONSTRAINTS 7.1 Planned Maintenance N/A Cloudera provides software and support for that software, but does not host the system or data. 7.2 Emergency Maintenance N/A Cloudera provides software and support for that software, but does not host the system or data.
12 8 SERVICE LEVELS The SLAs for Cloudera Support cases depend on the priority of the case. Priority levels and SLAs are described in the tables below: 8.1 Case Priority Definitions CASE PRIORITY CLOUDERA RESPONSIBILITIES CUSTOMER RESPONSIBILITIES DEFINITION P1 FOR 8x5 SUBSCRIPTION: Resources dedicated Monday through Friday during customer s local business hours until a resolution or workaround is in place. FOR 24x7 SUBSCRIPTION Resources dedicated 24x7 until a resolution or workaround is in place FOR 8x5 SUBSCRIPTION: Designated resources that are available Monday through Friday during customer s local business hours. Ability to provide necessary diagnostic information. FOR 24x7 SUBSCRIPTION Designated resources available 24x7 until a resolution or workaround is in place. Ability to provide necessary diagnostic information Total loss or continuous instability of functionality or inability to use a feature on a production system. Development systems do not apply here. Inability to use a feature or functionality that is currently relied upon for production functionality. P2 FOR 8x5 SUBSCRIPTION Resources available Monday through Friday during local business hours until a resolution or workaround is in place FOR 24x7 SUBSCRIPTION: Resources dedicated 24x7 until a resolution or workaround is in place FOR 8x5 SUBSCRIPTION Resources available Monday through Friday during local business hours until a resolution or workaround is in place. Ability to provide necessary diagnostic information. FOR 24x7 SUBSCRIPTION Designated resources available 24x7 until a resolution or workaround is in place. Ability to provide necessary diagnostic information Performance degraded or severely limited but not causing a total loss of functionality. Inability to deploy a feature that is not currently relied upon in a production environment. P3: Resources available Monday through Friday during local business hours until a resolution or workaround is in place Resources available Monday through Friday during local business hours until a resolution or workaround is in place. Ability to provide necessary diagnostic information. General questions. Workaround in place for Priority 1 and Priority 2 issues. P4 Solid understanding of the customer request documented in our systems for reviewed by Product Marketing Use cases for the feature request and specifics on requested functionality Feature Requests
13 8.2 Support SLAs CASE PRIORITY INITIAL RESPONSE TARGET 24x7SUBSCRIPTION UPDATE FREQUENCY TARGET 24x7 SUBSCRIPTION P1 Within 1 hour Updated every 4 hours P2 Within 2 hours Updated every business day P3 Within 8 hours Updated every 3 business days P4 Within 24 hours N/A, feature request CASE PRIORITY INITIAL RESPONSE TARGET 8x5 SUBSCRIPTION UPDATE FREQUENCY TARGET 8x5 SUBSCRIPTION P1 Within 1 business hour Updated every 4 business hours P2 Within 2 business hours Updated every business day P3 Within 8 business hours Updated every 3 business days P4 Within 2 business days N/A, feature request 8.3 Escalation Timelines CASE PRIORITY ESCALATION TIMELINE 24x7 SUBSCRIPTION ESCALATION TIMELINE 8x5 SUBSCRIPTION P1 Within 2 hours Within 2 business hours P2 Within 12 hours Within 12 business hours P3 Within 3 days Within 5 days P4 N/A N/A Business Days are defined as Monday-Friday, excluding holidays observed by Cloudera. 24x7 applies for Status Update Frequency only for P1s. For the rest of the priorities, you provide the same service irrespective of contract type. 8.4 Award of Service Credits: N/A 8.5 Payment of Service Credits: N/A
14 9 Financial recompense N/A 10 TRAINING Training is available as part of the Cloudera Professional Services and Training offering until GCloud Lot 4 (SCS). 11 INVOICING PROCESS See terms and conditions 12 TERMINATION TERMS See terms and conditions. 13 DATA EXTRACTION /REMOVAL CRITERIA 13.1 Data standards in use Cloudera Enterprise is based on the HDFS filesystem which is capable of storing any data type or file format including both structured and unstructured data Consumer generated data N/A Cloudera provides software and support for that software, but does not host the system or data Data extraction There are many ways of extracting data from Cloudera Enterprise e.g. HDFS APIs, HDFS shell commands, Apache Hue, JDBC/ODBC, Thrift/REST APIs for the various services Price of extraction N/A Cloudera provides software and support for that software, but does not host the system or data Purge & destroy N/A Cloudera provides software and support for that software, but does not host the system or data. 14 DATA PROCESSING AND STORAGE LOCATION(S) N/A Cloudera provides software and support for that software, but does not host the system or data.
15 15 DATA RESTORATION / SERVICE MIGRATION N/A Cloudera provides software and support for that software, but does not host the system or data. 16 CUSTOMER RESPONSIBILITIES See terms and conditions 17 TECHNICAL REQUIREMENTS All requirements and support versions for Cloudera Enterprise are listed in the online documentation. For Cloudera Manager this is here: pic_4_2_unique_1 And for CDH it is here: 18 BROWSERS The Cloudera Manager Admin Console, which you use to install, configure, manage, and monitor services, supports the following browsers: Mozilla Firefox 11 and higher Google Chrome Internet Explorer 9 and higher Safari 5 and higher 19 DETAILS OF ANY TRIAL SERVICE AVAILABLE A 60 day trial version of Cloudera Enterprise can be downloaded from the Cloudera website. 20 ICT Greening Policy Compliance Cloudera completely endorses the UK Government s policy to provide a cost effective and energy efficient ICT estate, which is fully exploited, with reduced environmental impacts to enable new and sustainable ways of working with our customers. We seek relationships in the delivery of services with those entities that have both ethical commitments to schemes endorsing Carbon reduction policies. 21 ICT Strategy Policy Compliance Cloudera is committed to supporting the UK Government s aspirations and objectives to improve both the image and performance of services provisioned through ICT resources. We anticipate that the types of services we offer will particularly support the development of and achievement of the Intelligent Customer Function through the provision of services that support informed decision making to maximise outcomes in the provision of services that support the Digital by Default strategy and associated policies.
16 22 W3C Compliance Cloudera is committed to continue to develop its services to support social inclusion and commits to continue to develop the provision of services that allow access to them. We have made significant investment to align our digital service offerings to best practice in delivering W3C based services. We are cognizant of the requirements for inclusiveness and our services consider these requirements as part of their delivery and where it is applicable.