Accelerating the adoption of Cloud Computing Planning an OpenStack PoC Webinar April 10, 2014
Speakers Today Brad Vaughan Seth Fox 20 years as architect of infrastructure solutions for the enterprise Experience designing and deploying across US, APAC and Emerging Markets Specializes in infrastructure adoption in the worlds largest enterprises across people, process and technology Managed and delivered some of the largest cloud deployments, both public and private, worldwide Business and technical leadership to service providers and enterprises around the world Prior to Solinea, Seth was a Director in the Product Management Group at Cloudscaling 2
Solinea Overview Accelerating Open Infrastructure Adoption! Purpose-built for cloud Cloud is the only domain we focus on, with vertical industry and horizontal solutions specialization OpenStack Experience Built the first OpenStack production clouds and contributors to the platform since its inception Proven Delivery Success Track record of success architecting, building and operating production clouds private and public world-wide! Unique Approach Integrated capabilities lifecycle: cloud strategy, architecture, implementation and adoption services Enterprise IT Experience We understand cloud adoption challenges of global companies 3
Webinar Agenda Why a Proof of Concept (PoC)? Select PoC Candidate Workloads Creating a Test Plan PoC Architecture Deployment Planning Solinea Jumpstart Methodology 4
Technology Evaluation Continuum Sandbox Informal exploration of technology Small scale installation to allow for experimentation Single user/operator testing Proof of Concept Quantifiable proof of business value to multiple business stakeholders Scoped and budgeted project with assigned staffing Proving technical viability for specific use case and solution May also evaluate competing solutions Fully understand the impact/value across multiple business units/ workloads Pilot Initial build-out of tested solution Limited user community and SLAs Operated with production tooling and support 5
Setting Goals & Criteria Sandbox No predefined goals or criteria Reduced HW footprint Functional understanding of technology Proof of Concept Prove a hypothesis Goals must be directly link to the business requirements for approving next steps Generate convincing data comparing current state solution Prove ROI and Investment Gain practical skills and understanding, to properly design the end state Understand impact on IT lifecycle service development and delivery process Pilot Production quality/ performance goals Successful completion of Preproduction QA testing Completion of user testing 6
Candidate Workloads! Selection Criteria Solve a existing problem Workload/application profile Representative architecture pattern Complexity and dependency Supportability, Customization! Stakeholder Involvement Resource commitment Is the pain point real! Measurability Existing quantifiable testing Historical data 7
Selecting Tests! Defining the scope (breadth and depth of PoC)! Defines timeline, cost and complexity! Application level testing Primary issue is finding existing test with actual data Needs to be self contained with limited dependency on other production or test/dev systems Many applications require refactoring to take advantage of cloud architecture! Largest number of tests are generally functional testing Auto-scaling High Availability Operational! Non-functional tests can be challenging PoC is usually only functional simulation of production Performance, capacity limited unless you have comparable benchmarks 8
Creating a Test Plan! The candidate selection process should have identified a workloads with existing test harness! Developing, architecting and implementing testing tools is time consuming and complicated! Formal definition of use cases is required to ensure a valid scope Use Case ID Purpose Pre-requisites Required Data Steps Expected Results Actual Results 9
OpenStack Operational Use Cases! Exercise the APIs Create and destroy Objects (e.g. users, tenants, flavors, image) Start/Stop, Enable/ Disable! Non-functional features Upgrading the environment High availability / Failover! Backup and recover 10
OpenStack Testing Tools! Several tools available Tempest: automated CI/CD test suite for OpenStack Rally: benchmark OpenStack at scale! Valuable to validate PoC platform install prior to running other tests! Can be very complicated to configure! Types of Tests API RESTful calls CLI read-only actions of the client Scenario often operational actions Stress used primarily to identify race condition bugs 3 rd Party test non-native API s like EC2 compatibility 11
Test Results: Auto Scaling Results 1 2 1 2 Once the stress testing load was initiated there was about 60K to 80K requests per second. During this initial phase the single caching server generated a sustained CPU load over 75% (Red Bars). This triggers a heat alarm which will launch and configure a new caching server. This new caching server is joined to the cluster and gets an equal number of requests distributed to itself. This causes the overall Cluster CPU load average to decrease by roughly half. This should allow the overall cluster to handle significantly more requests per second. Benefits This test showcases the ability for the cluster to grow and shrink as needed to handle expected and unexpected high load and can scale according to the level of load pushed against the cluster 2 1 12
Example PoC Plan 13
Identifying the Prerequisites! Equipment Rack RUs, Power, A/C Servers Controller, Storage, Compute Storage Storage software, drives, backup space Networking Networks, IPs, SSL certs! Software & Data OpenStack Code Application Software Licenses Who will install Who will customize Testing Tools Install and configure Sample Datasets Which datasets (live, test)?! Privacy and Security 14
Example Skills Matrix Role Networking Compute Storage Other OpenStack Generalist Good Linux networking experience Excellent hypervisor skills Excellent Linux administration skills Config management with Puppet, Chef, etc. Experience administrating iscsi or NFS servers General python scripting Experience using OpenStack clouds Network Specialist Strong general L2/L3 skills with chosen ToR switches Excellent virtualized networking skills (OVS, linux bridging, etc.) Experience with chosen hypervisor(s) Experience with NICs and IPMI/ILo on chosen hardware Understanding of network tuning for iscsi / NFS traffic Storage Specialist Familiarity with iscsi / NFS tuning Excellent tuning/ troubleshooting with chosen storage 15
OpenStack Distributions Sandbox Proof of Concept Pilot DevStack RDO Fuel RDO/RHEL OSP Fuel Piston Cloudscaling Stackops Many others RHEL OSP Fuel Piston Cloudscaling Stackops Many others 16
Distribution Selection Criteria! Price! Adoption! Support Offerings! Installation Simplicity! Maintainability and Management! OpenStack release! Value Added Tools! Specialized Features Storage VMware integration Quota SDN! Familiarity 17
Logical Architecture Storage Network 192.168.103.0/24 Jump Box Foreman Repository Heat VM Horizon SSH Controller(s) All APIs except Swift Neutron gateway Qpid MySQL Public Network Object Store Swift Proxy Container Object Account Private Network 192.168.1.0/24 Floating IPs Compute 10.10.1.0/24 Nova compute Neutron agent Block iscsi Cinder Mgmt Network 192.168.102.0/24 IPMI Network 192.168.101.0/24 18
Example Hardware Design Unit Segment Role Hardware 42 Switch (IPMI) Cisco 2xxx 41 Network Switch (Service) Arista 7150 40 Switch (Management) Cisco 3xxx 39 cntr- 01 Quanta X12RS 38 cntr- 02 Quanta X12RS 37 cntr- 03 Quanta X12RS Management 36 cntr- 04 Quanta X12RS 35 cntr- 05 Quanta X12RS 34 cntr- 06 Quanta X12RS 33 comp- 01 Quanta X12RS 32 comp- 02 Quanta X12RS 31 comp- 03 Quanta X12RS 30 comp- 04 Quanta X12RS 29 comp- 05 Quanta X12RS Compute 28 comp- 06 Quanta X12RS 27 comp- 07 Quanta X12RS 26 comp- 08 Quanta X12RS 25 comp- 09 Quanta X12RS 24 comp- 10 Quanta X12RS 23 22 KVM Monitor + KVM Dell KVM 21 Admin jump- 01 Quanta X12RS 20 19 iscsi- 01 Quanta X22RQ 18 17 Block iscsi- 02 Quanta X22RQ 16 15 iscsi- 03 Quanta X22RQ 14 13 obj- 01 Quanta X22RQ 12 11 obj- 02 Quanta X22RQ 10 9 Object obj- 03 Quanta X22RQ 8 7 obj- 04 Quanta X22RQ 6 5 obj- 05 Quanta X22RQ 4 3 2 1! Servers Minimal server hardware configuration diversity One model for compute, one for storage Most people segregate compute, object and block storage from controller nodes! ToR Switches 10Gb networking for public, management and data networks 1GB for IPMI! Storage will be determined by workload needs NFS, iscsi, Swift and Ceph dominate storage configs 19
Evaluation Example OpenStack PoC Evaluation 1. Compute Resources Criteria Weighting (0 to 5) 5=most important RHEL OSP Rank Weighted Score Rank SUSE This category defines the attributes of the compute resource that are under control of the end user. The end user should be able to configure the capacity and attributes of a compute unit with minimal friction and deploy the appropriate level of resources without the need to "over provision". The ideal situation is to have granular control over both the workload capacity of the compute unit and the service level. The compute unit should be able to easily scale to meet a variety of workloads, I.E. once the initial compute unit is provisioned you should be able to easily add incremental and storage resources. Weighted Score Compute B. Ability to configure private flavors 4 5 20 3 12 C. Ability to configure memory in GB increments from.5 to 128 4 5 20 4 0 D. Ability to configure attached storage in GB increments to 1TB 4 5 20 3 12 F. Ability to meter usage in 1 hour increments 1 5 5 2 2 G. Compute resource configuration changes can be made via the portal or via an API call 5 5 25 1 5 H. Ability to upload images into service catalog 5 5 25 2 10 I. 3 5 15 2 6 Compute Score 5.0 18.6 2.4 6.7 Allocation of Compute Score 15% 0.8 2.79 0.4 1.01 2. Storage Resources This category defines the attributes of the storage services that are under control of the end user. Two categories of storage services are listed Object based storage and Block based storage. Object based storage, which would be appropriate for storing backups, images, archives, etc. Object based storage is used when latency and performance are not top criteria and low cost/high volume requirements preside. Object based storage is not part of the local attached file system. Amazon web services S3 or Openstack SWIFT are examples of object based storage. Block based storage refers to the typical file system storage that is directly accessible by OS and conforms to the file system structure in use by the Guest OS. Block based storage can be delivered using a variety of service levels and is often classified using IOPS, latency or QoS levels. Object based storage A. Ability to read, write and delete and Secure objects ranging in size from 1 byte to 5 terabytes 2 3 6 1 2 B. Objects can be stored over geographically tiered locations 1 4 4 2 2 E. Accessible via APIs 1 5 5 3 3 E. Objects are taggable and versioned 1 3 3 4 4 F. Objects are replicated to multiple locations 1 2 2 6 6 Block- based storage A. Integrate with compute (attach/detach) 3 2 6 3 9 B. Multiple SLAs based tiers of block storage service 3 5 15 1 3 C. Ability to provide point-in-time snapshot backups 2 5 10 5 10 D. Ability to resize volumes 1 5 5 7 7 E. Available across geographically dispersed locations 1 5 5 3 3 F. Storage has configurable IOPS 1 5 5 1 1 G. Metering is produced on volume/gb hours 1 5 5 2 2 Storage Score 4.1 5.9 3.2 4.3 Allocation of Storage Score 15% 0.6 0.89 0.5 0.65! Weighted ranking approach to evaluation Simple pass/fail testing doesn t capture flexibility and non-functional capabilities Scoring metrics should be detailed to reduce subjective nature! Each use case and test should have several rating criteria! Should always be accompanied by testing output and narrative for executive audiences! Very useful in vendor/ technology comparisons 20
Example: Cloud vs. Appliance Evaluation Use Case Tested for Comparative Purposes: A predefined and parsed data set is preloaded on Hadoop Map/Reduce transforms the data to a number of key and value pairs The Map/Reduce job is submitted Job is monitored for completion PoC! Cost: $125K + Services! Timeframe: 3 weeks! Performance: 40 minutes Legacy Appliance Cost: $1.2MM Timeframe: 2 weeks Performance: Did not compute
Solinea Services We can make your PoC a success! A repeatable methodology. Proven with our customers. Conceive Architect Integrate Adopt!!! " Workshop! Workload Analysis and Categorization! IaaS architecture confirmation! Bill of Materials (BoM)! Implementation Plan to immediately go into POC Proof of Concept! Logical & Physical Architecture! OpenStack Build Specification! OpenStack Cloud (single rack);! Training & Mentoring Program. 22
Resources Available on solinea.com! Slides / Project Plans for this webinar Replay and Materials available in 24-48 Hours Emails will be sent with link! Upcoming Webinars OpenStack Icehouse Preview April 22 nd! Replays / Downloads Available Now Building OpenStack Block Storage into your Cloud Making the case for OpenStack in the Enterprise OpenStack Breaking into the Enterprise 23
Accelerating the adoption of Cloud Computing Thank You Solinea, Inc. 404 Bryant Street San Francisco, CA 94110 www.solinea.com Proprietary and Confidential - Not to be distributed without prior written permission from Solinea, Inc.