10 th International Conference on Software Testing June 18 21, 2013 at Bangalore, INDIA by Sowmya Krishnan, Senior Software QA Engineer, Citrix Copyright: STeP-IN Forum and Quality Solutions for Information Technology Pvt. Ltd. Published with permission for restricted use in STeP-IN SUMMIT 2013 in agreement with full copyrights from owner(s) / author(s) of material. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise without the prior consent of the owner(s) / author(s). This edition is manufactured in India and is authorized for distribution only during STeP-IN SUMMIT 2013 as per the applicable conditions. Practices Experience Knowledge Automation Produced By Hosted By www.stepinforum.org www.qsitglobal.com
PERFORMANCE TESTING OF AN IAAS CLOUD SOFTWARE Sowmya Krishnan Citrix R&D India ABSTRACT
ABSTRACT AUDIENCE PREREQUISITES (Preferable) Basic knowledge of IaaS Cloud (Infrastructure As A Service) such as What is a public/private cloud Familiarity of common IaaS providers like CloudStack, OpenStack or Eucalyptus KEY TAKEAWAYS Technology landscape of IAAS cloud provider of late Performance data one should look for in a new deployment of an IAAS cloud / fine tuning an existing IAAS cloud Innovative tools to achieve production scale test beds in a cloud Where do you start and stop looking for performance bottlenecks in an IAAS cloud product Quality base lining and reporting CASE STUDY ABSTRACT We ll take example of Performance Testing of Apache CloudStack Interesting Use Cases Using Simulator to generate production quality work loads Performance Bottlenecks Sample Reports
Introduction Most providers offer user and account management, native API support along with resource and usage accounting Manage public, private, hybrid multi tenant clouds End Users get access to buy self service virtual machines, storage volumes, network configurations on demand An IaaS solution is used to deploy and manage public, private or hybrid clouds. It offers compute orchestration, manages network and storage nodes Topic of discussion The Management Software that managesthe cloud including serving end user requests Management includes Orchestration of Infrastructure Resources Monitoring of Infrastructure Performance Testing of the software which manages the cloud
Our Focus Performance Testing of the Management Software Typical activities of an IaaS Management Software Serve End User Requests: (Examples) Start, Stop, Migrate Virtual Machines Create an account, Network Assign IP Addresses, Storage Take Snapshot of Volume Orchestration of the Cloud: (Examples) Monitor all Virtual Machines Monitor all Hosts Monitor Network Track Usage and Activities of all Users
Performance Testing of an IaaS Management Software Challenge of creating production quality setup in a lab Ideal requirements for performance testing of an IaaS cloud? 10000 servers, 100s of storage nodes, high network bandwidth, high end switch - Typical data center Generate work loads of production scale (high traffic of deploy instances, snapshots ) Arrive at base lines and recommendations for production scale deployments Performance Testing of an IaaS Management Software Modeling failures in a lab set up - Network failure - Host outages - Storage failures - Outage of infrastructure software What matters to a performance test engineer? How long did the Software take to identify the failure? How long did the Software take to respond to the failure How do I model these failures in my lab to test my software?"
Given these challenges of scale and massive work load generation, the CloudStack team innovated a simulation tool that could mimic hundreds or thousands of hypervisor hosts to test scale. SOLUTION? Simulating Scale Software which mimics the presence of hundreds or thousands of hypervisor hosts Implements commonly used commands of hypervisor host Start/Stop, List VMs in host Implements commonly used commands of virtual machines Uses a Simulator Template Start/Stop, Migrate VMs Take snapshot of the volume Mimics Storage pools
Simulator Case Study: CloudStack User and Admin APIs s Raw Storage Network Load Balancer MySQL DB Simulator Case Study: CloudStack User and Admin APIs Zone Simulator Load Balancer Storage Simulator MySQL DB
Is it just a problem of Scale? Simulator enables massive scalability Baselines Baseline per stack layer Need holistic view, but how? Metrics Traditional metrics Cloud-specific, like elasticity Compute Storage Network Deploying a Guest Instance Use Case: Deploying a guest instance in a cloud with 1000 hosts, 100 storage pools. Guest Instance specification: 2G CPU, 20G Disk, Windows 8 OS Performance Test: How long does it take for the guest VM to be placed on a hypervisor and hand it over to the user ready to be used? What do we need to measure? Process account information Apply resource limitation Checks Select the suitable hypervisor Select a suitable storage pool Download Windows template Apply security rules Deploy the Virtual machine All associated Database updating
Deploying a Guest Instance Use Case: Deploying a guest instance in a cloud with 1000 hosts, 100 storage pools. Guest Instance specification: 2G CPU, 20G Disk, Windows 8 OS Performance Test: How long does it take for the guest VM to be placed on a hypervisor and hand it over to the user ready to be used? What does NOT fall under the purviewof our Test? How long did it take to download the template Time taken for creating a VLAN on the hypervisor How long the network element took to respond Time taken to provision the disk in storage How long did the Windows VM take to boot up Simulator Served Our Needs Orchestrating Resources at Scale Monitoring Infrastructure at Scale Base lining at different layers of the Stack Establish new Metrics Fault injection Simulator vs. Production Data Numbers from simulator are 95% close to production numbers (As tested with List APIs)
Sample Reports, Case Study: CloudStack (Deploy Virtual Machine) Time taken in sec No. Of Virtual Machines deployed Sample Reports, Case Study: CloudStack (CPU Utilization during Deploy VM) CPU Utilization for one Management (Across 4 Cores cumulative results) Utilization across cores Time in seconds
Takeaways Performance metrics and bench marking is a long term process for an evolving domain like IaaS Should be refined release over release / as more features are getting added Identify and develop holisticsimulation tools to help derive meaningful metrics and fine tuning parameters Identify possible performance bottlenecks while designing the feature Identify use cases based on real customer deployments and community feedback (in case of open source projects) Thank You!