What is Virtualization and How Do I Audit It? Rick Schnierer and Chris Tennant Nationwide Insurance
Learning Objectives Understand the fundamentals of virtualization and supporting architecture Develop and execute a risk based audit for VMware ESX servers Identifybestpractices for securing VMware ESX servers, access to the management tools, and other key configurations related to virtual servers Leverage the lessons learned from our review and apply this to your environment 2
Before We Begin 3
Agenda What is it? Why are companies using it? How do you audit VMware? 4
Virtualization Defined 5
Virtualization Defined Virtualization, in computing, is the creation of a virtual (rather than actual) version of something, such as a hardware platform, operating system, a storage device or network resources. 6
Virtualization Defined 7
How about an Analogy? 8
Virtualization Architecture http://zone.ni.com/devzone/cda/tut/p/id/8709 9
Why Virtualize? Increase Agility Cost Savings Enables Standardization Virtualize Virtualization: Benefits and Challenges ISACAEmerging Technology White Paper 10
Impacts to Governance Improve cost control Improve delivery efficiency Strengthen business continuity SME availability Roles and responsibilities ITGC impacts Virtualization: Benefits and Challenges ISACAEmerging Technology White Paper 11
How to Audit VMware? Stti Setting the stage Risks to consider Focus areas Compatibility Change management Patch management Physical access Logical access Segmentation Backup/storage Monitoring/logging 12
Setting the Stage Focus 13
Setting the Stage At Nationwide zvm vs. VMware Virtualization first direction > 2,000 VMware server instances Hosting Windows and Linux Leveraged by all core businesses 14
Setting the Stage Common Terms Hosts Guests (VMs) Hypervisor (VMware ESX) Service Console Management Solutions (vcenter, Update Manager, vmotion) 15
Setting the Stage What s COBIT Say? PO4 Dfi Define the IT processes, organization and relationships PO9 Assess and manage IT risks AI3 Acquire and maintain technology infrastructure AI6 Manage changes DS5 Ensure systems security DS9 Manage the configuration ME3 Ensure compliance with external requirements ME4 Provide IT governance 16
Risks to Consider Information Security System Availability Virtualization Strategict Regulatory 17
Where do we start? Version of VMware (ESX, ESXi) Inventory of host servers Inventory of guest servers Current architecture diagrams Security templates 18
Information Security System Availability Strategic/Regulatory Risk Strategic Strategic Virtualization Regulatory Business/IT strategy Information Risk Management Regulatory requirements 19
Information Security System Availability Hardware Compatibility Strategic Strategic Virtualization Regulatory Hardware compatibility list Risks: Unexpected server behavior Lack of support Limited functionality (vmotion) www.vmware.com/resources/compatibility/pdf/vi_systems_guide.pdf 20
Deployment: Change Management (Hosts/Guests) Standard/certified build Approval/exception p for customization Limit ability to deploy Standard change management controls Routine scans for compliance with enterprise standards Virtual Machine Sprawl Information Security Strategic Virtualization System Availability Regulatory 21
Patch Management (Hosts/Guests) Information Security Strategic Virtualization System Availability Regulatory OS, VMware, Anti virus, etc. Updates/patches Currency Testing Automation http://support.vmware.com/selfsupport/download 22
Physical Access Information Security Virtualization System Availability Strategic Regulatory (Hosts) Hosts = physical servers Same rules apply Physical Environmental Disaster Recovery 23
Logical Access Strategic Information Security Virtualization System Availability Strategic Regulatory http://zone.ni.com/devzone/cda/tut/p/id/8709 24
Logical Access Information Security Strategic Virtualization System Availability Regulatory Service Console Dedicated/secure channel Limited i access No SSH Use SUDO Enable logging 25
Logical Access (Management) Information Security Strategic Virtualization System Availability Regulatory Virtual Infrastructure Client (VIC) Access to Guest or vcenter Secure tunnel Limit installations Limit access 26
Logical Access Information Security Virtualization System Availability Strategic Regulatory (Management) vcenter Limit access to administrator roles Parent/child relationship Disable Propagate Segregation of duties Default passwords 27
Logical Access (Management) vcenter Security consists of three parts: The object The user or group A security role Security is assigned at the object level by combining a user/group with a role and assigning to an object Information Security Strategic Virtualization System Availability Regulatory 28
Logical Access Information Security Virtualization System Availability Strategic Regulatory (Management) 29
vcenter List of users/groups List of roles assigned to users/groups Detail of privileges assigned to roles Logical Access (Management) Information Security Strategic Virtualization System Availability Regulatory 30
Logical Access (Guests) Information Security Strategic Virtualization System Availability Regulatory Operating System Same access concerns found in OS on host Df Default ltinstalled duser and Group Accounts Default Vendor ID s Administrator level accounts 31
Information Security System Availability Quick Tip Virtualization Strategic Financial Non persistent Disks Disabled by default When enabled all changes are deleted d when guest is turned off Could allow hacker to cover their tracks 32
Segmentation Strategic Information Security Virtualization System Availability Strategic Regulatory Guest segmentation Legal/Regulatory Criticality Sensitivity of date Separate zone for Management Firewalls, switches, vlan s 33
Information Security System Availability Segmentation Strategic Virtualization Strategic Regulatory http://www.vmware.com/technical resources/virtual networking/networking basics.html 34
Information Security System Availability Quick Tip Virtualization Strategic Financial Virtual Switches Potential for MAC Spoofing Promiscuous mode on vswitch his disabled d by default Verify it is not enabled Disable MAC Address Changes and Forged Transmissions 35
Information Security System Availability Backup/Storage Strategic Virtualization Strategic Regulatory VM repositories and datastores Hosts VMware Consolidated d Backup (VCB) Storage array Secure data transfer Backup testing/restores 36
Information Security System Availability Backup/Storage Virtualization Strategic Regulatory Virtual Server Snapshots Best Practices Adequate free disk space Only on active snapshot Remove inactive snapshots Risk running out of disk space if you leave snapshots active Multiple snapshots leads to version control issues 37
Information Security System Availability Quick Tip Virtualization Strategic Financial Virtual Disk Shrinking Can cause availability issues Ensure the following are configured for each guest isolation.tools.diskwiper.disable=true isolation.tools.diskshrink.disable=true 38
Monitoring and Logging (Hosts/Guests) t Information Security Strategic Virtualization System Availability Regulatory Monitor performance and capacity CPU cycles Number of servers Disk storage Security failed logins, lockouts Alerts when thresholds are approached 39
Our Lessons Learned Do your research! Utilize subject matter experts Don t bite off more than you can chew Leverage this presentation tti and existing iti audit programs Partner with your IT department 40
Learning Objectives Understand the fundamentals of virtualization and supporting architecture Develop and execute a risk based audit for VMware ESX servers Identifybestpractices for securing VMware ESX servers, access to the management tools, and other key configurations related to virtual servers Leverage the lessons learned from our review and apply this to your environment 41
Contact Information Rick Schnierer, CISA, CRISC Associate Vice President Systems Audit, Nationwide Insurance schnier@nationwide.com Chris hi Tennant, CISA, CRISC Audit Director InternalAudit Audit, Nationwide Insurance tennanc@nationwide.com 42
Appendix Audit Programs Nationwide id Insurance VMware ESX 4 Server Virtualization Audit Program Nationwide VMWare Audit Program ISACA VMware Server Virtualization Audit Program www.isaca.org/knowledge Center/ITAF IT Assurance Audit /Audit Programs/Pages/ICQs and Audit Programs.aspx 43
Appendix Additional Resources Virtualization: Benefits and Challenges by ISACA http://www.isaca.org/knowledge Center/Research/ ResearchDeliverables/Pages/Virtualization Benefits and Challenges.aspx Security Hardening, by VMware, www.vmware.com/files/pdf/vi35_security_hardening_ security wp.pdf ESX Server Security Technical Implementation Guide Version 1, Release 1, by the Defense Information Systems Agency (DISA), iase.disa.mil/stigs/stig/esx_server_stig_v1r1_final.pdfserver stig v1r1 44
Risk Control Test Procedures Host Servers Architecture decisions related to host servers are reviewed, approved and documented. 1 Host hardware is not compatible with virtualization software increasing the risk of outages, increased maintenance costs or the inability to fully utilize software functionality. 2 Host servers are unable to provide adequate memory and processing resources as a result of poor monitoring and resource planning/ management. A monitoring solution is in place to monitor various resources and settings on host servers including CPU, memory, disks, power supply, etc. Alerts are configured to notify the appropriate support group should any thresholds be exceeded. Inquire with management to determine how host hardware selections are made for virtual environments. - Ensure decision are made using the existing architecture policies and procedures. Select a sample of host servers and obtain hardware configuration information. - Review hardware configuration to reasonably assert compatible hardware is being used. - Also obtain and review applicable documentation to ensure hardware planning is aligned with the ESX host hardware compatibility guide. Compare hardware information to ESX host hardware compatibility guide to ensure VMware is only installed on compatible hardware. http://www.vmware.com/resources/compatability/search.php Verify effective monitoring of key resources elements such as memory (minimum requirements, currently used, total used), CPU utilization (% of available), and used/free hard disk space. -Verify monitoring application does not use the root account or other account with administrative rights Analyze the number of incidents (tickets) related to capacity issues for host servers to ensure monitoring and overall management of capacity is effective. Determine if alerts are enable to trigger when predetermined thresholds are approached and/or met. Ensure appropriate individuals are designate to receive alerts. A periodic assessment of Interview management to determine what type of scalability and hardware resources and resource forecasting are performed with regard to host servers. needs is performed to -Ensure frequency of analysis and representatives are appropriate. ensure memory, CPU's -Ensure host servers have sufficient free memory slots to meet etc are updated proactively. memory expansion needs. The CIM is restricted to only authorized users through a separate administrative account. Inquire with management to determine if CIM is being used. If not, it should be disabled. If CIM is used ensure a unique administrative account exists and is used when access the CIM.
Risk Control Test Procedures An approved security template is in place for use on all ESX host server builds. 3 Host servers are not appropriately configured increasing the risk of compromise, outages, or scalability issues. Host server builds are reviewed to ensure compliance with existing security template guidelines. Exceptions or deviations from the standard build must be requested and approved prior to implementation. All changes to host server configurations follow standard change management processes including approvals, testing, communication and roll-back requirements for each request/change. Changes made to host servers are also made to base images and test environments to ensure currency. Host servers are designed and implemented with sufficient physical network cards to ensure adequate separation of management, vmotion, heartbeat, and virtual server networks. Anti-virus software is installed as part of the build process host servers. Obtain and review current ESX security template. Inquire with virtualization team management to ensure the security template is used when configuring new ESX host server builds. Inquire with management to ensure host server configurations are reviewed against the existing security template for compliance. Standard change management governance controls should be tested. Standard change management governance controls should be tested. Review hardware configuration to determine if sufficient NIC's are available. A minimum of five should be present (for ESX v4); however, additional NIC's should be present for any required virtual server segmentation. (1 - dedicated to management network, 1 - cluster heartbeat, 1 - vmotion traffic, 2 - for virtual server redundancy) Standard Anti-virus and change management testing should be performed. Patches and upgrades to host servers are performed in accordance with enterprise software currency policies. Remote Management Cards (e.g. Integrated Lights Out (ILO)) are placed on protected networks. Standard patch management control testing. Review hardware configuration to ensure remote management cards are placed on protected networks.
Risk Control Test Procedures The host hardware includes a RAID sufficient to meet data retention and processing needs. 4 Host Hardware is not supported by a redundant architecture increasing the risk of lost data and outages. Determine what RAID level is utilized. Ensure it is appropriate (i.e. RAID level 5 or 10) 5 Access to console is not adequately restricted to authorized users increasing the risk of unauthorized attempts to access ESX administrative resources. Console is not directly accessed by shared accounts except during emergencies. Administrator roles have been appropriately established (i.e. power on/off virtual machine/ connect to a remove device, create new machines.) Host Management Inquire with management to determine console access practices. Ensure the Console is accessed using SUDO (not SU) and logging for SUDO is enabled. Review logs stored in /var/log/secure for esxcfg commands. Review access privileges to ensure super user type privileges are only assigned to the administrator role. Obtain and review access assigned to ensure only authorized users have access to the management console. 6 Remote access to the console is not through a secure channel to limit the risk of compromise. A secure channel (i.e. Obtain and review network architecture diagrams to determine if isolated physical network, access to the management console is obtained through an isolated SSH tunnel, etc) is used physical network (not a VLAN). to access the console -This will include obtain IP's for remote access and comparing to public remotely. network IP's. 7 The Management Console is not implemented according to existing enterprise security templates increasing the risk of compromise, outages, or scalability issues. An approved security template is in place including guidelines for appropriate Management Console setup. Obtain and review current ESX security template. Inquire with virtualization team management to ensure the security template is used when configuring new management consoles. Configuration of the management console is reviewed against the security template and approved by the IT operations team. Inquire with management to ensure configurations are reviewed against the existing security template for compliance.
Risk Control Test Procedures Virtual Guests and vcenter Virtual server memory is monitored using CIM. 8 Virtual server memory is overcommitted leading to continuous memory swapping, performance erosion or possible outages. Determine if a monitoring tool is in place. Obtain and review a sample of monitoring results to ensure thresholds are reasonable, critical components are monitored, and results are routinely reviewed. - Determine if virtual servers are "thin" provisioned or "thick" provision and determine if this is reasonable. Policies and procedures are in place to ensure Snapshots are deleted (after compilation with base file) timely. From sample of servers selected above determine is any active snapshots are on related virtual machines. - Right click on virtual server folder - Select Snapshot option - If "Revert to..." is available active Snapshot is on virtual server and should be removed unless acceptable reason presented. alternatively.. -In Data store Browser for virtual server, search for file names including "*-*0001" or similar. 9 Guest servers are not appropriately configured increasing the risk of compromise, outages, or scalability issues. Guest servers are prioritized based on criticality to ensure memory swapping between guests is adequately controlled. An approved security template is in place for use on all ESX guest server builds. Inquire with management to determine if procedures are in place to prioritize guests based on business criticality. Review configuration of guests to ensure memory swapping is adequately prioritized based on criticality. Obtain and review current ESX security template. Inquire with virtualization team management to ensure the security template is used when configuring new ESX guest server builds. Virtual guest server builds are reviewed to ensure compliance with existing security template guidelines. Inquire with management to ensure guest server configurations are reviewed against the existing security template for compliance. Exceptions or deviations from the standard build must be requested and approved prior to implementation. Inquire with management to ensure guest server configurations are reviewed against the existing security template for compliance.
Risk Control Test Procedures All changes to guest server configurations follow standard change management processes including approvals, testing, communication and roll-back requirements for each request/change. Standard change management governance controls should be tested. Changes made to guest servers are also made to base images and test environments to ensure currency. A standard naming convention is used to identify all virtual servers including location, type, identification number and description. Standard change management governance controls should be tested. Inquire with management to ensure a standard naming convention is employed with creating new guest servers. Each server name should be uniquely identifiable. Non-persistent disks are not enabled on guest servers. Review servers configuration to ensure nonpersistent disks are not permitted. -Right click on guest server -Select hardware tab -Observe if "Non-persistent Disks" is enabled Patches and upgrades to Standard patch management testing should be performed. guest servers are perform in accordance with enterprise software currency policies. Anti-virus i software is Standard d Anti-virus i and change management testing ti should be installed as part of the performed. build process host servers. Disk shrinking functionality is disable. Review parameters on a sample of guest servers to ensure the disk shrinking functionality is disabled. -Edit>Settings>Options>Advanced/General>Configuration Parameters 10 Remote access to the Virtual Infrastructure Client (VIC) is not through a secure channel (VPN). Remote access is only accessible through a VPN (secure) tunnel. Remote console connections are restricted to one user. External access is not permitted through port 3389 (remote desktop). Inquire with management to ensure remote access (RDP sessions) requires use of a VPN tunnel and two factor authentication. Review parameters on a sample of guest servers to ensure the Remote Console Connection parameter is adequately restricted. -Edit>Settings>Options>Advanced/General>Configuration Parameters Inquire with management to determine if port 3389 is disabled. For a sample of servers review configurations. (This allows a hacker to obtain username and password through key logging, brute force, etc.)
Risk Control Test Procedures Promiscuous mode on virtual switches is disabled. 11 Virtual switches are configured to permit promiscuous mode, MAC address changes, or forged transmission allowing the MAC address to be spoofed. 12 Guest servers/clusters are not adequately separated through network segmentations to leading to potential breach of confidential information or legal/regulatory noncompliance. Legal and regulatory requirements regarding the implementation of virtual environments are communicated to the engineering team. Appropriate segmentation or security zones are maintained to ensure compliance with legal and regulatory requirements. Review virtual switch configuration to ensure promiscuous mode is disabled. - In vswitch properties open security tab - Promiscuous Mode, MAC Address Changes, and Forged Transmissions should all be set to "Rejected" Ensure legal and regulatory concerns affecting the implementation of virtual servers is communicated to and acted on by the IT engineering team. Determine if legal or regulatory constraints exist requiring segmentation of guest servers (i.e. Nationwide Bank). If so, ensure appropriate action has been taken. 13 Periodic backups of *.vmdk files are not performed. Virtual environments with like security ypostures are placed together in separate network segments and isolated from those with significantly different security postures. Disk based backups are performed for *.vmdk files on a periodic basis. Review network diagrams with management to determine reasonableness of guest server placement in relation to servers with similar security postures. Inquire with management to ensure backups of *.vmdk files are performed on a routine basis.
Risk Control Test Procedures Additional Areas Management performs ongoing analysis of the virtual deployment strategy to ensure it aligns with the overall enterprise business and IT strategy. 14 Virtual environments are implemented without adequate planning and approval from appropriate management leading to potential scalability concerns, inability to meet SLA's, or undesirable ROI. 15 Products utilized in virtual environments are not supported by vendor contracts increasing the risk of insufficient knowledge sources, prolonged outages and recovery times, and increased maintenance costs. Critical applications are reviewed for compatibility with virtualized environments prior to implementation. Virtual implementations are reviewed pre and post deployment to ensure all SLA's and intend benefits (e.g. ROI) are achieved. Vendor support contracts are in place for all VMware installations. Inquire with management and review analysis and decision making evidence to ensure adequate evaluation and approvals/disapprovals were obtained. Inquire with management and review analysis of application and tool deployment to virtualized environments to ensure adequate compatibility and performance levels to meet or exceed SLA's. Determine if each virtual server implementation is analyzed to ensure planned builds will meet SLA's. Inspect product support contract(s) to ensure all critical components are supported. (Initial support includes one year - ensure it is renewed as appropriate.)