Best Practices for Deploying Citrix XenDesktop on NexentaStor Open Storage White Paper July, 2011
Deploying Citrix XenDesktop on NexentaStor Open Storage Table of Contents The Challenges of VDI Storage Workloads... 1 Best Practices for Citrix XenDesktop on NexentaStor... 2 Use Citrix Provisioning Services (PVS)... 3 Set I/O Size to Match the Workload... 3 Use Mirroring for Performance... 4 Enable Compression, Disable Deduplication... 4 Increase Memory for Power Users... 4 Place Paging Files on Solid State Devices (SSDs)... 5 Performance Results... 5 Conclusion... 6
Deploying Citrix XenDesktop on NexentaStor Open Storage 1 The Challenges of VDI Storage Workloads Desktop virtualization is increasingly popular, allowing organizations to rapidly provision one or more virtual desktop sessions for individual users. As a leading desktop virtualization solution, Citrix XenDesktop transforms Windows desktops into an ondemand service available to any user, any device, anywhere. To do this, XenDesktop lets organizations easily accommodate a wide range of needs without deploying custom desktop hardware to each user. At the same time, configuring cost-effective and performant storage solutions for growing virtual desktop infrastructure (VDI) has remained challenging for traditional and legacy storage. VDI-based desktop sessions are known to stress traditional storage solutions, often resulting in costly storage deployments and unsatisfactory performance for end users. As has been widely demonstrated, VDI workloads can generate particularly poor performance when served by RAID-5 storage arrays. These issues are driven by a fundamental mismatch between traditional storage technology and the nature of VDI storage workloads. VDI workloads tend to represent bursts of I/O activity as virtual desktops page to storage, and most I/O operations from paging activity are small and random. When using Citrix Provisioning Services (PVS), up to 99% of storage activity is comprised of writes, with only a small amount of read activity. I/O alignment issues abound between VDI workloads and traditional storage, especially if 4 KB sector disk drives are used. The read-modify-write operations implied by RAID-5 arrays can cause unacceptable performance under the write-intensive nature of VDI workloads, and write caches are only helpful until they fill, resulting in non-deterministic behavior. In contrast to standard RAID environments, the NexentaStor appliance provides a flexible environment that represents an ideal technology match for VDI workloads such as the Hostied VDI FlexCast technology delivered as a part of Citrix XenDesktop. In addition to allowing wide configuration of numerous parameters, NexentaStor exploits powerful features in the underlying ZFS file system implementation. Together, NexentaStor and ZFS capabilities easily overcomes the shortcomings of legacy file systems. With NexentaStor, I/O size can be easily tuned to match the workload, and ZFS block alignment automatically aligns all write operations. The ZFS block allocation algorithms condense an I/O stream of many small random writes into a single and more efficient sequential write operation, dramatically improving performance for VDI workloads.
Deploying Citrix XenDesktop on NexentaStor Open Storage 2 The ZFS copy on write algorithm eliminates expensive read/modify/write operations. NexentaStor provides a flexible and tunable environment with options that include compression to both speed performance and save on required storage. Based on joint testing conducted by Citrix and Nexenta, this document describes recommended best practices for configuring both NexentaStor and XenDesktop. Best Practices for Citrix XenDesktop on NexentaStor With VDI-based desktop sessions such as those provided by XenDesktop, the user s virtual session experience is highly dependent on write performance, and on how fast write I/O operations return. Based on testing with XenDesktop, Nexenta and Citrix engineers concluded that storage performance is first write-intensive, and secondly, highly dependent on write latency. In other words, minimizing latencies associated with write operations is critical to enhancing overall virtual desktop performance. The NexentaStor appliance functions as both a storage array and a virtualization appliance that abstracts legacy and commodity storage devices, pooling resources and presenting them at the file or block level to VDI servers such as XenDesktop. In its default configuration, NexentaStor provides strong performance for a variety of generalpurpose workloads. Beyond the inherent ZFS implementation advantages, NexentaStor can be easily configured to deliver superior I/O performance for write-intensive VDI workloads, at a massive savings over traditional storage solutions. The NextentaStor User Guide (http://www.nexenta.com/corp/static/docs-stable/ NexentaStor-UserGuide.pdf) provides information on setting the range of tunable parameters on the NexentaStor appliance. Figure 1 illustrates some of the settings used in Citrix / Nextenta proof of concept testing, including: Block size was set to 4 KB to match the most common XenDesktop I/O operation. Log bias was set to latency to provide a hint to ZFS to use the volume s log devices in order to optimize for latency (as apposed to throughput). Compression was enabled (on) to both improve performance and save space. Deduplication was disabled (off) since this capability was provided by Citrix Provisioning Services (PVS) in this testing. Sync mode was set to standard (default) so that synchronous file system transactions were written out to the ZFS Intent Log (ZIL), with all devices then flushed to ensure data stability. Background on these settings along with recommended best practices for configuring NexentaStor for XenDesktop are described in the sections that follow.
Deploying Citrix XenDesktop on NexentaStor Open Storage 3 Figure 1. The block size on the virtual block device and other parameters are easily set through the NexentaStor GUI. Use Citrix Provisioning Services (PVS) Citrix Provisioning Services (PVS) is a unique technology that was used as a part of the testing conducted by Citrix and Nexenta, and its use is considered a best practice. By configuring PVS to use Write Cache on a Server Disk, the read I/O is effectively separated from the write I/O. In this testing, the NextentaStor appliance was used only for writes while reads are cached on the XenDesktop server and provided directly to clients. Without PVS, the master image would not be separated from the write cache in the same way, greatly changing the I/O profile. Set I/O Size to Match the Workload In testing performed by Citrix and Nexenta based on Citrix XenDesktop workloads, the majority of I/O operations to storage (more than 80%) were multiples of 4 KB. Some storage arrays with fixed block sizes particularly RAID-5 configurations pay a significant performance penalty when writes are misaligned. This issue is particularly noticeable in virtualization environments when a guest OS partition is not properly aligned with the back-end array block size. In the case of misalignment, some implementations require the array to read or write more than a single block. For instance, small block I/Os from mail or database applications can double the I/O workload when misaligned. If misalignment is not detected and corrected, overly large and costly configurations are often deployed to achieve reasonable write performance.
Deploying Citrix XenDesktop on NexentaStor Open Storage 4 With NexentaStor, the alignment problem can be easily avoided by specifying a suggested block size for the LUN when the virtual block device is first created. In Citrix and Nexenta testing, block size was set to 4 KB, consistent with most of the I/O operations generated by Citrix XenDesktop. Use Mirroring for Performance Mirroring is not typically thought of as performance technology, but it proves vastly superior to RAID-5 in a VDI setting. Both technologies provide data redundancy, and RAID-5 configurations are typically considered to be cost-effective because they provide redundancy with fewer devices. However, RAID-5 incurs an unacceptable performance cost in a VDI environment to do the implicit read-modify-write penalty. Simple mirroring can have a significant performance benefit over RAID-5 by configuring main pool devices in a striped mirror with set sizes of six disks or less. Mirroring eliminates the readmodify-write penalty, while providing a degree of redundancy. Enable Compression, Disable Deduplication The NexentaStor appliance includes compression technology that can drastically improve performance for VDI workloads. Compression can also save on storage space. For many VDI client operating systems, the default NexentaStor compression algorithm can save approximately 30% of the raw disk space. In some cases compression can improve performance because fewer physical I/O operations are required. If the CPU resources to compress or decompress data are readily available, a few hundred microseconds of CPU time used for compression can save milliseconds of disk I/O time. It is natural to assume that deduplication would provide advantages in a VDI setting, and it would in many cases. In the testing performed by Citrix and Nexenta, however, deduplication does not offer additional value because the data to be deduped was the PVS write cache, which already features considerable space efficiency. In cases where PVS or other tools are not used, deduplication may offer more significant value. Increase Memory for Power Users Performance and scalability of the entire solution can be impacted by the balance of virtual host and storage resources. VDI-based desktop sessions represent shared multi-user systems, and they must be managed equitably so that users don t impact each other s activities, or the health of the overall system. In this context, it may be tempting to reduce memory for certain power users in an effort to manage the impact of their activities. Unfortunately, lowering memory allocation in a resource-intensive virtual machine has exactly the opposite effect in a VDI setting, since lower memory increases the paging load between the VDI server and the storage system. Instead, memory allocation should actually be increased for power users and decreased for less demanding users.
Deploying Citrix XenDesktop on NexentaStor Open Storage 5 To explore the impact of virtual machine memory on performance, virtual machines were tested first with 4 GB, 2 GB, 1 GB, and 512 MB of local memory. With the workloads tested, I/O load change was minimal between 4 GB and 1 GB. However, I/O load increased dramatically (by 20%) when memory was decreased from 1 GB to 512 MB. With less memory, the virtual operating system (Windows XP in this case) and applications needed to page more frequently to temporary storage. Precise memory sizing recommendations are beyond the scope of this document, and are heavily dependent on the choice of operating system and user behavior. Please contact Citrix for the latest virtual machine sizing recommendations for various operating systems. Place Paging Files on Solid State Devices (SSDs) When XenDesktop is configured to use PVS, each virtual machine (VM) requires two different types of storage: persistent storage for OS images, and temporary storage for caching running desktop sessions. Of the two, temporary storage requires the fastest write access and the lowest latency in order to ensure good virtual desktop performance. To this end, the NexentaStor configuration can be optimized by employing solid state devices (SSDs) either in combination with, or in place of hard disk drives. When used for VDI temporary storage, SSDs can provide extremely fast storage performance in virtualization environments. SSD space requirements can also be dramatically reduced by using compression. In Citrix and Nexenta proof of concept testing, NexentaStor defined two 32 GB SSDs with ZFS compression as a striped data volume to host temporary storage for 100 users consuming only 4.7 GB! Performance Results Even in its default configuration, the NexentaStor appliance provides VDI storage performance advantages over traditional storage solutions such as RAID-5. These advantages stem from the inherent synergy between ZFS technology and the unique requirements of VDI workloads. By applying the best practices described in this document, additional performance advantage can be obtained. To evaluate the performance differences implied by these configuration best practices, Citrix and Nexenta conducted before/after testing. For the testing, Windows Server 2008 R2 was run under Citrix XenServer mounting a NexentaStor appliance via NFS. 4 KB random I/O operations were tested on both a tuned and untuned NexentaStor appliance. As shown in Figure 2, the tuned results consistently achieved better performance in terms of higher I/O operations per second across a range of file sizes.
25000 20000 Write IOPS 15000 10000 Untuned Tuned 5000 0 2 GB 4 GB 8 GB 16 GB File Size Figure 2. A tuned NexentaStor appliance provides significantly better I/O performance across a range of file sizes (bigger is better). Conclusion Unlike many legacy storage solutions, and particularly traditional RAID-5 arrays, Citrix XenDesktop and NexentaStor represent an ideal technology combination. Beyond the inherent strengths of the NexentaStor appliance for serving general-purpose workloads, innovations available through an effective ZFS implementation overcome unique aspects of VDI storage workloads. Flexible configuration capabilities in both Citrix XenDesktop and the NexentaStor appliance allow performance to be optimized further, yielding critical write performance attuned to the most demanding VDI storage needs. Nexenta Systems, Inc. 444 Castro, Suite 320, Mountain View, CA, 94041 Phone 1-877-862-7770 Fax 1-650-965-4482 Web nexenta.com 2011 Nexenta Systems, Inc. All rights reserved. Nexenta, NexentaStor, and NexentaOS are trademarks or registered trademarks of Nexenta Systems, Inc. GNU is a registered trademark of the Free Software Foundation. Debian is a registered trademark of Software In The Public Interest, Inc. Citrix is a registered trademark of Citrix, Inc. Other products and company names are the trademarks of their respective owners in the United States and other countries. Citrix Systems is a leading provider of virtual computing solutions that help companies deliver IT as an on-demand service. Founded in 1989, Citrix combines virtualization, networking, and cloud computing technologies into a full portfolio of products that enable virtual workstyles for users and virtual datacenters for IT. More than 230,000 organizations worldwide rely on Citrix to help them build simpler and more cost-effective IT environments. Citrix partners with more than 10,000 companies in more than 100 countries. For more information on Citrix, visit www.citrix.com. 07/11