Nutanix NOS 4.0 vs. Scale Computing HC3 HC3 Nutanix Integrated / Included Hypervisor Software! requires separate hypervisor licensing, install, configuration, support, updates Shared Storage benefits w/o SAN/NAS Protocols and Provisioning! VM Live Migration / Failover between Servers!! (with 3rd party Hypervisor and VM Management only) Hypervisor to Storage Path One Step rolling updates of full hypervisor / storage stack Direct / Integrated, Single OS Kernel! Storage Controller VM to Virtual Network to Hypervisor using Network Storage Protocols Single vendor for support / maintenance! requires the use of a 3 rd party hypervisor What is Nutanix? Nutanix sells hardware and software that allows the storage functionality otherwise provided by a SAN or NAS shared storage solution to be consolidated onto the same hardware used to run virtualization and compute. By running their storage functionality inside a virtual machine that has access to the physical disks on each machine, it creates a software SAN that can be distributed across multiple physical nodes to create data redundancy. This approach also allows Nutanix to work with multiple 3rd party hypervisors (VMware, Hyper-V and open source KVM) and provision storage to them using network storage protocols like iscsi, NFS or SMB. What are the Benefits? Virtualizing the storage within the same chassis as the compute allows users to get rid of the external storage array and some of the cabling required in that setup, which is a good first step in simplifying the server room. Nutanix started with support for only VMware, but their virtualized storage approach has allowed them to quickly pivot to support multiple 3rd party hypervisors ( hypervisor agnostic ), which was
key given the pressure on their relationship with VMware. As it was widely publicized, VMware disallowed Nutanix from participating in their Partner Exchange in 2013 with their announcement of VMware s VSAN on the horizon. Because Nutanix distributes the storage across multiple physical systems it also has an advantage over monolithic SANs that can suffer from a single point of failure without redundant storage arrays. Assuming the chosen hypervisor in the deployment supports and is properly licensed and configured for host failover, Virtual Machines created on a redundant Nutanix system (except the controller VMs) are made highly available meaning that they will failover to other nodes in the cluster given a failure. What are the downsides? Using a virtual storage approach where each node hosts a Controller VM is a design decision that requires the separation of the hypervisor and storage administration. Users must decide whether to use iscsi, NFS or SMB to communicate to the storage based on their selection of a 3 rd party hypervisor and must administer that storage much the same way they would using a physical SAN or a NAS. A virtual SAN is not fundamentally any different or easier to manage than a physical SAN Storage becomes a new complex VM workload to manage and size next to the critical VMs running on each host, which means that they effectively compete for the same resources requiring a minimum of 12GB of RAM per controller VM (or 24GB if using Deduplication). There are also strict rules for Nonconfigurable Components mapped out in the Nutanix user guide that if modified, could render the cluster inoperable. Even things like accidentally taking a snapshot of this special controller VM from within the vsphere console can lead to system-wide problems.
By not removing the requirement for the use of storage protocols, Nutanix users must also deal with patching and other protocol issues outside of the updates of the hypervisor and other components in the stack. As seen in a knowledge base article from VMware (http://kb.vmware.com/selfservice/microsites/search.do?language=en_us&cmd=displaykc&ext ernalid=2077360), storage protocols can and do cause unforeseen issues: <<< host intermittently loses connectivity to NFS storage and an All Paths Down (APD) condition to NFS volumes is observed >>> This particular issue affected traditional NAS vendors and Nutanix alike and was ultimately fixed after a lengthy period with a VMware patch (note that the hypervisor patch is managed separately from a storage/nutanix patch), but could have been entirely sidestepped with a system that was tested as a single stack with storage protocols eliminated. It also highlights the issues that arise when multiple vendors (in this case VMware and Nutanix) are required to work together to support a single product. Instead of one throat to choke, users would have to determine where in the stack an issue occurred to avoid the inevitable finger pointing that often affects users of multiple vendor solutions like Nutanix. Walking through the I/O path of a 4K write on Nutanix, you will notice that each IO must pass through the hypervisor at least twice. You not only have the "normal" path a VM would have to remotely access for network SAN or NAS storage, that now virtual SAN or NAS is ALSO a workload running as a VM on each system doing all of it's processing and then accessing physical disk, flash and SSD storage distributed across multiple nodes. Guest VM does a 4K block I/O to emulated or paravirtualized disk device That device is backed by a VMDK file that represents a virtual disk VMFS file system translates the VMDK block I/O to file system semantics storing data in 1MB blocks VMFS uses VMware Storage Drivers to execute the storage commands on a logical disk (LUN) sitting out on the controller VM (also known as a Virtual Storage Appliance or VSA). To do this it has to use SAN or NAS storage protocols such as iscsi or NFS. Each I/O results in network storage protocol requests and responses being sent to the Controller VM just like they would be sent to an external SAN or NAS device The Controller VM generates I/O requests to multiple redundancy sets, possibly again grouping multiple I/O s into various size stripes across the array going back through a virtual switch on the hypervisor For the local I/O to be written to disk on each recipient server, it is then sent back through the hypervisor where the blocks are physically written to the disk spindles Then the disks have to acknowledge the writes back to the controller VM, back up through the hypervisor again The controller VM finally can acknowledge the write through the iscsi target or NFS server and back to the network to the initiator
The iscsi initiator tells the hypervisor which updates the VMFS and VMDK metadata and ultimately acknowledges the writes back to the originating VM In addition to the extra steps the I/O must traverse, there is also no benefit of shared memory as seen in those products that have integrated within the hypervisor. Above, every one of these steps involves copying data in memory from one location to another, passing around the payload along with metadata and control bits that are continuously transformed and modified as the request is passed from one layer to the next. This makes the use of SSDs a requirement to maintain the performance necessary for most end user environments. How is HC3 Different? Under the hood HC3 provides a pre-installed, fully integrated virtualization stack with storage functionality directly built into the HyperCore OS. HC3 makes use of KVM (Kernel Virtual Machine), an open source hypervisor that was accepted into the upstream Linux in early 2007. By integrating the storage in the hypervisor, HC3 eliminates the need for traditional storage protocols like iscsi, NFS or SMB. This means that the complexity of administering this layer of the stack vanishes, but still retains the high availability normally associated with a shared storage deployment using VMware. There is also a side benefit of the performance gain seen as the memory of a virtual machine is stored as memory for any other Linux process. While KVM is a supported hypervisor on Nutanix, they are at a distinct disadvantage when compared to HC3. Both products are able to take advantage of the lack of licensing fees associated with the open source hypervisor, but that is where the similarities end. HC3 is built for Small to Mid-size IT organizations that do not have the in-house expertise to administer KVM in its traditional form (mainly command line). Because of this, Scale has abstracted the complexity of KVM by wrapping the underlying stack with an easy to use UI alongside the selfhealing intelligence built into the HyperCore OS which manages the cluster as a whole (both hypervisor and storage). Initializing the cluster initializes both the storage and the hypervisor so that end users are able to unbox and immediately begin creating VMs in less than a half a day with no prior training in either virtualization or storage system management.
Nutanix actually requires the administrator to download the Nutanix specific KVM RPM and install it as root on every Controller VM in the cluster. The specific process can be seen in the Nutanix documentation (http://download.nutanix.com/guides/c_3_5/xhtml/oxy_ex- 1/topics/kvm/vm_mgmt_commands_r.html), but would likely require a level of comfort with general Linux Administration to perform. Note that this process also poses a potential security risk as SSH also has to be open on the controller VM. Once the KVM hypervisor is set up and management tools installed (command line scripts and wrappers used to create VM's under KVM), each new virtual disk created in NOS creates a separate iscsi LUN, again highlighting the added complexity that the virtual storage approach requires. In addition, the only supported KVM distribution listed by Nutanix is Centos (Community Enterprise OS) for which there is no paid vendor support available. Nutanix is in Gartner s Magic Quadrant, Why not Scale? While they appear in Gartner s Magic Quadrant report, they are not actually in THE magic quadrant. This is the upper, right Leaders quadrant where companies that focus on large enterprise aspire to be. Nutanix was actually placed in the quadrant labeled Visionaries due to concerns over their ability to execute given their relatively short presence in the market. What often happens to visionaries without ability to execute is that they get bought by someone who thinks they CAN execute which then changes everything to the dismay of existing customers who purchased along the way. Why was Scale not included? Gartner focuses their research on and for the largest of the large organizations only 9,000 Enterprises worldwide. Note enterprises and not small and midmarket companies that Scale is laser focused on serving. As a result, Scale Computing is NOT a paid Gartner client and did not pay to be included in large enterprise focused research like this. We focus our efforts on product development and simplification and listening to the needs of hundreds of thousands of target companies that range from mid-market to the very small. We also work with technology and market analysts that focus in Mid-market IT needs and are happy to provide reports and quotes that highlight their positive view of Scale Computing and HC3.
The TANEJA Group - http://www.scalecomputing.com/files/documentation/whitepaper-taneja-group-tech-validation.pdf Nutanix Raised a lot of Money, Doesn t that Make Them a Safe Choice? There is a trend of enterprise IT vendors taking $100+ million in funding at Billion dollar valuations to extend the runway of the life of their company. While it seems impressive in press releases, the reality is that end users must ask themselves why before purchasing from one of these companies and what would investors at that level who now control the company want to have happen. Typically, companies site a need for this influx in cash to grow sales and marketing ahead of what their actual sales revenues can support. As noted by the announcement of their 500th employee (http://www.nutanix.com/blog/2014/05/13/nutanixcelebrates-500th-employee/), Between April 2013 and April 2014 we ve hired, on average, one
person a day. At the end of May 2014, Howard Ting, Nutanix Senior VP, Marketing and Product Management reported 600 customers and 550 employees on a Stifel Capital Markets update call. That is almost an employee for every customer! Clearly a level not sustainable by a long-term company. Scale on the other hand has shipped just under 900 clusters to date and have profitability in sight, which is the true mark of longevity for any company (not just a high growth, venture backed company). Conclusions Nutanix converges the storage and compute by virtualizing the storage in the form of a Virtual Machine running on the cluster (Controller VM) o Controller VMs communicate to the storage through traditional storage protocols (iscsi, NFS or SMB) o Drawbacks of utilizing a software SAN " Requires the use of traditional storage protocols with separate administration between the hypervisor and storage Separate patches and updates Multiple vendors required to support can lead to finger pointing " Requires that IO go through the hypervisor twice mandating the use of SSDs for performance " The Controller VMs compete with actual workloads running on the system (12GB Minimum RAM / 24GB Minimum when using deduplication) " Minor changes to the default settings such as the accidental snapshot of the Controller VM within vsphere can cause system-wide outages Nutanix on support for KVM o Requires command line knowledge and general competence in Linux administration to implement KVM o Only supported distribution is CENTOS, which offers no paid support Nutanix is not listed as a Leader (Upper Right) in the Gartner Magic Quadrant o Gartner defines Visionary as lacking the ability to execute to become Leader " Companies in this quadrant are targets for acquisition by those who CAN execute, to the dismay of existing customers.