Large Installation Administration. Comparing open source deduplication performance for virtual machines

Size: px
Start display at page:

Download "Large Installation Administration. Comparing open source deduplication performance for virtual machines"

Transcription

1 System & Network Engineering Large Installation Administration Comparing open source deduplication performance for virtual machines Authors: Supervisor: Jaap van Ginkel University of Amsterdam

2 This page is left blank intentionally.

3 Abstract The research presented in this paper shows in which specific situations the most efficient use of storage resources can be achieved using open source deduplication solutions, LessFS and ZFS, while at the same time maintaining an acceptable level of disk performance. This study specifically aims at the disk performance for virtual guests that have their disks stored at the deduplicated file systems. The results of this study show that optimal performance can be achieved not using any data deduplication solution, especially when virtual guests need to perform heavy write actions. However, if you want to save huge amounts of storage, only a small expense in performance is paid when using ZFS as the deduplication solution. Furthermore, the research shows that ZFS is best used as the open source deduplication solution as it performs best in most of the test cases while still maintaining a high level of storage consolidation. Finally, the research shows the effectiveness of both deduplication solutions in storage consolidation and it can be seen that LessFS is the most efficient in this task.

4 CONTENTS 1 Contents 1 Introduction Data deduplication Research Research questions Literature review Open source solutions Performance measurement tools Disk performance determining factors Experiments Lab setup Storage measurements Results Sequential write Sequential read Random rewrite Random seeks Storage consolidation Conclusion 20 6 Future work 21 A Server hardware 22 A.1 Storage server A.2 Host server A.3 Virtual guest server

5 1 INTRODUCTION 2 1 Introduction In an ever growing and competitive hosting market, keeping costs down is one of the more important factors. It is therefore important to make the most efficient use of resources while keeping the same level of performance. With the increase in bandwidth speeds, larger data files and increasing storage requirements a solution for efficient and fast data storage is needed. To this end data deduplication [1] has gained enormous popularity as an efficient way of storing data. 1.1 Data deduplication Storage systems are often dedicated to storing large amounts of data. Storage systems store this data on its file system in the form of blocks of a certain size. For instance, take a system administrator who has configured nightly backup jobs of all the desktop machines in his organization. These backups are to be stored on the central backup storage system. Most likely, each backup will contain many similar data files, which will require the same amount of storage space each time it has to be stored on the storage system containing all the backups of the organization. An example of efficient resource utilization is virtualization [2]. Many bare-metal systems that are providing only a single service, e.g. a DNS or Mail server, often do not fully utilize the available resources. By using virtualization, a bare-metal server can be used to run multiple virtual operating systems using virtualization technology. From a user point of view, the OS appears isolated and standalone. Effectively, one system can be used to create multiple, virtual, systems increasing the utilization of the resources of the bare-metal server. However, each virtualized system still requires the storage of its operating system files, which is often stored on a central storage repository accessible over the network. In scenarios where the same OS is virtualized more than once, the same files are likely to be stored more than once as well. According to Keren Jin and Ethan L. Miller [3], making use of data deduplication for storing virtual disks helps considerably in consolidating storage. However, practical data deduplication implementations and virtual guest performance have not been researched. Data deduplication is a method to eliminate duplicate copies of the same data blocks, which can be used to reduce the required amount of space in storage systems like described in the examples above. There are different types of data deduplication, which we will discuss further in this paper. By using data deduplication techniques on a storage system, data that is stored on it is checked for duplicate blocks. When identical data blocks are found, the system will create a pointer to the data block that was already present on the system. This approach only requires the storage space for one file together with some overhead to store the pointers.

6 1 INTRODUCTION Types of deduplication Data deduplication differs from regular compression algorithms in the sense that it actually works on the block level of a storage device, instead of on the files themselves. As there are many more data blocks than there are data files, it is therefore most likely that many blocks are the same in contrast to duplicate data files. This potentially allows for more amounts of duplicated data that can be saved. A downside is that the process itself is a CPU intensive task, as new data needs to be matched against previously stored data. There are different forms of deduplication that can be used in various scenarios. The usage of each type depends on the storage and network setup and requirements. We can distinguish between the location, the time and the method of deduplication. Location The location of where the deduplication is done is an important factor to take into account. For instance, when there is a limited amount of bandwidth available, the maximum amount of time a backup can take will increase. The choice where deduplication has to take place is dependent on the amount of bandwidth available. Depending on how the storage infrastructure is designed, choices have to be made on how to utilize it. Deduplication methods can be applied at the source the data is stored, or at the target. In the example above, applying source deduplication can shorten the backup time, as there is less data to transfer when transferring the backup to the backup system. With target deduplication, more bandwidth is required, plus the target storage system would initially have to require more disk space in contrast to source deduplication. Time A storage system with a high system load is undesired at times when the system is actively used. The time to perform deduplication on your data can influence the performance of a storage system and can be noticed by the direct or indirect users of such a system. Depending on the function of the storage system, it may be wise to let the deduplication algorithms run when the system is not actively used, for instance, at night. Method The method of deduplication defines what type of deduplication technique is performed on the data to reduce the amount of space it consumes Deduplication methods There are different deduplication methods of how the actual data is stored and how they are reduced in file size. The most common methods are described below. In-line deduplication is a form of source deduplication where the deduplication mechanism examines the data that is received (e.g. via the network).

7 1 INTRODUCTION 4 If a duplicate block of data is detected, a new pointer to the existing block is created. Besides the pointer, no data is written to the storage system in general. A downside to this is that it slows down the system if a large amount of data is being received [4]. Out-of-line deduplication is, as the name suggests, the opposite of in-line deduplication. Deduplication is performed at the target data storage. This method is also known as post-process deduplication. The data is first received on disk, where it is deduplicated at a later time, for example, to avoid a loss in performance during business hours. With this form of deduplication, more storage space is required to store the initial data first, before it is duplicated [4]. File-based deduplication, which is commonly known as Content Addressable Storage (CAS) [5] is a deduplication method on file level instead of block level. The methods described above each have their own pro s and con s. Which one to implement is depending on the requirements and utilization of a storage system. Deduplication impacts the write performance on a system, so this is an important aspect to take into account when choosing the most suitable method.

8 2 RESEARCH 5 2 Research Our research focuses on the use of different open source data deduplication solutions as storage platforms for virtual disks and the amount of storage that can be consolidated. Furthermore, we want to see if it is possible to gain insight into the impact on the performance of the virtual guests with virtual disks stored on a deduplicated storage platform. 2.1 Research questions Based on the description above, we defined the following research question. What is the impact on virtual guest disk performance when using open source data deduplication solutions? The following subquestions will help to answer the main research question. What is the amount of storage saved when a deduplication mechanism is applied? How does increasing amounts of storage consolidation influence the virtual disk performance? 2.2 Literature review Prior to the start of this study, we have looked into existing related research on the subject. Keren Jin and Ethan L. Miller have done research on the effectiveness of data deduplication of virtual disks in their paper The Effectiveness of Deduplication on Virtual Machine Disk Images [3]. They found that using data deduplication on virtual disks can save about 80% of storage. However, the methods used are purely conceptual and proof of concept as they treat each virtual disk as a byte stream and separate the stream into chunks of a specific size. For each chunk they calculate a SHA1 hash and if multiple chunks have the same SHA1 hash, they are considered to be duplicates of each other. Whether the same percentage of storage saving can be achieved in practice is not clear. A more practical approach into the effectiveness of data deduplication is taken by Dutch T. Meyer and William J. Bolosky in A Study of Practical Deduplication [6]. Their results show that data deduplication is very effective within a large commercial company. However, they do not look at the use of data deduplication in virtualized environments and only look at Windows based desktop machines. This research shows the effectiveness of data deduplication on a large set of different data between different systems which is relevant for our research.

9 2 RESEARCH Open source solutions Companies such as NetApp [7] and EMC [8] provide storage solutions which are often very costly to implement into a business. Such solutions are not affordable by every company as these solutions are proprietary and also require trained engineers or a support contract. Although the latter two can also desired when using an open source solution, an open source solution can be a more viable option. In this study, we specifically look into open source implementations of deduplication mechanisms. Below a few of the open source deduplication solutions available are discussed. SDFS is an open source deduplication solution by OpenDedup. It is capable of performing in-line deduplication [9] on the files stored in its file system. It is a cross platform solution with support for multiple, distributed storage nodes. OpenDedup provides support for in-line and out-of-line (batch) deduplication. They claim to be an open source enterprise deduplication platform that is able to perform deduplication at line speeds of 1 Gigabyte per second or faster. The official homepage contains well documented guides on how to setup an OpenDedup file system on a Linux system. OpenDedup could be an interesting choice for this research, as it provides an easy setup and promises great performance. LessFS provides in-line data deduplication using FUSE [10]. LessFS is a userspace deduplication solution that has support for in-line deduplication, compression and encryption. The official LessFS website focuses on usability by providing easy to follow tutorials on how to setup a simple LessFS-based file system on a Linux system using common tools. The suggested usability of LessFS makes this solution a very viable candidate to be used in our research. s3ql is a source-based deduplication technique that deduplicates data before sending it to the Amazon S3 storage buckets in the cloud [11]. A similar approach is taken by s3fs [12], but it does not provide local deduplication. Both solutions sound very interesting, especially when looked at with a cost-saving point of view. However, cost-saving is not the prime focus of this research and thus not interesting for us to investigate further. ZFS is a well-known file system that has build-in data deduplication. ZFS is a robust and reliable file system originally introduced with the OpenSolaris operating system in It has support for large storage volumes of up to 16 Exabytes, support for file and folder snapshots and is able to continuously perform file integrity checking [13] to prevent data corruption. More recently, native support for ZFS file systems was included in the FreeBSD operating system [14]. We think ZFS is also interesting to include in our research, as it is a popular file system and has build-in support for deduplication.

10 2 RESEARCH Performance measurement tools To perform the actual measurements, we have done research on existing tools that are able to aid us in the measurements. iozone is a file system benchmark tool [15]. It can simulate various workloads (or IO operations) on file systems to measure it s real-life performance [16]. iostat is a similar tool as iozone. It is mostly used on local storage or NFS shares to monitor and measure the IO performance [17]. hdparm is a utility to view and change hard drive parameters to gain optimal drive performance. It can also be used to perform disk throughput tests to measure disk performance [18]. bonnie++ is basically a rewrite in the C++ programming language of a similar tool called bonnie. It has many features to perform the disk performance measurements we require. fio is a popular tool to measure storage performance [19]. It features a comprehensive set of features, such as 13 different IO engines, multiple IO priorities (for newer Linux kernels) and multi-threading. Iperf can measure the maximum performance and bandwidth of TCP and UDP based connections [20]. A client-server model is used to perform measurements on the network performance. For a specified amount of time, a traffic stream is generated between an iperf client and server and network throughput is reported upon completion. Although this tool can t be used to measure IOPS, it is still useful to measure the raw network throughput. A similar tool is Netperf [21].

11 2 RESEARCH Disk performance determining factors How a disk performs in a storage system depends on certain factors. The hardware that is used in the system is obviously the most important factor. While hardware has its limitations, software can improve the performance of the system. The most important factors that can influence disk performance in the are the following. CPU (clockspeed, L3 cache size) Storage device (SSD, HDD, PCI) Storage configuration (RAID level) Caching mechanisms (RAM) File system (block size, journaling) IO scheduler (type of algorithm and configuration) Kernel (version, optimizations)

12 3 EXPERIMENTS 9 3 Experiments We have created a test plan to be able to get consistent test results. The defined test procedures were applied on the different lab setups using different configuration parameters. This method of consistent testing results in comparable measurements. 3.1 Lab setup For us to perform measurements, we set up a lab environment. This test environment consists of a total of six Dell servers which will be used during the experiments. The specific models and hardware specifications are described in Appendix A. Out of the six available machines, five will be used as the hosts which will run multiple virtual guest machines. These virtual guests are installed with a default x64 installation of Debian Squeeze. On top of this the virtualization software Xen will be used [22], which will enable us to run multiple operating systems simultaneously. We chose to use Xen because it is easy to setup, widely used, has a large community and is an enterprise ready virtualization solution. Xen is installed from the Ubuntu repository using the xen-hypervisor-4.1-amd64 package. The sixth machine will be used as a shared storage server on which we will install the deduplication software. This server will share its storage over the network to the five host machines Initial network test First, we want to test the performance of the network to confirm the reliability and stability of the network and its speed. This is important so we can be sure that the network is not a limiting factor in achieving our results. Accessing a shared storage resource over a network, results in a slight performance overhead. Because of this, our initial network test will be done using a benchmark tool called iperf, which we have described in Section 2.4. The iperf test was performed without any tweaks to the TCP/IP stack. The test performed is shown below. hosta~# iperf -s hostb~# iperf -c hosta iperf command. An average network speed of 939 Mbits/sec is measured using the shown method to set up a server and client connection. The test generates traffic from memory,

13 3 EXPERIMENTS 10 so disk performance is not measured or taken into account during this test. The results shown a near line-speed network bandwidth, which proves a stable network. 3.2 Storage measurements Storage setup As we want to be able to perform measurements using a shared storage system, we have to setup a server that will provide these services. We choose to use the iscsi protocol to share the file system on the shared storage server to the different virtual host machines. For every virtual guest in the lab setup, we create a separate sparse file containing the operating system. Normally, when storing files on disk, the size of the file can be requested by issuing a specific system call to the operating system. A special way of storing files is by doing this in a sparse way. Sparse files are different than normal files in the sense that empty parts of the file are not actually stored on disk, but are represented using meta data, thus saving valuable storage space on the file system [23]. Each sparse file belonging to a separate virtual guest is represented as an iscsi volume (LUN), which was mounted on the host where the virtual guest would run. Each of the five hosts would mount the shared iscsi LUNs to provide the hosted virtual guests with disk space. Each virtual guest would have a separate physical hard disk of 5 Gigabyte configured in its Xen configuration file which represented the actual iscsi LUN mounted on the host. This setup enables the virtual guest to directly boot from the iscsi LUN located on the shared storage server. The setup is visualized and shown in Figure 1. The first step is to perform measurements on the lab setup without any deduplication solution running on the shared storage server. The storage server is installed using the EXT4 file system. We choose to use EXT4 as the base file system for the storage server as it is a modern file system with great performance, as benchmarked in [24]. By performing measurements on the EXT4 file system, we are able to gain insight in the basic performance of the lab setup. A baseline of the performance is created, which in turn enables us to compare it against the deduplication solutions. To be able to compare the baseline performance of the lab setup with the deduplicated setup we have to install multiple open source deduplication solutions. We choose two solutions of solutions discussed in Section 2.3 to perform measurements on.

14 3 EXPERIMENTS DomU Dom0 Storage backend Filesystem Disk iscsi connection Figure 1: Lab setup. Initially we have selected OpenDedup s SDFS and the ZFS file system. However, we were unsuccessful in maintaining stability when performing measurements on the SDFS solution under high loads. As time is a limiting factor for this project, we choose to use ZFS in a separate installation of FreeBSD 9.and LessFS instead of SDFS. LessFS and ZFS seem to be two of the most popular open source deduplication solutions that perform in-line deduplication. LessFS runs in user space, while ZFS runs in kernel space (a native implementation) in the FreeBSD 9 installation. Currently ZFS for FreeBSD is the only native file system with deduplication capabilities. While both solutions fundamentally differ from each other in the way they are run, we believe it is still very interesting to see a comparison. The drawback of running an application in user space is that more context switching is required to switch from user to kernel space. Some might argue that comparing a user space solution to a kernel space solution is unfair. While this may be true in essence, we believe the comparison is still valid as real world practice shows that, mostly due to management constrains, using FreeBSD in a strict Linux environment is often not a possibility. This results in the need for a different solution which, currently, automatically means a user space implementation is the only option.

15 3 EXPERIMENTS Lab measurements The measurements will be conducted on the lab setup simultaneously running 10, 20 and 30 virtual guests respectively. By using increasing amounts of virtual guests, we were able to compare the differences in load on every measurement. In turn, the results can be used to draw conclusions on what the best performing solution is and to determine what to take into account when deploying deduplication in a similar way. To be able to calculate average values, the measurements are done a total of three times using the bonnie++ tool which we briefly discussed in Section 2.4. The reason for using bonnie++ for benchmarking is because it is a widely used tool for measuring disk performance which is able to run all of the benchmarks we require for this research. The version used is version 1.96 of bonnie++. To be able to efficiently perform measurements on the lab setup, we had to automate the deployment and bootstrapping of the virtual guests on the host machines over iscsi. To automate this, scripts are created. The automation part consists of two parts. The first script acts as a central control script on the storage server, which in turn contacts a second script on the hosts using SSH public key methods for password-less login. This second script is able to dynamically mount the iscsi LUNS required for the given amount of virtual guests that we require to boot on the lab setup. To start or stop a virtual environment consisting of 30 virtual guests, running on our lab setup, we would issue the following command on the control server: Control command to boot the virtual lab environment. control_server:~# /lab-control.sh start 30 control_server:~# /lab-control.sh stop 30 The 30 virtual guests are equally distributed over the five host servers. The control script contacts all five hosts to dynamically mount the iscsi mounts located on the storage server and boots the Xen virtual guests on each respective host. The scripts are available on request Measuring the performance There are multiple tools available for Linux distributions that are suitable for performance measurements [25]. In Section 2.4, we have looked into several of the available tools that are able to test hard drive and system performance [26].

16 3 EXPERIMENTS 13 Out of these tools we found that the most suitable tool for our experiments was bonnie++. Bonnie++ performs four types of tests, which can be divided into three categories. sequential output, sequential input and random seeks. Some of these tests require data being present on the disk before they can be performed. For each of these tests, bonnie++ first writes a file with random data to disk. Sequential output The sequential output category consist out of two tests. The first test is the writing of blocks of data sequentially to disk. The second test in this category is the rewrite test. This test seeks 8000 times in parallel a part of the generated file and reads it to memory. In addition, in 10% of the cases, the read data is changed and written back to disk. Both of these tests are aimed at measuring how fast files can be written to disk. Sequential input In the sequential input category, the test performed is the sequential reading of data from disk into memory. Effectively this test measures how fast files can be read from disk into memory so they can be used by the user or arbitrary applications. Random seeks Finally, the random seeks category contains a test for randomly reading data from the disk in parallel. This test focuses on measuring how fast data can be read from different parts of the disk simultaneously [26].

17 3 EXPERIMENTS 14 We used the following bonnie++ command in combination with standard GNU tools to get the desired output. Bonnie++ command. $ /usr/sbin/bonnie++ -n 0 -u 0 -r free -m grep Mem: \ awk {print $2} -s $(echo "scale=0; free -m grep Mem: \ awk {print $2} *2" bc -l) -f -b -d /tmp/ > /tmp/bonnie.output The command does the following: Command / Argument Description /usr/sbin/bonnie++ Run the bonnie++ tool. -n 0 Disable file creation. -u 0 Set the UID to 0 (root) -r free -m... Specify the amount of memory in Megabytes -s $(echo... File size for IO performance tests in MB times 2 -f Skips per-character IO tests. -b Don t use write buffering. fsync() after every write. -d The directory to use for the tests. Table 1: Description of bonnie++ used arguments. When running bonnie++ in parallel with the parameters mentioned in tables 1, it will write small files, similar to a real-world scenario. We are skipping per-character IO writes as we want to perform sequential writes to the storage system. The combination using files twice as large as the maximum amount of memory, in combination with disabling write buffering, we completely bypass the operating systems caching mechanisms.

18 4 RESULTS 15 4 Results 4.1 Sequential write Figure 2 shows the average performance of a virtual guest when 10, 20 or 30 virtual guests are writing to disk simultaneously. These results are as expected since writing to an EXT4 file system requires less actions to be taken before the data can be stored on disk. In the case where a deduplication solution is used, overhead in the form of deduplication and compression is introduced which makes the writing slower compared to EXT KByte/s EXT4 ZFS LessFS 0 10VM 20VM 30VM Amount of VMs Figure 2: Average write performance. 4.2 Sequential read For the results of the read tests, we can take a look at figure 3. The results gathered are not as expected and show inconsistent behavior when these tests are performed with different amounts of virtual guests. However, interestingly both deduplication solutions gradually perform less in contrast to EXT4 when more virtual guests are being used. The data that needs to be read is stored in a compressed state on disk and therefore is read faster into memory since less data needs to be read. The decompression process that follows is needed to get the complete set of data. This process is solely a CPU intensive task. However, when 20 or 30 virtual guests are used, we can see a rapid decline in the performance of the deduplication solutions. This is because the CPU of the storage backend now becomes the bottleneck for these same decompression actions that need to be taken. We therefore believe that the reason for the better performance of the deduplication

19 4 RESULTS 16 solutions when tests are run for 10 virtual guests is because of the compression employed by both deduplication solutions. KByte/s VM 20VM 30VM Amount of VMs EXT4 ZFS LessFS Figure 3: Average read performance. 4.3 Random rewrite The results for the rewrite tests are shown in figure 4. In this graph, you can immediately see that LessFS is outperformed by ZFS and EXT4. Furthermore, ZFS either performs almost as good as EXT4 or even better. The reason ZFS actually performs better when using 20 or 30 virtual guests can be contributed to the use of dynamic block sizes of up to 128 Kilobytes [13]. When this test is conducted with 10 virtual guests, the size of the blocks used in both EXT4 and ZFS have no real impact as enough disk resources are available to accommodate all the disk access requests by the virtual guests. However, for 20 and 30 virtual guests, this is no longer the case. But because ZFS uses a dynamic block size, smaller blocks can actually be read, changed and written faster compared to the fixed size blocks of EXT4 and the even larger blocks of LessFS. 4.4 Random seeks As mentioned in section 3.2.3, the seek test actually tries to seek a part of the test file in parallel. It does this 8000 times. And in 10% of the cases, the read data is changed and written back. The results for this test can be seen in figure 5. The results are as expected because reading random data from disk, and sometimes writing data back to disk is directly affected by the number of actions needed to complete these tasks. We believe that mainly the

20 4 RESULTS 17 KByte/s VM 20VM 30VM Amount of VMs EXT4 ZFS LessFS Figure 4: Average rewrite performance. decompression action is the bottleneck, as trying to decompress something 8000 times in parallel is a very CPU intensive task. Also interesting to see is that LessFS outperforms ZFS structurally for the first time, albeit not by much.this is most likely due to the fact that the compression algorithm used by LessFS is actually a little faster in decompressing data than the compression algorithm used by ZFS. LessFS uses the QuickLZ algorithm to apply additional compression on top of the deduplication. QuickLZ developers claim to be the world s fastest compression library and that they can reach over 358 Megabytes per second in decompression throughput [27]. The ZFS pool we configured uses the LZJB compression algorithm, which we believe is a slower decompression algorithm when compared to QLZ. We base this belief on [28] in which LZJB is compared against LZO, the latter being significantly slower than QuickLZ. It is shown that decompression rates of LZJB are relatively poor, especially for smaller blocks of data. 4.5 Storage consolidation Figures 6, 7 and 8 show the reported usage and the actual file system usage. As discussed in Section 3.2.1, sparse files have been used and get interpreted correctly by the operating system in case of the EXT4 file system. Therefore, we can see the same size for both the reported and actual size in the case of the EXT4 file system. The most interesting thing about these results is the amount of reported usage by LessFS. This is due to the fact that LessFS does not take into account the sparse properties of the files and therefore reports the full size for each file. ZFS

21 4 RESULTS Seeks/s VM 20VM 30VM Amount of VMs EXT4 ZFS LessFS Figure 5: Average seek performance. on the other hand does take the sparse properties in account and in addition also accounts for the used compression. This results in the reported size being just below the size reported by the EXT4 file system MBytes Reported Real EXT4 LessFS ZFS Deduplica0on method Figure 6: Used space for 10 virtual guests For a better look at the amounts of space reported, used and saved, we can take a look at Table 2. This table shows that LessFS actually consolidates more storage when compared to ZFS and EXT4.

22 4 RESULTS MBytes Reported Real EXT4 LessFS ZFS Deduplica0on method Figure 7: Used space for 20 virtual guests MBytes Reported Real EXT4 LessFS ZFS Deduplica0on method Figure 8: Used space for 30 virtual guests 10 VM 20 VM 30 VM Real Reported Real Reported Real Reported EXT4 4594, , , , , ,68 LessFS 251, , , , , ,377 ZFS , , ,535 Table 2: Detail storage consumption (in MB)

23 5 CONCLUSION 20 5 Conclusion In general you can get better performance when not using any data deduplication solution, especially when virtual guests need to perform heavy write actions. However, in some specific cases it might actually prove more fruitful to use the deduplication features of ZFS. In cases where large amounts of virtual guests need to read data, change it and write it back, ZFS can get you better performance than using, for instance, EXT4. In most cases, ZFS outperforms LessFS as can be seen in our results. We believe this is mostly because of the fact that ZFS is a native file system, which is implemented in kernel space, as to where LessFS is implemented in user space and requires the FUSE libraries. This extra layer adds additional overhead, which in turn results in slower performance. The results clearly show that using a deduplication solution is very effective in saving storage. It hardly has no effect on the actual amount of storage used when you save 10 times the same data or 30 times the same data. However, if you correlate the results of the tests and the amount of storage space saved, the performance of all the file systems drop considerably when more virtual guests are run concurrently, which is expected. However, our results show no tangible evidence that the amount of storage consolidation actually influences the rate of decrease in performance for deduplication solutions. We can argue that in an environment on which you have virtual guests with mixed functions and you want to save as much space as possible without losing to much performance, the best way to go is to use ZFS. ZFS saves more space compared to EXT4 and performs better when compared to LessFS. Finally, in situations when disk performance is not as important as storage consolidation, LessFS is the better choice as it is the most efficient deduplication solution for saving space.

24 6 FUTURE WORK 21 6 Future work Due to time limitations, not every aspect was feasible to research extensively. This section discusses some ideas for further and future research in this area. Different hypervisor In this study, we have chosen Xen as the main hypervisor to perform testing with. An interesting extension to this research would be to compare it with different hypervisors, such as VMware or KVM. Replicated performance In cases, like failover clusters, where multiple virtual machines need to stay in sync and thus write the same data to disk, different results could be achieved. It might be interesting to see the measurements when identical blocks of data are written and read from disk of the different virtual machines. Different deduplication methods It might be interesting to see the performance of the other deduplication methods such as Nexenta [29]. We were not able to continue the measurements using OpenDedup because of occurring crashes. Perhaps in the future this particular solution is more stable. Multiple datasets Using a multiple of different operating systems for the virtual machines will most likely result in different space usage between the different file systems. The disk performance might be affected because of the different datasets that need to be kept in the deduplication databases. High-end Hardware The hardware used in our setup is far from modern. It might be interesting to pursue the usage of modern and high-end hardware and the impact it has on the results. ZFS user mode A comparison between LessFS, which runs in user mode, and ZFS running in user mode. Although unpractical, it would be better comparable.

25 A SERVER HARDWARE 22 A Server hardware In the lab setups described in Section 3.1, five machines were used. The hardware specifications of the machines in the setup are below: A.1 Storage server One server acting as a storage backend using data deduplication on the shared storage volume. Brand Dell Model PowerEdge R210 CPU Intel(R) Xeon(R) CPU 1.87GHz Memory 8GB Hard disk Western Digital WD5002ABYS-18B1B0 500GB (x2) NIC Embedded Broadcom 5716 NetXtreme II BCM5716 Gigabit Ethernet (x2) Operating system Ubuntu x64 & FreeBSD 9.0-RELEASE #0 Linux kernel generic x86 64 & 9.0-RELEASE #0 A.2 Host server A total of five servers acting as a Xen host for the DomU virtual machines. Brand Dell Model PowerEdge 850 CPU Intel(R) Pentium(R) D 3.00GHz Memory 2GB Hard disk Seagate ST AS 80GB (x2) NIC Embedded Broadcom 5716 NetXtreme BCM5721 Gigabit Ethernet(x2) Operating system Debian 6 (Squeeze) x64 Linux kernel xen-amd64 x86 64

26 A SERVER HARDWARE 23 A.3 Virtual guest server A maximum of 30 virtual guests were booted to perform measurements on. Brand Xen Model Version 4.1 vcpu Intel(R) Pentium(R) D 3.00GHz Memory 128MB Hard disk 5GB iscsi mount NIC Xen routed from Dom0 Operating system Debian 6 (Squeeze) x64 Linux kernel xen-amd64 x86 64

27 REFERENCES 24 References [1] Nagapramod Mandagere, Pin Zhou, Mark A Smith, and Sandeep Uttamchandani. Demystifying data deduplication. In Proceedings of the ACM/IFIP/USENIX Middleware 08 Conference Companion, Companion 08, pages 12 17, New York, NY, USA, ACM. [2] G. Goth. Virtualization: Old technology offers huge new potential. Distributed Systems Online, IEEE, 8(2):3, feb [3] Keren Jin and Ethan L. Miller. The effectiveness of deduplication on virtual machine disk images. In Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference, SYSTOR 09, pages 7:1 7:12, New York, NY, USA, ACM. [4] Datadomain - in-line deduplication. Website, com/pdf/datadomain-techbrief-inline-deduplication.pdf. [Online; consulted on March ]. [5] Searchtarget - how data deduplication works. Website, How-data-deduplication-works. [Online; consulted on March ]. [6] Dutch T. Meyer and William J. Bolosky. A study of practical deduplication. usenix, [7] Netapp - official homepage. Website, [Online; consulted on March ]. [8] Emc - leading cloud computing, big data, and trusted it solutions. Website, [Online; consulted on March ]. [9] opendedup.org. Opendedup official website. February [Online; Consulted on February 21, 2012]. [10] Mark Ruijter. lessfs - open source data de-duplication. lessfs.com, February [Online; Consulted on February 21, 2012]. [11] s3ql - a full-featured file system for online data storage. Website, http: //code.google.com/p/s3ql/. [Online; consulted on March 6, 2012]. [12] s3fs. s3fs - fuse-based file system backed by amazon s3. google.com/p/s3fs/, February [Online; Consulted on February 21, 2012]. [13] Wikipedia.org - zfs. Website, [Online; consulted on March ]. [14] Freebsd wiki - zfs. Website, [Online; consulted on March ].

28 REFERENCES 25 [15] IOzone. Iozone.org - iozone pdf documentation. docs/iozone_msword_98.pdf, March [Online; Consulted on March 6, 2012]. [16] nixcraft - how to measure linux filesystem i/o performance with iozone. Website, linux-filesystem-benchmarking-with-iozone.html. [Online; consulted on March 6, 2012]. [17] iostat linux man page. Website, [Online; consulted on March 6, 2012]. [18] hdparm - get/set ata/sata drive parameters under linux. Website, http: //sourceforge.net/projects/hdparm/. [Online; consulted on March 6, 2012]. [19] fio website. Website, [Online; consulted on March 6, 2012]. [20] Iperf website. Website, [Online; consulted on March 6, 2012]. [21] Netperf official homepage. Website, [Online; consulted on March ]. [22] Xen.org - xen official homepage. Website, [Online; consulted on March ]. [23] Wikipedia.org - sparse files. Website, Sparse_files. [Online; consulted on March ]. [24] Kernel.org - ext4 benchmark. Website, /ols2007v2-pages pdf. [Online; consulted on March ]. [25] Linux benchmark suite homepage. Website, net/. [Online; consulted on March ]. [26] Bonnie++ - official homepage. Website, bonnie++/. [Online; consulted on March ]. [27] Quicklz - official website. Website, [Online; consulted on March ]. [28] Lzo vs. lzjb in zfs. Website, [Online; consulted on March ]. [29] Nexenta - enterprise class storage for everyone. Website, nexenta.com/corp. [Online; consulted on March ].

StACC: St Andrews Cloud Computing Co laboratory. A Performance Comparison of Clouds. Amazon EC2 and Ubuntu Enterprise Cloud

StACC: St Andrews Cloud Computing Co laboratory. A Performance Comparison of Clouds. Amazon EC2 and Ubuntu Enterprise Cloud StACC: St Andrews Cloud Computing Co laboratory A Performance Comparison of Clouds Amazon EC2 and Ubuntu Enterprise Cloud Jonathan S Ward StACC (pronounced like 'stack') is a research collaboration launched

More information

Enabling Technologies for Distributed Computing

Enabling Technologies for Distributed Computing Enabling Technologies for Distributed Computing Dr. Sanjay P. Ahuja, Ph.D. Fidelity National Financial Distinguished Professor of CIS School of Computing, UNF Multi-core CPUs and Multithreading Technologies

More information

Analysis of VDI Storage Performance During Bootstorm

Analysis of VDI Storage Performance During Bootstorm Analysis of VDI Storage Performance During Bootstorm Introduction Virtual desktops are gaining popularity as a more cost effective and more easily serviceable solution. The most resource-dependent process

More information

Best Practices for Deploying Citrix XenDesktop on NexentaStor Open Storage

Best Practices for Deploying Citrix XenDesktop on NexentaStor Open Storage Best Practices for Deploying Citrix XenDesktop on NexentaStor Open Storage White Paper July, 2011 Deploying Citrix XenDesktop on NexentaStor Open Storage Table of Contents The Challenges of VDI Storage

More information

Open Source Data Deduplication

Open Source Data Deduplication Open Source Data Deduplication Nick Webb Red Wire Services, LLC nickw@redwireservices.com www.redwireservices.com @disasteraverted Introduction What is Deduplication? Different kinds? Why do you want it?

More information

VMware Virtual SAN Backup Using VMware vsphere Data Protection Advanced SEPTEMBER 2014

VMware Virtual SAN Backup Using VMware vsphere Data Protection Advanced SEPTEMBER 2014 VMware SAN Backup Using VMware vsphere Data Protection Advanced SEPTEMBER 2014 VMware SAN Backup Using VMware vsphere Table of Contents Introduction.... 3 vsphere Architectural Overview... 4 SAN Backup

More information

Enabling Technologies for Distributed and Cloud Computing

Enabling Technologies for Distributed and Cloud Computing Enabling Technologies for Distributed and Cloud Computing Dr. Sanjay P. Ahuja, Ph.D. 2010-14 FIS Distinguished Professor of Computer Science School of Computing, UNF Multi-core CPUs and Multithreading

More information

Virtualization of the MS Exchange Server Environment

Virtualization of the MS Exchange Server Environment MS Exchange Server Acceleration Maximizing Users in a Virtualized Environment with Flash-Powered Consolidation Allon Cohen, PhD OCZ Technology Group Introduction Microsoft (MS) Exchange Server is one of

More information

Virtual server management: Top tips on managing storage in virtual server environments

Virtual server management: Top tips on managing storage in virtual server environments Tutorial Virtual server management: Top tips on managing storage in virtual server environments Sponsored By: Top five tips for managing storage in a virtual server environment By Eric Siebert, Contributor

More information

Parallels Cloud Server 6.0

Parallels Cloud Server 6.0 Parallels Cloud Server 6.0 Parallels Cloud Storage I/O Benchmarking Guide September 05, 2014 Copyright 1999-2014 Parallels IP Holdings GmbH and its affiliates. All rights reserved. Parallels IP Holdings

More information

ZEN LOAD BALANCER EE v3.04 DATASHEET The Load Balancing made easy

ZEN LOAD BALANCER EE v3.04 DATASHEET The Load Balancing made easy ZEN LOAD BALANCER EE v3.04 DATASHEET The Load Balancing made easy OVERVIEW The global communication and the continuous growth of services provided through the Internet or local infrastructure require to

More information

Cloud Storage. Parallels. Performance Benchmark Results. White Paper. www.parallels.com

Cloud Storage. Parallels. Performance Benchmark Results. White Paper. www.parallels.com Parallels Cloud Storage White Paper Performance Benchmark Results www.parallels.com Table of Contents Executive Summary... 3 Architecture Overview... 3 Key Features... 4 No Special Hardware Requirements...

More information

MS Exchange Server Acceleration

MS Exchange Server Acceleration White Paper MS Exchange Server Acceleration Using virtualization to dramatically maximize user experience for Microsoft Exchange Server Allon Cohen, PhD Scott Harlin OCZ Storage Solutions, Inc. A Toshiba

More information

Preparation Guide. How to prepare your environment for an OnApp Cloud v3.0 (beta) deployment.

Preparation Guide. How to prepare your environment for an OnApp Cloud v3.0 (beta) deployment. Preparation Guide v3.0 BETA How to prepare your environment for an OnApp Cloud v3.0 (beta) deployment. Document version 1.0 Document release date 25 th September 2012 document revisions 1 Contents 1. Overview...

More information

Virtualization @ Google

Virtualization @ Google Virtualization @ Google Alexander Schreiber Google Switzerland Libre Software Meeting 2012 Geneva, Switzerland, 2012-06-10 Introduction Talk overview Corporate infrastructure Overview Use cases Technology

More information

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance. Agenda Enterprise Performance Factors Overall Enterprise Performance Factors Best Practice for generic Enterprise Best Practice for 3-tiers Enterprise Hardware Load Balancer Basic Unix Tuning Performance

More information

Cloud Optimize Your IT

Cloud Optimize Your IT Cloud Optimize Your IT Windows Server 2012 The information contained in this presentation relates to a pre-release product which may be substantially modified before it is commercially released. This pre-release

More information

PARALLELS CLOUD SERVER

PARALLELS CLOUD SERVER PARALLELS CLOUD SERVER Performance and Scalability 1 Table of Contents Executive Summary... Error! Bookmark not defined. LAMP Stack Performance Evaluation... Error! Bookmark not defined. Background...

More information

Best Practices for Optimizing Your Linux VPS and Cloud Server Infrastructure

Best Practices for Optimizing Your Linux VPS and Cloud Server Infrastructure Best Practices for Optimizing Your Linux VPS and Cloud Server Infrastructure Q1 2012 Maximizing Revenue per Server with Parallels Containers for Linux www.parallels.com Table of Contents Overview... 3

More information

PERFORMANCE ANALYSIS OF KERNEL-BASED VIRTUAL MACHINE

PERFORMANCE ANALYSIS OF KERNEL-BASED VIRTUAL MACHINE PERFORMANCE ANALYSIS OF KERNEL-BASED VIRTUAL MACHINE Sudha M 1, Harish G M 2, Nandan A 3, Usha J 4 1 Department of MCA, R V College of Engineering, Bangalore : 560059, India sudha.mooki@gmail.com 2 Department

More information

Solaris For The Modern Data Center. Taking Advantage of Solaris 11 Features

Solaris For The Modern Data Center. Taking Advantage of Solaris 11 Features Solaris For The Modern Data Center Taking Advantage of Solaris 11 Features JANUARY 2013 Contents Introduction... 2 Patching and Maintenance... 2 IPS Packages... 2 Boot Environments... 2 Fast Reboot...

More information

TECHNICAL PAPER. Veeam Backup & Replication with Nimble Storage

TECHNICAL PAPER. Veeam Backup & Replication with Nimble Storage TECHNICAL PAPER Veeam Backup & Replication with Nimble Storage Document Revision Date Revision Description (author) 11/26/2014 1. 0 Draft release (Bill Roth) 12/23/2014 1.1 Draft update (Bill Roth) 2/20/2015

More information

SIDN Server Measurements

SIDN Server Measurements SIDN Server Measurements Yuri Schaeffer 1, NLnet Labs NLnet Labs document 2010-003 July 19, 2010 1 Introduction For future capacity planning SIDN would like to have an insight on the required resources

More information

Cloud Operating Systems for Servers

Cloud Operating Systems for Servers Cloud Operating Systems for Servers Mike Day Distinguished Engineer, Virtualization and Linux August 20, 2014 mdday@us.ibm.com 1 What Makes a Good Cloud Operating System?! Consumes Few Resources! Fast

More information

Windows Server 2012 R2 Hyper-V: Designing for the Real World

Windows Server 2012 R2 Hyper-V: Designing for the Real World Windows Server 2012 R2 Hyper-V: Designing for the Real World Steve Evans @scevans www.loudsteve.com Nick Hawkins @nhawkins www.nickahawkins.com Is Hyper-V for real? Microsoft Fan Boys Reality VMware Hyper-V

More information

Cloud Computing for Control Systems CERN Openlab Summer Student Program 9/9/2011 ARSALAAN AHMED SHAIKH

Cloud Computing for Control Systems CERN Openlab Summer Student Program 9/9/2011 ARSALAAN AHMED SHAIKH Cloud Computing for Control Systems CERN Openlab Summer Student Program 9/9/2011 ARSALAAN AHMED SHAIKH CONTENTS Introduction... 4 System Components... 4 OpenNebula Cloud Management Toolkit... 4 VMware

More information

ZEN LOAD BALANCER EE v3.02 DATASHEET The Load Balancing made easy

ZEN LOAD BALANCER EE v3.02 DATASHEET The Load Balancing made easy ZEN LOAD BALANCER EE v3.02 DATASHEET The Load Balancing made easy OVERVIEW The global communication and the continuous growth of services provided through the Internet or local infrastructure require to

More information

Delivering SDS simplicity and extreme performance

Delivering SDS simplicity and extreme performance Delivering SDS simplicity and extreme performance Real-World SDS implementation of getting most out of limited hardware Murat Karslioglu Director Storage Systems Nexenta Systems October 2013 1 Agenda Key

More information

Xen @ Google. Iustin Pop, <iustin@google.com> Google Switzerland. Sponsored by:

Xen @ Google. Iustin Pop, <iustin@google.com> Google Switzerland. Sponsored by: Xen @ Google Iustin Pop, Google Switzerland Sponsored by: & & Introduction Talk overview Corporate infrastructure Overview Use cases Technology Open source components Internal components

More information

Using VMware VMotion with Oracle Database and EMC CLARiiON Storage Systems

Using VMware VMotion with Oracle Database and EMC CLARiiON Storage Systems Using VMware VMotion with Oracle Database and EMC CLARiiON Storage Systems Applied Technology Abstract By migrating VMware virtual machines from one physical environment to another, VMware VMotion can

More information

Dell Virtualization Solution for Microsoft SQL Server 2012 using PowerEdge R820

Dell Virtualization Solution for Microsoft SQL Server 2012 using PowerEdge R820 Dell Virtualization Solution for Microsoft SQL Server 2012 using PowerEdge R820 This white paper discusses the SQL server workload consolidation capabilities of Dell PowerEdge R820 using Virtualization.

More information

Using Synology SSD Technology to Enhance System Performance Synology Inc.

Using Synology SSD Technology to Enhance System Performance Synology Inc. Using Synology SSD Technology to Enhance System Performance Synology Inc. Synology_SSD_Cache_WP_ 20140512 Table of Contents Chapter 1: Enterprise Challenges and SSD Cache as Solution Enterprise Challenges...

More information

Secure Web. Hardware Sizing Guide

Secure Web. Hardware Sizing Guide Secure Web Hardware Sizing Guide Table of Contents 1. Introduction... 1 2. Sizing Guide... 2 3. CPU... 3 3.1. Measurement... 3 4. RAM... 5 4.1. Measurement... 6 5. Harddisk... 7 5.1. Mesurement of disk

More information

HPC performance applications on Virtual Clusters

HPC performance applications on Virtual Clusters Panagiotis Kritikakos EPCC, School of Physics & Astronomy, University of Edinburgh, Scotland - UK pkritika@epcc.ed.ac.uk 4 th IC-SCCE, Athens 7 th July 2010 This work investigates the performance of (Java)

More information

Evaluating Network Attached Storage Units

Evaluating Network Attached Storage Units Benchmarking Strategies for Home Users, SOHOs and SMBs October 23, 2015 1 2 3 4 5 6 Background NAS Units - The Marketing Angle Evaluation Metrics Sr. Editor @ AnandTech 5+ years reviewing multimedia systems,

More information

Virtualization. Michael Tsai 2015/06/08

Virtualization. Michael Tsai 2015/06/08 Virtualization Michael Tsai 2015/06/08 What is virtualization? Let s first look at a video from VMware http://bcove.me/x9zhalcl Problems? Low utilization Different needs DNS DHCP Web mail 5% 5% 15% 8%

More information

D1.2 Network Load Balancing

D1.2 Network Load Balancing D1. Network Load Balancing Ronald van der Pol, Freek Dijkstra, Igor Idziejczak, and Mark Meijerink SARA Computing and Networking Services, Science Park 11, 9 XG Amsterdam, The Netherlands June ronald.vanderpol@sara.nl,freek.dijkstra@sara.nl,

More information

IOmark- VDI. HP HP ConvergedSystem 242- HC StoreVirtual Test Report: VDI- HC- 150427- b Test Report Date: 27, April 2015. www.iomark.

IOmark- VDI. HP HP ConvergedSystem 242- HC StoreVirtual Test Report: VDI- HC- 150427- b Test Report Date: 27, April 2015. www.iomark. IOmark- VDI HP HP ConvergedSystem 242- HC StoreVirtual Test Report: VDI- HC- 150427- b Test Copyright 2010-2014 Evaluator Group, Inc. All rights reserved. IOmark- VDI, IOmark- VM, VDI- IOmark, and IOmark

More information

Masters Project Proposal

Masters Project Proposal Masters Project Proposal Virtual Machine Storage Performance Using SR-IOV by Michael J. Kopps Committee Members and Signatures Approved By Date Advisor: Dr. Jia Rao Committee Member: Dr. Xiabo Zhou Committee

More information

CLOUDSPECS PERFORMANCE REPORT LUNACLOUD, AMAZON EC2, RACKSPACE CLOUD AUTHOR: KENNY LI NOVEMBER 2012

CLOUDSPECS PERFORMANCE REPORT LUNACLOUD, AMAZON EC2, RACKSPACE CLOUD AUTHOR: KENNY LI NOVEMBER 2012 CLOUDSPECS PERFORMANCE REPORT LUNACLOUD, AMAZON EC2, RACKSPACE CLOUD AUTHOR: KENNY LI NOVEMBER 2012 EXECUTIVE SUMMARY This publication of the CloudSpecs Performance Report compares cloud servers of Amazon

More information

PARALLELS CLOUD STORAGE

PARALLELS CLOUD STORAGE PARALLELS CLOUD STORAGE Performance Benchmark Results 1 Table of Contents Executive Summary... Error! Bookmark not defined. Architecture Overview... 3 Key Features... 5 No Special Hardware Requirements...

More information

Intro to Virtualization

Intro to Virtualization Cloud@Ceid Seminars Intro to Virtualization Christos Alexakos Computer Engineer, MSc, PhD C. Sysadmin at Pattern Recognition Lab 1 st Seminar 19/3/2014 Contents What is virtualization How it works Hypervisor

More information

Outline. Introduction Virtualization Platform - Hypervisor High-level NAS Functions Applications Supported NAS models

Outline. Introduction Virtualization Platform - Hypervisor High-level NAS Functions Applications Supported NAS models 1 2 Outline Introduction Virtualization Platform - Hypervisor High-level NAS Functions Applications Supported NAS models 3 Introduction What is Virtualization Station? Allows users to create and operate

More information

Oracle Database Scalability in VMware ESX VMware ESX 3.5

Oracle Database Scalability in VMware ESX VMware ESX 3.5 Performance Study Oracle Database Scalability in VMware ESX VMware ESX 3.5 Database applications running on individual physical servers represent a large consolidation opportunity. However enterprises

More information

Moving Virtual Storage to the Cloud. Guidelines for Hosters Who Want to Enhance Their Cloud Offerings with Cloud Storage

Moving Virtual Storage to the Cloud. Guidelines for Hosters Who Want to Enhance Their Cloud Offerings with Cloud Storage Moving Virtual Storage to the Cloud Guidelines for Hosters Who Want to Enhance Their Cloud Offerings with Cloud Storage Table of Contents Overview... 1 Understanding the Storage Problem... 1 What Makes

More information

Cloud Server. Parallels. Key Features and Benefits. White Paper. www.parallels.com

Cloud Server. Parallels. Key Features and Benefits. White Paper. www.parallels.com Parallels Cloud Server White Paper Key Features and Benefits www.parallels.com Table of Contents Introduction... 3 Key Features... 3 Distributed Cloud Storage (Containers and Hypervisors)... 3 Rebootless

More information

9/26/2011. What is Virtualization? What are the different types of virtualization.

9/26/2011. What is Virtualization? What are the different types of virtualization. CSE 501 Monday, September 26, 2011 Kevin Cleary kpcleary@buffalo.edu What is Virtualization? What are the different types of virtualization. Practical Uses Popular virtualization products Demo Question,

More information

Veeam Cloud Connect. Version 8.0. Administrator Guide

Veeam Cloud Connect. Version 8.0. Administrator Guide Veeam Cloud Connect Version 8.0 Administrator Guide April, 2015 2015 Veeam Software. All rights reserved. All trademarks are the property of their respective owners. No part of this publication may be

More information

Maximizing SQL Server Virtualization Performance

Maximizing SQL Server Virtualization Performance Maximizing SQL Server Virtualization Performance Michael Otey Senior Technical Director Windows IT Pro SQL Server Pro 1 What this presentation covers Host configuration guidelines CPU, RAM, networking

More information

IOmark- VDI. Nimbus Data Gemini Test Report: VDI- 130906- a Test Report Date: 6, September 2013. www.iomark.org

IOmark- VDI. Nimbus Data Gemini Test Report: VDI- 130906- a Test Report Date: 6, September 2013. www.iomark.org IOmark- VDI Nimbus Data Gemini Test Report: VDI- 130906- a Test Copyright 2010-2013 Evaluator Group, Inc. All rights reserved. IOmark- VDI, IOmark- VDI, VDI- IOmark, and IOmark are trademarks of Evaluator

More information

Deploying Business Virtual Appliances on Open Source Cloud Computing

Deploying Business Virtual Appliances on Open Source Cloud Computing International Journal of Computer Science and Telecommunications [Volume 3, Issue 4, April 2012] 26 ISSN 2047-3338 Deploying Business Virtual Appliances on Open Source Cloud Computing Tran Van Lang 1 and

More information

Virtual Switching Without a Hypervisor for a More Secure Cloud

Virtual Switching Without a Hypervisor for a More Secure Cloud ing Without a for a More Secure Cloud Xin Jin Princeton University Joint work with Eric Keller(UPenn) and Jennifer Rexford(Princeton) 1 Public Cloud Infrastructure Cloud providers offer computing resources

More information

Benchmarking Hadoop & HBase on Violin

Benchmarking Hadoop & HBase on Violin Technical White Paper Report Technical Report Benchmarking Hadoop & HBase on Violin Harnessing Big Data Analytics at the Speed of Memory Version 1.0 Abstract The purpose of benchmarking is to show advantages

More information

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION A DIABLO WHITE PAPER AUGUST 2014 Ricky Trigalo Director of Business Development Virtualization, Diablo Technologies

More information

Performance in a Gluster System. Versions 3.1.x

Performance in a Gluster System. Versions 3.1.x Performance in a Gluster System Versions 3.1.x TABLE OF CONTENTS Table of Contents... 2 List of Figures... 3 1.0 Introduction to Gluster... 4 2.0 Gluster view of Performance... 5 2.1 Good performance across

More information

Moving Virtual Storage to the Cloud

Moving Virtual Storage to the Cloud Moving Virtual Storage to the Cloud White Paper Guidelines for Hosters Who Want to Enhance Their Cloud Offerings with Cloud Storage www.parallels.com Table of Contents Overview... 3 Understanding the Storage

More information

A Middleware Strategy to Survive Compute Peak Loads in Cloud

A Middleware Strategy to Survive Compute Peak Loads in Cloud A Middleware Strategy to Survive Compute Peak Loads in Cloud Sasko Ristov Ss. Cyril and Methodius University Faculty of Information Sciences and Computer Engineering Skopje, Macedonia Email: sashko.ristov@finki.ukim.mk

More information

Monitoring Databases on VMware

Monitoring Databases on VMware Monitoring Databases on VMware Ensure Optimum Performance with the Correct Metrics By Dean Richards, Manager, Sales Engineering Confio Software 4772 Walnut Street, Suite 100 Boulder, CO 80301 www.confio.com

More information

The future is in the management tools. Profoss 22/01/2008

The future is in the management tools. Profoss 22/01/2008 The future is in the management tools Profoss 22/01/2008 Niko Nelissen Co founder & VP Business development Q layer Agenda Introduction Virtualization today Server & desktop virtualization Storage virtualization

More information

How to Choose your Red Hat Enterprise Linux Filesystem

How to Choose your Red Hat Enterprise Linux Filesystem How to Choose your Red Hat Enterprise Linux Filesystem EXECUTIVE SUMMARY Choosing the Red Hat Enterprise Linux filesystem that is appropriate for your application is often a non-trivial decision due to

More information

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency WHITE PAPER Solving I/O Bottlenecks to Enable Superior Cloud Efficiency Overview...1 Mellanox I/O Virtualization Features and Benefits...2 Summary...6 Overview We already have 8 or even 16 cores on one

More information

On Benchmarking Popular File Systems

On Benchmarking Popular File Systems On Benchmarking Popular File Systems Matti Vanninen James Z. Wang Department of Computer Science Clemson University, Clemson, SC 2963 Emails: {mvannin, jzwang}@cs.clemson.edu Abstract In recent years,

More information

Data Backup and Archiving with Enterprise Storage Systems

Data Backup and Archiving with Enterprise Storage Systems Data Backup and Archiving with Enterprise Storage Systems Slavjan Ivanov 1, Igor Mishkovski 1 1 Faculty of Computer Science and Engineering Ss. Cyril and Methodius University Skopje, Macedonia slavjan_ivanov@yahoo.com,

More information

Using Synology SSD Technology to Enhance System Performance Synology Inc.

Using Synology SSD Technology to Enhance System Performance Synology Inc. Using Synology SSD Technology to Enhance System Performance Synology Inc. Synology_WP_ 20121112 Table of Contents Chapter 1: Enterprise Challenges and SSD Cache as Solution Enterprise Challenges... 3 SSD

More information

Technical Investigation of Computational Resource Interdependencies

Technical Investigation of Computational Resource Interdependencies Technical Investigation of Computational Resource Interdependencies By Lars-Eric Windhab Table of Contents 1. Introduction and Motivation... 2 2. Problem to be solved... 2 3. Discussion of design choices...

More information

Pivot3 Reference Architecture for VMware View Version 1.03

Pivot3 Reference Architecture for VMware View Version 1.03 Pivot3 Reference Architecture for VMware View Version 1.03 January 2012 Table of Contents Test and Document History... 2 Test Goals... 3 Reference Architecture Design... 4 Design Overview... 4 The Pivot3

More information

Drobo How-To Guide. Use a Drobo iscsi Array as a Target for Veeam Backups

Drobo How-To Guide. Use a Drobo iscsi Array as a Target for Veeam Backups This document shows you how to use a Drobo iscsi SAN Storage array with Veeam Backup & Replication version 5 in a VMware environment. Veeam provides fast disk-based backup and recovery of virtual machines

More information

Parallels Cloud Storage

Parallels Cloud Storage Parallels Cloud Storage White Paper Best Practices for Configuring a Parallels Cloud Storage Cluster www.parallels.com Table of Contents Introduction... 3 How Parallels Cloud Storage Works... 3 Deploying

More information

WHITE PAPER 1 WWW.FUSIONIO.COM

WHITE PAPER 1 WWW.FUSIONIO.COM 1 WWW.FUSIONIO.COM WHITE PAPER WHITE PAPER Executive Summary Fusion iovdi is the first desktop- aware solution to virtual desktop infrastructure. Its software- defined approach uniquely combines the economics

More information

White Paper. Recording Server Virtualization

White Paper. Recording Server Virtualization White Paper Recording Server Virtualization Prepared by: Mike Sherwood, Senior Solutions Engineer Milestone Systems 23 March 2011 Table of Contents Introduction... 3 Target audience and white paper purpose...

More information

A Holistic Model of the Energy-Efficiency of Hypervisors

A Holistic Model of the Energy-Efficiency of Hypervisors A Holistic Model of the -Efficiency of Hypervisors in an HPC Environment Mateusz Guzek,Sebastien Varrette, Valentin Plugaru, Johnatan E. Pecero and Pascal Bouvry SnT & CSC, University of Luxembourg, Luxembourg

More information

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1 Performance Study Performance Characteristics of and RDM VMware ESX Server 3.0.1 VMware ESX Server offers three choices for managing disk access in a virtual machine VMware Virtual Machine File System

More information

Quantifying the Performance Degradation of IPv6 for TCP in Windows and Linux Networking

Quantifying the Performance Degradation of IPv6 for TCP in Windows and Linux Networking Quantifying the Performance Degradation of IPv6 for TCP in Windows and Linux Networking Burjiz Soorty School of Computing and Mathematical Sciences Auckland University of Technology Auckland, New Zealand

More information

VDI Optimization Real World Learnings. Russ Fellows, Evaluator Group

VDI Optimization Real World Learnings. Russ Fellows, Evaluator Group Russ Fellows, Evaluator Group SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA unless otherwise noted. Member companies and individual members may use this material

More information

Evaluation of Enterprise Data Protection using SEP Software

Evaluation of Enterprise Data Protection using SEP Software Test Validation Test Validation - SEP sesam Enterprise Backup Software Evaluation of Enterprise Data Protection using SEP Software Author:... Enabling you to make the best technology decisions Backup &

More information

Dell Compellent Storage Center SAN & VMware View 1,000 Desktop Reference Architecture. Dell Compellent Product Specialist Team

Dell Compellent Storage Center SAN & VMware View 1,000 Desktop Reference Architecture. Dell Compellent Product Specialist Team Dell Compellent Storage Center SAN & VMware View 1,000 Desktop Reference Architecture Dell Compellent Product Specialist Team THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL

More information

Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build 164009

Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build 164009 Performance Study Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build 164009 Introduction With more and more mission critical networking intensive workloads being virtualized

More information

Best Practices for Deploying SSDs in a Microsoft SQL Server 2008 OLTP Environment with Dell EqualLogic PS-Series Arrays

Best Practices for Deploying SSDs in a Microsoft SQL Server 2008 OLTP Environment with Dell EqualLogic PS-Series Arrays Best Practices for Deploying SSDs in a Microsoft SQL Server 2008 OLTP Environment with Dell EqualLogic PS-Series Arrays Database Solutions Engineering By Murali Krishnan.K Dell Product Group October 2009

More information

Week Overview. Installing Linux Linux on your Desktop Virtualization Basic Linux system administration

Week Overview. Installing Linux Linux on your Desktop Virtualization Basic Linux system administration ULI101 Week 06b Week Overview Installing Linux Linux on your Desktop Virtualization Basic Linux system administration Installing Linux Standalone installation Linux is the only OS on the computer Any existing

More information

UBUNTU DISK IO BENCHMARK TEST RESULTS

UBUNTU DISK IO BENCHMARK TEST RESULTS UBUNTU DISK IO BENCHMARK TEST RESULTS FOR JOYENT Revision 2 January 5 th, 2010 The IMS Company Scope: This report summarizes the Disk Input Output (IO) benchmark testing performed in December of 2010 for

More information

my forecasted needs. The constraint of asymmetrical processing was offset two ways. The first was by configuring the SAN and all hosts to utilize

my forecasted needs. The constraint of asymmetrical processing was offset two ways. The first was by configuring the SAN and all hosts to utilize 1) Disk performance When factoring in disk performance, one of the larger impacts on a VM is determined by the type of disk you opt to use for your VMs in Hyper-v manager/scvmm such as fixed vs dynamic.

More information

Step by Step Guide To vstorage Backup Server (Proxy) Sizing

Step by Step Guide To vstorage Backup Server (Proxy) Sizing Tivoli Storage Manager for Virtual Environments V6.3 Step by Step Guide To vstorage Backup Server (Proxy) Sizing 12 September 2012 1.1 Author: Dan Wolfe, Tivoli Software Advanced Technology Page 1 of 18

More information

Performance Testing of a Cloud Service

Performance Testing of a Cloud Service Performance Testing of a Cloud Service Trilesh Bhurtun, Junior Consultant, Capacitas Ltd Capacitas 2012 1 Introduction Objectives Environment Tests and Results Issues Summary Agenda Capacitas 2012 2 1

More information

NEXENTA S VDI SOLUTIONS BRAD STONE GENERAL MANAGER NEXENTA GREATERCHINA

NEXENTA S VDI SOLUTIONS BRAD STONE GENERAL MANAGER NEXENTA GREATERCHINA NEXENTA S VDI SOLUTIONS BRAD STONE GENERAL MANAGER NEXENTA GREATERCHINA VDI Storage Challenge 95% of I/O is small, random writes Very challenging for a storage system End users demand low latency NexentaStor

More information

Online Remote Data Backup for iscsi-based Storage Systems

Online Remote Data Backup for iscsi-based Storage Systems Online Remote Data Backup for iscsi-based Storage Systems Dan Zhou, Li Ou, Xubin (Ben) He Department of Electrical and Computer Engineering Tennessee Technological University Cookeville, TN 38505, USA

More information

DELL. Virtual Desktop Infrastructure Study END-TO-END COMPUTING. Dell Enterprise Solutions Engineering

DELL. Virtual Desktop Infrastructure Study END-TO-END COMPUTING. Dell Enterprise Solutions Engineering DELL Virtual Desktop Infrastructure Study END-TO-END COMPUTING Dell Enterprise Solutions Engineering 1 THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS AND TECHNICAL

More information

Peter Senna Tschudin. Performance Overhead and Comparative Performance of 4 Virtualization Solutions. Version 1.29

Peter Senna Tschudin. Performance Overhead and Comparative Performance of 4 Virtualization Solutions. Version 1.29 Peter Senna Tschudin Performance Overhead and Comparative Performance of 4 Virtualization Solutions Version 1.29 Table of Contents Project Description...4 Virtualization Concepts...4 Virtualization...4

More information

Leveraging EMC Fully Automated Storage Tiering (FAST) and FAST Cache for SQL Server Enterprise Deployments

Leveraging EMC Fully Automated Storage Tiering (FAST) and FAST Cache for SQL Server Enterprise Deployments Leveraging EMC Fully Automated Storage Tiering (FAST) and FAST Cache for SQL Server Enterprise Deployments Applied Technology Abstract This white paper introduces EMC s latest groundbreaking technologies,

More information

IOmark-VM. DotHill AssuredSAN Pro 5000. Test Report: VM- 130816-a Test Report Date: 16, August 2013. www.iomark.org

IOmark-VM. DotHill AssuredSAN Pro 5000. Test Report: VM- 130816-a Test Report Date: 16, August 2013. www.iomark.org IOmark-VM DotHill AssuredSAN Pro 5000 Test Report: VM- 130816-a Test Report Date: 16, August 2013 Copyright 2010-2013 Evaluator Group, Inc. All rights reserved. IOmark-VM, IOmark-VDI, VDI-IOmark, and IOmark

More information

IMPLEMENTING GREEN IT

IMPLEMENTING GREEN IT Saint Petersburg State University of Information Technologies, Mechanics and Optics Department of Telecommunication Systems IMPLEMENTING GREEN IT APPROACH FOR TRANSFERRING BIG DATA OVER PARALLEL DATA LINK

More information

Cloud Simulator for Scalability Testing

Cloud Simulator for Scalability Testing Cloud Simulator for Scalability Testing Nitin Singhvi (nitin.singhvi@calsoftinc.com) 1 Introduction Nitin Singhvi 11+ Years of experience in technology, especially in Networking QA. Currently playing roles

More information

POSIX and Object Distributed Storage Systems

POSIX and Object Distributed Storage Systems 1 POSIX and Object Distributed Storage Systems Performance Comparison Studies With Real-Life Scenarios in an Experimental Data Taking Context Leveraging OpenStack Swift & Ceph by Michael Poat, Dr. Jerome

More information

Windows Server 2008 R2 Hyper-V Live Migration

Windows Server 2008 R2 Hyper-V Live Migration Windows Server 2008 R2 Hyper-V Live Migration Table of Contents Overview of Windows Server 2008 R2 Hyper-V Features... 3 Dynamic VM storage... 3 Enhanced Processor Support... 3 Enhanced Networking Support...

More information

VDI Without Compromise with SimpliVity OmniStack and Citrix XenDesktop

VDI Without Compromise with SimpliVity OmniStack and Citrix XenDesktop VDI Without Compromise with SimpliVity OmniStack and Citrix XenDesktop Page 1 of 11 Introduction Virtual Desktop Infrastructure (VDI) provides customers with a more consistent end-user experience and excellent

More information

Leveraging NIC Technology to Improve Network Performance in VMware vsphere

Leveraging NIC Technology to Improve Network Performance in VMware vsphere Leveraging NIC Technology to Improve Network Performance in VMware vsphere Performance Study TECHNICAL WHITE PAPER Table of Contents Introduction... 3 Hardware Description... 3 List of Features... 4 NetQueue...

More information

Benchmarking Cassandra on Violin

Benchmarking Cassandra on Violin Technical White Paper Report Technical Report Benchmarking Cassandra on Violin Accelerating Cassandra Performance and Reducing Read Latency With Violin Memory Flash-based Storage Arrays Version 1.0 Abstract

More information

Reference Architecture for a Virtualized SharePoint 2010 Document Management Solution A Dell Technical White Paper

Reference Architecture for a Virtualized SharePoint 2010 Document Management Solution A Dell Technical White Paper Dell EqualLogic Best Practices Series Reference Architecture for a Virtualized SharePoint 2010 Document Management Solution A Dell Technical White Paper Storage Infrastructure and Solutions Engineering

More information

VIDEO SURVEILLANCE WITH SURVEILLUS VMS AND EMC ISILON STORAGE ARRAYS

VIDEO SURVEILLANCE WITH SURVEILLUS VMS AND EMC ISILON STORAGE ARRAYS VIDEO SURVEILLANCE WITH SURVEILLUS VMS AND EMC ISILON STORAGE ARRAYS Successfully configure all solution components Use VMS at the required bandwidth for NAS storage Meet the bandwidth demands of a 2,200

More information

SAN Acceleration Using Nexenta VSA for VMware Horizon View with Third-Party SAN Storage NEXENTA OFFICE OF CTO ILYA GRAFUTKO

SAN Acceleration Using Nexenta VSA for VMware Horizon View with Third-Party SAN Storage NEXENTA OFFICE OF CTO ILYA GRAFUTKO SAN Acceleration Using Nexenta VSA for VMware Horizon View with Third-Party SAN Storage NEXENTA OFFICE OF CTO ILYA GRAFUTKO Table of Contents VDI Performance 3 NV4V and Storage Attached Network 3 Getting

More information

Analyzing Big Data with Splunk A Cost Effective Storage Architecture and Solution

Analyzing Big Data with Splunk A Cost Effective Storage Architecture and Solution Analyzing Big Data with Splunk A Cost Effective Storage Architecture and Solution Jonathan Halstuch, COO, RackTop Systems JHalstuch@racktopsystems.com Big Data Invasion We hear so much on Big Data and

More information