Large Installation Administration. Comparing open source deduplication performance for virtual machines
|
|
- Kenneth Parrish
- 8 years ago
- Views:
Transcription
1 System & Network Engineering Large Installation Administration Comparing open source deduplication performance for virtual machines Authors: Supervisor: Jaap van Ginkel University of Amsterdam
2 This page is left blank intentionally.
3 Abstract The research presented in this paper shows in which specific situations the most efficient use of storage resources can be achieved using open source deduplication solutions, LessFS and ZFS, while at the same time maintaining an acceptable level of disk performance. This study specifically aims at the disk performance for virtual guests that have their disks stored at the deduplicated file systems. The results of this study show that optimal performance can be achieved not using any data deduplication solution, especially when virtual guests need to perform heavy write actions. However, if you want to save huge amounts of storage, only a small expense in performance is paid when using ZFS as the deduplication solution. Furthermore, the research shows that ZFS is best used as the open source deduplication solution as it performs best in most of the test cases while still maintaining a high level of storage consolidation. Finally, the research shows the effectiveness of both deduplication solutions in storage consolidation and it can be seen that LessFS is the most efficient in this task.
4 CONTENTS 1 Contents 1 Introduction Data deduplication Research Research questions Literature review Open source solutions Performance measurement tools Disk performance determining factors Experiments Lab setup Storage measurements Results Sequential write Sequential read Random rewrite Random seeks Storage consolidation Conclusion 20 6 Future work 21 A Server hardware 22 A.1 Storage server A.2 Host server A.3 Virtual guest server
5 1 INTRODUCTION 2 1 Introduction In an ever growing and competitive hosting market, keeping costs down is one of the more important factors. It is therefore important to make the most efficient use of resources while keeping the same level of performance. With the increase in bandwidth speeds, larger data files and increasing storage requirements a solution for efficient and fast data storage is needed. To this end data deduplication [1] has gained enormous popularity as an efficient way of storing data. 1.1 Data deduplication Storage systems are often dedicated to storing large amounts of data. Storage systems store this data on its file system in the form of blocks of a certain size. For instance, take a system administrator who has configured nightly backup jobs of all the desktop machines in his organization. These backups are to be stored on the central backup storage system. Most likely, each backup will contain many similar data files, which will require the same amount of storage space each time it has to be stored on the storage system containing all the backups of the organization. An example of efficient resource utilization is virtualization [2]. Many bare-metal systems that are providing only a single service, e.g. a DNS or Mail server, often do not fully utilize the available resources. By using virtualization, a bare-metal server can be used to run multiple virtual operating systems using virtualization technology. From a user point of view, the OS appears isolated and standalone. Effectively, one system can be used to create multiple, virtual, systems increasing the utilization of the resources of the bare-metal server. However, each virtualized system still requires the storage of its operating system files, which is often stored on a central storage repository accessible over the network. In scenarios where the same OS is virtualized more than once, the same files are likely to be stored more than once as well. According to Keren Jin and Ethan L. Miller [3], making use of data deduplication for storing virtual disks helps considerably in consolidating storage. However, practical data deduplication implementations and virtual guest performance have not been researched. Data deduplication is a method to eliminate duplicate copies of the same data blocks, which can be used to reduce the required amount of space in storage systems like described in the examples above. There are different types of data deduplication, which we will discuss further in this paper. By using data deduplication techniques on a storage system, data that is stored on it is checked for duplicate blocks. When identical data blocks are found, the system will create a pointer to the data block that was already present on the system. This approach only requires the storage space for one file together with some overhead to store the pointers.
6 1 INTRODUCTION Types of deduplication Data deduplication differs from regular compression algorithms in the sense that it actually works on the block level of a storage device, instead of on the files themselves. As there are many more data blocks than there are data files, it is therefore most likely that many blocks are the same in contrast to duplicate data files. This potentially allows for more amounts of duplicated data that can be saved. A downside is that the process itself is a CPU intensive task, as new data needs to be matched against previously stored data. There are different forms of deduplication that can be used in various scenarios. The usage of each type depends on the storage and network setup and requirements. We can distinguish between the location, the time and the method of deduplication. Location The location of where the deduplication is done is an important factor to take into account. For instance, when there is a limited amount of bandwidth available, the maximum amount of time a backup can take will increase. The choice where deduplication has to take place is dependent on the amount of bandwidth available. Depending on how the storage infrastructure is designed, choices have to be made on how to utilize it. Deduplication methods can be applied at the source the data is stored, or at the target. In the example above, applying source deduplication can shorten the backup time, as there is less data to transfer when transferring the backup to the backup system. With target deduplication, more bandwidth is required, plus the target storage system would initially have to require more disk space in contrast to source deduplication. Time A storage system with a high system load is undesired at times when the system is actively used. The time to perform deduplication on your data can influence the performance of a storage system and can be noticed by the direct or indirect users of such a system. Depending on the function of the storage system, it may be wise to let the deduplication algorithms run when the system is not actively used, for instance, at night. Method The method of deduplication defines what type of deduplication technique is performed on the data to reduce the amount of space it consumes Deduplication methods There are different deduplication methods of how the actual data is stored and how they are reduced in file size. The most common methods are described below. In-line deduplication is a form of source deduplication where the deduplication mechanism examines the data that is received (e.g. via the network).
7 1 INTRODUCTION 4 If a duplicate block of data is detected, a new pointer to the existing block is created. Besides the pointer, no data is written to the storage system in general. A downside to this is that it slows down the system if a large amount of data is being received [4]. Out-of-line deduplication is, as the name suggests, the opposite of in-line deduplication. Deduplication is performed at the target data storage. This method is also known as post-process deduplication. The data is first received on disk, where it is deduplicated at a later time, for example, to avoid a loss in performance during business hours. With this form of deduplication, more storage space is required to store the initial data first, before it is duplicated [4]. File-based deduplication, which is commonly known as Content Addressable Storage (CAS) [5] is a deduplication method on file level instead of block level. The methods described above each have their own pro s and con s. Which one to implement is depending on the requirements and utilization of a storage system. Deduplication impacts the write performance on a system, so this is an important aspect to take into account when choosing the most suitable method.
8 2 RESEARCH 5 2 Research Our research focuses on the use of different open source data deduplication solutions as storage platforms for virtual disks and the amount of storage that can be consolidated. Furthermore, we want to see if it is possible to gain insight into the impact on the performance of the virtual guests with virtual disks stored on a deduplicated storage platform. 2.1 Research questions Based on the description above, we defined the following research question. What is the impact on virtual guest disk performance when using open source data deduplication solutions? The following subquestions will help to answer the main research question. What is the amount of storage saved when a deduplication mechanism is applied? How does increasing amounts of storage consolidation influence the virtual disk performance? 2.2 Literature review Prior to the start of this study, we have looked into existing related research on the subject. Keren Jin and Ethan L. Miller have done research on the effectiveness of data deduplication of virtual disks in their paper The Effectiveness of Deduplication on Virtual Machine Disk Images [3]. They found that using data deduplication on virtual disks can save about 80% of storage. However, the methods used are purely conceptual and proof of concept as they treat each virtual disk as a byte stream and separate the stream into chunks of a specific size. For each chunk they calculate a SHA1 hash and if multiple chunks have the same SHA1 hash, they are considered to be duplicates of each other. Whether the same percentage of storage saving can be achieved in practice is not clear. A more practical approach into the effectiveness of data deduplication is taken by Dutch T. Meyer and William J. Bolosky in A Study of Practical Deduplication [6]. Their results show that data deduplication is very effective within a large commercial company. However, they do not look at the use of data deduplication in virtualized environments and only look at Windows based desktop machines. This research shows the effectiveness of data deduplication on a large set of different data between different systems which is relevant for our research.
9 2 RESEARCH Open source solutions Companies such as NetApp [7] and EMC [8] provide storage solutions which are often very costly to implement into a business. Such solutions are not affordable by every company as these solutions are proprietary and also require trained engineers or a support contract. Although the latter two can also desired when using an open source solution, an open source solution can be a more viable option. In this study, we specifically look into open source implementations of deduplication mechanisms. Below a few of the open source deduplication solutions available are discussed. SDFS is an open source deduplication solution by OpenDedup. It is capable of performing in-line deduplication [9] on the files stored in its file system. It is a cross platform solution with support for multiple, distributed storage nodes. OpenDedup provides support for in-line and out-of-line (batch) deduplication. They claim to be an open source enterprise deduplication platform that is able to perform deduplication at line speeds of 1 Gigabyte per second or faster. The official homepage contains well documented guides on how to setup an OpenDedup file system on a Linux system. OpenDedup could be an interesting choice for this research, as it provides an easy setup and promises great performance. LessFS provides in-line data deduplication using FUSE [10]. LessFS is a userspace deduplication solution that has support for in-line deduplication, compression and encryption. The official LessFS website focuses on usability by providing easy to follow tutorials on how to setup a simple LessFS-based file system on a Linux system using common tools. The suggested usability of LessFS makes this solution a very viable candidate to be used in our research. s3ql is a source-based deduplication technique that deduplicates data before sending it to the Amazon S3 storage buckets in the cloud [11]. A similar approach is taken by s3fs [12], but it does not provide local deduplication. Both solutions sound very interesting, especially when looked at with a cost-saving point of view. However, cost-saving is not the prime focus of this research and thus not interesting for us to investigate further. ZFS is a well-known file system that has build-in data deduplication. ZFS is a robust and reliable file system originally introduced with the OpenSolaris operating system in It has support for large storage volumes of up to 16 Exabytes, support for file and folder snapshots and is able to continuously perform file integrity checking [13] to prevent data corruption. More recently, native support for ZFS file systems was included in the FreeBSD operating system [14]. We think ZFS is also interesting to include in our research, as it is a popular file system and has build-in support for deduplication.
10 2 RESEARCH Performance measurement tools To perform the actual measurements, we have done research on existing tools that are able to aid us in the measurements. iozone is a file system benchmark tool [15]. It can simulate various workloads (or IO operations) on file systems to measure it s real-life performance [16]. iostat is a similar tool as iozone. It is mostly used on local storage or NFS shares to monitor and measure the IO performance [17]. hdparm is a utility to view and change hard drive parameters to gain optimal drive performance. It can also be used to perform disk throughput tests to measure disk performance [18]. bonnie++ is basically a rewrite in the C++ programming language of a similar tool called bonnie. It has many features to perform the disk performance measurements we require. fio is a popular tool to measure storage performance [19]. It features a comprehensive set of features, such as 13 different IO engines, multiple IO priorities (for newer Linux kernels) and multi-threading. Iperf can measure the maximum performance and bandwidth of TCP and UDP based connections [20]. A client-server model is used to perform measurements on the network performance. For a specified amount of time, a traffic stream is generated between an iperf client and server and network throughput is reported upon completion. Although this tool can t be used to measure IOPS, it is still useful to measure the raw network throughput. A similar tool is Netperf [21].
11 2 RESEARCH Disk performance determining factors How a disk performs in a storage system depends on certain factors. The hardware that is used in the system is obviously the most important factor. While hardware has its limitations, software can improve the performance of the system. The most important factors that can influence disk performance in the are the following. CPU (clockspeed, L3 cache size) Storage device (SSD, HDD, PCI) Storage configuration (RAID level) Caching mechanisms (RAM) File system (block size, journaling) IO scheduler (type of algorithm and configuration) Kernel (version, optimizations)
12 3 EXPERIMENTS 9 3 Experiments We have created a test plan to be able to get consistent test results. The defined test procedures were applied on the different lab setups using different configuration parameters. This method of consistent testing results in comparable measurements. 3.1 Lab setup For us to perform measurements, we set up a lab environment. This test environment consists of a total of six Dell servers which will be used during the experiments. The specific models and hardware specifications are described in Appendix A. Out of the six available machines, five will be used as the hosts which will run multiple virtual guest machines. These virtual guests are installed with a default x64 installation of Debian Squeeze. On top of this the virtualization software Xen will be used [22], which will enable us to run multiple operating systems simultaneously. We chose to use Xen because it is easy to setup, widely used, has a large community and is an enterprise ready virtualization solution. Xen is installed from the Ubuntu repository using the xen-hypervisor-4.1-amd64 package. The sixth machine will be used as a shared storage server on which we will install the deduplication software. This server will share its storage over the network to the five host machines Initial network test First, we want to test the performance of the network to confirm the reliability and stability of the network and its speed. This is important so we can be sure that the network is not a limiting factor in achieving our results. Accessing a shared storage resource over a network, results in a slight performance overhead. Because of this, our initial network test will be done using a benchmark tool called iperf, which we have described in Section 2.4. The iperf test was performed without any tweaks to the TCP/IP stack. The test performed is shown below. hosta~# iperf -s hostb~# iperf -c hosta iperf command. An average network speed of 939 Mbits/sec is measured using the shown method to set up a server and client connection. The test generates traffic from memory,
13 3 EXPERIMENTS 10 so disk performance is not measured or taken into account during this test. The results shown a near line-speed network bandwidth, which proves a stable network. 3.2 Storage measurements Storage setup As we want to be able to perform measurements using a shared storage system, we have to setup a server that will provide these services. We choose to use the iscsi protocol to share the file system on the shared storage server to the different virtual host machines. For every virtual guest in the lab setup, we create a separate sparse file containing the operating system. Normally, when storing files on disk, the size of the file can be requested by issuing a specific system call to the operating system. A special way of storing files is by doing this in a sparse way. Sparse files are different than normal files in the sense that empty parts of the file are not actually stored on disk, but are represented using meta data, thus saving valuable storage space on the file system [23]. Each sparse file belonging to a separate virtual guest is represented as an iscsi volume (LUN), which was mounted on the host where the virtual guest would run. Each of the five hosts would mount the shared iscsi LUNs to provide the hosted virtual guests with disk space. Each virtual guest would have a separate physical hard disk of 5 Gigabyte configured in its Xen configuration file which represented the actual iscsi LUN mounted on the host. This setup enables the virtual guest to directly boot from the iscsi LUN located on the shared storage server. The setup is visualized and shown in Figure 1. The first step is to perform measurements on the lab setup without any deduplication solution running on the shared storage server. The storage server is installed using the EXT4 file system. We choose to use EXT4 as the base file system for the storage server as it is a modern file system with great performance, as benchmarked in [24]. By performing measurements on the EXT4 file system, we are able to gain insight in the basic performance of the lab setup. A baseline of the performance is created, which in turn enables us to compare it against the deduplication solutions. To be able to compare the baseline performance of the lab setup with the deduplicated setup we have to install multiple open source deduplication solutions. We choose two solutions of solutions discussed in Section 2.3 to perform measurements on.
14 3 EXPERIMENTS DomU Dom0 Storage backend Filesystem Disk iscsi connection Figure 1: Lab setup. Initially we have selected OpenDedup s SDFS and the ZFS file system. However, we were unsuccessful in maintaining stability when performing measurements on the SDFS solution under high loads. As time is a limiting factor for this project, we choose to use ZFS in a separate installation of FreeBSD 9.and LessFS instead of SDFS. LessFS and ZFS seem to be two of the most popular open source deduplication solutions that perform in-line deduplication. LessFS runs in user space, while ZFS runs in kernel space (a native implementation) in the FreeBSD 9 installation. Currently ZFS for FreeBSD is the only native file system with deduplication capabilities. While both solutions fundamentally differ from each other in the way they are run, we believe it is still very interesting to see a comparison. The drawback of running an application in user space is that more context switching is required to switch from user to kernel space. Some might argue that comparing a user space solution to a kernel space solution is unfair. While this may be true in essence, we believe the comparison is still valid as real world practice shows that, mostly due to management constrains, using FreeBSD in a strict Linux environment is often not a possibility. This results in the need for a different solution which, currently, automatically means a user space implementation is the only option.
15 3 EXPERIMENTS Lab measurements The measurements will be conducted on the lab setup simultaneously running 10, 20 and 30 virtual guests respectively. By using increasing amounts of virtual guests, we were able to compare the differences in load on every measurement. In turn, the results can be used to draw conclusions on what the best performing solution is and to determine what to take into account when deploying deduplication in a similar way. To be able to calculate average values, the measurements are done a total of three times using the bonnie++ tool which we briefly discussed in Section 2.4. The reason for using bonnie++ for benchmarking is because it is a widely used tool for measuring disk performance which is able to run all of the benchmarks we require for this research. The version used is version 1.96 of bonnie++. To be able to efficiently perform measurements on the lab setup, we had to automate the deployment and bootstrapping of the virtual guests on the host machines over iscsi. To automate this, scripts are created. The automation part consists of two parts. The first script acts as a central control script on the storage server, which in turn contacts a second script on the hosts using SSH public key methods for password-less login. This second script is able to dynamically mount the iscsi LUNS required for the given amount of virtual guests that we require to boot on the lab setup. To start or stop a virtual environment consisting of 30 virtual guests, running on our lab setup, we would issue the following command on the control server: Control command to boot the virtual lab environment. control_server:~# /lab-control.sh start 30 control_server:~# /lab-control.sh stop 30 The 30 virtual guests are equally distributed over the five host servers. The control script contacts all five hosts to dynamically mount the iscsi mounts located on the storage server and boots the Xen virtual guests on each respective host. The scripts are available on request Measuring the performance There are multiple tools available for Linux distributions that are suitable for performance measurements [25]. In Section 2.4, we have looked into several of the available tools that are able to test hard drive and system performance [26].
16 3 EXPERIMENTS 13 Out of these tools we found that the most suitable tool for our experiments was bonnie++. Bonnie++ performs four types of tests, which can be divided into three categories. sequential output, sequential input and random seeks. Some of these tests require data being present on the disk before they can be performed. For each of these tests, bonnie++ first writes a file with random data to disk. Sequential output The sequential output category consist out of two tests. The first test is the writing of blocks of data sequentially to disk. The second test in this category is the rewrite test. This test seeks 8000 times in parallel a part of the generated file and reads it to memory. In addition, in 10% of the cases, the read data is changed and written back to disk. Both of these tests are aimed at measuring how fast files can be written to disk. Sequential input In the sequential input category, the test performed is the sequential reading of data from disk into memory. Effectively this test measures how fast files can be read from disk into memory so they can be used by the user or arbitrary applications. Random seeks Finally, the random seeks category contains a test for randomly reading data from the disk in parallel. This test focuses on measuring how fast data can be read from different parts of the disk simultaneously [26].
17 3 EXPERIMENTS 14 We used the following bonnie++ command in combination with standard GNU tools to get the desired output. Bonnie++ command. $ /usr/sbin/bonnie++ -n 0 -u 0 -r free -m grep Mem: \ awk {print $2} -s $(echo "scale=0; free -m grep Mem: \ awk {print $2} *2" bc -l) -f -b -d /tmp/ > /tmp/bonnie.output The command does the following: Command / Argument Description /usr/sbin/bonnie++ Run the bonnie++ tool. -n 0 Disable file creation. -u 0 Set the UID to 0 (root) -r free -m... Specify the amount of memory in Megabytes -s $(echo... File size for IO performance tests in MB times 2 -f Skips per-character IO tests. -b Don t use write buffering. fsync() after every write. -d The directory to use for the tests. Table 1: Description of bonnie++ used arguments. When running bonnie++ in parallel with the parameters mentioned in tables 1, it will write small files, similar to a real-world scenario. We are skipping per-character IO writes as we want to perform sequential writes to the storage system. The combination using files twice as large as the maximum amount of memory, in combination with disabling write buffering, we completely bypass the operating systems caching mechanisms.
18 4 RESULTS 15 4 Results 4.1 Sequential write Figure 2 shows the average performance of a virtual guest when 10, 20 or 30 virtual guests are writing to disk simultaneously. These results are as expected since writing to an EXT4 file system requires less actions to be taken before the data can be stored on disk. In the case where a deduplication solution is used, overhead in the form of deduplication and compression is introduced which makes the writing slower compared to EXT KByte/s EXT4 ZFS LessFS 0 10VM 20VM 30VM Amount of VMs Figure 2: Average write performance. 4.2 Sequential read For the results of the read tests, we can take a look at figure 3. The results gathered are not as expected and show inconsistent behavior when these tests are performed with different amounts of virtual guests. However, interestingly both deduplication solutions gradually perform less in contrast to EXT4 when more virtual guests are being used. The data that needs to be read is stored in a compressed state on disk and therefore is read faster into memory since less data needs to be read. The decompression process that follows is needed to get the complete set of data. This process is solely a CPU intensive task. However, when 20 or 30 virtual guests are used, we can see a rapid decline in the performance of the deduplication solutions. This is because the CPU of the storage backend now becomes the bottleneck for these same decompression actions that need to be taken. We therefore believe that the reason for the better performance of the deduplication
19 4 RESULTS 16 solutions when tests are run for 10 virtual guests is because of the compression employed by both deduplication solutions. KByte/s VM 20VM 30VM Amount of VMs EXT4 ZFS LessFS Figure 3: Average read performance. 4.3 Random rewrite The results for the rewrite tests are shown in figure 4. In this graph, you can immediately see that LessFS is outperformed by ZFS and EXT4. Furthermore, ZFS either performs almost as good as EXT4 or even better. The reason ZFS actually performs better when using 20 or 30 virtual guests can be contributed to the use of dynamic block sizes of up to 128 Kilobytes [13]. When this test is conducted with 10 virtual guests, the size of the blocks used in both EXT4 and ZFS have no real impact as enough disk resources are available to accommodate all the disk access requests by the virtual guests. However, for 20 and 30 virtual guests, this is no longer the case. But because ZFS uses a dynamic block size, smaller blocks can actually be read, changed and written faster compared to the fixed size blocks of EXT4 and the even larger blocks of LessFS. 4.4 Random seeks As mentioned in section 3.2.3, the seek test actually tries to seek a part of the test file in parallel. It does this 8000 times. And in 10% of the cases, the read data is changed and written back. The results for this test can be seen in figure 5. The results are as expected because reading random data from disk, and sometimes writing data back to disk is directly affected by the number of actions needed to complete these tasks. We believe that mainly the
20 4 RESULTS 17 KByte/s VM 20VM 30VM Amount of VMs EXT4 ZFS LessFS Figure 4: Average rewrite performance. decompression action is the bottleneck, as trying to decompress something 8000 times in parallel is a very CPU intensive task. Also interesting to see is that LessFS outperforms ZFS structurally for the first time, albeit not by much.this is most likely due to the fact that the compression algorithm used by LessFS is actually a little faster in decompressing data than the compression algorithm used by ZFS. LessFS uses the QuickLZ algorithm to apply additional compression on top of the deduplication. QuickLZ developers claim to be the world s fastest compression library and that they can reach over 358 Megabytes per second in decompression throughput [27]. The ZFS pool we configured uses the LZJB compression algorithm, which we believe is a slower decompression algorithm when compared to QLZ. We base this belief on [28] in which LZJB is compared against LZO, the latter being significantly slower than QuickLZ. It is shown that decompression rates of LZJB are relatively poor, especially for smaller blocks of data. 4.5 Storage consolidation Figures 6, 7 and 8 show the reported usage and the actual file system usage. As discussed in Section 3.2.1, sparse files have been used and get interpreted correctly by the operating system in case of the EXT4 file system. Therefore, we can see the same size for both the reported and actual size in the case of the EXT4 file system. The most interesting thing about these results is the amount of reported usage by LessFS. This is due to the fact that LessFS does not take into account the sparse properties of the files and therefore reports the full size for each file. ZFS
21 4 RESULTS Seeks/s VM 20VM 30VM Amount of VMs EXT4 ZFS LessFS Figure 5: Average seek performance. on the other hand does take the sparse properties in account and in addition also accounts for the used compression. This results in the reported size being just below the size reported by the EXT4 file system MBytes Reported Real EXT4 LessFS ZFS Deduplica0on method Figure 6: Used space for 10 virtual guests For a better look at the amounts of space reported, used and saved, we can take a look at Table 2. This table shows that LessFS actually consolidates more storage when compared to ZFS and EXT4.
22 4 RESULTS MBytes Reported Real EXT4 LessFS ZFS Deduplica0on method Figure 7: Used space for 20 virtual guests MBytes Reported Real EXT4 LessFS ZFS Deduplica0on method Figure 8: Used space for 30 virtual guests 10 VM 20 VM 30 VM Real Reported Real Reported Real Reported EXT4 4594, , , , , ,68 LessFS 251, , , , , ,377 ZFS , , ,535 Table 2: Detail storage consumption (in MB)
23 5 CONCLUSION 20 5 Conclusion In general you can get better performance when not using any data deduplication solution, especially when virtual guests need to perform heavy write actions. However, in some specific cases it might actually prove more fruitful to use the deduplication features of ZFS. In cases where large amounts of virtual guests need to read data, change it and write it back, ZFS can get you better performance than using, for instance, EXT4. In most cases, ZFS outperforms LessFS as can be seen in our results. We believe this is mostly because of the fact that ZFS is a native file system, which is implemented in kernel space, as to where LessFS is implemented in user space and requires the FUSE libraries. This extra layer adds additional overhead, which in turn results in slower performance. The results clearly show that using a deduplication solution is very effective in saving storage. It hardly has no effect on the actual amount of storage used when you save 10 times the same data or 30 times the same data. However, if you correlate the results of the tests and the amount of storage space saved, the performance of all the file systems drop considerably when more virtual guests are run concurrently, which is expected. However, our results show no tangible evidence that the amount of storage consolidation actually influences the rate of decrease in performance for deduplication solutions. We can argue that in an environment on which you have virtual guests with mixed functions and you want to save as much space as possible without losing to much performance, the best way to go is to use ZFS. ZFS saves more space compared to EXT4 and performs better when compared to LessFS. Finally, in situations when disk performance is not as important as storage consolidation, LessFS is the better choice as it is the most efficient deduplication solution for saving space.
24 6 FUTURE WORK 21 6 Future work Due to time limitations, not every aspect was feasible to research extensively. This section discusses some ideas for further and future research in this area. Different hypervisor In this study, we have chosen Xen as the main hypervisor to perform testing with. An interesting extension to this research would be to compare it with different hypervisors, such as VMware or KVM. Replicated performance In cases, like failover clusters, where multiple virtual machines need to stay in sync and thus write the same data to disk, different results could be achieved. It might be interesting to see the measurements when identical blocks of data are written and read from disk of the different virtual machines. Different deduplication methods It might be interesting to see the performance of the other deduplication methods such as Nexenta [29]. We were not able to continue the measurements using OpenDedup because of occurring crashes. Perhaps in the future this particular solution is more stable. Multiple datasets Using a multiple of different operating systems for the virtual machines will most likely result in different space usage between the different file systems. The disk performance might be affected because of the different datasets that need to be kept in the deduplication databases. High-end Hardware The hardware used in our setup is far from modern. It might be interesting to pursue the usage of modern and high-end hardware and the impact it has on the results. ZFS user mode A comparison between LessFS, which runs in user mode, and ZFS running in user mode. Although unpractical, it would be better comparable.
25 A SERVER HARDWARE 22 A Server hardware In the lab setups described in Section 3.1, five machines were used. The hardware specifications of the machines in the setup are below: A.1 Storage server One server acting as a storage backend using data deduplication on the shared storage volume. Brand Dell Model PowerEdge R210 CPU Intel(R) Xeon(R) CPU 1.87GHz Memory 8GB Hard disk Western Digital WD5002ABYS-18B1B0 500GB (x2) NIC Embedded Broadcom 5716 NetXtreme II BCM5716 Gigabit Ethernet (x2) Operating system Ubuntu x64 & FreeBSD 9.0-RELEASE #0 Linux kernel generic x86 64 & 9.0-RELEASE #0 A.2 Host server A total of five servers acting as a Xen host for the DomU virtual machines. Brand Dell Model PowerEdge 850 CPU Intel(R) Pentium(R) D 3.00GHz Memory 2GB Hard disk Seagate ST AS 80GB (x2) NIC Embedded Broadcom 5716 NetXtreme BCM5721 Gigabit Ethernet(x2) Operating system Debian 6 (Squeeze) x64 Linux kernel xen-amd64 x86 64
26 A SERVER HARDWARE 23 A.3 Virtual guest server A maximum of 30 virtual guests were booted to perform measurements on. Brand Xen Model Version 4.1 vcpu Intel(R) Pentium(R) D 3.00GHz Memory 128MB Hard disk 5GB iscsi mount NIC Xen routed from Dom0 Operating system Debian 6 (Squeeze) x64 Linux kernel xen-amd64 x86 64
27 REFERENCES 24 References [1] Nagapramod Mandagere, Pin Zhou, Mark A Smith, and Sandeep Uttamchandani. Demystifying data deduplication. In Proceedings of the ACM/IFIP/USENIX Middleware 08 Conference Companion, Companion 08, pages 12 17, New York, NY, USA, ACM. [2] G. Goth. Virtualization: Old technology offers huge new potential. Distributed Systems Online, IEEE, 8(2):3, feb [3] Keren Jin and Ethan L. Miller. The effectiveness of deduplication on virtual machine disk images. In Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference, SYSTOR 09, pages 7:1 7:12, New York, NY, USA, ACM. [4] Datadomain - in-line deduplication. Website, com/pdf/datadomain-techbrief-inline-deduplication.pdf. [Online; consulted on March ]. [5] Searchtarget - how data deduplication works. Website, How-data-deduplication-works. [Online; consulted on March ]. [6] Dutch T. Meyer and William J. Bolosky. A study of practical deduplication. usenix, [7] Netapp - official homepage. Website, [Online; consulted on March ]. [8] Emc - leading cloud computing, big data, and trusted it solutions. Website, [Online; consulted on March ]. [9] opendedup.org. Opendedup official website. February [Online; Consulted on February 21, 2012]. [10] Mark Ruijter. lessfs - open source data de-duplication. lessfs.com, February [Online; Consulted on February 21, 2012]. [11] s3ql - a full-featured file system for online data storage. Website, http: //code.google.com/p/s3ql/. [Online; consulted on March 6, 2012]. [12] s3fs. s3fs - fuse-based file system backed by amazon s3. google.com/p/s3fs/, February [Online; Consulted on February 21, 2012]. [13] Wikipedia.org - zfs. Website, [Online; consulted on March ]. [14] Freebsd wiki - zfs. Website, [Online; consulted on March ].
28 REFERENCES 25 [15] IOzone. Iozone.org - iozone pdf documentation. docs/iozone_msword_98.pdf, March [Online; Consulted on March 6, 2012]. [16] nixcraft - how to measure linux filesystem i/o performance with iozone. Website, linux-filesystem-benchmarking-with-iozone.html. [Online; consulted on March 6, 2012]. [17] iostat linux man page. Website, [Online; consulted on March 6, 2012]. [18] hdparm - get/set ata/sata drive parameters under linux. Website, http: //sourceforge.net/projects/hdparm/. [Online; consulted on March 6, 2012]. [19] fio website. Website, [Online; consulted on March 6, 2012]. [20] Iperf website. Website, [Online; consulted on March 6, 2012]. [21] Netperf official homepage. Website, [Online; consulted on March ]. [22] Xen.org - xen official homepage. Website, [Online; consulted on March ]. [23] Wikipedia.org - sparse files. Website, Sparse_files. [Online; consulted on March ]. [24] Kernel.org - ext4 benchmark. Website, /ols2007v2-pages pdf. [Online; consulted on March ]. [25] Linux benchmark suite homepage. Website, net/. [Online; consulted on March ]. [26] Bonnie++ - official homepage. Website, bonnie++/. [Online; consulted on March ]. [27] Quicklz - official website. Website, [Online; consulted on March ]. [28] Lzo vs. lzjb in zfs. Website, [Online; consulted on March ]. [29] Nexenta - enterprise class storage for everyone. Website, nexenta.com/corp. [Online; consulted on March ].
StACC: St Andrews Cloud Computing Co laboratory. A Performance Comparison of Clouds. Amazon EC2 and Ubuntu Enterprise Cloud
StACC: St Andrews Cloud Computing Co laboratory A Performance Comparison of Clouds Amazon EC2 and Ubuntu Enterprise Cloud Jonathan S Ward StACC (pronounced like 'stack') is a research collaboration launched
More informationEnabling Technologies for Distributed Computing
Enabling Technologies for Distributed Computing Dr. Sanjay P. Ahuja, Ph.D. Fidelity National Financial Distinguished Professor of CIS School of Computing, UNF Multi-core CPUs and Multithreading Technologies
More informationAnalysis of VDI Storage Performance During Bootstorm
Analysis of VDI Storage Performance During Bootstorm Introduction Virtual desktops are gaining popularity as a more cost effective and more easily serviceable solution. The most resource-dependent process
More informationBest Practices for Deploying Citrix XenDesktop on NexentaStor Open Storage
Best Practices for Deploying Citrix XenDesktop on NexentaStor Open Storage White Paper July, 2011 Deploying Citrix XenDesktop on NexentaStor Open Storage Table of Contents The Challenges of VDI Storage
More informationOpen Source Data Deduplication
Open Source Data Deduplication Nick Webb Red Wire Services, LLC nickw@redwireservices.com www.redwireservices.com @disasteraverted Introduction What is Deduplication? Different kinds? Why do you want it?
More informationVMware Virtual SAN Backup Using VMware vsphere Data Protection Advanced SEPTEMBER 2014
VMware SAN Backup Using VMware vsphere Data Protection Advanced SEPTEMBER 2014 VMware SAN Backup Using VMware vsphere Table of Contents Introduction.... 3 vsphere Architectural Overview... 4 SAN Backup
More informationEnabling Technologies for Distributed and Cloud Computing
Enabling Technologies for Distributed and Cloud Computing Dr. Sanjay P. Ahuja, Ph.D. 2010-14 FIS Distinguished Professor of Computer Science School of Computing, UNF Multi-core CPUs and Multithreading
More informationVirtualization of the MS Exchange Server Environment
MS Exchange Server Acceleration Maximizing Users in a Virtualized Environment with Flash-Powered Consolidation Allon Cohen, PhD OCZ Technology Group Introduction Microsoft (MS) Exchange Server is one of
More informationVirtual server management: Top tips on managing storage in virtual server environments
Tutorial Virtual server management: Top tips on managing storage in virtual server environments Sponsored By: Top five tips for managing storage in a virtual server environment By Eric Siebert, Contributor
More informationParallels Cloud Server 6.0
Parallels Cloud Server 6.0 Parallels Cloud Storage I/O Benchmarking Guide September 05, 2014 Copyright 1999-2014 Parallels IP Holdings GmbH and its affiliates. All rights reserved. Parallels IP Holdings
More informationZEN LOAD BALANCER EE v3.04 DATASHEET The Load Balancing made easy
ZEN LOAD BALANCER EE v3.04 DATASHEET The Load Balancing made easy OVERVIEW The global communication and the continuous growth of services provided through the Internet or local infrastructure require to
More informationCloud Storage. Parallels. Performance Benchmark Results. White Paper. www.parallels.com
Parallels Cloud Storage White Paper Performance Benchmark Results www.parallels.com Table of Contents Executive Summary... 3 Architecture Overview... 3 Key Features... 4 No Special Hardware Requirements...
More informationMS Exchange Server Acceleration
White Paper MS Exchange Server Acceleration Using virtualization to dramatically maximize user experience for Microsoft Exchange Server Allon Cohen, PhD Scott Harlin OCZ Storage Solutions, Inc. A Toshiba
More informationPreparation Guide. How to prepare your environment for an OnApp Cloud v3.0 (beta) deployment.
Preparation Guide v3.0 BETA How to prepare your environment for an OnApp Cloud v3.0 (beta) deployment. Document version 1.0 Document release date 25 th September 2012 document revisions 1 Contents 1. Overview...
More informationVirtualization @ Google
Virtualization @ Google Alexander Schreiber Google Switzerland Libre Software Meeting 2012 Geneva, Switzerland, 2012-06-10 Introduction Talk overview Corporate infrastructure Overview Use cases Technology
More informationAgenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.
Agenda Enterprise Performance Factors Overall Enterprise Performance Factors Best Practice for generic Enterprise Best Practice for 3-tiers Enterprise Hardware Load Balancer Basic Unix Tuning Performance
More informationCloud Optimize Your IT
Cloud Optimize Your IT Windows Server 2012 The information contained in this presentation relates to a pre-release product which may be substantially modified before it is commercially released. This pre-release
More informationPARALLELS CLOUD SERVER
PARALLELS CLOUD SERVER Performance and Scalability 1 Table of Contents Executive Summary... Error! Bookmark not defined. LAMP Stack Performance Evaluation... Error! Bookmark not defined. Background...
More informationBest Practices for Optimizing Your Linux VPS and Cloud Server Infrastructure
Best Practices for Optimizing Your Linux VPS and Cloud Server Infrastructure Q1 2012 Maximizing Revenue per Server with Parallels Containers for Linux www.parallels.com Table of Contents Overview... 3
More informationPERFORMANCE ANALYSIS OF KERNEL-BASED VIRTUAL MACHINE
PERFORMANCE ANALYSIS OF KERNEL-BASED VIRTUAL MACHINE Sudha M 1, Harish G M 2, Nandan A 3, Usha J 4 1 Department of MCA, R V College of Engineering, Bangalore : 560059, India sudha.mooki@gmail.com 2 Department
More informationSolaris For The Modern Data Center. Taking Advantage of Solaris 11 Features
Solaris For The Modern Data Center Taking Advantage of Solaris 11 Features JANUARY 2013 Contents Introduction... 2 Patching and Maintenance... 2 IPS Packages... 2 Boot Environments... 2 Fast Reboot...
More informationTECHNICAL PAPER. Veeam Backup & Replication with Nimble Storage
TECHNICAL PAPER Veeam Backup & Replication with Nimble Storage Document Revision Date Revision Description (author) 11/26/2014 1. 0 Draft release (Bill Roth) 12/23/2014 1.1 Draft update (Bill Roth) 2/20/2015
More informationSIDN Server Measurements
SIDN Server Measurements Yuri Schaeffer 1, NLnet Labs NLnet Labs document 2010-003 July 19, 2010 1 Introduction For future capacity planning SIDN would like to have an insight on the required resources
More informationCloud Operating Systems for Servers
Cloud Operating Systems for Servers Mike Day Distinguished Engineer, Virtualization and Linux August 20, 2014 mdday@us.ibm.com 1 What Makes a Good Cloud Operating System?! Consumes Few Resources! Fast
More informationWindows Server 2012 R2 Hyper-V: Designing for the Real World
Windows Server 2012 R2 Hyper-V: Designing for the Real World Steve Evans @scevans www.loudsteve.com Nick Hawkins @nhawkins www.nickahawkins.com Is Hyper-V for real? Microsoft Fan Boys Reality VMware Hyper-V
More informationCloud Computing for Control Systems CERN Openlab Summer Student Program 9/9/2011 ARSALAAN AHMED SHAIKH
Cloud Computing for Control Systems CERN Openlab Summer Student Program 9/9/2011 ARSALAAN AHMED SHAIKH CONTENTS Introduction... 4 System Components... 4 OpenNebula Cloud Management Toolkit... 4 VMware
More informationZEN LOAD BALANCER EE v3.02 DATASHEET The Load Balancing made easy
ZEN LOAD BALANCER EE v3.02 DATASHEET The Load Balancing made easy OVERVIEW The global communication and the continuous growth of services provided through the Internet or local infrastructure require to
More informationDelivering SDS simplicity and extreme performance
Delivering SDS simplicity and extreme performance Real-World SDS implementation of getting most out of limited hardware Murat Karslioglu Director Storage Systems Nexenta Systems October 2013 1 Agenda Key
More informationXen @ Google. Iustin Pop, <iustin@google.com> Google Switzerland. Sponsored by:
Xen @ Google Iustin Pop, Google Switzerland Sponsored by: & & Introduction Talk overview Corporate infrastructure Overview Use cases Technology Open source components Internal components
More informationUsing VMware VMotion with Oracle Database and EMC CLARiiON Storage Systems
Using VMware VMotion with Oracle Database and EMC CLARiiON Storage Systems Applied Technology Abstract By migrating VMware virtual machines from one physical environment to another, VMware VMotion can
More informationDell Virtualization Solution for Microsoft SQL Server 2012 using PowerEdge R820
Dell Virtualization Solution for Microsoft SQL Server 2012 using PowerEdge R820 This white paper discusses the SQL server workload consolidation capabilities of Dell PowerEdge R820 using Virtualization.
More informationUsing Synology SSD Technology to Enhance System Performance Synology Inc.
Using Synology SSD Technology to Enhance System Performance Synology Inc. Synology_SSD_Cache_WP_ 20140512 Table of Contents Chapter 1: Enterprise Challenges and SSD Cache as Solution Enterprise Challenges...
More informationSecure Web. Hardware Sizing Guide
Secure Web Hardware Sizing Guide Table of Contents 1. Introduction... 1 2. Sizing Guide... 2 3. CPU... 3 3.1. Measurement... 3 4. RAM... 5 4.1. Measurement... 6 5. Harddisk... 7 5.1. Mesurement of disk
More informationHPC performance applications on Virtual Clusters
Panagiotis Kritikakos EPCC, School of Physics & Astronomy, University of Edinburgh, Scotland - UK pkritika@epcc.ed.ac.uk 4 th IC-SCCE, Athens 7 th July 2010 This work investigates the performance of (Java)
More informationEvaluating Network Attached Storage Units
Benchmarking Strategies for Home Users, SOHOs and SMBs October 23, 2015 1 2 3 4 5 6 Background NAS Units - The Marketing Angle Evaluation Metrics Sr. Editor @ AnandTech 5+ years reviewing multimedia systems,
More informationVirtualization. Michael Tsai 2015/06/08
Virtualization Michael Tsai 2015/06/08 What is virtualization? Let s first look at a video from VMware http://bcove.me/x9zhalcl Problems? Low utilization Different needs DNS DHCP Web mail 5% 5% 15% 8%
More informationD1.2 Network Load Balancing
D1. Network Load Balancing Ronald van der Pol, Freek Dijkstra, Igor Idziejczak, and Mark Meijerink SARA Computing and Networking Services, Science Park 11, 9 XG Amsterdam, The Netherlands June ronald.vanderpol@sara.nl,freek.dijkstra@sara.nl,
More informationIOmark- VDI. HP HP ConvergedSystem 242- HC StoreVirtual Test Report: VDI- HC- 150427- b Test Report Date: 27, April 2015. www.iomark.
IOmark- VDI HP HP ConvergedSystem 242- HC StoreVirtual Test Report: VDI- HC- 150427- b Test Copyright 2010-2014 Evaluator Group, Inc. All rights reserved. IOmark- VDI, IOmark- VM, VDI- IOmark, and IOmark
More informationMasters Project Proposal
Masters Project Proposal Virtual Machine Storage Performance Using SR-IOV by Michael J. Kopps Committee Members and Signatures Approved By Date Advisor: Dr. Jia Rao Committee Member: Dr. Xiabo Zhou Committee
More informationCLOUDSPECS PERFORMANCE REPORT LUNACLOUD, AMAZON EC2, RACKSPACE CLOUD AUTHOR: KENNY LI NOVEMBER 2012
CLOUDSPECS PERFORMANCE REPORT LUNACLOUD, AMAZON EC2, RACKSPACE CLOUD AUTHOR: KENNY LI NOVEMBER 2012 EXECUTIVE SUMMARY This publication of the CloudSpecs Performance Report compares cloud servers of Amazon
More informationPARALLELS CLOUD STORAGE
PARALLELS CLOUD STORAGE Performance Benchmark Results 1 Table of Contents Executive Summary... Error! Bookmark not defined. Architecture Overview... 3 Key Features... 5 No Special Hardware Requirements...
More informationIntro to Virtualization
Cloud@Ceid Seminars Intro to Virtualization Christos Alexakos Computer Engineer, MSc, PhD C. Sysadmin at Pattern Recognition Lab 1 st Seminar 19/3/2014 Contents What is virtualization How it works Hypervisor
More informationOutline. Introduction Virtualization Platform - Hypervisor High-level NAS Functions Applications Supported NAS models
1 2 Outline Introduction Virtualization Platform - Hypervisor High-level NAS Functions Applications Supported NAS models 3 Introduction What is Virtualization Station? Allows users to create and operate
More informationOracle Database Scalability in VMware ESX VMware ESX 3.5
Performance Study Oracle Database Scalability in VMware ESX VMware ESX 3.5 Database applications running on individual physical servers represent a large consolidation opportunity. However enterprises
More informationMoving Virtual Storage to the Cloud. Guidelines for Hosters Who Want to Enhance Their Cloud Offerings with Cloud Storage
Moving Virtual Storage to the Cloud Guidelines for Hosters Who Want to Enhance Their Cloud Offerings with Cloud Storage Table of Contents Overview... 1 Understanding the Storage Problem... 1 What Makes
More informationCloud Server. Parallels. Key Features and Benefits. White Paper. www.parallels.com
Parallels Cloud Server White Paper Key Features and Benefits www.parallels.com Table of Contents Introduction... 3 Key Features... 3 Distributed Cloud Storage (Containers and Hypervisors)... 3 Rebootless
More information9/26/2011. What is Virtualization? What are the different types of virtualization.
CSE 501 Monday, September 26, 2011 Kevin Cleary kpcleary@buffalo.edu What is Virtualization? What are the different types of virtualization. Practical Uses Popular virtualization products Demo Question,
More informationVeeam Cloud Connect. Version 8.0. Administrator Guide
Veeam Cloud Connect Version 8.0 Administrator Guide April, 2015 2015 Veeam Software. All rights reserved. All trademarks are the property of their respective owners. No part of this publication may be
More informationMaximizing SQL Server Virtualization Performance
Maximizing SQL Server Virtualization Performance Michael Otey Senior Technical Director Windows IT Pro SQL Server Pro 1 What this presentation covers Host configuration guidelines CPU, RAM, networking
More informationIOmark- VDI. Nimbus Data Gemini Test Report: VDI- 130906- a Test Report Date: 6, September 2013. www.iomark.org
IOmark- VDI Nimbus Data Gemini Test Report: VDI- 130906- a Test Copyright 2010-2013 Evaluator Group, Inc. All rights reserved. IOmark- VDI, IOmark- VDI, VDI- IOmark, and IOmark are trademarks of Evaluator
More informationDeploying Business Virtual Appliances on Open Source Cloud Computing
International Journal of Computer Science and Telecommunications [Volume 3, Issue 4, April 2012] 26 ISSN 2047-3338 Deploying Business Virtual Appliances on Open Source Cloud Computing Tran Van Lang 1 and
More informationVirtual Switching Without a Hypervisor for a More Secure Cloud
ing Without a for a More Secure Cloud Xin Jin Princeton University Joint work with Eric Keller(UPenn) and Jennifer Rexford(Princeton) 1 Public Cloud Infrastructure Cloud providers offer computing resources
More informationBenchmarking Hadoop & HBase on Violin
Technical White Paper Report Technical Report Benchmarking Hadoop & HBase on Violin Harnessing Big Data Analytics at the Speed of Memory Version 1.0 Abstract The purpose of benchmarking is to show advantages
More informationDIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION
DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION A DIABLO WHITE PAPER AUGUST 2014 Ricky Trigalo Director of Business Development Virtualization, Diablo Technologies
More informationPerformance in a Gluster System. Versions 3.1.x
Performance in a Gluster System Versions 3.1.x TABLE OF CONTENTS Table of Contents... 2 List of Figures... 3 1.0 Introduction to Gluster... 4 2.0 Gluster view of Performance... 5 2.1 Good performance across
More informationMoving Virtual Storage to the Cloud
Moving Virtual Storage to the Cloud White Paper Guidelines for Hosters Who Want to Enhance Their Cloud Offerings with Cloud Storage www.parallels.com Table of Contents Overview... 3 Understanding the Storage
More informationA Middleware Strategy to Survive Compute Peak Loads in Cloud
A Middleware Strategy to Survive Compute Peak Loads in Cloud Sasko Ristov Ss. Cyril and Methodius University Faculty of Information Sciences and Computer Engineering Skopje, Macedonia Email: sashko.ristov@finki.ukim.mk
More informationMonitoring Databases on VMware
Monitoring Databases on VMware Ensure Optimum Performance with the Correct Metrics By Dean Richards, Manager, Sales Engineering Confio Software 4772 Walnut Street, Suite 100 Boulder, CO 80301 www.confio.com
More informationThe future is in the management tools. Profoss 22/01/2008
The future is in the management tools Profoss 22/01/2008 Niko Nelissen Co founder & VP Business development Q layer Agenda Introduction Virtualization today Server & desktop virtualization Storage virtualization
More informationHow to Choose your Red Hat Enterprise Linux Filesystem
How to Choose your Red Hat Enterprise Linux Filesystem EXECUTIVE SUMMARY Choosing the Red Hat Enterprise Linux filesystem that is appropriate for your application is often a non-trivial decision due to
More informationSolving I/O Bottlenecks to Enable Superior Cloud Efficiency
WHITE PAPER Solving I/O Bottlenecks to Enable Superior Cloud Efficiency Overview...1 Mellanox I/O Virtualization Features and Benefits...2 Summary...6 Overview We already have 8 or even 16 cores on one
More informationOn Benchmarking Popular File Systems
On Benchmarking Popular File Systems Matti Vanninen James Z. Wang Department of Computer Science Clemson University, Clemson, SC 2963 Emails: {mvannin, jzwang}@cs.clemson.edu Abstract In recent years,
More informationData Backup and Archiving with Enterprise Storage Systems
Data Backup and Archiving with Enterprise Storage Systems Slavjan Ivanov 1, Igor Mishkovski 1 1 Faculty of Computer Science and Engineering Ss. Cyril and Methodius University Skopje, Macedonia slavjan_ivanov@yahoo.com,
More informationUsing Synology SSD Technology to Enhance System Performance Synology Inc.
Using Synology SSD Technology to Enhance System Performance Synology Inc. Synology_WP_ 20121112 Table of Contents Chapter 1: Enterprise Challenges and SSD Cache as Solution Enterprise Challenges... 3 SSD
More informationTechnical Investigation of Computational Resource Interdependencies
Technical Investigation of Computational Resource Interdependencies By Lars-Eric Windhab Table of Contents 1. Introduction and Motivation... 2 2. Problem to be solved... 2 3. Discussion of design choices...
More informationPivot3 Reference Architecture for VMware View Version 1.03
Pivot3 Reference Architecture for VMware View Version 1.03 January 2012 Table of Contents Test and Document History... 2 Test Goals... 3 Reference Architecture Design... 4 Design Overview... 4 The Pivot3
More informationDrobo How-To Guide. Use a Drobo iscsi Array as a Target for Veeam Backups
This document shows you how to use a Drobo iscsi SAN Storage array with Veeam Backup & Replication version 5 in a VMware environment. Veeam provides fast disk-based backup and recovery of virtual machines
More informationParallels Cloud Storage
Parallels Cloud Storage White Paper Best Practices for Configuring a Parallels Cloud Storage Cluster www.parallels.com Table of Contents Introduction... 3 How Parallels Cloud Storage Works... 3 Deploying
More informationWHITE PAPER 1 WWW.FUSIONIO.COM
1 WWW.FUSIONIO.COM WHITE PAPER WHITE PAPER Executive Summary Fusion iovdi is the first desktop- aware solution to virtual desktop infrastructure. Its software- defined approach uniquely combines the economics
More informationWhite Paper. Recording Server Virtualization
White Paper Recording Server Virtualization Prepared by: Mike Sherwood, Senior Solutions Engineer Milestone Systems 23 March 2011 Table of Contents Introduction... 3 Target audience and white paper purpose...
More informationA Holistic Model of the Energy-Efficiency of Hypervisors
A Holistic Model of the -Efficiency of Hypervisors in an HPC Environment Mateusz Guzek,Sebastien Varrette, Valentin Plugaru, Johnatan E. Pecero and Pascal Bouvry SnT & CSC, University of Luxembourg, Luxembourg
More informationPerformance Characteristics of VMFS and RDM VMware ESX Server 3.0.1
Performance Study Performance Characteristics of and RDM VMware ESX Server 3.0.1 VMware ESX Server offers three choices for managing disk access in a virtual machine VMware Virtual Machine File System
More informationQuantifying the Performance Degradation of IPv6 for TCP in Windows and Linux Networking
Quantifying the Performance Degradation of IPv6 for TCP in Windows and Linux Networking Burjiz Soorty School of Computing and Mathematical Sciences Auckland University of Technology Auckland, New Zealand
More informationVDI Optimization Real World Learnings. Russ Fellows, Evaluator Group
Russ Fellows, Evaluator Group SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA unless otherwise noted. Member companies and individual members may use this material
More informationEvaluation of Enterprise Data Protection using SEP Software
Test Validation Test Validation - SEP sesam Enterprise Backup Software Evaluation of Enterprise Data Protection using SEP Software Author:... Enabling you to make the best technology decisions Backup &
More informationDell Compellent Storage Center SAN & VMware View 1,000 Desktop Reference Architecture. Dell Compellent Product Specialist Team
Dell Compellent Storage Center SAN & VMware View 1,000 Desktop Reference Architecture Dell Compellent Product Specialist Team THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL
More informationPerformance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build 164009
Performance Study Performance Evaluation of VMXNET3 Virtual Network Device VMware vsphere 4 build 164009 Introduction With more and more mission critical networking intensive workloads being virtualized
More informationBest Practices for Deploying SSDs in a Microsoft SQL Server 2008 OLTP Environment with Dell EqualLogic PS-Series Arrays
Best Practices for Deploying SSDs in a Microsoft SQL Server 2008 OLTP Environment with Dell EqualLogic PS-Series Arrays Database Solutions Engineering By Murali Krishnan.K Dell Product Group October 2009
More informationWeek Overview. Installing Linux Linux on your Desktop Virtualization Basic Linux system administration
ULI101 Week 06b Week Overview Installing Linux Linux on your Desktop Virtualization Basic Linux system administration Installing Linux Standalone installation Linux is the only OS on the computer Any existing
More informationUBUNTU DISK IO BENCHMARK TEST RESULTS
UBUNTU DISK IO BENCHMARK TEST RESULTS FOR JOYENT Revision 2 January 5 th, 2010 The IMS Company Scope: This report summarizes the Disk Input Output (IO) benchmark testing performed in December of 2010 for
More informationmy forecasted needs. The constraint of asymmetrical processing was offset two ways. The first was by configuring the SAN and all hosts to utilize
1) Disk performance When factoring in disk performance, one of the larger impacts on a VM is determined by the type of disk you opt to use for your VMs in Hyper-v manager/scvmm such as fixed vs dynamic.
More informationStep by Step Guide To vstorage Backup Server (Proxy) Sizing
Tivoli Storage Manager for Virtual Environments V6.3 Step by Step Guide To vstorage Backup Server (Proxy) Sizing 12 September 2012 1.1 Author: Dan Wolfe, Tivoli Software Advanced Technology Page 1 of 18
More informationPerformance Testing of a Cloud Service
Performance Testing of a Cloud Service Trilesh Bhurtun, Junior Consultant, Capacitas Ltd Capacitas 2012 1 Introduction Objectives Environment Tests and Results Issues Summary Agenda Capacitas 2012 2 1
More informationNEXENTA S VDI SOLUTIONS BRAD STONE GENERAL MANAGER NEXENTA GREATERCHINA
NEXENTA S VDI SOLUTIONS BRAD STONE GENERAL MANAGER NEXENTA GREATERCHINA VDI Storage Challenge 95% of I/O is small, random writes Very challenging for a storage system End users demand low latency NexentaStor
More informationOnline Remote Data Backup for iscsi-based Storage Systems
Online Remote Data Backup for iscsi-based Storage Systems Dan Zhou, Li Ou, Xubin (Ben) He Department of Electrical and Computer Engineering Tennessee Technological University Cookeville, TN 38505, USA
More informationDELL. Virtual Desktop Infrastructure Study END-TO-END COMPUTING. Dell Enterprise Solutions Engineering
DELL Virtual Desktop Infrastructure Study END-TO-END COMPUTING Dell Enterprise Solutions Engineering 1 THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS AND TECHNICAL
More informationPeter Senna Tschudin. Performance Overhead and Comparative Performance of 4 Virtualization Solutions. Version 1.29
Peter Senna Tschudin Performance Overhead and Comparative Performance of 4 Virtualization Solutions Version 1.29 Table of Contents Project Description...4 Virtualization Concepts...4 Virtualization...4
More informationLeveraging EMC Fully Automated Storage Tiering (FAST) and FAST Cache for SQL Server Enterprise Deployments
Leveraging EMC Fully Automated Storage Tiering (FAST) and FAST Cache for SQL Server Enterprise Deployments Applied Technology Abstract This white paper introduces EMC s latest groundbreaking technologies,
More informationIOmark-VM. DotHill AssuredSAN Pro 5000. Test Report: VM- 130816-a Test Report Date: 16, August 2013. www.iomark.org
IOmark-VM DotHill AssuredSAN Pro 5000 Test Report: VM- 130816-a Test Report Date: 16, August 2013 Copyright 2010-2013 Evaluator Group, Inc. All rights reserved. IOmark-VM, IOmark-VDI, VDI-IOmark, and IOmark
More informationIMPLEMENTING GREEN IT
Saint Petersburg State University of Information Technologies, Mechanics and Optics Department of Telecommunication Systems IMPLEMENTING GREEN IT APPROACH FOR TRANSFERRING BIG DATA OVER PARALLEL DATA LINK
More informationCloud Simulator for Scalability Testing
Cloud Simulator for Scalability Testing Nitin Singhvi (nitin.singhvi@calsoftinc.com) 1 Introduction Nitin Singhvi 11+ Years of experience in technology, especially in Networking QA. Currently playing roles
More informationPOSIX and Object Distributed Storage Systems
1 POSIX and Object Distributed Storage Systems Performance Comparison Studies With Real-Life Scenarios in an Experimental Data Taking Context Leveraging OpenStack Swift & Ceph by Michael Poat, Dr. Jerome
More informationWindows Server 2008 R2 Hyper-V Live Migration
Windows Server 2008 R2 Hyper-V Live Migration Table of Contents Overview of Windows Server 2008 R2 Hyper-V Features... 3 Dynamic VM storage... 3 Enhanced Processor Support... 3 Enhanced Networking Support...
More informationVDI Without Compromise with SimpliVity OmniStack and Citrix XenDesktop
VDI Without Compromise with SimpliVity OmniStack and Citrix XenDesktop Page 1 of 11 Introduction Virtual Desktop Infrastructure (VDI) provides customers with a more consistent end-user experience and excellent
More informationLeveraging NIC Technology to Improve Network Performance in VMware vsphere
Leveraging NIC Technology to Improve Network Performance in VMware vsphere Performance Study TECHNICAL WHITE PAPER Table of Contents Introduction... 3 Hardware Description... 3 List of Features... 4 NetQueue...
More informationBenchmarking Cassandra on Violin
Technical White Paper Report Technical Report Benchmarking Cassandra on Violin Accelerating Cassandra Performance and Reducing Read Latency With Violin Memory Flash-based Storage Arrays Version 1.0 Abstract
More informationReference Architecture for a Virtualized SharePoint 2010 Document Management Solution A Dell Technical White Paper
Dell EqualLogic Best Practices Series Reference Architecture for a Virtualized SharePoint 2010 Document Management Solution A Dell Technical White Paper Storage Infrastructure and Solutions Engineering
More informationVIDEO SURVEILLANCE WITH SURVEILLUS VMS AND EMC ISILON STORAGE ARRAYS
VIDEO SURVEILLANCE WITH SURVEILLUS VMS AND EMC ISILON STORAGE ARRAYS Successfully configure all solution components Use VMS at the required bandwidth for NAS storage Meet the bandwidth demands of a 2,200
More informationSAN Acceleration Using Nexenta VSA for VMware Horizon View with Third-Party SAN Storage NEXENTA OFFICE OF CTO ILYA GRAFUTKO
SAN Acceleration Using Nexenta VSA for VMware Horizon View with Third-Party SAN Storage NEXENTA OFFICE OF CTO ILYA GRAFUTKO Table of Contents VDI Performance 3 NV4V and Storage Attached Network 3 Getting
More informationAnalyzing Big Data with Splunk A Cost Effective Storage Architecture and Solution
Analyzing Big Data with Splunk A Cost Effective Storage Architecture and Solution Jonathan Halstuch, COO, RackTop Systems JHalstuch@racktopsystems.com Big Data Invasion We hear so much on Big Data and
More information