HPC PerformancePlacement and Consolidation in Cloud

Size: px
Start display at page:

Download "HPC PerformancePlacement and Consolidation in Cloud"

Transcription

1 HPC-Aware VM Placement in Infrastructure Clouds Abhishek Gupta, Laxmikant V. Kalé University of Illinois at Urbana-Champaign Urbana, IL 61801, USA (gupta59, Dejan Milojicic, Paolo Faraboschi HP Labs Palo Alto, CA, USA Susanne M. Balle HP Cloud Services Hudson, NH, USA Abstract Cloud offerings are increasingly serving workloads with a large variability in terms of compute, storage and networking resources. Computing requirements (all the way to High Performance Computing or HPC), criticality, communication intensity, memory requirements, and scale can vary widely. Virtual Machine (VM) placement and consolidation for effective utilization of a common pool of resources for efficient execution of such diverse class of applications in the cloud is challenging, resulting in higher cost and missed Service Level Agreements (SLAs). For HPC, current cloud providers either offer dedicated cloud with dedicated nodes, losing out on consolidation benefits of virtualization, or use HPC-agnostic cloud scheduling resulting in poor HPC performance. In this work, we address application-aware allocation of n VM instances (comprising a single job request) to physical hosts from a single pool. We design and implement an HPC-aware scheduler on top of OpenStack Compute (Nova) and also incorporate it in a simulator (CloudSim). Through various optimizations, specifically topology- and hardware-awareness, cross-vm interference accounting and application-aware consolidation, we demonstrate enhanced VM placements which achieve up to 45% improvement in HPC performance and/or 32% increase in job throughput while limiting the effect of jitter (or noise) to 8%. Keywords-Cloud; High Performance Computing; Scheduling; Placement; I. INTRODUCTION Cloud computing is increasingly being explored as a cost effective alternative (and addition) to supercomputers for some HPC applications [1] [4]. Cloud provides the benefits of economy of scale, elasticity, flexibility, and customization (through virtualization) to the HPC community. It is attracting several users who cannot afford to deploy their own dedicated HPC infrastructure due to up-front investment or sporadic demands. Despite these benefits, today s HPC is not cloud-aware, and today s clouds are not HPC-aware. As a consequence, only embarrassingly parallel or small scale HPC applications are good candidates today to run in cloud. The cloud commodity interconnects (or better, the absence of low-latency interconnects), the performance overhead introduced by virtualization, and the application-agnostic cloud schedulers are the biggest obstacles for efficient execution of HPC applications in cloud [2], [3]. Past research [1] [4] on HPC in cloud has primarily focused on evaluation of scientific parallel applications (such as those using MPI [5]) and has reached pessimistic conclusions. HPC applications are usually composed of tightly coupled processes performing frequent inter-process communication and synchronizations, and pose significant challenges to cloud schedulers. There have been few efforts on researching VM scheduling algorithms which take into account the nature of HPC applications and have shown promising results [6] [9]. In this paper, we postulate that the placement of VMs to physical machines can have significant impact on performance. With this as motivation, the primary questions that we address are the following: Can we improve HPC application performance in cloud through VM placement strategies tailored to application characteristics? Is there a cost-saving potential through increased resource utilization achieved by applicationaware consolidation? What are the performance-cost tradeoffs in using VM consolidation for HPC? We address the problem of how to effectively utilize a common pool of resources for efficient execution of very diverse classes of applications in the cloud. For that purpose, we solve the problem of simultaneously allocating multiple VM instances comprising a single job request to physical hosts taken from a pool. We do this while meeting Service Level Agreement (SLA) requirements (expressed in terms of compute, memory, homogeneity, and topology), and while attempting to improve the utilization of hardware resources. Existing VM scheduling mechanisms rely on user inputs and static partitioning of clusters into availability zones for different application types (such as HPC and non-hpc). The problem is particularly challenging because, in general, a large-scale HPC application would ideally require a dedicated allocation of cloud resources (compute and network), since its performance is quite sensitive to variability (caused by noise, or jitter). The per-hour charge that a cloud provider would have to establish for dedicating the resources would quickly make the proposition uneconomical for customers. To overcome this problem, our technique identifies suitable application combinations whose execution profiles well complement each other and that can be consolidated on the same hardware resources without compromising the overall HPC performance. This enables us to better utilize the hardware, and lower the cost for HPC applications while maintaining performance and profitability, hence greatly enhancing the business value of the solution. The methodology used in this paper consists of a two step process 1) Characterizing applications based on their use of shared resources in a multi-core node (with focus on

2 shared cache) and their tightly coupledness, and 2) using an application-aware scheduler to identify groups of applications that have complementary profiles. The key contributions of this paper are: We identify the opportunities and challenges of VM consolidation for HPC in cloud. In addition, we develop scheduling algorithms which optimize resource allocation while being HPC-aware. We achieve this by applying Multi-dimensional Online Bin Packing (MDOBP) heuristics while ensuring that cross-application interference is kept within bounds. ( II, III) We optimize the performance for HPC in cloud through intelligent HPC-aware VM placement specifically topology awareness and homogeneity, showing performance gains up to 25% compared to HPC-agnostic scheduling. ( III, VI) We implement the proposed algorithm in OpenStack Nova scheduler to enable intelligent application-aware VM scheduling. Through experimental measurements, we show that compared to dedicated execution, our techniques can result in up to 45% better performance while limiting jitter to 8%. ( IV, VI) We modify CloudSim [10] to make it suitable for simulation of HPC in cloud. To our knowledge, our work is the first effort towards simulation of HPC job scheduling algorithms in cloud. Simulation results show that our techniques can result in up to 32% increased throughput compared to default scheduling algorithms. ( VII) II. VM CONSOLIDATION FOR HPC IN CLOUD: SCOPE AND CHALLENGES There are two advantages associated with the ability to mix HPC and other applications on a common platform. First, better system utilization since the machines can be used for running non-hpc applications when there is low incoming flux of HPC applications. Secondly, placing different types of VM instances on the same physical node can result in advantages arising from resource packing. To quantify the potential cost savings that can be achieved through consolidation, we performed an approximate calculation using pricing of Amazon EC2 instances [11]. Amazon EC2 offers a dedicated pool of resources for HPC applications known as Cluster Compute. We consider two instance types shown in Table I and, as a concrete example, Table II shows the distribution of actually executed jobs calculated from METACENTRUM-02.swf logs obtained from the Parallel Workload Archive [12]. It is clear that there is a wide distribution and some HPC applications have small memory footprint while some need large memory. Also, according to the US DoE, there is a technology trend towards decreasing memory per core for exascale supercomputers, indicating that memory will be even more crucial resource in future [13]. If the memory left unused by some applications in the Cluster Compute instance can be used by placing a High Memory instance on the same node by trading 13 EC2 Compute Units and 34.2 GB memory (still leaving = 26 GB), then TABLE I: Amazon EC2 instance types and pricing Instance type Resource High-Memory Double Extra Large Eight Extra Large Cluster Compute API name m2.2xlarge cc2.8xlarge EC2 Comp. Units Memory 34.2 GB 60.5 GB Storage 850 GB 3370 GB I/O Perf. High Very High Price ($/hour) TABLE II: Distribution of job s memory requirement Memory per core Number of Jobs <512MB (84.00%) 512MB-1GB (9.71%) 1GB-2GB 5946 (5.74%) 2GB-4GB 379 (0.37%) 4GB-8GB 161 (0.16%) >8GB 33 (0.03%) Total (100%) 4 instances 2 instances 2 Phys Machines 1 core 3GB 3 cores free but 1 VM unoccupied 1 core 4GB 4 cores 4GB 4 cores All VMs occupied but HPC VMs may suffer interference, more communication Fig. 1: Tradeoff: Resource packing vs. HPC-awareness from Table I pricing, for every 2.4$, one can get additional 0.9$. However, the price of cluster compute instance needs to be reduced by a factor of 2.4/(88/13), since that instance will have 13 EC2 units less. Hence, through better resource packing, we can get % benefits of [ /(88/13)+0.9] = 23%. However, traditionally, HPC applications are executed on dedicated nodes to prevent any interference arising from co-located applications. This is because the performance of many HPC applications strongly depends on the slowest node (for example, when they synchronize through MPI barriers). Figure 1 illustrates this tradeoff between resource packing and optimized HPC performance with an example. Here we have two incoming VM provisioning requests, first for 4 instances each with 1 core, 512 MB memory and second for 2 instances each with 1 core, 3 GB memory. There are two available physical servers each with 4 cores and 4 GB memory. Figure shows two ways of placing these VMs on physical servers. The boxes represent the 2 dimensions x dimension being cores, y dimension being memory. Both requests are satisfied in the right figure, but not in left figure, since there is not enough memory on an individual server to meet the 3 GB requirement although there is enough memory in the system as a whole. Hence, the right figure is a better strategy since it is performing 2-dimensional bin packing. Now consider that the 1 core 512 MB VMs (green) are meant for HPC. In that case,

3 Normalized Performance Normalized Performance First App Second App LU.B-IS.B EP.B-ChaNGa EP.B-IS.B EP.B-LU.B LU.C-ChaNGa EP.C-ChaNGa EP.C-LU.C IS.B-ChaNGa LU.B-ChaNGa Application Combinations (a) Total cores (and VMs) per application = First App Second App LU.B-ChaNGa EP.B-ChaNGa EP.B-LU.B Application Combinations LU.C-ChaNGa EP.C-ChaNGa EP.C-LU.C (b) Total cores (and VMs) per application = 16 Fig. 2: Application Performance in shared node execution (2 cores for each application on a node) normalized wrt to dedicated execution (using all 4 cores of a node for same application). Total cores (and VMs) per application = 16, physical cores per node = 4 the left figure can result in better HPC performance compared to the right one because of two reasons a) No interference from applications of other users running on same server, and b) all inter-process communications is within node. This tradeoff between better HPC performance vs. better resource utilization makes VM scheduling for HPC a challenging problem. A. Cross-Application Interference Even though there are potential benefits of using consolidation of VMs for HPC, it is still unclear whether (and to what extent) we can achieve increased resource utilization at an acceptable performance penalty. For HPC applications, the degradation due to interference accumulates because of the synchronous and tightly coupled nature of many HPC applications. We compared the performance of a set of HPC applications in a co-located vs. dedicated execution. Figure 2 demonstrates the effect of running two different applications while sharing a multi-core node (4-core, 8GB, 3 GHz Open Cirrus [14] node). Each VM needs 1-vcpu, 2GB memory, and uses KVM-hypervisor and CPU-pinned configuration. Applications used here are NPB [15] (EP = Embarrisingly Parallel, LU = LU factorization, IS = Integer Sort) problem size class B and C and ChaNGa [16] = Cosmology. More details of testbed and applications are discussed in Section V. In this experiment, we first ran each application using all 4 cores of a node. We then ran VMs from 2 different applications on each node (2 VMs of each application on a node). Next, we normalized the performance for both applications in second case (shared node) with respect to the first case (dedicated node), and plotted them as shown in Figures 2a (4 VMs each application) and 2b (16 VMs each applications) for different application combinations. In the figures, the x-label shows the application combination, and the first bar shows normalized performance for the first application in x-label. Similarly the second bar shows that of second application in x-label. We can observe that some application combinations have normalized performance close to one for both applications e.g. EP.B- ChaNGa. For some applications, co-location has a significant detrimental impact on performance on at least one application e.g. combinations involving IS.B. The other facet of the interference problem is positive interference. Through experimental data, we notice that we can achieve significant performance improvement for some application combinations e.g. LU.C-ChaNGa for 4 cores case shows almost 120% better performance for LU.C with ChaNGa s normalized performance close to 1. The positive impact on performance when co-locating different HPC applications presents to us another opportunity for optimizing VM placement. What needs to be explored is why some colocations perform well while others do not (Section III). B. Topology Awareness The second challenge to VM consolidation for HPC in cloud is the applications sensitivity to network topology. Since many parallel processes constituting an HPC application communicate frequently, time spent in communication forms a significant fraction of total execution time. The impact of cluster topology has been widely researched by HPC researchers, but in the context of cloud, it is up to the cloud provider to use VM placement algorithms which map the multiple VMs of an HPC application in a topology-aware manner to minimize inter-vm communication overhead. The importance of topology awareness can be understood by a practical example Open Cirrus HP Labs cluster has 32 nodes (4-cores each) in a rack, and all nodes in a rack are connected by a 1Gbps link to a switch. The racks are connected using a 10Gbps link to a top-level switch. Hence, the 10Gbps link is shared by 32 nodes with an effective bandwidth of 10Gbps/32 = Gbps between two nodes in different racks for allto-all communication. However, the point-to-point bandwidth between two nodes in the same rack is 1 Gbps. Thus, packing VMs to nodes in the same rack will be beneficial compared to a random placement policy, which can potentially distribute them all over the cluster. However, topology-aware placement can conflict with the goals of achieving better resource utilization as demonstrated by Figure 1. C. Hardware Awareness Another characteristic of HPC applications is that they are generally iterative and bulk synchronous, with computation phase followed by barrier synchronization phase. Since all processes must finish the previous iteration before next iteration can be started, a single slow process can slow down the entire application. Since clouds evolve over time and demand, they consist of heterogeneous servers. Furthermore, the underlying hardware is not visible to the user who expects all VMs to achieve identical performance. The commonly used approach

4 Normalized Performance Normalized Performance Cache misses (100K LLC/s) cores per node 2-cores per node EP.B.4 EP.B.16 EP.C.16 EP.C.4 IS.B.4 LU.C.4 LU.B.4 LU.C.16 LU.B.16 ChaNGa.4 ChaNGa.16 IS.B.4 LU.B.4 EP.B.4 Application LU.C.4 EP.C.4 ChaNGa.4 LU.B.16 EP.B.16 LU.C.16 EP.C.16 ChaNGa.16 Application Application (a) Total cores (and VMs) per application = 4 (b) Total cores (and VMs) per application = 16 (c) Last Level Cache Misses Fig. 3: (a,b) Application Performance using 2 cores per node normalized wrt to dedicated execution using all 4 cores of a 4-core node for same application. First bar for each application shows the case when leaving 2 cores idle (no co-located applications). Rest bars for each application show the case when co-located with other applications (combinations of Figure 1). (c) Average per core last level cache misses: Using only 2 cores vs. using all 4 cores of a node to address heterogeneity in cloud is to create a new compute unit (e.g. Amazon EC2 Compute unit) and allocate hardware based on this unit. This allows allocation of a CPU core to multiple VMs using shares (e.g CPU share). However, it is impractical for HPC applications since the VMs comprising a single application will quickly get out of sync when sharing CPU with other VMs, resulting in much worse performance. To overcome these problems, Amazon EC2 uses a dedicated cluster for HPC. However, the disadvantage is lower utilization which results in higher price. Hence, the third challenge for VM placement for HPC is to ensure homogeneity. VM placement needs to be hardware-aware to ensure that all k VMs of a user request are allocated same type of processors. III. METHODOLOGY Having identified the opportunities for HPC-aware VM consolidation in cloud, we discuss our methodology for addressing the challenges discussed in Section II. We formulate the problem as an initial VM placement problem: Map k VMs (v 1, v 2,.., v k ) each with same, fixed resource requirements (CPU, memory, disk etc.) to n physical servers P 1, P 2,..., P n, which are unoccupied or partially occupied, while meeting resource demands. Moreover, we focus on providing the user an HPC-optimized VM placement. Our solution consists of a) One-time application characterization, and b) applicationaware scheduling. Next, we discuss these two components. A. Application Characterization Our goal is to identify what characteristics of applications affect their performance when they are co-located with other applications on a node. To get more insights into the performance observed in Figure 2, we plot the performance of each application obtained when running alone (but using 2 VMs on a 4 core node, leaving 2 cores idle) normalized with respect to the performance obtained when using all 4 cores for same application (See Figures 3a and 3b first bar for each application). We can see that LU benefits most when run in 2 core per node case, EP and ChaNGa achieve almost same performance, and IS suffers. This indicates that the contention of shared resources in multi-core nodes is a critical factor for these applications. To confirm our hypothesis, we measured the number of last level cache (LLC) misses per sec for each application using hardware performance counters and Linux tool oprofile. Figure 3c shows LLC misses/sec for our application set, and demonstrates that LU suffers a huge number of misses, indicative of larger working set size (or cache-intensiveness). In our terminology, cache-intensive refers to larger working set. Co-relating Figures 3a and 3c, we see that applications which are more cache-intensive (that is suffer more LLC misses per sec) are the ones that benefit most in 2-core per node case, whereas applications which are low to moderate cache-intensive (e.g. EP and ChaNGa) are mostly not affected by the use of 2 or 4-cores per node. One exception to this is IS.B.4, because this application is highly communication-intensive and hence suffers because of the inter-node communication happening in 2-core per node case. Barring this exception, one fairly intuitive conclusion that can be drawn from this experiment is that it is indeed beneficial to co-locate cache-intensive applications (such as LU) and application with less cache usage (such as EP) on same node. This is confirmed by more closely examining Figure 2. HPC applications introduce another dimension to the problem of accounting cross-application interference. In general, the effect of noise/interference gets amplified in applications which are bulk synchronous. For synchronous HPC applications, even if only one VM suffers a performance penalty, all the remaining VMs would have to wait for it to reach the synchronization point. Even though the interference suffered by individual processes may be less over a period of time, the overall effect on application performance can be significant due to the accumulation of noise over all processes. Hence, we characterize applications along two dimensions: 1) Cache-intensiveness We assign each application a cache score (= 100K LLC misses/sec), representative of the pressure it puts on the shared cache and memory controller subsystem. We acknowledge that one can use working set size as a metric, but we chose LLC misses/sec since it can be experimentally measured using hardware performance counters. 2) Parallel Synchronization and Network Sensitivity We map applications to four different application classes, which can be specified by a user when requesting VMs:

5 ExtremeHPC: Extremely tightly coupled or topologysensitive applications for which the best will be to provide dedicated nodes, example IS. SyncHPC: Sensitive to interference, but less compared to ExtremeHPC and can sustain small degree of interference to get consolidation benefits, examples LU, ChaNGa. AsyncHPC: Asynchronous (and less communication sensitive) and can sustain more interference than SyncHPC, examples EP, MapReduce applications. NonHPC: Do not perform any communication, can sustain more interference, and can be placed on heterogeneous hardware, example Web applications. B. Application-aware Scheduling With this characterization, we devise an applicationcharacteristics aware VM placement algorithm which is a combination of HPC-awareness (topology and homogeneity awareness), Multi-dimensional Online Bin packing, and Interference minimization through cache-sensitivity awareness. We discuss the details of this scheduler in the next section. IV. AN HPC-AWARE SCHEDULER Next, we discuss the design and implementation of the proposed techniques on top of OpenStack Nova scheduler [17]. A. Background: OpenStack Nova Scheduler OpenStack [17] is an open source software, being developed by collaboration of multiple inter-related projects, for largescale deployment and management of private and public clouds from large pools of infrastructure resources (compute, storage, and networking). In this work, we focus on the compute component of OpenStack, known as Nova. Nova scheduler performs the task of selecting physical nodes where a VM will be provisioned. Since OpenStack is a popular cloud management system, we implemented our scheduling techniques on top of existing Nova scheduler (Diablo ). The default scheduler makes VM placement based on the VM provisioning request (request_spec), and the existing state and occupancy of physical hosts (capability data). request_spec specifies the number and type of requested instances (VMs), instance type maps to resource requirements such as number of virtual cores, amount of memory, amount of disk space. Host capability data contains the current capabilities (such as free CPUs, free memory) of physical servers (hosts) in the cloud. Using request_spec and capabilities data, the scheduler performs a 2-step algorithm: 1) Filtering excludes hosts incapable of fulfilling the request (e.g free cores < requested virtual cores). 2) Weighing computes the relative fitness of filtered list of hosts to fulfill the request using cost functions such as least free host. Multiple cost functions can be used. Next, the list of hosts is sorted by the weighted score, and VMs are provisioned on hosts using this sorted list. However Nova Scheduler is HPC-agnostic since: Existing filtering and weighing strategies do not consider the nature of application (e.g. HPC vs. non-hpc). Algorithm 1 Pseudo code for Scheduler Algorithm 1: capability = list of capabilities of unique hosts 2: request spec = request specification 3: numhosts = capability.length() 4: filteredhostlist = new vector < int > 5: racklist = new set < int > 6: hostcapacity, rackcapacity, filteredhostlist Calculate- HostAndRackCapacity(request spec, capabilities) 7: if (request spec.class == ExtremeHP C) (request spec.class == SyncHP C) then 8: sortedhostlist sort filteredhostlist by decreasing order of hostcapacity[j] where j filteredhostlist. 9: P relimbuildp lan stable Sort sortedhostlist by decreasing order of rackcapacity[capability[j].rackid] where j filteredhostlist. 10: else 11: P rebuildp lan = filteredhostlist 12: end if 13: if request spec.class == ExtremeHP C then 14: buildp lan = new vector[int] 15: for i = 1 to i <= numf ilteredhosts do 16: for j = 1 to j <= hostcapacity[p rebuildp lan[i]] do 17: buildp lan.push(p rebuildp lan[i]) 18: end for 19: end for 20: else 21: buildp lan MDOBP (request spec, P rebuildplan, capabilities) 22: end if 23: return buildp lan 24: procedure MDOBP(request spec, Prebuildplan, capabilities) 25: buildp lan = new vector[int] 26: for i = 1 to i < request spec.numinstances do 27: repeat 28: node choosebestf ithost(request spec, P rebuildplan, capabilities) // use a multi-dimensional heuristic 29: until meetsinterferencecriteria(node, request spec, capabilities) 30: buildp lan.insert(node); 31: update temporary capability database 32: end for 33: return buildp lan 34: procedure meetsinterferencecriteria(node, request spec, capabilities) 35: α = Total cache threshold for any application 36: β = Total cache threshold for SyncHPC, in general α >> β 37: totalcachescore = i i.cachescore i such that i is an instance currently running on node 38: if (totalcachescore + request spec.cachescore) > α then 39: return false 40: end if 41: if i.class = SyncHP C for any i an instance currently running on node then 42: if (totalcachescore + request spec.cachescore) > β then 43: return false 44: end if 45: end if 46: return true Scheduler ignores processor heterogeneity and network topology. Scheduler considers the k VMs requested by an HPC user as k separate placement problems, there is no co-relation between the placement of VMs of a single request. There has been recent and ongoing work on adapting Nova scheduler to make it architecture- and HPC-aware [8], [9]. B. Design and Implementation Algorithm 1 describes our scheduling algorithm using OpenStack terminology. The VM provisioning request (request_spec) now contains application class and name in addition to existing parameters. The algorithm proceeds by calculating the current host and rack free capacity, that is

6 number of additional VMs of requested specification that can be placed at a particular host and rack (line 6). While doing so, it sets the capacity of all the hosts which have a running VM as zero if the requested VM type is ExtremeHPC to ensure that only dedicated nodes are used for ExtremeHPC. Next, if the class of requested VM is ExtremeHPC or SyncHPC, the scheduler creates a preliminary build plan which is a list of hosts ordered by rackcapacity of the rack to which a host belongs and hostcapacity for hosts of same rack. The goal is to allocate VMs to same host and same rack to the extent possible to minimize inter-vm communication overhead for these application classes. For ExtremeHPC, this PreBuildPlan is used for provisioning VMs, whereas for the rest classes, the algorithm performs multi-dimensional online bin packing to fit VMs of different characteristics together on same host (line 21). Procedure MDOBP uses a bin packing heuristic for selecting a host from available choices (line 28). We use a dimension-aware heuristic select the host for which the vector of requested resources aligns the most with the vector of remaining capacities. The key intuition can be understood by revisiting the example of 2-dimensional bin-packing in Figure 1. For best utilization of the capacity in both dimensions, it is desirable that the final sum of all the VM vectors on a host is close to the top right corner of the host rectangle. Hence, we select the host such that placing the requested VM on it would move the vector representing its occupied resources towards the top right corner. Our heuristic is similar to those studied by Lee et al. [18]. Formally, consider remaining or residual capacities (CP URes, MemRes) of a host, i.e. subtract from the capacity (total CPU, total memory) the total demand of all the items (VM cores, VM memory) currently assigned to it. Also consider requested VM: (CP UReq(= 1), MemReq). This heuristic selects the host with the minimum θ where cos(θ) is calculated using dot product of the two vectors, and (CP UReq CP URes)+(MemReq MemRes) is given by: CP URes2 +MemRes 2, with CP UReq 2 +MemReq 2 CP URes >= CP UReq, MemRes >= MemReq. Next, the selected host is checked to ensure that placing the requested VM on it does not violate the interference criteria (line 29). We use the following criteria the sum of cache scores of the requested VM and all the VMs running on a host should not exceed a threshold, which needs to be determined through experimental analysis. This threshold is different if the requested VM or one or more VMs running on that host is of class SyncHPC since applications of this class can tolerate lesser interference (line 44). In addition, we maintain a database of interference indices to record interference between those applications which suffer large performance penalty when sharing hosts. This information is used to avoid colocations which are definitely not beneficial. The output of Algorithm 1 is buildplan which is the list of hosts where the VMs should be provisioned. To ensure homogeneity, hosts are grouped into different lists based on their processor type, and the algorithm operates on these groups. Currently, we use CPU frequency as the Modified Scheduler Calculates weighted dot product Euca-tools Open Stack EC2 API Further processing instanceclass ExtremeHPC name lu.b.16 _schedulemdobp choose_best_fit_host meet_interference_criteria Nova Compute Parse additional parameters, add to request_spec Initial processing CPU pinning and prioritization Physical Resources in Cloud H/W Aware Topology Aware Resource Packing Aware Interference Aware Nova DB app_interference apps_running Hypervisor (KVM) Fig. 4: Implementation details and control flow of a provisioning request distinction criteria between processor types. For more accurate distinction, additional factors such as MIPS can be considered. Figure 4 shows the overall control flow for a VM provisioning request, highlighting the additional features and changes that we introduced while implementing the HPCaware scheduling algorithm in OpenStack Nova. We modified both euca-tools and OpenStack EC2 API to allow additional parameters to be passed along with a VM provisioning request. Further, we store these additional properties (such as application class and cache score) of running VMs in the Nova database. Also, we create and maintain additional information interference indices, which is a record of interference suffered by each application with other applications during a characterization run. New tables app_running with columns that store the host name and applications running on it, and app_interferences that stores the interference between any of them were added to Nova DB. Nova DB API was modified to include a function to read this database. We extended the existing abstract_scheduler.py in Nova to create HPCinCloud_scheduler.py which contains the additional functions _schedulemdobp, choose_best_fit_host, and meet_interference_criteria. V. EVALUATION METHODOLOGY In this section, we describe our cloud setup and the applications which we used.

7 A. Experimental Testbed We evaluated our techniques on a cloud setup using Open- Stack on Open Cirrus testbed at HP Labs site [14]. This cloud has 3 types of servers: Intel Xeon E5450 (12M Cache, 3.00 GHz) Intel Xeon X3370 (12M Cache, 3.00 GHz) Intel Xeon X3210 (8M Cache, 2.13 GHz) The cluster topology is as described in Section II-B. For virtualization, we chose KVM [19], since prior research has indicated that KVM is a good choice for virtualization for HPC clouds [20]. For network virtualization, we experimented with different network drivers such as rtl8139, eth1000, virtionet, and settled on virtio-net because of better network performance (also shown in [6]). We used VMs of type m1.small (1 core, 2 GB memory, 20 GB disk). However, these choices do not influence the generality of our conclusions. CPU/VM CPU/VM B. Benchmarks and Applications We used the NAS Parallel Benchmarks (NPB) [15] problem size class B and C (the MPI version, NPB3.3-MPI), which are widely used by HPC community for performance benchmarking, and provide good coverage of computation, communication, and memory characteristics. We also used three larger HPC applications: NAMD [21] A highly scalable molecular dynamics application used ubiquitously on supercomputers. We used the ApoA1 input (92k atoms) for our experiments. ChaNGa [16] A cosmology application which perform collisionless N-body simulation using Barnes-Hut tree for force calculation. We used a 300,000 particle system. Jacobi2D A 5-point stencil computation kernel which averages values in a 2-D grid, and is used in scientific simulations, numerical algebra, and image processing. These applications are written in Charm++ [22] which is an object-oriented parallel programming language. We used the net-linux-x86-64 machine layer of Charm++ with O3 optimization level. VI. EXPERIMENTAL RESULTS Next, we evaluate the benefits of HPC-aware VM placement and the effect of jitter arising from VM consolidation. A. HPC-Aware Placement To demonstrate the impact of topology awareness and homogeneity, we compared the performance obtained by HPCaware scheduler with random VM placement. In these experiments, we did not perform VM consolidation. Figure 5 shows the performance obtained by our VM placement (Homo) compared to the case when two VMs are mapped to a slower processors, rest to the faster processor (Hetero). We calculated % improvement = (T Hetero T Homo )/T Hetero. We can see that the improvement achieved depends on the nature of application and the scale at which it is run. Also, the improvement is not equal to the ratio of sequential execution time on slower processor to that on faster processor. This can % improvement in Execution Time 22 % 20 % 18 % 16 % 14 % 12 % 10 % 8 % ChaNGa.8 ChaNGa.4 NAMD.16 NAMD.8 NAMD.4 Jacobi2D.4K.16 Jacobi2D.4K.8 Jacobi2D.4K.4 Jacobi2D.1K.16 Jacobi2D.1K.8 Jacobi2D.1K.4 ChaNGa.16 Application Fig. 5: % improvement achieved using HPC awareness (homogeneity) compared to the case where 2 VMs were on slower processors and rest on faster processors Time Time white=idle time white=idle time (a) Heterogeneous: First two VMs on slower processors (b) Homogeneous: All 8 VMs on same type of processors Fig. 6: CPU Timelines of 8 VMs running Jacobi2D be attributed to the communication time and parallel overhead, which is not necessarily dependent on the processor speeds. For these applications, we achieved up to 20% improvement in parallel execution time, which means we save 20% of time * N CPU-hours, where N is the number of processors used. We analyzed the performance bottleneck using the Projections [23] tool. Figure 6 shows the CPU (VM) timelines for an 8-core Jacobi2D experiment, x-axis is time, y-axis is the (virtual) core number, white portion shows idle time, and colored portions represent application functions. In Figure 6a, there is lot more idle time on VMs 3-7 compared to first 2 VMs (running on slower processors) since VMs 3-7 have to wait for VMs 0-1 to reach the synchronization point. The small idle time in Figure 6b due to the communication time. Next, we compared the performance obtained when using the VM placement provided by HPC-optimized algorithm vs. the default VM placement vs. without virtualization on the same testbed (see Figure 7). The default placement selects the host with least free CPU cores (or PEs) agnostic of its topology and hardware. In this experiment, the first host in the cloud had slower processor type. Figure 7 shows that even communication-intensive applications such as NAMD and ChaNGa scale well for Cloud-opt case, and achieve performance close to that obtained on physical platform. Benefits up to 25% are achieved compared to the default scheduler. However, performance achieved on the physical platform itself is up to 4X worse compared to ideal scaling at 64 cores,

8 Execution Time per step (s) Cloud Default Cloud Opt Phy. Ideal Number of cores Execution Time per step (s) Cloud Default Cloud Opt Phy. Ideal Number of cores (a) NAMD (Molecular Dynamics) (b) ChaNGa (Cosmology) (c) Jacobi2D 4K by 4K matrix Fig. 7: Runtime Results: Execution Time vs. Number of cores / VMs for different applications. Application Class Cache Time Placement 1.5 Score ded. run β=100 β=40 β= IS.B.4 ExtremeHPC N1 4 N1 4 N1 1.3 LU.C.16 SyncHPC (N2-N5) 2 (N1-N8) after 3 (N2-N6) 1.2 App1,App3-App5 +1 N7 1.1 LU.B.4 SyncHPC N6+1 N7 1 (N2-N5) 2 N1+2 N8 after App1 1 ChaNGa.4 SyncHPC N6+3 N7 1 (N2-N5) 1 (N2-N5) 0.9 EP.B.4 AsyncHPC N8 1 (N2-N5) 1 N6+3 N7 0.8 Normalized Application Performance Execution Time per step(s) IS.B.4 LU.C.16 LU.B.4 ChaNGa.4 EP.B.4 Cloud Default Cloud Opt Phy. Ideal Number of cores Cache Score Threshold Fig. 8: Table of applications, and figure showing percentage improvement achieved using application-aware scheduling compared to the case when applications were run in dedicated manner likely due to the absence of an HPC-optimized network. A detailed analysis of the communication performance of this cloud (with different virtualization drivers) was done in [6]. B. Case Study of Application-Aware Scheduling Here, we consider 8 nodes (32 cores) of our experimental testbed, and perform VM placement for the application stream shown in Figure 8 using the application-aware scheduler. The application suffix is the number of requested VMs. Figure 8 shows the characteristics of these applications and the output of scheduler with three different cache thresholds (β). The output (Placement) is presented in the form of the nodes (and cores per node) to which the scheduler mapped the application. This figure also shows the achieved performance for these cases compared to the dedicated execution using all 4 cores per node. When the cache threshold is too large, there is less performance improvement due to aggressive packing of cache-intensive applications on the same node. On the contrary, a very small threshold results in unnecessary wastage of some CPU cores if there are few applications with very small cache scores. This is illustrated by the placement shown in Figure 8, where the execution of some applications was deferred because the interference criteria were not satisfied due to small cache threshold. Moreover, there is additional penalty (communication overhead) associated when not using all cores of a node for running an HPC application. Hence, the cache threshold needs to be chosen carefully through extensive experimentation. In this case, we see that the threshold of 60 works the best. For this threshold and our application set, we achieve performance gains up to 45% for a single application while limiting negative impact of interference to 8%. We also measured the overhead of our scheduling algorithm by measuring the execution time. The average time to handle a request for 1, 16 instances was 1.58s, 1.80s respectively by our scheduler compared to 1.54s, 1.67s for default scheduler. VII. SIMULATION CloudSim is a simulation tool modeling a cloud computing environment in a datacenter, and is widely used for evaluation of resource provisioning algorithms [10]. In this work, we modified it to enable the simulation of High Performance Computing jobs in cloud. HPC machines have massive number of processors, whereas CloudSim is designed and implemented for cloud computing environment, and works mainly with jobs which needs single processor. Hence, for simulating HPC in cloud, the primary modification we performed was to improve the handling of multi-core jobs. We extended the existing vmallocationpolicysimple class to create a vmallocationpolicyhpc which can handle a user request comprising multiple VM instances and performs application-aware scheduling (discussed in Algorithm 1). At the start of simulation, a fixed number of VMs (of different specified types) are created, and jobs (cloudlets) are submitted to the data center broker which maps a job to a VM. When there are no pending jobs, all the VMs are terminated and simulation completes. Since our focus was on mapping of VMs to physical hosts, we created a one-to-one mapping between cloudlets and VMs. Moreover, we implemented VM termination during simulation to ensure complete simulation. Without dynamic VM creation and termination, the initial set of VMs run till the end of simulation, leading to indefinitely blocked jobs in the system since the VMs where they can run never get scheduled because of the limited datacenter capacity. We simulated the execution of jobs from the logs obtained from parallel workload archive [12]. We used the METACENTRUM-02.swf logs since these logs contain in- 60

9 Number of Completed Jobs MDOBP (ppn=4, RAM=16GB) Least Free PE (ppn=4, RAM=16GB) MDOBP (ppn=4, RAM=2GB) Least Free PE (ppn=4, RAM=2GB) Number of Completed Jobs IFMDOBP (β=60, adjust=-20%) IFMDOBP (β=60, adjust=-10%) IFMDOBP (β=60, noadjust) IFMDOBP (β=40, adjust=-20%) IFMDOBP (β=40, adjust=-10%) IFMDOBP (β=40, noadjust) MDOBP Least Free PE Time (s) (a) MDOBP vs. default Least Free PE heuristic for different amount of (b) Interference-aware MDOBP for different cache thresholds (β), adjusted RAM per node. ppn = processors (cores) per node execution times (accounting for better performance with cache awareness) Fig. 9: Simulation Results: Number of completed jobs vs time for different scheduling techniques, 1024 cores formation about a job s memory consumption. For each job record (n cores, m MB memory, execution time) in the log file, we create n VMs each with 1-core and m/n MB memory. We simulated the execution of first 1500 jobs from the log file on 1024 cores, and measured the number of completed jobs after 100 seconds. Figure 9a shows that the number of completed jobs after 100 seconds increased by around 109/801 = 13.6% when using MDOBP instead of the default heuristic (selecting a node with least free PEs) for the constrained memory case (2GB per node), whereas there was no improvement for this job set when the nodes had large memory per core. This is attributed to the fact that this job set has very few applications with large memory requirement. However, with the trend of the big memory applications, also true for the next generation exascale applications, we expect to see significant gains for architectures with large memory per node as well. We also simulated our HPC-aware scheduler (including cache-awareness) by assigning each job a cache score from (0-30) using a uniform distribution random number generator. We used two different values of the cache threshold (See Figure 9b IFMDOBP). We simulated the jobs with modified execution times of all jobs by -10% and -20% to account for the improvement in performance resulting from cacheawareness as seen from results in Section VI. The number of completed jobs after 100 seconds further increased to 1060 for the cache threshold of 60 and adjustment of -10%, which is a reasonable choice based on the results obtained in Section VI-B. Hence, overall we get improvement in throughput by 259/801 = 32.3% compared to default scheduler. Also, we can see that a small cache threshold (β=40) can actually degrade overall throughput because some cores will be left unused to ensure that the interference requirements are obeyed. VIII. RELATED WORK Previous studies on HPC applications in cloud have concluded that cloud cannot compete with supercomputers based on the metric $/GFLOPS for large scale HPC applications because of bottlenecks such as interconnect and I/O performance [1] [4]. However, clouds can be cost-effective for some applications, specifically those with less communication and at low scale [3], [4]. In this work, we explore VM placement Time (s) techniques to make HPC in cloud more economical through improved performance and resource utilization. Work on scheduling in cloud can be classified into three areas: 1) Initial VM Placement where the problem is to map a (set of) VM(s) of a single user request to available pool of resources. 2) Offline VM Consolidation where the problem is to map VM(s) from different user requests, hence with different resource requirements to physical resources to minimize the number of active servers to save energy. 3) Live Migration where remapping decisions are made for live VMs. Our focus is on the first problem, since our research is towards infrastructure clouds (IaaS) such as Amazon EC2, where VM allocation and mapping happen as and when VM requests arrive. Offline VM consolidation has been extensively researched [24], [25], but is not applicable to IaaS. Also, live migration has associated costs, and introduces further noise. For initial VM placement, existing cloud management systems such as OpenStack [17], Eucalyptus [26], and Open- Nebula [27] use Round Robin (next available server), First Fit (first available server), or Greedy Ranking based (best fit according to certain criteria e.g. least free RAM) strategies, which operate in one-dimension (CPU or memory). Other researchers have proposed genetic algorithms [28]. A detailed description and validation of VM consolidation heuristics is provided in [18]. However, these techniques ignore the intrinsic nature of HPC VMs tightly coupledness. Fan et al. discuss topology-aware deployment for scientific applications in cloud, and map the communication topology of a parallel application to the VM physical topology [7]. Recently, OpenStack community has been working on making the scheduler architecture-aware and suitable for HPC [8], [9]. Amazon EC2 has a Cluster Compute instance which allows placement groups such that all instances within a placement group are expected to get low latency and full bisection 10 Gbps bandwidth [29]. It is not known how strictly those guarantees are met and what techniques are used to meet them. In this work, we extend our previous work on HPCaware scheduler [6] in multiple ways - First, we use multidimensional online bin packing (MDOBP) for considering resources along all dimensions (such as CPU and memory).

10 MDOBP algorithms have been explored in offline VM consolidation research, but we apply these to initial VM placement problem and in the context of HPC in cloud. Second, we leverage the additional knowledge about the application characteristics, such as HPC or non-hpc, synchronization, communication, and cache characteristics to limit cross-application interference. We got insights from studies which have explored the effects of shared multi-core node on cross-vm interference, both in HPC and non-hpc domain [24], [30], [31]. There are many tools for scheduling HPC jobs on clusters, such as Oracle Grid Engine, ALPS, OpenPBS, SLURM, TORQUE, and Condor. They are all job schedulers or resource management systems for cluster or grid environment, and aim to utilize system resources in an efficient manner. They differ from scheduling on cloud since they work with physical not virtual machines, and hence cannot benefit from the traits of virtualization such as consolidation. Nodes are typically allotted to a single user, and not shared with other users. IX. LESSONS, CONCLUSIONS AND FUTURE WORK We summarize the lessons learned through this research: Although it may be counterintuitive, HPC can benefit greatly by consolidating VMs using smart co-locations. A cloud management system such as OpenStack would greatly benefit from a scheduler which is aware of the application characteristics such as cache, synchronization and communication behavior, and HPC vs non-hpc. Careful VM placement and execution of HPC and other workloads can result in better resource utilization, cost reduction, and hence broader acceptance of HPC clouds. Through experimental research, we explored the opportunities and challenges of VM consolidation for HPC in cloud. We designed and implemented an HPC-aware scheduling algorithm for VM placement which achieves better resource utilization and limits cross-application interference through careful co-location. Through experimental and simulation results, we demonstrated benefits of up to 32% increase in job throughput and performance improvement up to 45% while limiting the effect of jitter to 8%. In future, we plan to consider other factors which can affect performance of a VM in a shared multi-core node such as I/O (network and disk). Another direction of research is to address other challenges for adoption of cloud by HPC community. ACKNOWLEDGMENTS We thank Arpita Kundu, Ivan Nithin, Chaitra Padmanabhan, Jiban J. Sarma and R Suryaprakash for setting up the cloud environment and helping with the scheduler implementation. We thank Alex Zhang for the discussions on server consolidation. First author was supported by HP Lab s 2012 IRP award. REFERENCES [1] E. Walker, Benchmarking Amazon EC2 for High-Performance Scientific Computing, LOGIN, pp , [2] Magellan Final Report, U.S. Department of Energy (DOE), Tech. Rep., 2011, /media/ascr/pdf/program-documents/ docs/magellan Final Report.pdf. [3] A. Gupta and D. Milojicic, Evaluation of HPC Applications on Cloud, in Open Cirrus Summit (Best Student Paper), Atlanta, GA, Oct. 2011, pp [Online]. Available: [4] A. Gupta et al., Exploring the Performance and Mapping of HPC Applications to Platforms in the cloud, in HPDC 12. New York, NY, USA: ACM, 2012, pp [5] MPI: A Message Passing Interface Standard, in M. P. I. Forum, [6] A. Gupta, D. Milojicic, and L. Kale, Optimizing VM Placement for HPC in Cloud, in Workshop on Cloud Services, Federation and the 8th Open Cirrus Summit, San Jose, CA, [7] P. Fan, Z. Chen, J. Wang, Z. Zheng, and M. R. Lyu, Topology-Aware Deployment of Scientific Applications in Cloud Computing, Cloud Computing, IEEE International Conference on, vol. 0, [8] Nova Scheduling Adaptations, Download/Presentations/Workshop Scheduler.pdf. [9] HeterogeneousArchitectureScheduler, HeterogeneousArchitectureScheduler. [10] R. N. Calheiros, R. Ranjan, A. Beloglazov, C. A. F. De Rose, and R. Buyya, CloudSim: A Toolkit for Modeling and Simulation of Cloud Computing Environments and Evaluation of Resource Provisioning algorithms, Softw. Pract. Exper., vol. 41, no. 1, pp , Jan [11] Amazon Elastic Compute Cloud, [12] Parallel Workloads Archive. [Online]. Available: ac.il/labs/parallel/workload/ [13] Exascale Challenges, exascale-challenges. [14] A. Avetisyan et al., Open Cirrus: A Global Cloud Computing Testbed, IEEE Computer, vol. 43, pp , April [15] NAS Parallel Benchmarks, Software/npb.html. [16] P. Jetley, F. Gioachin, C. Mendes, L. V. Kale, and T. R. Quinn, Massively Parallel Cosmological Simulations with ChaNGa, in IDPPS, 2008, pp [17] OpenStack Cloud Computing Software, [18] S. Lee, R. Panigrahy, V. Prabhakaran, V. Ramasubramanian, K. Talwar, L. Uyeda, and U. Wieder., Validating Heuristics for Virtual Machines Consolidation, Microsoft Research, Tech. Rep., [19] KVM Kernel-based Virtual Machine, Redhat, Inc., Tech. Rep., [20] A. J. Younge, R. Henschel, J. T. Brown, G. von Laszewski, J. Qiu, and G. C. Fox, Analysis of Virtualization Technologies for High Performance Computing Environments, Cloud Computing, IEEE Intl. Conf. on, vol. 0, pp. 9 16, [21] A. Bhatele, S. Kumar, C. Mei, J. C. Phillips, G. Zheng, and L. V. Kale, Overcoming Scaling Challenges in Biomolecular Simulations across Multiple Platforms, in IPDPS 2008, April 2008, pp [22] L. Kalé, The Chare Kernel parallel programming language and system, in Proceedings of the International Conference on Parallel Processing, vol. II, Aug. 1990, pp [23] L. Kalé and A. Sinha, Projections : A Scalable Performance Tool, in Parallel Systems Fair, International Parallel Processing Symposium, Apr [24] A. Verma, P. Ahuja, and A. Neogi, Power-aware Dynamic Placement of HPC Applications, ser. ICS 08. New York, NY, USA: ACM, 2008, pp [25] S. K. Garg, C. S. Yeo, A. Anandasivam, and R. Buyya, Energy-Efficient Scheduling of HPC Applications in Cloud Computing Environments, CoRR, vol. abs/ , [26] D. Nurmi et al., The Eucalyptus Open-source Cloud-computing System, in Proceedings of Cloud Computing and Its Applications, [27] The Cloud Data Center Management Solution, [28] J. Xu and J. A. B. Fortes, Multi-Objective Virtual Machine Placement in Virtualized Data Center Environments, ser. GREENCOM-CPSCOM 10. Washington, DC, USA: IEEE Computer Society, pp [29] High Performance Computing (HPC) on AWS, com/hpc-applications. [30] J. Mars et al., Bubble-Up: Increasing Utilization in Modern Warehouse Scale Computers via Sensible Co-locations, ser. MICRO New York, NY, USA: ACM, 2011, pp [31] J. Han, J. Ahn, C. Kim, Y. Kwon, Y.-R. Choi, and J. Huh, The Effect of Multi-core on HPC Applications in Virtualized Systems, ser. Euro-Par Berlin, Heidelberg: Springer-Verlag, 2011, pp

Cloud Friendly Load Balancing for HPC Applications: Preliminary Work

Cloud Friendly Load Balancing for HPC Applications: Preliminary Work Cloud Friendly Load Balancing for HPC Applications: Preliminary Work Osman Sarood, Abhishek Gupta and Laxmikant V. Kalé Department of Computer Science University of Illinois at Urbana-Champaign Urbana,

More information

Evaluation of HPC Applications on Cloud

Evaluation of HPC Applications on Cloud Evaluation of HPC Applications on Cloud Abhishek Gupta Department of Computer Science University of Illinois at Urbana-Champaign Urbana, IL 60, USA gupta59@illinois.edu Dejan Milojicic HP Labs Palo Alto,

More information

Improving HPC Application Performance in Cloud through Dynamic Load Balancing

Improving HPC Application Performance in Cloud through Dynamic Load Balancing Improving HPC Application Performance in Cloud through Dynamic Load Balancing Abhishek Gupta, Osman Sarood, Laxmikant V Kale University of Illinois at Urbana-Champaign Urbana, IL 6181, USA (gupta59, sarood1,

More information

Heterogeneous Workload Consolidation for Efficient Management of Data Centers in Cloud Computing

Heterogeneous Workload Consolidation for Efficient Management of Data Centers in Cloud Computing Heterogeneous Workload Consolidation for Efficient Management of Data Centers in Cloud Computing Deep Mann ME (Software Engineering) Computer Science and Engineering Department Thapar University Patiala-147004

More information

An Experimental Study of Load Balancing of OpenNebula Open-Source Cloud Computing Platform

An Experimental Study of Load Balancing of OpenNebula Open-Source Cloud Computing Platform An Experimental Study of Load Balancing of OpenNebula Open-Source Cloud Computing Platform A B M Moniruzzaman 1, Kawser Wazed Nafi 2, Prof. Syed Akhter Hossain 1 and Prof. M. M. A. Hashem 1 Department

More information

SLA-aware Resource Scheduling for Cloud Storage

SLA-aware Resource Scheduling for Cloud Storage SLA-aware Resource Scheduling for Cloud Storage Zhihao Yao Computer and Information Technology Purdue University West Lafayette, Indiana 47906 Email: yao86@purdue.edu Ioannis Papapanagiotou Computer and

More information

International Journal of Digital Application & Contemporary research Website: www.ijdacr.com (Volume 2, Issue 9, April 2014)

International Journal of Digital Application & Contemporary research Website: www.ijdacr.com (Volume 2, Issue 9, April 2014) Green Cloud Computing: Greedy Algorithms for Virtual Machines Migration and Consolidation to Optimize Energy Consumption in a Data Center Rasoul Beik Islamic Azad University Khomeinishahr Branch, Isfahan,

More information

Cloud Computing Simulation Using CloudSim

Cloud Computing Simulation Using CloudSim Cloud Computing Simulation Using CloudSim Ranjan Kumar #1, G.Sahoo *2 # Assistant Professor, Computer Science & Engineering, Ranchi University, India Professor & Head, Information Technology, Birla Institute

More information

CDBMS Physical Layer issue: Load Balancing

CDBMS Physical Layer issue: Load Balancing CDBMS Physical Layer issue: Load Balancing Shweta Mongia CSE, School of Engineering G D Goenka University, Sohna Shweta.mongia@gdgoenka.ac.in Shipra Kataria CSE, School of Engineering G D Goenka University,

More information

A Dynamic Resource Management with Energy Saving Mechanism for Supporting Cloud Computing

A Dynamic Resource Management with Energy Saving Mechanism for Supporting Cloud Computing A Dynamic Resource Management with Energy Saving Mechanism for Supporting Cloud Computing Liang-Teh Lee, Kang-Yuan Liu, Hui-Yang Huang and Chia-Ying Tseng Department of Computer Science and Engineering,

More information

IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures

IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures Dr. Sanjay P. Ahuja, Ph.D. 2010-14 FIS Distinguished Professor of Computer Science School of Computing, UNF Introduction

More information

CloudSim: A Toolkit for Modeling and Simulation of Cloud Computing Environments and Evaluation of Resource Provisioning Algorithms

CloudSim: A Toolkit for Modeling and Simulation of Cloud Computing Environments and Evaluation of Resource Provisioning Algorithms CloudSim: A Toolkit for Modeling and Simulation of Cloud Computing Environments and Evaluation of Resource Provisioning Algorithms Rodrigo N. Calheiros, Rajiv Ranjan, Anton Beloglazov, César A. F. De Rose,

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 4, July-Aug 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 4, July-Aug 2014 RESEARCH ARTICLE An Efficient Service Broker Policy for Cloud Computing Environment Kunal Kishor 1, Vivek Thapar 2 Research Scholar 1, Assistant Professor 2 Department of Computer Science and Engineering,

More information

benchmarking Amazon EC2 for high-performance scientific computing

benchmarking Amazon EC2 for high-performance scientific computing Edward Walker benchmarking Amazon EC2 for high-performance scientific computing Edward Walker is a Research Scientist with the Texas Advanced Computing Center at the University of Texas at Austin. He received

More information

Optimal Service Pricing for a Cloud Cache

Optimal Service Pricing for a Cloud Cache Optimal Service Pricing for a Cloud Cache K.SRAVANTHI Department of Computer Science & Engineering (M.Tech.) Sindura College of Engineering and Technology Ramagundam,Telangana G.LAKSHMI Asst. Professor,

More information

EPOBF: ENERGY EFFICIENT ALLOCATION OF VIRTUAL MACHINES IN HIGH PERFORMANCE COMPUTING CLOUD

EPOBF: ENERGY EFFICIENT ALLOCATION OF VIRTUAL MACHINES IN HIGH PERFORMANCE COMPUTING CLOUD Journal of Science and Technology 51 (4B) (2013) 173-182 EPOBF: ENERGY EFFICIENT ALLOCATION OF VIRTUAL MACHINES IN HIGH PERFORMANCE COMPUTING CLOUD Nguyen Quang-Hung, Nam Thoai, Nguyen Thanh Son Faculty

More information

Dynamic resource management for energy saving in the cloud computing environment

Dynamic resource management for energy saving in the cloud computing environment Dynamic resource management for energy saving in the cloud computing environment Liang-Teh Lee, Kang-Yuan Liu, and Hui-Yang Huang Department of Computer Science and Engineering, Tatung University, Taiwan

More information

International Journal of Advance Research in Computer Science and Management Studies

International Journal of Advance Research in Computer Science and Management Studies Volume 3, Issue 6, June 2015 ISSN: 2321 7782 (Online) International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online

More information

Eucalyptus: An Open-source Infrastructure for Cloud Computing. Rich Wolski Eucalyptus Systems Inc. www.eucalyptus.com

Eucalyptus: An Open-source Infrastructure for Cloud Computing. Rich Wolski Eucalyptus Systems Inc. www.eucalyptus.com Eucalyptus: An Open-source Infrastructure for Cloud Computing Rich Wolski Eucalyptus Systems Inc. www.eucalyptus.com Exciting Weather Forecasts Commercial Cloud Formation Eucalyptus - Confidential What

More information

The Green Index: A Metric for Evaluating System-Wide Energy Efficiency in HPC Systems

The Green Index: A Metric for Evaluating System-Wide Energy Efficiency in HPC Systems 202 IEEE 202 26th IEEE International 26th International Parallel Parallel and Distributed and Distributed Processing Processing Symposium Symposium Workshops Workshops & PhD Forum The Green Index: A Metric

More information

Workshop on Parallel and Distributed Scientific and Engineering Computing, Shanghai, 25 May 2012

Workshop on Parallel and Distributed Scientific and Engineering Computing, Shanghai, 25 May 2012 Scientific Application Performance on HPC, Private and Public Cloud Resources: A Case Study Using Climate, Cardiac Model Codes and the NPB Benchmark Suite Peter Strazdins (Research School of Computer Science),

More information

ENERGY EFFICIENT VIRTUAL MACHINE ASSIGNMENT BASED ON ENERGY CONSUMPTION AND RESOURCE UTILIZATION IN CLOUD NETWORK

ENERGY EFFICIENT VIRTUAL MACHINE ASSIGNMENT BASED ON ENERGY CONSUMPTION AND RESOURCE UTILIZATION IN CLOUD NETWORK International Journal of Computer Engineering & Technology (IJCET) Volume 7, Issue 1, Jan-Feb 2016, pp. 45-53, Article ID: IJCET_07_01_006 Available online at http://www.iaeme.com/ijcet/issues.asp?jtype=ijcet&vtype=7&itype=1

More information

Datacenters and Cloud Computing. Jia Rao Assistant Professor in CS http://cs.uccs.edu/~jrao/cs5540/spring2014/index.html

Datacenters and Cloud Computing. Jia Rao Assistant Professor in CS http://cs.uccs.edu/~jrao/cs5540/spring2014/index.html Datacenters and Cloud Computing Jia Rao Assistant Professor in CS http://cs.uccs.edu/~jrao/cs5540/spring2014/index.html What is Cloud Computing? A model for enabling ubiquitous, convenient, ondemand network

More information

International Journal of Computer & Organization Trends Volume21 Number1 June 2015 A Study on Load Balancing in Cloud Computing

International Journal of Computer & Organization Trends Volume21 Number1 June 2015 A Study on Load Balancing in Cloud Computing A Study on Load Balancing in Cloud Computing * Parveen Kumar * Er.Mandeep Kaur Guru kashi University,Talwandi Sabo Guru kashi University,Talwandi Sabo Abstract: Load Balancing is a computer networking

More information

The Who, What, Why and How of High Performance Computing Applications in the Cloud

The Who, What, Why and How of High Performance Computing Applications in the Cloud The Who, What, Why and How of High Performance Computing Applications in the Cloud Abhishek Gupta, Laxmikant V. Kale, Filippo Gioachin, Verdi March, Chun Hui Suen, Bu-Sung Lee, Paolo Faraboschi, Richard

More information

Efficient and Enhanced Load Balancing Algorithms in Cloud Computing

Efficient and Enhanced Load Balancing Algorithms in Cloud Computing , pp.9-14 http://dx.doi.org/10.14257/ijgdc.2015.8.2.02 Efficient and Enhanced Load Balancing Algorithms in Cloud Computing Prabhjot Kaur and Dr. Pankaj Deep Kaur M. Tech, CSE P.H.D prabhjotbhullar22@gmail.com,

More information

Optimizing Shared Resource Contention in HPC Clusters

Optimizing Shared Resource Contention in HPC Clusters Optimizing Shared Resource Contention in HPC Clusters Sergey Blagodurov Simon Fraser University Alexandra Fedorova Simon Fraser University Abstract Contention for shared resources in HPC clusters occurs

More information

Task Scheduling for Efficient Resource Utilization in Cloud

Task Scheduling for Efficient Resource Utilization in Cloud Summer 2014 Task Scheduling for Efficient Resource Utilization in Cloud A Project Report for course COEN 241 Under the guidance of, Dr.Ming Hwa Wang Submitted by : Najuka Sankhe Nikitha Karkala Nimisha

More information

System Models for Distributed and Cloud Computing

System Models for Distributed and Cloud Computing System Models for Distributed and Cloud Computing Dr. Sanjay P. Ahuja, Ph.D. 2010-14 FIS Distinguished Professor of Computer Science School of Computing, UNF Classification of Distributed Computing Systems

More information

Distributed communication-aware load balancing with TreeMatch in Charm++

Distributed communication-aware load balancing with TreeMatch in Charm++ Distributed communication-aware load balancing with TreeMatch in Charm++ The 9th Scheduling for Large Scale Systems Workshop, Lyon, France Emmanuel Jeannot Guillaume Mercier Francois Tessier In collaboration

More information

Performance Analysis of VM Scheduling Algorithm of CloudSim in Cloud Computing

Performance Analysis of VM Scheduling Algorithm of CloudSim in Cloud Computing IJECT Vo l. 6, Is s u e 1, Sp l-1 Ja n - Ma r c h 2015 ISSN : 2230-7109 (Online) ISSN : 2230-9543 (Print) Performance Analysis Scheduling Algorithm CloudSim in Cloud Computing 1 Md. Ashifuddin Mondal,

More information

Survey on Models to Investigate Data Center Performance and QoS in Cloud Computing Infrastructure

Survey on Models to Investigate Data Center Performance and QoS in Cloud Computing Infrastructure Survey on Models to Investigate Data Center Performance and QoS in Cloud Computing Infrastructure Chandrakala Department of Computer Science and Engineering Srinivas School of Engineering, Mukka Mangalore,

More information

Energy Constrained Resource Scheduling for Cloud Environment

Energy Constrained Resource Scheduling for Cloud Environment Energy Constrained Resource Scheduling for Cloud Environment 1 R.Selvi, 2 S.Russia, 3 V.K.Anitha 1 2 nd Year M.E.(Software Engineering), 2 Assistant Professor Department of IT KSR Institute for Engineering

More information

Figure 1. The cloud scales: Amazon EC2 growth [2].

Figure 1. The cloud scales: Amazon EC2 growth [2]. - Chung-Cheng Li and Kuochen Wang Department of Computer Science National Chiao Tung University Hsinchu, Taiwan 300 shinji10343@hotmail.com, kwang@cs.nctu.edu.tw Abstract One of the most important issues

More information

2) Xen Hypervisor 3) UEC

2) Xen Hypervisor 3) UEC 5. Implementation Implementation of the trust model requires first preparing a test bed. It is a cloud computing environment that is required as the first step towards the implementation. Various tools

More information

Profit-driven Cloud Service Request Scheduling Under SLA Constraints

Profit-driven Cloud Service Request Scheduling Under SLA Constraints Journal of Information & Computational Science 9: 14 (2012) 4065 4073 Available at http://www.joics.com Profit-driven Cloud Service Request Scheduling Under SLA Constraints Zhipiao Liu, Qibo Sun, Shangguang

More information

Affinity Aware VM Colocation Mechanism for Cloud

Affinity Aware VM Colocation Mechanism for Cloud Affinity Aware VM Colocation Mechanism for Cloud Nilesh Pachorkar 1* and Rajesh Ingle 2 Received: 24-December-2014; Revised: 12-January-2015; Accepted: 12-January-2015 2014 ACCENTS Abstract The most of

More information

Methodology for predicting the energy consumption of SPMD application on virtualized environments *

Methodology for predicting the energy consumption of SPMD application on virtualized environments * Methodology for predicting the energy consumption of SPMD application on virtualized environments * Javier Balladini, Ronal Muresano +, Remo Suppi +, Dolores Rexachs + and Emilio Luque + * Computer Engineering

More information

PERFORMANCE ANALYSIS OF PaaS CLOUD COMPUTING SYSTEM

PERFORMANCE ANALYSIS OF PaaS CLOUD COMPUTING SYSTEM PERFORMANCE ANALYSIS OF PaaS CLOUD COMPUTING SYSTEM Akmal Basha 1 Krishna Sagar 2 1 PG Student,Department of Computer Science and Engineering, Madanapalle Institute of Technology & Science, India. 2 Associate

More information

Reallocation and Allocation of Virtual Machines in Cloud Computing Manan D. Shah a, *, Harshad B. Prajapati b

Reallocation and Allocation of Virtual Machines in Cloud Computing Manan D. Shah a, *, Harshad B. Prajapati b Proceedings of International Conference on Emerging Research in Computing, Information, Communication and Applications (ERCICA-14) Reallocation and Allocation of Virtual Machines in Cloud Computing Manan

More information

Multilevel Communication Aware Approach for Load Balancing

Multilevel Communication Aware Approach for Load Balancing Multilevel Communication Aware Approach for Load Balancing 1 Dipti Patel, 2 Ashil Patel Department of Information Technology, L.D. College of Engineering, Gujarat Technological University, Ahmedabad 1

More information

Performance Gathering and Implementing Portability on Cloud Storage Data

Performance Gathering and Implementing Portability on Cloud Storage Data International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 17 (2014), pp. 1815-1823 International Research Publications House http://www. irphouse.com Performance Gathering

More information

Amazon EC2 Product Details Page 1 of 5

Amazon EC2 Product Details Page 1 of 5 Amazon EC2 Product Details Page 1 of 5 Amazon EC2 Functionality Amazon EC2 presents a true virtual computing environment, allowing you to use web service interfaces to launch instances with a variety of

More information

Permanent Link: http://espace.library.curtin.edu.au/r?func=dbin-jump-full&local_base=gen01-era02&object_id=154091

Permanent Link: http://espace.library.curtin.edu.au/r?func=dbin-jump-full&local_base=gen01-era02&object_id=154091 Citation: Alhamad, Mohammed and Dillon, Tharam S. and Wu, Chen and Chang, Elizabeth. 2010. Response time for cloud computing providers, in Kotsis, G. and Taniar, D. and Pardede, E. and Saleh, I. and Khalil,

More information

C-Meter: A Framework for Performance Analysis of Computing Clouds

C-Meter: A Framework for Performance Analysis of Computing Clouds 9th IEEE/ACM International Symposium on Cluster Computing and the Grid C-Meter: A Framework for Performance Analysis of Computing Clouds Nezih Yigitbasi, Alexandru Iosup, and Dick Epema Delft University

More information

9/26/2011. What is Virtualization? What are the different types of virtualization.

9/26/2011. What is Virtualization? What are the different types of virtualization. CSE 501 Monday, September 26, 2011 Kevin Cleary kpcleary@buffalo.edu What is Virtualization? What are the different types of virtualization. Practical Uses Popular virtualization products Demo Question,

More information

White Paper. Recording Server Virtualization

White Paper. Recording Server Virtualization White Paper Recording Server Virtualization Prepared by: Mike Sherwood, Senior Solutions Engineer Milestone Systems 23 March 2011 Table of Contents Introduction... 3 Target audience and white paper purpose...

More information

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1 Performance Study Performance Characteristics of and RDM VMware ESX Server 3.0.1 VMware ESX Server offers three choices for managing disk access in a virtual machine VMware Virtual Machine File System

More information

Load Balancing to Save Energy in Cloud Computing

Load Balancing to Save Energy in Cloud Computing presented at the Energy Efficient Systems Workshop at ICT4S, Stockholm, Aug. 2014 Load Balancing to Save Energy in Cloud Computing Theodore Pertsas University of Manchester United Kingdom tpertsas@gmail.com

More information

Oracle Database Scalability in VMware ESX VMware ESX 3.5

Oracle Database Scalability in VMware ESX VMware ESX 3.5 Performance Study Oracle Database Scalability in VMware ESX VMware ESX 3.5 Database applications running on individual physical servers represent a large consolidation opportunity. However enterprises

More information

Study of Various Load Balancing Techniques in Cloud Environment- A Review

Study of Various Load Balancing Techniques in Cloud Environment- A Review International Journal of Computer Sciences and Engineering Open Access Review Paper Volume-4, Issue-04 E-ISSN: 2347-2693 Study of Various Load Balancing Techniques in Cloud Environment- A Review Rajdeep

More information

Multifaceted Resource Management for Dealing with Heterogeneous Workloads in Virtualized Data Centers

Multifaceted Resource Management for Dealing with Heterogeneous Workloads in Virtualized Data Centers Multifaceted Resource Management for Dealing with Heterogeneous Workloads in Virtualized Data Centers Íñigo Goiri, J. Oriol Fitó, Ferran Julià, Ramón Nou, Josep Ll. Berral, Jordi Guitart and Jordi Torres

More information

Group Based Load Balancing Algorithm in Cloud Computing Virtualization

Group Based Load Balancing Algorithm in Cloud Computing Virtualization Group Based Load Balancing Algorithm in Cloud Computing Virtualization Rishi Bhardwaj, 2 Sangeeta Mittal, Student, 2 Assistant Professor, Department of Computer Science, Jaypee Institute of Information

More information

GUEST OPERATING SYSTEM BASED PERFORMANCE COMPARISON OF VMWARE AND XEN HYPERVISOR

GUEST OPERATING SYSTEM BASED PERFORMANCE COMPARISON OF VMWARE AND XEN HYPERVISOR GUEST OPERATING SYSTEM BASED PERFORMANCE COMPARISON OF VMWARE AND XEN HYPERVISOR ANKIT KUMAR, SAVITA SHIWANI 1 M. Tech Scholar, Software Engineering, Suresh Gyan Vihar University, Rajasthan, India, Email:

More information

Eucalyptus: An Open-source Infrastructure for Cloud Computing. Rich Wolski Eucalyptus Systems Inc. www.eucalyptus.com

Eucalyptus: An Open-source Infrastructure for Cloud Computing. Rich Wolski Eucalyptus Systems Inc. www.eucalyptus.com Eucalyptus: An Open-source Infrastructure for Cloud Computing Rich Wolski Eucalyptus Systems Inc. www.eucalyptus.com Exciting Weather Forecasts Commercial Cloud Formation Eucalyptus - Confidential What

More information

How swift is your Swift? Ning Zhang, OpenStack Engineer at Zmanda Chander Kant, CEO at Zmanda

How swift is your Swift? Ning Zhang, OpenStack Engineer at Zmanda Chander Kant, CEO at Zmanda How swift is your Swift? Ning Zhang, OpenStack Engineer at Zmanda Chander Kant, CEO at Zmanda 1 Outline Build a cost-efficient Swift cluster with expected performance Background & Problem Solution Experiments

More information

IMPROVEMENT OF RESPONSE TIME OF LOAD BALANCING ALGORITHM IN CLOUD ENVIROMENT

IMPROVEMENT OF RESPONSE TIME OF LOAD BALANCING ALGORITHM IN CLOUD ENVIROMENT IMPROVEMENT OF RESPONSE TIME OF LOAD BALANCING ALGORITHM IN CLOUD ENVIROMENT Muhammad Muhammad Bala 1, Miss Preety Kaushik 2, Mr Vivec Demri 3 1, 2, 3 Department of Engineering and Computer Science, Sharda

More information

Towards an understanding of oversubscription in cloud

Towards an understanding of oversubscription in cloud IBM Research Towards an understanding of oversubscription in cloud Salman A. Baset, Long Wang, Chunqiang Tang sabaset@us.ibm.com IBM T. J. Watson Research Center Hawthorne, NY Outline Oversubscription

More information

A Flexible Cluster Infrastructure for Systems Research and Software Development

A Flexible Cluster Infrastructure for Systems Research and Software Development Award Number: CNS-551555 Title: CRI: Acquisition of an InfiniBand Cluster with SMP Nodes Institution: Florida State University PIs: Xin Yuan, Robert van Engelen, Kartik Gopalan A Flexible Cluster Infrastructure

More information

A Middleware Strategy to Survive Compute Peak Loads in Cloud

A Middleware Strategy to Survive Compute Peak Loads in Cloud A Middleware Strategy to Survive Compute Peak Loads in Cloud Sasko Ristov Ss. Cyril and Methodius University Faculty of Information Sciences and Computer Engineering Skopje, Macedonia Email: sashko.ristov@finki.ukim.mk

More information

Payment minimization and Error-tolerant Resource Allocation for Cloud System Using equally spread current execution load

Payment minimization and Error-tolerant Resource Allocation for Cloud System Using equally spread current execution load Payment minimization and Error-tolerant Resource Allocation for Cloud System Using equally spread current execution load Pooja.B. Jewargi Prof. Jyoti.Patil Department of computer science and engineering,

More information

Amazon EC2 XenApp Scalability Analysis

Amazon EC2 XenApp Scalability Analysis WHITE PAPER Citrix XenApp Amazon EC2 XenApp Scalability Analysis www.citrix.com Table of Contents Introduction...3 Results Summary...3 Detailed Results...4 Methods of Determining Results...4 Amazon EC2

More information

Managed Virtualized Platforms: From Multicore Nodes to Distributed Cloud Infrastructures

Managed Virtualized Platforms: From Multicore Nodes to Distributed Cloud Infrastructures Managed Virtualized Platforms: From Multicore Nodes to Distributed Cloud Infrastructures Ada Gavrilovska Karsten Schwan, Mukil Kesavan Sanjay Kumar, Ripal Nathuji, Adit Ranadive Center for Experimental

More information

Energy Efficient Resource Management in Virtualized Cloud Data Centers

Energy Efficient Resource Management in Virtualized Cloud Data Centers 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing Energy Efficient Resource Management in Virtualized Cloud Data Centers Anton Beloglazov* and Rajkumar Buyya Cloud Computing

More information

Last time. Data Center as a Computer. Today. Data Center Construction (and management)

Last time. Data Center as a Computer. Today. Data Center Construction (and management) Last time Data Center Construction (and management) Johan Tordsson Department of Computing Science 1. Common (Web) application architectures N-tier applications Load Balancers Application Servers Databases

More information

C-Meter: A Framework for Performance Analysis of Computing Clouds

C-Meter: A Framework for Performance Analysis of Computing Clouds C-Meter: A Framework for Performance Analysis of Computing Clouds Nezih Yigitbasi, Alexandru Iosup, and Dick Epema {M.N.Yigitbasi, D.H.J.Epema, A.Iosup}@tudelft.nl Delft University of Technology Simon

More information

DISTRIBUTED SYSTEMS AND CLOUD COMPUTING. A Comparative Study

DISTRIBUTED SYSTEMS AND CLOUD COMPUTING. A Comparative Study DISTRIBUTED SYSTEMS AND CLOUD COMPUTING A Comparative Study Geographically distributed resources, such as storage devices, data sources, and computing power, are interconnected as a single, unified resource

More information

Building a Private Cloud with Eucalyptus

Building a Private Cloud with Eucalyptus Building a Private Cloud with Eucalyptus 5th IEEE International Conference on e-science Oxford December 9th 2009 Christian Baun, Marcel Kunze KIT The cooperation of Forschungszentrum Karlsruhe GmbH und

More information

PARALLELS CLOUD SERVER

PARALLELS CLOUD SERVER PARALLELS CLOUD SERVER Performance and Scalability 1 Table of Contents Executive Summary... Error! Bookmark not defined. LAMP Stack Performance Evaluation... Error! Bookmark not defined. Background...

More information

Cloud Computing Capacity Planning. Maximizing Cloud Value. Authors: Jose Vargas, Clint Sherwood. Organization: IBM Cloud Labs

Cloud Computing Capacity Planning. Maximizing Cloud Value. Authors: Jose Vargas, Clint Sherwood. Organization: IBM Cloud Labs Cloud Computing Capacity Planning Authors: Jose Vargas, Clint Sherwood Organization: IBM Cloud Labs Web address: ibm.com/websphere/developer/zones/hipods Date: 3 November 2010 Status: Version 1.0 Abstract:

More information

A Taxonomy and Survey of Energy-Efficient Data Centers and Cloud Computing Systems

A Taxonomy and Survey of Energy-Efficient Data Centers and Cloud Computing Systems A Taxonomy and Survey of Energy-Efficient Data Centers and Cloud Computing Systems Anton Beloglazov, Rajkumar Buyya, Young Choon Lee, and Albert Zomaya Present by Leping Wang 1/25/2012 Outline Background

More information

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency WHITE PAPER Solving I/O Bottlenecks to Enable Superior Cloud Efficiency Overview...1 Mellanox I/O Virtualization Features and Benefits...2 Summary...6 Overview We already have 8 or even 16 cores on one

More information

A Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems

A Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems A Hybrid Scheduling Approach for Scalable Heterogeneous Hadoop Systems Aysan Rasooli Department of Computing and Software McMaster University Hamilton, Canada Email: rasooa@mcmaster.ca Douglas G. Down

More information

Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering

Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays Red Hat Performance Engineering Version 1.0 August 2013 1801 Varsity Drive Raleigh NC

More information

Energy Conscious Virtual Machine Migration by Job Shop Scheduling Algorithm

Energy Conscious Virtual Machine Migration by Job Shop Scheduling Algorithm Energy Conscious Virtual Machine Migration by Job Shop Scheduling Algorithm Shanthipriya.M 1, S.T.Munusamy 2 ProfSrinivasan. R 3 M.Tech (IT) Student, Department of IT, PSV College of Engg & Tech, Krishnagiri,

More information

VON/K: A Fast Virtual Overlay Network Embedded in KVM Hypervisor for High Performance Computing

VON/K: A Fast Virtual Overlay Network Embedded in KVM Hypervisor for High Performance Computing Journal of Information & Computational Science 9: 5 (2012) 1273 1280 Available at http://www.joics.com VON/K: A Fast Virtual Overlay Network Embedded in KVM Hypervisor for High Performance Computing Yuan

More information

Virtual InfiniBand Clusters for HPC Clouds

Virtual InfiniBand Clusters for HPC Clouds Virtual InfiniBand Clusters for HPC Clouds April 10, 2012 Marius Hillenbrand, Viktor Mauch, Jan Stoess, Konrad Miller, Frank Bellosa SYSTEM ARCHITECTURE GROUP, 1 10.04.2012 Marius Hillenbrand - Virtual

More information

International Journal of Scientific & Engineering Research, Volume 6, Issue 4, April-2015 36 ISSN 2229-5518

International Journal of Scientific & Engineering Research, Volume 6, Issue 4, April-2015 36 ISSN 2229-5518 International Journal of Scientific & Engineering Research, Volume 6, Issue 4, April-2015 36 An Efficient Approach for Load Balancing in Cloud Environment Balasundaram Ananthakrishnan Abstract Cloud computing

More information

Cost-effective Resource Provisioning for MapReduce in a Cloud

Cost-effective Resource Provisioning for MapReduce in a Cloud 1 -effective Resource Provisioning for MapReduce in a Cloud Balaji Palanisamy, Member, IEEE, Aameek Singh, Member, IEEE Ling Liu, Senior Member, IEEE Abstract This paper presents a new MapReduce cloud

More information

Cloud Computing through Virtualization and HPC technologies

Cloud Computing through Virtualization and HPC technologies Cloud Computing through Virtualization and HPC technologies William Lu, Ph.D. 1 Agenda Cloud Computing & HPC A Case of HPC Implementation Application Performance in VM Summary 2 Cloud Computing & HPC HPC

More information

A Framework for Performance Analysis and Tuning in Hadoop Based Clusters

A Framework for Performance Analysis and Tuning in Hadoop Based Clusters A Framework for Performance Analysis and Tuning in Hadoop Based Clusters Garvit Bansal Anshul Gupta Utkarsh Pyne LNMIIT, Jaipur, India Email: [garvit.bansal anshul.gupta utkarsh.pyne] @lnmiit.ac.in Manish

More information

Resource usage monitoring for KVM based virtual machines

Resource usage monitoring for KVM based virtual machines 2012 18th International Conference on Adavanced Computing and Communications (ADCOM) Resource usage monitoring for KVM based virtual machines Ankit Anand, Mohit Dhingra, J. Lakshmi, S. K. Nandy CAD Lab,

More information

International Journal of Computer & Organization Trends Volume20 Number1 May 2015

International Journal of Computer & Organization Trends Volume20 Number1 May 2015 Performance Analysis of Various Guest Operating Systems on Ubuntu 14.04 Prof. (Dr.) Viabhakar Pathak 1, Pramod Kumar Ram 2 1 Computer Science and Engineering, Arya College of Engineering, Jaipur, India.

More information

Keywords: Cloudsim, MIPS, Gridlet, Virtual machine, Data center, Simulation, SaaS, PaaS, IaaS, VM. Introduction

Keywords: Cloudsim, MIPS, Gridlet, Virtual machine, Data center, Simulation, SaaS, PaaS, IaaS, VM. Introduction Vol. 3 Issue 1, January-2014, pp: (1-5), Impact Factor: 1.252, Available online at: www.erpublications.com Performance evaluation of cloud application with constant data center configuration and variable

More information

StACC: St Andrews Cloud Computing Co laboratory. A Performance Comparison of Clouds. Amazon EC2 and Ubuntu Enterprise Cloud

StACC: St Andrews Cloud Computing Co laboratory. A Performance Comparison of Clouds. Amazon EC2 and Ubuntu Enterprise Cloud StACC: St Andrews Cloud Computing Co laboratory A Performance Comparison of Clouds Amazon EC2 and Ubuntu Enterprise Cloud Jonathan S Ward StACC (pronounced like 'stack') is a research collaboration launched

More information

Cost Effective Selection of Data Center in Cloud Environment

Cost Effective Selection of Data Center in Cloud Environment Cost Effective Selection of Data Center in Cloud Environment Manoranjan Dash 1, Amitav Mahapatra 2 & Narayan Ranjan Chakraborty 3 1 Institute of Business & Computer Studies, Siksha O Anusandhan University,

More information

Virtual Machine Instance Scheduling in IaaS Clouds

Virtual Machine Instance Scheduling in IaaS Clouds Virtual Machine Instance Scheduling in IaaS Clouds Naylor G. Bachiega, Henrique P. Martins, Roberta Spolon, Marcos A. Cavenaghi Departamento de Ciência da Computação UNESP - Univ Estadual Paulista Bauru,

More information

Network Infrastructure Services CS848 Project

Network Infrastructure Services CS848 Project Quality of Service Guarantees for Cloud Services CS848 Project presentation by Alexey Karyakin David R. Cheriton School of Computer Science University of Waterloo March 2010 Outline 1. Performance of cloud

More information

Scheduling using Optimization Decomposition in Wireless Network with Time Performance Analysis

Scheduling using Optimization Decomposition in Wireless Network with Time Performance Analysis Scheduling using Optimization Decomposition in Wireless Network with Time Performance Analysis Aparna.C 1, Kavitha.V.kakade 2 M.E Student, Department of Computer Science and Engineering, Sri Shakthi Institute

More information

Big Data Management in the Clouds and HPC Systems

Big Data Management in the Clouds and HPC Systems Big Data Management in the Clouds and HPC Systems Hemera Final Evaluation Paris 17 th December 2014 Shadi Ibrahim Shadi.ibrahim@inria.fr Era of Big Data! Source: CNRS Magazine 2013 2 Era of Big Data! Source:

More information

Sla Aware Load Balancing Algorithm Using Join-Idle Queue for Virtual Machines in Cloud Computing

Sla Aware Load Balancing Algorithm Using Join-Idle Queue for Virtual Machines in Cloud Computing Sla Aware Load Balancing Using Join-Idle Queue for Virtual Machines in Cloud Computing Mehak Choudhary M.Tech Student [CSE], Dept. of CSE, SKIET, Kurukshetra University, Haryana, India ABSTRACT: Cloud

More information

INCREASING SERVER UTILIZATION AND ACHIEVING GREEN COMPUTING IN CLOUD

INCREASING SERVER UTILIZATION AND ACHIEVING GREEN COMPUTING IN CLOUD INCREASING SERVER UTILIZATION AND ACHIEVING GREEN COMPUTING IN CLOUD M.Rajeswari 1, M.Savuri Raja 2, M.Suganthy 3 1 Master of Technology, Department of Computer Science & Engineering, Dr. S.J.S Paul Memorial

More information

Simulation-based Evaluation of an Intercloud Service Broker

Simulation-based Evaluation of an Intercloud Service Broker Simulation-based Evaluation of an Intercloud Service Broker Foued Jrad, Jie Tao and Achim Streit Steinbuch Centre for Computing, SCC Karlsruhe Institute of Technology, KIT Karlsruhe, Germany {foued.jrad,

More information

Enhancing the Scalability of Virtual Machines in Cloud

Enhancing the Scalability of Virtual Machines in Cloud Enhancing the Scalability of Virtual Machines in Cloud Chippy.A #1, Ashok Kumar.P #2, Deepak.S #3, Ananthi.S #4 # Department of Computer Science and Engineering, SNS College of Technology Coimbatore, Tamil

More information

Infrastructure as a Service (IaaS)

Infrastructure as a Service (IaaS) Infrastructure as a Service (IaaS) (ENCS 691K Chapter 4) Roch Glitho, PhD Associate Professor and Canada Research Chair My URL - http://users.encs.concordia.ca/~glitho/ References 1. R. Moreno et al.,

More information

EFFICIENT VM LOAD BALANCING ALGORITHM FOR A CLOUD COMPUTING ENVIRONMENT

EFFICIENT VM LOAD BALANCING ALGORITHM FOR A CLOUD COMPUTING ENVIRONMENT EFFICIENT VM LOAD BALANCING ALGORITHM FOR A CLOUD COMPUTING ENVIRONMENT Jasmin James, 38 Sector-A, Ambedkar Colony, Govindpura, Bhopal M.P Email:james.jasmin18@gmail.com Dr. Bhupendra Verma, Professor

More information

Energy Optimized Virtual Machine Scheduling Schemes in Cloud Environment

Energy Optimized Virtual Machine Scheduling Schemes in Cloud Environment Abstract Energy Optimized Virtual Machine Scheduling Schemes in Cloud Environment (14-18) Energy Optimized Virtual Machine Scheduling Schemes in Cloud Environment Ghanshyam Parmar a, Dr. Vimal Pandya b

More information

Design of Simulator for Cloud Computing Infrastructure and Service

Design of Simulator for Cloud Computing Infrastructure and Service , pp. 27-36 http://dx.doi.org/10.14257/ijsh.2014.8.6.03 Design of Simulator for Cloud Computing Infrastructure and Service Changhyeon Kim, Junsang Kim and Won Joo Lee * Dept. of Computer Science and Engineering,

More information

CiteSeer x in the Cloud

CiteSeer x in the Cloud Published in the 2nd USENIX Workshop on Hot Topics in Cloud Computing 2010 CiteSeer x in the Cloud Pradeep B. Teregowda Pennsylvania State University C. Lee Giles Pennsylvania State University Bhuvan Urgaonkar

More information

Rackspace Cloud Databases and Container-based Virtualization

Rackspace Cloud Databases and Container-based Virtualization Rackspace Cloud Databases and Container-based Virtualization August 2012 J.R. Arredondo @jrarredondo Page 1 of 6 INTRODUCTION When Rackspace set out to build the Cloud Databases product, we asked many

More information