Analysis of server utilization and resource consumption of the Google data centre trace log Robert Mugonza MSc Advanced Computer Science

Size: px
Start display at page:

Download "Analysis of server utilization and resource consumption of the Google data centre trace log Robert Mugonza MSc Advanced Computer Science"

Transcription

1 Analysis of server utilization and resource consumption of the Google data centre trace log Robert Mugonza MSc Advanced Computer Science 2012/2013 The candidate confirms that the work submitted is their own and the appropriate credit has been given where reference has been made to the work of others. I understand that failure to attribute material which is obtained from another source may be considered as plagiarism. (Signature of student)

2 Acknowledgements This dissertation is the product of active and passive contribution of a number of individuals and Institutions. But above all kind, To God, the source of knowledge, wisdom and life, I will always owe my success to you. I extend my sincere thanks to my supervisor Professor, Jie Xu. You have been very instrumental in all my works. I recall all the encouraging words and the numerous meetings to improve my research writing and critical analysis of results. I have carried that knowledge forward for future studies. And to Peter Garraghan, a research student in the School of Computing, and Dr. Paul for your timely and unconditional support towards this fulfilment of the degree requirement, I am grateful. I would also like to thank my project assessor, Karim Djemame, for very useful and directional feedback at the progress meeting and interim report. My friends and especially, Chawanagwa, Faustina, David and Cathy, I thank you for the courage and motivation you have imparted on me. To my personal tutor, Dr. Hamish Carr, and the director of the School of computing, Dr. Andy Bulpit. Thank you for the guiding and encourage you accorded me. To the Commonwealth Scholarship Commission (CSC), I am grateful for the financial support without whom funding for studies in the UK would still appear a nightmare. Special thanks go to my family members, I am indebted. Especially, my sweet hearted Mum, Mrs. Janefrances Mulwana Nakku and my Dad Mr. Mulwana Deus, who have endless reminded me of their prayers and courage. Finally to you my wife and best friend, Birungi Immaculate, my two daughters, Divine Mary and Roberta Desire, you are the reason for this sweat and sacrifices. Thank you for offering your unconditional long distance love, care and support. I love you. i

3 List of Acronyms APIs : Application Programming Interfaces (APIs) CC : Cloud cluster CCIF : Cloud Computing Interoperability Forum (CCIF), CPU : Central Processing Unit CSC : Commonwealth Scholarship Commission DCC : Data Centre Cloud DHTs : Distributed Hash Tables IaaS : Infrastructure as a Service IDEs : Integrated Development Environment OCC : Open Cloud Consortium (OCC), OVF : Open Virtualization Format (OVF), PaaS : Platform as a Service QoS : Quality of Service RAM : Random Access Memory REST : Representational State Transfer SaaS : Software as a Service SAN : Storage Area Network SLAs : Service Level Agreements UCI : Unified Cloud Interface (UCI). VM : Virtual Machine WBS : Work Breakdown Structure ii

4 Table of Contents Acknowledgements... i List of Acronyms... ii 1.1 Overview Background Problem Aim of project Objectives Minimum requirements Proposed Contributions Initial Project Plan Research Questions Cloud computing Definition of Cloud computing Cloud Computing Characteristics Cloud Computing service deployment Models a) Software as a Service (SaaS): b) Infrastructure as a Service (IaaS): c) Platform as a Service (PaaS): Cloud Types a) Private Clouds: b) Public Clouds: c) Hybrid Clouds or federated clouds: d) Community Clouds: Data transfer within and between clouds Why clouds? iii

5 2.1.8 Enabling Technologies in the Cloud Data centres Why Data centres Data centre Virtualisation Cloud OS to enable Virtualised Data centres Types of Data centre Virtualisation a) Storage virtualization b) Network virtualization c) Server virtualization Data centre failure Power Consumption Management in Data Centres User behaviour on the cloud Workload characterisation and task Classifications VM Allocation Energy efficiency Google and its Data sets (Traces) Chapter Three: Methodology & Design Project Management Revised Project plan Key activities Progress Meetings Project Methodology Evaluation of Project Chapter Four: Implementation Data set Description Data set overview Data set structural Model Data set assumptions iv

6 4.2 Task Analysis Determining task structure and Composition Coarse grain Analysis Distributions for workload resource consumption & Server Utilisation Cluster Distributions for combined resource consumption (Task Memory Vs Task CPU- Core Utilisation) Clustering Google Cluster Aggregation Investigative clustering Identification of Work Load Dimensions Application of Simple K-means clustering Results of the Simple Kmeans clustering Work load - Energy consumption Energy Waste Quantification and Failure Analysis Energy usage quantification Discussion of the energy models Chapter Five: Evaluation Evaluating the Methodology Evaluating the Results Limitations Future Work Conclusion Appendices v

7 List of Figures Figure 1 : Task and Job Life Cycle[Di, 2012]... 4 Figure 2 : A four layered cloud architecture [Foster, et al., 2008] Figure 3 : A generated Cloud Architecture design [Foster, et. al., 2008] Figure 4 : Enabling Technologies in the cloud [Hwang, 2010] Figure 5 : Data centre network structure [Gill et. al., 2011] Figure 6 : Data centre virtualisation and automation trend [Grieser, 2008] Figure 7 : Displaying the Cloud Software s for virtualised Operating [Hang, 2010] Figure 8 : Visualisation of failure panorama: Sep 09 to Sep 10[Gill et. al., 2011] Figure 9 : Diagram adopted from ADS lecture notes, [Jie Xu, 2013] Figure 10 : Project methodology Figure 11 : Google data centre trace - Component model Figure 12 : Server utilisation Figure 13 : Task Submissions Figure 14 : Resource utilisation per jobtype Figure 15 : Energy waste models for architectures 0 & 1. Energy waste models for tasks of other jobtypes are availed in the appendix List of tables Table 1 : Summary of statistics that we contrast with from a 29 Day trace Table 2 : Table 1: displaying some of the trace statistics Table 3 : Cluster statistics Table 4 : Summary of the cluster properties of the trace (a & b) Table 5 : Simple statistic compositions of the Cloud cluster Table 6 : Results from clustering performed Table 7 : Failure analysis of the 7 hour trace Table 8 : A simple section of the Dell PowerEdge R520 Model server Specifications vi

8 Abstract Understanding of behavioural aspects of workloads has existed since the birth of grid computing in the early 80 s [15]. It is currently and widely important as studies to support fast solutions for efficient energy computing solutions is underway. Cloud computing, is a business concept for online convenient, on-demand scalable internet services as in [31], has quickly grown as a high profile computing paradigm that shares with ubiquitous, grid, mobile and utility computing. In this project, we study the 1 st ever Google data centre cloud traces with interest of furthering our understanding of server utilisation and resource consumption within data centre clouds. To achieve this, we explore the 1 st ever trace logs of the Google data centres via a combination of both coarse grain and fine grain analysis that include discovering interesting trace log statistical properties, characterising the data trace, and studying the impacts of the unrealistic task behaviour onto the data centre energy efficiency. Specifically, we analyse a 7 hour trace of the Google data centre cloud server utilisation and resource (CPU and memory) consumption so that we discover the deployed machine architectures, the virtual machines population and their configurations, and how resource usage varies per architecture, workload resource allocation and its impact on the resource usage. All this is achieved initially through workload classification into distinct groups with a view of identifying whether there is correlation in resource consumption among job-tasks, and whether a job category affects more on energy consumption. Hence, either way is a vital decision to future users of the data centre trace. We also, quantify the corresponding resource usage of tasks, separating the percentage of the resources that is wasted and the one that has been productively utilised during task execution. Hence this approach satisfactorily, enables efficient calculation of the proportion of the energy wasted and the energy that was productively utilised. We finalise by evaluating whether percentage resource waste, 39% form the 7hour trace is a deciding factor on the total data centre energy waste through usage of statistics obtained from this analysis and the 29 day Google trace analyses. Results from a 2 sample test, based on resource means, standard deviations and sample sizes, we deduce hypothetically that the 39% is representative of the 52% resource waste from [44] vii

9 Introduction & Motivation 1.1 Overview Cloud computing, a business concept for online convenient and on-demand provisioning of scalable internet services has quickly grown as a high profile computing paradigm that shares with numerous computing fields that range from: ubiquitous, pervasive, grid, mobile to utility computing. Scalability of such services is currently achievable through across-wide Cloud computing image [52] implementation of Cloud data centres, as the main operational environments by service providers like Google, Amazon, Microsoft, and etcetera. These services being provided can range from simple to complex. For instance; , storage, text processors, IDEs (Google code labs), etcetera. Various technologies that manage user level requests (workload) for cloud services in the data centres exist. These may include virtualisation with automation, consolidation, VMs, efficient schedulers and scheduling policies, migration and so forth. Managing such workload will require service providers to implement processes flows and mechanisms that will allow efficient scaling up and down of computing resources on demand. At the same time providers will require to maximize the utilization of resources and minimize running costs while maintaining Service Level Agreements (SLAs). 1

10 More so, users need to have a good understanding of their applications to take advantage of the elastic properties of the service. And because of this demand for application knowledge, over estimation and over allocation of computing resources (memory, bandwidth and CPU) is underway. The limited computing resources may be over held for over a long period of time, a condition, that may deprive execution of other user application requests (tasks) and hence the root cause of the most failures in the data centre. This behaviour of tasks while in the data centre cloud is unrealistic due to the high amount resource wasted in the data centre cloud, an interest for exploration by this research project and most cloud computing studies including, [31], [7], [32], [1], [27], [44] and so forth. In this project, we study the 1 st ever Google data centre cloud traces with interest of furthering our understanding of server utilisation and resource consumption within data centre clouds. To achieve this, we explore the 1 st ever trace logs of the Google data centres via a combination of both coarse grain and fine grain analysis that include discovering interesting trace log statistical properties, characterising the data trace, and studying the impacts of the unrealistic task behaviour onto the data centre energy efficiency. Specifically, we analyse a 7 hour trace of the Google data centre cloud server utilisation and resource (CPU and memory) consumption so that we discover the deployed machine architectures, the virtual machines population and their configurations, and how resource usage varies per architecture, workload resource allocation and its impact on the resource usage. All this is achieved initially through workload classification into distinct groups with a view of identifying whether there is correlation in resource consumption among job-tasks, and whether a job category affects more on energy consumption. Hence, either way is a vital decision to future users of the data centre trace. We quantify the corresponding resource usage of tasks, separating the percentage of the resources that is wasted and the one that has been productively utilised during task execution. Hence this approach satisfactorily, enables efficient calculation of the proportion of the energy wasted and the energy that was productively utilised. We finalise by evaluating whether percentage resource waste form the 7hour trace is a deciding factor on the total data centre energy waste through usage of statistics obtained from this analysis and the 29 day Google trace analyses. 2

11 1.2 Background Cloud computing, a business concept for online convenient, on-demand scalable internet services as in [31], has quickly grown as a high profile computing paradigm that shares with ubiquitous, grid, mobile and utility computing. [45] Identifies existence of many definitions of cloud computing to date and these are only the documented definitions. However, popular choices would zero down to the definition developed by the National Institute of Standards and Technology (NIST):-, Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction, [27]. The advent of Cloud computing has greatly revolutionised the IT industry. Currently, its wide spread has forced big firms in IT industry like Google, icloud, Drobpox, Facebook, IBM, Microsoft and so many others, to establish large scale data centres around the world containing thousands or even millions of compute nodes armies of computing servers. A cloud for this paper as identified also in [4] as a data centre hard ware and software. Less than a decade ago, large scale clouds and cloud systems have been setup to handle the massively escalating demand for high performance computing services like storage and CPU speed that is coupled by high reliability measures like Quality of Service (QoS). Examples of these include huge Google clouds (Omega, the next generation cloud system and the Borg), and Twitter clouds, referred by twitter as Mesos[43]. In practice, these systems provide automated resource allocation. Resource allocation in data centre clouds should implies an efficient way of packaging of work (scheduling) and applications across the data centre computing nodes or servers. These cloud systems provide intelligence for controlling the work across the companies data centre resources. Example work may include; Gmail, Skype, dropbox, icloud, Google maps, etcetera. All work is categorised as tasks, and the resource scheduler or allocator will assign this workload wherever and whenever it can find free computing resources. Also, Internet users like academia, webhosting firms, and other cloud customers, on a daily basis will access basic applications like Google search, webmail, Google Documents (Google Drive), and other cloud services. At the end user level, customers will require a stable internet connection that is a number of mega or Giga bytes of bandwidth, large amounts of disk storage and Random Access Memory (RAM). Clients like aerospace, meteorology, hosting 3

12 will consistently require both a stable bandwidth at all times to efficiently and effectively run their day to day client side services using applications hosted by cloud providers like Google, etc. Of course the services are at cost per access period, as may be arranged in the Service Level Agreements (SLA) made. 1.3 Problem Cloud Services accessed by these customers on the cloud either as SaaS or utility computing (pay as you go) service on timely basis, are monitored by cloud resource schedulers like Handoop, CloudSim, Eucalyptus, and mapreduce per say, in terms of access logs per application task and other priorities related to the task like resource requirements, latency, etcetera. Application tasks are organised in distinct groups called jobs. And at the data cluster level, each job is submitted to the server machines constrained by a set of compute requirements used for scheduling of also, various job tasks. At a periodical basis, Data centre cell management systems will record this job activity (work load) in compute machines in a single file, called a trace log. - Google traces [21], amazon traces [2], Microsoft traces [21] and the yahoo traces [31]. Based on insights from the Google data centre cloud traces [21], [31] and [22], Cloud computing presents a computing challenge characterised by unrealistic resource consumption due to failures that may occur during the tasks and job life cycle in the server machine. Figure 1 in section (1.3) demonstrates this type of behaviour. Figure 1 : Task and Job Life Cycle [16] Also task events may reflect task state transitions like task exit due to successful completion, or loss, eviction, failure immediately after getting scheduled for running. The general understanding of what happens in data centre schedulers is that tasks that have just been 4

13 evicted, failed, or killed remain runnable may be resubmitted back to the controllers for execution. Furthermore, the actual data centre scheduler policy on frequency of task rescheduling in the event of abnormal termination will always vary between jobs. Some jobs will be rescheduled as many times as possible as may be required for successful execution or may indefinitely be descheduled after termination of any form caused by the scheduler not allocating the required resources and other factors. The resulting resource or energy waste reflected by this unpredictable behaviour in data centre clouds calls for a system-wide solution that improve resource consumption and efficient energy use. And since, it is not easy to precisely determine what quantity of energy is fully wasted due to poor resource deployment and inelastic provisioning in data centre clouds, research that is geared towards quantification of the resultant resource waste of the data centre when it is experiencing crash-loops and other high-level failures is highly motivating. This project has considered the analysis of server utilization and resource consumption in a real-world Cloud data centre (from Google). This will include: - calculating the utilization of the entire data center over the period of the month, analysing resource consumption by architecture type to explore the correlation of workload with server utilization, and development of distributions for server utilization per architecture type (jobtype). Also considering the effect of crash-loops in the data centre traces, the server utilisation and resource consumption analysis will clearly help separate resource waste due to the operational environment and failures. 1.4 Aim of project To analyse the server utilization and resource consumption in a data centre based on the Google data centre trace in order to understand specific system behaviours, such as crashloops and high-level failures, and their impacts on the data centre's energy efficiency. 1.5 Objectives To achieve this, I will specifically; i. Calculate the utilization of the entire data centre over a given period of time. Inference of associations between time of the day and server utilisation, and workload from customer or clients and server utilisation. 5

14 ii. iii. iv. Analyse server utilisation per architecture or CPU type or number of CPU cores and per the number of memory cores. And as a result, distributions information gathered here can be used to infer local tasks configuration and resource requirements configurations. Analyse the high level failure characteristics and resource utilization of sampled time slices exhibiting crash-loops, stable system status, and other behaviours. A 2- t test is performed to determine whether percentage the impact of resource waste in a real-world cloud data centre. Develop statistical distributions for server utilization per architecture type, and perhaps any other resource consumptions like memory (RAM). v. Analyse the impact of system behaviours on energy efficiency. 1.6 Minimum requirements i. Survey of related literature. ii. iii. Computerised analysis of server utilisation and CPU based on a set of real data during a given period of time (e.g. 7 hours), from the Google trace log. As a result of the analysis, identification and development of statistical distributions of the server utilisation per JobType, resource consumption should be done. Further enhancements iv. A workload models quantifying workload resource waste consumption and energy waste. 1.7 Proposed Contributions i. Server utilisation models per architecture type, CPU and/or models for other competed for resources like RAM. With these models, temporal analysis of workload with server utilization will be described. ii. iii. A combined approach for the Google Cluster data traces analysis that includes coarse grain, and fine grain analysis plus an investigative clustering. Quantification of the resultant resource waste of the data centre when it is experiencing crash-loops and other high-level failures. 6

15 1.8 Initial Project Plan In order to control and manage the project, a Work Breakdown Structure (WBS) and a Gantt chart have initially been created during the planning stage, as shown in Appendix C. Also, some changes may take place during the implementation phase in the case of encountering any challenges. This project plan will help to track the completion of each development stage within the time-scale to ensure the delivery of a successfully completed project. 1.9 Research Questions a) Does the aggregation as observed in the data centre trace log particularly impact realistic energy consumption or not. b) Could percentage composition be a determining factor on energy waste? 7

16 Chapter 2: Literature Reviews Understanding resource utilisation and server characterisation of large scale resource allocation or data centre cloud management systems is vital if cloud service providers and cloud users are to optimize their operations while at the same time maintaining Quality of Service. At the cloud service provider level, achieving cloud user trust is priority when providing cloud services. While to the cloud user, timely, scalable and secure access to this provider availed services bounded by Service Level agreements (SLAs) are mandatory and a priority. And whilst these services are handled in data centre environments, it is also worth important that timely evaluation and monitoring of energy, resources utilised with in data centres is judged as efficient or inefficient. This section explores the existing literature on resource utilisation and server characterisation taking into consideration the data centre cloud environment, data centre cloud resource allocation, server characterisation and consumption of server resources like the memory, bandwidth and Processor cores, user behaviour in the cloud and its impact on resource usage and data centre energy efficiency. 2.1 Cloud computing. Cloud computing provides cloud customers with computing resources as a service [Tran et al., 2012]. With big cloud bosses such as Amazon, Google, Verizon, IBM, and Microsoft running and maintaining large scale clouds, cloud users are able to utilise computational resources, storage, software, and/or data access on a cloud with limited knowledge of the details about the machine(s) providing service, a concept that shares with ubiquitous or pervasive computing. Ubiquitous or pervasive computing refers to a form of computing where resources are available and accessible whenever and wherever [30]. By using a cloud, a user with ease, scales their computational requirements to meet their current needs. Herewith, cloud services accessed by cloud customers shall encompass all services accessed via the internet that are not only necessarily the IT services like software-as-aservice (SaaS) and storage or server capacity as a service but also many, many non-it business and consumer services [Gens, 2008]. These can exhibit various characteristics or attributes that include; 8

17 Key Cloud Services Attributes (cloud offerings must meet all eight criteria) 1 Outsourced - Off-site, third party provider 2 Online - Accessed via the internet 3 Ease of use - Minimal / no IT skills required to implement 4 Provisioning - Self-service requesting, near real-time deployment, dynamic and fine grained scaling. 5 Pricing model - fine-grained, usage-based (at least available as an option) 6 User Interface - browser and successor accessed 7 System interface - web services APIs 8 Customised - Shared resources / common versions (customisation around the shared services. Figure 2 : Cloud computing attributes [Gens, 2008]. Cloud computing as a form of computing retains its confusing nature or ambiguity if not addressed to its simplest detail. It has been believed and said of its power to transform the IT industry [Armbrust et al, 2010] and indeed, all plans are underway. Understanding of the cloud computing notion would require a clear definition of cloud computing. Due to its increasing interest by researchers of the cloud, many studies have been conducted and all aim at standardising the definition of cloud computing because it then confuses the young and upcoming cloud researchers Definition of Cloud computing According to [45], there exist many definitions of cloud computing to date and these are only the documented definitions. The National Institute of Standards and Technology (NIST) describes, Cloud computing as a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, 9

18 servers, storage, applications and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction, [27]. However, a much simpler definition for the cloud computing is one in [35], a simple style of computing in which massively scalable IT-enabled capabilities are delivered as a service to multiple customers using Internet technologies. [David, 2010] describes the term as a style of computing where scalable and elastic IT-related capabilities are provided as a service to external customers using Internet technologies. It is important to note that since cloud computing discovery, cloud computing has been envisaged to entrust remote services with a user's data, software and computation Cloud Computing Characteristics Various Cloud computing forms have been identified, however, [51] characterises computing clouds as:- (a) Convenient and on-demand self-service: computer services such as , applications, network or server service can be provided without requiring human interaction with each service provider. Cloud service providers providing on demand self-services include Amazon Web Services (AWS), Microsoft, Google, and IBM; (b) Ubiquitous network access: Cloud Capabilities are available over the network and accessed through standard mechanisms that promote use by heterogeneous thin or thick client platforms such as mobile phones, laptops and PDAs; (c) Rapidly scaling up(down): Cloud services can be rapidly and elastically provisioned, in some cases automatically, to quickly scale out and rapidly released to quickly scale in. To the consumer, the capabilities available for provisioning often appear to be unlimited and can be purchased in any quantity at any time; and finally (d) Measurable: Cloud computing resource usage can be measured, controlled, and reported providing transparency for both the provider and user of the utilised service. Cloud computing services use a metering capability which enables to control and optimise resource use. This implies that just like air time, electricity services are charged per usage metrics. The more you utilise the higher the bill [51]. 10

19 2.1.3 Cloud Computing service deployment Models The cloud can be divided into three (03) basic models. Cloud computing services were well classified as computing layers by [hang, 2010]. The main ones here may include; a) Software as a Service (SaaS): Apparently, this is the highest layer in the cloud stack. Classical online software provided as a service with minimal requirements on installation on user s computer and almost all data are stored on the service provider s side [32]. Examples of such kind of layer may include; NetSuite, SalesForce and many more. Moreover with this mode of service provision, the consumer uses an application, but does not control the operating system, hardware or network infrastructure on which it's running [3]. b) Infrastructure as a Service (IaaS): This cloud computing layer can be identified as basically raw compute and storage services this option provides only an infrastructure without any software, so the modelling of the requirements is not particularly difficult. Open source cloud infrastructure currently exists too, examples include; Nimbus and eucalyptus, both existing as an Open source cloud computing platforms. More to the above is that in the delivery model, the consumer uses "fundamental computing resources" such as processing power, storage, networking components or middleware. The consumer can control the operating system, storage, deployed applications and possibly networking [3]. c) Platform as a Service (PaaS): This class of cloud layer described in [4] as higher-level development environments which abstract the underlying technology and provide for scalability and rapid application development such as Google App Engine, is further sub categorised [50] in the Gartner s report. In this service provision model, the consumer uses a hosting environment for their applications. The consumer controls the applications that run in the environment (and possibly has some control over the hosting environment), but does not control the operating system, hardware or network infrastructure on which they are running. The platform is typically an application framework [3]. However, other models have been identified in a business model of cloud computing [16]. These too, may include; iv) Storage as a service (STaaS), v) Security as a service (SECaaS), vi) Data as a service (DaaS), vii) Test environment as a service (TEaaS), viii) Desktop as a service (DaaS), ix) API as a service (APIaaS). 11

20 2.1.4 Cloud Types Other forms of Clouds do exist like; a) Private Clouds: These are a proprietary network or a data center that supplies hosted services to a limited number of people. When a service provider uses public cloud resources to create their private cloud, the result is called a virtual private cloud [29]. b) Public Clouds: A public cloud sells services to anyone on the Internet. (Currently, Amazon Web Services is the largest public cloud provider, the EC2. c) Hybrid Clouds or federated clouds: All of this can be an incorporated perspective, blending the effectiveness of both the private as well as public clouds. A hybrid cloud is a composition of at least one private cloud and at least one public cloud. In this context, both clouds remain separate entities, but with Workload able to migrate between them. In the Hybrid Cloud, the data can be securely retained in the private cloud while the computation tasks can be carried out on the public cloud. d) Community Clouds: Community clouds are cloud systems that may be distributed by a few providers and even sustain a precise network containing revealed precautions (e.g., objective, protection prerequisites, strategy, and additionally concurrence possibilities) [23]. Perhaps they are manipulated through the corporations or maybe an alternative party which enable it to be present upon assumption and / or away from assumption Data transfer within and between clouds Most cloud providers present novel Application Programming Interfaces (APIs) for data retrieval. A bunch of these APIs is in existence and profoundly used like; S3, SimpleDB, App Engine data store, etc. [23] explains that It is usually cheaper (or free) to transfer data within a cloud. However, to guide global transfer, Standards and organizations are emerging to govern this cloud activity like the Open Virtualization Format (OVF), Open Cloud Consortium (OCC), Cloud Computing Interoperability Forum (CCIF), and the Unified Cloud Interface (UCI). 12

21 2.1.6 Why clouds? Cloud computing has recently received considerable attention, as a promising approach for delivering ICT services by improving the utilization of data centre resources. The added value brought about by cloud computing can be analysed in the following ways; a) Device and location independence: enable users to access systems using a web browser regardless of their location or what device they are using (e.g., PC, mobile phone). As infrastructure is off-site (typically provided by a third-party) and accessed via the Internet, users can connect from anywhere. b) Virtualization technology: allows servers and storage devices to be shared and utilization will be increased. Applications can be easily migrated from one physical server to another. c) Multitenancy: enables sharing of resources and costs across a large pool of users thus allowing for: d) Reliability is improved if multiple redundant sites are used, which makes welldesigned cloud computing suitable for business continuity and disaster recovery. e) Scalability and elasticity: via dynamic ("on-demand") provisioning of resources on a fine-grained, self-service basis near real-time, without users having to engineer for peak loads. f) Performance: is monitored and consistent and loosely coupled architectures are constructed using web services as the system interface. g) Security: could improve due to centralization of data, increased security-focused resources, etc., but concerns can persist about loss of control over certain sensitive data, and the lack of security for stored kernels. h) Maintenance of cloud computing applications is easier, because they do not need to be installed on each user's computer and can be accessed from different places. i) Agility improves with users' ability to re-provision technological infrastructure resources. j) Application programming interface (API) accessibility to software that enables machines to interact with cloud software in the same way the user interface facilitates 13

22 interaction between humans and computers. Cloud computing systems typically use REST-based APIs. And while pervasive or ubiquitous computing had greatly revolutionised computing transformed the internet to the internet of things, cloud computing with its assistive technologies and approaches including, virtualisation, consolidation, automation, host load prediction, and optimisation have improved automated scaling of up and down of on demand computing resources while maintaining the pervasive computing paradigm, a term we call service elasticity. Elasticity of resources in cloud computing is fully enabled by availability of high, faster and large processing environments, called data centre clouds. Cloud providers like Google have already in plan of a multimillion cloud infrastructure, setup to efficiently provide computing as a service to its massively escalating user base and its own large scale data-intensive tasks, such as indexing of web pages or analysing large data sets often using variations of the MapReduce model [24] Architecture of Clouds Cloud Computing architecture consists of multiple different layers, as is the case with most distributed systems. Foster et al [18] categorise cloud computing architecture, based on a comparison with Grid architecture, into four layers, which are fabric, unified resource, platform and application, as shown in Figure 1 below: Figure 3 : A four layered cloud architecture [18] Firstly, the fabric layer is the lower level layer of the architecture and includes the raw hardware resources, like storage resources. Secondly, the unified resource layer contains 14

23 abstract resources, through virtualisation, that form integrated resources exposed to the upper layer and end users. Thirdly, the platform layer depends on the unified resources layer and includes an additional set of dedicated tools, middleware, and services in order to provide an environment for the applications development and deployment. Fourthly, the applications that run in the clouds are contained in the application layer [18]. Moreover, another study by [11] shows that Cloud architecture mainly consists of user-level middleware, core-middleware, and system level, as shown in Figure 2 below: Figure 4 : A generated Cloud Architecture design [18]. Beginning from the top of the architecture, Cloud application layers contain applications which can be accessed by end-users directly. Alternatively, users own applications can be deployed in this layer [41]. In addition, the user-level middleware layer contains software frameworks that assist the developer to create an environment for applications to be developed, deployed and executed in Clouds. Moreover, the platform level services that set the run-time environment are performed in the core-middleware layer to host and control services at user-level application. Finally, the system level layer is where massive physical resources, like servers, exist, and these resources are managed by the virtualisation services set above this layer [11] Enabling Technologies in the Cloud The major propelling forces behind cloud computing is the ubiquitous nature of reducing storage costs, broadband and wireless networking, and gradual improvements in Internet 15

24 computing software. Currently Cloud users have the capacity to request for more capacity at peak demand, reduce costs, experiment with new services, and eliminate the unneeded capacity, whereas service providers can increase the system utilization via virtualization, multiplexing, and dynamic resource provisioning. All these activities in the cloud are enabled by improvements in hardware, software and networking technologies. Below is a list of Cloud enabling technologies excerpt [38]. Figure 5 : Enabling Technologies in the cloud [38]. 2.2 Data centres [8] identifies data centres (DCs) as large dedicated clusters of computers that is owned and operated on by a singleton organisation or company. A further discussion on data centres will allow proper understanding of what types of data centres exist, what they are used for, resource allocation and the various technologies that are employed for the automated provision of cloud services, and their impact on the physical data centre. This will also help cloud users understand what kind cloud data centres they can opt for. Indeed, in order to analyse resource utilisation in data centres, understanding why, who, what and how data centre cloud environments are maintained is paramount. Various forms of data centres exist. Jonathan Koomey, a data centre energy expert [28], identifies four types of data centres. Namely; Scientific computing data centres (national laboratories), co-location data centres (private clouds where servers are housed together) inhouse data centres (facilities are owned and operated by company using the servers), public 16

25 cloud provider data centres (Amazon, Google). It is also clear that these data centres are employed for a diverse purpose set. For example, while large academic and scientific institutions and private enterprises are increasingly consolidating a huge percentage of their Information Technology (IT) within on-site data centres comprising of a few servers, other large enterprises like Microsoft, Twitter and Google are rapidly building geographically diverse data centre environments currently containing million server nodes to offer a variety of cloud services. Figure 6 : Data centre network structure [20]. In data centres where we envisage and further observe fleets of parallel and heterogeneous server computers receiving workload load due to incoming jobs and their corresponding tasks submitted by users, various technologies, policies and approaches that ensure efficient and effective computing resource management exist. These may include; capacity planning, virtual machine consolidation, load balancing and task scheduling, but atop of virtualisation. All these work collectively or are combined to ensure effective data centre resource utilisation, business continuity due to the elastic nature of provisioning so that what is needed is exactly what is availed, thus saving the data centre budget towards energy and human resource. 17

26 2.2.1 Why Data centres The need to establish and maintain an effective data center is only when an business entity is in a real need for continuity, manageability and scalability, since the IT infrastructure depends on the stability of the business call it an organisation or a large scale firm, like Google or Microsoft [Grieser, 2008]. Data centres basic role encompasses much more than what is generally observed: That is, consolidation of the processing of data and central management of IT infrastructure and information systems. Moreover, business cloud data centre users will always perceive the data centre existence as a technology in place with significance of multiplying the information size, the number of used business application, and storage and processing of data in far-flung divisions [33] Data centre Virtualisation Virtualization provides IT organizations with clear prospects of improving management and automation across the data centre. As Information Technology organizations face growing business demands and budget pressures, automating resource-intensive human tasks with virtualisation offers an ability to handle more loads with limited resources, and increase the productivity of existing staff [36]. A critical contribution of virtualisation is that, it partitions computational resources and allows the sharing of hardware [9]. The impacts due to increased virtualisation have grown since the early computer age and high prospects for the future. Figure 4 represents this enormous trend. Figure 7 : Data centre virtualisation and automation trend [36]. 18

27 The concept of data centre virtualization encompasses a range of virtualization activities aimed at creating a virtualized computing environment, such as for use in cloud computing, within a data centre. It is however defined by [46] as the abstraction of IT resources that masks the physical nature and boundaries of those resources from resource users. An Information Technology resource can be a server, a client, storage, networks, applications or Operating systems. Various software applications exist to aid virtualisation. These may include proprietary software, Open source and so forth. VMware, Xen, etc Cloud OS to enable Virtualised Data centres For data centres to serve as cloud providers, virtualisation has to be applied to them. That means, only computing resources like CPUs that allow creation of virtual CPU instances are the only ones viable for the virtualisation actions. Virtualisation has the capacity over memory and CPU and network nodes to create an elastic nature. Below is an excerpt of Applications and Operating System Software s for virtualisation data centres [47]. Figure 8 : Displaying the Cloud Software s for virtualised Operating [47]. 19

28 2.2.1 Types of Data centre Virtualisation Virtualization offers IT organizations a new architecture for management and automation, inherently embedding intelligence into the virtual machine (VM) and making automation a focal value proposition for enhancing cost savings opportunities [36]. Various forms of visualisation exist in the data centre environment. Namely; Storage, Desktop, and Server virtualization. a) Storage virtualization This type of virtualisation is based on the amalgamation of multiple network storage devices into what appears to be a single storage unit. Storage virtualization is often used in SAN (storage area network), a high-speed sub network of shared storage devices, and makes tasks such as archiving, back-up, and recovery easier and faster [39]. Storage virtualization is usually implemented via software applications b) Network virtualization For network visualisation is based on using network resources through a logical segmentation of a single physical network. Network virtualization is achieved by installing software and services to manage the sharing of storage, computing cycles and applications. Network virtualization treats all servers and services in the network as a single pool of resources that can be accessed without regard for its physical components [39]. c) Server virtualization This type of virtualization is the partitioning of a physical server into smaller virtual servers to help maximize your server resources. [39] Identifies the resources of the server itself are hidden, or masked, from users, and software is used to divide the physical server into multiple virtual environments, called virtual or private servers. This is in contrast to dedicating one server to a single application or task. It allows companies to make one server act as five, ten or even twenty virtual servers. The need to do so has become more apparent as processors have become far more powerful in recent years. Data centre virtualization typically focuses on server virtualization to enable, for example, Software as a Service, Platform as a Service, or Infrastructure as a Service solution. For instance, through server consolidation, multiple (virtual) servers can be allowed to run simultaneously on a single physical server. Also, live migration of the 20

29 virtual machine to the not fully utilised physical servers would allow more and more physical servers to be turned off, which would lead to better achievement of energy efficiency for data centres [12] and [18] Furthermore, virtualisation in cloud computing can offer dynamic configurations for different applications resource requirements, and aggregate these resources for different needs. Additionally, virtualisation can improve responsiveness by monitoring, maintaining and provisioning resources automatically. Therefore, all these features offered by virtualisation are used in Clouds in order to meet the criterion of the business requirements of SLAs [18] Data centre failure [13] suggests that a data centre often has a high failure rate as it possesses a number of servers and many more prone components. Figure 5 results of below; indicate failures with a 28% impact on the data centre. Figure 9 : Visualisation of failure panorama: Sep 09 to Sep 10[20]. Moreover [20] indicate that understanding network failures of data centres is vital towards improved reliability. Referring to their snapshot of massive losses caused by data centre downtimes (outages), they indicated that high costs were associated with downtime at 41 data centres across varying industry segments with minimum size of 2,500 square feet. This is a 21

30 snap short of data carried out in the United States. They indicated that data centre outages cost above five thousand dollars (About $5600) per minute. This failure observed in data centre networks is often caused by failure in either connection devices (traffic forwarding devices like routers, switches) or link failure (Observed when the connection between 02 interfaces is down). Yet achieving a healthy and energy efficient data centre, a variety of cloud technologies, methodologies, tools, policies and concrete understanding of the development life cycle of most services that are offered on the cloud to service users is crucial. The remaining sections, explore comprehensively, the technologies that enable workload handling on arrival into the data centre, the process of workload scheduling or planning, the data centre schedulers including resource allocation or provisioning, and consequently the general management of the data centre resources and user application requests. And due to existence of the user perspective in the data centre software management cycle and workload behaviour (server machine workload heavily depends on user application submissions), we also discuss user behaviour in the cloud and its implications on the total data centre efficiency Power Consumption Management in Data Centres The energy supplied in data centres is consumed for computational operation, cooling systems, networks, and other overheads. In terms of computational operations, there are some energy-saving techniques that can be deployed to monitor and control energy consumption. These eco-efficient energy techniques have become one of the hot topics in the IT business these days because of the benefits gained, not only from an economic perspective, but also from an environmental perspective. Cloud service providers can save huge costs by efficiently utilising their data centre to its maximum capacity. Also, governments pressure companies to conduct their businesses with less impact on the environment, in terms of the emission of CO2. Hence, Clouds service providers can also improve their SLAs by being friendly to the environment to attract more customers. Furthermore, in terms of power efficiency, it can be said that data centre A is better than data centre B if A can consume less power and process the same workload as B, or A can consume the same power but with more workload compared with B [14]. Thus, lots of research has been conducted to suggest some techniques for reducing power consumption without degrading the performance in the data centres, some of which will be discussed next. 22

31 2.3 User behaviour on the cloud Understanding the online user system requirements has become very crucial for online services providers. The existence of many users and services leads to different users needs. For instance, user behaviour will always differ according to the type of user, time of day or month and lastly, by purpose. An accountant may use the cloud for billing and accounting purposes during work time hours, for services like infrastructure (online banking - reconciliations) and partly databases (recovery of lost / forgotten bank customers card pins). While [32] urges that the cloud service provider has to manage and monitor their services level provided at any time for each particular service, the need for dynamic reallocation or provisioning tools and services are highly required. The energy waste yet covered in detail in section (2.5) citations, gives insights on user behaviour on the cloud. Most users herein are identified by the devices that they connect to these clouds for specific services. Most of these cloud users overestimate requirements to hedge their bets and ensure acceptable service [31]. This arises because of the necessity to compute successfully as providers reallocation of resources to will occur as an extra cost to them, and an energy waste to the service provider s side. Cloud users are a real-time computing generation. In this regard Applications that require this mode of implementation are made to impose heavy time constraints on the general system, call it a data centre or cloud. At most, such systems may crash (fail) or adapt or even destabilize the optimal performance of the system in use. But at least will impact on the returns of the provider and the client in the scenario whereby the client is a business type like train station departures and returns. A remedy as proposed in [31] and [1] is a cloud computing model for flexibility, elasticity and scalability. The cloud will always scale up or down, and will further re-provision to meet the requirements of user application or task. However the fascinating result is the energy waste while maintaining the Quality of service (QoS). Furthermore unconscious co-allocation and re-allocation of different types of workloads is estimated to cause performance interference. For most real time cloud computing systems, we would expect dynamic and on-demand scaling to provisioning of resources (CPU time and RAM time plus bandit and disk space). However as introduced in [31], cloud service providers are still faced with a challenge of dynamic estimation and provisioning of required resources so that also energy that is normally wasted is reduced. 23

32 2.4 Workload characterisation and task Classifications Expectations are high that the energy waste through user overestimation and performance interference in the data centre environment and desktop environment may be reduced [31]. A variety of approaches including [4], [22], [1], [38], [17], etcetera, exist that tackle this persistent and escalating cloud problem. Most of the approaches identified above focus on efficient task resource consumption in data centre environments. Achieving this state in data centres is a priority. [22], [1], [Wang et. al., 2011], [17] present simple and accurate approaches to understanding of the user behaviour through cloud backend characterisation. In review of the approaches, capacity planning to determine which machine resources must grow and by how much, and task scheduling to achieve high machine utilization and to meet service level objectives is common. In [1] & [17], capacity planning and task scheduling is achieved by efficient auto-scaling of cloud resources through analysing workloads and assigning them to the most suitable elasticity controllers based on the workloads characteristics and a set of business level objectives. [38] & [22] describe methodologies that can be applied in analysing available cloud trace logs to extract key workload characteristics. This information infers intuition in making realistic cloud workloads herein referred to as classification, and understanding of the different application configurations and settings in optimizing data centres for compute density and power consumption. Approaches to workload classification need to maintain simplicity and accuracy [22]. A survey of these approaches as in [22] indicates that one approach to workload classification is that each task can be its own workload. However, this does not scale properly well since thousands of tasks execute daily on Google compute clusters. Another approach to workload classification is to consolidate all tasks to a single workload referred to as the Coarse grain [38]. This coarse-grain workload classification approach once applied to a diversity of tasks arises in large variances in predicted resource consumptions. The last approach, Workload characterization via centroids a research contribution from [22] involves identifying the workload dimensions; constructing task classes using an off-the-shelf algorithm such as k- means; determining the break points for qualitative coordinates within the workload dimensions; and finally merging adjacent task classes to reduce the number of workloads. 24

33 2.5 VM Allocation A study by [14] identifies that Virtual Machine (VM) consolidations can be used as a means of reducing the power consumption of cloud data centres. To illustrate, this technique tries to allocate more VMs on less physical machines as far as possible to allow maximum utilisation of the running of physical machines. For instance, when there are two VMs, instead of allocating each one to a physical server that has not been fully utilised, this technique tries to allocate both VMs on one physical server and switch the other server off to save energy. Therefore, using this technique in a data centre can reduce the operational costs and increase the efficiency of energy usage. However, it is important to note that the number of VMs in one physical machine should not be too high to the extent that it may degrade the performance of VMs [14]. 2.6 Energy efficiency In principle, cloud computing can be an inherently energy-efficient technology for Information Communication Technology (ICT) provided that its potential for significant energy savings that have so far focused on hardware aspects, can be fully explored with respect to system operation and networking aspects [10]. [47], in his book, identifies efficiency as a measures the utilization rate of resources in an execution model by exploiting massive parallelism in HPC. A simple understanding of energy efficiency is by the Lawrence Berkery National laboratory definition. [48] Describes energy efficiency as "using less energy to provide the same service". Interestingly is its invariant, energy conservation. It simply means that reducing or going without a service to save energy. [49] argues that these two approaches if considered can reduce on greenhouse carbon emissions. In the data centre cloud setup, energy efficiency is therefore best described agreeably through characterisation of the energy waste problem [31]. The energy waste problem approached at a model level identifies a couple of issues that lead to inefficiency in energy use or consumption in the clouds, especially the data centre cloud. Understanding of the energy waste problem is feasibly performed using the simplified model of the states through which a job or task progresses. Figure 1, in section (1.3) of this report clearly draws this understanding to a finer grain level. 25

34 Figure 10 : Diagram adopted from ADS lecture notes, [25]. Task and job events represent state transitions. Events from this illustration can be classified into two types; events that affect the scheduling state like job submission, resource requests are updated or the job may get scheduled and becomes running; and events that reflect state changes of a task like a task exits. From the analysis of the illustration above, it is visible that the energy waste is exhibited in mainly two stages of resource provisioning and instance deployment of user tasks. Ways in which most waste is done in any data centre could be because of inefficiencies observed in server computers, the cooling and power systems, diversity usage patterns of the centre resources, and unnoticeably could be inefficiencies in general computing. According to [6] and [31], resource over allocation is sought to be an approach to customer or cloud user overestimation of resources as they try to achieve optimal task completion using the service. The latter proposes a model for resource over-allocation with aims of increasing the provider s returns impacting less as possible satisfaction. The impact created on the data centre level is the ability to have applications fully utilising allocated resources, thus availing a balance that can be allocated elsewhere. With this approach, the energy waste is reduced and the data centre capacity is further increased. Further observations on the energy waste are that it has not only impacted at a large scale but also at specific levels of cloud computing like mobile computing. Its role is tremendously increasing. [29] Provides an analysis of the critical factors affecting the energy consumption of mobile clients in cloud computing. These clients rely entirely on phone battery that is limited imposing a characteristic energy reliant technology. At the data centre level (Cloud 26

35 cluster), the use of such technology, will impose a tremendous impact on the total energy consumed and bandwidth. [29] Claims that server and data centre under-utilization is a primary source of waste and inefficiency in compute clusters arguing that companies that manage their own servers and data centres often use only 10-20% of their available computing cycles. The balance and the energy required to generate them, go to waste. [29] Identifies that even when servers move to idle mode they require power and cooling. Although computers consume far less power in idle mode than they did five years ago, the watts still add up. However, strategies availed for energy efficiency improvement [29] and [10] in data centres propose a variety of mechanisms that include; (a) architectural principles for energy-efficient management of Clouds; (b) energy-efficient resource allocation policies and scheduling algorithms considering quality-of-service expectations, and devices power usage characteristics; and (c) a novel software technology for energy-efficient management of Clouds. As a remedy, research onto optimising for data centre efficient energy consumption is underway. Examples of such energy-aware improvements in both software and hardware may include; (a) FAWN: A Fast Array of Wimpy Nodes, fast, scalable, and energy-efficient cluster architecture for data-intensive computing has of recent been developed at Carnegie Mellon University; (b) Distributed Hash Tables (DHTs) have been built at CMU with XScale chips and flash storage to reduce the energy waste in task processing and heavy data storage; (c) Low Power Amdahl Blades for Data Intensive Computing ; (d) Couple low-power CPUs to flash flash-based solid-state drives (SSDs) for DISC workloads; (e) Also Microsoft Builds Atomic Cloud and that Microsoft s Cloud Computing Futures (CCF) team exploring is exploring clusters built from nodes using low-power Atom chips. 27

36 2.7 Google and its Data sets (Traces) [21] Avails two sets of Google traces, TraceVersion1 (The first dataset provides traces over a 7 hour period) and TraceVersion2 (The second version, contains data traces collected for a period of over 29 days). The workload consists of a set of tasks, where each task runs on a single machine. Tasks consume memory and one or more cores (in fractional units). Each task belongs to a single job; a job may have multiple tasks (e.g., mappers and reducers). The trace data available in [21] has been anonymized in several ways: there are no task or job names, just numeric identifiers; timestamps are relative to the start of data collection; the consumption of CPU and memory is obscured using a linear transformation. TraceVersion2 is a more complete trace that has picked remarkable interest from research and academia due to its completedness. Indeed it is fully detailed to enable effective results extraction. Version1 trace has been ignored by many research though [38] and [22] who perform a coarse grain and clustering methodologies in characterising workload. And since workload characterisation guides understanding of the cluster behaviour and resource allocation to workload, more investigation is needed to learn of user behaviour in the clouds and the allocation of resources to data centre cloud workload. An excerpt of the data set, utilised in this project [21] has been availed in appendix 5 of this report. 28

37 Chapter Three: Methodology & Design 3.1 Project Management This section encompasses all steps, the process of achieving complete and right execution of the project. To achieve this state or nature, this project has set flow mechanisms including control or enabling tools like Project plans, anticipated project milestones, Progress review meetings, and clearly stated step by step approach to achieving the project goals or objectives (methodology) Revised Project plan For the revised project plan illustration / diagram, refer to appendix 3 of this document Key activities The analysis of server utilisation and resource consumption of the Google Cloud traces. Phase 1: Project Exploration. This stage covers the initial meeting with the supervisor to discuss the project idea, and set up the aim and minimum requirements for the project. We set a feasible planned project structure and are agreed on with the supervisor. Other vital activities covered here include planning data sources and acquisitions, time framing and scoping of the project, and granulated observations of the data set plus planning for trace data handling(ruby gem platform for query manipulation and a simple MySQL workbench infrastructure). Phase 2: Task Analysis & Execution. Execution encompasses the main activities of the project s development; the literature review and background research about Cloud Computing and eco-efficiency issues in data centres; analysing and gaining a deep understanding of the issues and how to provide an evaluated solution; In view of the earlier mentioned, based on the literature, choice of methodology for exploratory analysis, Coarse grain analysis including server utilisation, server resource consumption, cluster distribution, clustering, fine grain analysis failure analysis, server energy modelling and energy quantification. Phase 3: Clustering: In this phase of the project, both intra-cluster aggregations performed initially at the basis of workload types (JobType) as confirmed in [21] that a class of job is the only way of workload is categorised. Since limited knowledge of the traces is availed, we draw assumptions that 29

38 enable another form of clustering to be performed, the investigative clustering. In this form of clustering, initial clustering constraints are maintained (For example the number of cluster centres or centroids (k=4)). The researcher then, investigates the aggregation exhibited by the traces by applying a generalised kmeans on these sliced traces with similar dimensions of interest. Dimensions employed here are similar to those in the traces. Phase 4: Evaluation (Project checks): This stage covers the progression status of the project including weekly progress reports and a work in progress presentation. Mid project reports are also utilised as progress presentations. This helps to check for progress and as a check point of whether project scheduling realistic or not. Otherwise, reviews of plans are handled, objectives and evaluation criteria adjusted to suit the available project time. Above all, this phase is a check phase for whether the goals have been achieved through evaluation of the implementation results of Project phases 1, 2 & 3. Results from findings are compared with those from a 29 day trace. Finally, an uncategorised phase is Project closure: this phase is made up of completing writeup of the dissertation, finalising, and the submission of the dissertation Progress Meetings In order to track the progression status of the project, weekly progress meetings with the supervisors have been conducted almost from the beginning of the project till the end. During these meetings, it was very useful to discuss the progress of the project, reveal any issues encountered, and set plans based on the discussion output. After the submission of the interim report, weekly progress reports were documented and brought to the meetings for tracking and discussing the tasks accomplished for that week, and setting the tasks to be completed for the following week. In addition, a work in progress presentation was conducted with both the supervisor and assessor to show and discuss the overall progress of the project and obtain feedback to enhance the project accordingly (see the presentation slides in Appendix 7). 30

39 3.2 Project Methodology Project methodology can be envisaged as step by step approach to achieving of any project goals. This constitutes details of how the whole project is managed from initiation to end (closure due to failure or success). Hence this section will try to address the whole process and control of implementation for the results that are in use here. First and foremost, reviewing related literature on analysis of server utilisation and resource consumption in a data centre cloud environment is important. However, concrete understanding of the cloud environment, and its technologies like data centres, virtualisation, workload scheduling, is worth describing as we tackle this endeavour. So, we study the cloud and the cloud environment in detail; describe data centres and its enabling technologies for efficient energy use. And because of energy conservation (Including energy efficiency), we briefly study energy efficiency for only purposes of understanding how resource waste in data centres contributes to the total energy waste from data centres, and what technologies, approaches are in place to reduce on the escalating percentage as [29]. [32], [22], [1], [9] & [29] so far, have given insights and paved way to achieve the earlier stated objectives. A summary of what each objective achieves is given below. i. Conduct background research on related literature and the current problem with a view of identifying existing solutions to data centre cloud workload analysis methods and tools currently in use. ii. Calculate the utilization of the entire data centre over a long period of time. For example: - 7 hours provided in [26]. Inference of associations between time of the day and server utilisation, and workload generated from customer or clients and server utilisation. iii. Analyse server utilisation per workload type to explore the correlation of workload with server utilization. And as a result, distributions information gathered here can be used to infer local tasks configuration and resource requirements configurations. iv. Analyse the high level failure characteristics and resource utilization of sampled time slices exhibiting crash-loops, stable system status, and other behaviours. v. Evaluate the impact of system behaviours on the energy efficiency with in the entire data centre cloud. 31

40 Based on the recent approaches to cloud back end workload characterisation and classification methodologies, and the analysis, the project methodology will follow; a thorough coarse grain analysis including resource consumption and server utilisation. We proceed with a fine grain analysis to try and quantify the resultant resource waste that may arise from in large scale data centre traces due to user behaviour. This is characterised by early task termination. With a comprehensive coarse grain and fine grain analysis, an investigative clustering is performed to evaluate the stability, consistency and reliability of scheduler aggregation of data centre workload with the simple kmeans clustering. Evaluation of analysis results is performed through a comparison of % failure levels observed in a 7 hour data trace and the results from a 29 Day trace as will be detailed in section (Evaluation of Project) below. The design approach will follow the following steps somehow detailed in the figure below; Illustration of the Project Methodology Figure 11 : Project methodology 32

41 3.3 Evaluation of Project We shall use percentage of failures to evaluate energy waste. Using a simple assumption like 39% for the three types of events (failure, eviction, kill) observed from the 7 hour Google traces, we compare it to the 52% observed in a more large scale trace, a 29 Day Google trace. Using results as in [Garraghan et al, 2013], coarse grain statistics like average, standard deviation and median of resource consumption CPU cores and memory, consumed by tasks, complete and Full tasks, we contrast these results with those from a 7 hour trace. Percentages Others CPU Memory 29 Day trace statistics (n1=12532 machines) Average (avg1) % St. deviation (std1) disparity hour traces (n2= machines) Average(avg2) % St. deviation (std2) disparity Table 1 : Summary of statistics that we contrast with from a 29 Day trace We use these pieces of information of the two population descriptions for the t-statistic which we base on calculate the p-value which is used to reject the following hypothesis that comes from the questions raised on the data set. Calculation of the 2-t test requires three components, the average, standard deviation and sample populations using this test formula. t = ( ) ( ) 33

42 Chapter Four: Implementation Data set Description The data set used in this project is of the first version of the Google data centre traces for over 7 hour period that have been publically availed for research [28]. Ethical issues in regards to the outcomes of this project due to this data availability have already been sorted by Google's department for research. The data have been anonymized in several ways: there are no task or job names, just numeric identifiers; timestamps are relative to the start of data collection; the consumption of CPU and memory is obscured using a linear transformation. The workload consists of a set of tasks, where each task is run on a single virtual machine or compute machine. Tasks consume both memory and one or more Central Processing unit cores (CPU cores). Each tasks belongs to a job and a single job may have one or more tasks (e.g mappers and reducers), implying that a single job can utilise one or more compute machines. Mappers and reducers are functions of the mapreduce frame work or infrastructure, a data centre task scheduler used in the Google data centre to provide monitoring and management of the distributed server computers running in parallel, managing all communications and data transfers between the various parts of the system Data set overview An overview of the statistics elicited from the coarse grain analysis of the trace log is presented in table 1. It is made of 3,535,029 task records of which only of the records represent unique tasks. The trace details supplied on [21] indicate that each task runs on a singleton machine. This implies that the trace is composed of machines. These various tasks are grouped in to jobs. However, each task implements a task profile. And since they are categorised in specific jobtypes or classes, homogeneous tasks are expected to have relatively homogeneous resource usage. Trace statistics 1 Total Trace length 7hours 2 Disparity with in memory & CPU respectively , St. Deviation (memory & CPU respectively) 1.258, Total trace records 3,535,029 34

43 5 Average Normalised tasks memory usage Average tasks CPU usage Total number of virtual machines Table 2 : Table 1: displaying some of the trace statistics 35

44 4.1.3 Data set structural Model A data model for the job task relationship exhibited in the data centre may appear like one below. Figure 12 : Google data centre trace - Component model Since users submit requests for applications to run, the mapreduce system organises all requests as jobs and tasks, at the lowest level of detail. The trace rows represent a collection of a single task execution during a five minute period, though recorded in seconds as timestamps with realisable overlap of the beginning or end of the trace. The data are structured with columns; i. Time (int) - time in seconds since the start of data collection, jobid (int) - Unique identifier of the job to which this task belongs. ii. TaskID (int) - Unique identifier of the executing task iii. Job Type (0, 1, 2, 3) - class of job (a categorization of work) iv. Normalized Task Cores (float) - normalized value of the average number of cores used by the task. v. Normalized Task Memory (float) - normalized value of the average memory consumed by the task. 36

45 Figure 13 : Server utilisation 37

46 4.1.4 Data set assumptions A realistic analysis of this trace involves : (1) coarse grain analysis :- resource usage will be characterised where a single work load is assigned to each task using the already known distribution measures like utilisation mean, maximum, minimum and standard deviation for the kind of tasks. And since we describe tasks by an indicative column of the type of job it belongs to, utilisation comparison have been performed on the trace to understand similarities and differences in resource consumption for the various categories of tasks. (2) An investigative clustering using the already known algorithm as an alternative way to aggregate the same workload to discover extra characteristics that are used to decide whether a category in the workload particularly affects more on energy consumption. Assumptions a. In the trace log it is visible that various jobs are identified by their latency-sensitivity class (scheduling class) label or scheduler aggregate class that is based on some factor like consumption estimate. That is to say, 0-least latency-sensitive tasks, 3 representing a highly latency-sensitive task. So by using this class dependency, certain jobs are scheduled for execution before others. Furthermore, if consideration too is made for the fact that the aggregate class scheduler factor, it is possible that because similarity in resource estimation, the scheduler allocates a jobtype value of say, 0, 1, 2, 3 as an identification of these job classes. b. Resource usage is one detail that can be used to imply a job state. Jobs whose tasks are in a yet to enter into a pending state of the cluster scheduler or are awaiting to be updated with more resources are represented with 0 values for the resource consumption of either or (both) compute machine memory and CPU cores. And while it is not pending, a task for a particular job should be represented with a resource usage that is not 0 s. Our assumption agreed from the behaviour of the tasks exhibited in the trace is that, as a task progresses from the start state to the end state with in the trace log, it will execute for some time with memory and CPU usage that is greater than zero (0) or zero. To many tasks, the resource usage may change during the execution to almost 0 while some may progress with the same resource usage until its end state. Later on, the tasks whose resource usage may change, they are observed for some time in the trace to 38

47 wait for some time or simply disappear in the next trace window, usually 5 minutes. If in the next trace monitor these same tasks are observed with now new resource usage values, we classify the past traces as wasted traces. This continues until when of the tasks with new resource usage values maintain them till the task's end state. This part of the task duration from when new values for the resource usage are observed to when the task is last monitored is referred to as productive time. 4.2 Task Analysis Determining task structure and Composition Insights from the Google Data centre Scheduling [37], identify two (02) kinds of jobs that exist in the Google Data Centre, service jobs and batch (production) jobs. Production or batch jobs consume resources for far much less time than service jobs because they have a lower priority. This implies that once service jobs have been allocated computing resources, the balance is channelled to batch jobs. These types of jobs in the traces represent a huge % of jobs. They are characterised by shorter execution time and resource usage. Jobs are perceived to contain singleton or multiple tasks, which are the smallest units, executed on the Google server clusters. Tasks observed with in the trace are monitored every 5 minutes and mapped onto a single machine per task. Each job belongs to one class of jobs, 0, 1, 2 and 3. We suspect and observe two things; (1) task resource consumption is consistent across submissions and consecutive monitoring; (2) Because of such consistency observed across for resource consumption, similar tasks with in the same job class indicate similar resource usage and thus similar computing environments. Job Type Maximum CPU Minimum CPU Maximum Minimum Core Usage Core usage Memory usage Memory usage Table 3 : Cluster statistics 39

48 4.2.2 Coarse grain Analysis Distributions for workload resource consumption & Server Utilisation For each of the task classes, we develop distributions for resource consumption. As identified earlier on in step one, the project focussed onto understanding the consumption of the data centre cloud resources, including memory and CPU Cores. The trace data gives little information on the actual memory and CPU cores used though it indicates normalized value of the average number of cores and memory consumed by the task. Server utilisation distributions are designed based on the fact that since every task with in the clusters is run on a singleton machine, a virtual machine. Therefore, from task analysis, we develop distribution for workload or Normalised task memory and number of Normalised CPU cores consumption as a calculation of resource consumption for the entire workload cluster, including how memory usage varies with CPU core usage values for various task classes (0, 1, 2, 3). In addition to these distributions, we develop specific distributions for individual resource usage for the work load. These may include; Task Normalised memory usage and task Normalised CPU core usage in section ( ). Table 4 : Summary of the cluster properties of the trace (a & b). CPU ARCH0 ARCH1 ARCH2 ARCH3 AVERAGE STANDARD DIVIATION VARIANCE Memory ARCH0 ARCH1 ARCH2 ARCH3 AVERAGE STANDARD DIVIATION VARIANCE (a) (b) 40

49 Task Submissions 5 x 104 Submission Rates Time x 10 5 Figure 14 : Task Submissions 41

50 Cluster Distributions for combined resource consumption (Task Memory Vs Task CPU-Core Utilisation). (a) Resource Utilisation for Jobs with jobtype value 2 (b) Resource Utilisation for Jobs with jobtype value 3 (c) Resource Utilisation for Jobs with jobtype value 0 (d) Resource Utilisation for Jobs with jobtype value 1 Figure 15 : Resource utilisation per jobtype. Resource consumption across the various aggregates in the Google trace in figure 15 is uniform and steady for the 04 work load aggregates. Aggregate with jobtype 1, 2, 3 tend to show homogeneous resource usage that gets stable between for Normalised Task Memory for Normalised CPU cores of ranges. However, tasks for jobtype with value 0 for ranges for CPU consumption required high memory. This is obvious indeed for resource allocation since it would be feasible for any scheduler to allocate resources in that manner because CPU is a more scarce resource than memory. Individual resource consumption models are attached for further comprehension. Appendix 42

51 Yet still convincing to draw insights into workload resource estimation and allocation, we observe resource usage discrepancies for all work load aggregates 0,1,2,3 for most of the tasks. 4.3 Clustering Google Cluster Aggregation There is limited information known of the 7 hour Google trace and as such a reason, it is hard to tell of how the mapreduce framework aggregates workload prior to resource allocation. Trace papers identify aggregation performed by the framework on the basis of a number of factors not limited to latency-sensitivity, priority, mapping or monitoring jobs, resource demand or any other constraints that may arise for consideration. The resources demanded here may include bandwidth, hard disk, Random Access Memory and the processor, herewith referred as CPU cores. A review of the trace, an aggregation based on jobtype is performed. Usage of this metric in clustering would imply a number of suggestions including; jobtype would mean a unique virtual machine specification and thus a unique architecture type for the tasks whose jobs are categorised as a particular jobtype. For example, a jobtype 0, 1,2,3 would mean an architecture type 0,1,2,3. Job Type Maximum CPU Minimum Maximum Minimum Core Usage CPU Core Memory usage Memory usage usage Table 5 : Simple statistic compositions of the Cloud cluster Investigative clustering Key questions answered here include; why aggregate the data set? What aggregation techniques are of preference here? 43

52 Canopy clustering, hierarchical, agglomerative and partitioning clustering algorithms have existed and used in various Data centre dynamic scheduling environments. For example, Google s MapReduce uses clustering algorithms to achieve workload classification and allocation of workload to relatively equal amounts of user estimated resource demands. We have of recent learnt that a combination of algorithms is implemented in mapreduce infrastructure Identification of Work Load Dimensions Key questions of interest answered here include: - Why the need to specify workload dimensions? And why the preferred dimensions? The initial activity of our investigative clustering approach is to identify trace dimensions. The trace includes detail of the various dimensions that are limited to time stamp, job ID or parentid, type of Job, task ID, and Resource Usage details like Normalised CPU and Memory usage. Analysis of such dimensions avails only three interesting details though because it is an investigative clustering, we would wish to determine whether type of Job is a criterion for clustering performed by MapReduce and whether this form of clustering would impact on the current behaviour of workload observed in the traces and energy efficiency. Unique tasks are presented with their respective resource constraints that are limited to memory and CPU usage. And yet since it is the only detail that is available, we construct task classes that include at least some extra detail like; the type of Job, and the duration of the task. Due to this detail, task duration, Type of Job, Normalised CPU core and Normalised memory usage are considered. While we consider Task Duration as a dimension, it is generated from timestamp; a dimension availed in the trace log Application of Simple K-means clustering (How exactly is the identification of workload and the aggregation accomplished?) A clustering technique is applied to all the tasks based on their resource sage to construct task classes that have fairly similar resource usage. The purpose of this step is to determine whether the mapreduce framework cluster characteristics are consistent with normal K- means. And hence the intra - cluster CPU & memory consumption and server utilisation per jobtype / architecture. To achieve this, data pre-processing for distinct task records and 44

53 average consumption per task is conducted, followed by the running of the off-the-shelf clustering algorithm, Kmeans with consistent task classes as the k- value. Data pre-processing is conducted through recalculating the task duration since it is one of the dimensions to be used. With the pre-processed data set, kmeans clustering is performed to discover intra cluster resource consumption similarities and differences. Clustering is done with (k = 4) for consistency with the mapreduce system used workload aggregation using fields like jobtype, task duration, NormTaskCores, NormTaskMem, as computed from the trace. Using the cluster compositions from Investigative clustering, we study and draw insights on the accuracy and effectiveness of the aggregation techniques implemented in the Google Cloud centre as a comparison with our kmeans results set. Maintaining the number of clusters, where k = 4, Simple Kmeans produces 4 clusters. We shall then investigate closely on cluster distributions Results of the Simple Kmeans clustering Cluster Attribute Full Data No of Instances Task Duration NormTaskCores NormTaskMem JobType Table 6 : Results from clustering performed Discussion.: We observe that a large percentage of the tasks exhibit a behavioural characteristics similar to those of tasks of jobtype0. This is well observed yet still in the graphs in appendix. Investigative clustering results also indicate that A snapshot of the newly constructed clusters is available in the Appendix 45

54 4.4 Work load - Energy consumption Energy Waste Quantification and Failure Analysis Version 1 of the Google data traces is an semi - abstract type of trace in that, limited information is availed to the researchers of when to tell whether a termination event has occurred or not. An observation in the trace log is that some percentages of the tasks that experience a termination event are resubmitted in the subsequent trace windows 1 while the rest are not. However, another percentage of the tasks observed in trace will be submitted and execute successfully with in the trace window. This implies that a considerable amount of resource is wasted due to task terminations. Understanding of resource waste should imply energy waste. In this analysis, resource waste is defined to constitute total time wasted due to task termination and also taking into consideration of the average memory and CPU units consumed during this period. Time wasted as introduced in the preceding sections, is considered as a difference between the total task trace time and productive time. Total task trace time is the sum of all task time from the first monitoring until it is last monitored. While productive time is considered as the time between the latest monitoring of a task and the completion monitoring without termination. We assume that the last monitoring constitutes a good probability of success and that task presence of tasks after the termination behaviour would imply a task restart by the scheduler so that the time it takes to complete successfully is regarded as productive time. Summary of task Execution per architecture type or jobtype Failure analysis of tasks at architectural level / jobtype Architecture type Task Count Percentage of total failures Trace Window, a 5 minute trace period specified for the execution of a task. 46

55 Total failures 9261 Table 7 : Failure analysis of the 7 hour trace Of the total unique tasks in the trace, this portion of tasks represents an 18.75% ~ 19% of the total trace presented. However, we assume that 20% of the remaining trace log progresses further with such behaviour or state. Once considered and observed worthwhile, the total failure in the 7 hour Google trace will impart a 39% failure of tasks. Of the Google data centre version 1 trace, it is interesting to study the way tasks behave while during execution. First and foremost, various observation exist here that lead to a variety of suggestions imposed on to the data set results. a) First Observation The data set contains tasks traces that indicate zero (0) resource usage for both CPU and Memory usage having run for some consecutive times with a realisable resource usage across resources. This represents 1% of the total tasks (33111 out of the ). Our suggestions is that if the task is observed in the trace of this case, we identify such behaviour as a termination behaviour that may either indicate three issues; an eviction, a kill or a fail event or not. The later's existence may arise due to a status condition when the task perhaps is running and is waiting for a resource update. b) Second Observation The same data set also contains records of task traces indicating consistence with consecutive data centre monitoring of every 5 minutes. This is identified as a result of consecutive resource usage, even without a termination event indication. The suggestion due to such behaviour is that these types of tasks may constitute the tasks that executed completely or successfully. c) Third Observation Finally, with regards to task behaviour learnt from the trace log, various tasks exist that indicate zeros (0) for resource usage for either Memory or CPU monitored with in the same 5 minute period. 50.8% ( out of ) of these tasks whose behaviour is identified herein, ended in a termination status condition as described in in the first observation. Our suggestions due to such task behaviour in the trace is that since a large percentage of these tasks is assumed to end in a failure, it is interesting to study whether this specific percentage is a deciding factor on energy waste. 47

56 A detailed failure analysis of the Google Data Centre traces across different server architectures here in interchangeably used to imply jobtype (0-3) is observed like in the table below Energy usage quantification Recent publications based on large scale data sets, a 29 day Google data centre traces have proposed and presented an empirical evidence of 52 % tasks behaviour identified as failure due to presence of termination events introduced in sections above. And since the 7 hour data centre trace is a trace captured for a particular section of a day, of the same Google cluster machines, it is important that clarity is demonstrated for the discrepancy in the percentages. Applying briefly part of this detail, we study the energy waste that is a result of these failures. This time wasted can only draw realistic meaning only if perceived in terns if energy, thus energy waste. Due to such understanding, rate of task failure per cluster or workload type, quantification of energy waste per task and per workload type should give insights to whether task classes impact on the total energy waste due cloud environment. To achieve this, we have assumed a sample data centre server model, the Dell PowerEdge R520 energy performance adopted from the SPECS2008 Energy consumption benchmarks results and power generation as described by [Schulz, 2010]. 2 This type of server assumed as one of the high performance servers of the type X86 Architecture observed in use in all Google data centre traces 3, constitutes the following specifications over which the decision was made. Details of interest of this server are bellow. DELL PowerEdge R520 Server.(Model server) 1 Component identification Specification detail 2 CPU Characteristics 8 core, 2.30GHz, 20MB L3 Cache 3 Memory 4GB 2Rx8 PC3L-10600E-9 ECC 4 Total power Utilisation at 20% load 80.4 W Table 8 : A simple section of the Dell PowerEdge R520 Model server Specifications 2 SPECS2008 Benchmark for a DELL PowerEdge R520 Server. 3 High performance server architectures (Platforms) used by Google, sourced from Wikipedia: 48

57 The essence of server selection is to serve as a basis for the energy consumption calculation for each task in the trace. The trace provides for some detail that can be used to understand the energy consumption of the data centre for the 7 hour period. It consists of a set of workloads where each task runs on a single machine. It has been provided that the tasks in the trace consume memory and one or more CPU Cores all measured in fractional units. Since in the data centre Virtual machines are created on physical servers, the energy used by a single virtual machine onto which a task can be mapped is calculated as follows; Estimated mean Energy consumed by a unit DELL PowerEdge server = 80.4 W Energy consumed by a task whose task duration (z) and resource usage (Memory (x) and CPU cores(y) ) is the energy consumed by the single virtual machine onto which it is running, stated as a Machine Energy Usage Profile (MEUP), given as a summation of Watthours for both memory and CPU core usage. = taskresour ceusagehours unitdellpoweredgeserverenergyusage. which is in Watt-hours units (kwh). This is fully supported by the section on determining your energy usage in the data center from [42] Where; Task resource usage hours = taskduration resourceusage 12 hours. Such that for task resource usage hours from Normalised Task Cores (y) for task duration (z) is, while task memory usage from Normalised Task Memory is hours. Based on the assumptions in section (4.1.4), we calculate for energy consumption, that is to say; Total energy, wasted and utilised energy. Utilised energy is the energy due to a task executing successfully while the wasted energy is the resultant energy due to task failure to execute successfully. The total energy is the summation of both utilised energy and wasted energy. The energy calculations are made possible through initial calculations of individual task durations; full task trace time, productive time and wasted time. The time wasted is calculated as: total time productive time. For example if the total time taken by a task in the trace is 900 seconds, and the productive time (a time when it was last monitored and 49

58 executed successfully is 600s) with average resource usage values of x = 0.25 and y = 0.8, total time waste is Full task trace time = 900s and equivalence of 15 minutes Also, Productive time = 600s would imply and equivalence of 10 minutes; Wasted time = ( ) = 300s ~ 5 minutes Task resource usage hours from Normalised Task Cores (y) for task duration (z) is ( ) equivalent to 1 Core hour, while task memory usage from Normalised Task Memory is ( ) = 0.21 Memory hours. The MEUP is = taskresour ceusagehours unitdellpoweredgeserverenergyusage. For Memory energy usage; = 0.21 * 80.4 = Watt-Hours while for CPU Core energy usage can be calculated as; = 1 *80.4 = 80.4 Watt-Hours (WH) Such that the MEUP is a summation of the Memory energy usage and CPU Core energy usage = = Watt-hours (WH) Discussion of the energy models There is quite a large disparity amongst completed task aggregations with tasks of jobtypes 1 & 0. However, the trend was an exponential trend for all the three energy models as observed in the figure below. It is thus worth concluding that all aggregations contributed quite a large amounts of energy Energy waste models for tasks for Particular job Types (architectures) waste. For the energy models, refer to appendix and figure (a & b). 50

59 F(x) F(x) a 1 Energy waste for VMs of tasks with JobType 0 Energy waste trend x x 10 6 b Energy waste for VMs of tasks with JobType 1 Energy waste trend x x

60 Figure 16 : Energy waste models for architectures 0 & 1. Energy waste models for tasks of other jobtypes are availed in the appendix Chapter Five: Evaluation In this section, an evaluation Concerns: how to tell that the models and project were successful? 5.1. Evaluating the Methodology We have presented a standard methodology for workload analysis and classification propelled by a mixture of methodological approaches including coarse grain analysis, clustering and finally, a fine grain analysis to quantify energy waste amongst workload categories. This combined approach has been successful at distinguishing and characterising abnormal behaviour within the workload like failure or abnormal task terminations. We have further developed an automated ruby environment for processing direct trace workloads that are of a similar nature Evaluating the Results. a) The research was interested in deciding whether the aggregation as observed in the data centre trace log particularly impacts realistic energy consumption or not. Either way is vital as support for decision making to the cloud or data centre users. b) Results from the investigative clustering indicate that most VMs created for tasks in the trace were not underutilised. We observed that most of the tasks consumed 20% of the VM allocated resources. This is an indication for the higher energy waste observed in the traces for all architecture types or task categories (jobtype 0, 1, 2, and 3). c) Arguably, it is further observed that there was a steady and relatively stable resource utilisation for all task classes. However much fluctuations are quickly observed at the start of the trace monitoring, consumption is stable after. It is also however observed that such behaviour is unrealistic usage as once the VMs stored on the servers are made active at almost 100% active load and later, suddenly drawn to 18% or 20%, 52

61 these results to yet still a high energy waste as represented in the figures below. Thus fluctuations depicted here like what is in observed in clouds may still affect energy waste. Whereas Google describes VM provisioning such as each task is available it its own VM, The scheduler policy does not reflect full utilisation of resource. d) Attempting to answer, the question: Could percentage composition be a determining factor on energy waste? On a fine grain evaluation of the results, a 2-t test was made on the two sample populations. Namely a 7 hour trace (from this project) and the 29 Day trace all from Google Data Cloud centres. Section 4.4 calculates and presents an empirical assumption that the percentage failure. And as presented as a fraction of the total unique tasks in the trace, this portion of tasks represents an 18.75% ~ 19% of the total trace. However, we assume that 20% of the remaining trace log progresses further with such behaviour because of the observations in section (4.4.1). Once considered and observed worthwhile, the total failure in the 7 hour Google trace may contribute a 39% failure of tasks. We shall use percentage failures to evaluate energy waste. Using a simple assumption like 39% for the three types of events (failure, eviction, kill) observed from the 7 hour Google traces, we compare it to the 52% observed in a more large scale trace, a 29 Day Google trace. In order to contrast these percentages utilise statistical properties of the two populations as in section (3.) to identify whether % observed behaviour of the two traces imparts on almost equal resource waste using a sample 2-t test results for both memory and CPU cores usage for the two populations. For t-values calculated and use of the standard t-values table, we obtain t=307 & t=317 for both population CPU cores and memory respectively. This gives a t-table reading of 80% confidence levels. Since our t-values are so high and fall between a moderately good p-value range (pvalue close to 0.001), we are able to say that 39% observed in the 7hour trace is representative of the 52% obtained from a 29 day trace. Hence the observed high and unacceptable high energy waste Limitations Limitation to the effectiveness of the results and the efficiency of the infrastructure developed to process the traces is the idea of too many assumptions. A lot more have been done for the effective observations. Criteria to further enhance this project evaluation, changes due to the 53

62 fact that statistics for the 29 day trace are available through use of the trace papers, and since the data set also comes from the same clusters as explained by [21], it was not possible to use CloudSim for evaluation of the energy models developed. However, using the statistics provided by the different trace papers and ones from this project, we have been able to evaluate our results. Such short comings could be avoided in the future if a thorough literature search is made including also the most recent papers, say papers published in the same year as the research being handled. Yet still, learning to use an infrastructure like CloudSim in the period that was remaining was not feasible Future Work The analysis of server utilisation and resource consumption of the Google data centre cloud traces has only attempted to fully characterise the trace of the Trace version1, explored the concept of energy efficiency from the perspective of energy wasted and Total energy utilised on a SPECPower2008 Benchmark while it was experiencing 20% active state. Interest for the analysis work presented in this research focussed on understanding user behaviour while on the cloud observed due to the fact that currently, cloud users have the mandate to scale on what capacity of the resource they want on a utility computing based model. Further research studies using this same data set is vital to the understanding of how various issues are handled in the cloud data centre like scheduling, resource provisioning, virtual machine resource migration (hardware and software), and etcetera. More studies on the same trace on regards to investigation of what exact server models were the VMs made of are possible, since provided by jobtype and the corresponding resource consumption, VMs always VM instances created on the same machine possess approximately similar configurations like the host machine (Cluster Servers). 5.5 Conclusion To provide a more precise workload description and improve the results of the energy consumption calculations, an extra detail is needed, provided in the second version of the trace. And to also have trustable results that relate to the machine configuration, assumptions are needed to be made with support of industrial performance benchmarks like the SPECPower2008 and so many others. So in conclusion of the work done in this thesis and 54

63 project in general, a computerised analysis of the Google data centre trace has been achieved; based on the jobtype detail in the trace used for categorisation of workload, analysis of server utilisation based on the VM architecture has also been done; calculation of the total trace for trace basic properties has also been achieved; utilisation and resource consumption graphical models, and the energy waste graphical models were also generated just as required by the project requirements. Finally, evaluation of the project in terms of evaluation of methodology, and project results has also been done. Based on the statistics provided by the trace papers, basic statistics have been used to contrast our results from those presented by them. We have explored energy waste calculation and hopefully, the approach is well presented for easy comprehension. 55

64 Bibliography [1] Ali-Eldin, A., Tordsson, J., Elmroth, E., & Kihl, M. (2013). Workload Classification for Efficient Auto-Scaling of Cloud Resources. [2] Amazon, Amazon web service data sets (2009). [3]Amrhein, D et al., Cloud computing use cases,(2010), [4] AnthonyS, Viewing Microsoft trace data (2009). [5] Armbrust, M., Fox, A., Griffith, R., Joseph, A. D., Katz, R., Konwinski, A.,... & Zaharia, M. (2010). A View of Cloud Computing-Clearing the clouds away from the true potential and obstacles posed by this computing capability.communications of the ACM, 53(4), 50. [6]B. Newton and H. VanHook, "Cloud Cover Delivering on the Value of the Cloud in Public Sector IT Organizations," BMC Software, White Paper , [7] Beloglazov, A., & Buyya, R. (2010, May). Energy efficient resource management in virtualized cloud data centers. In Proceedings of the th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing (pp ). IEEE Computer Society. [8] Benson, T., Akella, A., & Maltz, D. A. (2010, November). Network traffic characteristics of data centers in the wild. In Proceedings of the 10th ACM SIGCOMM conference on Internet measurement (pp ). ACM [9] Berl, A., Gelenbe, E., Di Girolamo, M., Giuliani, G., De Meer, H., Dang, M. Q., & Pentikousis, K. (2010). Energy-efficient cloud computing. The Computer Journal, 53(7),

65 [10] Buyya, R., Beloglazov, A., & Abawajy, J. (2010). Energy-efficient management of data center resources for cloud computing: A vision, architectural elements, and open challenges. arxiv preprint arxiv: [11] Buyya, R., Ranjan, R., and Calheiros, R. (2009) Modelling and Simulation of Scalable Cloud Computing Environment and the CloudSim Toolkit: Challenges and Opportunities. International Conference on High Performance Computing and Simulation, HPCS June, pp [12] Calheiros, R. et al (2010) CloudSim: A Toolkit for Modelling and Simulation of Cloud Computing Environments and Evaluation of Resources Provisioning Algorithms. Software Practice and Experience 41, 24 August, pp , John Wiley & Sons, Ltd. [13] Chalermarrewong, T., Achalakul, T., & See, S. C. W. (2012, December). Failure Prediction of Data Centers Using Time Series and Fault Tree Analysis. InParallel and Distributed Systems (ICPADS), 2012 IEEE 18th International Conference on (pp ). IEEE [14] Corradi, A., Fanelli, M., and Foschini, L. (2011) Increasing Cloud Power Efficiency through Consolidation Techniques. IEEE Symposium on Computers and Communications. 28 June 1 July, pp [15] Dave Durkee, The competition among cloud providers may drive prices downward, but at what cost? ACM queue. (pg.2) [16] Di, S., Kondo, D., & Cirne, W. (2012, September). Characterization and Comparison of Cloud versus Grid Workloads. In Cluster Computing (CLUSTER), 2012 IEEE International Conference on (pp ). IEEE. [17] Ferdman, M., Adileh, A., Kocberber, O., Volos, S., Alisafaee, M., Jevdjic, D.,... & Falsafi, B. (2011). Clearing the Clouds: A Study of Emerging Workloads on Modern Hardware. Tech. Rep. [18] Foster, I. et al (2008) Cloud Computing and Grid Computing 360-Degree Compared. Grid Computing Environment Workshop, GCE November, pp

66 Gartner. (2013). Virtualization. Available: Last accessed 31 Aug [19] Gen Frank, Defining Cloud Services and cloud computing, (2008), [20] Gill, P., Jain, N., & Nagappan, N. (2011, August). Understanding network failures in data centers: measurement, analysis, and implications. In ACM SIGCOMM Computer Communication Review (Vol. 41, No. 4, pp ). ACM. [21] John Wilkes, googleclusterdata, https://code.google.com/p/googleclusterdata/wiki/ [22] Mishra, A. K., Hellerstein, J. L., Cirne, W., & Das, C. R. (2010). Towards characterizing cloud backend workloads: insights from google compute clusters. ACM SIGMETRICS Performance Evaluation Review, 37(4), [23] Jeff Hammerbacher, Global Information Platforms: Evolving the data Ware house. A Cloudera product report, March 9, [24] Jeffrey Dean, Sanjay Ghemawat, MapReduce: simplified data processing on large clusters, Communications of the ACM, v.51 n.1, January 2008 [doi> / ] [25] Jie Xu (2013) Advanced Distributed Systems Lecture notes, Available at: (Accessed: 2nd March 2013). [26] Joseph L. Hellerstein (2010) Google cluster data, Available at:http://googleresearch.blogspot.com/2010/01/google-cluster-data.html (Accessed: 2nd May 2013). [27] Mell, P., & Grance, T. (2011). The NIST definition of cloud computing (draft).nist special publication, 800(145), 7. [28] Lesser Adam. (2012). 4 types of data centers. Available: Last accessed 20th Aug

67 [29] Michael Kanellos. (6/06/2013).Google Says: Save Energy, Ditch Your Data Center. GREEN TECH. Retrieved June 11, From [30] Poslad, Stefan (2009). Ubiquitous Computing Smart Devices, Smart Environments and Smart Interaction. Wiley. ISBN [31] Moreno, I. S., Garraghan, P., Townend, P., & Xu, J. (2013). An Approach for Characterizing Workloads in Google Cloud to Derive Realistic Resource Utilization Models. In SOSE (pp ). [32] Pesout, P., & Matustik, O. (2012). On a Modeling of Online User Behavior Using Function Representation. Mathematical Problems in Engineering, [33] Pravin. (2011). About the Data Center. Available: Last accessed 31 Aug 2013 [34] Rai, A., Bhagwan, R., & Guha, S. (2012, October). Generalized resource allocation for the cloud. In Proceedings of the Third ACM Symposium on Cloud Computing (p. 15). ACM [35] Shivakumar, B. L., & Raju, T. (2010, August). Emerging Role of Cloud Computing in Redefining Business Operations. Global Management Review, IV (4), [36] Tim Grieser (2008). Enabling Data center automation with virtualised infrastructure. Global Headquarters: 5 Speen Street Framingham, MA USA P F : VMware. p1-6. [37] Tran, T.T., & J. C. Beck. (2012) "Report: Google Data Center Scheduling", Technical Report, University of Toronto, Canada. [38] Wang, G., Butt, A. R., Monti, H., & Gupta, K. (2011, July). Towards synthesizing realistic workload traces for studying the hadoop ecosystem. In Modeling, Analysis & Simulation of Computer and Telecommunication Systems (MASCOTS), 2011 IEEE 19th International Symposium on (pp ). IEEE 59

68 [39] Webopedia. (2013). virtualization. Available: Last accessed 31 Aug [40] Yahoo!, M45 supercomputing project, 2009, [41] Ye, K. et al (2010) Virtual Machine Based Energy-Efficient Data Center Architecture for Cloud Computing: A Performance Perspective. IEEE/ACM International Conference on Green Computing and Communications & IEEE/ACM International Conference on Cyber, Physical and Social Computing Devember, pp [42] Greg Schulz. (2009). Determining energy usage in the data center. Available: Last accessed 1st Sept [43] Hindman, B., Konwinski, A., Zaharia, M., Ghodsi, A., Joseph, A. D., Katz, R.,... & Stoica, I. (2011, March). Mesos: A platform for fine-grained resource sharing in the data center. In Proceedings of the 8th USENIX conference on Networked systems design and implementation (pp ). USENIX Association. [44] Garraghan, P., Townend, P., & Xu, J. An Analysis of the Server Characteristics and Resource Utilization in Google Cloud. [45] Gutierrez, C. F. (2013). CLOUD COMPUTING: AN INTRODUCTION FOR THE LAYPERSON. International Journal, 2(1), [46] Dawson, P., & Bittman, T. J. (2008). Virtualization Changes Virtually Everything. Gartner Special Report. [47] Xiao-Hang, L. I. (2010). Construction and Practice of Embedded Software Development Environment Based on Virtual Technology [J]. Computer Knowledge and Technology, 32, 044. [48] A Greening, L., Greene, D. L., & Difiglio, C. (2000). Energy efficiency and consumption the rebound effect a survey. Energy policy, 28(6), [49] Kreith, F., & Goswami, D. Y. (Eds.). (2007). Energy management and conservation handbook. CRC Press. 60

69 [50] Plummer, D. C., Bittman, T. J., Austin, T., Cearley, D. W., & Smith, D. M. (2008). Cloud computing: Defining and describing an emerging phenomenon. Gartner, June, 17. [51] Ramanathan, Shalini, Savita Goel, and Subramanian Alagumalai. "Comparison of Cloud database: Amazon's SimpleDB and Google's Bigtable." Recent Trends in Information Systems (ReTIS), 2011 International Conference on. IEEE, [52] Sam Johnston. (2009). Cloud computing. Available: Last accessed 1st Sept

70 Appendices Appendix 1: Project Reflection Overall, my project was successful though there are things I would have done better given more time. Throughout this project time, I have found a great opportunity to develop my programming and database management skills. This is because my project involved handling of massive datasets from the Google clusters which required a database management system and scripting platform to perform automated sampling and processing of the data trace for various data queries. It is also worth sharing that it is through this project that I got a chance to learn deeply and explore various issues about Computational modelling, Cloud Computing, Distributed Systems, Techniques for Knowledge Management and statistical analysis that is relevant to the project evaluation. While working on this project, I managed to register with various Websites, Use-nets, mail-groups like the Google Groups used for research on the Google traces. The project methodology design, distribution modelling of the workload, and energy efficiency and the project evaluation posed a tough challenge. But through feedback, weekly meetings and guidance from my supervisor, the assessor and the Distributed Systems research group simplified solutions delivery, and the understanding of the whole project. Furthermore, I must acknowledge and emphasise the importance of regular meetings and feedback. These, indeed provided proper research direction and meaning. The power of the meetings is immense because even if you had something outside the project domain like health issues or need for moral support, a meeting would avail ground for a constructive talk over it, hence an attempt to the solution. During the progressive meetings I got sincere and knowledge support from my assessor, which helped me a great deal in evaluating the project of whether project requirements had been attained. I would also account this enormous interest for cloud computing to the Advanced Distributed Systems module leaders. The approach applied to the course delivery was a brilliant one. Obviously, I had also weighed and so my interests strongest in that research area. They frequently introduced the kind of research relevant to the modules that were currently pursued by the research group. And this possibly gave me a chance to propose a project idea that was later improved into a viable MSc project. 62

71 During the project implementation, I got a chance to attend the World Cloud forums in London where I exchanged ideas with the firms in cloud business. The chat exchanged with them provided knowledge on cloud techniques, technologies, service providers, approaches and mechanisms for efficient cloud service delivery and energy consumption, commonly identified eco-friendly (energy efficient) approaches, methods and techniques. However, I realised that I knew less about the area of cloud computing and data centres. This further motivated me to fully harness the treasures that cloud computing and distributed computing in general are bound to deliver. And it is due to all these academic favours that I have developed an interest in advancing studies in this area. And based on a project management point of view, I regard time management and critical analysis as the most important parameters in completing the project. Critical evaluation of various project phases enabled me to develop a thorough and well balanced approach towards them from different angles and dimensions. In order to complete my work within a required time-frame, I planned my work in a proper scheme and organized way which worked well for me till last week of my designing phase. My assessor commented on it and I adjusted it. Indeed, some phases were not easy. Yes, I received some challenges at the implementation and final evaluation phases due to inability to keep pace of UK time though managed. More so, the mid project updates on the project evaluation criteria shifted earlier suggestions to use CloudSim for simulation data to the use of statistical tests (2-t test results). However, the shift enabled me to practically learn contingency project planning and more into statistical analysis. This short term planning (creation of a contingency plan for project evaluation methodology and results evaluation) required four extra days of tutorials to learn this type of analysis. Eventually, once again I planned and managed my time in a correct way. I continued writing my project report while finalising my implementation of the various model developments of the project datasets. This allowed me to balance the time requirement and efficiency and to avoid the panic in the last moment. Lastly, I must also share that my experience of research group presentation of progress on the project has helped to develop a sincere evaluation direction for the project. An since learning and skill advancement never ends, I believe that my abilities and skills for distributed and cloud computing research shall be gradually enhanced. 63

72 Appendix 2: Interim Report (Assessor s comments) 64

73 Appendix 3: CPU and Memory consumption for each of the architectures (0, 1, 2, 3) 65

74 66

75 F(x) F(x) Appendix: More energy models for tasks of jobtype 2 & 3 c Energy waste for VMs of tasks with JobType 2 Energy waste trend x x 10 7 d Energy waste for VMs of tasks with JobType 3 Energy waste trend x x

76 Appendix 4: SPECSPOWER2008 server model used 68

77 Appendix 5: A snapshot of the data utilised in the project. 69

78 Clustered Results 70

79 Appendix 6: A presentation to the Assessor 71

80 72

81 73

INTRODUCTION TO CLOUD COMPUTING CEN483 PARALLEL AND DISTRIBUTED SYSTEMS

INTRODUCTION TO CLOUD COMPUTING CEN483 PARALLEL AND DISTRIBUTED SYSTEMS INTRODUCTION TO CLOUD COMPUTING CEN483 PARALLEL AND DISTRIBUTED SYSTEMS CLOUD COMPUTING Cloud computing is a model for enabling convenient, ondemand network access to a shared pool of configurable computing

More information

Cloud computing: the state of the art and challenges. Jānis Kampars Riga Technical University

Cloud computing: the state of the art and challenges. Jānis Kampars Riga Technical University Cloud computing: the state of the art and challenges Jānis Kampars Riga Technical University Presentation structure Enabling technologies Cloud computing defined Dealing with load in cloud computing Service

More information

CLOUD COMPUTING INTRODUCTION HISTORY

CLOUD COMPUTING INTRODUCTION HISTORY 1 CLOUD COMPUTING INTRODUCTION 1. Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over a network (typically the Internet). The name comes from

More information

White Paper on CLOUD COMPUTING

White Paper on CLOUD COMPUTING White Paper on CLOUD COMPUTING INDEX 1. Introduction 2. Features of Cloud Computing 3. Benefits of Cloud computing 4. Service models of Cloud Computing 5. Deployment models of Cloud Computing 6. Examples

More information

CLOUD COMPUTING IN HIGHER EDUCATION

CLOUD COMPUTING IN HIGHER EDUCATION Mr Dinesh G Umale Saraswati College,Shegaon (Department of MCA) CLOUD COMPUTING IN HIGHER EDUCATION Abstract Technology has grown rapidly with scientific advancement over the world in recent decades. Therefore,

More information

What is Cloud Computing? First, a little history. Demystifying Cloud Computing. Mainframe Era (1944-1978) Workstation Era (1968-1985) Xerox Star 1981!

What is Cloud Computing? First, a little history. Demystifying Cloud Computing. Mainframe Era (1944-1978) Workstation Era (1968-1985) Xerox Star 1981! Demystifying Cloud Computing What is Cloud Computing? First, a little history. Tim Horgan Head of Cloud Computing Centre of Excellence http://cloud.cit.ie 1" 2" Mainframe Era (1944-1978) Workstation Era

More information

Tamanna Roy Rayat & Bahra Institute of Engineering & Technology, Punjab, India talk2tamanna@gmail.com

Tamanna Roy Rayat & Bahra Institute of Engineering & Technology, Punjab, India talk2tamanna@gmail.com IJCSIT, Volume 1, Issue 5 (October, 2014) e-issn: 1694-2329 p-issn: 1694-2345 A STUDY OF CLOUD COMPUTING MODELS AND ITS FUTURE Tamanna Roy Rayat & Bahra Institute of Engineering & Technology, Punjab, India

More information

Cloud Computing Architecture: A Survey

Cloud Computing Architecture: A Survey Cloud Computing Architecture: A Survey Abstract Now a day s Cloud computing is a complex and very rapidly evolving and emerging area that affects IT infrastructure, network services, data management and

More information

Introduction to Cloud Computing. Srinath Beldona srinath_beldona@yahoo.com

Introduction to Cloud Computing. Srinath Beldona srinath_beldona@yahoo.com Introduction to Cloud Computing Srinath Beldona srinath_beldona@yahoo.com Agenda Pre-requisites Course objectives What you will learn in this tutorial? Brief history Is cloud computing new? Why cloud computing?

More information

INCREASING SERVER UTILIZATION AND ACHIEVING GREEN COMPUTING IN CLOUD

INCREASING SERVER UTILIZATION AND ACHIEVING GREEN COMPUTING IN CLOUD INCREASING SERVER UTILIZATION AND ACHIEVING GREEN COMPUTING IN CLOUD M.Rajeswari 1, M.Savuri Raja 2, M.Suganthy 3 1 Master of Technology, Department of Computer Science & Engineering, Dr. S.J.S Paul Memorial

More information

Cloud Computing 159.735. Submitted By : Fahim Ilyas (08497461) Submitted To : Martin Johnson Submitted On: 31 st May, 2009

Cloud Computing 159.735. Submitted By : Fahim Ilyas (08497461) Submitted To : Martin Johnson Submitted On: 31 st May, 2009 Cloud Computing 159.735 Submitted By : Fahim Ilyas (08497461) Submitted To : Martin Johnson Submitted On: 31 st May, 2009 Table of Contents Introduction... 3 What is Cloud Computing?... 3 Key Characteristics...

More information

IS PRIVATE CLOUD A UNICORN?

IS PRIVATE CLOUD A UNICORN? IS PRIVATE CLOUD A UNICORN? With all of the discussion, adoption, and expansion of cloud offerings there is a constant debate that continues to rear its head: Public vs. Private or more bluntly Is there

More information

Rapid Application Development

Rapid Application Development Rapid Application Development Chapter 7: Development RAD with CASE tool: App Inventor And Cloud computing Technology Cr: appinventor.org Dr.Orawit Thinnukool College of Arts, Media and Technology, Chiang

More information

Grid Computing Vs. Cloud Computing

Grid Computing Vs. Cloud Computing International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 6 (2013), pp. 577-582 International Research Publications House http://www. irphouse.com /ijict.htm Grid

More information

Cloud Computing-A Tool For Future

Cloud Computing-A Tool For Future [Volume 1 issue 1 Feb 2013] Page No.09-14 www.ijmcr.in [International Journal Of Mathematics And Computer Research] Cloud Computing-A Tool For Future 1 Dr D S Kushwaha 1 Ankit Maurya 2 Institute of Engineering

More information

Cloud Models and Platforms

Cloud Models and Platforms Cloud Models and Platforms Dr. Sanjay P. Ahuja, Ph.D. 2010-14 FIS Distinguished Professor of Computer Science School of Computing, UNF A Working Definition of Cloud Computing Cloud computing is a model

More information

The Hybrid Cloud: Bringing Cloud-Based IT Services to State Government

The Hybrid Cloud: Bringing Cloud-Based IT Services to State Government The Hybrid Cloud: Bringing Cloud-Based IT Services to State Government October 4, 2009 Prepared By: Robert Woolley and David Fletcher Introduction Provisioning Information Technology (IT) services to enterprises

More information

Cloud Computing. Karan Saxena * & Kritika Agarwal**

Cloud Computing. Karan Saxena * & Kritika Agarwal** Page29 Cloud Computing Karan Saxena * & Kritika Agarwal** *Student, Sir M. Visvesvaraya Institute of Technology **Student, Dayananda Sagar College of Engineering ABSTRACT: This document contains basic

More information

Data Centers and Cloud Computing

Data Centers and Cloud Computing Data Centers and Cloud Computing CS377 Guest Lecture Tian Guo 1 Data Centers and Cloud Computing Intro. to Data centers Virtualization Basics Intro. to Cloud Computing Case Study: Amazon EC2 2 Data Centers

More information

yvette@yvetteagostini.it yvette@yvetteagostini.it

yvette@yvetteagostini.it yvette@yvetteagostini.it 1 The following is merely a collection of notes taken during works, study and just-for-fun activities No copyright infringements intended: all sources are duly listed at the end of the document This work

More information

Datacenters and Cloud Computing. Jia Rao Assistant Professor in CS http://cs.uccs.edu/~jrao/cs5540/spring2014/index.html

Datacenters and Cloud Computing. Jia Rao Assistant Professor in CS http://cs.uccs.edu/~jrao/cs5540/spring2014/index.html Datacenters and Cloud Computing Jia Rao Assistant Professor in CS http://cs.uccs.edu/~jrao/cs5540/spring2014/index.html What is Cloud Computing? A model for enabling ubiquitous, convenient, ondemand network

More information

Essential Characteristics of Cloud Computing: On-Demand Self-Service Rapid Elasticity Location Independence Resource Pooling Measured Service

Essential Characteristics of Cloud Computing: On-Demand Self-Service Rapid Elasticity Location Independence Resource Pooling Measured Service Cloud Computing Although cloud computing is quite a recent term, elements of the concept have been around for years. It is the maturation of Internet. Cloud Computing is the fine end result of a long chain;

More information

Expert Reference Series of White Papers. Understanding Data Centers and Cloud Computing

Expert Reference Series of White Papers. Understanding Data Centers and Cloud Computing Expert Reference Series of White Papers Understanding Data Centers and Cloud Computing 1-800-COURSES www.globalknowledge.com Understanding Data Centers and Cloud Computing Paul Stryer, Global Knowledge

More information

High Performance Computing Cloud Computing. Dr. Rami YARED

High Performance Computing Cloud Computing. Dr. Rami YARED High Performance Computing Cloud Computing Dr. Rami YARED Outline High Performance Computing Parallel Computing Cloud Computing Definitions Advantages and drawbacks Cloud Computing vs Grid Computing Outline

More information

Cloud Computing: Computing as a Service. Prof. Daivashala Deshmukh Maharashtra Institute of Technology, Aurangabad

Cloud Computing: Computing as a Service. Prof. Daivashala Deshmukh Maharashtra Institute of Technology, Aurangabad Cloud Computing: Computing as a Service Prof. Daivashala Deshmukh Maharashtra Institute of Technology, Aurangabad Abstract: Computing as a utility. is a dream that dates from the beginning from the computer

More information

Cloud Computing Paradigm

Cloud Computing Paradigm Cloud Computing Paradigm Julio Guijarro Automated Infrastructure Lab HP Labs Bristol, UK 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice

More information

DISTRIBUTED SYSTEMS AND CLOUD COMPUTING. A Comparative Study

DISTRIBUTED SYSTEMS AND CLOUD COMPUTING. A Comparative Study DISTRIBUTED SYSTEMS AND CLOUD COMPUTING A Comparative Study Geographically distributed resources, such as storage devices, data sources, and computing power, are interconnected as a single, unified resource

More information

GUIDELINE. on SERVER CONSOLIDATION and VIRTUALISATION. National Computer Board, 7th Floor Stratton Court, La Poudriere Street, Port Louis

GUIDELINE. on SERVER CONSOLIDATION and VIRTUALISATION. National Computer Board, 7th Floor Stratton Court, La Poudriere Street, Port Louis GUIDELINE on SERVER CONSOLIDATION and VIRTUALISATION National Computer Board, 7th Floor Stratton Court, La Poudriere Street, Port Louis Introduction There is an ever increasing need for both organisations

More information

CHAPTER 8 CLOUD COMPUTING

CHAPTER 8 CLOUD COMPUTING CHAPTER 8 CLOUD COMPUTING SE 458 SERVICE ORIENTED ARCHITECTURE Assist. Prof. Dr. Volkan TUNALI Faculty of Engineering and Natural Sciences / Maltepe University Topics 2 Cloud Computing Essential Characteristics

More information

An Introduction to Cloud Computing Concepts

An Introduction to Cloud Computing Concepts Software Engineering Competence Center TUTORIAL An Introduction to Cloud Computing Concepts Practical Steps for Using Amazon EC2 IaaS Technology Ahmed Mohamed Gamaleldin Senior R&D Engineer-SECC ahmed.gamal.eldin@itida.gov.eg

More information

Emerging Technology for the Next Decade

Emerging Technology for the Next Decade Emerging Technology for the Next Decade Cloud Computing Keynote Presented by Charles Liang, President & CEO Super Micro Computer, Inc. What is Cloud Computing? Cloud computing is Internet-based computing,

More information

IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures

IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures Dr. Sanjay P. Ahuja, Ph.D. 2010-14 FIS Distinguished Professor of Computer Science School of Computing, UNF Introduction

More information

Evaluation of Enterprise Data Protection using SEP Software

Evaluation of Enterprise Data Protection using SEP Software Test Validation Test Validation - SEP sesam Enterprise Backup Software Evaluation of Enterprise Data Protection using SEP Software Author:... Enabling you to make the best technology decisions Backup &

More information

ITL BULLETIN FOR JUNE 2012 CLOUD COMPUTING: A REVIEW OF FEATURES, BENEFITS, AND RISKS, AND RECOMMENDATIONS FOR SECURE, EFFICIENT IMPLEMENTATIONS

ITL BULLETIN FOR JUNE 2012 CLOUD COMPUTING: A REVIEW OF FEATURES, BENEFITS, AND RISKS, AND RECOMMENDATIONS FOR SECURE, EFFICIENT IMPLEMENTATIONS ITL BULLETIN FOR JUNE 2012 CLOUD COMPUTING: A REVIEW OF FEATURES, BENEFITS, AND RISKS, AND RECOMMENDATIONS FOR SECURE, EFFICIENT IMPLEMENTATIONS Shirley Radack, Editor Computer Security Division Information

More information

See Appendix A for the complete definition which includes the five essential characteristics, three service models, and four deployment models.

See Appendix A for the complete definition which includes the five essential characteristics, three service models, and four deployment models. Cloud Strategy Information Systems and Technology Bruce Campbell What is the Cloud? From http://csrc.nist.gov/publications/nistpubs/800-145/sp800-145.pdf Cloud computing is a model for enabling ubiquitous,

More information

Mobile Cloud Networking FP7 European Project: Radio Access Network as a Service

Mobile Cloud Networking FP7 European Project: Radio Access Network as a Service Optical switch WC-Pool (in a data centre) BBU-pool RAT 1 BBU-pool RAT 2 BBU-pool RAT N Mobile Cloud Networking FP7 European Project: Radio Access Network as a Service Dominique Pichon (Orange) 4th Workshop

More information

Sistemi Operativi e Reti. Cloud Computing

Sistemi Operativi e Reti. Cloud Computing 1 Sistemi Operativi e Reti Cloud Computing Facoltà di Scienze Matematiche Fisiche e Naturali Corso di Laurea Magistrale in Informatica Osvaldo Gervasi ogervasi@computer.org 2 Introduction Technologies

More information

9/26/2011. What is Virtualization? What are the different types of virtualization.

9/26/2011. What is Virtualization? What are the different types of virtualization. CSE 501 Monday, September 26, 2011 Kevin Cleary kpcleary@buffalo.edu What is Virtualization? What are the different types of virtualization. Practical Uses Popular virtualization products Demo Question,

More information

Cloud Computing Today. David Hirsch April 2013

Cloud Computing Today. David Hirsch April 2013 Cloud Computing Today David Hirsch April 2013 Outline What is the Cloud? Types of Cloud Computing Why the interest in Cloud computing today? Business Uses for the Cloud Consumer Uses for the Cloud PCs

More information

Introduction to Cloud Computing

Introduction to Cloud Computing Introduction to Cloud Computing Rohit Thakral rohit@targetintegration.com +353 1 886 5684 About Rohit Expertise Sales/Business Management Helpdesk Management Open Source Software & Cloud Expertise Running

More information

Cloud Computing - Architecture, Applications and Advantages

Cloud Computing - Architecture, Applications and Advantages Cloud Computing - Architecture, Applications and Advantages 1 Arun Mani Tripathi 2 Rizwan Beg NIELIT Ministry of C&I.T., Govt. of India 2 Prof. and Head, Department 1 of Computer science and Engineering,Integral

More information

SCADA Cloud Computing

SCADA Cloud Computing SCADA Cloud Computing Information on Cloud Computing with SCADA systems Version: 1.0 Erik Daalder, Business Development Manager Yokogawa Electric Corporation Global SCADA Center T: +31 88 4641 360 E: erik.daalder@nl.yokogawa.com

More information

Customer Engagement & The Cloud

Customer Engagement & The Cloud Customer Engagement & The Cloud Silverbear Membership Customer Engagement & The Cloud There has been a lot of talk and hype recently surrounding this new phenomenon called the Cloud". A lot of senior business

More information

Lecture 02a Cloud Computing I

Lecture 02a Cloud Computing I Mobile Cloud Computing Lecture 02a Cloud Computing I 吳 秀 陽 Shiow-yang Wu What is Cloud Computing? Computing with cloud? Mobile Cloud Computing Cloud Computing I 2 Note 1 What is Cloud Computing? Walking

More information

Kent State University s Cloud Strategy

Kent State University s Cloud Strategy Kent State University s Cloud Strategy Table of Contents Item Page 1. From the CIO 3 2. Strategic Direction for Cloud Computing at Kent State 4 3. Cloud Computing at Kent State University 5 4. Methodology

More information

ITSM in the Cloud. An Overview of Why IT Service Management is Critical to The Cloud. Presented By: Rick Leopoldi RL Information Consulting LLC

ITSM in the Cloud. An Overview of Why IT Service Management is Critical to The Cloud. Presented By: Rick Leopoldi RL Information Consulting LLC ITSM in the Cloud An Overview of Why IT Service Management is Critical to The Cloud Presented By: Rick Leopoldi RL Information Consulting LLC What s Driving the Move to Cloud Computing Greater than 70%

More information

Energy Constrained Resource Scheduling for Cloud Environment

Energy Constrained Resource Scheduling for Cloud Environment Energy Constrained Resource Scheduling for Cloud Environment 1 R.Selvi, 2 S.Russia, 3 V.K.Anitha 1 2 nd Year M.E.(Software Engineering), 2 Assistant Professor Department of IT KSR Institute for Engineering

More information

Cloud Computing in the Enterprise An Overview. For INF 5890 IT & Management Ben Eaton 24/04/2013

Cloud Computing in the Enterprise An Overview. For INF 5890 IT & Management Ben Eaton 24/04/2013 Cloud Computing in the Enterprise An Overview For INF 5890 IT & Management Ben Eaton 24/04/2013 Cloud Computing in the Enterprise Background Defining the Cloud Issues of Cloud Governance Issue of Cloud

More information

journey to a hybrid cloud

journey to a hybrid cloud journey to a hybrid cloud Virtualization and Automation VI015SN journey to a hybrid cloud Jim Sweeney, CTO GTSI about the speaker Jim Sweeney GTSI, Chief Technology Officer 35 years of engineering experience

More information

WhitePaper. Private Cloud Computing Essentials

WhitePaper. Private Cloud Computing Essentials Private Cloud Computing Essentials The 2X Private Cloud Computing Essentials This white paper contains a brief guide to Private Cloud Computing. Contents Introduction.... 3 About Private Cloud Computing....

More information

20 th Year of Publication. A monthly publication from South Indian Bank. www.sib.co.in

20 th Year of Publication. A monthly publication from South Indian Bank. www.sib.co.in To kindle interest in economic affairs... To empower the student community... Open YAccess www.sib.co.in ho2099@sib.co.in A monthly publication from South Indian Bank 20 th Year of Publication Experience

More information

2) Xen Hypervisor 3) UEC

2) Xen Hypervisor 3) UEC 5. Implementation Implementation of the trust model requires first preparing a test bed. It is a cloud computing environment that is required as the first step towards the implementation. Various tools

More information

Cloud Computing: The Next Computing Paradigm

Cloud Computing: The Next Computing Paradigm Cloud Computing: The Next Computing Paradigm Ronnie D. Caytiles 1, Sunguk Lee and Byungjoo Park 1 * 1 Department of Multimedia Engineering, Hannam University 133 Ojeongdong, Daeduk-gu, Daejeon, Korea rdcaytiles@gmail.com,

More information

CLOUD COMPUTING. Keywords: Cloud Computing, Data Centers, Utility Computing, Virtualization, IAAS, PAAS, SAAS.

CLOUD COMPUTING. Keywords: Cloud Computing, Data Centers, Utility Computing, Virtualization, IAAS, PAAS, SAAS. CLOUD COMPUTING Mr. Dhananjay Kakade CSIT, CHINCHWAD, Mr Giridhar Gundre CSIT College Chinchwad Abstract: Cloud computing is a technology that uses the internet and central remote servers to maintain data

More information

Getting Familiar with Cloud Terminology. Cloud Dictionary

Getting Familiar with Cloud Terminology. Cloud Dictionary Getting Familiar with Cloud Terminology Cloud computing is a hot topic in today s IT industry. However, the technology brings with it new terminology that can be confusing. Although you don t have to know

More information

Architectural Implications of Cloud Computing

Architectural Implications of Cloud Computing Architectural Implications of Cloud Computing Grace Lewis Research, Technology and Systems Solutions (RTSS) Program Lewis is a senior member of the technical staff at the SEI in the Research, Technology,

More information

Overview. The Cloud. Characteristics and usage of the cloud Realities and risks of the cloud

Overview. The Cloud. Characteristics and usage of the cloud Realities and risks of the cloud Overview The purpose of this paper is to introduce the reader to the basics of cloud computing or the cloud with the aim of introducing the following aspects: Characteristics and usage of the cloud Realities

More information

Emergence of Cloud. Definition. Service Models. Deployment Models. Software as a Service (SaaS) Public Cloud. Platform as a Service (PaaS)

Emergence of Cloud. Definition. Service Models. Deployment Models. Software as a Service (SaaS) Public Cloud. Platform as a Service (PaaS) Forth House 28 Rutland Square Edinburgh, Scotland EH1 2BW 0131 202 6018 www.farrpoint.com The best of both worlds A path to business transformation through the use of Cloud technology The demand for cloud

More information

6 Cloud strategy formation. 6.1 Towards cloud solutions

6 Cloud strategy formation. 6.1 Towards cloud solutions 6 Cloud strategy formation 6.1 Towards cloud solutions Based on the comprehensive set of information, collected and analysed during the strategic analysis process, the next step in cloud strategy formation

More information

Simulation of Cloud Computing Eco-Efficient Data Centre

Simulation of Cloud Computing Eco-Efficient Data Centre Simulation of Cloud Computing Eco-Efficient Data Centre Ibrahim Alzamil MSc Computing and Management Session (2011/2012) The candidate confirms that the work submitted is their own and the appropriate

More information

Cloud Computing INTRODUCTION

Cloud Computing INTRODUCTION Cloud Computing INTRODUCTION Cloud computing is where software applications, processing power, data and potentially even artificial intelligence are accessed over the internet. or in simple words any situation

More information

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing WHAT IS CLOUD COMPUTING? 2

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing WHAT IS CLOUD COMPUTING? 2 DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing Slide 1 Slide 3 A style of computing in which dynamically scalable and often virtualized resources are provided as a service over the Internet.

More information

Assistance of Cloud Computing For Green Computing

Assistance of Cloud Computing For Green Computing Assistance of Cloud Computing For Green Computing Rashmi R. Rathi Department of Masters in Computer Application Prof.Ram Meghe Institute Of Technology And Research, Badnera,Amravati rrrathi777@gmail.com

More information

International Journal of Engineering Research & Management Technology

International Journal of Engineering Research & Management Technology International Journal of Engineering Research & Management Technology March- 2015 Volume 2, Issue-2 Survey paper on cloud computing with load balancing policy Anant Gaur, Kush Garg Department of CSE SRM

More information

Dr.K.C.DAS HEAD PG Dept. of Library & Inf. Science Utkal University, Vani Vihar,Bhubaneswar

Dr.K.C.DAS HEAD PG Dept. of Library & Inf. Science Utkal University, Vani Vihar,Bhubaneswar Dr.K.C.DAS HEAD PG Dept. of Library & Inf. Science Utkal University, Vani Vihar,Bhubaneswar There is potential for a lot of confusion surrounding the definition of cloud computing. In its basic conceptual

More information

Cloud Computing: Making the right choices

Cloud Computing: Making the right choices Cloud Computing: Making the right choices Kalpak Shah Clogeny Technologies Pvt Ltd 1 About Me Kalpak Shah Founder & CEO, Clogeny Technologies Passionate about economics and technology evolving through

More information

CLOUD COMPUTING. When It's smarter to rent than to buy

CLOUD COMPUTING. When It's smarter to rent than to buy CLOUD COMPUTING When It's smarter to rent than to buy Is it new concept? Nothing new In 1990 s, WWW itself Grid Technologies- Scientific applications Online banking websites More convenience Not to visit

More information

Introduction to Cloud Services

Introduction to Cloud Services Introduction to Cloud Services (brought to you by www.rmroberts.com) Cloud computing concept is not as new as you might think, and it has actually been around for many years, even before the term cloud

More information

Virtualization and Cloud Computing

Virtualization and Cloud Computing Written by Zakir Hossain, CS Graduate (OSU) CEO, Data Group Fed Certifications: PFA (Programming Foreign Assistance), COR (Contracting Officer), AOR (Assistance Officer) Oracle Certifications: OCP (Oracle

More information

Secure Cloud Computing through IT Auditing

Secure Cloud Computing through IT Auditing Secure Cloud Computing through IT Auditing 75 Navita Agarwal Department of CSIT Moradabad Institute of Technology, Moradabad, U.P., INDIA Email: nvgrwl06@gmail.com ABSTRACT In this paper we discuss the

More information

Managing Cloud Computing Risk

Managing Cloud Computing Risk Managing Cloud Computing Risk Presented By: Dan Desko; Manager, Internal IT Audit & Risk Advisory Services Schneider Downs & Co. Inc. ddesko@schneiderdowns.com Learning Objectives Understand how to identify

More information

Data Centers and Cloud Computing. Data Centers

Data Centers and Cloud Computing. Data Centers Data Centers and Cloud Computing Intro. to Data centers Virtualization Basics Intro. to Cloud Computing 1 Data Centers Large server and storage farms 1000s of servers Many TBs or PBs of data Used by Enterprises

More information

Cloud Computing Backgrounder

Cloud Computing Backgrounder Cloud Computing Backgrounder No surprise: information technology (IT) is huge. Huge costs, huge number of buzz words, huge amount of jargon, and a huge competitive advantage for those who can effectively

More information

Introduction to Cloud Computing

Introduction to Cloud Computing Introduction to Cloud Computing Cloud Computing I (intro) 15 319, spring 2010 2 nd Lecture, Jan 14 th Majd F. Sakr Lecture Motivation General overview on cloud computing What is cloud computing Services

More information

The Key Components of a Cloud-Based Unified Communications Offering

The Key Components of a Cloud-Based Unified Communications Offering The Key Components of a Cloud-Based Unified Communications Offering Organizations must enhance their communications and collaboration capabilities to remain competitive. Get up to speed with this tech

More information

How cloud computing can transform your business landscape

How cloud computing can transform your business landscape How cloud computing can transform your business landscape Introduction It seems like everyone is talking about the cloud. Cloud computing and cloud services are the new buzz words for what s really a not

More information

A Study of Infrastructure Clouds

A Study of Infrastructure Clouds A Study of Infrastructure Clouds Pothamsetty Nagaraju 1, K.R.R.M.Rao 2 1 Pursuing M.Tech(CSE), Nalanda Institute of Engineering & Technology,Siddharth Nagar, Sattenapalli, Guntur., Affiliated to JNTUK,

More information

VMware for your hosting services

VMware for your hosting services VMware for your hosting services Anindya Kishore Das 2009 VMware Inc. All rights reserved Everybody talks Cloud! You will eat your cloud and you will like it! Everybody talks Cloud - But what is it? VMware

More information

Certified Cloud Computing Professional Sample Material

Certified Cloud Computing Professional Sample Material Certified Cloud Computing Professional Sample Material 1. INTRODUCTION Let us get flashback of few years back. Suppose you have some important files in a system at home but, you are away from your home.

More information

A white paper from Fordway on CLOUD COMPUTING. Why private cloud should be your first step on the cloud computing journey - and how to get there

A white paper from Fordway on CLOUD COMPUTING. Why private cloud should be your first step on the cloud computing journey - and how to get there A white paper from Fordway on CLOUD COMPUTING Why private cloud should be your first step on the cloud computing journey - and how to get there PRIVATE CLOUD WHITE PAPER January 2012 www.fordway.com Page

More information

Novel Network Computing Paradigms (I)

Novel Network Computing Paradigms (I) Lecture 4 Novel Network Computing Paradigms (I) Part B Cloud Computing Graduate Course, Hosei U., J. Ma 1 Computing Paradigm Evolution Personal PC Client Server Cloud Computing Hardware Centric Software

More information

The Key Components of a Cloud-Based UC Offering

The Key Components of a Cloud-Based UC Offering The Key Components of a Cloud-Based UC Offering Organizations must enhance their communications and collaboration capabilities to remain competitive. Get up to speed with this tech primer and find new

More information

Introduction to Cloud Computing

Introduction to Cloud Computing 1 Introduction to Cloud Computing CERTIFICATION OBJECTIVES 1.01 Cloud Computing: Common Terms and Definitions 1.02 Cloud Computing and Virtualization 1.03 Early Examples of Cloud Computing 1.04 Cloud Computing

More information

OVERVIEW Cloud Deployment Services

OVERVIEW Cloud Deployment Services OVERVIEW Cloud Deployment Services Audience This document is intended for those involved in planning, defining, designing, and providing cloud services to consumers. The intended audience includes the

More information

The NREN s core activities are in providing network and associated services to its user community that usually comprises:

The NREN s core activities are in providing network and associated services to its user community that usually comprises: 3 NREN and its Users The NREN s core activities are in providing network and associated services to its user community that usually comprises: Higher education institutions and possibly other levels of

More information

Mobile Cloud Computing T-110.5121 Open Source IaaS

Mobile Cloud Computing T-110.5121 Open Source IaaS Mobile Cloud Computing T-110.5121 Open Source IaaS Tommi Mäkelä, Otaniemi Evolution Mainframe Centralized computation and storage, thin clients Dedicated hardware, software, experienced staff High capital

More information

Cloud Computing For Distributed University Campus: A Prototype Suggestion

Cloud Computing For Distributed University Campus: A Prototype Suggestion Cloud Computing For Distributed University Campus: A Prototype Suggestion Mehmet Fatih Erkoç, Serhat Bahadir Kert mferkoc@yildiz.edu.tr, sbkert@yildiz.edu.tr Yildiz Technical University (Turkey) Abstract

More information

Technology Insight Series

Technology Insight Series Evaluating Storage Technologies for Virtual Server Environments Russ Fellows June, 2010 Technology Insight Series Evaluator Group Copyright 2010 Evaluator Group, Inc. All rights reserved Executive Summary

More information

Optimizing Service Levels in Public Cloud Deployments

Optimizing Service Levels in Public Cloud Deployments WHITE PAPER OCTOBER 2014 Optimizing Service Levels in Public Cloud Deployments Keys to Effective Service Management 2 WHITE PAPER: OPTIMIZING SERVICE LEVELS IN PUBLIC CLOUD DEPLOYMENTS ca.com Table of

More information

Overview of Cloud Computing (ENCS 691K Chapter 1)

Overview of Cloud Computing (ENCS 691K Chapter 1) Overview of Cloud Computing (ENCS 691K Chapter 1) Roch Glitho, PhD Associate Professor and Canada Research Chair My URL - http://users.encs.concordia.ca/~glitho/ Overview of Cloud Computing Towards a definition

More information

Prof. Luiz Fernando Bittencourt MO809L. Tópicos em Sistemas Distribuídos 1 semestre, 2015

Prof. Luiz Fernando Bittencourt MO809L. Tópicos em Sistemas Distribuídos 1 semestre, 2015 MO809L Tópicos em Sistemas Distribuídos 1 semestre, 2015 Introduction to Cloud Computing IT Challenges 70% of the budget to keep IT running, 30% available to create new value that needs to be inverted

More information

Cloud, Community and Collaboration Airline benefits of using the Amadeus community cloud

Cloud, Community and Collaboration Airline benefits of using the Amadeus community cloud Cloud, Community and Collaboration Airline benefits of using the Amadeus community cloud Index Index... 2 Overview... 3 What is cloud computing?... 3 The benefit to businesses... 4 The downsides of public

More information

Cloud Computing and Amazon Web Services

Cloud Computing and Amazon Web Services Cloud Computing and Amazon Web Services Gary A. McGilvary edinburgh data.intensive research 1 OUTLINE 1. An Overview of Cloud Computing 2. Amazon Web Services 3. Amazon EC2 Tutorial 4. Conclusions 2 CLOUD

More information

A Study on Analysis and Implementation of a Cloud Computing Framework for Multimedia Convergence Services

A Study on Analysis and Implementation of a Cloud Computing Framework for Multimedia Convergence Services A Study on Analysis and Implementation of a Cloud Computing Framework for Multimedia Convergence Services Ronnie D. Caytiles and Byungjoo Park * Department of Multimedia Engineering, Hannam University

More information

Managing the Real Cost of On-Demand Enterprise Cloud Services with Chargeback Models

Managing the Real Cost of On-Demand Enterprise Cloud Services with Chargeback Models Managing the Real Cost of On-Demand Enterprise Cloud Services with Chargeback Models A Guide to Cloud Computing Costs, Server Costs, Pricing Plans, and Chargeback Implementation and Systems Introduction

More information

Cloud Computing. Course: Designing and Implementing Service Oriented Business Processes

Cloud Computing. Course: Designing and Implementing Service Oriented Business Processes Cloud Computing Supplementary slides Course: Designing and Implementing Service Oriented Business Processes 1 Introduction Cloud computing represents a new way, in some cases a more cost effective way,

More information

Data Centers and Cloud Computing. Data Centers

Data Centers and Cloud Computing. Data Centers Data Centers and Cloud Computing Slides courtesy of Tim Wood 1 Data Centers Large server and storage farms 1000s of servers Many TBs or PBs of data Used by Enterprises for server applications Internet

More information

DNA IT - Business IT On Demand

DNA IT - Business IT On Demand DNA IT - Business IT On Demand September 1 2011 DNA IT White Paper: Introduction to Cloud Computing The boom in cloud computing over the past few years has led to a situation that is common to many innovations

More information

Cloud Optimize Your IT

Cloud Optimize Your IT Cloud Optimize Your IT Windows Server 2012 The information contained in this presentation relates to a pre-release product which may be substantially modified before it is commercially released. This pre-release

More information

APPLICABILITY OF CLOUD COMPUTING IN ACADEMIA

APPLICABILITY OF CLOUD COMPUTING IN ACADEMIA Abstract APPLICABILITY OF CLOUD COMPUTING IN ACADEMIA Prof. Atul B Naik naik_ab@yahoo.com Prof. Amarendra Kumar Ajay akajay2001@gmail.com Prof. Swapna S Kolhatkar swapna.kolhatkar@gmail.com The Indian

More information