CS 6343: CLOUD COMPUTING Term Project



Similar documents
SUSE Cloud 2.0. Pete Chadwick. Douglas Jarvis. Senior Product Manager Product Marketing Manager

Sheepdog: distributed storage system for QEMU

STeP-IN SUMMIT June 18 21, 2013 at Bangalore, INDIA. Performance Testing of an IAAS Cloud Software (A CloudStack Use Case)

Introducing ScienceCloud

UZH Experiences with OpenStack

Monitoring Elastic Cloud Services

Index C, D. Background Intelligent Transfer Service (BITS), 174, 191

Using SUSE Cloud to Orchestrate Multiple Hypervisors and Storage at ADP

Designing Performance Monitoring Tool for NoSQL Cassandra Distributed Database

Cloud Computing through Virtualization and HPC technologies

What s new in Hyper-V 2012 R2

Marco Mantegazza WebSphere Client Technical Professional Team IBM Software Group. Virtualization and Cloud

Planning, Provisioning and Deploying Enterprise Clouds with Oracle Enterprise Manager 12c Kevin Patterson, Principal Sales Consultant, Enterprise

White Paper. Cloud Native Advantage: Multi-Tenant, Shared Container PaaS. Version 1.1 (June 19, 2012)

PES. Batch virtualization and Cloud computing. Part 1: Batch virtualization. Batch virtualization and Cloud computing

Using Oracle NoSQL Database

Enterprise IT is complex. Today, IT infrastructure spans the physical, the virtual and applications, and crosses public, private and hybrid clouds.

PowerVC 1.2 Q Power Systems Virtualization Center

Scyld Cloud Manager User Guide

Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA

Web Application Hosting Cloud Architecture

IM and Presence Disaster Recovery System

SwiftStack Global Cluster Deployment Guide

High Availability for Citrix XenServer

Introduction. Part I: Finding Bottlenecks when Something s Wrong. Chapter 1: Performance Tuning 3

Feature Comparison. Windows Server 2008 R2 Hyper-V and Windows Server 2012 Hyper-V

Cloud Design and Implementation. Cheng Li MPI-SWS Nov 9 th, 2010

Building a Parallel Cloud Storage System using OpenStack s Swift Object Store and Transformative Parallel I/O

Flying Circus RCA report #

This presentation covers virtual application shared services supplied with IBM Workload Deployer version 3.1.

AFS Usage and Backups using TiBS at Fermilab. Presented by Kevin Hill

Benchmarking Couchbase Server for Interactive Applications. By Alexey Diomin and Kirill Grigorchuk

Introducing EEMBC Cloud and Big Data Server Benchmarks

How To Use Arcgis For Free On A Gdb (For A Gis Server) For A Small Business

Vistara Lifecycle Management

Configure Cisco Emergency Responder Disaster Recovery System

Virtualization Technologies and Blackboard: The Future of Blackboard Software on Multi-Core Technologies

Enterprise Storage Solution for Hyper-V Private Cloud and VDI Deployments using Sanbolic s Melio Cloud Software Suite April 2011

Windows Server 2008 R2 Hyper-V Server and Windows Server 8 Beta Hyper-V

Microsoft Dynamics NAV 2013 R2 Sizing Guidelines for Multitenant Deployments

Optimization, Business Continuity & Disaster Recovery in Virtual Environments. Darius Spaičys, Partner Business manager Baltic s

Hypertable Architecture Overview

Availability for the modern datacentre Veeam Availability Suite v8 & Sneakpreview v9

Performance Management for Cloudbased STC 2012

Snapshots in Hadoop Distributed File System

High Availability Solutions for the MariaDB and MySQL Database

IBM Tivoli Composite Application Manager for Microsoft Applications: Microsoft Hyper-V Server Agent Version Fix Pack 2.

Bright Idea: GE s Storage Performance Best Practices Brian W. Walker

Distributed File Systems

How to manage your OpenStack Swift Cluster using Swift Metrics Sreedhar Varma Vedams Inc.

International Journal of Engineering Research & Management Technology

Private Cloud for WebSphere Virtual Enterprise Application Hosting

Ceph Distributed Storage for the Cloud An update of enterprise use-cases at BMW

Handling Hyper-V. In this series of articles, learn how to manage Hyper-V, from ensuring high availability to upgrading to Windows Server 2012 R2

Ethernet-based Software Defined Network (SDN) Cloud Computing Research Center for Mobile Applications (CCMA), ITRI 雲 端 運 算 行 動 應 用 研 究 中 心

Cloud Provision Widget 1.41

Hadoop Data Locality Change for Virtualization Environment

70-417: Upgrading Your Skills to MCSA Windows Server 2012

Managed Virtualized Platforms: From Multicore Nodes to Distributed Cloud Infrastructures

Hybrid Cloud: Overview of Intercloud Fabric. Sutapa Bansal Sr. Product Manager Cloud and Virtualization Group

Private Cloud Using Service Catalog

Workflow Templates Library

Performance Testing of a Cloud Service

Hadoop: Embracing future hardware

Open Data Center Alliance Usage: VIRTUAL MACHINE (VM) INTEROPERABILITY IN A HYBRID CLOUD ENVIRONMENT REV. 1.1

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

Data Storage in Clouds

HP Virtualization Performance Viewer

Cisco Prime Collaboration Deployment Troubleshooting

13.1 Backup virtual machines running on VMware ESXi / ESX Server

A Complete Open Cloud Storage, Virt, IaaS, PaaS. Dave Neary Open Source and Standards, Red Hat

Cloud/SaaS enablement of existing applications

Non-Stop for Apache HBase: Active-active region server clusters TECHNICAL BRIEF

Avoiding Performance Bottlenecks in Hyper-V

Cisco Unified CM Disaster Recovery System

Ceph. A complete introduction.

Microsoft SQL Database Administrator Certification

Building Storage as a Service with OpenStack. Greg Elkinbard Senior Technical Director

How to Deploy OpenStack on TH-2 Supercomputer Yusong Tan, Bao Li National Supercomputing Center in Guangzhou April 10, 2014

White Paper. Optimizing the Performance Of MySQL Cluster

Best Practices for Installing and Configuring the Hyper-V Role on the LSI CTS2600 Storage System for Windows 2008

VirtualclientTechnology 2011 July

IBM Tivoli Monitoring for Virtual Environments: Dashboard, Reporting, and Capacity Planning Version 7.2 Fix Pack 2. User s Guide SC

Savanna Hadoop on. OpenStack. Savanna Technical Lead

ovirt and Gluster hyper-converged! HA solution for maximum resource utilization

Real-Time KVM for the Masses Unrestricted Siemens AG All rights reserved

Cloud Migration: Migrating workloads to OpenStack Cloud

1. Comments on reviews a. Need to avoid just summarizing web page asks you for:

ovirt and Gluster hyper-converged! HA solution for maximum resource utilization

GeoGrid Project and Experiences with Hadoop

MANAGEMENT OF VIRTUAL MACHINE AS AN ENERGY CONSERVATION IN PRIVATE CLOUD COMPUTING SYSTEM

Linux/Open Source and Cloud computing Wim Coekaerts Senior Vice President, Linux and Virtualization Engineering

Evaluation Methodology of Converged Cloud Environments

Installation Guide. Step-by-Step Guide for clustering Hyper-V virtual machines with Sanbolic s Kayo FS. Table of Contents

Study of Load Balancing of Resource Namespace Service

Cloud Server. Parallels. Key Features and Benefits. White Paper.

IBM PureApplication System for IBM WebSphere Application Server workloads

Transcription:

CS 6343: CLOUD COMPUTING Term Project Group A1 Project: IaaS cloud middleware Create a cloud environment with a number of servers, allowing users to submit their jobs, scale their jobs Make simple resource management solutions in determining where to place a VM and when to migrate them TA: Shuai Zhang Basic cloud platform One cluster Reasonable interface for job submission (Command line or GUI) Allow users to create and submit VMs Proper management of the VMs of each user (treat them as files) One VM per job, and start it on a host based on a basic VM placement algorithm Migrate VM if necessary based on a simple load balancing algorithm Scale up/down VM if necessary based on a simple prediction algorithm Management component Basic monitoring of the VMs, the hosts CPU, memory, diskio, networkio, etc. Simple interface for midterm Improve the midterm outcome Better interface, better algorithms, more robust, etc. Cloud platform Support multiple clusters (2 clusters, use the router on top of the switches) Support the submission of a job with multiple VMs Interface for specifying the VMs under one job The VMs of one job are likely to communicate with each other, how to handle their IP address assignments? How to prevent the VMs from other jobs communicating with the VMs of this job Placement and load balancing algorithms should consider placing the VMs close to each other Management component Allow admin to specify what are to be observed on one or more panels Add and remove hosts and VMs Migrate and scale VMs Group A2 Project: Cloud benchmark workload TA: Shuai Zhang Project: Create and install a suite of cloud benchmark program as input jobs to the cloud middleware Basic benchmark systems

Benchmark systems properly installed and VM created Not all CloudSuite components For each benchmark system, parameter settings are reasonably controlled for desired workload specifications Reasonable interface/language for workload specification (Configuration file or GUI) Try to tune the benchmark for different load factor combinations CPU, memory, diskio, networkio, etc. If it is not possible to tune the load factors independently, then provide the correlation equation among different attributes of the workload Able to start the workload on A2 s own environment and submit the workload to A1 s platform System-wide benchmarking Allow the specification of a set of benchmark programs, and allow adding new ones For each one, support the specification of the benchmark programs (workload and parameter setting correlations) Report CloudSuite capabilities and decide what to do for CloudSuite Refine the basic benchmark system Better understanding of the relation between workload and parameter settings of the benchmark systems Even if the system does not have certain workload control externally, try to tune internally to achieve workload variations Refine the system-wide benchmarking User can specify the desired workload and the desired benchmark programs, and the system can mix the benchmark programs to satisfy the workload Support continuous workload specification Add other benchmarks from CloudSuite to the system Submit the workload to A1 to explore their cloud management system Group B Project: Cloud file systems Install a few famous cloud file systems, explore their features and compare their performance Automated file system setup Fully install HDFS, Swift, Ceph on multiple hosts and VMs Build an environment to support the startup of each file system Create VMs for all components of the file system Provide scripts to start the file systems by activating the VMs Provide an interface (the configuration file and GUI) to support the specification of file system configurations Need to define the set of parameters for configuration Identify a feature vector (what features should be considered if a user needs to select a file system to use) Look up time Access latency, access throughput, directory service latency, etc. Load balancing features and performance Directory access capabilities and performance (ls l, cd, create, delete) Consistency model and solutions, availability solutions, etc.

Other special features that are unique to a certain file system Add additional code to probe the system to allow exploration of some attributes Create the file system contents and generate the access requests to facilitate file system performance and behavior exploration Use IOZone and create your own code for file system exploration Evaluate the file systems based on the feature vector Final project additions Create a file system feature specification standard Finalize the file system evaluation Define an specification format for describing the features of each file system according to the evaluation results Create a simple federated file system service Start up multiple file systems (HDFS, Swift, Ceph) in the cluster Provide a user interface to allow a user to build a file system (FSC) The interface supports the selection of the desired features (based on the feature attributes you selected earlier) The service selects the proper file system (HDFS, Swift, Ceph) for the user Return a handle to the user to support further accesses to the correct file system by the user Provide a file system selection algorithm Match user selected features with the features of the file systems Group C Project: Directory structure maintenance Compare different methods in implementing directory files, including three solutions Solution 1: Use a centralized server to store the entire directory Solution 2: Treat directory files as regular files, but may merge a subtree of directories into one file, with a fixed number of levels (the fixed number of levels is configurable) Solution 3: Ceph solution Complete the basic directory maintenance systems Implement all three systems in memory without replication and accept a single request at a time For Ceph, do not consider dynamic load partitioning, but develop the mechanism to decide which partitioning is the best for the system For HDFS, same as Ceph, except that there is no partitioning For Solution 2, Yongtao provides the file system to host the directory files Support create, delete, ls commands Implement the basic client Generate the basic directory system on three maintenance systems Generate a mix of client requests for accessing the directories Submit the commands to the three directory management systems Support replication Provide replication and master/slave update for HDFS Ceph is the same, except that there are multiple partitions For Solution 2, the system already supports replication Refine the basic directory maintenance system Handle multiple client requests at the same time, i.e., provide locking Provide additional commands if desirable

Improve the client Generate a mix of client requests with proper probability of selecting commands, selecting the directory names for creation and deletion, and selecting whether the command should fail or succeed Obtain performance results Consider special cases in client request generation to find out different performance features in different systems In Ceph, still do not consider dynamic changes, but consider different configurations for performance testing For each specific workload pattern, and for each specific directory structure, decide which partitioning is the best, and test the performance of the system in the best partitioning setting Group D Project: Load balancing in DHT based file systems Develop a load balancing solution for Swift like file systems Complete the DHT based file system Implement the ring solution with successor based and virtual node based data distribution Implement distributed table maintenance and distributed table updates Let the update frequency F be a configurable parameter, F=0 means immediate update, F=x means x milliseconds Implement an encapsulated file transferring component for load balancing Implement an API to be used by any load balancing schemes Provide a standardized interface, should be agreed upon Specify the set of files to be transferred, the source and the destination of the transfer Copy the files from source to destination, allow one file at a time transfer The file should be locked during copying to avoid changes during copying, lock should be done one file at a time (Yongtao s implementation?) (This does not have to be done before midterm report) When the copying of one file is done, the subsequent updates should be done on the copy even though the copy is not shown in the directory yet (this step requires an API from the lookup component) After all files are copied, update the lookup table(s) (this step requires an API from the lookup component) Implement a simple load balancing scheme Allow the admin to initiate load balancing Copy one file from source to destination Change the file content at the source to link to a new location Modify the client program to recognize that the file has been moved and determine where the file is at and submit the new request File system creation and access request generation Use IOZone or other tools to create a file system with a desired file system load Use IOZone or your own program to create file system accesses Including create file, delete file, read file, write file Apply the requests to the file system you build Complete the load balancing scheme Collect load information from the local node Design an algorithm to achieve distributed load balancing

Move a set of files by calling the file transferring API Change the file contents to link to their new locations Modify the client program to recognize that the file that has been moved and to determine where the file is at and submit the new request Implement client cache of the routing table and changes Standardize the implementation to facilitate performance comparison Compare performance with group E and with Yongtao s file system if available Group E Project: Ceph file system Develop a simple version of the Ceph file system, focusing on its naming service solution Complete the single monitor for OSD cluster map maintenance Implement the solution in memory Implement an encapsulated file transferring component for load balancing Implement an API to be used by any load balancing schemes Provide a standardized interface, should be agreed upon Specify the set of files to be transferred, the source and the destination of the transfer Copy the files from source to destination, allow one file at a time transfer The file should be locked during copying to avoid changes during copying, lock should be done one file at a time (Yongtao s implementation?) (This does not have to be done before midterm report) When the copying of one file is done, the subsequent updates should be done on the copy even though the copy is not shown in the directory yet (this step requires an API from the lookup component) After all files are copied, update the lookup table(s) (this step requires an API from the lookup component) Group D will be mainly responsible for this implementation Implement a simple load balancing scheme Allow the admin to initiate load balancing Copy one file from source to destination Call the monitor to update the OSD cluster map File system creation and access request generation Use IOZone or other tools to create a file system with a desired file system load Use IOZone or your own program to create file system accesses Including create file, delete file, read file, write file Apply the requests to the file system you build Group E will be mainly responsible for this implementation Replicate the monitors and use primary and backup update scheme Still only need in memory map Complete the load balancing scheme Collect load information from the local node Design an algorithm to achieve load balancing by monitor Monitor calls the file transfer API to move a set of files Update the OSD cluster map using primary-backup update Implement client cache of the OSD cluster map Cache the entire map with a monotonically increasing version number Make a change when a mismatch of version number is discovered

Standardize the implementation to facilitate performance comparison Compare performance with group E and with Yongtao s system