CS 6343: CLOUD COMPUTING Term Project

CS 6343: CLOUD COMPUTING Term Project Group A1 Project: IaaS cloud middleware Create a cloud environment with a number of servers, allowing users to submit their jobs, scale their jobs Make simple resource management solutions in determining where to place a VM and when to migrate them TA: Shuai Zhang Basic cloud platform One cluster Reasonable interface for job submission (Command line or GUI) Allow users to create and submit VMs Proper management of the VMs of each user (treat them as files) One VM per job, and start it on a host based on a basic VM placement algorithm Migrate VM if necessary based on a simple load balancing algorithm Scale up/down VM if necessary based on a simple prediction algorithm Management component Basic monitoring of the VMs, the hosts CPU, memory, diskio, networkio, etc. Simple interface for midterm Improve the midterm outcome Better interface, better algorithms, more robust, etc. Cloud platform Support multiple clusters (2 clusters, use the router on top of the switches) Support the submission of a job with multiple VMs Interface for specifying the VMs under one job The VMs of one job are likely to communicate with each other, how to handle their IP address assignments? How to prevent the VMs from other jobs communicating with the VMs of this job Placement and load balancing algorithms should consider placing the VMs close to each other Management component Allow admin to specify what are to be observed on one or more panels Add and remove hosts and VMs Migrate and scale VMs Group A2 Project: Cloud benchmark workload TA: Shuai Zhang Project: Create and install a suite of cloud benchmark program as input jobs to the cloud middleware Basic benchmark systems

Benchmark systems properly installed and VM created Not all CloudSuite components For each benchmark system, parameter settings are reasonably controlled for desired workload specifications Reasonable interface/language for workload specification (Configuration file or GUI) Try to tune the benchmark for different load factor combinations CPU, memory, diskio, networkio, etc. If it is not possible to tune the load factors independently, then provide the correlation equation among different attributes of the workload Able to start the workload on A2 s own environment and submit the workload to A1 s platform System-wide benchmarking Allow the specification of a set of benchmark programs, and allow adding new ones For each one, support the specification of the benchmark programs (workload and parameter setting correlations) Report CloudSuite capabilities and decide what to do for CloudSuite Refine the basic benchmark system Better understanding of the relation between workload and parameter settings of the benchmark systems Even if the system does not have certain workload control externally, try to tune internally to achieve workload variations Refine the system-wide benchmarking User can specify the desired workload and the desired benchmark programs, and the system can mix the benchmark programs to satisfy the workload Support continuous workload specification Add other benchmarks from CloudSuite to the system Submit the workload to A1 to explore their cloud management system Group B Project: Cloud file systems Install a few famous cloud file systems, explore their features and compare their performance Automated file system setup Fully install HDFS, Swift, Ceph on multiple hosts and VMs Build an environment to support the startup of each file system Create VMs for all components of the file system Provide scripts to start the file systems by activating the VMs Provide an interface (the configuration file and GUI) to support the specification of file system configurations Need to define the set of parameters for configuration Identify a feature vector (what features should be considered if a user needs to select a file system to use) Look up time Access latency, access throughput, directory service latency, etc. Load balancing features and performance Directory access capabilities and performance (ls l, cd, create, delete) Consistency model and solutions, availability solutions, etc.

Other special features that are unique to a certain file system Add additional code to probe the system to allow exploration of some attributes Create the file system contents and generate the access requests to facilitate file system performance and behavior exploration Use IOZone and create your own code for file system exploration Evaluate the file systems based on the feature vector Final project additions Create a file system feature specification standard Finalize the file system evaluation Define an specification format for describing the features of each file system according to the evaluation results Create a simple federated file system service Start up multiple file systems (HDFS, Swift, Ceph) in the cluster Provide a user interface to allow a user to build a file system (FSC) The interface supports the selection of the desired features (based on the feature attributes you selected earlier) The service selects the proper file system (HDFS, Swift, Ceph) for the user Return a handle to the user to support further accesses to the correct file system by the user Provide a file system selection algorithm Match user selected features with the features of the file systems Group C Project: Directory structure maintenance Compare different methods in implementing directory files, including three solutions Solution 1: Use a centralized server to store the entire directory Solution 2: Treat directory files as regular files, but may merge a subtree of directories into one file, with a fixed number of levels (the fixed number of levels is configurable) Solution 3: Ceph solution Complete the basic directory maintenance systems Implement all three systems in memory without replication and accept a single request at a time For Ceph, do not consider dynamic load partitioning, but develop the mechanism to decide which partitioning is the best for the system For HDFS, same as Ceph, except that there is no partitioning For Solution 2, Yongtao provides the file system to host the directory files Support create, delete, ls commands Implement the basic client Generate the basic directory system on three maintenance systems Generate a mix of client requests for accessing the directories Submit the commands to the three directory management systems Support replication Provide replication and master/slave update for HDFS Ceph is the same, except that there are multiple partitions For Solution 2, the system already supports replication Refine the basic directory maintenance system Handle multiple client requests at the same time, i.e., provide locking Provide additional commands if desirable

Improve the client Generate a mix of client requests with proper probability of selecting commands, selecting the directory names for creation and deletion, and selecting whether the command should fail or succeed Obtain performance results Consider special cases in client request generation to find out different performance features in different systems In Ceph, still do not consider dynamic changes, but consider different configurations for performance testing For each specific workload pattern, and for each specific directory structure, decide which partitioning is the best, and test the performance of the system in the best partitioning setting Group D Project: Load balancing in DHT based file systems Develop a load balancing solution for Swift like file systems Complete the DHT based file system Implement the ring solution with successor based and virtual node based data distribution Implement distributed table maintenance and distributed table updates Let the update frequency F be a configurable parameter, F=0 means immediate update, F=x means x milliseconds Implement an encapsulated file transferring component for load balancing Implement an API to be used by any load balancing schemes Provide a standardized interface, should be agreed upon Specify the set of files to be transferred, the source and the destination of the transfer Copy the files from source to destination, allow one file at a time transfer The file should be locked during copying to avoid changes during copying, lock should be done one file at a time (Yongtao s implementation?) (This does not have to be done before midterm report) When the copying of one file is done, the subsequent updates should be done on the copy even though the copy is not shown in the directory yet (this step requires an API from the lookup component) After all files are copied, update the lookup table(s) (this step requires an API from the lookup component) Implement a simple load balancing scheme Allow the admin to initiate load balancing Copy one file from source to destination Change the file content at the source to link to a new location Modify the client program to recognize that the file has been moved and determine where the file is at and submit the new request File system creation and access request generation Use IOZone or other tools to create a file system with a desired file system load Use IOZone or your own program to create file system accesses Including create file, delete file, read file, write file Apply the requests to the file system you build Complete the load balancing scheme Collect load information from the local node Design an algorithm to achieve distributed load balancing

Move a set of files by calling the file transferring API Change the file contents to link to their new locations Modify the client program to recognize that the file that has been moved and to determine where the file is at and submit the new request Implement client cache of the routing table and changes Standardize the implementation to facilitate performance comparison Compare performance with group E and with Yongtao s file system if available Group E Project: Ceph file system Develop a simple version of the Ceph file system, focusing on its naming service solution Complete the single monitor for OSD cluster map maintenance Implement the solution in memory Implement an encapsulated file transferring component for load balancing Implement an API to be used by any load balancing schemes Provide a standardized interface, should be agreed upon Specify the set of files to be transferred, the source and the destination of the transfer Copy the files from source to destination, allow one file at a time transfer The file should be locked during copying to avoid changes during copying, lock should be done one file at a time (Yongtao s implementation?) (This does not have to be done before midterm report) When the copying of one file is done, the subsequent updates should be done on the copy even though the copy is not shown in the directory yet (this step requires an API from the lookup component) After all files are copied, update the lookup table(s) (this step requires an API from the lookup component) Group D will be mainly responsible for this implementation Implement a simple load balancing scheme Allow the admin to initiate load balancing Copy one file from source to destination Call the monitor to update the OSD cluster map File system creation and access request generation Use IOZone or other tools to create a file system with a desired file system load Use IOZone or your own program to create file system accesses Including create file, delete file, read file, write file Apply the requests to the file system you build Group E will be mainly responsible for this implementation Replicate the monitors and use primary and backup update scheme Still only need in memory map Complete the load balancing scheme Collect load information from the local node Design an algorithm to achieve load balancing by monitor Monitor calls the file transfer API to move a set of files Update the OSD cluster map using primary-backup update Implement client cache of the OSD cluster map Cache the entire map with a monotonically increasing version number Make a change when a mismatch of version number is discovered

Standardize the implementation to facilitate performance comparison Compare performance with group E and with Yongtao s system