Agent-based Federated Hybrid Cloud Prof. Yue-Shan Chang Distributed & Mobile Computing Lab. Dept. of Computer Science & Information Engineering National Taipei University
Material Presented at Agent-based Service Migration Framework in Hybrid Cloud 2011 IEEE 13th International Conference on High Performance Computing and Communications (HPCC), 2-4 Sept. 2011, pp. 887-892. Execution Time Prediction Using Rough Set Theory in Hybrid Cloud The 9th IEEE International Conference on Ubiquitous Intelligence and Computing (UIC 2012), September 04-07, 2012, Fukuoka, Japan RST-based Dynamic Resource Allocation in Cloud Environment the 5th IET International Conference on Ubi-Media Computing (U- Media 2012), in Xining, China, August 16-18, 2012,
Introduction Cloud computing The evolution and convergence of computing trends Types Private Cloud each enterprise s IT platform has their own network, servers and storage hardware (Data Centers) Public Cloud User can obtain any service and resource from service provider pay-per-use charging model
Introduction Hybrid cloud Integrate private and public cloud an organization provides and manages some resources in-house and has others provided externally can tackle transient and great volume of requests efficiently and effectively. 4
Introduction Why? Considering Issues Cost (Construction, Operation, Maintenance, Tax ) Security (Data, Network, ) Flexibility & Convenience (Operation, Maintenance, Management, ) Reliability & Availability Performance 5
Introduction Main benefits of using a public cloud service: Easy and inexpensive set-up because hardware, application and bandwidth costs are covered by the provider. Scalability to meet needs. No wasted resources because you pay for what you use. 6
Introduction What kind of cloud do I need? Private? Public? 7
Introduction A hybrid cloud is a cloud computing environment in which an organization provides and manages some resources in-house and has others provided externally. Public Cloud Amazo n HiClou d Google Private Cloud 8
Introduction Effectively utilize public cloud resource is an important issue while adopting hybrid cloud what kind of jobs need to be dispatched or be migrated to public cloud? When does a job be migrated to public cloud? And how will a job be migrated to public cloud? service migration is increasingly becoming an important research topic 9
Introduction Hybrid Cloud Project ITRI Cloud Center f5 Hybrid Cloud Architecture http://www.f5.com/pdf/solution-center/vmware-vclouddirector.pdf Fujitsu Hybrid Cloud Mikio Funahashi, Shigeo Yoshikawa Fujitsu s Approach to Hybrid Cloud Systems, Fujitsu Sci. Tech. J., Jul. 2011, Vol. 47, No.3, pp. 285-292 IBM Hybrid Cloud IBM Service Management Extensions for Hybrid Cloud http://public.dhe.ibm.com/common/ssi/ecm/en/ibd03004us en/ibd03004usen.pdf 10
Introduction ITRI Hybrid Cloud Architecture Public Cloud Private Cloud 11
Introduction f5 Hybrid Cloud Architecture http://www.f5.com/pdf/solution-center/vmware-vcloud-director.pdf 12
Introduction Fujitsu s Approach to Hybrid Cloud Systems Mikio Funahashi, Shigeo Yoshikawa Fujitsu s Approach to Hybrid Cloud Systems, Fujitsu Sci. Tech. J., Jul. 2011, Vol. 47, No.3, pp. 285-292 13
Introduction Agent & Grid Computing Ian Foster addressed that agent technology and grid computing need each other because agent technology can enhance the ability of problem solving of grid. Agent & Cloud computing More and more research adopting agent technology to solve problems faced in the cloud 14
Introduction Propose an automatic, intelligent framework based on agent technology. A federated layer to tie private and public cloud. Mobile agent technique is exploited manage all resources, monitor system behaviour, negotiate all actions 15
Introduction Objective For performance issue Support service migration (job migration) Load balance For cost issue utilize private cloud as much as possible if private cloud cannot complete user s job before deadline (Deadline-constraint Job) dispatch the job to public cloud» minimize the required resource of the VM 16
Agent-based Federated Broker 17
Agent-based Federated Broker Five major components System Monitoring Agent (SyMA) Collects the system information Reconfiguration Decision Agent (RDA) Reconfigure and adjust the cloud environment. Service Migration Agent (SeMA) assign a location in the cloud that allows the job to be executed on. if some clusters are overloading, SeMA will notify some JAs to migrate to some other cluster, to balance the load. 18
Agent-based Federated Broker Cluster Management Agent (CMA) schedules jobs locally in a FCFS fashion, so that there is only one job is executing on the cluster. reports the status of the cluster collects the information and send it via heartbeats to SeMA. Job Agent (JA) encapsulates a job, the job can be migrated along with the JA. executes and monitors the job on the cluster. reports the job status to the CMA periodically. brings the results back to the private cloud. 19
Agent-based Federated Broker Job migration scenario 20
Agent-based Federated Broker Job migration issue Pack a job into Job Agent(JA) Migrate JA to destination Unpack the JA 21
Policy of Job Migration Job Count (JC), JC PR -JC PU T C T c : Job count threshold the SeMA will pick up the (JC PU +T C +1) th job from job queue of private cloud, and trigger it to be migrated. For example, if the JC PR is equal to 10, the JC PU is equal to 4, and the T C is equal to 2. Therefore, the 7 th job will be migrated to public cloud. 22
Policy of Job Migration total Size of Job (SJ), n i 1 SJ i m k 1 SJ k T S T s : the threshold of SJ SJ T 1 k S the SeMA will pick up the th k 1 job from the job queue of private cloud, and trigger it to be migrated. For example, if the total size of job in public cloud is 10Mbytes, the T S is equal to 2Mbytes, and the size of jobs in private cloud are 3, 4, 3, 3, 2, 3, 4 Mbytes respectively. The 5 th job (2Mbytes) will be migrated to public cloud because the (3+3+4+3); so that the 5 th job will be migrated. m 23
Policy of Service Migration Estimated Finish Time (EFT) n T i m i 1 k 1 T the SeMA will pick the k T T th job in the queue of private cloud, and trigger it to be migrated. For example, if the total finish time of jobs in public cloud is 100s, the T T is equal to 20s, and the finish time of jobs in private cloud are 33, 24, 45, 43, 22, 37, 24 second respectively. The 5 th job (22s of finish time) will be migrated to public cloud Rough Set Theory m k 1 T k T T 1 24
Prototyping and evaluation Agent Platform for the hybrid cloud 25
Prototyping and evaluation Job migrated 26
Evaluation Service migration time T M T E T T T D 27
Evaluation Comparison between with migration and without migration 28
Evaluation Comparison between job count and total size of job 29
Summary an agent-based automatic intelligent job migration framework on a hybrid cloud is proposed. built a prototype that integrating our private cloud with public cloud. We demonstrate the job migration mechanism on Hadoop platform it shows that the framework can be applied to hybrid cloud and work well. 30
Execution Time Prediction Using Rough Set Theory in Hybrid Cloud Chih-Tien Fan, Yue-Shan Chang, Wei-Jen Wang, Shyan-Ming Yuan
Introduction Resource utilization is important issue in cloud computing Could the remaining resource in private cloud serve the incoming task and complete the task before deadline? If not, the incoming task need to be dispatched to public cloud. How much resource we need to preserve to serve the deadline-constraint task in public cloud? For the remaining resource, the execution time prediction of a task becomes an important issue in hybrid cloud. 32
Introduction Exploit Rough Set Theory (RST) to predict job's execution time in the hybrid cloud environment. RST is a well-known prediction technique that uses the historical data to predict the attribute value of an object. We propose an execution time prediction algorithm based on RST to schedule jobs The evaluation show that the RST can be utilized to accurately predict the execution time while historical data is increasingly. 33
RST-based Prediction Rough Set Theory (RST) have been witnessed that is a useful prediction technique based on historical data in a variety of applications, such as quantitative structure activity relationship in the Chemistry and data mining. It provides an appropriate theory for identifying good similarity templates. The primary objective of similarity templates is to identify characteristics of applications that define similarity. Two prediction phases Inference rule deducing phase Estimation phase 34
RST-based Prediction Inference rule deducing phase Steps (detailed methodology of RST can refer to [2]) Define all attributes; including condition attributes (CA) and decision attributes (DA). Discretize the properties of historical records for diversified attributes. Calculate D-Reducts Utilize discernibility matrix to list all properties, apply discernibility function to formulate the relation of the properties, and then simplify the formulation using boolean algebra. Derive the inference rule of DA.. 35
RST-based Prediction Define all attributes Conditional Attributes Decision Attribute 36
RST-based Prediction Discretize the properties of historical records 37
RST-based Prediction Calculate D-Reducts and D-Core Generate discernibility matrix 38
RST-based Prediction Calculate D-Reducts and D-Core Formulate discernibility function: f A (D) Both {a 1, a 3 } and {a 2,a 3 } are D-Reducts, {a 3 } is D-core 39
RST-based Prediction Calculate D-Reducts and D-Core formulate the relation of the properties, and simplify the formulation f 2 (D)=a 1, f 3 (D)=a 1 +a 3, f 4 (D)=a 1 +a 3, 40
RST-based Prediction Deduce Inference Rule (- : means don t care) a 1 =2 -> d=2 a 1 =3-> d=1 a 3 =4 -> d=4 a 1 =1 and a 3 =2 -> d=2 41
RST-based Prediction Estimation Phase Apply simple mathematical operation, such as arithmetic average of the value of DA, to obtain the final value of the DA.» Estimated time = (job3+job5+job6)/3 Element Processor Speed Input size Execution time 3 5 2 480 5 5 2 500 6 5 2 505 The new job 5 2? 42
Prototyping and Evaluation Prototype the system using the agent platform JADE v4.0 43
Prototyping and Evaluation two jobs are submitted to the system Compute π Area Approximation 44
Error Rate Prototyping and Evaluation The Error Rate Positive->over prediction, Negative->under predicted. Vibration during the first 25 jobs. lack of the historical data that can be used to predict the job. The more the historical data are stored, the more accurate the prediction will be. 3.5 3 Compute π 2.5 Area Approximation 2 1.5 1 0.5 0-0.5-1 1 9 17 25 33 41 49 57 65 73 81 89 97 105113121129137145153161169177185193 Job # 45
Absolute Error Rate Prototyping and Evaluation Absolute Error Rate. shows how much improvement has the prediction made. The higher the absolute error is, the more improvement is needed. for 2 kinds of jobs with 200 submissions are 0.2008 and 0.0615. the accuracy is very impressive if remove the first 25 predictions 3.5 3 2.5 Compute π Area Approximation 2 1.5 1 0.5 0 1 9 17 25 33 41 49 57 65 73 81 89 97 105113121129137145153161169177185193 Job # 46
Millisecond Job Number Prototyping and Evaluation The largest prediction latency is 642.91 ms with 190 jobs is acceptable. no new record to be updated, the prediction time taken can be less than 1 ms. generating the decision rule needs much more time than just predicting the value. To reduce the time of predicting, periodically updating the decision rules can be considered. 700 600 500 400 300 200 100 0 No. of job in history estimated time 3 60 117 174 231 288 345 402 459 516 573 630 687 744 801 858 915 9721029 Estimation # 200 180 160 140 120 100 80 60 40 20 0 47
Summary we utilized the RST to predict the execution time in hybrid cloud. The result shows that RST-based predictor can predict the execution time of a job error rate under 0.1 when the number of historical job is over 50. When more records available, the error rate can drop under 0.03. Latency is reasonable, less than 1 second with 190 historical records to perform a full prediction. The system can aid users to schedule their jobs faster and more accurate. 48
RST-based Dynamic Resource Allocation In Cloud Environment Yue-Shan Chang, Chih-Tien Fan, Wei-Jen Wang Dept. of Comp. Sci. and Inf. Eng., National Taipei University
Introduction Allocate resource more effectively and efficiently. dynamic resource allocation minimize allocated resource maximize the throughput of platform, 50
Introduction We propose a deadline-aware resource allocation approach based on Rough Set Theory (RST) reserve appropriate resource for incoming requests. accurately find out enough but un-wasted resource for a VM instance can complete submitted job before pre-defined deadline. propose a resource prediction algorithm for reserving appropriate resources while initiating VM instance to serve incoming applications. The evaluation shows that RST can be applied to accurately predict the required resource. 51
Dynamic Resource Allocation Problem Model Job_Name (Algorithms, Data, Deadline) the Expected Execution Time (Tex) of the job can be obtained by Deadline-Now(). In order to find an appropriate VM to serve the request, the platform needs to calculate the estimated execution time of the job on each available VM with different resource allocation. 52
Dynamic Resource Allocation Definition : Remaining Execution Time on VM i : Estimated Execution Time of the job on all N VMs T ex : Expected Execution Time of the job Assume that there are N VMs (VM List) have been initiated for serving jobs 53
Dynamic Resource Allocation 54
Dynamic Resource Allocation For example If we have 5 VMs serving submitted requests in the cloud environment While an incoming job submitted, and we can estimated its execution time on each VM 55
Dynamic Resource Allocation The expected completion time can be obtained. i If T ex < Expected Deadline, VM i can meet the deadline and serve the request. Candidate VM = {VM 1, VM 2, VM 3 }. 1 T ex 56
Dynamic Resource Allocation Finding CVM 57
Dynamic Resource Allocation RST-based Resource Estimation 58
Dynamic Resource Allocation Deadline-aware Resource Allocation Algorithm 59
Prototyping and evaluation Prototyping the system using popular agent platform JADE v3.5.1 Based on our previous work 60
Prototyping and evaluation Error Rate for Execution Time Prediction Initially, the cloud platform contains less historical data to assist the estimation estimation accuracy is increasing while the number of historical data increasing 61
Prototyping and evaluation Error rate for resource prediction error evaluation result that is similar with previous experiment. Similarly, the estimation accuracy is increasing while the number of historical data increasing 62
Prototyping and evaluation RST-based Estimation Overhead shows the RST-based estimation overhead is less than 50ms if record size is up to 350. Obviously, the overhead is acceptable if the record size is up to hundreds records 63
Summary propose a resource estimation algorithm for reserving appropriate resources while initiating VM instance to serve incoming applications. We also conducted three experiments to evaluate the effectiveness and efficiency. The result show that the RST can be utilized to accurately estimation the required resource. 64