5 10 15 20 25 30 35 40 Agent-based Monitoring Approach for Hybrid Cloud # LIU Yunchang, LI Chunlin, LIU Yanpei * (Department of Computer Science and Technology,Wuhan University of Technology, WuHan 430063) Abstract: Hybrid Clouds integrate different cloud solutions, Its inherent complexity and short of standard urge for a careful analysis, systematizing and understanding of monitoring. In this context, this paper provides a deep insight into hybrid cloud monitoring. It proposes a layered monitoring model for hybrid clouds, identifying the multiple layers of monitoring, focusing on phsical infrastructure layer, virtual infrastructure layer, network, application/service layer, while combining the perspectives of service providers and clients. This process involves the identification of relevant parameters and metrics for each layer. Due to its flexibiliity and Intelligence, Using agent technology, an agent-based monitoring architecture is presented. It enables to eliminate the complexity among different cloud platforms. This study contributes to achieve a clearer and more efficient approach to hybrid cloud services monitoring. Keywords: Cloud Computing; Hybrid Cloud; Monitoring; Agent 0 Introduction Cloud computing represents a contemporary computing as a service paradigm. It efficiently provisions pooled resources on-demand. These resources are shared among multiple users through a broadband network access [1]. According to [1], the deploy forms of cloud infrastructure are private, community, public, or hybrid cloud. A private cloud is operated by a single organization, whereas a community cloud is shared and jointly operated by several organizations. In contrast, a public cloud is operated by an independent cloud service provider,made available to the general public. A hybrid cloud is a combination of one private cloud and one or more public clouds, and is aimed at addressing workload bursting for load balancing between clouds. Optimal utilization, data center consolidation, risk transfer and high availability are the main factors that drive industries to utilize hybrid cloud services. Due to these benefits, the adoption of hybrid cloud infrastructure services has become widespread. All hybrid cloud services must be metered and monitored for cost control, chargebacks and provisioning. A hybrid cloud is a composition of two or more clouds (private, community, or public) that remain unique entities but are bound together by standardized or proprietary technology. In fact different cloud systems do not work well together. Beginning from the hypervisor level up to the application programming interfaces, currently available clouds differ fundamentally. This makes it more complex for hybrid cloud monitoring. Although many public cloud providers offer monitoring facilities for tracking availability of their services, alerting capabilities for identifying service outages in a timely manner, all of them are proprietary solutions and do not aim at defining a standard for monitoring interoperability[2]. these capabilities are not sufficient for the owner of private cloud who needs to have a full control of the performance of cloud services in use. More importantly, the owner of private cloud cannot rely on monitoring capabilities offered by public cloud service providers. In order to ensure that these service levels are met, the owner of private cloud needs to have independent monitoring tools in place that allow them to monitor not only actual levels of performance as experienced by business users, but also enable it to conduct root cause analysis of problems as they occur. As a Foundations: Specialized Research Fund for the Doctoral Program of Higher Education under Grant (No.20120143110014) Brief author introduction:liu Yunchang, (1975-), Male, PhD, Lecturer, Main research :Cloud Computing. E-mail: yunchang75@aliyun.com - 1 -
45 50 55 consequence, obtaining a comprehensive monitoring solution for hybrid cloud still represents a challenging task and it has not been properly addressed in literature yet. In this context, articulating the various aspects hybrid cloud monitoring rises, this paper proposes a layered monitoring model for hybrid cloud. According to this model, combining agent technology, we propose an agent-based hybrid cloud monitoring architecture. Agents take measures inside the Cloud resources, which are completely under the customer s administration, collect performance information and compute metrics according to the user s requirements. They implement a provider independent monitoring of the Cloud infrastructure during the execution of applications. The rest of the paper is organized as follow. In Section 1, the related work is introduced. Section 2 analyzes the need of hybrid cloud monitoring. In Section 3, an agent-based monitoring architecture is presented. Finally, the paper is concluded in Section 4. 1 Related work 60 65 70 75 80 85 Monitoring is a fundamental part of cloud computing management system. In this section, we discuss the related work in the areas of cloud monitoring. Lot of works focuses on monitoring of federated cloud like hybrid cloud. Fu, Yongquan et al. [2] focuses on collecting detailed network metrics between participating nodes of the hybrid clouds in order to optimize the quality of the service provision in hybrid clouds. Since nodes can be large-scale and dynamic, the network metrics may be diverse for different cloud services. This paper proposes a novel distributed level monitoring method HPM (Hierarchical Performance Measurement) meeting these requirements. The authors in [3] have designed a trusted monitoring framework, which provides a chain of trust excluding the untrusted privileged domain, by deploying an independent guest domain for the monitoring purpose, as well as utilizing the trusted computing technology to ensure the integrity of the monitoring environment. Moreover, the feature of fine-grained and general monitoring is also provided. An approach based on software agents is a natural way to tackle the monitoring tasks in the distributed environments. Agents move and distribute themselves to perform their monitoring tasks[4]. Based on multi-agent and matrix grammar, SAaaS [5], built upon intelligent autonomous agents, is aware of underlying business driven intercommunication of cloud services, and enables to be flexible and to supported cross customer event monitoring within a cloud infrastructure. Due to the complexity of monitoring cloud environments and the lack of standards for all those new service models, it is so far urge demand for a careful analysis, systematizing and understanding of key points involved when assessing the services provided. In this context, Palhares, Nuno et al.[6]propose a layered model for Cloud Services monitoring, identifying the multiple dimensions of monitoring, while combining the perspectives of service providers and customers. This process involves the identification of relevant parameters and metrics for each monitoring dimension, focusing on monitoring of resources, quality of service, security and service contracts. Taking a stratified view of the problem, this study contributes to achieve a clearer and more efficient approach to cloud services monitoring. Katsaros, Gregory et al.[7]presents a monitoring system that facilitates on-the-fly self-configuration in regard of both the monitoring time intervals and the monitoring parameters. A multi-layered monitoring framework for measuring QoS at both application and infrastructure levels targeting trigger events for runtime adaptability of resource provisioning estimation and decision making is proposed. GMonE[8], a general-purpose cloud monitoring tool which covers all aspects of cloud monitoring by specifically addressing the needs of modern cloud infrastructures. It proposes a - 2 -
90 95 100 105 unified cloud monitoring taxonomy, based on which it defines a layered cloud monitoring architecture. Furthermore, the performance, scalability and overhead of GMonE with Yahoo Cloud Serving Benchmark (YCSB) is evaluated by using the OpenNebula cloud middleware on the Grid 5000 experimental test bed. According to [9], a Cloud can be modeled in seven layers: facility, network, hardware, OS, middleware, application, and the user. These layers can be controlled by either a Cloud Service Provider or a Cloud Service Consumer. In the context of Cloud monitoring, these layers can be seen as where to put the probes of the monitoring system. In fact, the layer at which the probes are located has direct consequences on the phenomena that can be monitored and observed. Aceto, Giuseppe et al.[10] provide a comprehensive survey on Cloud monitoring. This paper firstly analyzes motivations for Cloud monitoring, providing definitions and background for Cloud monitoring. Then, it carefully analyzes and discusses the properties of a monitoring system for the Cloud, the issues arising from such properties and how such issues have been tackled in literature. Finally, it identifies open issues, main challenges and future directions of Cloud monitoring. It points out Cross-domain monitoring (Federated Clouds, Hybrid Clouds, multi-tenancy services) is an open issue. 2 Monitoring Requirements for Hybrid Cloud 2.1 Hybrid cloud monitoring 110 115 Cloud monitoring can provide information about aspects of system performance, behavior, evolution, etc. The way this information is understood, analyzed and used depends not only on what layer of the system is being monitored (infrastructure, network or application) but also who is obtaining this information and to what purpose. A typical hybrid cloud combines the use of a private cloud (for example an OpenNebula-based cloud) together with a public cloud (for example an Amazon EC2-based cloud). As cloud administrators, we need to monitor the entire infrastructure. Due to the full control of the private cloud, monitoring the private cloud is performed from a cloud-services-provider-side monitoring perspective. However, it is not possible to monitor the public cloud in the same way. We can only monitor it as clients, since we are just clients and limited to the use of monitoring tool of public clouds. 2.2 Hybrid Cloud Monitoring Layer 120 125 130 Due to the variety of monitoring data, the monitoring solution design should follow an hierarchical approach. Based on work carried out by Cloud Security Alliance (CSA) [9], we propose a novel layered monitoring model for hybrid cloud. The proposed model is stratified into four main layers. As shown in Figure 1, the main layers are: Physical Infrastructure; Virtual Infrastructure; Network and Service/Application. The physical Infrastructure layer covers monitoring of physical resources involved in the private cloud computing environment. The virtual infrastructure layer covers monitoring virtual resources involved in private cloud and public cloud. Aspects related to the IP service, throughput, performance and reliability are covered at Network layer. The layer Service/Application is focused on assessing the availability, efficiency, reliability and safety of a service. 2.3 Combining perspective and hybrid cloud monitoring layer As we have mentioned in previous sections, there are many aspects to consider when you monitor a hybrid cloud. In this model we have identified two basic types of cloud monitoring, - 3 -
depending on the specific cloud layer and perspective being considered: provider-side monitoring, client-side monitoring. 135 Fig.1. layered Monitoring Model for Hybrid Cloud 2.3.1 provider-side monitoring 140 145 150 155 160 165 As a cloud provider, it needs to monitor its physical resource to ensure health and availability and Quality of Service (QoS) of its facilities. In addition to this, it has monitoring requirements enabling it to perform sophisticated optimization of resource utilization due to the dynamic nature of its infrastructure. 1) Physical infrastructure layer It addresses the physical infrastructure of the private cloud. All physical components, from processing and storage devices to network equipment, should be monitored. Various metrics are addressed that are used for assuring the health and the performance of the infrastructure, like: CPU-, memory-, disk-usage, number of VMs running on each physical machine, ingoing and outgoing traffic, etc. Apart from the need to monitor distinct components that compose an entire infrastructure, there are other components that should be monitored at this level, namely energy and security. Energy consumption, as well as its impact on system performance, operating cost and the environment, has become critical issues in Cloud environments. In order to improve energy efficiency and reduce energy cost, energy use and associated carbon emissions must be measured in association with tasks, resources, usage and other workload parameters in the IT infrastructure. When the capability of private cloud is not efficient, private cloud should rent resource of public clouds, migrating data and/or applications to public cloud, this incurs security and legal issues. Guaranteeing security heavily depends on monitoring frameworks and requires specific attention to be given to the corresponding parameters at infrastructure level. 2) Virtual Infrastructure layer Virtual resources play a crucial role, increasing transparency, dynamics and scalability. Common metrics at this level are mainly related to the percentage of CPU usage, RAM and memory storage of VMs. Statistics on the network interfaces of VMs are equally relevant. Operations related to creation and migration of VMs or numbers of active instances are also useful information. The security of virtual infrastructure resources can be associated with OS and middleware. 3) Network layer In this layer, the relevant metrics are mainly at IP service level. These metric types are classified as: Throughput, Availability and Reliability. Throughput is considered an essential parameter in cloud monitoring. Regarding Availability, a network can present downtime periods - 4 -
170 175 180 185 190 195 caused by problems in network components, routing configurations, among other aspects. For Reliability assessment, the response time to a network configuration can be a relevant indicator. Upon the occurrence of a network failure, the mean time between failures or the average time to recover are common reliability indicators. 4) Application/Service layer In the Service/Application layer, the nature of the monitored parameters and how they should be collected depends essentially on the software being monitored. Metrics should provide information about the state of the service, its performance and other service specific information. Relevant metrics for this layer are classified in Availability, Reliability and Security. Availability includes registering the periods of time during which a service is running and when it is unavailable. The response time of a given service is a common indicator of efficiency. The number of security vulnerabilities is a relevant metric, since it is necessary to monitor behavior to detect possible violations. Table 1 summarizes the metrics to consider at provider-side monitoring. Tab 1. Metrics for Provider-side Monitoring Layer Type Key Metrics Computing CPU, memory, disk-usage, numbers of VM Resource Physical Network Equipment network interface statistics, topology connectivity Infrastructure Energy energy consumption, temperature Security authentication systems, firewalls, access control, IDS and IPS monitoring Virtual Infrastructure Components Security number of CPU, memory size, hosting space, VM interfaces, VM migration, number of active instances. status of VM instances, system level call between VMs and hardware Throughput traffic volume per time unit, used and available bandwidth Network Performance Packet duplication, packet loss Reliability mean time to repair upon failure, mean time between failures. Server response DNS lookup time, Connect time, Server processing time, Application/ Service time Availability Reliability Security Download time Uptime, service (un)availability. mean time to repair upon failure, mean time between failures. Password Management, Backup policies, access patterns, login processes 2.3.2 client- side monitoring As an end user, it needs to monitor its resources or services rented from public cloud for performance evaluation, QoS and Service Level Agreement (SLA) validation. 1) Virtual infrastructure layer Clients obtain information about statistics their VM instances, such as number of CPU, memory size, hosting space, consumed time and cost per instance, etc. 2) Network Layer Clients obtain information about network performance, such as min latency, max latency, mean latency, packet loss, bandwidth. 3) Application/Service layer Clients collect information about the status and usage of the cloud applications and associated resources, services costs, etc. Table 2 summarizes the metrics to consider at client-side monitoring. 200-5 -
Tab 2. Metrics for Client-side Monitoring Layer Type Key Metrics Virtual number of CPU, memory size, hosting space, Components Infrastructure consumed time, cost per instance Throughput bandwidth Network Availability data lost rate, data error rate Efficiency Response time (average/ maximum), Service response Transfer time, max,min, average service time Application/ time Service Service cost Consumed time, billing 205 210 3 Agent-base monitoring architecture for Hybrid Cloud This section presents an agent-based monitoring architecture adapted to the needs of hybrid cloud monitoring. Monitoring may be easily achieved by employing agents. To support our monitoring architecture, a set of specialized agents has been devised that can extract information about the system status and perform monitoring tasks. Agents will be in charge of configuring the system, performing measures, computing statistics, collecting and managing information, detecting critical situation and applying reactions according to the decisions notified by the system administrator. To achieve these goals, the architecture shown in Fig. 2 is proposed. Private Cloud Public Cloud 1 Service seragent Service seragent Application appagent Network netagent Aggregate Agent Aggregate Agent Application appagent VM vmagent VM vmagent Physical Resource pragent Public Cloud 2 Service seragent Report Agent Coordinator Agent Aggregate Agent Application appagent VM vmagent Fig.2. Agent-based Monitoring Architecture for Hybrid Cloud 215 220 225 Specialized agents distributed in the cloud involve pragent, vmagent, netagent, appagent, seragent, AggreagteAgent,CoordiantorAgent and Report Agent. pragent is responsible for collecting the information of physical resoures,such as server,router, switch and so on. A vmagent collects information about virtual machine across hybrid cloud. A netagent obtains the information of network layer. An appagent collects information about applications running on hybrid cloud infrastruture. A seragent is responsible for collecting information about service offered by public cloud providers. An AggregateAgent collects different monitoring information from different sources, translates them into a unified format message and sends them to a CoordinatorAgent after performing classified aggregation. A CoordinatorAgent receives various types monitoring information and stores them for future use. A ReportAgent can be used to analysize monitoring data and present the result in the form of graph according to administrator s requirements. - 6 -
230 235 4 Conclusion and Future Research Cloud monitoring is a recent and active research area where the lack of related standards is evident. This fact is particularly important and complex when trying to perform monitoring of cloud services across hybrid clouds, involving security, quality and legal issues. Contributing to the efforts toward modeling, this paper has proposed a stratified model identifying and suggesting needs and metrics for efficient monitoring of hybrid cloud environments. By employing agent technology, we proposed an agent-based monitoring architecture for hybrid cloud. Future work includes validating and tuning the proposed model resorting to an experimental scenario and forthcoming activities in the area. Acknowledgements This paper is partly supposed by Specialized Research Fund for the Doctoral Program of Higher Education under Grant No.20120143110014. References 240 245 250 255 260 265 270 [1] Mell, P. and T. Grance, The NIST definition of cloud computing (draft)[ol].[2013-3-23].http://csrc.nist.gov/publications/nistpubs/800-145/sp800-145.pdf [2] Fu, Y., Y. Wang and E. Biersack. A general scalable and accurate decentralized level monitoring method for large-scale dynamic service provision in hybrid clouds[j]. Future Generation Computer Systems, 2013, 29(5): 1235-1253. [3] Zou, D., et al.design and implementation of a trusted monitoring framework for cloud platforms[ol].[2013-3-26].http://doi:10.1016/j.future.2012.12.020. [4] Ilarri, S., E. Mena and A. Illarramendi. Using cooperative mobile agents to monitor distributed and dynamic environments[j]. Information Sciences, 2008, 178(9): 2105-2127. [5] Frank Doelitzscher et al.an agent based business aware incident detection system for cloud environments[ol].[2013-4-21].http://www.journalofcloudcomputing.com/content/1/1/9 [6] Palhares, N., S. Lima and P. Carvalho. A Multidimensional Model for Monitoring Cloud Services[J]. Advances in Information Systems and Technologies, 2013, 206: 931-938. [7] Katsaros, G., et al. A Self-adaptive hierarchical monitoring mechanism for Clouds[J]. Journal of Systems and Software, 2012, 85(5): 1029-1041. [8] Montes, J., et al.gmone: A complete approach to cloud monitoring[ol].[2013-3-26].http://dx.doi.org/10.1016/j.future.2013.02.011. [9] Spring, J. Monitoring Cloud Computing by Layer, Part 1[J]. Security & Privacy, IEEE, 2011, 9(2): 66-68. [10] Aceto, G., et al.cloud monitoring: A survey[ol].[2013-4-30].http://dx.doi.org/10.1016/j.comnet.2013.04.001. 基 于 Agent 的 混 合 云 监 控 方 法 柳 运 昌, 李 春 林, 刘 炎 培 ( 武 汉 理 工 大 学, 计 算 机 科 学 与 技 术 学 院, 武 汉 430063) 摘 要 : 混 合 云 集 成 了 多 种 云 平 台 技 术, 内 在 的 复 杂 性 和 监 控 标 准 的 缺 乏 迫 切 需 要 对 监 控 进 行 仔 细 分 析 分 类 在 此 背 景 下, 本 文 对 混 合 云 监 控 进 行 了 深 入 的 分 析 文 章 首 先 分 析 了 混 合 云 的 监 控 需 求, 提 出 了 分 层 的 监 控 模 型, 针 对 物 理 基 础 设 施 虚 拟 基 础 设 施 网 络 和 应 用 与 服 务 四 个 层 次, 结 合 服 务 供 应 商 和 用 户 两 个 角 度 对 混 合 云 监 控 各 个 方 面 进 行 了 研 究, 鉴 别 了 各 层 相 关 的 参 数 和 指 标 同 时, 基 于 Agent 的 灵 活 性 智 能 性, 提 出 了 一 个 基 于 Agent 的 混 合 云 监 控 架 构 体 系, 该 架 构 可 消 除 不 同 云 平 台 间 的 复 杂 性 该 项 研 究 有 助 于 实 现 混 合 云 更 加 清 晰 和 高 效 的 监 控 方 案 关 键 词 : 云 计 算 ; 混 合 云 ; 监 控 ;Agent 中 图 分 类 号 :TP393-7 -