Study of Flexible Contents Delivery System With Dynamic Server Deployment Yuko KAMIYA Toshihiko SHIMOKAWA and orihiko YOSHIDA Graduate School of Information Science, Kyushu Sangyo University, JAPA Faculty of Information Science, Kyushu Sangyo University, JAPA Information Technology Center, Saitama University, JAPA, {kamiya,toshi}@nw.is.kyusan-u.ac.jp yoshida@mail.saitama-u.ac.jp ABSTRACT To deliver a large size of contents to a lot of clients, two different technologies - Content Distribution Network (CDN) and Peer-to-Peer (P2P) have been proposed. However both technologies have their own problems. CDN lacks Adaptability and Flexibility. Moreover CDN is expensive to deploy and maintain. On the other hand P2P lacks Flexibility and Reliability. In this paper, we propose flexible contents delivery system which has Adaptability, Flexibility and Reliability. Moreover cost performance is high. Keywords: CDN, P2P, Load Balancing, Server Proliferation 1. I TRODUCTIO With the rapid spread of the Internet, we can use broadband network even at home. Content holders provide several kinds of broadband contents for all over the Internet. Some popular contents of them are accessed by a lot of clients. Therefore it needs enough delivery capacity to provide them. Delivery capacity consists of processing power and network bandwidth of delivery system. In this research, we focus on network bandwidth. It is because we can increase processing power easily by Scale out or Scale up. Generally speaking, there are two methods to increase network bandwidth. One is expanding network bandwidth of existing network. We can expand existing network bandwidth by upgrading external link to more broadband link, e.g. 1G to 10G or 10G to 40G. We can also expand existing network bandwidth by adding new servers in a local area. It is usual that total capacity of output bandwidth of servers is lower than capacity of external link of the server system. However, this method cannot eliminate network bottleneck. The other one is adding new network to existing network. We can add new network to existing system by deploying new servers in a wide area. It is because requests from clients and responses from servers are distributed widely. CDN[1][2] is widely used to large scale Contents Delivery. Contents providers estimate the amount of the access in advance. They use CDN to provision the enough processing power and network bandwidth. However, the existing CDN has some problems. First, CDN needs new physical machines in constructing a new content server. They are expensive to distribute and maintain. Second, CDN cannot deliver contents during overload situation. It means that existing CDN does not have adaptability. Moreover existing CDN deploys their content servers in advance. Therefore it is not easy to change configuration of these content servers. For example, when CDN needs new transport layer protocol, it has to re-configure the OS Kernels of their CDN servers. However it is not easy because it needs verification of interoperability of their existing systems and new kernel. It means that existing CDN does not have flexibility. On the other hand, P2P system is used for increasing network bandwidth. After clients receive contents, they will act
as supplying peers and deliver the contents to other requesting clients. P2P system gains their delivery bandwidth when new peer join the P2P network. It is because P2P system can use the network of the new peer. It means that P2P has adaptability. However peers are not dedicated servers but autonomous and volunteer. Therefore they eventually leave from the P2P system. It means that P2P does not have reliability. In this research, we propose new contents delivery system which is flexible, adaptable, reliable, and high cost-performance. This paper is organized as follows. Section 2 presents a related works, including an overview and problem of CDN and P2P. Section 3 proposes flexible contents delivery system and Section 4 presents an implementation of flexible contents delivery system. Finally we conclude this paper in Section 5. A flash crowd is a sudden surge in the volume of request rates to a CDN that causes the CDN to be virtually unreachable. CDN is required to add its own distribution capacity dynamically to handle flash crowd. Some of existing CDNs are able to add content servers dynamically. It can add processing power. However new servers are deployed near existing content server. Therefore it can add only limited network bandwidth. Therefore, we can say that existing CDNs lack Adaptability. It costs too much for CDN to increase processing power and/or network bandwidth. It is because when CDN wants to increase distribution capacity, it needs a new content server. A new physical machine is required to construct a new content server.. 2.3 P2P(Peer-to-Peer) 2. RELATED WORKS Two different technologies-content Delivery Network (CDN) and Peer-to-Peer (P2P) - have been proposed to add new network to existing network. 2.1 CD (Contents Delivery etwork) CDN is widely used to large scale Contents Delivery. A CDN consists of origin server, content server, and request navigation system. The origin server stores the original contents. The CDN deploys many content servers all over the Internet. The content servers cache the contents from origin server. Requests from clients are redirected to their closest content server. Examples of commercial CDNs are Akamai[3] and Limelight[4]. 2.2 Problem about Existing CD The existing CDN has some problems. These are Flexibility, Adaptability, and Cost. Flexibility is required when CDN wants to start new services. For example, most of existing CDNs do not support SCTP[5]. SCTP is new transport layer protocol. Therefore kernel re-configuration is required to use the protocol. However kernel re-configuration is not easy for a large number of deployed servers. Not only new transport layer protocol but new application layer protocol is not easy to add to existing system. It is because verification is required to add something to existing running system. Therefore, we can say that existing CDNs lack Flexibility. Adaptability is required for CDN to handle flash crowd. P2P is an autonomous decentralized system. In P2P systems, peer is used not only receiving contents but serving it. Therefore the more the number of the peer is increased, the more resources P2P gains. 2.4 Problem about Existing P2P Reliability is required whenever serving contents. In the CDN, content server is always serving contents. However, in the P2P, peer might be left from P2P. Therefore, peer is not able to retrieve content after all peers which have the content left from P2P. 3. Flexible Contents Delivery System with Dynamic Server deployment As mentioned 2.2. and 2.4., there are problems on existing CDN and P2P. To solve these problems, we propose new flexible contents delivery system that has these characteristics. Adaptability guarantees that clients can retrieve its desirable content even if there are a large number of accesses to the server. Cost-Performance means that it costs low to construct a contents delivery system. Flexibility guarantees that any services and protocols can be adapted. Reliability guarantees that a client can always find and access its desirable content. Our system has a capability to add new content server
dynamically in a wide area. As mentioned 2.2, CDN can add processing power by adding new content server dynamically. Our system can add new content server dynamically. And also our system can deploy them in a wide area to avoid network congestion. To be able to deploy new content server dynamically, we adopts virtual machine in exchange for physical machine to construct a content server. Therefore our system can gain both of processing power and network bandwidth dynamically. Virtual machine also gives us characteristics of high-cost performance. Physical distribution is required to deploy physical machine. It is expensive. However, only data transfer is required to deploy virtual machine. It is not expensive. And we can share a physical machine by some virtual machines. It also gains cost-performance. Our system is easy to add new feature to CDN. It is because we can add new virtual machine for each new feature. We do not have to verify interoperability between existing system and new feature, because new feature are added in the new machine. We do not require modifying existing content servers. Therefore our system has Flexibility. All content servers in our system are under the control of our system. They are not autonomous and volunteer. Therefore our system has Reliability. We can guarantee QoS easily. Table 1 shows comparison of existing content delivery system and our system. machines, it can add processing power of the server dynamically. Moreover, by deploying virtual machines in a wide area, it can add the network bandwidth dynamically. We introduce two types of the servers in Server Proliferation. One is Deployment Server and the other is Execution Server. In Server Proliferation, services are provided by virtual machines. When a new virtual machine is necessary, a virtual machine is distributed from Deployment Server to one of the Execution Server. The distributed virtual machine is executed on the Execution Server. The load of the server and the network bandwidth can become problems when we deploy new virtual machines. That is why we use these two kinds of the server. If the load of the Execution Server is high, it is difficult to start new process "Deploying new virtual machine". Moreover, if the network bandwidth is not enough, it is difficult to distribute the virtual machine to another execution server. When load of virtual machine becomes high, load of Execution Server also becomes high. However load of Deployment Server keeps low. Therefore we can deploy new virtual machine from Deployment Server to another Execution Server if necessary. In this architecture, Deployment Server can become a bottleneck. However, we can deploy multiple Deployment Servers in a wide area. Therefore, it does not become a bottleneck. Table 1. Comparison of Content Delivery System Flexibility Adaptability Reliability cost CDN NG NG OK High P2P NG OK NG Low Our System OK OK OK Low 4. Implementation of flexible contents delivery system As mentioned in section 3., we propose new flexible contents delivery system. In this section, we describe implementation of this system. We use Server Proliferation [6] on implementation. Therefore we describe the Server Proliferation, first. Next, we describe implementation of our proposed system. 4. 1 Server Proliferation We proposed Server Proliferation for scalable server system. Basic idea of Server Proliferation is add and deploy virtual machines dynamically in a wide area. By adding virtual Figure 1. Architecture of Server Proliferation 4. 2 Implementation of Flexible Contents Delivery System using Server Proliferation In this research, we design new flexible contents delivery system with dynamic server deployment using Server Proliferation. We use Server Proliferation to implement our idea
described at 3. To add new surrogate servers dynamically, we can use Server Proliferation. It can add new virtual machines dynamically. Therefore we can add new surrogate servers dynamically when we construct a surrogate server on a virtual machine. And also Server Proliferation enables to deploy new surrogate servers in a wide area. It is because it can deploy new virtual machines in a wide area. However, Server Proliferation does not define any rule to deploy new virtual machines. Then, we introduce Monitor Server in addition to Execution Server and Deployment Server. It can be installed various server location rule. We call the rule Server Deployment Policy. It monitors various statuses of virtual machines and Execution Servers like load average, network bandwidth and so on. It monitors not only status of a machine; it monitors status between servers like distance between Deployment Servers and Execution Servers and so on. As we describe above, we can use various Server Deployment Policy. We will describe examples of the policy in section 4.3. 4. 3 Examples of Server Deployment Policy In this paper, we explain five examples of Server Deployment Policy. However, our system is possible to use another policies and/or combinations of those five policies. 4.3.1 Distance between execution server and deployment server This policy uses the number of hops between Deployment Server and Execution Server as a metric. It chooses the nearest Execution Server from Deployment Server; therefore it can deploy a virtual machine in a short time. As a result, it is possible to correspond to a sudden access increase. Therefore the downtime of the server can be shortened. However, it does not consider a processing power of Execution Servers and network bandwidth. Therefore we might have to deploy another virtual machine, because of the high load and / or the lack of the network bandwidth. 4.3.2 Processing Power This policy uses processing capacity of Execution Servers as a metric. It monitors load averages of Execution Servers. It calculates processing capacity of an Execution Server with the load average and its own processing power. Then it chooses most powerful server. This rule can solve the lack of the processing power. This policy does not consider with the network distance between Execution Server and Deployment Server. Therefore it may choose an Execution Server far from the Deployment Server. 4.3.3 Network Bandwidth This policy uses capacity of network bandwidth as a metric. It monitors usage of network bandwidth of Execution Servers. It chooses an Execution Server that has most network bandwidth. Therefore, it can solve the lack of the network bandwidth. 4.3.4 Nearest AS It uses locations of clients as a metric. It monitors AS of each clients. AS is Autonomous System in BGP[7]. By deploying the server in a wide area, it is possible to increase the total network bandwidth. However one of Execution Servers becomes a bottleneck when the accesses have concentrated on that one. In this case, we have to deploy another surrogate server. We use AS information of clients to distribute access between Execution Servers. The Execution Server near the AS from which many accesses are coming is chosen. 4.3.5 Contents deliverer s opinion It uses contents deliverer s opinion as a metric. For example, if accounting system introduces Execution Server, there is a possibility that not only the performance but also the cost becomes a metric. Then the contents deliverer wants to choose the execution server whose costs are cheap. Moreover, the execution server might be selected based on the connected network. As a result, the traffic engineering becomes possible. 5. CO CLUSIO S To deliver a large size of contents to a lot of clients, two different technologies - Content Distribution Network (CDN) and Peer-to-Peer (P2P) have been proposed. However, both technologies have their own problems. CDN lacks Adaptability and Flexibility. Moreover CDN is expensive to deploy and maintain. On the other hand P2P lacks Flexibility and Reliability. In this paper, we proposed flexible contents delivery system which has Adaptability, Flexibility and Reliability. Moreover cost performance is high. We use "Server Proliferation" to implement our system. ACK OWLEDGME TS This research was supported in part by MEXT in Japan
under Grants-in-Aid for Scientific Research on Priority Area 21013008, and by JSPS in Japan under Grants-in-Aid for Scientific Research (B) 20300024. REFERE CES [1] Green, M., Cain, B., Tomlinson, G., Speer, M., Rzewski, P. and Thomas, S.: Content Internetworking Architectural Overview, Internet-Draft(2002). draft-ietf-cdi-architecture-01. [2] Day, M., Cain, B., Tomlinson, G. and Rzewski, P.: A Model for Content Internetworking(CDI), RFC3466(2003) [3] Akamai Technologies Inc: Akamai: The Business Internet. http://www.akamai.com/ [4] Limelight: http://www.limelightnetworks.com/ [5] Stewart, R., Xie, Q., Morneault, K., Sharp, C., Schwarzabauer, H., Tallor, T. Rytina, I., Kalla, M., Zhang, L., and V. Paxson, Stream Control Transmission Protocol, RFC2960(2000) [6] Yuko Kamiya, Toshihiko Shimokawa, Norihiko Yoshida, "Scalable Server System Based on Virtual Machine Duplication in Wide Area", Proceedings of The 3rd International Conference on Ubiquitous Information Management and Communication, pp.432-436 (January, 2009) [7] Y. Rekhter, T.Li, S.Hares,: A Border Gateway Protocol 4 (BGP-4), RFC4271(January,2006)