A Virtual Machine Searching Method in Networks using a Vector Space Model and Routing Table Tree Architecture Hyeon seok O, Namgi Kim1, Byoung-Dai Lee dept. of Computer Science. Kyonggi University, Suwon, Korea {lims, ngkim, blee}@kyonggi.ac.kr Yoon-Ho Choi dept. Convergence Security Kyonggi University, Suwon, Korea ychoi@kyonggi.ac.kr Abstract : As cloud service draws great attention from the market, the number of providers of cloud service has increased. Accordingly, machines that offer a variety of services have come to exist in the network, and users can establish virtual service networks suited to their own demands using virtual machines with mutually differentiated performances and functions. However, the establishment of such a virtual service network requires a method by which the user can search within the network for appropriate virtual machines that fit his or her required services. In this regard, this study proposes a virtual machine search method that is appropriate for the user s demand, combining Logical Routing Table tree architecture and a Vector Space Model. Keywords-component; Virtual Network; Logical Routing Table; Vector Space Model; Seaching Virtual Machine; I. Introduction As networks were developed, both users and services rapidly increased. This caused various problems in existing networks, such as the shortage of available resources and difficulties in establishing networks for various services. One alternative that has emerged to resolve this problem is virtualization, which maximizes the utilization of available resources through the logical segmentation of physical resources, or collection of unused available resources all together. Such a virtualization has attracted great attention as an alternative to virtualization. It has also led to the development of servers that draw out and use the host s performance to the fullest, and it has led to storage virtualization and even to network virtualization that provides an independent network not only for specific performances but also for each service. While this virtualization has resolved earlier problems, it has also resulted in the existence of numerous network devices through logical segmentation within networks. Therefore, as service providers have organized networks that provide their own services, they have faced another problem, that of deciding which device within an existing network should be selected to provide the best performance and most appropriate results. To this end, this study proposes a method to organize virtual machines in the form of logical routing table tree architecture and to find virtual machines suited to the user s demands, using a vector space model [5]. 33
II. Relative works A. Network Function Virtualization (NFV) With the development of networks, the cycle of the emergence of services that use specific technologies has gradually shortened. Therefore, existing network functions have been removed, and new functions have frequently been added. However, existing architectures mostly consist of dedicated hardware-based devices, and thus their removal or addition is difficult, and their maintenance and management are also challenging. The concept designed to resolve this issue is NFV [1]. In this concept, a network architecture is produced that consists of several simple, high-performance factors such as high-capacity storage, a high-performance server, and a high performance switch. Thereafter, functions necessary for services are realized as software and are added to each factor. This can produce flexible network architectures. B. Software Defined Network (SDN) The switch that enables packet transmission in the existing network is an architecture that combines into one the data plane in charge of data transmission and the control plane in change of transmission directions. However, architecture combined in this way cannot identify the overall network, and therefore, it causes difficulties in creating an appropriate traffic path for numerous services over the network. The Software Defined Network (SDN) was proposed to resolve this problem. The SDN separates the two planes that have been combined into one. The switch is now in charge of the data plane, and the controller of each switch s control plane is collected as the device called controller in a centralized form, thereby creating optimal traffic flows by managing an overall network. During this process, the switch can receive the direction of transmission for a certain flow from the controller. Open flow has been suggested as the communication protocol used for this, and this protocol has been studied by many researchers. C. Virtual Network Embedding (VNE) Network virtualization produces logical devices by virtualizing each network device used within the network and creating a virtualized network [4] using the logical devices. Here, the most basic task is to correspond the virtual nodes and links within the virtual network to the devices within the actual network. This task should overcome limitations such as the size of the link s bandwidth on the virtual network and the host s throughput, and it should correspond virtual nodes and links to the actual network so as not to generate overheads. (This problem is known as NP-Hard.) Studies to resolve this problem of the correspondence of the virtual network can be largely divided into studies of heuristic programming and integer programming. Both solutions aim to reduce the overheads of matched links and nodes in the actual network and to produce the best performance. D. Selecting Best Web Service After the birth of the Web, each Web service provided only its own single service. However, such a single service can no longer satisfy users demands. Therefore, to provide new, quick services that satisfy users demands, new studies have been carried out on Web service mash-up technology that offers new services by linking and combining existing Web services. Specifically, this mash-up technology should first analyze the characteristics of services suited 34
to a user s demand, find corresponding appropriate Web services, and then mash-up these services. To this end, the characteristics of services have occasionally been defined by dividing them into input, output, preconditions, and effects. They have also been defined in various other ways, such as the production of Web Service Description Languages (WSDL) that record the detailed information of Web services in the xml format [6]. Based on this information, Web services suitable for mash-up are searched for. Studies have been carried out to discover the process of effective searching, and in these studies, various Web services are divided according to information and characteristics and are expressed as architectures such as trees [2] or graphs [3]. Then target Web services are found by applying an appropriate searching algorithm that is suitable for each type of data s architecture. III. Proposed Method The routing table tree and vector space model presented below suggests a virtual machine that is suited to the user s demand, among the various virtual machines that are present within the network and which are controlled by one controller. A. Logical Routing Table Tree Logical Routing Table Tree architecture is the method designed to resolve the problem of mounting overheads in root nodes and a catching node, which can occur in existing tree architectures and in which each device becomes a node within the network comprised of the distributed environment. Unlike existing trees, which have direct access to searching using the existing top-down method or the catching or tree architecture, this method enables left-right horizontal searching along with existing top-down vertical searching, including address information about nodes on the level that addresses nodes regarding their parents and children. Figure 1. An Example of Logical Routing Table Tree As Figure 1 shows, each node has its own level of information and address in the routing table tree. Using this, each node within the tree can generate a smaller number of overheads than conventional searching methods in which nodes are searched from the root to the tree. In addition, while the speed of this design could be slower than having direct access to nodes using caching or tree architecture replication, which are other options for resolving a similar problem, 35
this method has the advantage of resolving the problems of other methods such as memory space and tree architecture updates. B. Vector Space Model The Vector Space Model is the model used for information searching. In this method, a document that becomes the target for searching is expressed as a vector and is placed within the space. After this step, the query that becomes the target for searching is also expressed as a vector and is called a query vector within the space. This method searches information by measuring similarity, using the distance and angle between the query vector and the document vector. The query vector can be realized as a multi-dimension vector according to the user s query. To identify similarity within this vector space, it is generally calculated using the suggested cosine similarity computation algorithm below. Figure. 2. Cosine Similarity Computation Algorithm' The calculated value that results from using the above algorithm means similarity and always has a value less than 1. A larger value indicates a corresponding higher level of similarity. C. Proposal Method Figure. 3. Sample Architecture 36
This study proposes a method that searches the optimal virtual machine by combining the above two methods. First, as shown in Figure 2, using the network comprised of the distributed environment, it is reorganized as a logical routing table tree. Here, each node of the tree is divided into the root, domain, and node. The root is the machine that controls the overall network tree, and it has identical functions with the abovementioned SDN s controller or NFV. The domain is the network card or switch that connects the network. Each node is a virtual machine that has service. In addition, each link refers to the number of hop counts. Each virtual machine comprised of such trees and nodes has its own functions, and here, they are expressed as Ffunction. As Figure 3 shows, the architecture suggested in this study has scattered virtualized servers, and it converts the architecture of the actual network with the centralized controller into the form of logical architecture. In this type of architecture, an appropriate virtual machine is searched using only a logical architecture. However, if the user attempts to find a virtual machine that performs Function in Node 2, the Node 2 first finds Nodes 1 and 2 via vertical and horizontal searching using the routing table, and then it finds Nodes 3 and 4 through descendants searching in Domains #3 and #4. After searching virtual machines that have service and function using routing tables, in the obtained virtual machine set, the factors of each virtual machine and virtual machines that are fit for the query of Node 2 are found and prioritized. Factors that can enter each vector are shown in Table I. Here, if the query of Node 2 has throughput 50 and distance 4, then when these are applied to the cosine similarity of Equation 1, each virtual machine can yield the level of similarity shown in Table II. TABLE II. THE SIMILARITY AND PRIORITY OF EACH NODE ABOUT THE QUERY OF NODE 2 37
The above results show that the virtual machine that has Function A has Nodes #1, #3, and #4 in the tree, and the virtual machine with the fastest throughput is Node #3. This process makes it possible to find a virtual machine with the best performance within the network according to the user s demand. IV. Conclusion This study proposed a virtual machine searching method to organize a virtual network by finding a virtual machine suited to the user s required service among numerous virtual machines within the network. To this end, the actual network in which virtualization servers and centralized controllers are present was converted into the form of logical routing table tree architecture. In addition, this process of searching an architecture enabled the grouping of virtual machines that provide desired functions. Finally, a similarity measurement method was suggested using a vector space model in order to find the optimal virtual machine that was suitable for the user s demand within this group. In this study, simple evaluation factors that influence basic performances were used for brief explanations. However, follow-up studies are planned to research the method for composing the best network using a virtual machine searching method that is suited to more user demands and searched virtual machines by utilizing various evaluation factors. A CKNOWLEDGEMENT This work was supported by the Industrial Strategic Technology Development Program (10047541, Development of Self-Defending and Auto-Scaling SDN Smart Security Networking System) funded by the Ministry of Knowledge Economy(MKE, Korea)". REFERENCES 1. ETSI, Network Function Virtualisation (NFV) ; Proof of Concepts, Framework, 2013. 2. J. Hu and J. Li, Searching and Selecting the Best Web Service Chaining, CCCM 08, pp. 627-630, Aug. 2008. 3. S. Kona, A. Bansal and G. Gupta, Automatic Composition of Semantic Web Service, IEEE ICWS 2007, pp. 150 158, July 2007. 4. VMWare, VMWare Virtual Networking Concepts White paper, 2007. 5. C. D. Manning et al, Introduction to Information Retrieval, Cambridge University Press, 1st ed., 2008. 6. S. Ran, "A Model for Web Services Discovery With QoS", ACM SIGecom Exchanges, Vol. 4 Is. 1, pp. 1 10, Sping, 2003 38