Enhance UDDI and Design Peer-to-Peer Network for UDDI to Realize Decentralized Web Service Discovery

Transcription

1 Enhance UDDI and Design Peer-to-Peer Network for UDDI to Realize Decentralized Web Service Discovery De-Ke Guo 1, Hong-Hui Chen 1, Xian-Gang Luo 2,Xue-Shan Luo 1, Wei-Ming Zhang 1 1 School of Information System &Management, 2 School of Information Engineering, China National University of Defense Technology, University of Geosciences, Changsha, , China Wuhan, , China Abstract Web Services has emerged as a dominant paradigm for constructing and composing distributed business applications and enabling enterprise-wide interoperability. A critical factor to the overall utility of web services is a scalable, flexible and robust discover mechanism. This paper improves UDDI specification, thus it could guarantee usability of response and append grid-like monitoring information of web service hosting environment to response, but modification to standard UDDI specification does not affect the interoperability. We also present two distributed and scale-well approaches for fully autonomous registries and cooperative registries to overcome the disadvantage of traditional registries. In other words, we provide an unstructured peer-to-peer network for fully autonomous registries and a structured peer-to-peer network for cooperative registries. The two kinds of peer-to-peer network for registries support complex query without influence on interoperability, and they are a suitable candidates to extend traditional web service registries. Keyword UDDI, decentralized web service discovery, peer-to-peer network * 1.0 Introduction Web Services are emerging as a dominant paradigm for distributed computing in industry as well as academia (e.g. the Open Grid Services Architecture standard [1] and the Web Services Resource Framework [2] ). Web Services are enterprise applications that exchange data, share tasks, and automate processes over the Internet. They are designed to enable applications to communicate directly and exchange data, regardless of language, platform and location. A typical Web Services architecture consists of three entities: service providers that create and publish Web Services, service brokers that maintain a registry of published services and support their discovery, and service requesters that search the service broker s registries. Web Service registries are critical to the ultimate utility of the Web Services and must support scalable, flexible and robust discovery mechanisms. UDDI [3] Registry has a centralized architecture consisting of multiple UDDI Nodes that collectively manage a well-defined set of UDDI data. Typically, this is supported through synchronic replication between the nodes in the registry which reside on different systems. However, this needs a replication contract between both registry providers and service providers. Therefore, it is true that practically replication between UDDI Nodes doesn t occur under the consideration of security and privacy. Moreover, as the number of web service grows and become more dynamic, such a centralized approach quickly becomes impractical. As a result, there are a number of decentralized approaches that have been proposed. A peer-to-peer (P2P) network is a distributed system in which peers employ distributed resources to perform a critical function in a decentralized fashion. Nodes in a P2P network normally play equal roles. Therefore, these nodes are also called peers. P2P networks can be classified based on the control over data location and network topology. There are three categories: unstructured, loosely structured, and highly structured. In an unstructured P2P network such as Gnutella [4], no rule exists which defines Supported by National High Technology Research and Development Program of China under Grant No. 2002AA104220, 2002AA131010, 2002AA134010

2 where data is stored and the network topology is arbitrary. In a loosely structured network such as Freenet [5], both the overlay structure and the data location are not precisely determined. In a highly structured P2P network such as CAN [6] and Chord [7], both the network architecture and the data placement are precisely specified. Researchers have done lots of work on combining peer-to-peer technology and Web Services discovery technology. Farnoush Banaei Kashani [8] adopted unstructured peer-to-peer to design Web Services peer-to-peer discovery service. Min Cai [9] improved Chord to design grid information service, which support multiply attribute query and range query, but it assume that grid resource could be described by a sets of attribute, and replicate data according the value of each attribute. Cristina Schmidt [10] improved CAN to design web service discovery, but it just support single keyword query. Kunal Verma [11] presented a scalable peer-to-peer infrastructure of registries, which classify registries according domain and induce query message to correct registries directly, it use a centralized service to gossiping [12] the registry domain ontology among all registries. In this paper, we will improve UDDI specification that could guarantee usability of response and also append grid-like monitoring information of service hosting environment to response. In order to discover and select web service among fully autonomous registries, we present a network for registries by unstructured peer-to-peer technology. Furthermore, we present a network for registries by structured peer-to-peer technology in order to discover web service more efficient among cooperative registries. The two networks have no influence on interoperability between registries and requesters. The rest of this paper is organized as follows. Section 2 enhances traditional UDDI specification to involve metadata of service hosting environment. Section 3 describes architecture and algorithm of unstructured peer-to-peer network for registries. Section 4 describes structured peer-to-peer network for registries. Section 5 presents a prototype of unstructured peer-to-peer network for registries. Section 6 presents our conclusion and future work. 2.0 Enhance UDDI to Utilize Metadata of Web Service Host Environment It is well known that UDDI specification support efficient query based keyword and tmodel. But how can we know whether all response are online and idle, furthermore, how can we achieve status information of every hosting environment? In order to obtain status information about hosting environment in a large scale network, grid research organize have paid more attention to grid monitoring architecture, such as GMA [13], at other hand many grid development toolkits have implemented dedicated grid monitoring service, such as R-GMA [14]. We believe that registry should have responsibility to response user the right and usable services. It is very necessary if user or application want to immediately invoke target web service after receive response from given registry, especially useful to map abstract Web Services process to robust physical process at invoking time. Furthermore, registry should append relating monitoring information to response message of standard inquiry interfaces, thus could support requester to make further decision locally before invoking service. 2.1 Enhance UDDI to Guarantee Usability of Response There are two schemes to enable registry guarantee usability of response. First, we could deploy existing grid monitoring service for registry to monitor status of service provider, then invoke grid monitoring service at the running time of standard inquiry API, finally eliminate and rank candidates according to given usability metrics provided by registry. Second, we advice that registry should possess grid-like resource monitoring capability, that could be implemented by following steps. 1) Design unified and acceptable schema X for metadata information about services hosting environment; 2) Implement and deploy back end agent service at each computer which agrees to provide web service and be monitored. Agent service collects and reports metadata information to given registry according some rules at stated periods, and provides standard inquiry interfaces for user to extract real time status information from local storage system; 3)

3 Design new data structure named NodeEntity obeyed to schema X, and select storage model; 4) Design and implement publication and inquiry interface for NodeEntity; 5) Registry invokes NodeEntity inquiry interface to obtain status information during the realization of inquiry interface about ServiceEntity. Thus, registry could extract satisfied candidates according to given usability metrics based monitoring information. The new data structure and interface have important influence on other data structure and interface, but have no fundamental effect to web service architecture. The basic components of web service architecture still are service provider, service broker and service requester. But the major processes of web service architecture extend to support monitoring information publication and inquiry, besides service publication process, service inquiry process, and service invoke process. Service provider always deploys services at web server, and the number of web server may more than one. Unfortunately, both BusinessEntity and ServiceEntity could not reflect that characteristics. We believe NodeEntity could solve this problem. BusinessEntiy could contain one or more NodeEntiy, at the same time NodeEntity also could contain one or more ServiceEntity. It is easy to adjust schema and corresponding interface of BusinessEntiy and ServiceEntity to explicitly reflect those relationships, but we advise not to do so in order to make sure the interoperability among registries and client development toolkits provided by different company, organization, and individual. It is practical that registry establishs those implicit relationship according to discoveryurls coming from BusinessEntity, accesspoint coming from ServiceEntity and IP coming from NodeEntity. 2.2 Append Monitoring Information to UDDI Response As mentioned above, registry has capability to make decision about which candidate can satisfy usability metrics instead of requester. If requester can append local metrics to inquiry message, registry could not only guarantee the usability of response but also guarantee that response could satisfy local metrics. However, it is impractical to do so in order to guarantee interpretability between registry and client development toolkit. Thus, registry appends monitoring information to usability guaranteed response, then service requester make further decision locally, this seems to be the only practical approach to achieve same goal. We will explore concrete implementation process under the two schemas mentioned above in detail Considering the former schema, we could construct multiply IdentifierBag objects using metadata information of given service hosting environment obtained from monitoring service, and insert it to response. In fact, the response of findservice and getservicedetsil interface does not support IdentifierBag in current vision of UDDI specification. However, we could find that response of getbusinessdetail support use the of IdentifierBag, requester must call getbusinessdetail interface with Businesskey as parameter after invoking of findservice interface only thus can achieve monitoring information from standard inquiry interface of registry. Considering the later schema, registry possesses grid-like resource monitoring capability and store monitoring information at local storage system, so it is easy to construct IdentifierBag from local storage system and append it to response of getbusinessdetail interface following the same way. It is noted to say that IdentifierBag consist of pairs of attribute name and attribute value according to given Tmodel, thus it is very suitable to present monitoring metadata entry that has similar data structure. Moreover, we find that the number of IdentifierBag contianed by BusinessEntity can be more, thus BusinessEntity could support entire metadata schema by using multiply IdentiferBag. 3.0 Unstructured Peer-to-Peer Network for UDDI There will be large number of registries as the number of web service grows. But, it is unacceptable to replicate data between registries because of security and privacy problem. Moreover, as the number of web service grows and become more dynamic, synchronic replication among registries presented by current vision of UDDI specification quickly becomes impractical. We

4 consider such scenario that registries are fully autonomous, and service provider may publish web service to a random registry without any restraints. Thus, it is impossible to divide total web service description information among those registries and route query directly to correct candidate registries. In order to discover and select web service in such scenario, we construct a network for UDDI by connecting all registries with unstructured Peer-to-Peer technology. Here, we call this system UP2P4UDDI, each registry in UP2PUDDI normally play not only service broker but also service requester. In such unstructured P2P system, no rule exists that strictly defines where web service should be published and which registries are neighbors of each other, no copy of object exists, and no special network structure needs to be maintained. 3.1 Architecture We assume that registries have adopted either approach mentioned above to guarantee usability of response and insert monitoring information into response. Here, we only think about how to realize distributed inquire mechanism under such unstructured peer-to-peer network. In order to support local publication and global distributed inquire interface, each Peer should include local publish engine, local query engine and global query engine at least. When registry receive publication request, local publish engine parse xml document and store it to local file or database system. After receiving inquiry request, Local query engine extracts suitable data entity from local file or database system, and then organizes it following UDDI specification as response. If the inquiry termination metric still be false, local query engine will forward request to global query engine. After receiving inquiry request from local query engine, global query engine will forward inquiry request to some neighbors or all by given forward algorithm. It also merges received response to reply to requester. In summary, it is not wise to traverse all peers for obtain all services satisfied request message, because thus could generate large number of messages and could not bring more benefits for requester. We advise to traverse partition peers for obtain given number of services which satisfied usability metrics. In fact, it is very impractical to construct global overview of services for every requester because of expensive overhead. 3.2 Forwarding-based Searching Algorithms The desired features of searching algorithms in P2P systems include high-quality query results, minimal routing state maintained per node, high routing efficiency, load balance, resilience to node failures, and support of complex queries. The quality of query results is application dependent. Generally, it is measured by the number of results and relevance. The routing state refers to the number of neighbors each node maintains. The routing efficiency is generally measured by the number of overlay hops per query. In some systems, it is also evaluated using the number of messages peer query. Different searching techniques make different trade-offs between these desired characteristics. Original Gnutella used flooding, which is the Breadth First Search (BFS) of the overlay network graph with depth limit D. D refers to the network-wide maximum TTL of a message in terms of overlay hops. Thus, Querying node sends the query request to all its neighbors without any ranking and selection. Each neighbor processes the query and returns the result if the data is found. This neighbor then forwards the query request further to all its neighbors except the querying node. This procedure continues until the depth limit D is reached. Flooding tries to find the maximum number of results within the ring that is centered at the querying node and has the radius: D-overlay-hops. However, it generates a large number of messages and does not scale well [15]. In summary, searching in UP2P4UDDI network is often based on flooding or its variation because there is no control over data storage. The searching strategies in unstructured P2P systems are either blind search or informed search. In a blind search such as iterative deepening [16] and random walker [17], no node has information about the location of the desired data. In an informed search such as routing indices [18], each node keeps some metadata about the data location.

5 4.0 Structured Peer-to-Peer Network for UDDI We consider that if web service space is divided and subspaces are arrange to the registries, finding the right services would be easier by routing query to relevant registries directly. This could be implemented by following key steps.1) Extract web service space and form a uniform taxonomy about web service, which could be accepted by all roles of web service architecture ; 2) The uniform taxonomy is a logical tree in nature, only nodes of this tree represents the taxonomy entries. It is not difficult to divide this tree, and produce a set of sub-trees which root node was used to represent it; 3) Registry must declare its responsibility range by associating with one sub-tree of that taxonomy; 4) Web service publication message must contain category information obeyed that taxonomy in order to be stored at right registry; 5) Web service inquiry message also must contain category information obeyed that taxonomy in order to be forwarded right registries. Thus, service discovery process would involve locating the correct registry in the first place and then locating the appropriate service within that registry. In order to locating correct registry for every inquiry request, there should be a system to store the mapping relation between registries and sub-tree of that taxonomy, furthermore, the system could not be centralized according to single point failure and scalability, thus structured peer-to-peer network should be a suitable candidate. In this paper, we do not research query process at each correct registry. 4.1 Using DHT Technology to Locate Correct Registry Compare to unstructured peer-to-peer system, the neighbors of a node are well-defined, moreover, the data is stored in a well defined location. For this reason, it provide guarantees on finding existing data and bounded data lookup efficiency in terms of the number of overlay hops. Among the structured Peer-to-Peer protocols and systems, some implement a distributed hash table (DHT) using different data structures, such as Chord and CAN. A DHT is a hash table whose table entries are distributed among different peers located in arbitrary locations. Each data item is hashed to a unique numeric key. Each node is also hashed to a unique ID in the same key space. Each node is responsible for a number of keys which fall into a given numeric range. A key is mapped to a node whose ID is the largest number which does not exceed that key. Chord achieves O(logN) routing efficiency at the cost of O(logN) routing state per node. N refers to the total number of nodes in the system. Traditional structured peer-to-peer system support keyword-based query, but don t support complex query, such as multiple attribute query and range query. While UDDI programmer specification declares to support complex query in major inquiry interfaces, thus traditional structured peer-to-peer system couldn t be used directly to construct distributed registries network. But single keyword-based query mechanism is enough to locate correct registry by following steps. First, we implement a structured peer-to-peer system using Chord protocol. Every registry must declare its responsibility range by associating with one sub-tree of that taxonomy. The value of root node of that sub-tree is hashed to a unique numeric key. Registry must publish it to that chord system through put (key, object) operation before it can be retrieved by service provider and service requester, object can be any data structure including access point of registry. Second, if service provider has not pre-knowledge about registries distribution and responsibility range, the service publication message generated by it may be sent to a random registry. After receiving publication message, registry could obtain category information and extract the root node of sub-tree of that taxonomy, then hash the value of root node to a unique numeric key by same hash function. Registry could locate correct registries responded to store publication data by lookup(key) operation, and forward that publication message to one registry from all candidates. We do not research how to select one from correct registries in this paper, although this has important influence on data distribution among all registries. Third, service requestor could generate service inquiry message to a random registries if it has not pre-knowledge about registries distribution and responsibility range. After receiving inquiry message, registry also extract the root node of sub-tree of that taxonomy from service inquiry message, then

6 generate hashing key of root node value by same hash function. Correct registries stored relevant data could be obtain by lookup(key) operation, then the registry would forward inquiry message to all candidate registries and merge all response coming from different registry as the final response to service requestor Prototype of UP2P4UDDI First, we combined the B/S and RPC model to implement standard registry obeyed UDDI specification. We employ Jsp as the representation technology, Java and Beans as the realization technologies of application logic, JDBC and RDBMS as the database technology in the framework of B/S. Furthermore, Simple Object Access Protocol (SOAP) is selected to realize the RPC model. All kinds of application could send service inquiry or publication soap message encapsulated by SOAP toolkits to access point of registry, there is a Servlet that parses soap message and activates related Beans to execute application logic. Then application could obtain, parse and use the response soap message encapsulated by the Servlet. The framework of our registry was illustrated as figure 1. Second, we do some more work to implement configurable unstructured peer-to-peer network for standard registries. We have realized random walker algorithm to support forwarding-based search, and guarantee usability of response using monitoring information from grid monitor service provide by the Spatial Information Grid [19]. We also append monitoring information to response message of getbusinessdetail interface. Web Container HTML Browser HTTP S JSP Bean JDBC Application SOAP Servlet DBMS Figure1. The framework of our standard registry 6. 0 Conclusion In this paper, we improved UDDI specification that could guarantee usability of response and also append grid-like monitoring information of service hosting environment to response, thus requester could rank and select web service according some metrics after receiving response. But modification to standard UDDI specification does not affect the interpretability. We also presented two distributed and scale-well approach to overcome the disadvantage of traditional registries for fully autonomous registries and cooperative registries. We presented an unstructured peer-to-peer network for fully autonomous registries, and a structured peer-to-peer network for cooperative registries. The two peer-to-peer network for registries supports complex query without effect on interoperability, and are suitable candidates to extend traditional web service registries. As a part of the Spatial Information Grid project, we implement a prototype of unstructured peer-to-peer network for registries, and have deployed it at China National Geology Grid [20].

7 In the future work, we will realize structured peer-to-peer for our standard registry, and experiment other forwarding-based search algorithm with our prototype of unstructured peer-to-peer network for registries. It may be anther interesting work to realize semantic matching in future implementation of structured peer-to-peer network for registries. 7.0 References [1] Ian T. Foster, Carl Kesselman, Jeffrey M. Nick et al. Grid services for distributed system integration. IEEE Computer, 2002, 35(6): [2] Karl Czajkowski, Donald F Ferguson, Ian Foster et al. The WS-Resources framework. Global Grid Forum. March [3] The Evolution of UDDI. [4] Gnutella RFC. [5] I. Clarke, O. Sandberg, B. Wiley, T. W. Hong. Freenet: A distributed anonymous information storage and retrieval system. In: Proc. of ICSI Workshop on Design Issues in Anonymity and Unobservability, [6] S. Ratnasamy, P. Francis, M. Handley, R.M. Karp. A scalable content-addressable network. In: Proc. of ACM SIGCOMM, [7] I. Stoica, R. Morris, D. Karger et al. Chord: A scalable peer-to-peer lookup service for Internet applications. In: Proc. of ACM SIGCOMM, [8] Farnoush Banaei Kashani, Ching-Chien Chen et al. WSPDS: Web Services Peer-to-Peer Discovery Service. In: Proc. of International Conference on Internet Computing, [9] Min Cai, Martin Frank, Jinbo Chen et al. MAAN: A Multi-Attribute Addressable Network for Grid Information Services. In: Proc. of GRID 2003: [10] Cristina Schmidt, Manish Parashar. A Peer-to-Peer Approach to Web Service Discovery. World Wide Web 7(2): (2004). [11] Kunal Verma, Kaarthik Sivashanmugam, Amit Sheth et al. METEOR-S WSDI: A Scalable Infrastructure of Registries for Semantic Publication and Discovery of Web Services. Journal of Information Technology and Management, Special Issue on Universal Global Integration, 2005, 6(1): [12] F. M. Cuenca-Acuna, C. Peery, R. P. Martin, T. D. Nguyen. PlanetP: using gossiping to build content addressable peer-to-peer information sharing communities. In: Proc. of the 12nd IEEE International Symposium on High Performance Distributed Computing (HPDC 03), [13] B. Tierney, R. Aydt, D. Gunter et al. Grid Monitoring Architecture. GGF Performance Working Group, [14] Rob Byrom, Brian A. Coghlan, Andrew W. Cooke et al. Relational Grid Monitoring Architecture. CoRR cs. DC/ : (2003). [15] X. Li, J. Wu. Searching Techniques in Peer-to-Peer Networks. Accepted to appear in Handbook of Theoretical and Algorithmic Aspects of Ad Hoc, Sensor, and Peer-to-Peer Networks, J. Wu (ed.), CRC Press, [16] B. Yang, H. Garcia-Molina. Improving search in peer-to-peer networks. In: Proc. of the 22nd IEEE International Conference on Distributed Computing, [17] Q. Lv, P. Cao, E. Cohen et al. Search and replication in unstructured peer-to-peer networks. In: Proc. of the 16th ACM International Conference on Supercomputing, [18] A. Crespo, H. Garcia-Molina. Routing indices for peer-to-peer systems. In: Proc. of the 22nd International Conference on Distributed Computing, [19] Deke Guo, Honghui Chen, Xueshan Luo. Resource information management of spatial information grid. In: LNCS3032, Springer, 2003, [20] Yu Tang, Kaitao He, Nong Xiao et al. Study on system framework and key issues of national geological application grid. Journal of Computer Research and Development, 2003, 40(12): (in Chinese).