AN APPROACH TO DEVELOPING BUSINESS PROCESSES WITH WEB SERVICES IN GRID R. D. Goranova 1, V. T. Dimitrov 2 Faculty of Mathematics and Informatics, University of Sofia S. Kliment Ohridski, 1164, Sofia, Bulgaria 1 radoslava@fmi.uni-sofia.bg, 2 cht@fmi.uni-sofia.bg In this approach, g-lite Grid middleware site accounting functionality is exposed as Web Services. In the essence of the approach, Web Services are registered in IBM WebSphere Service Registry and Repository Server. The last one supports UDDI. Business processes are described and developed in WebSphere Business Modeler and WebSphere Integration Developer. The business process orchestrator - WebSphere Process Server is outside of the Grid environment, but can manage processes composed of web services from the middleware. Introduction g-lite [1] is a Grid middleware, which is designed and implemented for EGEE [2] Grid infrastructure and is tightly specified for the need of the project. The middleware provides Grid services for resource brokering, job computing and data storage. On Fig. 1 are shown g-lite services, grouped logically into five groups: Fig. 1: g-lite Services 114
Security services provide mechanism for user or service identification, authorization, verification of user permissions and log information for auditing. The monitoring and information services provide mechanism for task monitoring, resource discovery and retrieval of service information. The job management services include computational resource for job submission and execution, tasks scheduler, job tracking system and accounting information. The data services includes storage element for access to storage resources, file catalog and file transfer services. The middleware provides services for collecting accounting information. This information includes data for the number of submitted jobs, for users who submit the jobs and for virtual organization to which these users belong. The most of the services which g-lite middleware provides are not service-oriented. The last one requires implementation of architectural principles [3] as: contract, abstraction, reusability, composability and discoverability. For clearness, we will describe them bellow: The contract is a document describing how a service can be programmatically accessed; Abstraction claims that services expose only the logic defined in service contract and hide implementation from the client; Reusability guaranties that services can be reused more than once and from multiple clients; Composability provides opportunity service to be grouped in composite services and execute as processes; Discoverability provides a standard mechanism for services discovery. For example the securities services, which g-lite provide haven t standard description. The middleware does not provide composition service. The discovery mechanisms that the middleware provide are not standard. There is information system, where the information for available services can be found, but there is not information how this services can be invoked. Another disadvantage is the lack of centralized registry, where all services can be published. However, as it is defined in SOA [4], the standard service description, service discovery, reusability and composition are main features of a service-oriented environment. As we mentioned in the beginning, g-lite is a Grid middleware tightly specified for the need of the EGEE project. The aim of this project is to develop and deploy new grid infrastructure for scientific research. The project has two priority scientific directions, serving the needs of biomedical experiments and needs of High Energy Physics (HEP).The last one is an area with complex business processes. By business process, we mean set of services, ordered in common schema for execution. The processes of HEP can include not only services form the Grid infrastructure, but also components from specific software for data simulation and analysis. The goal of our research is to outline the approach for developing of business processes with Web services in Grid environment. More precisely, we are interested in business processes in HEP and their execution in g-lite middleware. The security issues are not subject of our research. IBM SOA Foundation IBM SOA Foundation [5] encompasses tools, programming model, methodologies and techniques for capturing and implementing business design and the middleware infrastructure for hosting that implementation. The SOA Foundation is a comprehensive set of technologies, and practices that address all SOA features, such as flexibility, dynamicity, easier integration and reuse. Flexibility allows the business process to be changed without major efforts, after its deployment. Dynamicity at runtime allows a service from the process to be changed with another service, implemented with different technology, programming language or in different runtime environment. Service reuse means that services can be used in other applications. The SOA approach breaks down the underlying software and information technology into reusable components (services). These services can be combined and recombined into complex processes. SOA allows the services, to talk to each other using the open standards. The SOA life cycle [6] includes four phases (Fig. 2). They can be summarized as follows: 115
Model - During the modeling phase are gathered requirements and processes are designed. An IBM SOA Foundation tools for business analysts, modeling and simulation of business processes is IBM WebSphere Business Modeler; Assemble During the assemble phase, processes are developed, assembled and tested in integration environment. This environment provides services for transport and mediation and control capabilities, for flow management and services interactions. An IBM SOA Foundation tool for workflows and data modeling and system interactions is IBM WebSphere Integration Developer; Deploy During deployment phase are integrated people, process and information; Fig. 2: SOA lifecycle Manage During management phase, applications and services are monitored. An IBM SOA Foundation tools for business monitoring is IBM WebSphere Business Monitor. Governance and best practices support the life cycle through the use of information technology alignment and process control. Service registry that allows users to manage SOA life cycle from development through deployment is IBM WebSphere Service Registry and Repository. Approach Specificity and Implementation The approach, we proposed, is based on programming model and methodologies defined in IBM SOA Foundation. It is service-oriented and is based on the next steps: Web service development, Web service registration, Process modeling, Process assembling, deployment and testing. For approach realization, we use the WebSphere Business Modeler to define the process. WebSphere Integration Developer for application assembly. The WebSphere Process Server built into the WebSphere Integration Developer for deployment and testing. And the WebSphere Service Registry and Repository for governance, service metadata and reuse. 116
Web service development The business processes modeling is not possible without services. They are the main components of the process. As we mentioned in the beginning, the high energy physics is an area with complex business processes. They can include not only services form the Grid infrastructure, but also components from specific software for data simulation and analysis. In order to achieve the goal of our research, services for software of HEP have to be developed. A problem is that g-lite is not serviceoriented. The environment does not provide standard for service description and mechanisms for service discovery. For modeling more complex business process in Grid environment, we also have to develop services for job submission, based on provided services in the g-lite middleware. In order to demonstrate the approach, we develop site statistic service, which provides the following functionality: usertaskcount returns information for number of jobs for given user; userfailedjobs returns information for number of failed jobs for given user; votaskcount returns information for number of jobs for given VO; vofailedjobscount returns information for number of failed jobs for VO; usercputime returns information for used CPU time for given user; vocputime returns information for used CPU time for given VO; sitetaskcount returns information for number of jobs submit on a site; sitefailedjobs returns information for number of failed jobs for a site; sitecputime returns information for used CPU Time on a site; drawpiechart returns URL of image pie chart for given data. For web service implementation, we used Eclipse Platform, JDK 1.6 and Axis 2. The service was deployed on Tomcat 6 application server (Fig. 3). MON BOX Statistic Service Tomcat 6 Log Processor GK Log Files Host running R-GMA server R-GMA Server Data streamed to central account server GK & CE R-GMA API Log Processor Publisher PBS Log Files Web service registration Fig. 3: Accounting web service development [7] For service metadata and reuse, we use WebSphere Service Registry and Repository (WSRR) [8]. WSRR is a service registry, which provides suitable interface for service definition and registration. Another feature is WSSR plug-in for Eclipse that allows services to be registered into the repository form within the Eclipse environment. WSRR supports UDDI and provide Web Browser for service registration. Registered services can be browsed as shown on Fig. 4. 117
Process modeling Fig. 4: Web service registry and service graph We develop example process to demonstrate the approach. We have to mention, that process is modeled, assembled and deployed without writing a single line of code. The reasons are two: All services which are part of the process are developed to expose their functionality by using only simple types. For example, all operation of statistic service get as input simple data types string, integer, etc. and return simple data types string; IBM SOA Foundation provide good framework for business process specification, generation and execution. For process modeling, we use WebSphere Business Modeler 7.0. On the process bellow (Fig. 5), we use the developed statistic service and two of its functionality votaskcount and drawpiechart. The aim of the process is to show statistic for the number of jobs which different VOs submit to a site. Fig. 5: Example process for site statistic 118
We have to mention that WebSphere Business Modeler can be also integrated with WSSR plug-in, which allows services to be included directly from registry into the process. As input the process gets the name of a VO and the date range for the desired period of time. If VO name is not specified, statistics is returned for all VOs. The result is URL of image, located on Tomcat server and is accessible from the Internet. On Fig. 6 we show the development of more complex process. Fig. 6: Process for job submission in g-lite environment This is a process based on services for job submission into g-lite environment. More complex service composition is possible, but in order to do that more services have to be developed. Currently, we are working on implementation of the following services: ROOT web services will exposing legacy ROOT functionality as services; Job manager services provide functionality for job submission into g-lite environment, by exposing existing g-lite functionality; Proxy services provide functionality for proxy certificate management by exposing existing g-lite functionality; All of developed services are designed according principles of service-orientation; they are published into repository and can participate into more complex sequences of tasks processes. Process assemble, deployment and test The example process, we modeled above, was assembled and deployed into Process Server. For process assembling, deployment and testing, we used WebSphere Integration Developer 7.0. On Fig. 7 are shown the process as it looks into integration development environment, and example test and result. 119
Fig. 7: Process deployment, testing and result Conclusion The approach, we present, outlines framework for business processes specification, development and execution into g-lite Grid middleware. The advantages of this approach are dynamic flexibility and loose-coupling. However business process specification of HEP is a subject of future investigation. Acknowledgement This work was supported by University of Sofia SRF under Contract N163/2010. References [1] Programming the Grid with g-lite, http://cdsweb.cern.ch/record/936685/files/egee-tr- 2006-001.pdf [2] EGEE, http://public.eu-egee.org/ [3] Service-Oriented Architecture: Principles of service design, T. Erl, Prentice Hall (2007). [4] Service-Oriented Architecture: Concepts, Technology, and Design, T. Erl, Prentice Hall, 2005. [5] IBM SOA Foundation: Providing what you need to get started with SOA, ftp://ftp.software.ibm.com/software/solutions/pdfs/soa_g224-7540-00_wp_final.pdf [6] IBM SOA Foundation: An Architectural Introduction and Overview, http://download.boulder.ibm.com/ibmdl/pub/software/dw/webservices/ws-soawhitepaper.pdf [7] Apel User Guide, http://www.egee.cesga.es/egee-sa1-swe/accounting/guides/apel-user-guide-glite.pdf [8] IBM WebSphere Service Registry and Repository Handbook, http://www.redbooks.ibm.com/redbooks/pdfs/sg247386.pdf 120