An IDL for Web Services Interface definitions are needed to allow clients to communicate with web services Interface definitions need to be provided as part of a more general web service description
Web Service Descriptions Service descriptions specify (1) how the messages are to be communicated and (2) the URI of the service Service descriptions are written in XML (so are accessible from any programming technology) The description forms the basis of an agreement between a client and a server The IDL is generally used to generate client stubs which automatically implement the correct behaviour for the client
WSDL Web Services Description Language (WSDL) is commonly used for service descriptions WSDL 2.0 is a working draft of the W3C It defines an XML schema for representing the components of a service description Included are standards for element name definitions, types, messages, interfaces, bindings and services
WSDL Requests and Replies message name = "ShapeList_newShape " part name="graphicalobject_1" type = "ns:graphicalobject " message name = "ShapeList_newShapeResponse " part name="result" type = "xsd:int " tns ミ target namespace xsd ミ XML schema definitions
More on WSDL The abstract part of the description includes a set of definitions of the types used by the service, in particular the types of the values exchanged in messages The concrete part specifies how and where the service may be contacted The inherent modularity of a WSDL definition allows its components to be combined in different ways the same interface may be used with different bindings or locations
Web Service Clients and Servers All that the client and the server need is to have a common idea about the messages to be exchanged When a clients sends messages to a web service, the latter decides what operation to perform and what type of message to send back to the client, on the basis of the type of the message it received
WSDL Interfaces The collection of operations belonging to a web service are grouped together in an XML element named "interface" Each operation must specify the message exchange pattern between the client and server
Message Exchange Patterns Name Messages sent by Client Server Delivery Fault message In Out In Only Robust In Only Out In Out Only Robust Out Only Request Reply may replace Reply Request no fault message Request guaranteed may be sent Reply Request may replace Reply Request no fault message Request guaranteed may send fault
WSDL Inheritance Any WSDL interface may extend one or more other WSDL interfaces This is a simple form of inheritance However, recursive definitions of interfaces is NOT allowed
WSDL Concrete Part The concrete part of a WSDL document consists of the binding and the service, that is, the choice of protocols and the choice of endpoint or server address When you see binding think protocol When you see service think end point or address
Binding and Services The binding section in a WSDL document says which message formats and form of external data representation are to be used Web services frequently use SOAP, HTTP and MIME Each service element in a WSDL document specifies the name of the server and one or more endpoints (or ports) where an instance of the service may be contacted
WSDL Use Complete WSDL documents can be accessed via their URLs by clients and servers, either directly or indirectly via a directory service Tools are available for generating WSDL definitions from information provided via a graphical user interface, removing the need for users to be involved in the complex details and structure of WSDL WSDL definitions can also be generated from interface definitions written in other languages
Web Service Directory Services The Universal Directory and Discovery Service (UDDI) provides both a name service and a directory service WSDL service descriptions may be looked up by name (white pages) or by attribute (yellow pages) They may also be accessed directly via their URLs, which is convenient for developers who are designing client programs that use the service
UDDI Lookup UDDI provides an API for looking up services based on two sets of query operations The get_xxx set of operations retrieves an entity based on a key value The find_xxx set of operations retrieves a set of entities that match a set of search criteria UDDI also provides a notify/subscribe interface by which clients register interest in a particular set of entities in a UDDI registry and get change notification messages (synchronously or asynchronously) sent to them
Coordination of Web Services Many useful web services applications involve several requests that need to be done in a particular order There's also a need for client web services to be provided with a description of a particular protocol to follow when interacting with other web services
The Travel Agent Service flight booking a flight booking b Client Travel Agent Service hire car booking a hotel bookinga hotel booking b hire car booking b
The Travel Agent Scenario 1. The client asks the travel agent service for information about a set of services; for example, flights, car hire and hotel bookings. 2. The travel agent service collects prices and availability information and sends it to the client, which chooses one of the following on behalf of the user: (a) refine the query, possibly involving more providers to get more information, then repeat step 2; (b) make reservations; (c) quit. 3. The client requests a reservation and the travel agent service checks availability. 4. Either all are available; or for services that are not available; either alternatives are offered to the client who goes back to step 3; or the client goes back to step 1. 5. Take deposit. 6. Give the client a reservation number as a confirmation. 7. During the period until the final payment, the client may modify or cancel reservations
Achieving Coordination Distributed transaction technologies play a big part here! The W3C (and others) are working towards the definition of higher level services Notable work in this area includes WS Coordination
Web Service Choreography W3C use the term "choreography" to refer to a language based on WSDL for defining coordination Intended to support interactions between web services which are generally managed by different companies and organizations Described in terms of the sets of observable interactions between pairs of web services, which forms the basis of the contract between the participants The use of a common choreography description by a set of collaborating web services should result in more robust services with better interoperability The W3C have released early draft standards that describe CDL the Choreography Definition Language
Securing Web Services As web services are based (almost exclusively) on XML, security comes down to protecting the text based XML communications Obviously, documents shared over the Internet also need to be authenticated The W3C have developed XML security technologies that support signing, key management and encryption Another approach is WS Security, which concerns itself with applying message integrity, message confidentiality and single message authentication to SOAP XML security depends on new tags that can be used to indicate the beginning and end of sections of encrypted or signed data and of signatures
XML Security Requirements To be able to encrypt either an entire document or just some selected parts of it To be able to sign either an entire document of just some selected parts of it To add to a document that is already signed and to sign the result To add to a document that already contains encrypted sections and to encrypt part of the new version, possibly including some of the already encrypted sections To authorize various users to view different parts of a document
Requirements: Algorithms and Keys The standard should specify a suite of algorithms to be provided to an implementation of XML security The algorithms used for encryption and authentication of a particular document must be selected from that suite and the name of the algorithms must be referenced with the XML document itself Appropriate keys must be chosen, without any negotiation with those parties that may access the document in the future Requirement to help the users of secure documents with finding the necessary keys, to make it possible for cooperating users to help one another with keys
Canonical XML Designed for use with digital signatures, which are used to guarantee that the information content of a document has not been changed XML elements are canonicalized before being signed and the name of the canonicalization algorithm is stored with the signature This (obviously) enables the same algorithm to be used when the signature is validated upon receipt
XML Signature Algorithms Type of algorithm Name of algorithm Required reference Message digest SHA 1 Required Section 7.4.3 Encoding base64 Required [Freed and Borenstein 1996] Signature DSA with SHA 1 Required [NIST 1994] (asymmetric) RSA with SHA 1 Recommended Section 7.3.2 MAC signature (symmetric) HMAC SHA 1 Required Section 7.4.2 and Krawczyk et al. [1997] Canonicalization Canonical XML Required Page 810
XML Encryption Algorithms Type of algorithm Name of algorithm Required reference Block cipher TRIPLEDES, AES 128 AES 256 AES 192 required Section 7.3.1 optional Encoding base64 required [Freed and Borenstein 1996] Key transport Symmetric key wrap (signature by shared key) RSA v1.5, RSA OAEP TRIPLEDES KeyWrap, AES 128 KeyWrap, AES 256KeyWrap AES 192 KeyWrap required Section 7.3.2 [Kaliski and Staddon 1998] required [Housley 2002] optional Key agreement Diffie Hellman optional [Rescorla, 1999]
A Web Services Case Study Using web services technology as the basis for building multi user, geographically dispersed Grid Services
What is a Grid? The name "grid" is used to refer to middleware that is designed to enable the sharing of resources on a very large scale Grids are used by groups of users in different organizations who are collaborating on the solution of problems requiring large numbers of computers to solve them, either by sharing data or by sharing computing power Grids typically run on heterogeneous computer hardware environments (and often have no choice) Management is needed to coordinate the use of resources to ensure that clients get what they need and that services can afford to supply it Sophisticated security techniques are required to ensure that the proper use is made of resources in this type of environment
A Grid Example WWT The World Wide Telescope (WWT) deploying the resources shared by the astronomy community Astronomy data consists of archives of observations, which covers a particular period of time, a part of the electromagnetic spectrum (optical, x ray, radio) and a particular area of the sky Astronomers freely share their results with one another (so security tends not to be an issue everyone trusts everyone else) Immense archives of data are gathered by many teams and managed locally data amounts grow exponentially and data is used by astronomers from all over the world
WWT Requirements and Aim Any project that manages an astronomy data archive must make it accessible to other researchers This task implies considerable overhead in addition to the original task of data acquisition, analysis and storage Additionally, derived data needs to be accompanied by metadata describing the parameters of the pipelines (processes) through which it was sent (producing this can amount to a considerable expense for the project that owns the original data) Overall aim: to unify the world's astronomy archives into a giant database containing astronomy literature, images, raw data, derived datasets and simulated/simulation data
Data Intensive Scientific Applications Data is collected by way of scientific instruments The data is archived at different sites whose locations can be anywhere in the world Data is managed by teams of scientists from many different organizations A large amount of raw data (terabytes and petabytes) is captured Computer programs will be used to analyze and make summaries of the raw data (classify, calibrate, catalog)
Science and the Internet The nature of the Internet makes all "open" digital archives available to any scientist anywhere in the world Typically, a scientist is only interested in a smaller subset of the entire archive It is, therefore, often infeasible (or plain stupid) to transfer the entire archive to the remote user Note: FTP and WEB access is not the way to go here! In place processing of the archive needs to be provided to the remote user/scientist The fact that data is processed at many different sites provides an inbuilt parallelism that effectively divides the immense task being undertaken (which has echoes of P2P technologies)
Application Requirements R1 remote access to local resources needs to be provided R2 processing of data needs to occur at the site which stores/manages the archive (to minimize data transfers) R3 the local resource manager needs to be able to dynamically create service instances based on a subset of the data R4 metadata is required to describe (1) the characteristics of the data in the archive and (2) the characteristics of the service managing that data R5 a discovery service is needed to provide access to the above metadata R6 management software is required to manage queries, handle data transfers and perform advance reservation of resources
Requirements and Services Standard web services can provide for the first two requirements, R1 and R2 Grid middleware can provide for the four remaining requirements, R3 to R6 Grids are also used for computationally intensive applications such as image analysis Grid resource management is concerned with allocating computing resources and balancing loads Even when privacy is not an issue, the ability to establish the identity of the people who created the data may be important within a grid environment
The Open Grid Services Architecture A standard framework for grid based applications OGSA OGSA is built on top of a set of web services The Globus Toolkit implements the architecture
The OGSA Application specific grid services e.g. astronomy, biomedical informatics, high-energy physics application specific interfaces OGSA services: directory, management, security OGSI services: naming, service data (metadata) service creation and deletion, fault model, service groups web services standard grid service interfaces e.g. GridService Factory
Application Level Grid Services Web services that implement standard grid service interfaces An interface to the set of data (the "service data") contains metadata about the service The context in which a service runs must provide a factory with the ability to create new service instances and to stop them when they are done
OGSA Services Layer OGSA Open Grid Services Architecture A directory service A management service A security service
OGSI Services Layer OGSI Open Grid Services Infrastructure The implementation of a scheme for the naming of service instances The definition of standard service data elements to be implements together with their set and get operations The interface to the factory for creating new service instances and operations to end them A fault model for all grid services to use Notification services supporting publishers/subscribers Service groups adding/removing members, etc.
Example Grid Applications Description of the project 1. Aircraft engine maintenance using fault histories and sensors for predictive diagnostics 2. Telepresence for predicting the effects of earthquakes on buildings, using simulations and test sites 3. Bio medical informatics network providing researchers with access to experiments and visualizations of results 4. Analysis of data from the CMS high energy particle detector at CERN by physicists world wide over 15 years 5. Testing the effects of candidate drug molecules for their effect on the activity of a protein, by performing parallel computations using idle desktop computers 6. Use of the Sun Grid Engine to enhance aerial photographs by using spare capacity on a cluster of web servers 7. The butterfly Grid supports multiplayer games for very large numbers of players on the internet over the Globus toolkit 8. The Access Grid supports the needs of small group collaboration, for example by providing shared workspaces Reference www.cs.york.ac.uk/dame www.neesgrid.org nbcr.sdsc.edu www.uscms.org [Taufer et al. 2003] [Chien 2004 www.globexplorer.com www.butterfly.net www.accessgrid.org
The Globus Toolkit Providing software that integrates and standardizes the functions required by a family of scientific applications built on grids The functions include directory services, security and resource management Aside: the OGSA standards evolved from version 2 of Globus Globus 3 was released in 2002 and is developed/maintained by the Globus Alliance (www.globus.org)
Globus in Action Grid service instances and factories are deployed in a runtime environment call a grid service "container" Containers deal with the dynamic creation and management of service instances with global names; the simple access to the state of service instances; the security of the service including looking after credentials, signing of messages, encryption and authorization
Globus Security Services Based on the protection of SOAP messages WS Security is used, as is XML Signature and XML Encryption X.509 certificates are used to provide a credentials service
Web Services Summary Providing a programmers API to Internet based applications and data (this is NOT web browsing!) Not necessarily part of the web browser/web server paradigm SOAP and REST are two of the popular enabling technologies, with XML a core technology Security is an important issue Grid services are an excellent example of using web services as an infrastructural component