Cloud Platform for VPH Applications Marian Bubak, Piotr Nowakowski, and Marek Kasztelnik ACC Cyfronet AGH Krakow and WP2 Team of Project dice.cyfronet.pl/projects/ www.vph-share.eu (No 269978)
Outline Motivation Architecture Overview of platform modules Technology Current functionality Introduction to demos
A Very Short Glossary Virtual Machine: A selfcontained operating system! image, registered in the Cloud framework and capable of being managed by mechanisms. Raw OS Atomic service: A! application (or a component thereof) installed on a Virtual Machine and registered with the cloud management tools for deployment. Atomic service instance: A running! instance of an atomic service, hosted in the Cloud and capable of being directly interfaced, e.g. by the workflow management tools or VPH- Share GUIs. OS app. (or component) External APIs Cloud host OS app. (or component) External APIs
Platform for three user groups The goal of of the platform is to manage cloud/hpc resources in support of applications by: Providing a mechanism for application developers to install their applications/tools/services on the available resources Providing a mechanism for end users (domain scientists) to execute workflows and/or standalone applications on the available resources with minimum fuss Providing a mechanism for end users (domain scientists) to securely manage their binary data in a hybrid cloud environment Providing administrative tools facilitating configuration and monitoring of the platform End user support Easy access to applications and binary data Developer support Tools for deploying applications and registering datasets Admin support Management of VPH- Share hardware resources Cloud Platform Interface Manage hardware resources Heuristically deploy services Ensure access to applications Keep track of binary data Enforce common security Application Data Application Data Generic service Hybrid cloud environment (public and private resources) Application Data
Cloud Platform Architecture Admin Modules available in first prototype Developer Scientist Master UI Data and Compute Cloud Platform Atomic Service Instances Deployed by AMS on available resources as required by WF mgmt or generic AS invoker AS mgmt. interface AM Service Tool / App. Generic AS invoker Workflow description and execution Security mgmt. interface Computation UI extensions Data mgmt. interface Generic data retrieval DRI Service VM templates Available cloud AS images 101101 101101 011010 101101 011010 111011 011010 111011 111011 Managed datasets infrastructure Atmosphere persistence layer (internal registry) Raw OS (Linux variant) LOB Federated storage access Web Service cmd. wrapper Web Service security agent Generic VNC server Data mgmt. UI extensions Security framework LOB federated storage access Custom AS client Remote access to Atomic Svc. UIs Cloud stack clients HPC resource client/backend Physical resources
The Atmosphere Management Service receives requests from the Workflow Execution stating that a set of atomic services is required to process/produce certain data; queries the Component Registry to determine the relevant AS and data characteristics; collects infostructure metrics, analyzes available data and prepares an optimal deployment plan. Application -- or -- Workflow environment -- or -- 1. Application (or any other authorized entity) requests access to an Atomic Service Atmosphere Core component of the cloud platform, responsible for managing cloud resources and deploying Atomic Services accordingly. 3. Heuristically determine whether to recycle an existing instance or spawn a new one. Also determine which computing resources to use when instantiating additional instances (based on cost information and performance metrics obtained from monitoring data) 2. Poll AIR for data regarding this AS and the available computing resources [Asynchronous process] Collect monitoring data and analyze health of the cloud infrastructure to ensure optimal deployment of application services AIR Also called the Atmosphere Internal Registry; stores all data on cloud resources, Atomic Services and their instances. End user 4. Call cloud middleware services to enforce the deployment plan Cloud middleware Computing infrastructure (hybrid public/private cloud) Selection of low-level middleware libraries to manage specific types of cloud sites 5. Deploy Atomic Service Instances as directed by Atmosphere
High Performance Execution Environment Provides virtualized access to high performance execution environments Seamlessly provides access to high performance computing to workflows that require more computational power than clouds can provide Deploys and extends the Application Hosting Environment provides a set of web services to start and control applications on HPC resources Application -- or -- Workflow environment -- or -- Invoke the Web Service API of AHE to delegate computation to the grid Present security token (obtained from authentication service) AHE Web Services (WSRF::Lite) Application Hosting Environment Auxiliary component of the cloud platform, responsible for managing access to traditional (grid-based) high performance computing environments. Provides a Web Service interface for clients. Tomcat container HARC GridFTP Job Submission Service (OGSA BES / Globus GRAM) WebDAV RealityGrid SWS User access layer Resource client layer End user Delegate credentials, instantiate computing tasks, poll for execution status and retrieve results on behalf of the client Grid resources running Local Resource Manager (PBS, SGE, Loadleveler etc.)
Data Access for Large Binary Objects LOBCDER host (149.156.10.143) LOBCDER service backend Resource factory WebDAV servlet Core component host (vph.cyfronet.pl) GUI-based access Data Manager Portlet ( Master Interface component) Storage driver Storage driver Storage driver (SWIFT) Resource catalogue Atomic Service Instance (10.100.x.x) Mounted on local FS (e.g. via davfs2) Service payload ( application component) SWIFT storage backend Generic WebDAV client External host LOBCDER (the federated data storage component) enables data sharing in the context of VPH- Share applications The system is capable of interfacing various types of storage resources and supports SWIFT cloud storage (support for Amazon S3 is under development) LOBCDER exposes a WebDAV interface and can be accessed by any DAV-compliant client. It can also be mounted as a component of the local client filesystem using any DAV-to-FS driver (such as davfs2).
Data Reliability and Integrity Provides a mechanism which will keep track of binary data stored in the Cloud infrastructure Monitors data availability Advises the cloud platform when instantiating atomic services Shifts/replicate data between cloud sites, as required AIR DRI Service Binary data registry Validation policy End-user features (browsing, querying, direct access to data) Register files Get metadata Migrate LOBs Get usage stats (etc.) A standalone application service, capable of autonomous operation. It periodically verifies access to any datasets submitted for validation and is capable of issuing alerts to dataset owners and system administrators in case of irregularities. Configurable validation runtime (registry-driven) Amazon S3 OpenStack Swift Cumulus Runtime layer Extensible resource client layer VPH Master Int. Data management portlet (with DRI management extensions) Store and marshal data Distributed Cloud storage
Security Framework Provides a policy-driven access system for the security framework. Provides a solution for an open-source based access control system based on fine-grained authorization policies. Implements Policy Enforcement, Policy Decision and Policy Management Ensures privacy and confidentiality of ehealthcare data Capable of expressing ehealth requirements and constraints in security policies (compliance) Tailored to the requirements of public clouds VPH clients Application Workflow managemen t service Developer End user Administrator (or any authorized user capable of presenting a valid security token) VPH Security Framework Public internet VPH Security Framework VPH Atomic Service Instances
Platform Modules and Technologies WP2 Component/Module Technologies applied Cloud Resource Allocation Management Cloud Execution Environment High Performance Execution Environment Data Access for Large Binary Objects Data Reliability and Integrity Security Framework Java application with Web Service (REST) interfaces, OSGi bundle hosted in a Karaf container, Camel integration framework Java application with Web Service (REST) interfaces, OSGi bundle hosted in a Karaf container, Nagios monitoring framework, OpenStack and Amazon EC2 cloud platforms Application Hosting Environment with Web Service (REST/SOAP) interfaces Standalone application preinstalled on Virtual Machines; connectors for OpenStack ObjectStore and Amazon S3; GridFTP for file transfer Standalone application wrapped as a Atomic Service, with Web Service (REST) interfaces; uses LOB tools for access to binary data Uniform security mechanism for SOAP/REST services; Master Interface SSO enabling shell access to virtual machines
Basic features of the cloud platform Install any scientific application in the cloud Developer Application Managed application Access available applications and data in a secure manner End user Administrator Manage cloud computing and storage resources Cloud infrastructure for e-science Install/configure each application service (which we call an Atomic Service) once then use them multiple times in different workflows; Direct access to raw virtual machines is provided for developers, with multitudes of operating systems to choose from (IaaS solution); Install whatever you want (root access to Cloud Virtual Machines); The cloud platform takes over management and instantiation of Atomic Services; Many instances of Atomic Services can be spawned simultaneously; Large-scale computations can be delegated from the PC to the cloud/hpc via a dedicated interface; Smart deployment: computations can be executed close to data (or the other way round).
Accessing the Infrastructure The Master Interface is deployed at new.physiomespace.com Provides access to all cloud platform features Tailored for domain experts (no in-depth technical knowledge necessary) Uses OpenID authentication provided by BiomedTown Contact Piotr Nowakowski (CYF) for details regarding access and account provisioning Further information at dice.cyfronet.pl/projects/ www.vph-share.eu
Demos of the Cloud Platform
End user s view of the cloud platform Developers, admins and scientists obtain access to the cloud platform via the Master Interface UI The OpenID architecture enables the Master Interace to delegate authentication to any public identity provider (e.g. BiomedTown). Following authentication the MI obtains a secure user token containing the current user s roles. This token is then used to authorize access to Atomic Service Instances, in accordance with their security policies. Developer Admin Scientist 1. User selects Log in with BiomedTown Master Int. Authentication widget Login feature Portlet Portlet Portlet Portlet 2. Open login window and delegate credentials 3. Validate credentials and spawn session cookie containing user token (created by the Master Interface) 4. When invoking AS, pass user token along with request header 6. Report error (HTTP/401) if not authorized BiomedTown Identity Provider Authentication service Security Proxy Security Policy Atomic Service Instance 6. Relay request if authorized Users and roles Service payload ( application component) 5. Parse user token, retrieve roles and allow/deny access to the ASI according to the security policy
End user s view of the cloud platform contd. Log into Master Interface Select Atomic Service Instantiate Atomic Service Atomic Services can be instantiated on demand Once instantiated, the service can be accessed by the end user Unused instances can be Access and use application
Handling security on the ASI level 1. Incoming request User token a6b72bfb5f2466512a b2700cd27ed5f84f99 1422rdiaz!developer! rdiaz,rodrigo Diaz,rodrigo.diaz@at osresearch.eu,,spain, 08018 digital signature timestamp unique username assigned role(s) additional info Public AS API (SOAP/REST) Exposed externally by local web server (apache2/tomcat) 2. Intercept request 3, 4 Report error 7. Relay response Atomic Service Instance Security Proxy Security Policy 3. Decrypt and validate the digital signature with the Master Interface s secret key. 4. If the digital signature checks out, consult the security policy to determine whether the user should be granted access on the basis of his/her assigned roles. 3, 4. If the digital signature is invalid or if the security policy prevents access given the user s existing roles, the Security Proxy throws a HTTP/401 (Forbidden) exception to the client. Actual application API (localhost access only) 5. Relay original request (if cleared) 6. Intercept service response Service payload ( application component) 5. Otherwise, relay the original request to the service payload. Include the user token for potential use by the service itself. 6-7. The service response is relayed to the original client. This mechanism is entirely transparent from the point of view of the person/application invoking the Atomic Service. The application API is only exposed to localhost clients Calls to Atomic Services are intercepted by the Security Proxy Each call carries a user token (passed in the request header) The user token is digitally signed to prevent forgery. This signature is validated by the Security Proxy The Security Proxy decides whether to allow or disallow the request on the basis of its internal security policy Cleared requests are forwarded to the local service instance
More information at dice.cyfronet.pl/projects/ www.vph-share.eu