Open Source Cloud Computing Research Distributed Cloud Computing Platform as a Service (PaaS) Analysis and Recommendations Cloud Computing Inter-Cloud Cloud Computing IaaS PaaS SaaS IaaS PaaS SaaS Web Data Web Data Updated June2012 Eugene Luster, Office of the CIO & Cloud Standards Lead, R2AD, LLC, Michael Behrens, CTO, R2AD, LLC and http://www.r2ad.com Page 1
Cloud Computing Standards Research Focus We (R2AD, LLC) are sponsored by DISA s Office of the CTO to pursue cloud computing research with emphasis on the following: 1. Open Source 5. High Performance 2. High Security 6. Transparency 3. Interoperability 4. Portability 7. Ease of Management 8. Standardized We are currently directly involved with these standards groups: Open Grid Forum (OGF) Open Cloud Computing Interface (OCCI) Storage Network Industry Association (SNIA) Cloud Data Management Interface (CDMI) National Institute of Standards and Technology (NIST) Document definition and roadmap of cloud standards Collected materials/briefs: https://www.intelink.gov/sites/cloud R2AD Android Cloud Management Client Page 2
PaaS Vision (work in progress) Replace stove pipe architectures and proprietary use of APIs with one or more standardized PaaS and Cloud APIs. Promoted Inter-Cloud Interoperability and Migration of Apps/Data. Common services (ReSTful APIs) for data storage and access, identity, logging/auditing, messaging, processing, monitoring, deployment, replication, SLA Use pattern based APIs/Tools in order to remain focus on application logic instead Do not keep re-inventing the platform (identity management, logging, database, management, etc) J2EE helped us through the last decade. PaaS binds scalable components together, similar to what Sun did with J2EE (EJB, JMS, JDBC, JSP, etc). PaaS supports multiple languages and web engines (Weblogic, glassfish, JBoss, Tomcat, Jetty, Nginx, node.js, etc) however ideally using same REST APIs to ensure portability in a heterogeneous cloud environment (i.e. CouchDB). Engine and Operating System should not matter.black box cloud computing Provide On-Demand scalability Automatic load balancing for web and data. Distributed data for speed and redundancy Built in replication/synchronization/caching, based on API which specifies data policy. Provide Data and also Web Transparency (do hard code to where things are). Provide automated self-service for full life cycle Use cloud based repository development, test, and field E.g.: Forge.mil: GIT, SVN, Maven, DISA CM. On demand install/cfg. Cloud Oriented and On Premise Installable Runs on top of IaaS in production. Agnostic as to which IaaS. Use the cloud to test the cloud, monitor the cloud, etc. Supporting Mobile end-user Data and apps accessible to mobile user (location transparency) SaaS Applications, Software PaaS Application Infrastructure Web Server+Data Storage IaaS Hardware Infrastructure Servers, Raw Storage, Network End Users/Customers Developers Data Center/Integrators Page 3
Recommend Piloting Platform as a Service (PaaS) PaaS next big wave of cloud technology It s the next generation for enterprise workloads Operating System really should not matter! black cloud computing PaaS layer should be Operating System agnostic Separating allows a set of heterogeneous systems to handle the workflow. Performance/Efficiency/Cost based decisions on back-end instead of technical. Major new choices available for open on-premise private clouds OpenShift Backed by Red-Hat. Uses JBoss, Spring, etc. Cloud Foundry Backed by VMware. Uses Tomcat, Spring, etc. Jelastic Available for on-premise in near future not open software however. Other community PaaS, Cumulogic, others NIST Definition of PaaS: Platform as a Service (PaaS). The capability provided to the consumer is to deploy onto the cloud infrastructure consumer-created or acquired applications created using programming languages, libraries, services, and tools supported by the provider. The consumer does not manage or control the underlying cloud infrastructure including network, servers, operating systems, or storage, but has control over the deployed applications and possibly configuration settings for the applicationhosting environment. Page 4
OSS Cloud Technologies Compared (work in progress) Offering Reference URL Installable On-Premise Page 5 Open Source Cloud Style OpenShift https://openshift.redhat.com/app/ Yes Yes PaaS (backed by RedHat) Cloud Foundry http://www.cloudfoundry.com/ Yes Yes PaaS (backed by VMware) Stackato http://www.activestate.com/cloud Yes Yes PaaS (based on cloud foundry) Cumulogic http://www.cumulogic.com Yes No PaaS Jelastic http://www.jelastic.com Soon No PaaS Apprenda http://apprenda.com/ Yes No PaaS (primarily.net focused) Appistry http://www.appistry.com/ Yes No PaaS Hadoop Analytics RightScale http://www.rightscale.com/ No No Management CompatibleOne http://compatibleone.org/ Yes Yes Cloud Broker Google App Engine http://code.google.com/appengine No. IaaS and PaaS SalesForce Force.com Microsoft Azure Services Platform Amazon EC2, Amazon S3, Amazon Beanstalk http://www.salesforce.com/ No SaaS http://www.force.com/ PaaS http://www.microsoft.com/windowsazure No. PaaS + DaaS + IaaS http://aws.amazon.com/ec2/ http://aws.amazon.com/s3/ http://aws.amazon.com/elasticbeanstalk/ Rackspace http://www.rackspace.com/ No IaaS OpenStack http://www.openstack.org/ Yes IaaS and DaaS CloudStack http://www.cloud.com/ Yes IaaS (backed by Citrix) Eucalyptus http://www.eucalyptus.com Yes Yes IaaS + DaaS (mimics Amazon) No. IaaS DaaS PaaS
Management & Security FYI: Generic Cloud Platform as a Service Big Data & Storage Source/App/VM Repository Auto Deploy/Configure PaaS Node PaaS Node Inter-Cloud Node Data Memory Cache Algorithms REST Web Distributed & Indexed - Synchronization Messages/Queue Scalable Platform Proxy Load Balancer Perimeter Security https (RESTful) Mashups (Distributed Observed State) Page 6 Node Data Memory Cache Algorithms REST Web Consider: Open PaaS Offerings that include well known/used/documented/supported components Hadoop, HAProxy, Restlet, ProxyCA, Node.js, web server, etc.
PaaS Offerings.Summary More PaaS Offerings to be added Please send in suggestions! More PaaS Offerings to be added Please send in suggestions! Page 7
Survey: OpenShift PaaS OpenShift is a PaaS backed by Red Hat since 2011 Languages: Java, Perl, PHP, Python, and Ruby Application Services: SQLite, Membase, MongoDB, Apache HTTP, JBOSS, Spring, built-in management layer Open Sourced: Yes (Apr/May 2012) https://openshift.redhat.com/community/open-source On-Premise: Yes OpenShift Express Free version, no auto-scaling. Good for developers Hosting: Amazon initially IaaS interoperability through the DeltaCloud API OpenShift Flex Auto-scaling, performance monitoring, app management Can autodeploy into OpenShift using tools like http://www.jboss.org/arquillian Web Site(s): https://openshift.redhat.com Page 8
OpenShift Screenshots Cartrigdes describe install+config of common modules, i.e. Drupal. Easily host web apps (i.e. Widgets) and data that scales. They use Amazon for now, OpenStack support coming Page 9
Survey: Cloud Foundry PaaS Cloud Foundry is a PaaS backed by VMware since 2011 Languages: Spring for Java,.NET, Ruby, Scala, Node.js, PHP, Python Application Services: RabbitMQ, vfabric PostgreSQL, MySQL, MongoDB, Redis, Spring, Chef Configuration Management Open Sourced: Yes On-Premise: Yes Cloud Foundry Open Source CloudFondry.org is the open source site. Others, like VMware, and Stackato, AppFog, build on it. Micro Cloud Foundry A VM which is available from VMware to run the PaaS environment on your laptop. Alternatively, VMware is hosting a free cloud for developers to experiment with. Web Site(s): http://www.cloudfoundry.com/ http://www.cloudfoundry.org/ http://www.activestate.com/cloud Page 10
More views of Cloud Foundry. Multiple Developer Choices in each layer Allows one to solve problem with right tool Need to be careful though Integrated components (architect and APIs form a glue) Scalability, Manageability, Automation Still Evolving, as are non-open offerings Cloud Foundry Views Page 11
Jelastic another PaaS focuses on Java Page 12
Creating Instances is Easy Nice UI Page 13
Cumulogic investigating Java PaaS Cloud & hypervisor agnostic Amazon Eucalyptus OpenStack CloudStack Ngnix, Apache, Tomcat, JBoss, Jetty, GlassFish, MongoDB and MySQL Page 14
Making Enterprise Data Available Everywhere Provide a Data Virtualization Layer (DVL) accessible via a ReST interface Implement Location Transparency Make data access layer virtual in order to decouple data sources from data consumers. This turns data into an enterprise information service. Restful access. NoSQL is a common solution, implemented by: CouchBase or MongoDB synchronization of document stores Hadoop/Hbase based on Google s file distributed file system/database, used by Facebook. Suitable for real-time and data warehouse. Riak based on Amazon s Dynamo, scalable, fault-tolerant, open source key/value database server Hypertable - high performance data storage for applications requiring maximum performance, scalability, and reliability. Achieve data virtualization, don t focus on a specific tool. Consider Cloud Data Management Interface (CDMI) provides an access API to data and metadata. Look for simple RESTful APIs. Spring-REST? Pick appropriate underlying storage products based on requirements Page 15
NoSQL Example: Apache CouchDB Features Apache CouchDB is a document-oriented database that can be queried and indexed using JavaScript in a MapReduce fashion. CouchDB also offers incremental replication with bi-directional conflict detection and resolution. CouchDB installed on r2ad.net as part of the Python CDMI implementation. Stores JSON objects based on provided key. Stored content access by a URL. Highly scalable, easy to use. Key Feature! Page 16
Cloud Broker: Compatible One http://compatibleone.org/bin/view/main/ A broker can talk to more than one cloud provider. They can help prevent vendor lock in as well. Page 17
Compatible One, open source cloud broker Page 18
PaaS Pilot Recommendations Evaluate new PaaS Products/Standards Understand the impact to overall integration and interoperability. Compare requirements/features with existing legacy or proprietary PaaS. Apply use case to each offering: RESTful data service with web access layer on top of it for endusers (i.e. Geo Sensor Use Case) Understand maintenance burden and stovepipe issues for non-industry offerings. Consider OpenShift/Cloud-Foundry components to migrate into existing environment as part of a transformation strategy Avoid NIH syndrome! Do not re-invent the wheel Investigate use of Cloud Brokers, such as CompatibleOne Open Source and Standards Participation Extend your presence in the Standards Development Organizations (SDO). Participate in standards groups to ensure your requirements are being addressed Bring the cloud architecture requirements and designs to OGF for standardization. Participate in the Open Source code stacks of PaaS and IaaS, with a focus on PaaS This is aligned with the NIST recommendations. Review/Follow NIST RoadMap. Page 19
Guiding Principles Be OSSM (pronounced awesome ) On-Demand, Self-Service, Scalable, and Measured As advocated by Cloud Camp, similar to NIST guidelines (cloudcamp.org) Cloud Computing is evolving rapidly Challenge Everything What might be true/best today may change, so re-think architecture and strategy often Keep up with industry trends and new specifications There are many different solutions available Do not have to like them all, but need to understand them Remain engaged with standards groups/efforts (NIST recommendation) OGF, SNIA, DMTF, NIST, IEEE, others Focus on interoperability and security On-Premise solutions are needed in support of our labs Think outside the cloud to inter-cloud secure exchanges DISA Office of the CTO Intelink Cloud Site: https://www.intelink.gov/sites/cloud R2AD s Home Page and cloud test page http://www.r2ad.com, http://cloud.r2ad.net Page 20
8 Fallacies of Distributed Computing (Peter Deutsch, et. al.) Common false assumptions of Distributed Computing. 1. The network is reliable 2. Latency is zero 3. Bandwidth is infinite 4. The network is secure 5. Topology doesn t change 6. There is one administrator 7. Transport cost is zero 8. The network is homogenous www.rgoarchitects.com/files/fallacies.pdf R2AD: HW/SW components never fail Page 21
Cloud Fringe: HTML5/CSS Important Features Canvas Exposed Offers fantastic power to create new User Experiences (UX) Windows in browser, Unix in Browser, graphical apps, etc. Browser becoming a Runner Game Changer! Source: http://caniuse.com/#feat=canvas TinySQL database part of browser Store cached data, work off-line in support of disconnected operations, etc Page 22
Cloud Brainstorm Replication Brainstorming session.white Board Ambient Broker Monitor Billing Management Protocol Mediation Transport Logging Repository Standards Instead of Cloud Silos! Distributed Computing NoSQL Web Platform Data Store Security Service Synchronization Provisioning Configuration Privacy Streaming Availability Presentation Load Balance Encryption Encryption High Availability High Availability Migration Page 23