Hace7epe Üniversitesi Bilgisayar Mühendisliği Bölümü BBM467 Data Intensive ApplicaAons Dr. Fuat Akal akal@hace7epe.edu.tr
Overview What is Cloud CompuAng? VirtualizaAon Service Oriented CompuAng
What is Cloud CompuAng? Use of compuang resources as a service resources = sorware, platorm, infrastructure e.g., word processor, database system, CPU, disk service: automate deployment of resource e.g., start and end Ame, availability, etc. Resources can be remote or local actually, you typically do not care you care about the what and when: what kind of resource is used at what point in Ame you do not care about the how and where unless you have legal / compliance issues
Principles of Cloud CompuAng AutomaAon program mundane IT tasks (e.g., backup,...) provide a Web Service for these tasks VirtualizaAon decouple sorware from hardware specify what to deploy not how or where Pay- as- you- go rent sorware and hardware: do not buy!
What is Promising? Reduce Cost UAlizaAon of hardware and sorware Pay- as- you- go & efficient (no overheads) No vendor lock- in Reduce Time to Market Focus on business problem (not IT) No configuraaon, automaac security etc. Development framework (for enterprise Web apps) OperaAng & Support (à cost+ame- to- market) SLAs: availability, guaranteed response Ames,... Security ElasAcity: scale- out and down with workload MulA- tenancy (support for SaaS (So#ware as a Service))
Pain: Cost Failures PenalAes for missed SLAs System AdministraAon configuraaon & patches SoRware licenses + maintenance Too many sorware layers Hardware 20% vs. 90% ualizaaon Fault tolerance on cheap HW
Key Enabling Technologies Clusters Computer Networks VirtualizaAon Service- oriented CompuAng
Scale Up vs. Scale Out What is be7er? 1 machine with 1000 cores: bigger problem - > bigger machine 1000 machines with 1 core bigger problem - > more machines It depends what you are looking for... performance: network within/across makes difference scalability: get shared- memory to work? cost: it depends flexibility: 1000 machines with 1 core win What ma7ers today scalability and flexibility design for scale out!!!
The Data Center of the Past one machine many tubes one storage system many tapes one terminal operated by humans one role for all
The Data Center Today H.Ü. Bilgisayar Mühendisliği Bölümü
The Data Center Today One Building somewhere where energy is cheap Many Clusters each cluster with its own switch each cluster with its own security Each cluster has many racks each rack has a switch each rack has many machines Each machine has many sockets, disks, MM each socket with many cores complex network within machine Roles: Administrators, Developers, Users H.Ü. Bilgisayar Mühendisliği Bölümü
VirtualizaAon Principles decouple resource from service share resources, dynamic provisioning, migraaon Apply these principles at different levels sorware service: map URL to virtual machine machines: map virtual machine to physical machine storage: map key to block on physical machine Advantages of VirtualizaAon increase ualizaaon improve fault tolerance improve manageability
VirtualizaAon Basics VirtualizaAon can be defined as the abstracaon of physical resources into logical units such that a single physical resource can appear as many logical units and mulaple physical resources can appear as a single logical unit. The concept of virtualizaaon has been around since the 1960s, when IBM implemented it to logically paraaon mainframe computers into separate VMs. The primary moavaaon behind virtualizaaon is to hide the physical characterisacs and irrelevant details of these resources from their end users. Thus, each user gets the illusion of being the lone user of that physical resource (one- to- many virtualizaaon). Or mulaple physical resources appear as a single virtual resource to the user (many- to- one virtualizaaon).
One to Many VirtualizaAon Virtualized server runs the virtual machine monitor (VMM) or hypervisor allows mulaple virtual machines (VM) to run on the same physical server. Each VM emulates a physical computer by creaang a separate operaang system environment. We can simultaneously host mulaple operaang systems on the same underlying physical machine. Each operaang system gets the illusion that it is the only one running on that host server. Virtualized Server VMs One physical machine has effecavely been divided into many logical ones.
Many to One VirtualizaAon The classic example for many- to- one virtualizaaon is that of a load balancer, which front ends a group of web servers. The load balancer hides the details about the mulaple physical web servers and simply exposes a single virtual IP (VIP). The web clients that connect to the VIP to obtain the web service have the illusion that there is a single web server. Many physical web servers have been abstracted into one logical web server.
Server VirtualizaAon Server or compute virtualizaaon is the most popular and visible form of virtualizaaon today (one- to- many virtualizaaon) Approaches to server virtualizaaon Bare- metal virtualiza4on: The hypervisor runs directly on the host s hardware. MulAple guest operaang systems could then run on top of this hypervisor. Hosted Virtualiza4on: The hypervisor runs as an applicaaon on the host operaang system. Then mulaple guest operaang systems could run as VMs on top of this hypervisor.
Bare- metal VirtualizaAon The hypervisor runs directly on the host s hardware. This type of hypervisor is also referred to as a Type 1 hypervisor. The popular Type 1 hypervisors include Citrix XenServer, VMware ESXi, Linux KVM, and MicrosoR Hyper- V. The Linux KVM hypervisor is considered as a Type 2 hypervisor because KVM is essenaally a Linux kernel module and is loaded by the Linux host operaang system. Type 1 hypervisor because the Linux with KVM module is the hypervisor running on bare metal.
Hosted VirtualizaAon The hypervisor runs as an applicaaon on the host operaang system. Then mulaple guest operaang systems could run as VMs on top of this hypervisor. This type of hypervisor is also referred to as a Type 2 hypervisor. MicrosoR Virtual Server and VMware Server, and numerous endpoint- based virtualizaaon platorms such as VMware WorkstaAon, MicrosoR Virtual PC, and Parallels WorkstaAon, are hosted hypervisors.
Type 1 vs. Type 2 VirtualizaAon Type 1 hypervisors are typically more efficient because they have direct access to the underlying hardware and can deliver superior performance as compared to their Type 2 counterparts. Type 2 hypervisors support a wider range of platorms and I/O devices, because they run on top of a standard operaang systems.
Web Services SoRware applicaaon idenafied by a URI, whose interfaces and bindings are capable of being defined, described, and discovered as XML arafacts W3C Web Services Architecture Requirements, Oct. 2002 Programmable applicaaon logic accessible using Standard Internet Protocols MicrosoR An interface that describes a collecaon of operaaons that are network accessible through standardized XML messaging IBM SoRware components that can be spontaneously discovered, combined, and recombined to provide a soluaon to the user s problem/request - SUN
Web Services Model Universal DescripAon, Discovery and IntegraAon Web Services DescripAon Language Simple Object Access Protocol
Web Services Model Roles in Web Service architecture Service provider Owner of the service PlaTorm that hosts access to the service Service requestor Business that requires certain funcaons to be saasfied ApplicaAon looking for and invoking an interacaon with a service Service registry Searchable registry of service descripaons where service providers publish their service descripaons
Web Services Model OperaAons in a Web Service Architecture Publish Service descripaons need to be published in order for service requestor to find them Find Service requestor retrieves a service descripaon directly or queries the service registry for the service required Bind Service requestor invokes or iniaates an interacaon with the service at runame
SOAP: Simple Object Access Protocol SOAP is a communica4on protocol SOAP is for communicaaon between applica4ons SOAP is a format for sending messages SOAP is designed to communicate via Internet SOAP is pla=orm independent SOAP is language independent SOAP is based on XML SOAP is simple and extensible SOAP will be developed as a W3C standard
SOAP Message Request and Response messages Request invokes a method on a remote object Response returns result of running the method SOAP specificaaon defines an envelop envelop wraps the message itself Message is a different vocabulary Namespace prefix is used to disanguish the two parts Application-specific message vocabulary SOAP Envelop vocabulary
SOAP Request Message <?xml version="1.0"?> <soap:envelope xmlns:soap="http://www.w3.org/2001/12/soap-envelope" soap:encodingstyle="http://www.w3.org/2001/12/soap-encoding"> <soap:body xmlns:m="http://www.stock.org/stock"> <m:getstockprice> <m:stockname>ibm</m:stockname> </m:getstockprice> </soap:body> </soap:envelope> Message SOAP Envelope
SOAP Response Message <?xml version="1.0"?> <soap:envelope xmlns:soap="http://www.w3.org/2001/12/soap-envelope" soap:encodingstyle="http://www.w3.org/2001/12/soap-encoding"> <soap:body xmlns:m="http://www.stock.org/stock"> <m:getstockpriceresponse> <m:price>34.5</m:price> </m:getstockpriceresponse> </soap:body> </soap:envelope> Message SOAP Envelope
Web Services DescripAon Language What is WSDL? WSDL is wri7en in XML WSDL is an XML document WSDL is used to describe Web services WSDL is also used to locate Web services WSDL is not yet a W3C standard OperaAonal informaaon about the service LocaAon of the service Service interface ImplementaAon details for the service interface
Universal DescripAon, Discovery and IntegraAon (UDDI) What is UDDI? Directory service where businesses can register and search for Web services Directory for storing informaaon about web services Directory of web service interfaces described by WSDL UDDI communicates via SOAP What is UDDI Based On? Uses W3C Internet standards such as XML, HTTP, and DNS protocols UDDI uses WSDL to describe interfaces to web services
Acknowledgement The course material used for this class is mostly taken and/or adopted* from the course materials of the Big Data class given by Nesime Tatbul and Donald Kossmann at ETH Zurich (h7p://www.systems.ethz.ch/). (*) Original course material is reduced somehow to fit the needs of BBM467. Therefore, original slides were not used as they are.