Implementing an Automated IT Service Catalog using Red Hat Technologies Michael Solberg msolberg@redhat.com Table of Contents Copyright 2010 Introduction...3 Implementing the Automated Service Catalog...3 Provisioning Virtual Machines...4 Managing Systems...9 Advanced Catalog Automation...11
Introduction Over the last decade, many IT organizations have seen their Intel/Linux footprint increase dramatically in number of managed systems. In response, support of these systems has stratified in these organizations, with teams dedicated to the running of systems, the provisioning of new systems, and the architecting of system patterns. Some larger organizations have sought to reduce operating costs by automating IT functions which were traditionally manual processes and minimizing the human footprint of the "build" and "run" teams. This allows technical resources to be dedicated to planning functions, which require tight integration with business units. As these leaner, planning-focused organizations become more integrated with the business units that they support, they tend to adopt a service-oriented stance towards these units. Roles eventually shift from IT into these business units and the responsibility for purchasing and owning IT assets leaves the technical realm and moves into the business realm. This migration of responsibility usually requires a codification of the services which IT provides to the business and a contract between the two groups will be developed. To simplify the management of these services, IT organizations will create what is commonly referred to as a Service Catalog, which is a list of available hardware or software platforms that a business unit may chose to purchase. A Service Catalog typically contains a pre-defined list of infrastructure components (such as servers or containers) which can be provisioned and managed by the IT organization at a negotiated cost. The increased automation of build and run functions and the codification of IT services are complimentary in many ways and have led organizations to purse the automation of their Service Catalog. In a sense, this is the real meat and potatoes of the new cloud architectures which are being developed - the ability for a business unit to control the provisioning and maintenance of IT assets without having to engage IT support directly. Instead of defining requirements, sponsoring discussions with a technology architecture committee, or chasing assets through procurement processes, a business unit simply requests a new widget from a set of available widgets and it is automatically available in a short time-frame. This paper and the corresponding podbay codebase present an approach to automating a Service Catalog using the Application Programming Interfaces of the Red Hat technology stack. It is not a reference architecture per se, but instead it is meant more as a proof of concept. Every IT organization that I have worked with as a Solution Architect at Red Hat has had a radically different set of technology components in their own stack and it s hard to envision a single tool that would fit any two of them, much less all of them. Think of this guide more as an exploration of technological ideas than a reference manual. Implementing the Automated Service Catalog There are many possible elements in a modern IT Service Catalog, from provisioning a new operating system instance to deploying a new revision of a given application. We will limit the scope of the catalog for this guide to the most basic set of items which make up the lifecycle of a virtual operating system instance. First, we will automate the creation of a virtual machine and the provisioning of an operating system onto that machine. Then we will automate applying patches to that virtual machine. Lastly we will deprovision the virtual machine. We will also provide a mechanism for starting and stopping these provisioned virtual machines. A robust (or even usable) automated catalog would also require authorization infrastructure as well as chargeback infrastructure. These may be implemented in future versions of the podbay code. This catalog was implemented as a Django application with a traditional HTML interface as well as a RESTful web API interface for further integration. The application itself is stateless and uses RHN Satellite, Cobbler, and libvirt to modify and track the virtual machine instances. The API and nomenclature were 3
patterned after the Deltacloud project. RHN Satellite and libvirt are enterprise-class technologies supported by Red Hat and Cobbler and Deltacloud are both emerging technologies sponsored by Red Hat. The catalog implementation was developed on Red Hat Enterprise Linux 6 and is hosted on the Fedora Project infrastructure at http://fedorahosted.org/podbay. For instructions on using the example code, see http://fedorahosted.org/podbay/wiki/installation. While this implementation was done in Python, the three APIs covered in this guide all have bindings for other languages. Provisioning Virtual Machines The first task in providing Infrastructure as a Service is to automate the provisioning of virtual machines. This process, performed manually, looks something like the following. 1. Define the hardware characteristics of the virtual machine. 2. Locate a host with available resources. 3. Create storage for the virtual machine. 4. Create the virtual machine. 5. Install an operating system on the virtual machine. 6. Install software on top of the operating system of the virtual machine. 7. Configure the software on the virtual machine. 8. Register the virtual machine with the Configuration Management Database. 9. Monitor the configuration of the virtual machine and the availability of the service it provides. The last five steps of this process can be automated by leveraging Kickstarts, Activation Keys, Stored Profiles, and System Groups in RHN Satellite. That capability is well-documented elsewhere. Our catalog implementation will focus on the process of creating a virtual machine and associating it with a given Kickstart Profile. The Cobbler API will provide the glue between the virtual infrastructure API and the Satellite API to get us from step one through to step five. The last step in the process will be accomplished by using all three APIs, with the Cobbler machine record s MAC address being used as a foreign key between the data sets. 4 Approaching the virtualization API With the release of Red Hat Enterprise Linux 5.0, Red Hat began shipping libvirt with the Xen hypervisor. The libvirt API was designed to provide an open standard which could be used to manipulate virtual infrastructures regardless of the particular virtualization technology used. While some of the language remains Xen specific (virtual machines as "domains" for example), the focus of development within Red Hat has moved towards the KVM hypervisor, which began shipping in Red Hat Enterprise Linux 5.4. The libvirt API provides a robust interface for managing virtual machines in Red Hat Enterprise Linux 6.0, the target hypervisor platform for our Service Catalog, but this API could also be used to manage VMWare, Xen, or Linux Container infrastructure. While managing a local hypervisor with libvirt is relatively simple, managing remote hosts can require quite a bit of system configuration, due to the library s reliance on TLS and SASL for encryption, authentication, and authorization. To mitigate some of these complexities, we ve chosen to use the libvirt QMF interface. This has the added benefit of allowing us to view all participating hosts and virtual machines through a
single connection. This greatly simplifies the host infrastructure and should provide us with an easier way to scale horizontally to more hosts in the long run. The libvirtqpid service is available in Fedora 10 and higher, as well as the current Red Hat Enterprise Linux 6 Beta. In our example code, The libvirt QMF API connection is wrapped by the podbay.virt.realm object, which provides access to all of the VirtualMachine, Host, StoragePool, and Volume objects available in the environment. The initializer demonstrates how to create a connection to the Qpid Broker. class Realm: def init (self, uri= amqp://localhost:5672 ): self.uri = uri self.session = Session() self.session.addbroker(uri) The VirtualMachine, Host, StoragePool, and Volume objects in the virtual infrastructure are made available as properties of the Realm object by making the getobjects() call to the session. @property def vms(self): l = [] for d in self.session.getobjects(_class= domain, _package= com.redhat.libvirt ): l.append(virtualmachine(d)) return l Lastly, each of the managed objects made available by the QMF session are wrapped so that we can add methods and properties. class VirtualMachine: def init (self, mo): self.mo = mo @property def name(self): return self.mo.name Using the wrapped objects is simple. The following code example will start a virtual machine with a given MAC address. from podbay.virt import Realm r = Realm(uri= amqp://localhost:5672 ) for v in r.vms: if v.mac == "52:54:00:03:ee:3b": v.start() All libvirt objects are defined in XML format. To provision storage for a virtual machine and then provision the virtual machine, we need to define a volume in a given storage pool and then define the domain on the host. Defining the storage volume is handled by the createvolumexml() method on the podbay.virt.storagepool object, which wraps the same method on the QMF pool managed object. def createvolumexml(self, xml): r = self.mo.createvolumexml(etree.tostring(xml)) if r.status == 0: 5
for i in xml.iter( name ): name = i.text return Volume(self.mo.getBroker().session.getObjects( _class= volume, _package= com.redhat.libvirt, name=name)[-1]) else: raise "Unable to create Volume: %s"% r.text The podbay.virt.host object has a convenience method, create_volume_from_xml(), which tries to create the described volume on each of a given host s storage pools and returns the first successfully created object. Virtual machines are created via a similar process. The create_vm_from_xml() method on the podbay.virt.host object wraps the domaindefinexml() method on the QMF node managed object. The domain, pool, and volume objects in the libvirt QMF interface all have getxmldesc() methods which expose the XML description of the object. This description is available as an lxml.etree Element via the xml property on each of the corresponding podbay.virt objects. Using the Cobbler API to Automate Network Management Once a virtual machine has been created and started via the libvirt API, it will need to be given an IP address and a hostname before it can have the OS provisioned and be put into service. This process is traditionally performed manually, due to the real danger of IP address conflicts. For organizations which are uncomfortable with automated IP provisioning and DNS modification, this step in the process (or any step in this process) could be replaced by placing a ticket into an appropriate queue for an engineer to pick up. We re attempting to create a fully automated Service Catalog with podbay, so we ll use the Cobbler project s API to drive DHCP and DNS. Cobbler was integrated into the 5.3 version of Red Hat Network Satellite as a PXE provisioning engine, but the API that we ll be using is not a fully supported offering by Red Hat. The Cobbler API contains distro, profile, system, image, and repo objects. We ll be managing the software installation process with the Satellite API, so we ll only need to use the profile and system objects to create systems and associate Satellite Kickstart profiles with them. When Cobbler is managed by RHN Satellite, Satellite will create a profile object for each Kickstart Profile in the system. We ll be creating a system object using these pre-created objects. The Cobbler API is available over XML-RPC. The following code in the initializer of the Cobbler object in podbay.cobbler creates a connection to the API: class Cobbler: def init (self, url, user, password): self.conn = xmlrpclib.server(url) self.token = self.conn.login(user, password) The profiles and systems properties on the Cobbler object return a list of dictionary objects describing the created profiles or defined systems by wrapping the get_profiles() and get_systems() calls. @property def profiles(self): return self.conn.get_profiles() 6
Creating a system record in Cobbler is a bit more involved. First, we ll make the new_system() call with the token we obtained from the login() call. This returns a unique identifier for the new system object. Then we ll send a set of modify_system() calls to set the name, profile, and network information on the new system. Lastly, we call the save_system() and sync() functions to commit the new system record. def add_system(self, name=none, mac=none, profile=none, ip=none): system_id = self.conn.new_system(self.token) self.conn.modify_system(system_id, "name", name, self.token) self.conn.modify_system(system_id, "modify_interface", { macaddress-eth0 : mac, ip_address-eth0 : ip, dns_name-eth0 : name + ".localdomain"}, self.token) self.conn.modify_system(system_id, "profile", profile, self.token) self.conn.save_system(system_id, self.token) self.conn.sync(self.token) return system_id The sync() call will take some time, as it restarts the DHCP and DNS services on the Cobbler server. It make make sense to move that call out into a different thread as environments grow in size. This example also expects a common domain of localdomain, which should be tuned for specific test environments. The Cobbler.add_system() call could also be used to provision the OS on a baremetal system via PXE, as long as the network information is known ahead of time. The following code should create DHCP, DNS, and PXE boot records for a bare-metal system with a mac address of "00:26:2D:F1:F3:CD" and associate it with a Kickstart profile named "VirtHost:1:RedHatInc". Note that the Cobbler names for Satellite profiles include the organization number and name. from podbay.cobbler import Cobbler c = Cobbler( http://satellite.localdomain/cobbler_api, administrator, password ) c.add_system(name="virthost01", mac="00:26:2d:f1:f3:cd", profile="virthost:1:redhatinc", ip="192.168.122.93") Assembling the Menu Now that we can create virtual machines and associate them with network configurations and Satellite Kickstarts, we ll need to limit the virtual hardware configurations to a set of pre-defined systems and simplify the provisioning call. To accomplish this, we ll implement an interface patterned after the Deltacloud project, which provides a nomenclature and structure for dealing with IaaS clouds. The Deltacloud API uses Realm, Hardware Profile, Image, and Instance objects to describe potential and instantiated cloud resources. Realms are used to define a boundary in the cloud such as a physical data center or segregated network. In podbay, we ll only implement a single Realm named "default". Instances describe running virtual machines, which in podbay are the union of libvirt domains and Cobbler and Satellite system objects. In the case of bare-metal systems, an Instance is the combined Cobbler and Satellite system object. A virtual Instance is created by associating an Image object with a Hardware Profile object. In podbay, an Image is a Kickstart Profile. An Image has owner_id, name, 7
8 description, and architecture properties exposed via the podbay REST API. Hardware Profiles are the only stateful data in podbay and they consist of libvirt XML templates, which are stored in plaintext on the local filesystem. podbay.virt has HardwareProfile and StorageProfile objects, which are used to describe each Hardware Profile and its associated storage. The list of available Hardware Profiles is assembled by reading the XML files from a pre-defined directory. def get_hardware_profiles(): hprofiles = [] for f in os.listdir(podbay.settings.libvirt_hprofiles_dir): if f[-3:].lower() == xml : try: p = HardwareProfile().from_XMLDesc(f) hprofiles.append(p) except TypeError, te: pass return hprofiles TypeError exceptions are caught because the profiles directory also contains StorageProfile XML files and trying to instantiate a HardwareProfile object from a StorageProfile XML file throws a TypeError. The add_system() function in podbay.utils demonstrates how a Hardware Profile is associated with an Image to create an Instance. It takes image_id, realm_id, hardware_profile_id, name, and ip arguments and returns the name of the provisioned Instance. It starts by fetching the HardwareProfile object referenced by the given hardware_profile_id. for hp in get_hardware_profiles(): if hp.id == hardware_profile_id: hardware_profile = hp Then it rewrites the names in the XML template with the name provided for the Instance. for sp in hardware_profile.storageprofiles: for i in sp.xmldesc.iter( name ): i.text = name + "_" + i.text for i in hardware_profile.xmldesc.iter(): if i.tag == name : i.text = name if i.tag == description : i.text = hardware_profile.id Note that we use the libvirt domain XML description to contain the Hardware Profile used to instantiate the Image. This allows us to easily associate the two when reporting on existing Instances. Now that we have an XML description of the new virtual machine, we ll provision the storage, and if that succeeds, provision the virtual machine. # Provision the storage for sp in hardware_profile.storageprofiles: volumes[sp.id] = h.create_volume_from_xml(sp.xmldesc) # Add the storage to the XMLDesc for i in hardware_profile.xmldesc.iter( disk ): for j in i.iter( source ): if i.get( type, None) == file : path = volumes[j.get( file, None)] j.set( file, path) elif i.get( type, None) == block : path = volumes[j.get( dev, None)] j.set( dev, path) # Create the VM: vm = h.create_vm_from_xml(hardware_profile.xmldesc)
If we ve sucessfully created the virtual machine and its storage, we ll add a system record to our Cobbler object via the add_system() call and start the virtual machine. if vm is not None: system_id = cblr.add_system(name=name, mac=vm.mac, profile=image_id, ip=ip)... try: vm.start() From there, Cobbler will do the heavy lifting, adding the DNS and DHCP records which will allow the virtual machine to PXE boot and install Red Hat Enterprise Linux according to the given Kickstart Profile. Once the system has finished Kickstarting, it will register itself with Satellite, which will create a system record for the Instance. Managing Systems The second element of our Service Catalog is system management. We ll use the Satellite API to manage the operating system of a provisioned physical or virtual machine and we ll use the libvirt QMF interface to manage the hardware of a provisioned virtual machine. Many interesting management functions are available via the Satellite API, including scheduling package installations and upgrades and running scripts across groups of machines. We ll look at implementing one of these calls, scheduleapplyerrata(), for our podbay Instances. Putting the Satellite API to Work In the last step of the Kickstart provisioning process, a system runs rhnreg_ks and joins itself to the Satellite. This creates a system object which can be manipulated via the Satellite XML-RPC interface. The initializer of the podbay.satellite.satellite class demonstrates how to connect to the API. class Satellite: def init (self, url, user, password): self.conn = xmlrpclib.server(url) self.key = self.conn.auth.login(user, password) self.userdetails = self.conn.user.getdetails(self.key, user) A list of managed systems is exposed via the systems property, which wraps the system.listsystems() call. @property def systems(self): l = [] for system in self.conn.system.listsystems(self.key): system[ mac ] = self.get_mac(system[ id ]) l.append(system) return l The MAC address of the system record is used to associate the Satellite system object with the Cobbler and libvirt objects. This is retrieved via the system.getnetworkdevices() call. def get_mac(self, system_id): for dev in self.conn.system.getnetworkdevices(self.key, system_id): 9
if dev[ interface ] == eth0 : return dev[ hardware_address ].lower() Once we ve identified the correct system record, we can query a list of unapplied patches with the system.getunschedulederrata() call and apply them with the system.scheduleapplyerrata() call. def update_system(self, system_id): updates = self.list_errata(system_id) d = self.conn.system.scheduleapplyerrata( self.key, system_id, [x[ id ] for x in updates]) return d The update_system() call in podbay.views demonstrates the use of the Satellite class in conjunction with the Cobbler class. def update_system(request, system_name=none): cblr = Cobbler(podbay.settings.COBBLER_URL, podbay.settings.satellite_user, podbay.settings.satellite_pass) s = Satellite(podbay.settings.SATELLITE_URL, podbay.settings.satellite_user, podbay.settings.satellite_pass) for system in cblr.systems: if system[ name ] == system_name: for rhn_system in s.systems: if rhn_system[ mac ] == system[ interfaces ][ eth0 ][ mac_address ]: s.update_system(rhn_system[ id ]) We can use the same logic to identify Virtual Machines based on MAC address. An example of this is the start_system() call in podbay.views. def start_system(request, system_name=none): cblr = Cobbler(podbay.settings.COBBLER_URL, podbay.settings.satellite_user, podbay.settings.satellite_pass) for system in cblr.systems: if system[ name ] == system_name: for vm in get_vms(): if vm.mac == system[ interfaces ][ eth0 ][ mac_address ]: vm.start() Presenting a Unified Instance List The get_system_list() in podbay.utils demonstrates how to bring the three APIs that we ve looked at in this guide together to present a single view of podbay s managed Instances. At the beginning of the call, we make a copy of each of the subsystem s system lists. 10 def get_system_list(): s = Satellite(podbay.settings.SATELLITE_URL, podbay.settings.satellite_user, podbay.settings.satellite_pass) cblr = Cobbler(podbay.settings.COBBLER_URL, podbay.settings.satellite_user, podbay.settings.satellite_pass)
r = Realm(podbay.settings.QMF_BROKER) sat_systems = s.systems cblr_systems = cblr.systems vms = r.vms hprofiles = get_hardware_profiles() Since podbay is stateless (with the exception of Hardware Profiles), it uses Cobbler for an authoritative system list. We iterate through the Cobbler system records and build a list of dictionaries containing system information.... for system in cblr_systems: d = { name : system[ name ], ip_address : system[ interfaces ][ eth0 ][ ip_address ], mac_address : system[ interfaces ][ eth0 ][ mac_address ].lower(), state : "Unknown", } hardware_profile_id : "Unknown" From the matching VirtualMachine object in podbay.virt, we get the virtual machine state and hardware information. We also match the hardware_profile_id from the description field in the libvirt domain XML to one of our HardwareProfile objects for vm in vms: if vm.mac == d[ mac_address ]: d[ state ] = vm.state d[ memory ] = int(vm.maxmem) / 1024 d[ cpus ] = vm.nrvirtcpu for hp in hprofiles: if vm.description == hp.id: d[ hardware_profile_id ] = hp.id From the Satellite system list, we get the rhnid, which we can use to make Satellite API calls. for system in sat_systems: if system[ mac ] == d[ mac_address ]: d[ lifecycle ] = "Managed" d[ rhnid ] = system[ id ] The resulting list of dictionaries is used to create the list of Instances for both the HTML interface and the REST API. Note that the superset of data is richer than its components - for example, if we have a Cobbler record and no libvirt record, we can assume that the Instance is physical. If we have a Cobbler record, a libvirt record, but no Satellite record, we can assume that the virtual machine has just been created and is in the process of provisioning an operating system. Advanced Catalog Automation One of the advantages of bridging best of breed technologies like Satellite and libvirt is being able to leverage the depth of capabilities that are available. While high-level tools often provide a quick implementation path and a low cost of ownership, lowlevel tools with open APIs allow for mature IT organizations to fit their technological processes tightly with their business needs. Since the RHN Satellite API allows for the automation of software deployment, one of the obvious areas where advanced automation could occur is in the software lifecycle 11
management arena. One can imagine a system where a tag of a software version in a repository makes a remote call to a Service Catalog, which builds out an entire test environment on an infrastructure cloud and kicks off an automated test suite. When testing is finished for that tag, the results could be recorded and another call could destroy the entire entire environment. Another area where Service Catalog automation could have a great impact is in the on-demand deployment of additional capacity. The podbay code as it stands provides this functionality with its REST API. Assuming that a number of hosts are available in the system and a service can be automatically deployed via Kickstart, a simple HTTP POST to the podbay API will provision or deprovision an instance of the service, depending on capacity requirements. This would allow system resources to be programatically allocated or freed to different scalable services. Notes 1. http://fedorahosted.org/podbay 2. http://fedorahosted.org/podbay/wiki/installation 12