Implementing a CMS First Steps A Case Study by, National Australia Group September 2008
Introduction...1 Spreadsheet Cells and Brain Cells...1 The Goal...2 Stage 1 Planning...2 Sponsorship...2 The Scope...2 Determining the Detail...2 Management Commitment....2 Product Selection....3 Assign Plenty of Time...3 Stage 2 Implementation...4 Avoid customisation...4 Test and Verify....4 Security...4 Capacity...4 Promotion to Production....5 Expectations...5 Conclusion....6
Introduction A Configuration Management System (CMS) is crucial for the successful management of IT services, infrastructure and assets. Without it, important service decisions can and will be flawed. The foundation of any CMS is the accuracy and availability of the source data and maintaining this accuracy and availability can be a difficult and challenging task. Multiple data sources and an ever changing environment contribute greatly to this challenge. The CMS employed by National Australia Group (NAG) did assist the Service Management teams with their day to day decisions. However, the ever increasing complexity of infrastructure and services, a high dependency on people - subject matter experts and manual processes - for the data input and little, if any, integration between various inputs meant that our CMS was, at best, inefficient, at worst, inaccurate. Automating the data collection and maintenance methods in regard to physical device CIs, software CI s and their relationships and holding this data in an appropriate CMDB would improve the efficiency and accuracy of the CMS. Spreadsheet Cells and Brain Cells The Change, Problem and Incident processes rely heavily on information regarding the services they help to support. Details regarding the technology, systems and processes used to support these services and how each relates to one another is crucial in maintaining service levels and continually improving the service. At NAG we held a lot of data in various, disparate data stores - physical CMDBs (spreadsheets databases document stores etc.) and virtual CMDBs, individual subject matter experts with a vast amount of experience and knowledge relating to the various services and technologies. While, in some areas there was a surfeit of information it could prove challenging to integrate the data. The continuous improvement of services added complexity, particularly within environments where the infrastructure was shared among services, and together with the wide spread data sources, it was often difficult to get a coherent and meaningful picture of the various end-to-end business processes, from a technology perspective. The diversity of data sources also resulted in conflict where differences were apparent between data sources, which was the reliable one? The overall result of this - ITIL processes suffered. Incidents could be categorized incorrectly with severity and impact being incorrectly diagnosed. Problems could take longer to resolve due to a lack of understanding of the complete end-to-end service. Incidents due to change were common as risk assessments were inaccurate. Twelve months ago NAG identified that while the processes did protect the services in the main, there were breaches in SLA s that could be eradicated through improving the data available to the CMS and a project to automate the discovery of CIs and integrate CMDBs was instigated. 1
The Goal To improve the quality of data provided to the CMS we determined that an accurate inventory of CIs was required. Initially these CIs would be limited to physical devices attached to our network and the software CIs that were hosted on the physical CIs and their relationships to each other. To achieve this, tools would be deployed that would automatically and regularly collect the required CI data. A new CMDB would be built to contain integrated data reconciled from the discovery tools and other data sources as required. This CMDB would hold key CI data, while a federated approach would allow more detailed information to be stored in other databases. This federated data would be accessible from the integrated CMDB giving us our initial goal - the single source of truth. On completion we would have a system on which we could further build enhancements to our CMS and achieve a secondary goal - the foundation for a fully integrated CMS. Stage 1 Planning Sponsorship The need to improve the CMS data was clear and gaining sponsorship was relatively easy. However, satisfying the budget requirements were challenging and compromises had to be negotiated. This had a major impact on what would eventually be the scope of the project. The Scope. Our requirements were to automate the collection of configuration data pertaining to technology components that supported the services that we provided to our business units. However, this in itself was not enough to determine the scope. Determining the consumers of the data and how they would gain access to it was a chief consideration. It was our conscious decision to collect data for the initial purpose of providing an inventory database. The data would be stored in a CMDB whose key requirement was to be able to integrate with existing technologies used to support our CMS. Integrating with other technologies at this stage was out of scope but ensuring that this functionality was available was critical to the future development of the system. Removing this requirement from the original scope allowed the project to quickly proceed. Determining the Detail. The level of detail to be contained within the CMDB was heavily debated. Determining the level of detail required was critical. Too much detail and the database(s) would be large and unmanageable, too little and vital information could be missed. People generally wanted to get as much detail as possible included. As our project made use of multiple data sources, both new and existing, and the initial scope was defined the main challenge was determining if the data was required in the integrated CMDB or if it was detail that could reside in a federated database. Management Commitment. The implementation of automated tools of any kind results in a degree of cultural change. Knocking down the walls of the technology silos can be challenging and without management support, nearly impossible. Ensuring the support of the management teams help promote this project and contributed to its success. 2
Product Selection. Several companies were engaged in initial discussion regarding the tools available to achieve our aims. The selection process we used looked at the following Current relationship with vendor Confidence in Professional Services Cost of tools and services Scope of discovery tools o Hardware o OS and application o Network Topology mapping Supports data federation Reporting capabilities Integration into CMS Once the product was selected a close relationship with the vendor was maintained to ensure that the deployment was successful. The vendor s profession services division was engaged to assist with the implementation. While we knew your systems and services, they knew their tool sets. Assign Plenty of Time. Collecting data from automated discovery tools is in itself a fairly quick and easy process. However, analysing the data and classifying discovered CIs and filtering out unwanted details takes time in planning and implementing. Identifying relationships between CIs and mapping them correctly to services is also a time consuming process. While tools can detect lines of communication through the infrastructure they cannot determine the relationship to the service provided. That requires manual effort and this too can take a considerable amount of time. However, once the initial mapping is complete it does not take too much resource to maintain the maps. 3
Stage 2 Implementation Avoid customisation. While the tools we deployed supported customisation, deviating from the tool s standards could complicate the various data mappings going forward. After determining what data was required an analysis of how this data related to the tool s data model was carried out. This analysis determined that our requirements would fit into the data model provided by the tools. Our recommendation is that if customisation cannot be avoided then ensure that it is kept to a minimum. It is a lot easier to maintain an out of the box solution. Test and Verify. During the implementation of the discovery tools it was quickly identified that a staged approach was required to ensure that data verification could be carried out easily and quickly. Discovering a specific CI category and correcting anomalies before progressing on to the next stage of discovery ensured that the collected data was accurate. Trying to validate the data collected from a single pass of our entire estate would have been difficult and time consuming. After collecting the data the reconciliation into the CMDB had to be tested. When multiple data sources are being used it is crucial that the correct order and priorities are assigned. Also, the reconciliation of data requires careful consideration regarding the unique key field on which data can be reconciled. At first serial numbers seemed to be the obvious choice, but it soon became apparent that this provided its own difficulties. In today s climate of virtualisation, several systems can have the same serial number thus negating the uniqueness of this attribute as a key field! Security Discovery tools require a high level of access to computer systems in order to collect the greatest amount of data and as such they can prove to be a security threat. The security of computer systems in any industry, not just the financial sector, is paramount and it was of great importance that we ensure no vulnerabilities were introduced. User account administration introduces its own problems maintaining restricted access and forced password changing policies. Collaboration with security administrators was undertaken in order to provide the best practical and compliant solution for our environment. Capacity There were two new discovery tools being deployed as part of this project. One would reside on a central system and would interrogate the network for attached devices while the other required the installation of a small program on each computer system that would interrogate its host and then send back data to a central administration system. While the latter could potentially impact service by using the resource of the hosting system, both of these tools would have an impact on our network traffic and therefore could potentially impact our business service. Collaboration with our network administration and capacity management teams, and execution of a series of test discoveries allowed us to show that, in both cases, the impact to existing services was negligible and the project could proceed. 4
Promotion to Production. At the conclusion of the project we had three new production systems. Two systems automatically discovered our IT infrastructure and recorded the results in their own, separate CMDBs, while the third system provided us with an integrated CMDB. Data from the two discovery CMDBs was being reconciled into the integrated CMDB along with data from two other manually maintained data sources. The integrated CMDB provides data that can be used to map technology to services, providing up-to-date views of relationships between the service and the CIs on which it depends. CIs can support multiple services and these relationships are also held in the integrated CMDB. This data is readily available, giving us our single source of truth. Expectations Now that the automated discovery tools and integrated CMDB have been deployed it is expected that the data will be used to assist change, problem and incident management processes. Change: Better understanding of change overlap Improve risk assessment Assist change owner in understanding dependencies Problem: Simplify association of CIs with Services Improved root cause analysis investigation Reduction in reoccurrences Incident: Improved view of overall service Improved impact analysis Aid to rapid recovery As our services become ever more complex the task of managing them becomes increasingly difficult. A failure of a single component may on the surface look quite innocuous but the knock on effect down stream could potentially be disastrous. The automated discovery and integrated CMDB give a complete end-to-end view of our services and CIs, in both directions (from a service to CI relationship and a CI to service relationship). There will not be any more surprises. If a CI is impacted it is easy to determine the services impacted. All of these improvements ultimately improve the levels of service delivered. 5
Conclusion. While the automation of discovery and the implementation of an integrated CMDB has resulted in an improved CMS this is not the end of the story. Our secondary goal was to lay the foundation on which we could build our CMS and I believe that we have achieved this also. There are still multiple CMDBs in use within NAG, and I am sure that this will always be the case. However, there is less reliance on manual processes, and the implementation of an integrated CMDB will allow us to reconcile the data from all of our CMDBs. It is our intention that key information taken from disparate CMDBs will be reconciled into the integrated CMDB while more detailed data will remain in the relevant CMDB. Users will make requests of the integrated CMDB and where more detail is required they can gain access to it through automated federation processes. Manual processes have not disappeared entirely as there is a limit to what data automated discovery tools can collect. Physical location, for example, is an attribute of a physical CI that, unless the CIs are electronically tagged, cannot be automatically generated. A recent data centre migration had carried out an extensive audit of one of our data centres. This audit provided valuable data in regard to machine locations. Although this data is manually maintained, the migration has ensured that the data is correct and a new management process will ensure that its accuracy is maintained. This process will in the coming months be rolled out into our other data centre which will result in an accurate database of all machine locations. This data too, will be reconciled into our integrated CMDB. The implementation of our integrated CMDB is not the end of our journey, just the beginning. While in its current guise the integrated CMDB provides benefit to other ITIL processes, the full potential can only be realised once the supporting systems for these processes are also integrated. Asset, Change, Problem and Incident systems will in due course be fully integrated, turning our integrated CMDB into an integrated CMS. 6