Widening the Configuration Management Perspective Lars Bendix (bendix@cs.lth.se) Department of Computer Science, Lund Institute of Technology, P. O. Box 118, S-221 00 Lund, Sweden Abstract: A metainformatics symposium provides an excellent opportunity to explore, discuss and investigate how configuration management techniques and principles can be expanded into new fields and how it itself can incorporate techniques from other fields. 1. Software Configuration Management For a long time Software Configuration Management (SCM) has been considered an important part of software engineering and something that is indispensable to carry out on software development projects. SCM is the discipline of organising, controlling and managing the development and evolution of software systems. In particular, emphasis is put on: configuration identification to uniquely identify a product and its parts; configuration control to apply changes to a product in a disciplined and orderly way; configuration status accounting to collect and report data about the status of the development of the product; and configuration audit to verify and validate that the right product has been built and built the right way. Thus in general SCM is seen as a vehicle to ensure consistency and quality of the product being produced. This is, however, just one face of SCM, as something that satisfies the needs of a company that has to deliver products to customers. Another face of SCM is that presented by Babich [Babich] in 1986, where he considers SCM as a means for obtaining team productivity by way of co-ordinating the efforts of a group of people. In his opinion the facilitation of collaboration and synchronisation of work should be a main focus of SCM. Thus focus is shifted or widened from consistency and quality of the product itself to consistency and quality in the workflow processes of the people that have to create the product. Unfortunately the traditional SCM world has not been very responsive to these needs so far. At Lund University we have looked for some time at how developers can use SCM in their process of creating a common product. We have analysed how SCM is used in some Open Source Software projects [Asklund]. It turned out that the traditional concepts of SCM were not used at least explicitly and that what was actually used were techniques for co-ordination of work, simple version control, build management, selection of configurations and handling of workspaces. Currently we are analysing how 12 groups of students 8-10 people in each group used SCM in projects where they developed an application following the Extreme Programming methodology.
They used CVS as the tool for SCM and asked whether it had been useful, most of them said that it would not have been possible to carry out the project without it. Preliminary results, however, show that in most cases CVS was used only for simple synchronisation and integration of individual work. In general, it is not just developers that collaborate to create a common product and as such can benefit from SCM techniques. In CSCW, people have been looking at how to facilitate collaborative writing and their results have many things in common with traditional SCM mechanisms for version control. The hypertext community has also dealt with aspects of versioning and especially problems regarding referential integrity, which is also a concern in SCM. In most software development environments, SCM is an important tool, which means that aspects such as information retrieval and visualisation have to be dealt with. Furthermore, aspects of programming languages also has impact on the build management part of SCM. SCM can be considered the foundation for any kind of collaborative or individual for that sake effort and as such, it has relations to many other fields. This means that SCM needs input from these fields to typecast its general solutions so they can be tailored for each specific context. 2. Conquering new territory The above mentioned aspects of collaboration and co-ordination is one important face of SCM. It has been sadly neglected by researchers and industry and because of that practitioners are struggling. Another face of SCM, that has had the interest of researchers and industry for years, is Product Data Management (PDM). It deals with the management and control of data about a product and recording changes to these. PDM is a mature discipline with well established techniques and principles and as such has the potential to be applied to other domains to conquer new territory. The motive for collecting these data is to keep a trace of what has happened to a product, in case we need to investigate the cause for a certain problem. In that respect it is quite similar to the flight recorders used on board aircrafts. In the case of a software disaster we just go and look for the black box which in real life is orange and pull out all information necessary to analyse and establish the cause of the disaster. In its simplest case, it is plain version control and just records one parameter: what changed (and implicitly that something was changed). In a full-fledged SCM solution, it would record a whole range of parameters such as: when, why and by whom a change was made, the associated change request and/or bug report, related changes, and more. Knowledge management is one domain of potential expansion for SCM systems. Rubart et al [Rubart] points out that knowledge management is co-operative. So are SCM systems and they can be used to create, disseminate and utilise any kind of knowledge. Most SCM systems also have built in processes for the creation, dissemination and utilisation of knowledge and in many cases these processes are
tailorable. The actual implementation of the SCM system might have to be changed to adapt to the needs of new domains like knowledge management, but the techniques and principles are ready and mature. Tochtermann [Tochtermann] states that knowledge management software is primarily used to mediate between people and to provide team work spaces that is exactly what SCM tools can do as explained in section 1. Tochtermann claims that as a consequence knowledge management tool developers should know what knowledge is. In their particular domain, SCM tool developers is a striking example of that, as in most cases they use the SCM tool to develop the SCM tool. This means that SCM is actually as Tochtermann states knowledge management should be driven by user needs and not technology, and therefore generate the highest benefit for the users with the slightest effort. According to Dalcher [Dalcher], modern software development should be seen as a very change-intensive activity. Emphasis should be shifted from the product to the processes used to produce the product, and a life-time perspective should be applied to include also maintenance and further evolution. Even such dynamic scenarios can be supported by modern SCM systems. They provide for the continuity dimension that spans from early development people to the later arrival of maintenance people. Acting as a common repository for all parties with interest in the development phases, an SCM system allows a continuous, as opposed to discrete, view of the development effort. Such a shared, common repository can conserve the intellectual capital that Dalcher says will become a primary resource for software development companies. 3. Adopting new techniques and tricks SCM is an expanding field and moving into other domains, it needs to incorporate new ideas from these domains. And even when it is not expanding, SCM needs to consolidate its own practices and thus can use inspiration from other fields. SCM as a discipline has been around for only about three decades. However, CM without the software has existed for ages and when it moved into SCM also had to adapt to the characteristics of the new domain. One of the earliest applications of CM principles were probably when God put together man, discovered that He was not quite satisfied with that and went on to make some changes to create Woman. Using CM He would be able to roll back the product if the new version turned out to be worse. Fortunately He still has not undone his latest creation and reverted to the previous version. So SCM and in particular CM is a mature discipline. Despite that there are still some problems around in SCM that could do with better solutions. Two such problems are: conceptual models for versions of configurations and selection of configurations. It is possible to provide the user whether a software developer or someone else with a conceptual model for versioning of an object that is simple. The same is true for the conceptual model for a configuration, but when we start to put versions and configurations together things start to get complex. A conceptual model for the fact that there may exist several versions of a particular configuration is not simple at all
and even worse when we want to model the possible configurations that can be created from the versions of a given set of objects. It is possible to model these cases by the use of and/or-graphs [Tichy], but they become very complex. Another, more serious, problem with the and/or-graphs is that they do not have any abstraction capabilities, such that we always get the full picture. The containment models of Gordon and Whitehead [Gordon], on the other hand, are able to abstract away details that we may consider irrelevant at any time. Furthermore, containment models seem to convey important SCM information in a very intuitive way. So they might be used to give better conceptual models for some fundamental SCM concepts. However, it seem that they too suffer from some of the same limitations that the and/or-graphs have: the inability to model dynamic dependencies between objects and to model the fact that a versioned object might be split into two at some point. This could be caused by the restrictions that are placed on the specialised entity-relationship model and might be solved by relaxing the restrictions to obtain a richer, though still simple, model. It would be interesting future work to incorporate relations between atomic objects and containers to model dynamic dependencies, and to introduce a new splitfrom semantics for relations to model versioning of split objects. During the co-operative work of a group, new versions of objects are endlessly added to the repository. At points in time, a developer may want to limit his view of the repository and create a context in which he can work or a configuration he can compile. In traditional SCM, this is done by using a generic configuration as description and perform a selection from the repository based on the description and a number of attribute values. What is obtained is a fully- or partial-bound configuration. This model is simple and has the advantage that it allows selections to be performed in parallel. However, it hardly reflects the needs of modern evolutionary software development methods. When the software evolves, the architecture (the structure) will also change and generic configurations are not able to model dynamic structures being static by nature. To support such development methods, SCM must provide for versioning and selection on structures. Griffiths et al [Griffiths] provide a solution for that in a hypermedia setting. Each object and relation comes with attached to it a context that can then be used during the query process to decide the visibility of data. More importantly, they change the selection process such that it is no longer performed statically and globally, but locally step-by-step and varying dynamically based on the results from the previous steps. It should be possible to adapt such a process of pattern propagation to the software development domain to obtain a richer and more flexible selection mechanism than current SCM systems offer. 4. Reflections A metainformatics symposium has shown to be an exciting and fertile breeding ground for interdisciplinary exchange of ideas. In this paper, we have pointed out some domains where it should be possible to use SCM techniques and principles in the modest hope that people in those domains will find inspiration. Furthermore, we have seen some interesting and promising ideas that it should be possible to adopt and incorporate into SCM to make it stronger.
Hicks [Hicks] describes the A, B and C levels of work and ask the question: where are the B-users to use the C-level open hypermedia systems? They are probably home working on implementing the A-level tools directly! To cast the levels in a different domain, then at the A-level you use compilers, at the B-level you construct compilers using the compiler generators provided by the C-level people. Now the question can be rephrased: why do I hand-build a compiler instead of using a compiler generator? There could be many reasons, I might need the compiler to be optimised for speed and storage or it might simply be that it is easier and faster for me to handcraft a oneoff compiler to convert from one format to another than it is to work out how the compiler-generator gets started. Probably these C-level products will always either have all the bells and whistles and be unusable bloatware or be slim and to the point forcing me to implement or tailor all the things that I really do need. Maybe we would be better off if we considered it a success when we were able to build metainformatical bridges by sharing best practices and reusing processes and principles between domains [Nurnberg], instead of trying to sell our C-level tools. Maybe they just serve to play around with to get hold of the right principles. 5. References [Asklund]: Ulf Asklund, Lars Bendix: A Study of Configuration Management in Open Source Software, IEE Proceedings - Software, Vol. 149, No. 1, February 2002. [Babich]: Wayne A. Babich: Software Configuration Management - Coordination for Team Productivity, Addison-Wesley, 1986. [Bendix]: Lars Bendix, Otto Vinter: Configuration Management from a Developer's Perspective, in Proceedings of the EuroSTAR 2001 Conference, Stockholm, Sweden, November 19-23, 2001. [Dalcher]: Darren Dalcher: Software Development for Dynamic Systems, in these [Gordon]: D. Gordon, E. James Whitehead Jr.: Containment Modeling of Content Management Systems, in these [Griffiths]: Jon Griffiths, David E. Millard, Hugh Davis, Danius T. Michaelides, Mark J. Weal: Reconciling Versioning and Context on Hypermedia Structure Servers, in these [Hicks]: David Hicks: In Search of a User Base: Where are the B s?, in these [Nurnberg]: Peter J. Nürnberg: Building metainformatical bridges, in these [Rubart]: Jessica Rubart, Weigang Wang, Jörg M. Haake: A Meta-Modeling Environment for Cooperative Knowledge Management, in these [Tochtermann]: Klaus Tochtermann: On the Role of Computer Science in Knowledge Management, in these [Tichy]: Walter F. Tichy: A Data Model for Programming Support Environments and its Application, in Automated Tools for Information Systems Design, (H.-J. Schneider & A. L. Wasserman, eds.), North-Holland, 1982.