Modeling Coordination as Resource Flow: An Object-Based Approach John Noll Computer Engineering Department Santa Clara University 500 El Camino Real Santa Clara, CA 95053-0566 jnoll@cse.scu.edu Bryce Billinger Avaya Inc. Denver, CO bbillinger@avaya.com Abstract Workflow management systems provide guidance to individuals performing tasks in an organization. This is typically achieved via a central workflow engine that executes descriptions of organizational processes in order to guide and coordinate activities of individuals in the organization. In this paper, we present a process modeling approach in which processes are modeled as independent process fragments that represent activities performed by a single actor. Each fragment is a specification of the control flow from one activity to the next, that leads to the completion of a task. Coordination among concurrent activities performed by different actors is modeled as resource flow: dependencies among coordinated activities are represented by the resources shared by concurrent activities. This allows processes performed by autonomous process performers distributed across a network to be coordinated without sacrificing individual autonomy. Keywords: Workflow Modeling, Cooperative Work Support, Coordination, Software Process Modeling 1 Introduction Workflow is the automation of a business process, in whole or part, during which documents, information or tasks are passed from one participant to another for action, according to a set of procedural rules. [1] Workflow management systems provide active guidance, by providing prompts, instructions, and supporting documents, to individual actors performing tasks in processes. Initially, this was achieved by delivering documents to actors via email or other electronic means, hence the name workflow ; in this case the routing of documents from actor to actor is specified by a simple, sequential process specification. Contemporary workflow systems can execute more complex processes than simple sequential workflows. These processes may involve iteration over a sequence of tasks, branching, and concurrent activities performed by multiple actors. The process descriptions are thus more complex, requiring specification of coordination and synchronization among concurrent activities, as well as both flow of control. The conventional model of a workflow system is based on a client-server architecture. A central workflow engine executes process descriptions to support the activities of actors that interact with the engine through client interfaces such as web-browsers or task-specific tools. As a centralized resource with global visibility of an organization s active processes, the workflow engine can coordinate the activities of each actor in relation to other actors, thus supporting collaboration and cooperation. A a database management system is often employed to manage the data produced and modified by the processes, and the data about the processes themselves. The advantages of this approach include synchronization of modifications to documents by concurrent activities, and the ability to perform queries over the entire set of currently active processes. This model has been employed to great success in workflow systems to support the business processes of companies and organizations. However, this model assumes all of an organization s processes are managed by the workflow engine (or set of engines), all documents are accessible to the engine and clients, and that all actors are connected to the engine through a client interface. Thus, the conventional client-server model presents obstacles when documents or actors involved in a processes are distributed in autonomous locations. In such cases, a central engine cannot control access to documents: this is the responsibility of the repository that contains the document (or other artifact). Similarly, the engine cannot support coordination based on the flow of control among the various activities, since this would require autonomous actors to submit to a
global control authority. For example, open source software development can involve the concurrent activities of a large number of people working on a set of shared artifacts. However, each person is an independent, autonomous actor who performs work voluntarily, according to his or her abilities and resources. While they share a common goal, and may have well defined processes, each actor works independently without direction of a central authority. However, these actors could benefit from the kind of guidance and coordination support that workflow management systems provide. In this paper, we present a process modeling approach that enables processes to be modeled as independent process fragments that represent activities performed by a single actor. Each fragment is a specification of the control flow from one activity to the next, that leads to the completion of a task. Coordination among concurrent activities performed by different actors is modeled as resource flow: dependencies among coordinated activities are represented by the resources shared by concurrent activities. 2 Approach Our goal is to model both control flow and coordination with one simple, straightforward process modeling language. Our approach is based on the process programming concept [9]: processes are modeled using a process programming language, called PML, that resembles a conventional programming language in many ways. PML models control flow using familiar programming language constructs such as iteration, selection, and branching. Processes in PML are modeled as sequences of actions. Actions are the atomic activities of a process. Control constructs allow actions to be combined into tasks, which are linear sequences of actions; iterations, that specify a sequence of actions to be repeated; branches, specifying sequences of actions that can be performed concurrently; and selections indicating a choice of one of several possible actions. An example is shown in Figure 2. This is a PML description of a simple software development process depicted in Figure 2. 2.1 Coordination and Resource Flow The key to our approach is the observation that products are produced and consumed by concurrent activities. This is depicted in Figure 2.1. This figure shows the example process from Figure 2, augmented with important resources that are produced and used by some of the process activities. Note process example { action Analyze { branch { sequence code { action Design { action Implement { action Compile { action Debug { action Commit { sequence test { action WriteTestPlan { action WriteTests { action RunTests { Figure 1: Development Process in PML that these resource flows mimic the temporal dependencies among the tasks that produce and consume them. For example, the Analyze activity has a temporal precedence to the Design and Write test plan activities. These are represented by the solid arrows. The analysis product also flows from the Analyze task to both the Design and Write test plan activities. Similarly, the Commit task has a precedence relationship to the Run tests task, which is also reflected in the flow of the code resource. These resource flows are significant because they can be used to coordinate activities when the explicit temporal relationships are removed from the process model. Suppose, for example, that the actors tasked with performing this process are widely distributed across the world: the Analyst is in Chicago, the Programmer in Hong Kong, and the Tester in India. It would still be possible for the process to be executed by a central workflow engine, with each actor connecting to the engine from a client interface or Web browser. However, they would be subject to disruptions caused by network partitions, server maintenance, and other interruptions in service. Further, if the actors belong to different organizations, there may be issues of autonomy that would prevent their subjecting to a central administration. If we remove the explicit control flow relationships that are reflected in resource relationships, we can partition the model into process fragments that can be performed by a single actor, independently of other actors, yet in coordination with there activities. Figure 2.1 depicts this decomposition. The process has been divided
Design Implement Debug Commit Analyze (Analyst) Write Plan Write Tests Run Tests Figure 2: A Simple Software Development Process. Design Implement Debug Commit code Analyze (Analyst) analysis Write Plan Write Tests Run Tests Figure 3: Resource Flow process example { action Analyze { provides { analysis... action Design { requires { analysis provides { design Figure 5: Modeling Resource Flow into three fragments, each associated with one of the three actors in the original model (Analyst, Programmer, Test Engineer). We add constructs to PML to specify pre- and postconditions on the state of resources (documents, data, and other artifacts) that are produced and consumed by activities in a process. The result is a process modeling language that models both resource flow and control flow. A generic object model is used to represent the artifacts and deliverables in a process. Each action may have requires and provides clauses. These specify preand post-conditions surrounding the performance of an action. The requires clause specifies the resources that are required by the action; the provides clause specifies the state these resources will in when the action is completed. The syntax for the object model is of the form object name.attribute. A unique object name specifies an object within the process. The format for the requires and provides clauses are in the form clause object name.attribute op value. Figure 2.1 shows a part the PML specification of Figure 2 augmented with requires and provides predicates. In this figure, the provides predicate of the Analyze action specifies that an analysis object (in this case, a document) will be produced by the Analyze activity. The requires predicate of the Design action specifies that an analysis document is required by this activity before it can be begin; the Design activity produces a design object, as specified by its provides predicate. 2.2 Process Enactment Operating System The Process Enactment Operating System (PEOS) is the engine for executing PML process descriptions. PEOS comprises several components, as shown in Figure 2.2. The virtual machine is the core of the system, responsible for executing compiled PML process descriptions. The virtual machine is a stack based interpreter that runs a PML compiled program and computes which actions are available at any given time. The kernel provides an interface to the outside environment for the virtual machine. It acts as an event dispatcher, for events submitted by actors, and events resulting from changes to objects required by processes under execution. It does this by executing queries to the various
Design Implement Debug Commit code Analyze (Analyst) analysis Write Plan Write Tests Run Tests Figure 4: Process Fragments User Interface Tools Resources User Interface process events resource create/update Resource I/F CVS WWW resource events Email File System Process Core Kernel resource events Active Process Repository Figure 6: PEOS Architecture process state Virtual Machine process models Process Model Repository storage managers and repositories where resources are stored. These queries are simply compiled forms of the provides and requires predicates in a PML specification. Thus, if an action requires an object, and the artifact the object is bound to does not exist, the virtual machine will block execution of the process containing the action until the object (artifact) exists. Each actor in deploys a copy of PEOS to support his own processes. These copies monitor the environment for changes in resource state, in order to coordinate the activities of actors with other actors. The resulting enactment environment is depicted in Figure 2.2. In this figure, the Analyst runs an instance of the Analyze process to create a new analysis document, which he posts to a web site for distribution to other interested parties. This creation event (step 2) is detected by the Programmer s PEOS engine (step 3); this triggers the creation of an instance of the Design and Implement process (step 4) on the Programmer s behalf. After completing the Commit action (step 5), a modification event (step 6) is detected by the Test Engineer s PEOS engine, which creates an instance of the Test process (step 7). 3 Related Work Process modeling approaches can be divided into four main categories, based on the control-flow model used. 3.1 Procedural Control Flow Procedural modeling approaches resemble high-level programming languages; some are even based on programming languages. For example, APPL/A [11] extends the ADA programming language with constructs specific to process and coordination modeling. One of the main differences among procedural modeling languages is the fact that they depend on specific types of repositories. JIL is closely associated with the Pleiades object management system [12]. In MVP-L, a relational database is used to store product (artifact) objects [4]. In APPL/A, the repository used is Triton, a persistent object system built on top of the EXODUS storage manager [11]. The resource model for all of these examples is dependent on the specific object manager. Thus, the resource model is tightly coupled to its underlying implementation. This makes the coordination of users crosses boundaries, such as a organizations or network difficult. A rule-based control flow model comes from rule based programming languages. Rule-based models logical statements to determine which actions can be executed at any given time. The Merlin project is based on PROLOG and uses these rules to model processes [7]. Marvel also uses rules to model processes, with a object-oriented database in the back-end to keep the data [3]. A rule-based modeling language is a very powerful way to model processes; however, because the sequence of activities is not explicitly represented, rulebased models can be difficult to understand. The graph-based process models represent the relationships among activities and resources with graphs. The Petri Net is one of the most used graph-based modeling examples. The projects SPADE/SLANG and
1: run Analysis_proc() 3: notify(analysis) Web Server SCCS 4: run Design_proc(analysis) Programmer 5: commit(code) 7: run Test_proc(code) Analyst 2: create(analysis) 6: notify(code) Test Engineer Figure 7: Distributed Workflow Execution PROMO are based on Petri-Nets [2, 6]. The benefit of graph-based models is they are a close match to the conceptual notion of a process as a sequence of activities, and are thus easy to create and understand. The graphbased control flow model is the closest model to the procedural model. While graph-based models may be easy to read and understand, their flaw is they are not as powerful as other process models. Because graphbased models do not have all of the capabilities as other process modeling languages. Agent-based process modeling is a radically different approach to modeling process flow. An agent is a program that is design to make an intelligent decision of what to do next based on its current state. Agents can be distributed and can act independently of other agents. In agent-base workflow, agents are given goals that conform to the objectives of the overall process; they then develop a work plan to achieve those goals. For example, the DartFlow project uses the World Wide Web and transportable agents to model process flow [5]. The Agent-based Process Management System (APMS) and the METEOR project use agents as a framework for workflow management [8, 10]. The work with agents has tended toward addressing control flow and has separated the data and resources from the model. The separation is due to the fact that agent-based models are focused on a process state and not the process model. For agent-based processes, the process model can be an ever changing model. The agent should be able to use the process state and the current environment, such as a process model, to intelligently determine the next state. 4 Conclusion We have described a process modeling approach that combines the advantages of procedural process descriptions with the ability to decouple concurrent activities so they can be distributed among independent, autonomous actors. This approach has several benefits. Using this approach, no direct communication is required between coordinated instances of execution engines; coordination is achieved by indirect communication through the resources shared among the concurrent activities. This means that an actor can even coordinate with another actor that is not following an explicit process, as long as the resources produced or modified by that are accessible. Also, products and other artifacts involved in the process are accessed from their native storage managers, making it possible for PEOS to co-exist with legacy systems, and to enact processes that cross organizational as well as geographic boundaries. Finally, because each actor runs a separate copy of the execution engine, PEOS maintains complete autonomy of participating individuals.
References [1] Rob Allen. Workflow: An Introduction. [2] S. Bandinelli, A. Fugetta, C. Ghezzi, and L. Lavazza. SPADE: AN Environment for Software Process Analysis, Design, and Enactment. Research Studies Press Limited, 1994. [3] Israel Z. Ben-Shaul, Gail E. Kaiser, and G. Heineman. An architecture for multi-user software development environments. Computing Systems, the Journal of the USENIX Association, 6(2):65 103, 1993. [4] C. Brockers, C. Lott, H. D. Rombach, and M. Verlage. MVP-L language report version 2. Technical report, University of Kaiserslautern, February 1995. [5] Ting Cai, Peter Gloor, and Saurab Nog. Dart- Flow: A workflow management system on the web using transportable agents. Technical Report PCS-TR96-283, Department of Computer Science, Dartmouth College, 1996. [6] John C. Doppke. Software process modeling and execution withing virtual environments. ACM Transactions on Software Engineering and Methodology, 7(1):1 40, January 1998. [7] G. Junkermann, B. Peuschel, W. Schafer, and S. Wolf. MERLIN: Supporting Cooperation in Software Development through a Knowledgebased Environment. John Wiley, 1994. [8] P. D. O Brien and M. E. Wiegand. Agent based process management: applying intelligent agents to workflow. Knowledge Engineering Review, 13(2), September 1998. [9] Leon Osterweil. Software processes are software too. In Proceedings of the 9th International Conference on Software Engineering, Monterey, CA USA, March 1987. [10] Amit Sheth. METEOR: Brief overview. Technical report, Large Scale Distributed Information Systems Lab, University of Georgia, and Infocosm, Inc. [11] Stanley M. Sutton, Dennis M. Heimbigner, and Leon J. Osterweil. Language constructs for managing change in process-centered environments. pages 206 217, Irvine, CA, December 1990. [12] Stanley M. Sutton Jr., Barbara Staudt Lerner, and Leon J. Osterweil. Experience using the JIL process programming language to specify design processes. Technical report, Computer Science Department, University of Massachusetts, Amherst, September 1997. A Example Process Specification process example { action Analyze { provides { analysis branch { sequence code { action Design { requires { analysis provides { design action Implement { requires { design provides { code action Compile { requires { code provides { code.status == compiled action Debug { requires { code.status == compiled provides { code.status == complete action Commit { requires { code.status == complete provides { code.status == committed sequence test { action WriteTestPlan { requires { analysis provides { test_plan action WriteTests { requires { test_plan provides { test_suite action RunTests { requires { test_suite && code provides { code.status == tested