Network Load Imposed by Software Agents in Distributed Plant Automation Yoseba K. Penya Vienna University of Technology Institute of Computer Technology Gusshausstrasse 27/E384, A-1040 Wien Austria yoseba@ict.tuwien.ac.at Thilo Sauter Vienna University of Technology Institute of Computer Technology Gusshausstrasse 27/E384, A-1040 Wien Austria sauter@ict.tuwien.ac.at Abstract The current state of the art in production planning is still dominated by largely centralized solutions. More distributed approaches require a substantial increase in communication, and network load becomes an important factor. This paper presents a distributed plant automation approach based on mobile software agents and investigates the network load generated by the agents in terms of communication and migration. Particular emphasis is given to the impact of the planning horizon and prioritization, which leads to extensive rescheduling processes. It appears that scheduling approaches that plan several steps in advance need substantially more communication overhead, but can be made less susceptible to snowball effects associated with rescheduling. I. INTRODUCTION As far as the production planning is concerned, the current state of the art in plant automation is still dominated by centralized solutions. Nevertheless, distributed approaches are the focus of much research work. The most obvious benefit of distributing functions of, e.g., the MES (manufacturing execution system) or the control level is a tremendous increase in flexibility, the ultimate goal being an autonomous self organization of the entire production process. Another advantage less often addressed is the scalability. Distributed approaches no longer need powerful central computers to execute the planning tasks, or at least they alleviate the performance requirements. This is beneficial especially in large plants. On the other hand, the reduction of local computing power in distributed systems comes at the price of an increased demand for communication. When the elements of such system increase their flexibility, autonomy and intelligence, they need to exchange more information in order successfully synchronize and coordinate themselves. The network constitutes a shared resource within the plant automation system. This is particularly important for distributed solutions (such as the one addressed by the EUfunded Project PABADIS, Plant Automation Based on Distributed Systems [1,2]) that make use of mobile software agents to enhance the control of the individual manufacturing process. In spite of their benefits, the necessary migration for every step of the production process increases the network traffic. Consequently, it is worthwhile to investigate the impact that a distributed approach has on the network load and where the limits of scalability are. The remainder of the paper is organized as follows: in the section II we introduce the main features of the multiagent system used in PABADIS, section III describes the scheduling approach, section IV explains the model used in the simulation, section V presents and discusses the results, and section VI draws conclusions and outlines future work. II. PLANT AUTOMATION AS AN INFORMATION SYSTEM The classical representation of Plant Automation is that of a three-layered pyramid, where the machinery is on the lowest (control) level and two layers of more complex plant-wide control functions are above it. This approach starts from the assumption that every production cell has a minimum degree of intelligence to give some information that will be used by the upper layer for scheduling, dispatching, and other management tasks. In other words, the plant is conceived as a network of information points. Traditionally, the information stream flows to a central place, where it is processed by an ERP (enterprise resource planning) system that controls the plant as a whole, issues production orders and logs the results. Unlike this approach of concentrating all the responsibility, intelligence, and production management in one point, latest trends tend to distribute these three tasks at least partly between several units that have smaller processing power but communicate and cooperate to enhance the overall plant automation functionalities. In PABADIS, a plant is composed from abstract entities called Cooperative Manufacturing Units () that comprise both an intelligence module and a production module, which can involve one or more machines. All the s are connected to a common network, so they can cooperate in the manufacturing process. There are three types of s (in [2], they are called BIIOs, Basic Intelligent Independent Object, which is essentially the same): Logical s for computational services such as complex scheduling algorithms, database search etc., Manufacturing s for the physical processing of the products, SCADA s providing an interface to the SCADA system that controls the plant. Basically, PABADIS can be seen as an enhancement of an MES, with a clear focus on a product-oriented view (this is why the approach is mainly targeted at single-piece production), rather than the more process-oriented approach of classical MES. According to [3], functions covered by the MES are the following: Resource allocation, scheduling, and dispatching: This most complicated function of the MES manages the resources of the plans and performs ERP order execution. On one hand, the goal of this function is to optimize the usage of resources, such as machines, tools, labor skills, or materials. On the other hand, the product creation must be optimized as well. In PABADIS this functionality is distributed in the community of software agents which act individually with respect to their own tasks, such as product creation or machine optimization.
Document control, data collection, product tracking and genealogy, process management and performance analysis: This function provides information to the ERP system regarding the current situation of the production and product processing. Maintenance management, which tracks and directs the activities to maintain the equipment and tools, in order to ensure their availability for manufacturing and to ensure scheduling for periodic or preventive maintenance. III. SCHEDULING IN A MULTI-AGENT SYSTEM As mentioned before, the s in PABADIS are composed of a manufacturing module and an intelligence module. The latter is in fact a stationary agent called Residential Agent () that represents the in the network. It provides the network with an interface to the physical resources and functionalities the offers. The interface is used by the Product Agents (PA), mobile software agents that guide the product through its manufacturing process. In the same way that the Residential Agents know everything about the they represent, the Product Agent possess all the necessary data to optimize the development of the product it is in charge of. Therefore, when a PA arrives at a, it exactly knows which parameters must be indicated to the, in order to get its workpiece processed. Product Agents are created in an entity called Agency, which is the interface of the PABADIS agent community to an ERP system. It processes the demand and transforms it into documents one per work order. The Agency parses the documents and creates a product agent that will be responsible for the creation of the correspondent workpiece. The Agency also collects the reports of the product agents. Finally, there is another key element within PABADIS: the Look-up Service (LUS). It acts as white-pages mechanism for the product agents, so they know where to migrate when they need a specific functionality. Every notifies the LUS of its joining the network, and provides it with data about the functionalities it offers. The LUS periodically renews its data, so that when a PA requests some information, it is up to date. Fig. 1 illustrates the main components of the PABADIS system. A. Scheduling -Parser PABADIS -Parser Agency ERP Much have been said, written and discussed about constrained scheduling and the benefits of using software agents in planning processes in general [4,5,6] in for specific use cases that implement different solutions to face the problem [7,8]. In PABADIS, the scheduling is distributed (physically and operatively); the problem is divided into small bits, since it is carried out in local operations at level between Product Agents and Residential Agents. On the one hand, PAs represent the resource consumers, on the other hand, s represent the resource owners. Consequently, they have to negotiate with each other to reach an agreement. Specifically, the scheduling is performed in two ways: Horizontal fashion: The s collect the incoming requests for processing and generates a schedule of tasks from different Pas that need to be processed on the respective. Vertical fashion: The PAs generates a list with all possible s where its task could be carried out. The combination of both tables forms a matrix representing the whole schedule. B. Depth of Scheduling and priorities The scheduling process is in its simplest form a kind of first-come first-serve model, which is easy to solve. Nevertheless, this simple schema does not fit the requirements imposed by current Plant Automation systems. There must be a mechanism to prioritize the manufacturing of special products to account for urgent demands. Therefore, PABADIS supports priority-based scheduling. When a PA requests a time slot that has been already reserved by a lower-priority PA, the s arbitrates the conflict by assigning the time slot to the high-priority PA. Since the lower-priority PA must reallocate its resources, the notifies it of the new situation to let the PA start a rescheduling process. If the PA stills has a higher priority than other PAs, this may lead to a rescheduling cascade effect [9] propagating through the whole agent community. The solution that PABADIS promotes to reduce this effect is the introduction of a depth of scheduling parameter. This variable gives the number of tasks a Product Agent plans before starting the manufacturing process. Depending on the value of this parameter, the scheduling can be classified in three categories: Advance scheduling: The PA allocates the resources for all tasks in the work order. Thus, the PA analyses the whole work order and chooses the best possible way of execution. The depth of scheduling parameter is equal to the number of tasks the work order comprises. Step-by-step scheduling: In this case a PA allocates a resource for the next task only when the current one is completed. In fact, real scheduling is not necessary for the, which makes the optimization of resource usage almost impossible. The depth of scheduling parameter is equal to 1. Hybrid approach: In this case a PA analyses a work order several steps in advance (the number specified by depth of scheduling parameter) and allocates resources for this certain number of tasks, but not for the whole work order.... -Parser Fig. 1. Components of the PABADIS system
C. Life Cycle of a Product Agent Fig. 2 illustrates the program flow of a Product Agent. After parsing the document with the work order, it contacts the Look-up Service to find out about which s provide the functionalities needed for the execution of the work order. The LUS returns a list of matching s and also adds a rough estimate for the time required by each to finish the task. This information is then used by the PA to outline a graph where all the possible paths for the accomplishment of the work order are represented and can be ordered with respect to their overall processing time. Of course, other criteria for the decision might be used as well: workload of the, cost, status of the stocks, etc. Subsequently, the PA selects the best path and starts allocating a depth of scheduling number of tasks on the s that compose such path. This process includes the negotiation with the s about which time-slot is available and which PA with lower priority can be forced to give way. Afterwards, the PA starts the migration and physical manufacturing of its associated workpiece. In case that the depth of scheduling was less than the total amount of tasks needed, it starts a new allocation procedure and so on, until the work piece is finished. IV. MODELING OF THE SYSTEM Allocation error Migration error parse work order ask LUS for needed list depth of scheduling tasks allocation Error? Move to next Error? Perform Task All tasks finished? Move to Agency and report Fig. 2. Program flow of a Product Agent. The objective of the simulation is to evaluate the influence that the behavior of the agents has on the network load. For this purpose, a model representing a worst-case PABADIS environment with the following characteristics is investigated: The objective is to measure the migrations of the agents and the interchanged messages, as well as the time when they are performed or sent. All the parameters (number of agents, number of tasks that compose a work order, etc.) remain fixed, whereas the depth of scheduling and number of product agents with higher priority are the variables to be tuned and observed. Only one kind of product is manufactured, hence the work orders of all Product Agents are identical. Moreover, all Residential Agents represent similar s, therefore each work order is composed of a fixed number of identical tasks. Migration time is supposed to be negligible. Hence, there is no need to include time slots for migrations in the scheduling. Furthermore, the migration is performed before as the task but in the same time slot. Although all s are alike, the PAs can not reserve two time slots with the same. They must perform a task and then move to a different machine. Each PA starts its life cycle at the Agency and it must return there upon completion of the work order to report about the operation that was carried out. Thus, there is always an extra movement (not necessary for the manufacturing of the product) that must be done by the Product Agent. The simulation itself is not based on a real agent platform (like the Grasshopper platform used in the PABADIS project) and several machines, but for the sake of simplicity on a multi-threaded Java program running on a single PC. The flow of the program is as follows: a daemon thread starts four Residential Agent threads and afterwards ten Product Agent threads consecutively and with a small pause in between to have enough time to finish the allocation of resources. The agents allocate their required resources from the first time step on, so the first time interval is devoted to initial operations. Furthermore, all the highpriority PAs are launched when the ordinary PAs have finished their respective allocation procedures. Since by that time the previous PAs have already reserved time slots in the s, the highest possible number of rescheduling processes will happen (one per task and high-priority PA). Table I shows the set-up details of the individual experiments. The work order length is set to ten tasks that require each one-time step for execution. The daemon thread acts also as the clock of the system, so every time it ticks, all the PAs and s are notified. After simulation starts, the Product Agents contact the LUS and then begin to allocate a number of tasks defined by the depth of scheduling parameter. According to the program flow shown in Fig. 2, they wait afterwards until they are able to move and perform an operation on a. Moreover, if the task was the last to do, they quit and return to the Agency. If the task is the last allocated but they needed to allocate more (thus, the depth of scheduling is less than the total number of tasks), they restart the allocation process. To account for the network traffic, the daemon thread
TABLE I EXPERIMENTS PERFORMED IN THE SIMULATION. Identifier Depth of Scheduling high-priority PAs A0 1 (step by step) 0 A2 1 (step by step) 2 A4 1 (step by step) 4 B0 5 (50% of tasks) 0 B2 5 (50% of tasks) 2 B4 5 (50% of tasks) 4 C0 10 (full advance) 0 C2 10 (full advance) 2 C4 10 (full advance) 4 registers all messages sent and migrations performed upon notification of the entity that did it. Effectively, only the numbers are counted. The final value of the network traffic is then estimated. Taking into account that the maximum length of an Ethernet packet is 1500 bytes, a segmentation of the agent is indispensable. Furthermore, the header information for Ethernet (11 bytes), IP (20 bytes), and TCP (20 bytes) must also be included. The agent size assumed here is 3,000 bytes for the code and 1,200 bytes for the data of the work order. This makes three packets and in total 4123 bytes that need to be moved, each time the agent migrates from one to another. The messages exchanged between the agents are very short, they are assumed to have just 10 bytes each, which is enough to convey information about status, time slot availability, or schedule details. However, the header overhead is significant, and so a total of 61 bytes must be sent over the network. Concerning the rescheduling processes that can be initiated be Product Agents with a higher priority, an additional restriction was imposed on the timing. It is essential to avoid rescheduling when the affected agent is migrating or is about to migrate. Therefore, the time slot that is going to be reassigned must be at least one time step after the request from the high-priority agent is received. V. RESULTS Table II shows the main results of the experiments. At first sight, it can be seen that migrations are not affected by the variation in the change of scheduling or the number of PAs with higher priority. They are always 11 per PA, this is, one for each task and one more for the return to the Agency (the high-priority agents are not counted to ensure comparability). Fig. 3 shows the average of the experiments (A for A0, A2 and A4; B for B0, B2 and B4, etc.). It illustrates the behaviors of the system over time. The number of rescheduling processes depends only on the number of high-priority PAs and the number of tasks to be performed. Thus, with 8 high-priority PAs and 10 tasks, 80 rescheduling processes happen. There is no snowball effect because there are only two different priorities, and the affected PAs are not able to cause more rescheduling processes (the other PAs have at least the same priority). Another interesting effect regarding the rescheduling is that if the PAs do some advance scheduling, a higher number of high-priority agents results in a decrease of total messages needed in the experiment. The reason is the scheduling strategy of the agents. The PA aims to allocate all the tasks consecutively. If there is a gap in its schedule, this means that other PAs reserved the slots in between. Thus, when rescheduling, the PA assumes that earlier time slots are already reserved and therefore does not try to allocate them, which spares messages. For instance, if a PA has allocated the following resources: t = 10, t = 11, t = 13, and it receives a notification of rescheduling at t = 10, the PA will start rescheduling from t = 13 on. It supposes that if it previously did not succeed in allocating resources for t = 12, it will not succeed in the future. This effect is especially relevant with advanced planning, since the saving is bigger than in step-by-step or hybrid scheduling (see Table II). Fig. 3 clearly shows the network load for each scheduling modality. Advance scheduling (type C) requires most of the network traffic at the beginning, where the PAs send all the messages. Afterwards, they migrate to the s without any further exchange of information. The hybrid approach (type B) presents a diagram with pronounced peaks, corresponding to the moments where the allocation is performed. Since the depth of scheduling is 5 and the PAs start allocating from t = 1, the time slots with higher network traffic are 6, 11, and then 16, 21, etc, but obviously the peaks are smaller since there are fewer agents. Finally, the type A experiments (step-by-step scheduling) do not show significant peaks (only when in the same time slot some PAs turn back to the Agency and others migrate to perform a task) and distribute the network load uniformly. Fig. 4 shows the average of the results with the number of high-priority PAs as parameter. It confirms what exposed for Fig. 3: experiments with a lower number of highpriority PAs need to interchange a higher amount of data, due to the previously explained improvement in the allocation procedure. VI. CONCLUSION AND FUTURE WORK The model presented in this paper illustrates the impact on the network load that different scheduling modalities for a plant automation system based on mobile software agents have. The simulation demonstrated that snowball effects due to rescheduling processes and extra migrations could be avoided by using only two different priority levels, and by ensuring that an agent to that is going to be Identifier Finishing Time Messages Migrations Bytes transmitted A0 26 1840 110 615270 A2 28 1834 110 614904 A4 27 1982 110 623932 B0 27 4772 110 794122 B2 28 3699 110 728699 B4 28 2544 110 658214 C0 30 6981 110 928871 C2 30 5567 110 842617 C4 30 4414 110 772284 TABLE II RESULTS OF THE MEASUREMENTS
Data Transmitted (bytes) 100000 80000 60000 40000 20000 A B C 0 0 5 10 15 20 25 30 Time Elapsed (seconds) Fig. 3. Average behavior of the experiments regarding the depth of scheduling. 150000 Data transmitted (bytes) 100000 50000 0 high-p Pas 2 high-p Pas 4 high-p Pas 0 0 5 10 15 20 25 30 Time elapsed (seconds) Fig. 4. Average behavior of the experiments regarding the number of high-priority Product Agents. rescheduled is not about to arrive at the targeted. Evidently, the simulation model used was rather coarse. In the course of the project, however, it will be refined to account for, e.g., non-uniform task execution times per, which will open new possibilities for optimization. The simulations seem to suggest that full in-advance scheduling creates an extremely high network load at the beginning of the process and is therefore inferior to a stepby-step scheduling or an approach with a low depth of scheduling where the load is more balanced. From the viewpoint of network load, this is certainly true. However, the simulation cannot judge the benefits of a larger scheduling depth such as better optimization results that might outweigh the temporarily higher network traffic. Still, future investigations have to explore the impact the number of agents has on the load. Particularly during the negotiation phase, it is anticipated that the load will dramatically increase with the number of agents. The available network bandwidth may therefore easily put an upper limit to the practically achievable scheduling depth. Of course, it has to be noted that it is highly unlikely that all PAs start at the same time. In practice, they can be released with short delays in between. The only critical situation will be then a rescheduling process affecting a large number of agents. REFERENCES [1] PABADIS, IST-1999-60016, www.pabadis.org [2] T. Sauter and P. Palensky, "Network Technology for Distributed Plant Automation", in Proceedings of the IEEE International Conference on Intelligent Engineering Systems (INES) 2001, pp. 407-412. [3] MES Association, www.mesa.org [4] C. Le Pape, Classification of scheduling problems and selection of corresponding constraint-based techniques, in Proceedings of the IEE Colloquium on Advanced Software Technologies for Scheduling 1993, pp. 1/1-1/3. [5] P. Prosser, The future of scheduling-dai?, in Proceedings of the IEE Colloquium on Advanced Software Technologies for Scheduling 1993, pp. 8/1-8/2. [6] T. Grant, Overview And Conclusions, in Proceedings of the IEE Colloquium on Intelligent Planning and Scheduling Solutions 1996, pp. 7/1-7/4. [7] B. Drabble, Modern planning and scheduling technologies, in Proceedings of the IEE Colloquium on Intelligent Planning and Scheduling Solutions 1996, pp. 3/1-3/6. [8] J. Reaidy, P. Massotte, L. Yingjiu and D. Diep, Product and process reconfiguration based on intelligent agents, in Proceedings of the 2001 IEEE International Conference on Systems, Man, and Cybernetics, pp. 3397-3402. [9] M. Fletcher and S. M. Deen, Task rescheduling in multi-agent manufacturing, in Proceedings of the 1999 International Workshop on Database and Expert Systems Applications, pp. 689-694.