Methods for the specification and verification of business processes MPB (6 cfu, 295AA)

Transcription

1 Methods for the specification and verification of business processes MPB (6 cfu, 295AA) Roberto Bruni Process Mining 1

2 Object We overview the key principles of process mining 2

3 Process Mining Process mining is a relative young research discipline that sits between machine learning and data mining on the one hand and process modeling and analysis on the other hand. The idea of process mining is to discover, monitor and improve real processes (i.e., not assumed processes) by extracting knowledge from event logs readily available in today s systems. 3

4 Processes, Cases, Events, Attributes A process consists of cases. A case consists of events such that each event relates to precisely one case. Events within a case are ordered. Events can have attributes. Examples of typical attribute names are activity, time, costs, and resource. 4

5 Event Logs Let us assume that it is possible to sequentially record events such that each event: refers to an activity (i.e., a well-defined step in the process) and is related to a particular case (i.e., a process instance). 5

6 Event Log Example 1.4 Analyzing an Example Log 13 Table 1.1 A fragment of some event log: each line corresponds to an event Case id Event id Properties Timestamp Activity Resource Cost :11.02 Register request Pete :10.06 Examine thoroughly Sue :15.12 Check ticket Mike :11.18 Decide Sara :14.24 Reject request Pete :11.32 Register request Mike :12.12 Check ticket Mike :14.16 Examine casually Pete :11.22 Decide Sara :12.05 Pay compensation Ellen :14.32 Register request Pete :15.06 Examine casually Mike

7 Mining Scheme 1.3 Process Mining 9 Fig. 1.4 Positioning of the three main types of process mining: discovery, conformance, and engiovedì 13 dicembre 2012

8 Discovery A discovery technique takes an event log and produces a model without using any a-priori information. If the event log contains information about resources, one can also discover resource-related models, e.g., a social network showing how people work together in an organization. 8

9 Conformance An existing process model is compared with an event log of the same process. Conformance checking can be used to check if reality, as recorded in the log, conforms to the model and vice versa. Conformance checking may be used to detect, locate and explain deviations, and to measure the severity of these deviations. 9

10 Enhancement The idea is to extend/improve an existing process model using information about the actual process recorded in some event log. Whereas conformance checking measures the alignment between model and reality, this third type of process mining aims at changing or extending the a-priori model. 10

11 Enhancement: Repair One type of enhancement is repair, i.e., modifying the model to better reflect reality. For example, if two activities are modeled sequentially but in reality can happen in any order, then the model may be corrected to reflect this. 11

12 Four Perspectives 12

13 Control-Flow Perspective The control-flow perspective focuses on the control-flow, i.e., the ordering of activities. The goal of mining this perspective is to find a good characterization of all possible paths, e.g., expressed in terms of a Petri net or some other notation (e.g., EPC, BPMN, and UML AD). We shall focus on this perspective 13

14 Organizational Perspective The organizational perspective focuses on information about resources hidden in the log, i.e., which actors (e.g., people, systems, roles, and departments) are involved and how are they related. The goal is to either structure the organization by classifying people in terms of roles and organizational units or to show the social network. 14

15 Case Perspective The case perspective focuses on properties of cases. Obviously, a case can be characterized by its path in the process or by the originators working on it. However, cases can also be characterized by the values of the corresponding data elements. For example, if a case represents a replenishment order, it may be interesting to know the supplier or the number of products ordered. 15

16 Time Perspective The time perspective is concerned with the timing and frequency of events (performance checking). When events bear timestamps it is possible to discover bottlenecks, measure service levels, monitor the utilization of resources, and predict the remaining processing time of running cases. 16

17 Play-in, Play-out, Replay 17

18 Play-in 1.5 Play-in, Play-out, and Replay 19 18

19 1.5 Play-in, Play-out, and Replay 19 Play-out 19

20 Replay Fig. 1.8 Three ways of relating event logs (or other sources of information containing example behavior) and process models: Play-in, Play-out, and Replay than 56 cigarettes tend to die young ) and association rules ( people that buy diapers also buy beer ). Unfortunately, it is not possible to use conventional data mining techniques to Play-in process models. 20 Only recently, process mining techniques have become readily available to discover process models based on event

21 An Example 21

22 Event Log Example 1.4 Analyzing an Example Log 13 Table 1.1 A fragment of some event log: each line corresponds to an event Case id Event id Properties Timestamp Activity Resource Cost :11.02 Register request Pete :10.06 Examine thoroughly Sue :15.12 Check ticket Mike :11.18 Decide Sara :14.24 Reject request Pete :11.32 Register request Mike :12.12 Check ticket Mike :14.16 Examine casually Pete :11.22 Decide Sara :12.05 Pay compensation Ellen :14.32 Register request Pete :15.06 Examine casually Mike

23 Table 1.1 A fragment of some event log: each line corresponds to an event Case id Event id Properties Timestamp Activity Resource Cost :11.02 Register request Pete Event Log Example Table 1.1 (Continued) :14.16 Examine casually Pete :10.45 Pay compensation Ellen :15.02 Register request Pete :12.06 Check ticket Mike :14.43 Examine thoroughly Sean :12.02 Decide Sara :15.44 Reject request Ellen :09.02 Register request Ellen :10.16 Examine casually Mike :11.22 Check ticket Pete :13.28 Decide Sara :16.18 Reinitiate request Sara :14.33 Check ticket Ellen :15.50 Examine casually Mike :11.18 Decide Sara :12.48 Reinitiate request Sara :09.06 Examine casually Sue :11.34 Check ticket Pete :13.12 Decide Sara :14.56 Reject request Mike Table 1.1 (Continued) Case id Event id Properties :15.02 R :16.06 E Timestamp Activity Resource Cost :16.22 C 14 1 Introduction :16.52 D :15.02 Register request Mike :10.06 Examine thoroughly Sue :16.06 Examine casually Ellen :15.12 Check ticket Mike :16.22 Check ticket Mike :11.18 Decide Sara :16.52 Decide Sara :14.24 Reject request Pete Case id Event id Properties :11.47 Pay compensation Mike :11.32 Register request Mike :12.12 Check ticket Mike 100 Timestamp... Activity Resource Cost :11.47 P :11.22 Decide Sara :15.02 Table 1.2 A more compact :12.05 Pay compensation Ellen Table Register representation of log shown Case 1.2request id A more compact MikeTrace :14.32 Register request Pete in Table 1.1: a = register representation of log shown Case id :16.06 Examine casually Ellen request, b = examine 1 a, b, d, e, h :15.06 Examine casually Mike thoroughly, c = examine in Table 1.1: a = register :16.34 Check ticket Ellen : Check2 ticket Mike a, d, c, e, g casually, d = check ticket, request, :09.18 Decide Sara b = examine 1 a, c, d, e, f, b, d, e, g :16.52 e = decide, f = reinitiate :12.18 Reinitiate request Sara request, g = pay thoroughly, Decide4 c = examine Sara a, d, b, e, 2h :13.06 Examine thoroughly Sean : compensation, and h = casually, Pay rejectcompensation 5 d = check ticket, Mike a, c, d, e, f, d, 200 c, e, f, c, d, e,... h request :11.43 Check ticket Pete a, c, d, e, 3g e = decide, f = reinitiate :09.55 Decide Sara request, g = pay compensation, and h = reject 5 request Table 1.2 A more compact representation of log shown in Table 1.1: a = register request, b = examine thoroughly, c = examine casually, d = check ticket, e = decide, f = reinitiate request, g = pay compensation, and h = reject request Case id Trace 1 a, b, d, e, h 2 a, d, c, e, g Fig. 1.5 The process model discovered by the α-algorithm [103] based on the set of traces { a, b, d, e, h, a, d, c, e, g, a, c, d, e, f, b, d, e, g, a, d, b, e, h, a, c, d, e, f, d, c, e, f, c, d, e, h, a, c, d, e, g } After executing h, the case ends in the desired final marking with just a token in place end. Similarly, it can be checked that the other five traces shown in Table a, c, d, e, f, b, d, e, g 4 a, d, b, e, h 5 a, c, d, e, f, d, c, e, f, c, d, e, h 6 a, c, d, e, g......

24 e = decide, f = reinitiate request, g = pay compensation, and h = reject request 4 a, d, b, e, h 5 a, c, d, e, f, d, c, e, f, c, d, e, h 14 6 a, c, d, e, g 1 Introduction Discovery Example Table 1.1 (Continued) Case id Event id Properties Timestamp Activity Resource Cost :15.02 Register request Mike :16.06 Examine casually Ellen :16.22 Check ticket Mike :16.52 Decide Sara :11.47 Pay compensation Mike Table 1.2 A more compact Fig. 1.5 The process model discovered representation by theofα-algorithm log shown [103] Case based id on the set of traces Trace { a, b, d, e, h, a, d, c, e, g, a, inc, Table d, e, f, 1.1: b, d, a e, = g, register a, d, b, e, h, a, c, d, e, f, d, c, e, f, c, d, e, h, a, c, d, e, g } request, b = examine 1 a, b, d, e, h thoroughly, c = examine 2 a, d, c, e, g casually, d = check ticket, After executing h, the case ends in the desired final marking 3 with just a token in a, c, d, e, f, b, d, e, g e = decide, f = reinitiate place end. Similarly, it can request, be checked g = that pay the other five traces 4 shown in Table 1.2 a, d, b, e, h are also possible in the model compensation, and that alland of these h = reject traces result 5 in the marking with a, c, d, e, f, d, c, e, f, c, d, e, h just a token in place end. request 6 a, c, d, e, g

25 e = decide, f = reinitiate request, g = pay compensation, and h = reject request 4 a, d, b, e, h 5 a, c, d, e, f, d, c, e, f, c, d, e, h 14 6 a, c, d, e, g 1 Introduction Discovery Example Table 1.1 (Continued) Case id Event id Properties Timestamp Activity Resource Cost :15.02 Register request Mike :16.06 Examine casually Ellen :16.22 Check ticket Mike :16.52 Decide Sara :11.47 Pay compensation Mike Table 1.2 A more compact All cases start Fig. 1.5 The process model discovered representation with by theofα-algorithm log a shown [103] Case based id on the set of traces Trace { a, b, d, e, h, a, d, c, e, g, a, inc, Table d, e, f, 1.1: b, d, a e, = g, register a, d, b, e, h, a, c, d, e, f, d, c, e, f, c, d, e, and h, a, c, end d, e, g } with either g or h. request, b = examine 1 a, b, d, e, h thoroughly, c = examine 2 a, d, c, e, g casually, d = check ticket, After executing h, the case ends in the desired final marking 3 with just a token in a, c, d, e, f, b, d, e, g e = decide, f = reinitiate place end. Similarly, it can request, be checked g = that pay the other five traces 4 shown in Table 1.2 a, d, b, e, h are also one possible of inthe model examination compensation, and that alland of these h = reject traces result 5 in the marking with a, c, d, e, f, d, c, e, f, c, d, e, h just a token in place end. request activities (b or c). 6 a, c, d, e, g Every e is preceded by d and

26 e = decide, f = reinitiate request, g = pay compensation, and h = reject request 4 a, d, b, e, h 5 a, c, d, e, f, d, c, e, f, c, d, e, h 14 6 a, c, d, e, g 1 Introduction Discovery Example Table 1.1 (Continued) Case id Event id Properties Timestamp Activity Resource Cost :15.02 Register request Mike :16.06 Examine casually Ellen :16.22 Check ticket Mike :16.52 Decide Sara :11.47 Pay compensation Mike Table 1.2 A more compact Moreover, e Fig. 1.5 The process model discovered representation followed by theofα-algorithm log shown [103] Case based id on the set of traces Trace { a, b, d, e, h, a, d, c, e, g, a, inc, Table d, e, f, 1.1: b, d, a e, = g, register a, d, b, e, h, a, c, d, e, f, d, c, e, f, c, d, e, h, a, c, d, e, by g } f, g, or h. request, b = examine 1 a, b, d, e, h thoroughly, c = examine 2 a, d, c, e, g casually, d = check ticket, After executing h, the case ends in the desired final marking 3 with just a token in a, c, d, e, f, b, d, e, g The repeated e = execution decide, f = reinitiate place end. Similarly, it can request, be checked g = that pay the other five traces 4 shown in Table 1.2 a, d, b, e, h are of also b possible or c, ind, the model and compensation, and e that suggests alland of these h = reject traces result 5 in the marking with a, c, d, e, f, d, c, e, f, c, d, e, h just a token in place end. request the presence of a loop. 6 a, c, d, e, g

27 e = decide, f = reinitiate request, g = pay compensation, and h = reject request 4 a, d, b, e, h 5 a, c, d, e, f, d, c, e, f, c, d, e, h 14 6 a, c, d, e, g 1 Introduction Discovery Example Table 1.1 (Continued) Case id Event id Properties Timestamp Activity Resource Cost :15.02 Register request Mike :16.06 Examine casually Ellen :16.22 Check ticket Mike :16.52 Decide Sara :11.47 Pay compensation Mike Table 1.2 A more compact These characteristics Fig. 1.5 process model discovered representation by theofα-algorithm logare shown [103] Case based id on the set of traces Trace { a, b, d, e, h, a, d, c, e, g, a, inc, Table d, e, f, 1.1: b, d, a e, = g, register a, d, b, e, h, a, c, d, e, f, d, c, e, f, c, d, e, h, a, c, d, e, g } request, b = examine 1 a, b, d, e, h thoroughly, c = examine 2 a, d, c, e, g casually, d = check ticket, After executing h, the case ends in the desired final marking 3 with just a token in a, c, d, e, f, b, d, e, g e = decide, f = reinitiate place end. Similarly, it can request, be checked g = that pay the other five traces 4 shown in Table 1.2 a, d, b, e, h are also possible in the model compensation, and that alland of these h = reject traces result 5 in the marking with a, c, d, e, f, d, c, e, f, c, d, e, h just a token in place end. request be a good balance between 6 a, c, d, e, g dequately captured by the net. When comparing the event log and the model, there seems to overfitting and underfitting.

28 Overfitting and Underfitting One of the challenges of process mining is to balance between overfitting (the model is too specific and only allows for the accidental behavior observed) and underfitting (the model is too general and allows for behavior unrelated to the behavior observed). 28

29 Discussion The Petri net shown also allows for traces not in the log. For example, other possible traces are <a, d, c, e, f, b, d, e, g> and <a, c, d, e, f, c, d, e, f, c, d, e, f, c, d, e, f, b, d, e, g> This is a desired phenomenon as the goal is not to represent just the particular set of example traces in the event log. Process mining algorithms need to generalize the behavior contained in the log to show the most likely underlying model that is not invalidated by the next set of observations 29

30 Mining Other Models We used Petri nets to represent the discovered process models, because Petri nets are a succinct way of representing processes and have unambiguous but intuitive semantics. However, some mining techniques are independent of the 1.2 Limitations of Modeling 5 desired representation. Fig. 1.2 The same process modeled in terms of BPMN 30

31 Another Discovery 14 1 Introduction Table 1.1 (Continued) Example Case id Event id Properties 1.4 Analyzing an Example Log 15 Timestamp Activity Resource Cost :15.02 Register request Mike :16.06 Examine casually Ellen :16.22 Check ticket Mike :16.52 Decide Sara :11.47 Pay compensation Mike Table 1.2 A more compact Fig. 1.6 The process model representation discovered byofthe logα-algorithm shown Case based id on Cases 1 and 4, i.e., Trace the set of traces { a, b, d, e, h, a, d, b, ine, Table h } 1.1: a = register request, b = examine 1 a, b, d, e, h thoroughly, c = examine 2 a, d, c, e, g The Petri net shown incasually, Fig. 1.5 d = also check allows ticket, for traces 3 not present in Table a, 1.2. c, d, For e, f, b, d, e, g e = decide, f = reinitiate example, the traces a, d, request, c, e, f, g = b, pay d, e, g and a, 4 c, d, e, f, c, d, e, f, c, a, d, d, e, b, f, e, c, h d, e, f, b, d, e, g are also compensation, possible. and This h = is reject a desired 5 phenomenon as the a, c, goal d, e, is f, d, c, e, f, c, d, e, h not to represent just the request particular set of example 6 traces in the event log. a, Process c, d, e, g mining algorithms need to generalize the behavior... contained in the log to. show.. the most likely underlying model that is not invalidated by the next set of observations. 31 One of the challenges of process mining is to balance between overfitting (the

32 14 1 Introduction Table 1.1 (Continued) Case id Event id Properties Conformance Example Timestamp Activity Resource Cost :15.02 Register request Mike :16.06 Examine casually Ellen :16.22 Check ticket Mike :16.52 Decide Sara :11.47 Pay compensation Mike Analyzing an Example Log Table 1.2 A more compact representation of log shown Case id Trace in Table 1.1: a = register request, b = examine 1 a, b, d, e, h thoroughly, c = examine 2 a, d, c, e, g casually, d = check ticket, Table 1.3 Another event log: 3 a, c, d, e, f, b, d, e, g e = decide, f = reinitiate request, g = pay 4 Cases 7, 8, and a, 10 d, b, are e, h not compensation, Fig. 1.6 The process and h = model rejectdiscovered 5 possible by the α-algorithm according based a, onc, Cases d, to e, f, 1Fig. and d, c, 4, e, i.e., 1.5 f, c, the d, set e, h of request traces { a, b, d, e, h, a, d, b, e, 6 h } a, c, d, e, g 16 1 Introduction The Petri net shown in Fig. 1.5 also allows for traces not present in Table 1.2. For example, the traces a, d, c, e, f, b, d, e, g and a, c, d, e, f, c, d, e, f, c, d, e, f, c, d, e, f, b, d, e, g are also possible. This is a desired phenomenon as the goal is not to represent just the particular set of example traces in the event log. Process mining algorithms need to generalize the behavior contained in the log to show the most likely underlying model that is not invalidated by the next set of observations. One of the challenges of process mining is to balance between overfitting (the model is too specific and only allows for the accidental behavior observed) and underfitting (the model is too general and allows for behavior unrelated to the behavior observed). When comparing the event log and the model, there seems to be a good balance between overfitting and underfitting. All cases start with a and end with either g or h. Every e is preceded by d and one of the examination activities (b or c). Moreover, e is followed by f, g, or h. The repeated execution of b or c, d, and e suggests the presence of a loop. These characteristics are adequately captured by Fig. 1.5 The process model discovered by the α-algorithm [103] based on the set of traces the net of Fig { a, b, d, e, h, a, d, c, e, g, a, c, d, e, f, b, d, e, g, a, d, b, e, h, a, c, d, e, f, d, c, e, f, c, d, e, h, Let a, usc, now d, e, g } consider an event log consisting of only two traces a, b, d, e, h and a, d, b, e, h, i.e., Cases 1 and 4 of the original log. For this log, the α-algorithm constructs the Petri net shown in Fig This model only allows for two traces After and these executing are exactly h, the the case ones ends in the in the small desired eventfinal log. bmarking and d are with modeled just a token as being Case id 32 Trace 1 a, b, d, e, h 2 a, d, c, e, g 3 a, c, d, e, f, b, d, e, g 4 a, d, b, e, h 5 a, c, d, e, f, d, c, e, f, c, d, e, h 6 a, c, d, e, g 7 a, b, e, g 8 a, b, d, e 9 a, d, c, e, f, d, c, e, f, b, d, e, h 10 a, c, d, e, f, b, d, g

33 Process Discovery: α-algorithm 33

34 Process Discovery Process discovery is the activity that combines Discovery with the Control-flow Perspective. The general problem: A process discovery algorithm is a function that maps an event log L onto a process model M such that the model M is representative for the behavior seen in the event log L. We focus on simple event logs and Petri net models (possibly sound workflow nets). 34

35 etri net that can replay event log L 1. Ideally, the Petri net is a sound WF-net efined in Sect Based on these choices, we reformulate the process discove roblem and make it more concrete. Simple Event Log efinition 5.2 (Specific process discovery problem) A process discovery algorith s a function γ that maps a log L B(A ) onto F-net 1 discovered for L 1 =[ a, b, c, d 3 a marked Petri, a, c, b, d 2 net γ (L) = (N, M deally, N is a sound WF-net and all traces in L correspond, a, to e, possible d ] firing s uences of (N, M). Let A be a set of activities. make things more concrete, we define the target to be a Petri ne A simple trace over A is a finite sequence of activities. Function γ defines a so-called Play-in technique as described in Chap. 1. Base, nwe L 1, ause process a simple discovery event algorithm log as γ could input discover (cf. Definition the WF-net shown 4.4). in AFig. simp 5..e., γ (L 1 A ) = simple (N 1, [start]). event Each log trace over ina Lis 1 corresponds a multiset to of atraces. possible firing s uence of WF-net N 1 shown in Fig Therefore, it is easy to see that the WF-n an indeed replay all traces [ in the event log. In fact, each of the three L 1 = a, b, c, d 3, a, c, b, d 2 ] possible firin equences of WF-net N 1 appears in L 1., a, e, d Let us now consider another event log: multi-set of traces over some set of activities A, i.e., L B(A ple log describing the history of six cases. The goal is now to di L 2 = [ a, b, c, d 3, a, c, b, d 4, a, b, c, e, f, b, c, d 2, a, b, c, e, f, c, b, d, a, c, b, e, f, b, c, d 2, a, c, b, e, f, b, c, e, f, c, b, d ] hat can replay event log L 1. Ideally, the Petri net is a sound W Sect Based on these choices, we reformulate the process d nd make it more concrete is a simple event log consisting of 13 cases represented by 6 different trace

36 Challenges 5.4 Challenges 151 Simple structure Other behaviours allowed No completely unrelated behaviour Fig Balancing the four quality dimensions: fitness, simplicity, precision, and generalization made. For example: What is the penalty if 36 a step needs to be skipped and what is the penalty if tokens remain in the WF-net after replay? Later, we will give concrete

37 Appropriateness 5.4 Challenges Fig Balancing the four quality dimensions: fitness, simplicity, precision, and gen 37 made. For example: What is the penalty if a step needs to be skipped an the penalty if tokens remain in the WF-net after replay? Later, we will giv definitions for fitness. In Sect , we defined performance measures like error, accurac fp-rate, precision, recall, and F1 score. Recall, also known as the tp-rate, the proportion of positive instances indeed classified as positive (tp/p). T in the log are positive instances. When such an instance can be replay model, then the instance is indeed classified as positive. Hence, the variou of fitness can be seen as variants of the recall measure. Most of the notion in Sect cannot be used because there are no negative examples, i.e. are unknown (see Fig. 3.14). Since the event log does not contain informa events that could not happen at a particular point in time, other notations ar The simplicity dimension refers to Occam s Razor. This principle wa discussed in Sect In the context of process discovery, this mean simplest model that can explain the behavior seen in the log, is the be The complexity of the model could be defined by the number of nodes in the underlying graph. Also more sophisticated metrics can be used, e.g that take the structuredness or entropy of the model into account. Se an empirical evaluation of the model complexity metrics defined in lite Sect , we also mentioned that this principle can be operationalized

38 α-algorithm The α-algorithm was one of the first process discovery algorithms that could adequately deal with concurrency. It has several limitations, but it provides a good introduction into the topic: The α-algorithm is simple and many of its ideas have been embedded in more complex and robust techniques. The α-algorithm scans the event log for particular patterns, called log-based ordering relations, to create a footprint of the log. 38

39 (c, d), (e, d) d # L1 L1 b L1 # L1 L1 L1 # L1 c L1 L1 # L1 L1 # L1 Log-based Ordering d # L1 L1 L1 # L1 L1 e L1 # L1 # L1 L1 # L1 Relations Definition 5.3 (Log-based ordering relations) L L B(A B(A ). Let a, ). b Let A : a, b A : Let L be an event log over A, i.e., a, a> L b, b if c, and d. onlyhowever, if there is a trace d σ = t 1,t 2,t 3,...,t n and i {1,...,n 1} such that σ L and t i = a and t i+1 = b L1 c because c never hea log. such L b if and that L1 onlyσ contains if a> L band ball t i L = pairs a a andoft i+1 activities = b in a # L b if and only if a L b and b L a a a L b if and L b if and only if a> only if a> L b and b> L a L b and b L a Consider for instance L 1 =[ a, b, c, d 3, a, c, b, d 2, a, e, d ] again. For this and a sometimes event log, the L b if and only the other if a> way following log-based ordering L b and around. b> relations can be found L b a# L1 e > L1 = { (a, b), (a, c), (a, e), (b, c), (c, b), (b, d), (c, d), (e, d) } L1 = { (a, b), (a, c), (a, e), (b, d), (c, d), (e, d) } e L1 # L1, (c, c), (c, e), (d, a), (d, d), (e, b), (e, c), (e, e) } Definition 5.3 (Log-based ordering relations) tivities in a directly follows relation. c> L1 d Let L b a> L b if and only if there is a trace σ = t 1,t 2,t 3,.. d because sometimes d directly follows c and d and a # L d b if L1 and c). only b L1 if ac because L b andb> b L1L ac and Consider for instance L 1 =[ a, b, c, d 3, a, c, b, d A : x L y, y L x, x # L y, or x L y, event log, the following log-based ordering relations ca holds 39 # L1 = { for any pair of activities. Therefore, the (a, a), (a, d), (b, b), (b, e), (c, c), (c, e), (d, a), (d, d), (e, b), (e, c), (e, e) }

40 , b A : c L1 L1 # L1 L1 # L1 e L1 # L1 # L1 L1 # L1 d # L1 L1 L1 # L1 L1 Log-based e L1 # L1 # L1 Ordering L1 # L1 nly if there is a trace σ = t 1,t 2,t 3,...,t n and i {1,..., Definition 5.3 (Log-based ordering ordering relations) relations) L Let a, b A : Land B(At i = ). Let a and a, b ta i+1 : = b Let L be an event log over A, i.e., Relations: Example a> L b if and only if there is a trace σ = t 1,t 2,t 3,...,t n and i {1,...,n 1} such that σ L and t i = a and t i+1 = b a L b if and only if a> L b and b L a a # L b if and only if a L b and b L a a L b if and only if a> L b and b> L a nly if a> L b and b L a > L1 = { (a, b), (a, c), (a, e), (b, c), (c, b), (b, d), (c, d), (e, d) } L1 = { (a, b), (a, c), (a, e), (b, d), (c, d), (e, d) } # L1 = { (a, a), (a, d), (b, b), (b, e), (c, c), (c, e), (d, a), (d, d), (e, b), (e, c), (e, e) } L1 = { (b, c), (c, b) } Relation > L1 contains all pairs of activities in a directly follows relation. c> L1 d because d directly follows c in trace a, b, c, d. However, d L1 c because c never directly follows d in any trace in the log. L1 contains all pairs of activities in a causality relation, e.g., c L1 d because sometimes d directly follows c and never the other way around (c > L1 d and d L1 c). b L1 c because b> L1 c and c> L1 b, i.e., sometimes c follows b and sometimes the other way around. b # L1 e, d), because (b, b L1 b), e and(b, e L1 b. e), (c, c), (c, e), (d, a), (d, d), (e, b), (e, c), For any log L over A and x, y A : x L y, y L x, x # L y, or x L y, 40 i.e., } precisely one of these relations holds for any pair of activities. Therefore, the footprint of a log can be captured in a matrix as shown in Table 5.1. Let L be an event log over A, i.e., a> L b if and only if there is a trace σ = t 1,t 2,t 3,...,t n and i {1,...,n 1} lysuch if athat σ L b Land t i b= a and L at i+1 = b a L b if and only if a> L b and b L a ly if a> a # Consider L b if and L b and b> for instance only Lif 1 =[ a, a L b, b L a c, d and 3, a, bc, b, L d a 2, a, e, d ] again. For this a event L log, b ifthe and following onlylog-based if a> ordering L b and relations b> can L abe found stance L 1 =[ a, b, c, d 3, a, c, b, d 2, a, e, d ] again. F Consider for instance L 1 =[ a, b, c, d 3, a, c, b, d 2, a, e, d ] again. For this event wing log, log-based the followingordering log-based ordering relations relations can canbe found > L1 = { (a, b), (a, c), (a, e), (b, c), (c, b), (b, d), (c, d), (e, d) }, c), (a, e), (b, c), (c, b), (b, d), (c, d), (e, d) } L1 = { (a, b), (a, c), (a, e), (b, d), (c, d), (e, d) }, c), (a, e), (b, d), (c, d), (e, d) } # L1 = { (a, a), (a, d), (b, b), (b, e), (c, c), (c, e), (d, a), (d, d), (e, b), (e, c), (e, e) } L1 = { (b, c), (c, b) } Relation > L1 contains all pairs of activities in a directly follows relation. c> L1 d