Lluis Belanche + Alfredo Vellido Intelligent Data Analysis and Data Mining or Data Analysis and Knowledge Discovery a.k.a. Data Mining II
An insider s view
Geoff Holmes: WEKA founder
Process Mining
Process Mining (PM) PM sits between CI and DM on the one hand, and process modeling and analysis on the other. PM aims to discover, monitor and improve real processes by extracting knowledge from event logs. event logs Whyare PM? extracted an ever increasing from data sources number (e.g., databases, of events transaction are being logs, audit trails, recorded, etc.). Examples providing of detailed formats are information MXML (Mining about extensible the history Markup of Language) processes. and XES On (extensible the other Event hand, Stream). there XES is a was need selected to improve by the IEEE and Task Force on Process Mining as the standard format for logging events. support business processes in rapidly changing and aggressively There are several tools to extract MXML or XES logs from various data sources. competitive See for example: environments. XESame PM includes (automated) process discovery (extracting process ProMimport models from an event log), conformance checking (monitoring Nitro deviations of model from log), organizational mining (inc. social networks), automated construction of simulation models, model extension, model repair, case prediction, and history based recommendations.
Process Mining (PM) PM could be a bridge between DM and business process modeling and analysis, under the umbrella concept of Business Intelligence (BI). It can also be seen as the "missing link" between DM and traditional modeldriven BPM. Most DM techniques are not fit as such for process analysis. Co existing analytical concepts: Business Activity Monitoring (BAM): technologies enabling the real time monitoring of business processes. Complex Event Processing (CEP): technologies to process large amounts of Six Sigmaevents is a set for of strategies, optimizing techniques, the business and in tools real for time. process Corporate improvement. Performance It was developed by Motorola Management in 1981. [ and (CPM): became measuring famous when the it performance became a successful of a process business or strategy at General organization. Electric in 1995. Today, it is used in many industrial sectors. It seeks to Co existing improve the management quality of process concepts: outputs such by identifying as Continuous and removing Processthe causes of defects (errors) Improvement and minimizing (CPI), Business variability Process in business Improvement processes. It (BPI), uses a Total set of Quality quality management methods, including statistical methods Management (TQM), and Six Sigma. PM enables all these within a single Each Six Sigma project carried out within an organization follows a defined sequence of steps and has quantified framework. value targets, for example: reduce process cycle time, reduce pollution, reduce costs, increase customer satisfaction, or increase profits.
Process Mining (PM) Event logs: All PM techniques assume that it is possible to sequentially record events such that each event refers to an activity (a well defined step in some process) and is related to a particular case (a process instance). EL may store additional information about events: resource (person or device) executing the activity, timestamp of the event, or data elements recorded together with the event.
Process Mining (PM) Discovery: The first element of PM is discovery. A discovery technique takes an event log and produces a model without using any a priori information. Conformance: The second is conformance: an existing process model is compared with an event log of the same process. Conformance checking can be used to check if reality/process, as recorded in the EL, conforms to the model and vice versa. Conformance checking can be applied to procedural models, organizational models, declarative process models, etc. Enhancement : Extending or improving an existing PM using information about the actual process recorded in some EL. This third type of PM aims at changing or extending the a priori model.
Process Mining (PM): perspectives Control flow perspective: focuses on the ordering of activities. The goal of mining this perspective is to find a good characterization of all possible paths. The result is typically expressed in terms of a Petri net or some other process notation (EPCs, BPMN, or UML activity diagrams). Organizational perspective: focuses on information about resources hidden in the event log, i.e., which actors (people, systems, roles, or departments) are involved and how are they related. The goal is to either structure the organization by classifying people in terms of roles and organizational units or to map a social network. Case perspective: focuses on properties of cases. A case can be characterized by its path in the process or by the actors working on it. Business Process Model and Notation (BPMN) example. A graphical representation for specifying business processes in a business process model.
Process Mining (PM): BPM vs. PM Business Process Modeling: 7 phases :In the (re)design phase a new process model is created or an existing process model is adapted. In the analysis phase a candidate model and its alternatives are analyzed. Then, the model is implemented (implementation phase) or an existing system is (re)configured (reconfiguration phase). In the execution phase, the designed model is enacted. During the execution phase the process is monitored. Moreover, smaller adjustments may be made without redesigning the process (adjustment phase). In the diagnosis phase the enacted process is analyzed and the output of this phase may trigger a new process redesign phase.
Process Mining (PM): BPM vs. PM PMining: 5stages: Plan and Justify: Includes understanding the available data and process domain. Extract: event data, models, objectives, and questions need to be extracted from systems, domain experts, and management. Control flow modelling: control flow model is constructed and linked to the event log. Here automated process discovery techniques can be used. The event log may be filtered or adapted using the model (e.g., removing outlier cases and inserting missing events). Integrated process model: the control flow model may be extended with other perspectives (e.g., data, time, and resources). Operational support: Moreover, smaller adjustments may be made without redesigning the process (adjustment phase). In the diagnosis phase the enacted process is analyzed and the output of this phase may trigger a new process redesign phase.
Process Mining (PM): Guiding principles PMining: 5 stages : Plan and Justify: Includes understanding the available data and process domain. Extract: event data, models, objectives, and questions need to be extracted from systems, domain experts, and management. Control flow modelling: control flow model is constructed and linked to the event log. Here automated process discovery techniques can be used. The event log may be filtered or adapted using the model (e.g., removing outlier cases and inserting missing events). Integrated process model: the control flow model may be extended with other perspectives (e.g., data, time, and resources). Operational support: Moreover, smaller adjustments may be made without redesigning the process (adjustment phase). In the diagnosis phase the enacted process is analyzed and the output of this phase may trigger a new process redesign phase.
Process Mining (PM) PM as a building block of BI
Process Mining (PM) PM book
Process Mining (PM) PM IEEE Task Force