Summary and Outlook. Business Process Intelligence Course Lecture 8. prof.dr.ir. Wil van der Aalst.

Size: px
Start display at page:

Download "Summary and Outlook. Business Process Intelligence Course Lecture 8. prof.dr.ir. Wil van der Aalst. www.processmining.org"

Transcription

1 Business Process Intelligence Course Lecture 8 Summary and Outlook prof.dr.ir. Wil van der Aalst

2 Overview Chapter 1 Introduction Part I: Preliminaries Chapter 2 Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter 5 Process Discovery: An Introduction Chapter 6 Advanced Process Discovery Techniques Part III: Beyond Process Discovery Chapter 7 Conformance Checking Chapter 8 Mining Additional Perspectives Chapter 9 Operational Support Part IV: Putting Process Mining to Work Chapter 10 Tool Support Chapter 11 Analyzing Lasagna Processes Chapter 12 Analyzing Spaghetti Processes Part V: Reflection Chapter 13 Cartography and Navigation Chapter 14 Epilogue PAGE 1

3 Clive Humby (dunnhumby) 2006 Wil van der Aalst TU/e (use only with permission & acknowledgements)

4 data HW/SW systems processes

5 process models as maps

6 Business process maps The first geographical maps date back to the 7th Millennium BC. Since then cartographers have improved their skills and techniques to create maps thereby addressing problems such as clearly representing desired traits, eliminating irrelevant details, reducing complexity, and improving PAGE 5 understandability.

7 Example of a map: Road map of NL The map abstracts from smaller cities and less significant roads. Only the bigger cities, highways, and other important roads are shown. Cities aggregate local roads and local districts. Also note the use of color, size, etc. PAGE 6

8 PAGE 7

9 Charles Joseph Minard's map showing the size of Napoleon's army at different locations/times Charles Minard's 1869 chart showing the number of men in Napoleon s 1812 Russian campaign army, their movements, as well as the temperature they encountered on the return path. PAGE 8

10 PAGE 9

11 Illustrating the problem x start p1 0.3 a 0.3 p3 y 0.4 p7 f 1.0 z 1.0 p9 p j b c d p4 0.4 g h 0.6 p10 k l p2 p5 p8 p e i 1.0 p6 end PAGE 10

12 Classical top level view: low level connections still exist p3 p9 p4 x y z p10 p5 p11 x start 1.0 a p3 y f 1.0 z 1.0 j p9 p6 p p p b c d p4 g 0.4 h 0.6 p10 k l p2 p5 p8 p e i 1.0 p6 end PAGE 11

13 Seamless zoom Threshold: 1.0 x y z a f j x y z e i Threshold: 0.6 x y z a f j h k x y z e i Threshold: 0.4 x y z a f j b g h k l x y z e i Threshold: 0.3 x a f y j z b c d g h k l x y z e i PAGE 12

14 most process modeling notations assume a fixed hierarchy no seamless zoom-in and zoom out! traditional hierarchy concepts don't support "Google Maps" abstraction PAGE 13

15 Example: Reviewing papers (100 cases generating 3730 events) WF-net discovered using the α-algorithm PAGE 14

16 Fuzzy miner: two views on the same process fuzzy model showing all activities fuzzy model showing only two activities color and width of arc indicates significance of connection PAGE 15

17 Balancing between both extremes fuzzy model showing all activities fuzzy model showing only two activities color and width of arc indicates significance of connection aggregated node containing 10 activities inner structure of aggregated node PAGE 16

18 Projecting dynamic information on business process maps PAGE 17

19 Projecting traffic jams on maps PAGE 18

20 Business process movies PAGE 19

21 information system as a navigation device

22 Navigation Whereas a TomTom device is continuously showing the expected arrival time, users of today s information systems are often left clueless about likely outcomes of the cases they are working on. Car navigation systems provide directions and guidance without controlling the driver. The driver is still in control, but, given a goal (e.g. to get from A to B as fast as possible), the navigation system recommends the next action to be taken. Operational support provides TomTom functionality for business processes. PAGE 21

23 Recommend: How to get home ASAP? Take a left turn! Detect: You drive too fast! Predict: When will I be home? At 11.26! PAGE 22

24 Relating the process mining framework to cartography and navigation people machines business processes world documents organizations information system(s) event logs provenance pre mortem current data historic data post mortem navigation auditing cartography explore predict recommend detect check compare promote discover enhance diagnose models de jure models de facto models control-flow control-flow data/rules data/rules resources/ organization resources/ organization PAGE 23

25 What should I have learned from this course?

26 Lecture 1 Understanding that process mining combines process model analysis (BPM) and data-oriented analysis (e.g., data mining). Understand the link to data science. Understand the link to data mining (supervised and unsupervised learning). Understand the relation between models and event data: play-out, play-in, and replay. Able to interpret a decision tree. Able to compute entropy (per node and for the whole tree). Understand the concept of information gain. PAGE 25

27 Information Gain Based on Entropy Note: information gain while classification does not change. #young=546 #old=314 E= young (860/314) Information gain is split on attribute smoker Overall Entropy #young=184 #old=11 E = yes smoker no young (195/11) young (665/303) #young=362 #old=303 E= PAGE 26

28 Lecture 1 (cont'd) Interpret the results of clustering. Understand the k-means algorithm. Read a dendrogram produced by agglomerative hierarchical clustering. Understand frequent item sets and association rules. Compute the support, confidence, and lift of an association rule. Able to create a confusion matrix (tp,fn,fp,tn) and compute F1 score. PAGE 27

29 Association rules and confusion matrix actual class + - predicted class + - tp fn fp tn p n p n N name error accuracy tp-rate fp-rate precision recall formula (fp+fn)/n (tp+tn)/n tp/p fp/n tp/p tp/p PAGE 28

30 Lecture 2 Understand the limitations of pure model-based analysis. Understand the notion of an event log and process discovery. Understand basic Petri net concepts (marking, liveness, boudedness, soundness). Able to read a simple BPMN diagram. Intuitive understanding of the four basic quality dimensions of process discovery: fitness, precision, generalization, and simplicity. Able to derive the alpha (α) relations (>,,,#) for models and event logs. PAGE 29

31 α algorithm Let L be an event log over T. α(l) is defined as follows. 1. T L = { t T σ L t σ}, 2. T I = { t T σ L t = first(σ) }, 3. T O = { t T σ L t = last(σ) }, 4. X L = { (A,B) A T L A ø B T L B ø a A b B a L b a1,a2 A a 1 # L a 2 b1,b2 B b 1 # L b 2 }, 5. Y L = { (A,B) X L (A,B ) XL A A B B (A,B) = (A,B ) }, 6. P L = { p (A,B) (A,B) Y L } {i L,o L }, 7. F L = { (a,p (A,B) ) (A,B) Y L a A } { (p (A,B),b) (A,B) Y L b B } { (i L,t) t T I } { (t,o L ) t T O }, and 8. α(l) = (P L,T L,F L ). PAGE 30

32 Lecture 2 (cont'd) Able to apply the α algorithm to any event log and interpret the result. Know the limitations of the α algorithm (able to construct event logs resulting in particular problems). Able to show overfitting and underfitting models. fitness lift ability to explain observed behavior thrust avoiding overfitting generalization Process Mining Occam s Razor simplicity avoiding underfitting precision drag gravity PAGE 31

33 Lecture 3 Understand the challenges of process discovery (balancing the four forces and incomplete event logs). Able to read and construct C-nets. Able to convert C-nets into Petri nets (if possible) and vice-versa. Understand the different phases of the heuristic mining approach. Given an event log, compute the dependency measure. Determine the dependency graph based on two thresholds. PAGE 32

34 Dependency graph using a higher threshold (at least 5 direct successions and a dependency of at least 0.9) 11(0.92) b 5(0.83) b 11(0.92) 11(0.92) 11(0.92) a c e 11(0.92) 11(0.92) 13(0.93) 4(0.80) d 13(0.93) a c e 11(0.92) 11(0.92) 13(0.93) 13(0.93) d PAGE 33

35 Lecture 3 (cont'd) Understand the different phases of the two-phase approach based on state-based regions. Able to construct a transition system based on an event log and particular abstraction (past/future, set/bag/sequence, etc.). Able to determine and check state-based regions. Know the limitations of the state-based region approach (able to construct event logs resulting in particular problems). PAGE 34

36 Example of State-Based Region a b [ a,b] e [a,e] d [a,d,e] [ ] [a] c b c d [a,c] [a,b,c] [a,b,c,d] enter: b,e leave: d do-not-cross: a,c b a p1 e p3 d start end p2 c p4 PAGE 35

37 Lecture 4 Have an overview of additional process mining approaches (genetic, language-based regions, etc.). Comprehend the minimal requirements for event data. Understand the elements of the XES format (not just control-flow). Able to name data quality problems (e.g. imprecise timestamps). Understand that given a data set different event logs can be extracted based on different viewpoints. Have a good understanding of available tooling (ProM, Disco, Celonis, Perceptive process mining). PAGE 36

38 Lecture 5 Understand the concept of conformance checking. Able to name the different applications of conformance checking. Able to compute the produced, consumed, missing and remaining tokens given a single trace or whole log. Compute fitness based on counting missing and remaining tokens. Able to interpret the diagnostics of such a fitness computation. Able to compute and compare footprints based on models and logs. Understand the notion of alignments. PAGE 37

39 Fitness = 0.8 trace frequency produced tokens (p) remaining tokens (r) consumed tokens (c) missing tokens (m) produced tokens (all) remaining tokens (all) consumed tokens (all) missing tokens (all) abefcd abbefccd sum p sum r sum c sum m p1 b p3 fitness 0.8 a f e d start p5 end p2 c p4 PAGE 38

40 Lecture 6 Understand the concepts of model repair and model extension. Able to interpret the different types of dotted charts. Able to convert a decision point into a classification problem. Able to convert a decision tree for a decision point into guards. Able to replay a timed event log and compute waiting times, service times, and routing probabilities. Able to construct the resource-activity matrix given an event log. Able to construct the handover of work matrix. PAGE 39

41 Lecture 6 (cont'd) Able to create a social network based on the handover of work matrix. Understand how the resource-activity matrix can be used to cluster resources and construct organizational models. Understand the process cube notion as a means to do comparative process mining. Understand how the different types of process mining can be combined to create models covering all perspectives (control-flow, data, resources, time, etc.). PAGE 40

42 Lecture 7 Able to reproduce the refined process mining framework (listing 10 activities). Understand the difference between "pre mortem" and "post mortem" event data and "de jure" and "de facto" models. Understand the three types of operational support: detect, predict, and recommend. Able to explain these concepts using a timed event log, e.g., constructing an annotated transition system to compute the remaining flow time. Understand the difference between declarative and procedural languages. PAGE 41

43 Lecture 7 (cont'd) Understand the process spectrum (from Lasagna to Spaghetti processes). Able to reproduce the L* life-cycle model for process mining projects. Have an overview of the wide range of possible applications and understand the different opportunities depending on the type of process (Lasagna versus Spaghetti). PAGE 42

44 Lecture 8 Understand that process models can be viewed as maps. Multiple maps for the same reality. Fixed decomposition does not work. Projecting information on maps. Consolidation of the different lectures. PAGE 43

45 Difference between 2IIE0 and 2IIF0 There are two variants of the course 2IIE0 (5 ECTS) and 2IIF0 (6 ECTS), as you know The final written test on Wednesday 9/4/2014, will have two variants: The 2IIF0 (6 ECTS) includes the content of Lecture 6 and Chapter 8 of the book. The 2IIE0 (5 ECTS) does not include the content of Lecture 6 and Chapter 8. PAGE 44

46 closing

47 Overview Chapter 1 Introduction Part I: Preliminaries Chapter 2 Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter 5 Process Discovery: An Introduction Chapter 6 Advanced Process Discovery Techniques Part III: Beyond Process Discovery Chapter 7 Conformance Checking Chapter 8 Mining Additional Perspectives Chapter 9 Operational Support Part IV: Putting Process Mining to Work Chapter 10 Tool Support Chapter 11 Analyzing Lasagna Processes Chapter 12 Analyzing Spaghetti Processes Part V: Reflection Chapter 13 Cartography and Navigation Chapter 14 Epilogue PAGE 46

48 Process Mining: A bridge between data mining and business process management PAGE 47

49 Experience the magic of process mining, i.e., discovering and improving processes based on facts rather than fiction! PAGE 48

Process Mining. ^J Springer. Discovery, Conformance and Enhancement of Business Processes. Wil M.R van der Aalst Q UNIVERS1TAT.

Process Mining. ^J Springer. Discovery, Conformance and Enhancement of Business Processes. Wil M.R van der Aalst Q UNIVERS1TAT. Wil M.R van der Aalst Process Mining Discovery, Conformance and Enhancement of Business Processes Q UNIVERS1TAT m LIECHTENSTEIN Bibliothek ^J Springer Contents 1 Introduction I 1.1 Data Explosion I 1.2

More information

Process Mining Data Science in Action

Process Mining Data Science in Action Process Mining Data Science in Action Wil van der Aalst Scientific director of the DSC/e Dutch Data Science Summit, Eindhoven, 4-5-2014. Process Mining Data Science in Action https://www.coursera.org/course/procmin

More information

Process Mining. Data science in action

Process Mining. Data science in action Process Mining. Data science in action Julia Rudnitckaia Brno, University of Technology, Faculty of Information Technology, irudnickaia@fit.vutbr.cz 1 Abstract. At last decades people have to accumulate

More information

Chapter 12 Analyzing Spaghetti Processes

Chapter 12 Analyzing Spaghetti Processes Chapter 12 Analyzing Spaghetti Processes prof.dr.ir. Wil van der Aalst www.processmining.org Overview Chapter 1 Introduction Part I: Preliminaries Chapter 2 Process Modeling and Analysis Chapter 3 Data

More information

Using Process Mining to Bridge the Gap between BI and BPM

Using Process Mining to Bridge the Gap between BI and BPM Using Process Mining to Bridge the Gap between BI and BPM Wil van der alst Eindhoven University of Technology, The Netherlands Process mining techniques enable process-centric analytics through automated

More information

Implementing Heuristic Miner for Different Types of Event Logs

Implementing Heuristic Miner for Different Types of Event Logs Implementing Heuristic Miner for Different Types of Event Logs Angelina Prima Kurniati 1, GunturPrabawa Kusuma 2, GedeAgungAry Wisudiawan 3 1,3 School of Compuing, Telkom University, Indonesia. 2 School

More information

Business Process Modeling

Business Process Modeling Business Process Concepts Process Mining Kelly Rosa Braghetto Instituto de Matemática e Estatística Universidade de São Paulo kellyrb@ime.usp.br January 30, 2009 1 / 41 Business Process Concepts Process

More information

Process Mining and Fraud Detection

Process Mining and Fraud Detection Process Mining and Fraud Detection A case study on the theoretical and practical value of using process mining for the detection of fraudulent behavior in the procurement process Masters of Science Thesis

More information

Using Trace Clustering for Configurable Process Discovery Explained by Event Log Data

Using Trace Clustering for Configurable Process Discovery Explained by Event Log Data Master of Business Information Systems, Department of Mathematics and Computer Science Using Trace Clustering for Configurable Process Discovery Explained by Event Log Data Master Thesis Author: ing. Y.P.J.M.

More information

Chapter 4 Getting the Data

Chapter 4 Getting the Data Chapter 4 Getting the Data prof.dr.ir. Wil van der Aalst www.processmining.org Overview Chapter 1 Introduction Part I: Preliminaries Chapter 2 Process Modeling and Analysis Chapter 3 Data Mining Part II:

More information

Process Mining and Visual Analytics: Breathing Life into Business Process Models

Process Mining and Visual Analytics: Breathing Life into Business Process Models Process Mining and Visual Analytics: Breathing Life into Business Process Models Wil M.P. van der Aalst 1, Massimiliano de Leoni 1, and Arthur H.M. ter Hofstede 1,2 1 Eindhoven University of Technology,

More information

Mercy Health System. St. Louis, MO. Process Mining of Clinical Workflows for Quality and Process Improvement

Mercy Health System. St. Louis, MO. Process Mining of Clinical Workflows for Quality and Process Improvement Mercy Health System St. Louis, MO Process Mining of Clinical Workflows for Quality and Process Improvement Paul Helmering, Executive Director, Enterprise Architecture Pete Harrison, Data Analyst, Mercy

More information

ProM 6 Exercises. J.C.A.M. (Joos) Buijs and J.J.C.L. (Jan) Vogelaar {j.c.a.m.buijs,j.j.c.l.vogelaar}@tue.nl. August 2010

ProM 6 Exercises. J.C.A.M. (Joos) Buijs and J.J.C.L. (Jan) Vogelaar {j.c.a.m.buijs,j.j.c.l.vogelaar}@tue.nl. August 2010 ProM 6 Exercises J.C.A.M. (Joos) Buijs and J.J.C.L. (Jan) Vogelaar {j.c.a.m.buijs,j.j.c.l.vogelaar}@tue.nl August 2010 The exercises provided in this section are meant to become more familiar with ProM

More information

Process Modelling from Insurance Event Log

Process Modelling from Insurance Event Log Process Modelling from Insurance Event Log P.V. Kumaraguru Research scholar, Dr.M.G.R Educational and Research Institute University Chennai- 600 095 India Dr. S.P. Rajagopalan Professor Emeritus, Dr. M.G.R

More information

BIS 3106: Business Process Management. Lecture Two: Modelling the Control-flow Perspective

BIS 3106: Business Process Management. Lecture Two: Modelling the Control-flow Perspective BIS 3106: Business Process Management Lecture Two: Modelling the Control-flow Perspective Makerere University School of Computing and Informatics Technology Department of Computer Science SEM I 2015/2016

More information

Trace Clustering in Process Mining

Trace Clustering in Process Mining Trace Clustering in Process Mining M. Song, C.W. Günther, and W.M.P. van der Aalst Eindhoven University of Technology P.O.Box 513, NL-5600 MB, Eindhoven, The Netherlands. {m.s.song,c.w.gunther,w.m.p.v.d.aalst}@tue.nl

More information

Data Mining Algorithms Part 1. Dejan Sarka

Data Mining Algorithms Part 1. Dejan Sarka Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses

More information

Dotted Chart and Control-Flow Analysis for a Loan Application Process

Dotted Chart and Control-Flow Analysis for a Loan Application Process Dotted Chart and Control-Flow Analysis for a Loan Application Process Thomas Molka 1,2, Wasif Gilani 1 and Xiao-Jun Zeng 2 Business Intelligence Practice, SAP Research, Belfast, UK The University of Manchester,

More information

Data Science. Research Theme: Process Mining

Data Science. Research Theme: Process Mining Data Science Research Theme: Process Mining Process mining is a relatively young research discipline that sits between computational intelligence and data mining on the one hand and process modeling and

More information

Investigating Clinical Care Pathways Correlated with Outcomes

Investigating Clinical Care Pathways Correlated with Outcomes Investigating Clinical Care Pathways Correlated with Outcomes Geetika T. Lakshmanan, Szabolcs Rozsnyai, Fei Wang IBM T. J. Watson Research Center, NY, USA August 2013 Outline Care Pathways Typical Challenges

More information

Process Mining and the ProM Framework: An Exploratory Survey - Extended report

Process Mining and the ProM Framework: An Exploratory Survey - Extended report Process Mining and the ProM Framework: An Exploratory Survey - Extended report Jan Claes and Geert Poels Department of Management Information Science and Operations Management, Faculty of Economics and

More information

CHAPTER 1 INTRODUCTION

CHAPTER 1 INTRODUCTION CHAPTER 1 INTRODUCTION 1.1 Research Motivation In today s modern digital environment with or without our notice we are leaving our digital footprints in various data repositories through our daily activities,

More information

Process Mining Tools: A Comparative Analysis

Process Mining Tools: A Comparative Analysis EINDHOVEN UNIVERSITY OF TECHNOLOGY Department of Mathematics and Computer Science Process Mining Tools: A Comparative Analysis Irina-Maria Ailenei in partial fulfillment of the requirements for the degree

More information

Process Mining Using BPMN: Relating Event Logs and Process Models

Process Mining Using BPMN: Relating Event Logs and Process Models Noname manuscript No. (will be inserted by the editor) Process Mining Using BPMN: Relating Event Logs and Process Models Anna A. Kalenkova W. M. P. van der Aalst Irina A. Lomazova Vladimir A. Rubin Received:

More information

ProM Framework Tutorial

ProM Framework Tutorial ProM Framework Tutorial Authors: Ana Karla Alves de Medeiros (a.k.medeiros@.tue.nl) A.J.M.M. (Ton) Weijters (a.j.m.m.weijters@tue.nl) Technische Universiteit Eindhoven Eindhoven, The Netherlands February

More information

Analysis of Service Level Agreements using Process Mining techniques

Analysis of Service Level Agreements using Process Mining techniques Analysis of Service Level Agreements using Process Mining techniques CHRISTIAN MAGER University of Applied Sciences Wuerzburg-Schweinfurt Process Mining offers powerful methods to extract knowledge from

More information

Chapter 5 Process Discovery: An Introduction

Chapter 5 Process Discovery: An Introduction Chapter 5 Process Discovery: An Introduction Process discovery is one of the most challenging process mining tasks. Based on an event log, a process model is constructed thus capturing the behavior seen

More information

Combination of Process Mining and Simulation Techniques for Business Process Redesign: A Methodological Approach

Combination of Process Mining and Simulation Techniques for Business Process Redesign: A Methodological Approach Combination of Process Mining and Simulation Techniques for Business Process Redesign: A Methodological Approach Santiago Aguirre, Carlos Parra, and Jorge Alvarado Industrial Engineering Department, Pontificia

More information

Mining Configurable Process Models from Collections of Event Logs

Mining Configurable Process Models from Collections of Event Logs Mining Configurable Models from Collections of Event Logs J.C.A.M. Buijs, B.F. van Dongen, and W.M.P. van der Aalst Eindhoven University of Technology, The Netherlands {j.c.a.m.buijs,b.f.v.dongen,w.m.p.v.d.aalst}@tue.nl

More information

Article. Abstract. This is a pre-print version. For the printed version please refer to www.wisu.de

Article. Abstract. This is a pre-print version. For the printed version please refer to www.wisu.de Article StB Prof. Dr. Nick Gehrke Nordakademie Chair for Information Systems Köllner Chaussee 11 D-25337 Elmshorn nick.gehrke@nordakademie.de Michael Werner, Dipl.-Wirt.-Inf. University of Hamburg Chair

More information

Process Mining The influence of big data (and the internet of things) on the supply chain

Process Mining The influence of big data (and the internet of things) on the supply chain September 16, 2015 Process Mining The influence of big data (and the internet of things) on the supply chain Wil van der Aalst www.vdaalst.com @wvdaalst www.processmining.org http://www.engineersjournal.ie/factory-of-thefuture-will-see-merging-of-virtual-and-real-worlds/

More information

Process Mining: Making Knowledge Discovery Process Centric

Process Mining: Making Knowledge Discovery Process Centric Process Mining: Making Knowledge Discovery Process Centric Wil van der alst Department of Mathematics and Computer Science Eindhoven University of Technology PO Box 513, 5600 MB, Eindhoven, The Netherlands

More information

Master Thesis September 2010 ALGORITHMS FOR PROCESS CONFORMANCE AND PROCESS REFINEMENT

Master Thesis September 2010 ALGORITHMS FOR PROCESS CONFORMANCE AND PROCESS REFINEMENT Master in Computing Llenguatges i Sistemes Informàtics Master Thesis September 2010 ALGORITHMS FOR PROCESS CONFORMANCE AND PROCESS REFINEMENT Student: Advisor/Director: Jorge Muñoz-Gama Josep Carmona Vargas

More information

Lluis Belanche + Alfredo Vellido. Intelligent Data Analysis and Data Mining. Data Analysis and Knowledge Discovery

Lluis Belanche + Alfredo Vellido. Intelligent Data Analysis and Data Mining. Data Analysis and Knowledge Discovery Lluis Belanche + Alfredo Vellido Intelligent Data Analysis and Data Mining or Data Analysis and Knowledge Discovery a.k.a. Data Mining II An insider s view Geoff Holmes: WEKA founder Process Mining

More information

Discovering User Communities in Large Event Logs

Discovering User Communities in Large Event Logs Discovering User Communities in Large Event Logs Diogo R. Ferreira, Cláudia Alves IST Technical University of Lisbon, Portugal {diogo.ferreira,claudia.alves}@ist.utl.pt Abstract. The organizational perspective

More information

Handling Big(ger) Logs: Connecting ProM 6 to Apache Hadoop

Handling Big(ger) Logs: Connecting ProM 6 to Apache Hadoop Handling Big(ger) Logs: Connecting ProM 6 to Apache Hadoop Sergio Hernández 1, S.J. van Zelst 2, Joaquín Ezpeleta 1, and Wil M.P. van der Aalst 2 1 Department of Computer Science and Systems Engineering

More information

Model Discovery from Motor Claim Process Using Process Mining Technique

Model Discovery from Motor Claim Process Using Process Mining Technique International Journal of Scientific and Research Publications, Volume 3, Issue 1, January 2013 1 Model Discovery from Motor Claim Process Using Process Mining Technique P.V.Kumaraguru *, Dr.S.P.Rajagopalan

More information

Relational XES: Data Management for Process Mining

Relational XES: Data Management for Process Mining Relational XES: Data Management for Process Mining B.F. van Dongen and Sh. Shabani Department of Mathematics and Computer Science, Eindhoven University of Technology, The Netherlands. B.F.v.Dongen, S.Shabaninejad@tue.nl

More information

BUsiness process mining, or process mining in a short

BUsiness process mining, or process mining in a short , July 2-4, 2014, London, U.K. A Process Mining Approach in Software Development and Testing Process: A Case Study Rabia Saylam, Ozgur Koray Sahingoz Abstract Process mining is a relatively new and emerging

More information

ProM 6 Tutorial. H.M.W. (Eric) Verbeek mailto:h.m.w.verbeek@tue.nl R. P. Jagadeesh Chandra Bose mailto:j.c.b.rantham.prabhakara@tue.

ProM 6 Tutorial. H.M.W. (Eric) Verbeek mailto:h.m.w.verbeek@tue.nl R. P. Jagadeesh Chandra Bose mailto:j.c.b.rantham.prabhakara@tue. ProM 6 Tutorial H.M.W. (Eric) Verbeek mailto:h.m.w.verbeek@tue.nl R. P. Jagadeesh Chandra Bose mailto:j.c.b.rantham.prabhakara@tue.nl August 2010 1 Introduction This document shows how to use ProM 6 to

More information

Business Intelligence and Process Modelling

Business Intelligence and Process Modelling Business Intelligence and Process Modelling F.W. Takes Universiteit Leiden Lecture 7: Network Analytics & Process Modelling Introduction BIPM Lecture 7: Network Analytics & Process Modelling Introduction

More information

Feature. Applications of Business Process Analytics and Mining for Internal Control. World

Feature. Applications of Business Process Analytics and Mining for Internal Control. World Feature Filip Caron is a doctoral researcher in the Department of Decision Sciences and Information Management, Information Systems Group, at the Katholieke Universiteit Leuven (Flanders, Belgium). Jan

More information

Decision Mining in Business Processes

Decision Mining in Business Processes Decision Mining in Business Processes A. Rozinat and W.M.P. van der Aalst Department of Technology Management, Eindhoven University of Technology P.O. Box 513, NL-5600 MB, Eindhoven, The Netherlands {a.rozinat,w.m.p.v.d.aalst}@tm.tue.nl

More information

Process Mining Online Assessment Data

Process Mining Online Assessment Data Process Mining Online Assessment Data Mykola Pechenizkiy, Nikola Trčka, Ekaterina Vasilyeva, Wil van der Aalst, Paul De Bra {m.pechenizkiy, e.vasilyeva, n.trcka, w.m.p.v.d.aalst}@tue.nl, debra@win.tue.nl

More information

BIDM Project. Predicting the contract type for IT/ITES outsourcing contracts

BIDM Project. Predicting the contract type for IT/ITES outsourcing contracts BIDM Project Predicting the contract type for IT/ITES outsourcing contracts N a n d i n i G o v i n d a r a j a n ( 6 1 2 1 0 5 5 6 ) The authors believe that data modelling can be used to predict if an

More information

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10

131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10 1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom

More information

Data Mining Applications in Higher Education

Data Mining Applications in Higher Education Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2

More information

Process Mining and Network Analysis

Process Mining and Network Analysis Towards Comprehensive Support for Organizational Mining Minseok Song and Wil M.P. van der Aalst Eindhoven University of Technology P.O.Box 513, NL-5600 MB, Eindhoven, The Netherlands. {m.s.song, w.m.p.v.d.aalst}@tue.nl

More information

Software Visualization and Model Generation

Software Visualization and Model Generation Software Visualization and Model Generation Erik Doernenburg Software Developer ThoughtWorks, Inc. Gregor Hohpe Software Engineer Google, Inc. Where are the most defects? 2006 Erik Doernenburg & Gregor

More information

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015 An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content

More information

COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction

COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction COMP3420: Advanced Databases and Data Mining Classification and prediction: Introduction and Decision Tree Induction Lecture outline Classification versus prediction Classification A two step process Supervised

More information

Data Mining Techniques Chapter 6: Decision Trees

Data Mining Techniques Chapter 6: Decision Trees Data Mining Techniques Chapter 6: Decision Trees What is a classification decision tree?.......................................... 2 Visualizing decision trees...................................................

More information

How To Cluster

How To Cluster Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main

More information

Chapter 12 Discovering New Knowledge Data Mining

Chapter 12 Discovering New Knowledge Data Mining Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to

More information

CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19

CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19 PREFACE xi 1 INTRODUCTION 1 1.1 Overview 1 1.2 Definition 1 1.3 Preparation 2 1.3.1 Overview 2 1.3.2 Accessing Tabular Data 3 1.3.3 Accessing Unstructured Data 3 1.3.4 Understanding the Variables and Observations

More information

The Data Mining Process

The Data Mining Process Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data

More information

Data Preprocessing. Week 2

Data Preprocessing. Week 2 Data Preprocessing Week 2 Topics Data Types Data Repositories Data Preprocessing Present homework assignment #1 Team Homework Assignment #2 Read pp. 227 240, pp. 250 250, and pp. 259 263 the text book.

More information

BPIC 2014: Insights from the Analysis of Rabobank Service Desk Processes

BPIC 2014: Insights from the Analysis of Rabobank Service Desk Processes BPIC 2014: Insights from the Analysis of Rabobank Service Desk Processes Bruna Christina P. Brandão, Guilherme Neves Lopes, Pedro Henrique P. Richetti Department of Applied Informatics - Federal University

More information

Process-Aware Information Systems: Lessons to be Learned from Process Mining

Process-Aware Information Systems: Lessons to be Learned from Process Mining Process-Aware Information Systems: Lessons to be Learned from Process Mining W.M.P. van der Aalst Department of Mathematics and Computer Science, Eindhoven University of Technology P.O. Box 513, NL-5600

More information

Data Science Betere processen en producten dankzij (Big) data. Wil van der Aalst www.vdaalst.com @wvdaalst www.processmining.org

Data Science Betere processen en producten dankzij (Big) data. Wil van der Aalst www.vdaalst.com @wvdaalst www.processmining.org Data Science Betere processen en producten dankzij (Big) data Wil van der Aalst www.vdaalst.com @wvdaalst www.processmining.org Data Science Center Eindhoven http://www.tue.nl/dsce/ DSC/e: Competences

More information

PROCESS mining has been demonstrated to possess the

PROCESS mining has been demonstrated to possess the IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 25, NO. X, XXXXXXX 2013 1 Active Trace Clustering for Improved Process Discovery Jochen De Weerdt, Seppe vanden Broucke, Jan Vanthienen, and Bart

More information

Clustering. Danilo Croce Web Mining & Retrieval a.a. 2015/201 16/03/2016

Clustering. Danilo Croce Web Mining & Retrieval a.a. 2015/201 16/03/2016 Clustering Danilo Croce Web Mining & Retrieval a.a. 2015/201 16/03/2016 1 Supervised learning vs. unsupervised learning Supervised learning: discover patterns in the data that relate data attributes with

More information

Discovering process models from empirical data

Discovering process models from empirical data Discovering process models from empirical data Laura Măruşter (l.maruster@tm.tue.nl), Ton Weijters (a.j.m.m.weijters@tm.tue.nl) and Wil van der Aalst (w.m.p.aalst@tm.tue.nl) Eindhoven University of Technology,

More information

DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS

DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS 1 AND ALGORITHMS Chiara Renso KDD-LAB ISTI- CNR, Pisa, Italy WHAT IS CLUSTER ANALYSIS? Finding groups of objects such that the objects in a group will be similar

More information

WoPeD - An Educational Tool for Workflow Nets

WoPeD - An Educational Tool for Workflow Nets WoPeD - An Educational Tool for Workflow Nets Thomas Freytag, Cooperative State University (DHBW) Karlsruhe, Germany freytag@dhbw-karlsruhe.de Martin Sänger, 1&1 Internet AG, Karlsruhe, Germany m.saenger09@web.de

More information

Data Mining for Customer Service Support. Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin

Data Mining for Customer Service Support. Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin Data Mining for Customer Service Support Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin Traditional Hotline Services Problem Traditional Customer Service Support (manufacturing)

More information

Towards Cross-Organizational Process Mining in Collections of Process Models and their Executions

Towards Cross-Organizational Process Mining in Collections of Process Models and their Executions Towards Cross-Organizational Process Mining in Collections of Process Models and their Executions J.C.A.M. Buijs, B.F. van Dongen, W.M.P. van der Aalst Department of Mathematics and Computer Science, Eindhoven

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

Formal Modeling and Analysis by Simulation of Data Paths in Digital Document Printers

Formal Modeling and Analysis by Simulation of Data Paths in Digital Document Printers Formal Modeling and Analysis by Simulation of Data Paths in Digital Document Printers Venkatesh Kannan, Wil M.P. van der Aalst, and Marc Voorhoeve Department of Mathematics and Computer Science, Eindhoven

More information

Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Knowledge Discovery and Data Mining Lecture 15 - ROC, AUC & Lift Tom Kelsey School of Computer Science University of St Andrews http://tom.home.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-17-AUC

More information

Summary Data Mining & Process Mining (1BM46) Content. Made by S.P.T. Ariesen

Summary Data Mining & Process Mining (1BM46) Content. Made by S.P.T. Ariesen Summary Data Mining & Process Mining (1BM46) Made by S.P.T. Ariesen Content Data Mining part... 2 Lecture 1... 2 Lecture 2:... 4 Lecture 3... 7 Lecture 4... 9 Process mining part... 13 Lecture 5... 13

More information

Protein Protein Interaction Networks

Protein Protein Interaction Networks Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics

More information

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 by Tan, Steinbach, Kumar 1 What is Cluster Analysis? Finding groups of objects such that the objects in a group will

More information

Unsupervised learning: Clustering

Unsupervised learning: Clustering Unsupervised learning: Clustering Salissou Moutari Centre for Statistical Science and Operational Research CenSSOR 17 th September 2013 Unsupervised learning: Clustering 1/52 Outline 1 Introduction What

More information

Methods for the specification and verification of business processes MPB (6 cfu, 295AA)

Methods for the specification and verification of business processes MPB (6 cfu, 295AA) Methods for the specification and verification of business processes MPB (6 cfu, 295AA) Roberto Bruni http://www.di.unipi.it/~bruni 24 - Process Mining 1 Object We overview the key principles of process

More information

Business process measurement - data mining. enn@cc.ttu.ee

Business process measurement - data mining. enn@cc.ttu.ee Business process measurement - data mining. enn@cc.ttu.ee Business process measurement Balanced scorecard Process mining - ProM Äriprotsessi konteksti perspektiiv Clear & measurable goals Effective solutions

More information

Big Data Text Mining and Visualization. Anton Heijs

Big Data Text Mining and Visualization. Anton Heijs Copyright 2007 by Treparel Information Solutions BV. This report nor any part of it may be copied, circulated, quoted without prior written approval from Treparel7 Treparel Information Solutions BV Delftechpark

More information

Process Mining Manifesto

Process Mining Manifesto Process Mining Manifesto manifesto is a "public declaration of principles and intentions" by a group of people. This manifesto is written by members and supporters of the IEEE Task Force on Process Mining.

More information

D A T A M I N I N G C L A S S I F I C A T I O N

D A T A M I N I N G C L A S S I F I C A T I O N D A T A M I N I N G C L A S S I F I C A T I O N FABRICIO VOZNIKA LEO NARDO VIA NA INTRODUCTION Nowadays there is huge amount of data being collected and stored in databases everywhere across the globe.

More information

Data Mining Application in Direct Marketing: Identifying Hot Prospects for Banking Product

Data Mining Application in Direct Marketing: Identifying Hot Prospects for Banking Product Data Mining Application in Direct Marketing: Identifying Hot Prospects for Banking Product Sagarika Prusty Web Data Mining (ECT 584),Spring 2013 DePaul University,Chicago sagarikaprusty@gmail.com Keywords:

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:

More information

Machine Learning using MapReduce

Machine Learning using MapReduce Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous

More information

STATISTICA. Clustering Techniques. Case Study: Defining Clusters of Shopping Center Patrons. and

STATISTICA. Clustering Techniques. Case Study: Defining Clusters of Shopping Center Patrons. and Clustering Techniques and STATISTICA Case Study: Defining Clusters of Shopping Center Patrons STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table

More information

Data Mining Applications in Manufacturing

Data Mining Applications in Manufacturing Data Mining Applications in Manufacturing Dr Jenny Harding Senior Lecturer Wolfson School of Mechanical & Manufacturing Engineering, Loughborough University Identification of Knowledge - Context Intelligent

More information

IYOPRO Improve your Processes

IYOPRO Improve your Processes IYOPRO Improve your Processes Highlights Business Process Management Suite involving the following components Process modeling & documentation Process execution & workflow Companies across all industries

More information

Generation of a Set of Event Logs with Noise

Generation of a Set of Event Logs with Noise Generation of a Set of Event Logs with Noise Ivan Shugurov International Laboratory of Process-Aware Information Systems National Research University Higher School of Economics 33 Kirpichnaya Str., Moscow,

More information

A new Approach for Intrusion Detection in Computer Networks Using Data Mining Technique

A new Approach for Intrusion Detection in Computer Networks Using Data Mining Technique A new Approach for Intrusion Detection in Computer Networks Using Data Mining Technique Aida Parbaleh 1, Dr. Heirsh Soltanpanah 2* 1 Department of Computer Engineering, Islamic Azad University, Sanandaj

More information

EDIminer: A Toolset for Process Mining from EDI Messages

EDIminer: A Toolset for Process Mining from EDI Messages EDIminer: A Toolset for Process Mining from EDI Messages Robert Engel 1, R. P. Jagadeesh Chandra Bose 2, Christian Pichler 1, Marco Zapletal 1, and Hannes Werthner 1 1 Vienna University of Technology,

More information

Service Discovery from Observed Behavior While Guaranteeing Deadlock Freedom in Collaborations

Service Discovery from Observed Behavior While Guaranteeing Deadlock Freedom in Collaborations Service Discovery from Observed Behavior While Guaranteeing Deadlock Freedom in Collaborations Richard Müller 1,2, Christian Stahl 2, Wil M.P. van der Aalst 2,3, and Michael Westergaard 2,3 1 Institut

More information

Learning Example. Machine learning and our focus. Another Example. An example: data (loan application) The data and the goal

Learning Example. Machine learning and our focus. Another Example. An example: data (loan application) The data and the goal Learning Example Chapter 18: Learning from Examples 22c:145 An emergency room in a hospital measures 17 variables (e.g., blood pressure, age, etc) of newly admitted patients. A decision is needed: whether

More information

Experiments in Web Page Classification for Semantic Web

Experiments in Web Page Classification for Semantic Web Experiments in Web Page Classification for Semantic Web Asad Satti, Nick Cercone, Vlado Kešelj Faculty of Computer Science, Dalhousie University E-mail: {rashid,nick,vlado}@cs.dal.ca Abstract We address

More information

Business Process Analysis in Healthcare Environments: a Methodology based on Process Mining

Business Process Analysis in Healthcare Environments: a Methodology based on Process Mining Business Process Analysis in Healthcare Environments: a Methodology based on Process Mining Álvaro Rebuge a, Diogo R. Ferreira b a Hospital de São Sebastião, EPE Rua Dr. Cândido de Pinho, 4520-211 Santa

More information

Azure Machine Learning, SQL Data Mining and R

Azure Machine Learning, SQL Data Mining and R Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:

More information

Clustering UE 141 Spring 2013

Clustering UE 141 Spring 2013 Clustering UE 141 Spring 013 Jing Gao SUNY Buffalo 1 Definition of Clustering Finding groups of obects such that the obects in a group will be similar (or related) to one another and different from (or

More information

Process Mining A Comparative Study

Process Mining A Comparative Study International Journal of Advanced Research in Computer Communication Engineering Process Mining A Comparative Study Asst. Prof. Esmita.P. Gupta M.E. Student, Department of Information Technology, VIT,

More information

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP Data Warehousing and End-User Access Tools OLAP and Data Mining Accompanying growth in data warehouses is increasing demands for more powerful access tools providing advanced analytical capabilities. Key

More information

Visualization methods for patent data

Visualization methods for patent data Visualization methods for patent data Treparel 2013 Dr. Anton Heijs (CTO & Founder) Delft, The Netherlands Introduction Treparel can provide advanced visualizations for patent data. This document describes

More information

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery Index Contents Page No. 1. Introduction 1 1.1 Related Research 2 1.2 Objective of Research Work 3 1.3 Why Data Mining is Important 3 1.4 Research Methodology 4 1.5 Research Hypothesis 4 1.6 Scope 5 2.

More information

Customer Analytics. Turn Big Data into Big Value

Customer Analytics. Turn Big Data into Big Value Turn Big Data into Big Value All Your Data Integrated in Just One Place BIRT Analytics lets you capture the value of Big Data that speeds right by most enterprises. It analyzes massive volumes of data

More information

Part 2: Community Detection

Part 2: Community Detection Chapter 8: Graph Data Part 2: Community Detection Based on Leskovec, Rajaraman, Ullman 2014: Mining of Massive Datasets Big Data Management and Analytics Outline Community Detection - Social networks -

More information