Data Science. Research Theme: Process Mining



Similar documents
Using Process Mining to Bridge the Gap between BI and BPM

Process Mining: Making Knowledge Discovery Process Centric

Process Mining The influence of big data (and the internet of things) on the supply chain

Process Mining Data Science in Action

Chapter 12 Analyzing Spaghetti Processes

Lluis Belanche + Alfredo Vellido. Intelligent Data Analysis and Data Mining. Data Analysis and Knowledge Discovery

Chapter 4 Getting the Data

Process Mining Manifesto

Process Mining. Data science in action

Process Mining. ^J Springer. Discovery, Conformance and Enhancement of Business Processes. Wil M.R van der Aalst Q UNIVERS1TAT.

Process Mining Tools: A Comparative Analysis

Mercy Health System. St. Louis, MO. Process Mining of Clinical Workflows for Quality and Process Improvement

Model Discovery from Motor Claim Process Using Process Mining Technique

Replaying History. prof.dr.ir. Wil van der Aalst

Database Marketing, Business Intelligence and Knowledge Discovery

Process Modelling from Insurance Event Log

Enhancing Decision Making

Towards Cross-Organizational Process Mining in Collections of Process Models and their Executions

Intelligent Process Management & Process Visualization. TAProViz 2014 workshop. Presenter: Dafna Levy

Business Intelligence and Process Modelling

Process Mining and Visual Analytics: Breathing Life into Business Process Models

In this presentation, you will be introduced to data mining and the relationship with meaningful use.

Analance Data Integration Technical Whitepaper

Analysis of Service Level Agreements using Process Mining techniques

Feature. Applications of Business Process Analytics and Mining for Internal Control. World

Building a Data Quality Scorecard for Operational Data Governance

Combination of Process Mining and Simulation Techniques for Business Process Redesign: A Methodological Approach

Configuring IBM WebSphere Monitor for Process Mining

Using Trace Clustering for Configurable Process Discovery Explained by Event Log Data

3TU.BSR: Big Software on the Run

Business Process Discovery

Business Process Modeling

SAP BUSINESSOBJECTS SUPPLY CHAIN PERFORMANCE MANAGEMENT IMPROVING SUPPLY CHAIN EFFECTIVENESS

Healthcare Measurement Analysis Using Data mining Techniques

Generation of a Set of Event Logs with Noise

FIVE STEPS FOR DELIVERING SELF-SERVICE BUSINESS INTELLIGENCE TO EVERYONE CONTENTS

Process Mining and Monitoring Processes and Services: Workshop Report

SOFTWARE PROCESS MINING

EDIminer: A Toolset for Process Mining from EDI Messages

Business Intelligence and Decision Support Systems

A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH

Important dimensions of knowledge Knowledge is a firm asset: Knowledge has different forms Knowledge has a location Knowledge is situational Wisdom:

Analance Data Integration Technical Whitepaper

Insightful Analytics: Leveraging the data explosion for business optimisation. Top Ten Challenges for Investment Banks 2015

Master Thesis September 2010 ALGORITHMS FOR PROCESS CONFORMANCE AND PROCESS REFINEMENT

Pentaho Data Mining Last Modified on January 22, 2007

Practice profitability

Process mining challenges in hospital information systems

A Visualization is Worth a Thousand Tables: How IBM Business Analytics Lets Users See Big Data

Agil visualisering och dataanalys

A Hurwitz white paper. Inventing the Future. Judith Hurwitz President and CEO. Sponsored by Hitachi

Course DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Whitepaper. 5 Dos and Don ts of Embedded Analytics.

At a recent industry conference, global

Customer Analytics. Turn Big Data into Big Value

Mining productivity has declined 28% in the last 10 years. MineLens enables you to reverse the trend and improve productivity.

Implementing Heuristic Miner for Different Types of Event Logs

Dotted Chart and Control-Flow Analysis for a Loan Application Process

SEYMOUR SLOAN IDEAS THAT MATTER

Big Data Strategies Creating Customer Value In Utilities

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Maximizing the ROI Of Visual Rules

WHY IT ORGANIZATIONS CAN T LIVE WITHOUT QLIKVIEW

BUsiness process mining, or process mining in a short

BPIC 2014: Insights from the Analysis of Rabobank Service Desk Processes

The Future of Business Analytics is Now! 2013 IBM Corporation

A full spectrum of analytics you can get yourself

Analytics For Everyone - Even You

Gerard Mc Nulty Systems Optimisation Ltd BA.,B.A.I.,C.Eng.,F.I.E.I

Data Warehouse design

MicroStrategy Enterprise Mobile. Increase productivity and efficiency by mobilizing your enterprise today

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing

Summary and Outlook. Business Process Intelligence Course Lecture 8. prof.dr.ir. Wil van der Aalst.

PRODUCT INFORMATION. Know Your Business Better.

3/17/2009. Knowledge Management BIKM eclassifier Integrated BIKM Tools

Self-Service Big Data Analytics for Line of Business

Process Mining Using BPMN: Relating Event Logs and Process Models

Big Data, Physics, and the Industrial Internet! How Modeling & Analytics are Making the World Work Better."

Chapter Managing Knowledge in the Digital Firm

Getting started with a data quality program

Data Warehouse and Business Intelligence Testing: Challenges, Best Practices & the Solution

Picture of health. An integrated approach to asset health management

ICD-10 Advantages Require Advanced Analytics

Transcription:

Data Science Research Theme: Process Mining Process mining is a relatively young research discipline that sits between computational intelligence and data mining on the one hand and process modeling and analysis on the other. The idea of process mining is to discover, monitor, and improve real processes (not assumed processes) by extracting knowledge from event logs readily available in today s information systems. Process mining research is one of the cornerstones of NIRICT s Data nalytics for Smart Services (DSS) initiative. True Business Intelligence The term business intelligence (BI) refers to a broad collection of tools and methods that use data to support decision making. BI is, unfortunately, an oxymoron in many companies, which use primitive tools to monitor and analyze processes. Moreover, most BI vendors offer products that are data-centric and focus on rather simplistic forms of analysis, such as dashboards and scorecards. Mainstream BI tools are not as intelligent as the term suggests. End-users are easily confused by the marking terms used by BI vendors. Nevertheless, the market for BI products is steadily growing, showing BI s practical relevance. Within DSS we aim to develop truly intelligent approaches to unlock the intelligence that is hidden inside the massive amounts of data that are being created and generated in the business and social processes of our current and future networked world. Process mining techniques can be used to achieve this and thus facilitate the engineering of smart services for the networked world. Research: automated process discovery (extracting process models from an event log) conformance checking (monitoring deviations by comparing model and log) social network and organizational mining automated construction of simulation models model extension and repair case prediction history-based recommendations

Starting point is an event log. Each event refers to a process instance (case) and an activity. Events are ordered and additional properties (e.g. timestamp or resource data) may be present. The event log can be used to discover roles in the organization (e.g., groups of people with similar work patterns). These roles can be used to relate individuals and activities. Role : ssistant Pete Mike Role E: Expert Sue Sean Role M: Manager Sara Decision rules (e.g., a decision tree based on data known at the time a particular choice was made) can be learned from the event log and used to annotate decisions. Ellen E examine thoroughly start register examine casually check ticket M M reinitiate decide pay compensation reject end Discovery techniques can be used to find a controlflow model (in this case in terms of a BPMN model) that describes the observed behavior best. Performance information (e.g., the average time between two subsequent activities) can be extracted from the event log and visualized on top of the model. The figure illustrates the scope of process mining. The starting point for process mining is an event log. ll process-mining techniques assume that it s possible to sequentially record events such that each event refers to an activity (a well-defined step in some process) and is related to a particular case, or process instance. Event logs can store additional information about events. In fact, whenever possible, processmining techniques use extra information such as the resource (person or device) executing or initiating the activity, the timestamp of the event, or data elements recorded with the event (for instance, the size of an order). We can use event logs to conduct three types of process mining. The first type is discovery. discovery technique produces a model from an event log without using any a-priori information. Process discovery is the most prominent process-mining technique. In many organizations, people are surprised to see that existing techniques are indeed able to discover real processes merely based on example executions in event logs. The second type of process mining is conformance. Here, an existing process model is

compared with an event log of the same process. For example, there are various algorithms to compute the percentage of events that can be explained by the model. Conformance checking can confirm whether reality, as recorded in the log, conforms to the model and vice versa. The third type of process mining is enhancement. Here, the idea is to extend or improve an existing process model using information about the actual process recorded in some event log. Whereas conformance checking measures the alignment between model and reality, this third type of process mining aims at changing or extending the a-priori model. For instance, by using timestamps in the event log, you can extend the model to show bottlenecks, service levels, throughput times, and frequencies. pplications and Relevance Process mining techniques can be applied in all top sectors identified by the Dutch government. Event data are available in most organizations and in all sectors there is a continuous need to improve and adapt operational processes. Examples are: Top Sector High Tech Systems. Most high-tech systems (wafer steppers, medical equipment, etc.) are already recording events for remote diagnostics and servicing. Process mining can be used to understand how systems are used in the field, why and when they fail, and how they can be improved, etc. Top Sector Logistics. Event data around the physical movement of goods can come from different data sources. Tagging of products (e.g. RFID) and integrated supply chains are generating torrents of data that can be used to improve processes from source to sink. Top Sector Health. There is a need to reduce costs in care processes. Today s hospitals and other care providers collect detailed data about individuals. This can be used to optimize care, both in terms of quality and costs. Challenges Process mining is an important tool for modern organizations that must manage nontrivial operational processes. On the one hand, more and more event data are becoming available. On the other hand, processes and information must be aligned perfectly to meet compliance, efficiency, and customer service requirements. Despite the applicability of process mining, we must still address important challenges; these illustrate that process mining is an emerging discipline. For example, the Process Mining Manifesto lists the following challenges: C1: Finding, merging, and cleaning event data C2: Dealing with complex event logs having diverse characteristics C3: Creating representative benchmarks C4: Dealing with concept drift C5: Improving the representational bias used for process discovery When extracting event data suitable for process mining, we must address several challenges: data can be distributed over a variety of sources, event data might be incomplete, an event log could contain outliers, logs could contain events at different level of granularity, and so on. Event logs can have very different characteristics. Some event logs might be extremely large, making them difficult to handle, whereas others are so small that they don t provide enough data to make reliable conclusions. We need good benchmarks consisting of example data sets and representative quality criteria to compare and improve the various tools and algorithms. The process might be changing while under analysis. Understanding such concept drifts is of prime importance for process management. careful and refined selection of the representational bias is necessary to ensure high-quality process-mining results. C6: Balancing Four competing quality dimensions exist: fitness, simplicity, precision, and

between quality criteria such as fitness, simplicity, precision, and generalization C7: Crossorganizational mining C8: Providing operational support C9: Combining process mining with other types of analysis C10: Improving usability for nonexperts C11: Improving understandability for non-experts generalization. The challenge is to find models that can balance all four dimensions. In some use cases, event logs from multiple organizations are available for analysis. Some organizations, such as supply chain partners, work together to handle process instances; other organizations execute essentially the same process while sharing experiences, knowledge, or a common infrastructure. However, traditional process-mining techniques typically consider one event log in one organization. Process mining isn t restricted to offline analysis; it can also provide online operational support. Detection, prediction, and recommendation are examples of operational support activities. The challenge is to combine automated process-mining techniques with other analysis approaches (optimization techniques, data mining, simulation, visual analytics, and so on) to extract more insights from event data. The challenge is to hide the sophisticated process-mining algorithms behind userfriendly interfaces that automatically set parameters and suggest suitable types of analysis. The user might have problems understanding the output or be tempted to infer incorrect conclusions. To avoid such problems, process mining tools should present results using a suitable representation and the trustworthiness of the results should always be clearly indicated. s an example, consider Challenge C4: Dealing with Concept Drift. The term concept drift refers to a situation in which the process is changing while we re analyzing it. For instance, in the beginning of the event log, two activities might be concurrent, whereas later in the log, they become sequential. Processes might change because of periodic or seasonal changes (for example, in December, there is more demand or on Friday afternoon, fewer employees are available ) or changing conditions ( the market is getting more competitive ). Such changes impact processes, and detecting and analyzing them is vital. However, most process-mining techniques analyze processes as if they re in steady state. Impact Based on the research conducted within NIRICT, powerful process mining tools have been developed. For example, the open-source ProM tool developed at Eindhoven University of Technology provides a highperforming pluggable architecture and a common basis for all kinds of process-mining techniques. Hundreds of plugins are available; for instance, ProM supports dozens of process-discovery algorithms as plugins. ProM is available for download from prom.sf.net and www.processmining.org. lso various commercial process mining tools have been developed based on the research done at Eindhoven University of Technology, e.g., Disco by Fluxicon, Reflect by Perepetive, etc. ProM has been applied in more than 100 organizations, including municipalities such as lkmaar, Heusden, and Harderwijk; government agencies such as Rijkswaterstaat, Centraal Justitieel Incasso Bureau, and the Dutch Justice department insurance-related agencies such as UWV; banks such as ING; hospitals such as MC and Catharina hospitals; multinational corporations such as DSM and Deloitte;

high-tech system manufacturers, such as Philips Healthcare, SML, Ricoh, and Thales, and their customers; and media companies such as Winkwaves. This illustrates the broad spectrum of situations to which we can apply process mining. More information For more information about process mining visit www.processmining.org or read the book W. van der alst, Process Mining: Discovery, Conformance and Enhancement of Business Processes, Springer-Verlag, 2011 (http://springer.com/978-3-642-19344-6). The website provides sample logs, videos, slides, articles, and software. The Process Manifesto can be found on the home page of the IEEE Task Force on Process Mining: http://www.win.tue.nl/ieeetfpm/. The Task Force was established by the IEEE to promote the research, development, education and understanding of process mining.