CHAPTER 1 INTRODUCTION



Similar documents
Process Modelling from Insurance Event Log

Model Discovery from Motor Claim Process Using Process Mining Technique

Mercy Health System. St. Louis, MO. Process Mining of Clinical Workflows for Quality and Process Improvement

Business Process Modeling

Process Mining and Monitoring Processes and Services: Workshop Report

Feature. Applications of Business Process Analytics and Mining for Internal Control. World

Towards an Evaluation Framework for Process Mining Algorithms

Activity Mining for Discovering Software Process Models

EFFECTIVE CONSTRUCTIVE MODELS OF IMPLICIT SELECTION IN BUSINESS PROCESSES. Nataliya Golyan, Vera Golyan, Olga Kalynychenko

Handling Big(ger) Logs: Connecting ProM 6 to Apache Hadoop

Process Mining. ^J Springer. Discovery, Conformance and Enhancement of Business Processes. Wil M.R van der Aalst Q UNIVERS1TAT.

Towards Cross-Organizational Process Mining in Collections of Process Models and their Executions

Configuring IBM WebSphere Monitor for Process Mining

Towards a Software Framework for Automatic Business Process Redesign Marwa M.Essam 1, Selma Limam Mansar 2 1

Process mining challenges in hospital information systems

The Research on the Usage of Business Process Mining in the Implementation of BPR

From Workflow Design Patterns to Logical Specifications

Dotted Chart and Control-Flow Analysis for a Loan Application Process

Application of Process Mining in Healthcare A Case Study in a Dutch Hospital

ProM 6 Tutorial. H.M.W. (Eric) Verbeek mailto:h.m.w.verbeek@tue.nl R. P. Jagadeesh Chandra Bose mailto:j.c.b.rantham.prabhakara@tue.

EMiT: A process mining tool

Process Mining Framework for Software Processes

Verifying Business Processes Extracted from E-Commerce Systems Using Dynamic Analysis

ProM Framework Tutorial

Analysis of Service Level Agreements using Process Mining techniques

Combination of Process Mining and Simulation Techniques for Business Process Redesign: A Methodological Approach

Discovering User Communities in Large Event Logs

The ProM framework: A new era in process mining tool support

Business Process Measurement in small enterprises after the installation of an ERP software.

Implementing Heuristic Miner for Different Types of Event Logs

Decision Mining in Business Processes

Business Process Management: A personal view

Process Mining A Comparative Study

ProM 6 Exercises. J.C.A.M. (Joos) Buijs and J.J.C.L. (Jan) Vogelaar {j.c.a.m.buijs,j.j.c.l.vogelaar}@tue.nl. August 2010

Process Mining Tools: A Comparative Analysis

Trace Clustering in Process Mining

Dept. of IT in Vel Tech Dr. RR & Dr. SR Technical University, INDIA. Dept. of Computer Science, Rashtriya Sanskrit Vidyapeetha, INDIA

BIS 3106: Business Process Management. Lecture Two: Modelling the Control-flow Perspective

B. Majeed British Telecom, Computational Intelligence Group, Ipswich, UK

Investigating Clinical Care Pathways Correlated with Outcomes

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into

Using Process Mining to Bridge the Gap between BI and BPM

Structural Detection of Deadlocks in Business Process Models

FileNet s BPM life-cycle support

Master Thesis September 2010 ALGORITHMS FOR PROCESS CONFORMANCE AND PROCESS REFINEMENT

Chapter 12 Analyzing Spaghetti Processes

AdTheorent s. The Intelligent Solution for Real-time Predictive Technology in Mobile Advertising. The Intelligent Impression TM

Supporting the BPM life-cycle with FileNet

Process Mining: Making Knowledge Discovery Process Centric

Nr.: Fakultät für Informatik Otto-von-Guericke-Universität Magdeburg

PLG: a Framework for the Generation of Business Process Models and their Execution Logs

USING PROCESS MINING FOR ITIL ASSESSMENT: A CASE STUDY WITH INCIDENT MANAGEMENT

SOFTWARE PROCESS MINING

Generation of a Set of Event Logs with Noise

Business Process Mining: From Theory to Practice

Using Trace Clustering for Configurable Process Discovery Explained by Event Log Data

Modeling and Analysis of Incoming Raw Materials Business Process: A Process Mining Approach

Process Mining and the ProM Framework: An Exploratory Survey - Extended report

An Outlook on Semantic Business Process Mining and Monitoring

Business Process Discovery

Policy Modeling and Compliance Verification in Enterprise Software Systems: a Survey

CHAPTER 1 INTRODUCTION

The University of Jordan

Process Mining Data Science in Action

BPIC 2014: Insights from the Analysis of Rabobank Service Desk Processes

Process Aware Host-based Intrusion Detection Model

A Scala DSL for Rete-based Runtime Verification

CPN Tools 4: A Process Modeling Tool Combining Declarative and Imperative Paradigms

Static Program Transformations for Efficient Software Model Checking

Process Mining and Fraud Detection

Information Management course

Eventifier: Extracting Process Execution Logs from Operational Databases

WebSphere Business Modeler

Oracle Real Time Decisions

Discovering Structured Event Logs from Unstructured Audit Trails for Workflow Mining

EDIminer: A Toolset for Process Mining from EDI Messages

NNMi120 Network Node Manager i Software 9.x Essentials

Visionet IT Modernization Empowering Change

XpoLog Center Suite Data Sheet

Software Development Engineer Management Protection & Access Group

Translating Message Sequence Charts to other Process Languages using Process Mining

Malay A. Dalal Madhav Erraguntla Perakath Benjamin. Knowledge Based Systems, Inc. (KBSI) College Station, TX 77840, U.S.A.

Transcription:

CHAPTER 1 INTRODUCTION 1.1 Research Motivation In today s modern digital environment with or without our notice we are leaving our digital footprints in various data repositories through our daily activities, be it from a mobile call, ATM access and swiping a credit card. Similarly in every business transaction we are leaving our digital trace with various business enterprises which use Enterprise Application Software (EAS) for their business activities. In the EAS Business environment, for every transaction, data is captured and stored in the respective event logs automatically. The size of these data repositories has grown in incredible volumes and has become unimaginably huge. All the modern business enterprises are heading together to find the ways and means to manage this situation. It also triggered many challenges to the scientific community particularly to the researchers in the field of business intelligence and solution providers for enterprise data storage to handle this data explosion. One dimension of solutions is to provide enhanced infrastructure for data storage. Global business giants like IBM and HP have come up with solutions for big data storage - IBM-DS3500 and HP 3PPAR Store Serv respectively. The necessity to handle big digital data has given birth to new dimension in storage infrastructure and platform called cloud computing. Both IBM and HP also focus on the next technology trend called cloud infrastructure not only for storage provision but also for storage management. Merely providing a storage solution or storage management is not wise enough, instead converting this challenge of obese data into an opportunity for business understanding, business re-engineering and optimizing the business operations by providing insight, identify bottleneck, anticipate problems, record policy violations, recommend countermeasures, streamline processes and enhance the existing process standards, is a real challenge of the day. The real challenge in digital data is to trace the foot prints of the process and then convert it into visual models. These models can further be enhanced, which leads to process 1

evolution. Machine learning and data mining are the only solutions to handle this challenge properly. The main goal of process mining is to use the enterprise event logs to extract process related information. For example, by modeling a business process and analyzing it, management may get ideas on how to improve the quality and reduce the time of the activity in a process which in turn leads to improvement in operational efficiency. 1.2 Review of Literature Process mining is relatively a young research field which use the event data to extract process related information, e.g., to automatically discover a process model by observing events recorded by the enterprise system over time. The main goal of digging deeper knowledge about the enterprise data has twin advantages. The first and foremost advantage is to provide scientific knowledge that is used to develop standard operating procedures SOP by removal of non-value adding task from the process flow which is in practice. The resulting advantage is to have a firm base to develop a practical application system, which professionally support and control the business processes. This thesis presents the verification, conformance checking and process enhancement of an insurance motor claim process. Since the work presented in this thesis builds on prior work in different areas of process mining, the related research work are explained in detail below. Laura Maruster (Laura Maruster, 2006) has shown the application of a discovery method to data from different domains: simulated workflow data, real data resulted from the registration of some enterprize-specific information system and hospital data. The discovery method is rather able to capture the general process model than the process model containing exceptional paths. The discovery method provided reasons to question an existing process design or can reveal new insights into the considered process. The usefulness of the discovered process model is especially manifesting in combination with the designed model. A.J.M.M. Weijters and W.M.P van der Aalst (A.J.M.M. Weijters & W.M.P van der Aalst, 2003) propose a technique for process mining. This technique used workflow logs to discover the workflow process as it is actually being executed. The proposed process mining 2

technique deals with noise and can also be used to validate workflow processes by uncovering and measuring the discrepancies between prescriptive models and actual process executions. Henricus Marinus Wilhelmus Verbeek (Verbeek, 2008) present the workflow process definition verification tool Woflan and its supporting concepts. Woflan maps a workflow process definition onto a workflow net which is a Petri net with some additional requirements and can verify, before the workflow process definition is taken into production, the soundness property and four inheritance relations for the resulting for the resulting WF-net. Verbeek concludes that Woflan was the best tool to check this inheritance relation and soundness property of any work-flow net. Anne Rozinat (Rozinat, 2009) proposes an incremental approach to check the conformance of a process model and an event log. At first, the fitness between the log and the model was ensured and then the appropriateness of the model was analyzed with respect to the log. One metric (f) has been defined to address fitness and two metrics each to approach the structural appropriateness and behavioral appropriateness is then established. Together they allow for the quantification of conformance, whereas fitness should be ensured before appropriateness is analyzed. To verify her ideas a Conformance Checker has been implemented within the ProM framework. W.M.P van der Aalst (W.M.P van der Aalst, 2005), discusses the application of process mining to business alignment. The first assumption was that events are actually logged by some information system and the second fundamental assumption was that people are not completely controlled by the system, i.e., process mining does not give any insight if all decisions are made by the system and users cannot deviate from the default path. Although the degree of freedom is limited by some systems (e.g., traditional production workflow systems) the trend is towards more flexible systems. In a technical sense, the work of Havelund et al. [2004] is highly related. Havelund et al. propose three ways to evaluate LTL formulas: (1) Automata-based, (2) Using rewriting (based on Maude) and (3) Using dynamic programming. More recent work of Van der Aalst and Pesic [2006] shows that LTL formulas can not only be used for the verification of properties on process logs, but also for the execution of 3

business processes. In their approach, a process is specified by giving desired and undesired properties, in terms of LTL formulas. The system then lets its users perform any task, as long as the undesired properties are not violated, while at the same time it makes sure that a case is only completed if all desirable properties are fulfilled. In 1996, Sadiq and Orlowska were among the first ones to point out that modeling a business process (or workflow) can lead to problems like livelock and deadlock. In their paper, they present a way to overcome syntactical errors, but they ignore the semantical errors, i.e. they focus on the syntax to show that no deadlocks and livelocks occur. Agrawal et al. (1998) introduced the idea of applying process mining in the context of workflow management. This work is based on workflow graphs, which are inspired by workflow products such as IBM MQ Series workflow (formerly known as Flowmark) and InConcert. In his paper, two problems are defined. The first problem is to find a workflow graph generating events appearing in a given workflow log. A concrete algorithm was given for tackling the first problem. The approach was quite different from other approaches. Pinter et al. extended the work of Agrawal, by assuming that each event in the log refers either to the start or to the completion of an activity. This information is used to derive explicit parallel relations between activities. In (1998), Datta considers the problem of process mining as a tool for Business Process Redesign or BPR. In BPR, the starting point is typically assumed to be a set off process models that describe the current process. These models are then analyzed and better process models are constructed. The question that Datta addresses is how to get this initial set of models. Herbst (2000) also address the issue of process mining in the context of workflow management using an inductive approach. The work presented is limited to sequential models. Schimm [2003] has developed a mining tool suitable for discovering hierarchically structured workflow processes. This required all splits and joins to be balanced. However, in contrast to other approaches, he tries to generate a complete and minimal model, i.e. the model can reproduce the log and there is no smaller model that can do so. Greco et al. (2006) presented a process mining algorithm tailored towards discovering process models that describe the process at different levels of abstraction. Of the two step 4

approach they have presented, the first step is implemented as the Disjunctive Workflow Miner plug-in in the process mining framework ProM. The approach by Weijters et al (2003) provides an extension to the first step in the alpha-algorithm, i.e. a heuristic approach is presented for determining causal and parallel dependencies between events. These dependencies then serve as an input for the alphaalgorithm, thus making it applicable in situations where the process log contains noise, i.e. exceptional behavior that we do not want to appear in a process model. Again, the work of Weiters et al (2003) was implemented in the ProM framework. 1.3 Objectives And Scope The core objective of this research work is to convert the data handling challenge into an opportunity for business improvement. To do so the thesis employs machine learning techniques to obtain insights of the business processes. This approach is referred as process discovery. The objective of using machine learning technique is that human learning is a long process and slow. The most distinctive feature about human learning is that there is no copy process. When one machine has made to learn, they ve all learned it in principle. Furthermore the volume of data to be handled is very huge. The first objective of the thesis is to convert the real world process into visual process models, which brings clarity and convenience for better understanding of the business process without ambiguity by using standard notations, which are tangible and structured. These models can further be tuned and enhanced to improve the efficiency of the existing process to its next possible dimension. Every insurance business activity leaves the foot prints of the process in its event logs. Event log has many cases and every case has many events, with a corresponding time stamps. The main goal of process mining is to use event logs to extract process related information. Data mining concepts have huge scope in business industry because it is capable to handle rich data sources. Till recent years many industries suffered to handle obese databases. Data mining yields predictive models with which business industry can handle the situations such as database segmentation to identify target customers, process optimization, new product development, and marketing strategies. 5

1.4 Methodology The research approach used in this thesis is the inductive method. The inductive method is based on the observation in the real world and it involves the process of learning from examples. It aims to convert the real world process into a model and tries to induct a general rule from a set of observed instances. In the inductive method, observations from the real world are the authority. The task of constructing class definition is called induction or concept learning. The process of applying the inductive method is called inductive inference or inductive learning. Inductive learning as a heuristic search through a space of symbolic description generated by an application of various inference rules, to the initial observational statements. Inductive learning is a process of acquiring knowledge by drawing inductive inference from the environment. Inductive learning programs could provide both an improvement of the current technique and a basis for developing alternative knowledge acquisition methods. The basic methods of inference are inductive and deductive. The deductive inference uses the generalization or rules to learn about the specific example or activity hence it is referred to as top-down approach. The inductive method uses the specific examples or activities to formulate the generalization hence referred to as bottom-up approach. There are two basic modes in which inductive programs can be utilized: as inductive tools for acquisition of knowledge from specific facts or examples, or as parts of machine learning In this thesis inductive inference method is employed. This thesis uses machine learning approach to discover models from data. The discovered model can be formally analyzed. The model generator uses the machine learning approach called learning from instruction which takes input from the trace table and generates the visual model of the process. The model is generated in such a way that there should be at least one transition between any two activities. The model generator is able to generate different possible models for the same process and while generating the alternative model it is very important to take into account the sequence sensitive activities. The order should not be altered otherwise the objective and the purpose of the process will be affected. In this thesis, time and quality are been taken as the key parameters for further analysis. 6

1.5 Contribution of this thesis The contribution of this thesis for business optimization using machine learning technique is remarkable. Business optimization is the need of the hour in the present competitive global business scenario. The thesis has converted the challenges of handling big digital data into an opportunity for the business enterprises for better understanding of their business flow. The thesis also paves the way to detect the redundant activities, identify and remove the non value added task in the flow of the business process. It also made a sincere attempt to reduce the time and cost in the operation of the process and improve the quality of service to the customer. In the present business climate environment the customer retention is the most challenging task. The thesis uses the machine learning technique to convert the raw data into a process flow model and then by using process mining tool the activities of the process are analyzed and enhanced. 7