Development and application of lightweight cloud platform workflow system



Similar documents
CONCEPTUAL MODEL OF MULTI-AGENT BUSINESS COLLABORATION BASED ON CLOUD WORKFLOW

Research on Operation Management under the Environment of Cloud Computing Data Center

Maintenance Scheduling Optimization for 30kt Heavy Haul Combined Train in Daqin Railway

Scalable Multiple NameNodes Hadoop Cloud Storage System

OPTIMIZATION STRATEGY OF CLOUD COMPUTING SERVICE COMPOSITION RESEARCH BASED ON ANP

UPS battery remote monitoring system in cloud computing

Open Access Research on Database Massive Data Processing and Mining Method based on Hadoop Cloud Platform

A FRAMEWORK FOR AUTOMATIC FUNCTION POINT COUNTING

A performance analysis of EtherCAT and PROFINET IRT

The WAMS Power Data Processing based on Hadoop

Federated Cloud-based Big Data Platform in Telecommunications

Research of Smart Distribution Network Big Data Model

Design of Electric Energy Acquisition System on Hadoop

Big Data Storage Architecture Design in Cloud Computing

A MPCP-Based Centralized Rate Control Method for Mobile Stations in FiWi Access Networks

Journal of Chemical and Pharmaceutical Research, 2015, 7(3): Research Article. E-commerce recommendation system on cloud computing

FIXED INCOME ATTRIBUTION

Process Modelling from Insurance Event Log

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

A CLOUD-BASED FRAMEWORK FOR ONLINE MANAGEMENT OF MASSIVE BIMS USING HADOOP AND WEBGL

Cloud Computing based on the Hadoop Platform

Supporting the Workflow Management System Development Process with YAWL

Operation and Maintenance Management Strategy of Cloud Computing Data Center

Capability Service Management System for Manufacturing Equipments in

Fast and Easy Delivery of Data Mining Insights to Reporting Systems

Manufacturing and Fractional Cell Formation Using the Modified Binary Digit Grouping Algorithm


Oracle Big Data SQL Technical Update

1. Overview of Nios II Embedded Development

The Power Marketing Information System Model Based on Cloud Computing

A NOVEL ALGORITHM WITH IM-LSI INDEX FOR INCREMENTAL MAINTENANCE OF MATERIALIZED VIEW

7. Mentor Graphics PCB Design Tools Support

!!! Technical Notes : The One-click Installation & The AXIS Internet Dynamic DNS Service. Table of contents

A Hybrid Load Balancing Policy underlying Cloud Computing Environment

BIS 3106: Business Process Management. Lecture Two: Modelling the Control-flow Perspective

Problem Solving Hands-on Labware for Teaching Big Data Cybersecurity Analysis

Study on Redundant Strategies in Peer to Peer Cloud Storage Systems

Implementation Approach of ERP with Mass Customization

Telecom Data processing and analysis based on Hadoop

Applied research on data mining platform for weather forecast based on cloud storage

P2PCloud-W: A Novel P2PCloud Workflow Management Architecture Based on Petri Net

Enhancing Dataset Processing in Hadoop YARN Performance for Big Data Applications

Cost-Effective Business Intelligence with Red Hat and Open Source

An Epicor White Paper. Improve Scheduling, Production, and Quality Using Cloud ERP

RESEARCH ON THE APPLICATION OF WORKFLOW MANAGEMENT SYSTEM IN COLLABORATIVE PLATFORM FOR ENGLISH TEACHING

A Tabu Search Algorithm for the Parallel Assembly Line Balancing Problem

1. Overview of Nios II Embedded Development

FIFTH EDITION. Oracle Essentials. Rick Greenwald, Robert Stackowiak, and. Jonathan Stern O'REILLY" Tokyo. Koln Sebastopol. Cambridge Farnham.

Cloud Storage Solution for WSN in Internet Innovation Union

A Faster Way to Temporarily Redirect the Role Based Access Control Workflow Processes Christine Liang

An Advanced Commercial Contact Center Based on Cloud Computing

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

Analysis of Information Management and Scheduling Technology in Hadoop

Using quantum computing to realize the Fourier Transform in computer vision applications

Research of Postal Data mining system based on big data

THE MODELING AND CALCULATION OF SOUND RADIATION FROM FACILITIES WITH GAS FLOWED PIPES INTRODUCTION

The Application Research of Ant Colony Algorithm in Search Engine Jian Lan Liu1, a, Li Zhu2,b

Study on Architecture and Implementation of Port Logistics Information Service Platform Based on Cloud Computing 1

Azure Data Lake Analytics

PERSONALIZED WEB MAP CUSTOMIZED SERVICE

Method of Fault Detection in Cloud Computing Systems

Handling Big(ger) Logs: Connecting ProM 6 to Apache Hadoop

Optimization of Distributed Crawler under Hadoop

Cloud Storage Solution for WSN Based on Internet Innovation Union

Oracle Real Time Decisions

SIMPLIFIED CBA CONCEPT AND EXPRESS CHOICE METHOD FOR INTEGRATED NETWORK MANAGEMENT SYSTEM

Exploration on Security System Structure of Smart Campus Based on Cloud Computing. Wei Zhou

Blog Post Extraction Using Title Finding

Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control

Game Theory Based Iaas Services Composition in Cloud Computing

Application and practice of parallel cloud computing in ISP. Guangzhou Institute of China Telecom Zhilan Huang

IMPLEMENTATION OF NETWORK SECURITY MODEL IN CLOUD COMPUTING USING ENCRYPTION TECHNIQUE

Ankush Cluster Manager - Hadoop2 Technology User Guide

Evaluation Model for Internet Cloud Data Structure Audit System

How To Organize Your Digital Assets

GeoSquare: A cloud-enabled geospatial information resources (GIRs) interoperate infrastructure for cooperation and sharing

Real World Application and Usage of IBM Advanced Analytics Technology

How To Handle Big Data With A Data Scientist

The Improved Job Scheduling Algorithm of Hadoop Platform

2. Developing Nios II Software

Research on Task Planning Based on Activity Period in Manufacturing Grid

An Experimental Approach Towards Big Data for Analyzing Memory Utilization on a Hadoop cluster using HDFS and MapReduce.

Research on Trust Management Strategies in Cloud Computing Environment

Market Basket Analysis Algorithm on Map/Reduce in AWS EC2

Study on Cloud Service Mode of Agricultural Information Institutions

Research and realization of Resource Cloud Encapsulation in Cloud Manufacturing

CLOUDDMSS: CLOUD-BASED DISTRIBUTED MULTIMEDIA STREAMING SERVICE SYSTEM FOR HETEROGENEOUS DEVICES

Big Data Mining Services and Knowledge Discovery Applications on Clouds

Research and Application of Workflow-based Modeling Approval in Dispatching Automation Master System

A Study on Workload Imbalance Issues in Data Intensive Distributed Computing

Building Out Your Cloud-Ready Solutions. Clark D. Richey, Jr., Principal Technologist, DoD

Data Integration Checklist

Transcription:

Abstract Development and application o lightweight cloud platorm worklow system Qing Hou *, Qingsheng Xie, Shaobo Li Key Laboratory o Advanced Manuacturing Technology, Ministry o Education, Guizhou University, Guiyang Guizhou 550003, China Received 1 October 2014, www.cmnt.lv Based on the principle o colored Petri net, this paper introduces a lightweight cloud platorm worklow system, which will deliver BI services in Hadoop. The system aims to achieve the quick development and deployment o business processes via the worklow engine and to provide BI personalized application construction and service by way o rental. The paper expounds the worklow engine s creation principles, activity analysis, scheduling algorithm and speciic application. Furthermore, it illustrates the BI parallel cloud platorm deployment and system implementation. That platorm is applied in the proect management o operator construction and achieves some results. Keywords: worklow engine, cloud computing, business process management, (business intelligence BI 1 Introduction The technical principles o worklow can be divided into Petri net, DGA or rule-based description, etc. [1-3] It separates the work into coloring and task and executes according to immobilized standard so that the worklow makes it possible or regular activities o the immobilized program in work to act in the IT system, and achieves the whole process monitoring and data analysis [4]. It is widely applied in ields like proect management and oice automation. The traditional centralized worklow under the C/S mode can eiciently resolve general data analysis like Clementine and SPSS. However, with the coming o the big data age, the occurrence o BI makes users pay more attention to the relationship between the explicit data and implicit data. Besides, with explosive growth o the data scale, the increase o structured and semi-structured data, and the increasing demands o independent analysis with burstiness, the traditional worklow ails to meet the requirements o the big data age, such as mass data set and cleaning, OALP (On-Line Analytical Processing and data mining [5]. Cloud computing takes advantage o distributed technologies to work out the calculation required by worklow and store resources on the relatively cheap acilities such as IaaS (Inrastructure as a Service, PaaS (Platorm as a Service and SaaS (Sotware as a Service [6]. The open-source Hadoop has became the oundation on the cloud computing. At present, Hadoop grows to be a huge system including mass data analysis and storage, non-structured data set and processing, task scheduling and monitoring, etc. For instance, the BI platorm BC- PDM, based on Hadoop, provides users with the services o analysis and decision-making in the orm o Web ater putting ETL, OLAP, data mining and analysis [6]. Based on the principle o colored Petri net, this paper proposes a lightweight cloud worklow engine and provides the successul application that the engine works in a system o an operator s construction proect management when the engine is put into the platorm o Hadoop. That system accomplishes conigurable management o a proect through worklow engine and achieves distributed analysis o the process and mass data real-time analysis o the low processing through processing unit distribution and the distributed processing technique. As the online status works well, the system eiciently supports the operator s internal control management and decision analysis. 2 Worklow engine creations 2.1 CREATION PRINCIPLES OF WORKFLOW ENGINE When the user s request arrives, the worklow engine will create process instances immediately and create relative iles to save the operative inormation about the process instances. For one process instance, the case structure activities will be made into instantiation or at most once, the loop structure activities or at least once (or N times, and the others or once. In the meantime, the engine instantiates activities that satisy the conditions including inormation about activities ID, initialization time, participants, application, execution state, and so on. The engine will then save the inormation into the process instance and produce the work list or production users. The work list consists o the todo task list and the done task lest. The ormer provides * Corresponding author s e-mail: 282102166@qq.com 197

execution work items while the latter provides inormation inquiry o the completed tasks. According to the scheduling process o worklow engine, this paper will categorize the worklow net into process instantiation module, activity instantiation module and task assignment and execution module. 2.2 WORKFLOW ACTIVITY DECOMPOSITION According to the principles o the engine, the internal data o worklow is comprised o ive types including organization structure, activity instance, process instance, activity deinition and process deinition. The data structure o worklow engine based on such deinition is as ollows. 1 Activity deinition. Activity inormation consists o the activity involved process, speciic activities and the user inormation execution. The activities mentioned here can be divided into nine parts, which are general activities, and-oin, andoin predecessor activities, and-split, or-oin, or-split, orsplit ending activities, begin and end. 2 Process deinition. It means to deine speciic activities list content including predecessor activities and successor activities in the process. 3 Process instance. It involves the instance that has its creator o the target and the instance that has process deinition o the target. 4 Activity instance. It is saved in the task list and such instance is mainly about the task o speciic activities in the process instance. 5 Organization structure. It stores users coloring and specialization, and allocates relative privilege to correspond to activities in the work list. 2.3 WORKFLOW ENGINE SCHEDULING ALGORITHM According to the process deinition, worklow engine controls the low o the worklow, allocates corresponding tasks to participants, and then invokes application automatically to execute. The algorithm includes process instantiation module, activity instantiation module and task assignment and execution module. The activity steps are as ollows. 1 When the user builds a request, the process instantiation module will add the process instantiation required by the execution into queuing list; 2 To take out the irst activity and instantiate it to create an activity instance. 3 To carry on work item allocation, task assignment and module execution, and to takes activity allocation task rom the process instantiation queuing list and then store work items into users work list. 4 Users execute the tasks in the work list and put the completed activities into done activity list. 5 The engine obtains the next activity rom the process instance according to the completed task. It will come to the end i it gets the ending activity or it will turn to the 3, and re-execute the instantiation activity until the process is completed or it is aborted rom the outside. The speciic scheduling process is shown in Figure 1. idgen The ID set o process instances Tid 1 { r=r1, pid = p1 } Instance initiator Process instantiation Instance activity place ( #iid si Activity scheduling AInst SI Terminate the process ai To- do activity place AInst assignment reduce((#iid si,tskl InstEnd(ai,si FIGURE 1 The scheduling process worklow engine 3 Colored Petri net 3.1 PRINCIPLES OF THE COLORED PETRI NET Database In 1960s, a ormalized modeling tool, Petri net, came into being. It achieved visual representations in graphics and got proved strictly by mathematics. However, it was still lawed by deects listed as ollows. 1 It contained no data concepts. Under such circumstance, when the data control must be converted to net structure, it would increase the complexity o the model. 2 It contained no hierarchy concepts so that large modules could not be built sub-modules. These deects restricted the scale o Petri net to be small. In 1981, a Dane Kurt Jensen put up orward hierarchical colored Petri net (abbreviated as CP net or CPN. It used colors assertion to represent the data type o token, took s to show the relationship between transitional excitation and colored identity, bound the place with assigned color set and designated what types o resources were to be stored in the place. It took advantage o the strict ormalized descriptive method, the graphical visual representation and dynamic simulation, etc. This kind o CPN was widely applied in the modeling o many systems such as concurrence system, communication system and distribution system. CPN can take any complicated data type as a color set. With such representation capability, it can eectively solve problems listed as ollows. 1 The various state spaces generated by the dynamic worklow create excessive application nodes, which limits the computer solidiication. 2 During the operation, there are uncertainty about paths generated rom several executive activities and application problems about coloring. 3 During the operation, there are token dissipation and chaos caused by the processing o several instances. CPN are nine tuples [9]. (, P, T, A, N, C, G, E, I, Tskl 198

using the color o token to describe properties o the obect space. Among the tuples, p(s indicates the place that is connected with the arc s, Var(exp indicates the variable set o the expression exp, CMS indicates the multiple set on the set C, and Type(v indicates the type o the variable v. The corresponding meanings are all listed in Table 1. TABLE 1 Nine tuples o the CPN worklow engine model Title Tuple Meaning Color set Data type involved in the worklow engine P Place set Worklow engine state T Transition set Worklow engine operation A Directed art set Data direction o the worklow engine N Node C G E I Data type Deense Arc Initial P T P A A T N : A P T T P P T XX Data types stored in the place To input essential conditions or the engine s operation, except or the parameters, t T, worklow engine is the o I/O, t T, 4 CPN triggering analysis E( a C( p( a MS Type( Var( E( a The initial identity in the operation o the worklow engine, p P I( p C( p Var( I( p M (p is the multi-set o dierent color marks. It is used to represent the token in the place and is explained as that the place P contains two tokens in color <g> and three tokens in color <r>. The method is as ollows. C( m( P { g, r} 2g 3r MS. (1 The Equation (1 indicates that the color set C (P o every place deines the token color set that is given the access. The color set o every arc A is included in C(P and the token color belongs to the color set o arc A. When the trigger rules and the arc E decide to carry on the transition, the color transition gets triggered. P t i and Pk t i, i E ( pi, ti p and m(p is available, t i is triggered and creates the new mark m as ollows: m( pk pk E ( ti, pk m( p p E ( pi, tk G( t Bool Type( Var ( G( t. (2. (3 I and only i the importing place p o t i includes as many tokens as the arc E (p, t i that is relative to the arc (p, t i does, the transition t i can be triggered. When the trigger happens, t i uses the assigned amount o color tokens rom the importing place E (p, t i and store them in the exporting place p k. That is to say, the arc (p, t i sets the exact number o tokens that are to be retrieved rom p and the arc (t i, p k sets the exact number o tokens that are to be inserted in. 5 Cloud platorm realizations 5.1 CLOUD PLATFORM DEPLOYMENT The algorithm analysis units orm sequential combination and achieve the realization o the application and analysis o BI on the cloud system. The cloud platorm consists o the ollowing three parts. 1 Web client: the interace used by users; 2 Worklow engine: to realize the process s analysis, distribution, execution and monitoring on the basis o CPN; 3 Cloud platorm: to integrate the BI operation analysis algorithm and provide cloud storage and cloud computing services based on Hadoop; to encapsulate all activities nodes o the worklow into the node <invoke> o the Web service; to invoke corresponding abstract type to achieve dynamic mount o dierent obects and inally complete the whole process. When users submit their requirements rom the Web client, the speciic activity on the cloud platorm is shown in the Figure 2. 1 Worklow submits transaction requirements to the worklow engine to process; worklow engine then analyzes it as DAG (Directed Acyclic Graph based on parameters instantiation and saves in the process list; 2 I the users requirements are cloud transaction requirements, then deployment module will analyze and invoke relative BI algorithm and submit to cloud platorm; 3 Job Tracker on the cloud platorm arranges and executes Job, processes through the calculation mode in the Hadoop Distributed File System (or HDFS and stores the inal results in the BigTable o HBase; 4 The system will send back the results to users thus complete the whole process. Web client Local ile system 1Submit the worklow transaction Worklow engine scheduler Data scheduler 6Data delivery BI analysis module 3 Initiate a request Job Tracker 2Submit cloud transaction Deployment module 4Submit ob 1 5Submit ob n Receive the message Invoke the service reduce web services Hadoop Job 1 Hadoop Job 2 Hadoop Job n link...1 link...n Distributed storage HDFS HBase FIGURE 2 The deployment and execution o the cloud platorm 5.2 SYSTEM REALIZATION Combine with CPN the researchers then develop the lightweight worklow middleware o the enterprise s quick inormation development platorm. Such middleware can achieve the graphical process coniguration and route control. Process instantiation can dynamically udge executing node and executing route according to parameters and rules. The dark color part indicates executed node and the black part means no need to execute. 199

Other lighter color node shows that the process has not been completed as shown in the Figure 3. Revise the construction request and ill in payback time Audit the construction application and payback time(supervisor Conirm the construction proxy Conirm the starting work Conirm Construction Started Conirm Completion Upload Activate Service Process Start Audit the construction application and payback time (Assistant Manager Make the construction orm Make the construction proxy Make the starting work Audit the starting work Make the completion Audit the Completion Conirm Activate Service Upload Accounting List or Billing System The end Audit the construction application and payback time (Department Manager Examine & Approve Request (General Manager Revise the construction orm Make circuit dispatch orders Feed circuit dispatch orders back Conirm the completion o the circuit dispatch orders the completion o the proect FIGURE 3 The process o an operator s construction proect space o cloud system is extended with the proportion o 3 to 1, totaling 24TB. In the environment o cloud platorm, an IBM3850 is used as the Web server and database server; an IBM3650 is used as the server o worklow engine; an IBM3650 is used as the master control server or the Hadoop platorm; six old servers in the type o Xeon E5405@2GHZ equipped with the two ways, our cores, the capacity o 16 G and the 4TGB hard disk are used as the sub-node or the cloud platorm. In the BI analysis, the cloud platorm is subtilized to the node s granularity thus to count the inormation o dierent links in the process, and to analyze and process. The inormation involved are detailed inormation o any single item, timeout details o multi-latitude proects, the operational conditions o dierent departments in the process, timeout details o to-do tasks and corresponding evaluation o coloring. Then the platorm is able to carry out the analysis o the whole construction and the using proportion o the capital budget according to initiating departments and localization areas. Based on the Hadoop, the cloud platorm analysis can provide real-time operation analysis data and support the air management and quick decision making analysis o the enterprise as it is shown in the Figure 5. The cloud service system o an operator s construction proect management inormation integrates the worklow middleware, BI analysis module and the 0.20.2 version Hadoop. The deployment and operation o the system has achieved a success and passed the three-month environment and pressure test. The operator s construction proect includes ive categories and 18 sub-categories and achieves the visualized coniguration through the worklow middleware. The coloring process and authority o the process nodes correspond to the categories o clients inormation, get integrated into the seamless communication o all the nodes and deal with personalized requirements (like time limit ust as it is shown by Figure 4. FIGURE 4 The middleware o coniguration platorm o worklow In the system, the size o a single construction proect ranges rom 200MB to 6GB, averaging at about 1.2MB. The proects or the whole province total about 6000 per year, taking up 7.2TB. Considering the requirements o HDFS and the highly ault-tolerant mechanism, the storage FIGURE 5 The cloud platorm o an operator s construction proect management 6 Conclusions Based on CPN, this paper proposes a lightweight cloud platorm worklow system and achieves the application o SaaS. The system combines the counting and analysis o parallel BI and integrates the distributed calculation and capacity o storage o the cloud platorm thus to provide eective proect management and operation analysis application or the operator s construction proect management. The application indicates that the system can prop the provincial high concurrency o multi-users, invoke data eiciently to analyze the algorithm and show real-time results o the data analysis. The next step o research o the cloud platorm can ocus on: 1 urther perection o the eiciency o the cloud worklow; 2 the establishment o ODS and eective data cleaning; 3 the improvement o BI s such as OLAP, data mining, and cloudization o the analysis. 200

Acknowledgments National Key Technology R&D Program, Guizhou cultural heritage digitalization protection and development key technology research and application, Proect Reerences (2014BAH05F01; Ministry o Science and Technology Innovation Fund or Technology Based Firms, Guizhou Communication Technology Public Service Platorms, Proect (12C26245206305. [1] Fan Y, Li X, Wang Q 2008 Bottom Level Based Heuristic or Worklow Scheduling in Grids Chinese Journal O Computers 31(2 [2] Liang Y, Xu F 2010 Study on modeling o process ontology or enterprise patent resources management Computer Engineering And Applications 46(1 [3] Wu S 2009 Worklow model o construction proects based on Petrine Computer Engineering And Applications 45(30 [4] Fan Y 2001 Basic technology o Worklow Management System [M] Beiing: Tsinghua University Press [5] Yu L, Zhao S, Zhang Y, et al. 2013 The application o cloud worklow technology in BlSaaS Computer integrated manuacturing system 19(9 [6] Yan G, Yu J, Yang X 2013 Scientiic two-phase worklow scheduling strategy nder cloud computing environment Computer application 33(4 [7] Yu L, Zheng J, Wu B 2012 BC-PDM: data mining, social network analysis and text mining system based on cloud computing Proceedings o the 18th ACM SIGKDD International Conerence on Knowledge Discovery and Data Mining New York N.Y.USAACM 1496-9 [8] Jensen K 1994 An introduction to the theoretical aspect o colored Petri nets A decade o concurrency lecture notes in Computer Science 803 230-72 [9] Li B, Zhang L, Ren L 2012 Typical characteristics, technologies and applications o cloud manuacturing o cloud computing Computer Integrated Manuacturing Systems 18(7 Authors Hou Qing, 1980, Tianin, China. Current position, grades: senior engineer and doctoral candidate, the vice president o Guizhou Planning & Design Institute o Posts & Telecommunications. Scientiic interest: computer networks, computer integrated manuacturing system (CIMS. Xie Qingsheng, 1954, Guiyang, China. Current position, grades: proessor and tutor o doctoral candidate. Scientiic interest: network manuacture, CIMS and computer aided innovation design. Li Shaobo, 1973, Shaoyang, Hunan Province, China. Current position, grades: proessor and tutor o doctoral candidate. Scientiic interest: inormation management and inormation system, cyber inormation security, E-Government, computer aided innovation. 201