KNOWLEDGE DISCOVERY FOR SUPPLY CHAIN MANAGEMENT SYSTEMS: A SCHEMA COMPOSITION APPROACH



Similar documents
DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.

Data Mining Governance for Service Oriented Architecture

Data Warehousing and Data Mining in Business Applications

ORGANIZATIONAL KNOWLEDGE MAPPING BASED ON LIBRARY INFORMATION SYSTEM

SPATIAL DATA CLASSIFICATION AND DATA MINING

Course DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

The Role of Data Warehousing Concept for Improved Organizations Performance and Decision Making

Data Warehousing and OLAP Technology for Knowledge Discovery

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

A Knowledge Management Framework Using Business Intelligence Solutions

Supply chain intelligence: benefits, techniques and future trends

Integrating SAP and non-sap data for comprehensive Business Intelligence

Subject Description Form

Building A Smart Academic Advising System Using Association Rule Mining

5.5 Copyright 2011 Pearson Education, Inc. publishing as Prentice Hall. Figure 5-2

Healthcare Measurement Analysis Using Data mining Techniques

Data Mining Solutions for the Business Environment

Chapter 5. Warehousing, Data Acquisition, Data. Visualization

Journal of Information Technology Impact

Mining Customer Behavior Knowledge to Develop Analytical Expert System for Beverage Marketing

CHAPTER 1 INTRODUCTION

Chapter 11 Mining Databases on the Web

COURSE SYLLABUS. Enterprise Information Systems and Business Intelligence

Supply Chain Management and Value Creation

2.1. Data Mining for Biomedical and DNA data analysis

DEVELOPMENT OF HASH TABLE BASED WEB-READY DATA MINING ENGINE

Fluency With Information Technology CSE100/IMT100

II. OLAP(ONLINE ANALYTICAL PROCESSING)

Enterprise Resource Planning Analysis of Business Intelligence & Emergence of Mining Objects

Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10

CONTEMPORARY DECISION SUPPORT AND KNOWLEDGE MANAGEMENT TECHNOLOGIES

TIM 50 - Business Information Systems

Dimensional Data Modeling for the Data Warehouse

Data Mining Analytics for Business Intelligence and Decision Support

International Journal of World Research, Vol: I Issue XIII, December 2008, Print ISSN: X DATA MINING TECHNIQUES AND STOCK MARKET

AN INTEGRATION APPROACH FOR THE STATISTICAL INFORMATION SYSTEM OF ISTAT USING SDMX STANDARDS

A Process Driven Architecture of Analytical CRM Systems with Implementation in Bank Industry

Using customer knowledge in designing electronic catalog

Data Mining: Concepts and Techniques. Jiawei Han. Micheline Kamber. Simon Fräser University К MORGAN KAUFMANN PUBLISHERS. AN IMPRINT OF Elsevier

A New Marketing Channel Management Strategy Based on Frequent Subtree Mining

How To Use Neural Networks In Data Mining

Selection of Optimal Discount of Retail Assortments with Data Mining Approach

Building Data Cubes and Mining Them. Jelena Jovanovic

PartJoin: An Efficient Storage and Query Execution for Data Warehouses

J.N.V.R.Swarup kumar $1 A.Tejaswi $1 G.Srinivas $2 Ajay kumar #3 $1

Foundations of Business Intelligence: Databases and Information Management

Data Mining - Introduction

Operations Research and Knowledge Modeling in Data Mining

Institute of Research on Information Systems (IRIS) Course Overview

Knowledge Mining for the Business Analyst

Towards applying Data Mining Techniques for Talent Mangement

Data Outsourcing based on Secure Association Rule Mining Processes

Requirements Analysis Concepts & Principles. Instructor: Dr. Jerry Gao

CORE CLASSES: IS 6410 Information Systems Analysis and Design IS 6420 Database Theory and Design IS 6440 Networking & Servers (3)

ISSN: (Online) Volume 3, Issue 4, April 2015 International Journal of Advance Research in Computer Science and Management Studies

CHAPTER-24 Mining Spatial Databases

E-Commerce Supply Chain Management Domain Research and Standard Architectures Kunal Chopra, Jeff Elrod, Bill Glenn, Barry Jones.

A Survey on Association Rule Mining in Market Basket Analysis

Overview Applications of Data Mining In Health Care: The Case Study of Arusha Region

A New Approach for Evaluation of Data Mining Techniques

Foundations of Business Intelligence: Databases and Information Management

DATA MINING TECHNIQUES AND APPLICATIONS

Formal Methods for Preserving Privacy for Big Data Extraction Software

Prediction of Heart Disease Using Naïve Bayes Algorithm

In-Database Analytics

CUSTOMER RELATIONSHIP MANAGEMENT (CRM) CII Institute of Logistics

Proposed Syllabus by C.S.J.M.University,Kanpur. Bachelors of Computer Application

Discussion on Airport Business Intelligence System Architecture

DATA WAREHOUSING AND OLAP TECHNOLOGY

Business Intelligence: Effective Decision Making

COURSE OUTLINE. Track 1 Advanced Data Modeling, Analysis and Design

Available online at Available online at Advanced in Control Engineering and Information Science

Analyzing Polls and News Headlines Using Business Intelligence Techniques

IJMIE Volume 2, Issue 8 ISSN:

Foundations of Business Intelligence: Databases and Information Management

International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April ISSN

Data Analytics and Reporting in Toll Management and Supervision System Case study Bosnia and Herzegovina

Chapter 5. B2B E-Commerce: Selling and Buying in Private E-Markets

Applied Business Intelligence. Iakovos Motakis, Ph.D. Director, DW & Decision Support Systems Intrasoft SA

USE OF DATA MINING TO DERIVE CRM STRATEGIES OF AN AUTOMOBILE REPAIR SERVICE CENTER IN KOREA

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

Verifying Business Processes Extracted from E-Commerce Systems Using Dynamic Analysis

Mobile Phone APP Software Browsing Behavior using Clustering Analysis

MULTI AGENT-BASED DISTRIBUTED DATA MINING

Module compendium of the Master s degree course of Information Systems

Clustering Marketing Datasets with Data Mining Techniques

Ezgi Dinçerden. Marmara University, Istanbul, Turkey

Mining changes in customer behavior in retail marketing

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

Business Intelligence in Oracle Fusion Applications

RESEARCH ON THE FRAMEWORK OF SPATIO-TEMPORAL DATA WAREHOUSE

Foundations of Business Intelligence: Databases and Information Management

Chapter 8 - Strengthening Business-to- Business Relationships via Supply Chain and Customer Relationship Management

ISMT527 - SPRING 2003 DATA MINING TOOLS AND APPLICATIONS

An Overview of Data Warehousing, Data mining, OLAP and OLTP Technologies

11741 E-Business Credit Hours: Integrated Application Systems Credit Hours: Enterprise Systems Architecture Credit Hours: 3

NEURAL NETWORKS IN DATA MINING

Customer Relationship Management using Adaptive Resonance Theory

Data Warehousing Systems: Foundations and Architectures

Technology-Driven Demand and e- Customer Relationship Management e-crm

Transcription:

KNOWLEDGE DISCOVERY FOR SUPPLY CHAIN MANAGEMENT SYSTEMS: A SCHEMA COMPOSITION APPROACH Shi-Ming Huang and Tsuei-Chun Hu* Department of Accounting and Information Technology *Department of Information Management National Chung Cheng University, Taiwan Email:smhuang@mis.ccu.edu.tw ABSTRACT Decision support and global knowledge discovery on SCM system have become an important issue, since SCM has become a common way for achieving competitive advantage. This research proposes a distributed data mining mechanism that uses a relationship metadata to integrate the data source derived from partner of the supply chain, thus solving the problem of excessive and various data volume in the current supply chain. With prior relevance detection, the mining process will be much better. On the other hand, since data provided by the partner is not extremely complete for traditional data mining, we can utilize peculiarity oriented data mining process to generate the rules across the organizations in order to achieve more competitive advantage. Therefore, all important rules will be mined successfully in our architecture. The results provided by the experimental system can identify the relevant resource effectively with more user-defined requirements. Keywords: Distributed Data Mining, Granule Computing, Peculiarity Oriented Data Mining, Contingency Table, Supply Chain Management. INTRODUCTION With the growing popularity of the Internet and e-commerce, individual companies no longer compete as a solely entity, but rather as part of the supply chain. The concept of supply chain management (SCM) is to utilize some efficient ways to integrate data and process among partners to achieve higher customer satisfaction and lower cost. SCM is increasingly important for enterprises. It is considered for a necessary IT investment with better capability for SCM, it tends to keep or get more competitive advantages (6). Figure1 illustrates the supply chain process. The information flow in SCM serve as the bridge between various phases within a supply chain, for allowing supply partners to coordinate their actions and increase inventory visibility (5). Information sharing on SCM is another important topic. The degree of information sharing and the availability of information at each level of supply chain cause an immediate influence on activities in SCM. Operational performance is particularly driven by information. Some studies reveal that information sharing will provide significant cost saving and inventory reduction (10). On the other hand, decision support and decision technologies have become increasingly important in SCM (11). They enable companies to engage in smart business (s-business). It is the next stage in the evolution of business, beyond the supply chain (3). Currently, three technologies are widely adopted as decision tools: (i) data warehousing; (ii) OLAP (On-Line Analytical Process- Volume V, No 2, 2004 523 Issues in Information Systems

ing); and (iii) Data Mining. However, information and decision-supporting on SCM face two serious problems: Resource variety: the sharable information is dispersed at different members of the supply chain. Information overloading: according to the study of (7), as shown in Figure2, the rate of growth of information sources available for technical and managerial executives is rapidly increasing. Figure1 The Supply Chain Process Figure2 Growth in information sources available to corporate decision-makers (7) To resolve the above problem, some studies suggested the distributed data mining approach. However, the distributed data mining algorithms nowadays focus on homogeneous data which are horizontally or vertically partitioned into multi-parts. It is not suitable for the heterogeneous data of the supply chain environment. In this study, we propose a distributed data mining mechanism with semantic composition, which can produce the decision rules and the reasons which influence the efficiency of the supply chain. The objectives of this paper are described as follows: 1. To investigate a mechanism to for building the relationship between different data sources; 2. To investigate a mechanism for retrieving the data for mining; 3. To investigate a mining mechanism for generating the decision rules; With our approach, distributed knowledge on various resources and extremely large datasets can be properly found and integrated. The quality of knowledge will therefore be basically certified. In the next section, we first introduce some of the related works. In Section3, we discuss our distributed data mining mechanism. Section4 shows an implementation of DDM (Distributed Data Mining) mechanism using JAVA and a real case study for the feasibility analysis. The final section contains the conclusion Decision Tools on SCM RELATED WORKS Data mining refers to the process of shifting through a large amount of corporate data (1) to look up nuggets of information serving as decision support in enterprises. There are some characteristics in operation of the supply chain. (i) The relationships between the data of companies are weak! (ii) The data derived from the company, such as sales and predictive information are segmental or predictive, (iii) The data sources are heterogeneous meaning that the characteristics of sources are different. These characteristics have become more and more important in the selec- Volume V, No 2, 2004 524 Issues in Information Systems

tion of decision tools for the supply chain. The decision-makers within the supply chain have to choose a better decision model in order to achieve competitive advantage. However, there are drawbacks in decision model of SCM as follows. Simplicity: Focused on single period, single echelon, and computational difficulty. Dependency: Because of difference in driving force behind the supply chain linkage, the members of the supply chain have to figure out which model is needed. Incompleteness for recommendation: Reinventing traditional analytical tools will not be the answer to many managerial issues. Distributed Data Mining Briefly, data mining, which is referred to as known as knowledge discovery in databases, denotes to extracting or mining knowledge from a large amount of data (8), and has been recognized as a new area for database research. Further, many areas, including decision support, market strategy, and fraud detection, have been employed to extract useful information for decision making. The traditional method of data mining utilizes the centralized data, such as data warehouse; nevertheless, it is fundamentally improper for most of the distributed and ubiquitous data mining applications. A new architecture, distributed data mining (DDM) is proposed to solve this problem. Lots of DDM algorithms have been developed, such as Count Distribution (2), Data Distribution (2) and Fast Distributed Mining (4), and are designed for relational database. Owing to the diverse data, the algorithms mentioned above can not deal with heterogeneous data. (12) proposed a new algorithm, peculiarity oriented mining, to solve the problems of heterogeneous data. It focuses on exploration of peculiarity rules from the different data sources. Peculiarity rules are a typical regularity hidden in some of domains, such as scientific, statistical, and transaction database. They are difficult to be discovered by the standard association rule due to the requirement of large support. Overview MECHANISM FOR DATA MINING ON SCM Our proposed mechanism for distributed data mining in supply chain environment is depicted in Figure3. The mechanism constructs the relationship between different tables within the multi-database from different members of the supply chain. It then generates the decision rule with background knowledge. There are two basic process modules in our mechanism. The two phases are Relationship Composition (RC) and Rule Discovery (RD). Phase1: Relationship Composition (RC) The rules generated from the mining algorithm provide less background knowledge. To explore the background knowledge, the mechanism will connect the entire tables using the structure of graph to present the relationship. There are three steps for implementing the RC in this model. Step1.1: Granule Definition The objective of granule definition is to transfer the precise data into the granule which represents a specific domain. Each attribute is divided into different granules. There are two kinds Volume V, No 2, 2004 525 Issues in Information Systems

of granule style, and are defined by user: Qualitative style: to transfer the quantitative value into qualitative value. Generalization style: to transfer the particular into generalization. Figure3 An overview of our mechanism.step 1.2: Relation detection The concept of Entity Relationship Diagrams (ERD) is adopted to detect the relationship between the multi-tables dispersed to multi-databases. Primary and foreign keys are utilized to connect the tables. It does not mean that the attributes of two tables will be mutually relevant. There are two steps in this module for determine which attributes are relevant. They are: Step 1.2.1: Local Detection Its objective is to detect the relevance with two fields and calculate the degree of relation according to ERD. We adopt Pawlak s rough sets which is one of the granular computing techniques for finding out the relevant fields in the tables. Step 1.2.2: Global Detection The relationship within the company and his supplies or customers is depended on the interactive messages (IM), such as purchase order. Because the degree of normalization is different between companies, the relation within multi-tables dispersed to different companies could be divided into the following definitions. One-to-One Association: All of attributes is come from one table in the site, and is corresponding to all or part of attributes in a table in the other site. One-to-Many Association: All of attributes is come from one table in the site, and is divided into multi-tables in the other site. Many-to-Many Association: All of attributes is come from multi-tables in the site, and is arranged to multi-tables in the other site. Step1.3: Relation construction After relation detection, all relevant fields between tables were recognized. The relationship is stored with Relationship Metadata (shown in Figure4) for rule discovery. Phase2: Rule Discovery (RD) Rule discovery is responsible for discovering the peculiar rule which can reveal the further meanings. Here, we refine the peculiarity oriented multidatabase mining which was proposed by (13) in this section, we illustrate each step for the module. Volume V, No 2, 2004 526 Issues in Information Systems

TABLENAM:BELONG TABLE DATABASE INSTANCE TABLENAM:ELEMENT ELENAME TABLENAM:EQUAL SYNNO MAPNO TABLENAM:MAP MAPNO OPERATOR CONT Figure4 The schema of relationship metadata (9) Step 2.1: Peculiar Data Discovery There are many ways of finding peculiar data which have very low frequency of appearance and could be lead the company to take emergency measures. An attribute-oriented method is utilized and is different from traditional statistical methods. In this step, to calculate the peculiarity factor, the threshold, and pick up the data are over the threshold. Step2.2: Exploration of Background Knowledge Two foundations which are relationship metadata and peculiar data assist in exploring the background knowledge in this step. To explore the background knowledge, it has to investigate the relevant fields and pick up the peculiar data. Step 2.3: Rule Generation After exploration of background knowledge, the interesting information are discovered according to relationship metadata. We adopt a detailed analysis of probability-related measures associated with the rule which was given by (12) to generate and interpret the rules. The characteristics of a rule φ ψ can be represented by the following contingency table (shown in Figure5). And the peculiarity rules represent only a subset of all rules with high change of CS( φ ψ ) m( φ) m( ψ ) m( φ) support (CS), the formula of CS is m( φ) U and the rules are more peculiar. =. The value of CS is higher Figure5 Contingency Table EVALUATION OF DDM MECHANISM The quality of decision is improved with our approach, such as providing the rule derived from different organizations, since the effective of the rules are improved using Pawlak s Rough Sets and peculiarity mining to determine the relation between the data and mining from fragmented data. In this paper, we evaluate our approach from two respects. The feasibility analysis is first performed with a prototype system and a real case study. System Implementation We have developed a prototype system for feasibility study. The system interface is a web-based application, and developed using Java programming language. Sample screenshots Volume V, No 2, 2004 527 Issues in Information Systems

are shown in the following Figure6 and Figure7. Figure6 The Granule Definition Figure7 The Rule Generator Case Study: Scenario and Results with DDM Mechanism Taiwan Uncle Sam s Apparel Company is one of the wholesalers and retailers for apparel goods. In order to reduce stock, the decision makers want to understand the relationships between the supply and sales. That is, a decision tree for association rules is necessary for decision support. We adopt the transactionlog, export, and import data from the database, CRM and Quixote, of Uncle Sam s Apparel Company for the case study. According to the case, we use relationship metadata to store the relevant fields for further generating business rules. The following Figure8 and Figure9 show the result with the relationship metadata and business rules. Figure8 The Relationship Metadata of real case Figure9 The rules of DDM Mechanism CONCLUSION This paper proposes a distributed data mining mechanism to resolve the problems, such as dispersed, heterogeneous and fragmented data, since traditional mining methodology can not effectively resolve the problems of information variety and overloading. From the results of an experimented implementation of the system, we show that the proposed structure effectively generates the useful rule for decision making. Volume V, No 2, 2004 528 Issues in Information Systems

The mechanism for distributed data mining in the environment of supply chain has been introduced. In order to increase efficiency of mining from the multiple data sources, we detect the relevance of the data source using Pawlak s Rough Sets. Furthermore, our approach can store the rule relationship metadata during rule generation process and can filter the useless rules. With our approach, distributed knowledge on various resources and extreme large data sets can be properly found and integrated. The quality of knowledge will therefore be basically certified. ACKNOWLEDGEMENT The National Science Council, Taiwan, under Grant No. NSC92-2213-E-194-033 has supported the work presented in this paper. We greatly appreciate their financial support and encouragement REFERENCES 1. R. Agrawal, T. Imielinski, and A. Swami (1993). Database Mining: A Performance Perspective, IEEE Transactions on Knowledge And Data Engineering, 5(6), 912-925. 2. R. Agrawal, and J. C. Shafer (1996). Parallel Mining of Association Rules, IEEE Transactions on Knowledge And Data Engineering, 8(6), 962-969. 3. P. S. Bender (2000). Debunking 5 Supply Chain Myths, Supply chain Management Review, 4(1), 52-58. 4. D. W. Cheung, J. Han, N. Vincent T, A. W. Fu, and Y. Fu (1996). A Fast Distributed Algorithm for Mining Association Rules, In International Conference on Parallel and Distributed Information Systems, 31-42. 5. S. Chopra, and P. Meindl (2000). Supply Chain Management: Strategy, planning and operation: Prentice Hall College Div; 1st edition. 6. R. W. Dik, H. v. Lewinski, J. D. Whitaker, and J. D. Brooks (2003). A Global Study Of Supply Chain Leadership and Its Impact On Business Performance, Accenture Institute www.accenture.com. 7. F. T. Edum-Fotwe, A. Thorpe, and R. McCaffer (2001). Information procurement practices of key actors in construction supply chain, European Journal of Purchasing & Supply Management, 7(3), 155-164. 8. J. Han, and M. Kamber (2001). Data Mining: Concepts and Techniques, Hardcover ed: Morgan Kaufmann. 9. S.-M. Huang, I. Kawn, D. C. Yen, and Hsiang-Yuan Hsueh (2000). Developing an XML Gateway for Business-to-Business Commerce, In Proceeding of International Conference on Web Information Systems. 10. H. L. Lee, K. C. So, and C. S. Tang (2000). The Value of Information Sharing in a Two-Level Supply Chain, Management Science, 46(5), 626-643. 11. H.-J. Sebastian, T. Grunert, and M. E. Nissen (2002). Introduction to the Minitrack Decision Technologies for Supply Chain Management, In Proceedings of the 35th Hawaii International Conference on System Sciences, 867-868. 12. Y. Y. Yao, and N. Zhong (1999). An analysis of Quantitative Measures Associated with Rules, In Pacific-Asia Conference on Knowledge Discovery and Data Mining, 479-488. 13. N. Zhong, Y. Yao, and M. Ohshima (2003). Peculiarity Oriented Multidatabase Mining, IEEE Transactions On Knowledge And Data Engineering, 15(4), 952-960. Volume V, No 2, 2004 529 Issues in Information Systems