An Enhanced Quality Approach for Ecommerce Website using DEA and High Utility Item Set Mining

Similar documents
A QoS-Aware Web Service Selection Based on Clustering

A Survey on Product Aspect Ranking

Bisecting K-Means for Clustering Web Log data

A Review of Data Mining Techniques

Prediction of Heart Disease Using Naïve Bayes Algorithm

An Overview of Knowledge Discovery Database and Data mining Techniques

Hybrid Data Envelopment Analysis and Neural Networks for Suppliers Efficiency Prediction and Ranking

Query Optimization Approach in SQL to prepare Data Sets for Data Mining Analysis

A Survey on Web Mining From Web Server Log

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS

Understanding Web personalization with Web Usage Mining and its Application: Recommender System

New Matrix Approach to Improve Apriori Algorithm

Assessing Learners Behavior by Monitoring Online Tests through Data Visualization

Search Result Optimization using Annotators

Static Data Mining Algorithm with Progressive Approach for Mining Knowledge

RULE-BASE DATA MINING SYSTEMS FOR

MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

Data Mining System, Functionalities and Applications: A Radical Review

Hadoop Technology for Flow Analysis of the Internet Traffic

Data Mining Solutions for the Business Environment

Association Technique on Prediction of Chronic Diseases Using Apriori Algorithm

Preprocessing Web Logs for Web Intrusion Detection

Indirect Positive and Negative Association Rules in Web Usage Mining

2.1. Data Mining for Biomedical and DNA data analysis

Binary Coded Web Access Pattern Tree in Education Domain

Data Mining Algorithms Part 1. Dejan Sarka

Detection of Distributed Denial of Service Attack with Hadoop on Live Network

A Survey of Customer Relationship Management

Healthcare Measurement Analysis Using Data mining Techniques

Future Trend Prediction of Indian IT Stock Market using Association Rule Mining of Transaction data

Exploitation of Server Log Files of User Behavior in Order to Inform Administrator

Comparative Performance Analysis of MySQL and SQL Server Relational Database Management Systems in Windows Environment

Importance of Online Product Reviews from a Consumer s Perspective

KEYWORD SEARCH IN RELATIONAL DATABASES

A Survey on Association Rule Mining in Market Basket Analysis

Client Perspective Based Documentation Related Over Query Outcomes from Numerous Web Databases

Mining for Web Engineering

International Journal of Innovative Research in Computer and Communication Engineering

72. Ontology Driven Knowledge Discovery Process: a proposal to integrate Ontology Engineering and KDD

Preparing Data Sets for the Data Mining Analysis using the Most Efficient Horizontal Aggregation Method in SQL

SPATIAL DATA CLASSIFICATION AND DATA MINING

Intrusion Detection System using Log Files and Reinforcement Learning

Sla Aware Load Balancing Algorithm Using Join-Idle Queue for Virtual Machines in Cloud Computing

Mobile Phone APP Software Browsing Behavior using Clustering Analysis

Comparison of K-means and Backpropagation Data Mining Algorithms

Sales and Invoice Management System with Analysis of Customer Behaviour

A COGNITIVE APPROACH IN PATTERN ANALYSIS TOOLS AND TECHNIQUES USING WEB USAGE MINING

Network Selection Using TOPSIS in Vertical Handover Decision Schemes for Heterogeneous Wireless Networks

An Online Purchase Portal for Books and Seeds in Regional Language (Punjabi)

IMPROVING BUSINESS PROCESS MODELING USING RECOMMENDATION METHOD

SPMF: a Java Open-Source Pattern Mining Library

An Evaluation of Machine Learning Method for Intrusion Detection System Using LOF on Jubatus

Operational Efficiency and Firm Life Cycle in the Korean Manufacturing Sector

Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10

Grid Density Clustering Algorithm

Framework model on enterprise information system based on Internet of things

Big Data Mining: Challenges and Opportunities to Forecast Future Scenario

KNOWLEDGE DISCOVERY FOR SUPPLY CHAIN MANAGEMENT SYSTEMS: A SCHEMA COMPOSITION APPROACH

PRIVACY PRESERVING ASSOCIATION RULE MINING

Dr. U. Devi Prasad Associate Professor Hyderabad Business School GITAM University, Hyderabad

A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH

DATA MINING TECHNIQUES AND APPLICATIONS

DEA implementation and clustering analysis using the K-Means algorithm

ESSENTIALS AND STRATEGIES OF E-MARKETING. Mr. R. Srinivasan, J.Jollyvinisheeba Assistant Manager, Novo Nordisk India Private Limited, Bangalore

AN EFFICIENT STRATEGY OF AGGREGATE SECURE DATA TRANSMISSION

A Review on Data Mining in Cloud Computing Environment

Mobile Image Offloading Using Cloud Computing

Automatic Recommendation for Online Users Using Web Usage Mining

Supply chain intelligence: benefits, techniques and future trends

A UPS Framework for Providing Privacy Protection in Personalized Web Search

Selection of Optimal Discount of Retail Assortments with Data Mining Approach

Business Lead Generation for Online Real Estate Services: A Case Study

Keywords: Data Warehouse, Data Warehouse testing, Lifecycle based testing, performance testing.

How To Use Data Mining For Knowledge Management In Technology Enhanced Learning

Data Mining Approach in Security Information and Event Management

Enhanced Boosted Trees Technique for Customer Churn Prediction Model

Improving Apriori Algorithm to get better performance with Cloud Computing

Efficiency in Software Development Projects

ANDROID BASED ERP SYSTEM

DATA MINING STRATEGIES AND TECHNIQUES FOR CRM SYSTEMS

A Framework for Data Warehouse Using Data Mining and Knowledge Discovery for a Network of Hospitals in Pakistan

A Review of Anomaly Detection Techniques in Network Intrusion Detection System

SEARCH ENGINE WITH PARALLEL PROCESSING AND INCREMENTAL K-MEANS FOR FAST SEARCH AND RETRIEVAL

ASSOCIATION RULE MINING ON WEB LOGS FOR EXTRACTING INTERESTING PATTERNS THROUGH WEKA TOOL

KEITH LEHNERT AND ERIC FRIEDRICH

Database Optimizing Services

Data Mining in Web Search Engine Optimization and User Assisted Rank Results

Novel Mining of Cancer via Mutation in Tumor Protein P53 using Quick Propagation Network

Use of Data Mining Techniques to Improve the Effectiveness of Sales and Marketing

Transcription:

An Enhanced Quality Approach for Ecommerce Website using DEA and High Utility Item Set Mining Varsha Kulkarni 1, Vilas Jadhav 2. 1 PG Scholar, Dept. of Computer Engineering, MGMCET, Kamothe, Navi Mumbai, India 2 Professor, Dept. of Computer Engineering, MGMCET, Kamothe, Navi Mumbai, India ABSTRACT: Today, E-commerce is booming everywhere. Customer s satisfaction is an important issue in every Business-to-Consumer website. Sometimes it is difficult to evaluate performance of website and to find utility for the product. It depends on customer to visit, see or purchase any product. Entrepreneur is unaware of its nature. Here, we have adopted DEA (Data Envelopment Analysis) method which is working on dataset. DEA is used to evaluate performance of DMU s (Decision Making Units). Here, we have taken up Utility Pattern Growth algorithm for finding High utility item sets. Our scheme E-commerce website Quality and Utility Analysis System (EQUAS) will analyse the website and will results in popularity of website to an entrepreneur. Our scheme will contribute to the improvement of overall quality of website and promote the healthy development in e-commerce. The goal of utility mining is to discover all the high utility item sets whose utility values are beyond a user specified threshold in a transaction database. Utility Mining identifies the item sets with highest utilities, by considering profit, quantity, cost or other user preferences. DEA analyzes quality by calculating user popularity rates and utility item set mining investigates profit earning items. KEYWORDS: data envelopment analysis (DEA), decision making units (DMU s), E-commerce website Quality and Utility Analysis System (EQUAS). I. INTRODUCTION Quality of e-commerce website affects on its performance directly. With the rapid development and wide application of e-commerce, academia and business are desperately in need of an effective website evaluation technique to compare the relative merits of website which will be helpful for website construction [1]. DEA is one of the techniques to evaluate the quality of e-commerce website. DEA is relatively new data oriented approach for evaluating the performance of a set of entities called DMU s which convert multiple inputs into multiple outputs. Proposed model will calculate inefficiencies in units and results will be used for improvement of website. In addition to DEA, we have included utility mining technique for finding high utility item sets. Data mining is the extraction of hidden, previously unknown and potentially useful information from databases. Data mining is sometimes called as knowledge discovery. Based on user queries it analyzes the relationships in stored data. Data mining is mostly used in financial data analysis, retail industry, intrusion detection and other scientific applications. Data mining models can be applied to specific scenarios such as forecasting, risk and probability. To discover the useful patterns from database, frequent pattern mining has been applied to different databases. Searching of frequent patterns from large databases is very important in many applications. The goal of frequent item set mining is to identify all frequent item sets and it collects the set of items that occur frequently together. An item set is called high utility item set if its utility is not less than user specified minimum utility threshold [2]. The information of high utility item sets is maintained in a tree based structure called utility pattern tree. High utility item sets can be generated from UP-Tree efficiently. We have designed E-commerce website Quality and Utility Analysis System (EQUAS) to calculate inefficiency of a website as well as perform utility mining of high utility item sets. DEA will work on website log data and utility mining will work on transactional databases to form a tree structure. Copyright to IJIRCCE DOI: 10.15680/IJIRCCE.2015. 0311087 10756

II. RELATED WORK Zijuan Liu [1] chose input oriented BCC model which is ideal for modelling of purchase situations. It is done by comparing with hypothetical DMU s to calculate constant rates. It highlights difference between CCR and BCC model. It defined input oriented BCC model. Vincent S. Tseng et Al [2] explains about discovery of high utility item sets like profit to eliminate unpromising items. It is mainly used for transactional databases. Here, they used UP-Growth and UP-Growth+ algorithms. They can effectively used for long transactions. Jyothi Pillai, O.P.Vyas [3] explained efficiency of utility mining in the field of data mining. Utility mining discovers item sets that frequently occur. Chuen Tse Kuah et al formalized significance of DEA and its different models. The popularity of DEA is due to its ability to measure relative efficiencies of multiple-input and multiple-output DMUs without prior weights on the inputs and outputs. Corrado lo Storto [4] evaluated website quality from user viewpoint using DEA. The framework is implemented to compare websites that sell products online. Jyothi Pillai, O.P.Vyas [5] defined emerging technologies in mining. It illustrated utility mining in terms of profit, cost and other expressions of user experiences. Vincent S. Tseng [6] defined UP-Growth(Utility Pattern Growth) algorithm for discovering high utility item sets and removing unpromising items by providing threshold like profit, quantity etc. A tree structure called UP-Tree is created to to generate candidate item sets to avoid repeated scanning of database. Smita R. Londhe et al [7] tried to avoid space consumption and execution time for long transactions by defining two algorithms namely, UP-Growth and UP- Growth+ algorithms. Adinarayanareddy B et al [8] mainly focus on Up- Growth algorithm and their production of large number candidate item sets which requires more space and affects mining performance. III. PROPOSED ALGORITHM A. Design Considerations: All the ports which are required should be free. B. Description of the Proposed Algorithm: We have designed E-commerce website Quality and Utility Analysis System (EQUAS). We have implemented this system using software Java 1.6 and database SQL server 2005. Our aim is to build a quality evaluation system which will diagnose datasets on the basis of their efficiency and investigates profit earning items. Our reference paper has built a system which uses BCC model. A BCC model is input oriented and used for evaluation purpose. A model is established based on input data and output data of derived from CCR [2]. DEA is used to evaluate efficiency of E-commerce website. In order to find unpromising items and to overcome the drawback of excess scanning of database, utility mining is also performed along with DEA. Proposed model will calculate inefficiencies in units and results will be used for improvement of website. In this the unit of analysis represents a production unit. We conceptualize consumer-website interaction during online shopping as a production process in which the customer conducts a purchase and inputs are converted into outputs. Finding Utility A novel algorithms as well as a compact data structure for efficiently discovering high utility item sets from transactional databases. Utility pattern growth (UP-Growth) [4] and a compact tree structure, called utility pattern tree (UP-Tree), for discovering high utility item sets and maintaining important information related to utility patterns within databases are proposed. High-utility item sets can be generated from UP-Tree efficiently with only two scans of original databases. To improve mining performance and avoid repeated scanning of database, we used here utility pattern tree [2]. To analyse performance of website we propose an enhanced method called, E-commerce website Quality and Utility Analysis System (EQUAS) which calculates inefficiency of website functionality and utility mining to obtain high quality item sets. In this section we discussed our EQUAS system s design and algorithm and in next section we focus implementation and results. Copyright to IJIRCCE DOI: 10.15680/IJIRCCE.2015. 0311087 10757

Figure 1: Proposed System IV. SYSTEM FLOW 1. An e-commerce website will be developed on which DEA will work to analyze quality of the website and utility mining will be performed to calculate the profit earned by the website. 2. Collecting web log data from datasets to find out value of input and output variables through DMU. 3. Calculating unit inefficiencies to analyze website functionality. 4. Calculating conversion rates from one page to another and obtaining average value of each conversion rate 5. Organizing web log data to calculate profit by using item set and its unique identifier. 6. Remove unpromising data by extracting only utility item sets whose threshold is greater than or equal to minimum threshold. 7. Applying UP-Growth algorithm to generate UP-Tree which is representation of information of high utility item sets. V. RESULTS We can divide the result into two parts. One is the quality analysis system developed by using DEA and another one is utility mining based on transactional database. First is to enhance background performance of the system and another is to increase the profit. We can see that rate of people who tends to purchase product is lower than product list. So, here we have scope to enhance our website quality. Some ways are arrange products properly, transaction should not be complex. Also here in second graph Transaction unit (TU) and Transaction weighted unit (TWU) are shown. TU is addition of each item in single transaction. Each transaction is number of items purchased by single user. TWU is addition of TU s which contain same element. Copyright to IJIRCCE DOI: 10.15680/IJIRCCE.2015. 0311087 10758

0.8 0.6 0.4 0.2 0 Total Average 60000 40000 20000 0 TU TWU Figure 2: Result of DEA Figure 3: TU and TWU calculation After calculation of TU and TWU we have inserted user defined threshold. It is owner specified means website owner can put in some value. System categorizes items into promising and unpromising items. Suppose threshold is 40000 then result could be: Figure 4: Output representation VI. CONCLUSION AND FUTURE WORK The overall performance of an e-commerce website can be analysed by this proposed technique. Using our approach EQUAS, we can verify website quality based on log data and utility mining helps to eliminate unpromising data by considering parameter profit as a threshold. Existing systems doesn t provide efficiency analysis as well as unpromising item removal mechanism. We used here both DEA and utility mining to check website popularity as well as removal of doubtful data to gain more profit. Implications of this study relate to a new model of a set of criteria that used to measure the efficiency of the design, usability and performance of websites. In case of DEA, it is only as good as the initial selection of input and output variables. With the help of utility mining, high utility item sets will be discovered whose values are beyond user specified threshold. Copyright to IJIRCCE DOI: 10.15680/IJIRCCE.2015. 0311087 10759

REFERENCES 1. Zijuan Liu, Diagnosing Ecommerce website quality Based on DEA, 2nd International Conference on Computer Science and Network Technology, 13565627, 762 765, 2012. 2. Vincent S. Tseng, Bai-En Shie, Cheng-Wei Wu, and Philip S. Yu, Efficient Algorithms for Mining High Utility Item sets from Transactional Databases, IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 25, NO. 8, AUGUST 2013. 3. Chuen Tse Kuah, Kuan Yew Wong, Farzad Behrouzi, A Review on Data Envelopment Analysis (DEA), 2010 Fourth Asia International Conference on Mathematical/Analytical Modelling and Computer Simulation, 978-1-4244-7196-6, 168 173, 2010. 4. Corrado lo Storto, Evaluating ecommerce websites cognitive efficiency: An integrative framework based on data envelopment analysis, Applied Ergonomics, 44, 1004-1014, 2013. 5. Jyothi Pillai, O.P.Vyas, Overview of Item set Utility Mining and its Applications, International Journal of Computer Applications (0975 8887) Volume 5 No.11, August 2010. 6. Vincent S. Tseng, Cheng-Wei Wu, Bai-En Shie, and Philip S. Yu, UP-Growth: An Efficient Algorithm for High Utility Item set Mining, KDD 10, July 25 28, 2010. 7. Smita R. Londhe, Rupali A. Mahajan, Bhagyashree J. Bhoyar, Overview on Methods for Mining High Utility Item set from Transactional Database, International Journal of Scientific Engineering and Research (IJSER), Volume 1 Issue 4, December 2013. 8. Adinarayanareddy B, O Srinivasa Rao, MHM Krishna Prasad, An Improved UP-Growth High Utility Item set Mining, IJCA, Volume 58 No.2, 25-28, November 2012. BIOGRAPHY Varsha Kulkarni is pursuing her Master degree of Engineering in Computer engineering from MGM s college of Engineering and Technology, Kamothe, Navi Mumbai, India. She also received Bachelor of Engineering in Computer engineering from MGM s college of Engineering and Technology, Kamothe, Navi Mumbai, India. Her research interests are data mining, web mining, web technology, website designing etc. Vilas Jadhav is an Assistant Professor in Computer Engineering department, MGM s college of Engineering and Technology, Kamothe, Navi Mumbai, India. His research interests are Data mining, Software Engineering, Web Engineering etc. Copyright to IJIRCCE DOI: 10.15680/IJIRCCE.2015. 0311087 10760