# Applied Mathematical Sciences, Vol. 7, 2013, no. 112, HIKARI Ltd,

Save this PDF as:

Size: px
Start display at page:

Download "Applied Mathematical Sciences, Vol. 7, 2013, no. 112, 5591-5597 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2013."

## Transcription

2 5592 Anirut Suebsing and Chakarin Vajiramedhin 1 Introduction Nowadays, data mining plays important in the highly competitive business world. That discovers the valuable information from a very large transaction database for supporting the business. The information has been applied to analysis, plan or manage [10]. In the financial industry, the data mining has been applied successfully in several fields such as the credit card approval. In the last several years, credit card business is the extremely competitive market share. The customers can sign up for credit cards easier using marketing strategies that focus on a few conditions. A number of credit card payment defaults are increasing. Therefore, data mining technique can be applied to filter customers data to reduce a number of default customers. Generally, the financial institutions have a screening of customers information before approving the credit card by using efficient techniques such as automatic predictive model. The predictive model is a supervised learning tool for predicting the customers behavior from their history data. The input data format is an important key of the accuracy performance besides the best predictive solution. If the format is suitable for processing, the predictive model would have the best performance. On the other hand, if the input data do not meet the agreement, the predictive model would make the experimental error. Thus the appropriate data help to enhance the efficiency and correctness of the algorithm. Data transformation is one of the processing data steps that converts the input data into suitable form. In this paper, we applied the data transformation with eight predictive model algorithms to predict credit card customer data. In this paper, we focus on how to increase an accuracy rate of predictive model for a credit approval to help banks to decrease a risk for a credit approval. The goal of scrutinization is to identify a group of customers who have a high probability to defraud the banks by ignoring to disburse money to the banks. In data mining based on the data transformation, it helps the banks to predict customers profiles or attributes are able to give the credit approval so that reduce the risk of bad debt. The remainders of this paper are organized as follows: Section 2 describes the research objectives. Section 3 reviews of the predictive model algorithms and data transformations. Section 4 describes the methodology in this paper. Section 5 illustrates the experimental results. Section 6 discusses the performance evaluation. Finally, conclusion is discussed in Section 7. 2 Related Work 2.1 Data transformation Data preprocessing is an important part of data mining preparing the input data before the data are processed [9] (see Fig.1). This step consists of several parts

3 Accuracy rate of predictive models 5593 such as data cleaning, data integration, data transformation and data reduction. The input data can be processed in their raw collected form but the accuracy percentage of predictive model may be reduced in case of incomplete data. That the quality of knowledge extracted from the data can be enhanced by its transformation [3]. Thus, data transformation is the phase in preprocessing that converts a source data format into appropriate data form. Fig.1 Data preprocessing for Predictive. 2.2 Predictive Model Predictive model is the area of data mining that is usually applied to classify the group of data such as a classification of customers credit card. The predictive model is a process to create a model of future behavior from the collected data. The model may be created from a simple or complex equation (see Fig. 2). In the financial industry, the future behavior predictive is very important thus many banks have adopted the predictive to predict the customer data for screening the customer information before approving the credit card. Thus, many predictive models ([2], [6], [7], [8]) have been proposed. Each model has its own advantages and disadvantages that the performance is measured by the percentage of accuracy predicting. Fig.2 Data preprocessing for Predictive. 3 Data Transform for Predictive Model In this section, an approach, which is used to transform a raw data to a new data by using the binary coding method, is a method for turning any characters into numerical values to help a predictive model for credit approval to enhance its accuracy rate. The data sets used in this paper must be changed numerical value.

4 5594 Anirut Suebsing and Chakarin Vajiramedhin The proposed approach is to divide a nominal attribute into n binary attributes, which is called binary coding or dummy coding, if there are possible values (here, n > 2), with 0/1 representing the absence or presence of each value. For instance, an attribute of sex in the data set which has two values as follows: man, woman, then this attribute can be turned into numeric based on binary coding as follows: Fig.3 An example of mapping nominal based on binary coding technique. In this paper, we focus on how to increase an accuracy rate of predictive model for a credit approval to help banks to decrease a risk for a credit approval. The goal of scrutinization is to identify a group of customers who have a high probability to defraud the banks by ignoring to disburse money to the banks. In data mining based on the data transformation, it helps the banks to predict customers profiles or attributes are able to give the credit approval so that reduce the risk of bad debt. After mapping character or nominal data into numerical data, we make all values of attributes in each data sets equivalent by using the Min-Max normalization. The Min-max normalization is the simplest normalization technique and the most commonly used to standardize the range of independent attributes or features of data sets ([4], [5]). Min-max normalization is the simplest normalization technique that is best-suited for the cases where the bounds of the scores produced by a classifier are known. V min = minimum of the value in the data set. V max = maximum of the value in the data set. In this case, given a set of matching values x i, i=1, 2,, M, the set of normalized scores is given by as follows:

5 Accuracy rate of predictive models 5595 x x i = V 1 min max V V min (1) Now, the experimental raw data sets have transformed by the proposed approach and they are ready to be used for increase the accuracy rate of the predictive model in credit screening. 4 Performance Evaluation In order to evaluate our proposed approach, the data sets from the UCI repository [1] (a Japanese credit screening data set) and KNN, Logistic, Smo, NaiveBayes, Kstar, VotedPerceptron, SGD, MultiClassClassifier algorithms, we used eight algorithms to evaluate the proposed approach because of increasing reliability of evaluating the proposed approach. Note that the measurement in this paper of the experimental results is based on the standard metrics for evaluations of accuracy rate for truth positive (TP) refers to the ratio between the number of correctly predicted values and the total number of values. This section shows the accuracy rate of different predictive models used the data set provided by the proposed approach and the original data set so that we compare the accuracy rate of each predictive model that used the different data sets. Table 1. Accuracy rate of different predictive models that used the proposed data set and original data set From Table 1, it shows all accuracy rates of each predictive model used the proposed data set are higher than the accuracy rates of each predictive model employed the original data set significantly especially the accuracy rates of KNN, Logistic, Kstar, and MultiClassClassifier models that are able to provide the rate more than 98 percent.

6 5596 Anirut Suebsing and Chakarin Vajiramedhin Fig.4 Accuracy rate of different predictive models. From the Result Section (see Fig.4), it shows the proposed approach can enhance the accuracy rate of the predictive models effectively. Because the proposed approach has transformed all characters of each attribute or column in the original data set to the numeric values in terms of binary code. Thus, it leads to many predictive model algorithms developed to support the numeric data sets can compute and a build the predictive model efficiently even if in the real-world applications, most of the predictive model algorithms can build the predictive model based on every type of data sets. Furthermore, the Min-Max method, which is used to make all values of attributes in the data set equivalent, is one of techniques proposed in the paper. After the data set was transformed to the numeric values, all values of attributes in the data set has to be made to standardization in order to avoid outlier values because this problem will make any predictive models built by any predictive model algorithms forecast ineffectively. Therefore, this method affects to the accuracy rate of the predictive model also. 5 Conclusion The proposed approach can reform the Japanese credit screening data set to the new one to improve the performance of accuracy rate of the predictive models. All accuracy rates of each predictive model shows the high performance compared with the accuracy rates of each predictive model used the original data set. When we focus on the accuracy rates of each predictive model based on the proposed data set in Table 1, it found that every accuracy rate of each predictive model is able to show performance impressively especially KNN, Logistic, Kstar, and MultiClassClassifier models providing the rate over than 98 percent.

7 Accuracy rate of predictive models 5597 Acknowledgement We would like to thank the Faculty of Management Science, Ubonratchathani University for the financial support. We are also grateful to the owner of the datasets, the UCI repository, for providing the available datasets. References [1] Asuncion, A., Newman, D., CA: University of California, School of Information and Computer Science, Uci machine learning repository. Irvine, February 8, [2] Hung, S., Yen, D. C., Wang H., Applying data mining to telecom churn manage-ment. Expert Systems with Applications, 2006, [3] Kusiak, A., Feature Transformation Methods in Data Mining. IEEE Transactions on Electronics Packaging Manufacturing, 24, 2001, [4] Li, C., Biswas, G., Unsupervised learning with mixed numeric and nominal data. IEEE Transactions on Knowledge and Data Engineering, 14, 2002, [5] Modugno, R., Pirlo, G., Impedovo, D., Score normalization by dynamic time warping, 2010 IEEE International Conference on Computational Intelligence for Measurement Systems and Applications (CIMSA), 2010, [6] Singh, R., Aggarwal., R. R., Comparative Evaluation of Predictive Modeling Techniques on Credit Card Data. International Journal of Computer Theory and Engineering, 3, 2011, [7] Wang, G., Et al., Predicting credit card holder churn in banks of China using data mining and MCDM IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, 2010, [8] Yu, Y., Song, L., The Issue and Risk Analysis of the Credit card Fourth International Conference on Business Intelligence and Financial Engineering, 2011, [9] A. Suebsing, N. Hiransakolwong, A novel technique for feature subset selection based on cosine similarity, Applied Mathematical Sciences, vol.6, no. 133, 2012, [10] C. Vajiramedhin, J. Werapun, EBPA: An efficient data structure for frequent closed itemset mining Applied Mathematical Sciences, vol.7, no.30, 2013, Received: August 16, 2013

### Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,

### Data Mining Approach For Subscription-Fraud. Detection in Telecommunication Sector

Contemporary Engineering Sciences, Vol. 7, 2014, no. 11, 515-522 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ces.2014.4431 Data Mining Approach For Subscription-Fraud Detection in Telecommunication

### Data Mining for Manufacturing: Preventive Maintenance, Failure Prediction, Quality Control

Data Mining for Manufacturing: Preventive Maintenance, Failure Prediction, Quality Control Andre BERGMANN Salzgitter Mannesmann Forschung GmbH; Duisburg, Germany Phone: +49 203 9993154, Fax: +49 203 9993234;

### Data Mining: A Preprocessing Engine

Journal of Computer Science 2 (9): 735-739, 2006 ISSN 1549-3636 2005 Science Publications Data Mining: A Preprocessing Engine Luai Al Shalabi, Zyad Shaaban and Basel Kasasbeh Applied Science University,

### An Analysis of Missing Data Treatment Methods and Their Application to Health Care Dataset

P P P Health An Analysis of Missing Data Treatment Methods and Their Application to Health Care Dataset Peng Liu 1, Elia El-Darzi 2, Lei Lei 1, Christos Vasilakis 2, Panagiotis Chountas 2, and Wei Huang

### Random Forest Based Imbalanced Data Cleaning and Classification

Random Forest Based Imbalanced Data Cleaning and Classification Jie Gu Software School of Tsinghua University, China Abstract. The given task of PAKDD 2007 data mining competition is a typical problem

### Data Mining: Overview. What is Data Mining?

Data Mining: Overview What is Data Mining? Recently * coined term for confluence of ideas from statistics and computer science (machine learning and database methods) applied to large databases in science,

### Social Media Mining. Data Mining Essentials

Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

### New Ensemble Combination Scheme

New Ensemble Combination Scheme Namhyoung Kim, Youngdoo Son, and Jaewook Lee, Member, IEEE Abstract Recently many statistical learning techniques are successfully developed and used in several areas However,

### DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES

DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES Vijayalakshmi Mahanra Rao 1, Yashwant Prasad Singh 2 Multimedia University, Cyberjaya, MALAYSIA 1 lakshmi.mahanra@gmail.com

### Beating the MLB Moneyline

Beating the MLB Moneyline Leland Chen llxchen@stanford.edu Andrew He andu@stanford.edu 1 Abstract Sports forecasting is a challenging task that has similarities to stock market prediction, requiring time-series

### Prediction Model for Crude Oil Price Using Artificial Neural Networks

Applied Mathematical Sciences, Vol. 8, 2014, no. 80, 3953-3965 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2014.43193 Prediction Model for Crude Oil Price Using Artificial Neural Networks

### BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL

The Fifth International Conference on e-learning (elearning-2014), 22-23 September 2014, Belgrade, Serbia BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL SNJEŽANA MILINKOVIĆ University

### A Survey on Outlier Detection Techniques for Credit Card Fraud Detection

IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661, p- ISSN: 2278-8727Volume 16, Issue 2, Ver. VI (Mar-Apr. 2014), PP 44-48 A Survey on Outlier Detection Techniques for Credit Card Fraud

### ASSOCIATION RULE MINING ON WEB LOGS FOR EXTRACTING INTERESTING PATTERNS THROUGH WEKA TOOL

International Journal Of Advanced Technology In Engineering And Science Www.Ijates.Com Volume No 03, Special Issue No. 01, February 2015 ISSN (Online): 2348 7550 ASSOCIATION RULE MINING ON WEB LOGS FOR

### Enhanced Boosted Trees Technique for Customer Churn Prediction Model

IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719 Vol. 04, Issue 03 (March. 2014), V5 PP 41-45 www.iosrjen.org Enhanced Boosted Trees Technique for Customer Churn Prediction

### Standardization and Its Effects on K-Means Clustering Algorithm

Research Journal of Applied Sciences, Engineering and Technology 6(7): 399-3303, 03 ISSN: 040-7459; e-issn: 040-7467 Maxwell Scientific Organization, 03 Submitted: January 3, 03 Accepted: February 5, 03

### Data Mining Framework for Direct Marketing: A Case Study of Bank Marketing

www.ijcsi.org 198 Data Mining Framework for Direct Marketing: A Case Study of Bank Marketing Lilian Sing oei 1 and Jiayang Wang 2 1 School of Information Science and Engineering, Central South University

### Domain Classification of Technical Terms Using the Web

Systems and Computers in Japan, Vol. 38, No. 14, 2007 Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. J89-D, No. 11, November 2006, pp. 2470 2482 Domain Classification of Technical Terms Using

### CS 6220: Data Mining Techniques Course Project Description

CS 6220: Data Mining Techniques Course Project Description College of Computer and Information Science Northeastern University Spring 2013 General Goal In this project, you will have an opportunity to

### ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION

ISSN 9 X INFORMATION TECHNOLOGY AND CONTROL, 00, Vol., No.A ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION Danuta Zakrzewska Institute of Computer Science, Technical

### Introduction to Data Mining

Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association

### Role of Customer Response Models in Customer Solicitation Center s Direct Marketing Campaign

Role of Customer Response Models in Customer Solicitation Center s Direct Marketing Campaign Arun K Mandapaka, Amit Singh Kushwah, Dr.Goutam Chakraborty Oklahoma State University, OK, USA ABSTRACT Direct

### FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS

FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS Breno C. Costa, Bruno. L. A. Alberto, André M. Portela, W. Maduro, Esdras O. Eler PDITec, Belo Horizonte,

### IDENTIFIC ATION OF SOFTWARE EROSION USING LOGISTIC REGRESSION

http:// IDENTIFIC ATION OF SOFTWARE EROSION USING LOGISTIC REGRESSION Harinder Kaur 1, Raveen Bajwa 2 1 PG Student., CSE., Baba Banda Singh Bahadur Engg. College, Fatehgarh Sahib, (India) 2 Asstt. Prof.,

### Categorical Data Visualization and Clustering Using Subjective Factors

Categorical Data Visualization and Clustering Using Subjective Factors Chia-Hui Chang and Zhi-Kai Ding Department of Computer Science and Information Engineering, National Central University, Chung-Li,

### A Regression Approach for Forecasting Vendor Revenue in Telecommunication Industries

A Regression Approach for Forecasting Vendor Revenue in Telecommunication Industries Aida Mustapha *1, Farhana M. Fadzil #2 * Faculty of Computer Science and Information Technology, Universiti Tun Hussein

### Tweaking Naïve Bayes classifier for intelligent spam detection

682 Tweaking Naïve Bayes classifier for intelligent spam detection Ankita Raturi 1 and Sunil Pranit Lal 2 1 University of California, Irvine, CA 92697, USA. araturi@uci.edu 2 School of Computing, Information

### Data, Measurements, Features

Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are

### Implementation of Data Mining Techniques to Perform Market Analysis

Implementation of Data Mining Techniques to Perform Market Analysis B.Sabitha 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, P.Balasubramanian 4 PG Scholar, Indian Institute of Information Technology, Srirangam,

### Data Mining Part 5. Prediction

Data Mining Part 5. Prediction 5.1 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Classification vs. Numeric Prediction Prediction Process Data Preparation Comparing Prediction Methods References Classification

### Comparison of K-means and Backpropagation Data Mining Algorithms

Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and

### Automatic Student Performance Analysis and Monitoring

Automatic Student Performance Analysis and Monitoring Snehal Kekane, Dipika Khairnar, Rohini Patil, Prof. S. R. Vispute, Prof. N. Gawande UG Students, Department of Computer Engineering, Pimpri Chinchwad

### A Neural Network and Web-Based Decision Support System for Forex Forecasting and Trading

A Neural Network and Web-Based Decision Support System for Forex Forecasting and Trading K.K. Lai 1, Lean Yu 2,3, and Shouyang Wang 2,4 1 Department of Management Sciences, City University of Hong Kong,

### Knowledge Discovery and Data Mining. Structured vs. Non-Structured Data

Knowledge Discovery and Data Mining Unit # 2 1 Structured vs. Non-Structured Data Most business databases contain structured data consisting of well-defined fields with numeric or alphanumeric values.

### International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

### Université de Montpellier 2 Hugo Alatrista-Salas : hugo.alatrista-salas@teledetection.fr

Université de Montpellier 2 Hugo Alatrista-Salas : hugo.alatrista-salas@teledetection.fr WEKA Gallirallus Zeland) australis : Endemic bird (New Characteristics Waikato university Weka is a collection

### Manjeet Kaur Bhullar, Kiranbir Kaur Department of CSE, GNDU, Amritsar, Punjab, India

Volume 5, Issue 6, June 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Multiple Pheromone

### Feature Selection using Integer and Binary coded Genetic Algorithm to improve the performance of SVM Classifier

Feature Selection using Integer and Binary coded Genetic Algorithm to improve the performance of SVM Classifier D.Nithya a, *, V.Suganya b,1, R.Saranya Irudaya Mary c,1 Abstract - This paper presents,

### Data Mining for Fun and Profit

Data Mining for Fun and Profit Data mining is the extraction of implicit, previously unknown, and potentially useful information from data. - Ian H. Witten, Data Mining: Practical Machine Learning Tools

### A Review of Anomaly Detection Techniques in Network Intrusion Detection System

A Review of Anomaly Detection Techniques in Network Intrusion Detection System Dr.D.V.S.S.Subrahmanyam Professor, Dept. of CSE, Sreyas Institute of Engineering & Technology, Hyderabad, India ABSTRACT:In

### Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets

Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets http://info.salford-systems.com/jsm-2015-ctw August 2015 Salford Systems Course Outline Demonstration of two classification

### A Fast and Efficient Method to Find the Conditional Functional Dependencies in Databases

International Journal of Engineering Research and Development e-issn: 2278-067X, p-issn: 2278-800X, www.ijerd.com Volume 3, Issue 5 (August 2012), PP. 56-61 A Fast and Efficient Method to Find the Conditional

### COURSE RECOMMENDER SYSTEM IN E-LEARNING

International Journal of Computer Science and Communication Vol. 3, No. 1, January-June 2012, pp. 159-164 COURSE RECOMMENDER SYSTEM IN E-LEARNING Sunita B Aher 1, Lobo L.M.R.J. 2 1 M.E. (CSE)-II, Walchand

Building A Smart Academic Advising System Using Association Rule Mining Raed Shatnawi +962795285056 raedamin@just.edu.jo Qutaibah Althebyan +962796536277 qaalthebyan@just.edu.jo Baraq Ghalib & Mohammed

### Decision Support System For A Customer Relationship Management Case Study

61 Decision Support System For A Customer Relationship Management Case Study Ozge Kart 1, Alp Kut 1, and Vladimir Radevski 2 1 Dokuz Eylul University, Izmir, Turkey {ozge, alp}@cs.deu.edu.tr 2 SEE University,

### Pattern Mining and Querying in Business Intelligence

Pattern Mining and Querying in Business Intelligence Nittaya Kerdprasop, Fonthip Koongaew, Phaichayon Kongchai, Kittisak Kerdprasop Data Engineering Research Unit, School of Computer Engineering, Suranaree

### Semi-Supervised and Unsupervised Machine Learning. Novel Strategies

Brochure More information from http://www.researchandmarkets.com/reports/2179190/ Semi-Supervised and Unsupervised Machine Learning. Novel Strategies Description: This book provides a detailed and up to

### End-User Satisfaction in ERP System: Application of Logit Modeling

Applied Mathematical Sciences, Vol. 8, 2014, no. 24, 1187-1192 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2014.4284 End-User Satisfaction in ERP System: Application of Logit Modeling Hashem

### Enhancing Quality of Data using Data Mining Method

JOURNAL OF COMPUTING, VOLUME 2, ISSUE 9, SEPTEMBER 2, ISSN 25-967 WWW.JOURNALOFCOMPUTING.ORG 9 Enhancing Quality of Data using Data Mining Method Fatemeh Ghorbanpour A., Mir M. Pedram, Kambiz Badie, Mohammad

### DATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7

DATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7 UNDER THE GUIDANCE Dr. N.P. DHAVALE, DGM, INFINET Department SUBMITTED TO INSTITUTE FOR DEVELOPMENT AND RESEARCH IN BANKING TECHNOLOGY

### Research on Clustering Analysis of Big Data Yuan Yuanming 1, 2, a, Wu Chanle 1, 2

Advanced Engineering Forum Vols. 6-7 (2012) pp 82-87 Online: 2012-09-26 (2012) Trans Tech Publications, Switzerland doi:10.4028/www.scientific.net/aef.6-7.82 Research on Clustering Analysis of Big Data

### A Robustness Simulation Method of Project Schedule based on the Monte Carlo Method

Send Orders for Reprints to reprints@benthamscience.ae 254 The Open Cybernetics & Systemics Journal, 2014, 8, 254-258 Open Access A Robustness Simulation Method of Project Schedule based on the Monte Carlo

### Numerical Algorithms for Predicting Sports Results

Numerical Algorithms for Predicting Sports Results by Jack David Blundell, 1 School of Computing, Faculty of Engineering ABSTRACT Numerical models can help predict the outcome of sporting events. The features

### Comparison of Data Mining Techniques used for Financial Data Analysis

Comparison of Data Mining Techniques used for Financial Data Analysis Abhijit A. Sawant 1, P. M. Chawan 2 1 Student, 2 Associate Professor, Department of Computer Technology, VJTI, Mumbai, INDIA Abstract

### DATA PREPARATION FOR DATA MINING

Applied Artificial Intelligence, 17:375 381, 2003 Copyright # 2003 Taylor & Francis 0883-9514/03 \$12.00 +.00 DOI: 10.1080/08839510390219264 u DATA PREPARATION FOR DATA MINING SHICHAO ZHANG and CHENGQI

### A fast multi-class SVM learning method for huge databases

www.ijcsi.org 544 A fast multi-class SVM learning method for huge databases Djeffal Abdelhamid 1, Babahenini Mohamed Chaouki 2 and Taleb-Ahmed Abdelmalik 3 1,2 Computer science department, LESIA Laboratory,

### Prediction of Stock Performance Using Analytical Techniques

136 JOURNAL OF EMERGING TECHNOLOGIES IN WEB INTELLIGENCE, VOL. 5, NO. 2, MAY 2013 Prediction of Stock Performance Using Analytical Techniques Carol Hargreaves Institute of Systems Science National University

### arxiv:1506.04135v1 [cs.ir] 12 Jun 2015

Reducing offline evaluation bias of collaborative filtering algorithms Arnaud de Myttenaere 1,2, Boris Golden 1, Bénédicte Le Grand 3 & Fabrice Rossi 2 arxiv:1506.04135v1 [cs.ir] 12 Jun 2015 1 - Viadeo

### Impact of Boolean factorization as preprocessing methods for classification of Boolean data

Impact of Boolean factorization as preprocessing methods for classification of Boolean data Radim Belohlavek, Jan Outrata, Martin Trnecka Data Analysis and Modeling Lab (DAMOL) Dept. Computer Science,

### Methodology Framework for Analysis and Design of Business Intelligence Systems

Applied Mathematical Sciences, Vol. 7, 2013, no. 31, 1523-1528 HIKARI Ltd, www.m-hikari.com Methodology Framework for Analysis and Design of Business Intelligence Systems Martin Závodný Department of Information

### Text Opinion Mining to Analyze News for Stock Market Prediction

Int. J. Advance. Soft Comput. Appl., Vol. 6, No. 1, March 2014 ISSN 2074-8523; Copyright SCRG Publication, 2014 Text Opinion Mining to Analyze News for Stock Market Prediction Yoosin Kim 1, Seung Ryul

### Detection of Heart Diseases by Mathematical Artificial Intelligence Algorithm Using Phonocardiogram Signals

International Journal of Innovation and Applied Studies ISSN 2028-9324 Vol. 3 No. 1 May 2013, pp. 145-150 2013 Innovative Space of Scientific Research Journals http://www.issr-journals.org/ijias/ Detection

### Open Access A Facial Expression Recognition Algorithm Based on Local Binary Pattern and Empirical Mode Decomposition

Send Orders for Reprints to reprints@benthamscience.ae The Open Electrical & Electronic Engineering Journal, 2014, 8, 599-604 599 Open Access A Facial Expression Recognition Algorithm Based on Local Binary

### The Application of Association Rule Mining in CRM Based on Formal Concept Analysis

The Application of Association Rule Mining in CRM Based on Formal Concept Analysis HongSheng Xu * and Lan Wang College of Information Technology, Luoyang Normal University, Luoyang, 471022, China xhs_ls@sina.com

### Choosing the Best Classification Performance Metric for Wrapper-based Software Metric Selection for Defect Prediction

Choosing the Best Classification Performance Metric for Wrapper-based Software Metric Selection for Defect Prediction Huanjing Wang Western Kentucky University huanjing.wang@wku.edu Taghi M. Khoshgoftaar

### Intersection of a Line and a Convex. Hull of Points Cloud

Applied Mathematical Sciences, Vol. 7, 213, no. 13, 5139-5149 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/1.12988/ams.213.37372 Intersection of a Line and a Convex Hull of Points Cloud R. P. Koptelov

### An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content

### Introduction to Learning & Decision Trees

Artificial Intelligence: Representation and Problem Solving 5-38 April 0, 2007 Introduction to Learning & Decision Trees Learning and Decision Trees to learning What is learning? - more than just memorizing

### Customer Classification And Prediction Based On Data Mining Technique

Customer Classification And Prediction Based On Data Mining Technique Ms. Neethu Baby 1, Mrs. Priyanka L.T 2 1 M.E CSE, Sri Shakthi Institute of Engineering and Technology, Coimbatore 2 Assistant Professor

### Internet of Things for Smart Crime Detection

Contemporary Engineering Sciences, Vol. 7, 2014, no. 15, 749-754 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ces.2014.4685 Internet of Things for Smart Crime Detection Jeong-Yong Byun, Aziz

### Web Document Clustering

Web Document Clustering Lab Project based on the MDL clustering suite http://www.cs.ccsu.edu/~markov/mdlclustering/ Zdravko Markov Computer Science Department Central Connecticut State University New Britain,

### The Role of Size Normalization on the Recognition Rate of Handwritten Numerals

The Role of Size Normalization on the Recognition Rate of Handwritten Numerals Chun Lei He, Ping Zhang, Jianxiong Dong, Ching Y. Suen, Tien D. Bui Centre for Pattern Recognition and Machine Intelligence,

### Forecasting Stock Prices using a Weightless Neural Network. Nontokozo Mpofu

Forecasting Stock Prices using a Weightless Neural Network Nontokozo Mpofu Abstract In this research work, we propose forecasting stock prices in the stock market industry in Zimbabwe using a Weightless

### A Novel Feature Selection Method Based on an Integrated Data Envelopment Analysis and Entropy Mode

A Novel Feature Selection Method Based on an Integrated Data Envelopment Analysis and Entropy Mode Seyed Mojtaba Hosseini Bamakan, Peyman Gholami RESEARCH CENTRE OF FICTITIOUS ECONOMY & DATA SCIENCE UNIVERSITY

### Neural Networks in Data Mining

IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719 Vol. 04, Issue 03 (March. 2014), V6 PP 01-06 www.iosrjen.org Neural Networks in Data Mining Ripundeep Singh Gill, Ashima Department

### Experiments in Web Page Classification for Semantic Web

Experiments in Web Page Classification for Semantic Web Asad Satti, Nick Cercone, Vlado Kešelj Faculty of Computer Science, Dalhousie University E-mail: {rashid,nick,vlado}@cs.dal.ca Abstract We address

### Rule based Classification of BSE Stock Data with Data Mining

International Journal of Information Sciences and Application. ISSN 0974-2255 Volume 4, Number 1 (2012), pp. 1-9 International Research Publication House http://www.irphouse.com Rule based Classification

### Sentiment analysis using emoticons

Sentiment analysis using emoticons Royden Kayhan Lewis Moharreri Steven Royden Ware Lewis Kayhan Steven Moharreri Ware Department of Computer Science, Ohio State University Problem definition Our aim was

### Performance Appraisal System using Multifactorial Evaluation Model

Performance Appraisal System using Multifactorial Evaluation Model C. C. Yee, and Y.Y.Chen Abstract Performance appraisal of employee is important in managing the human resource of an organization. With

### CHURN PREDICTION IN MOBILE TELECOM SYSTEM USING DATA MINING TECHNIQUES

International Journal of Scientific and Research Publications, Volume 4, Issue 4, April 2014 1 CHURN PREDICTION IN MOBILE TELECOM SYSTEM USING DATA MINING TECHNIQUES DR. M.BALASUBRAMANIAN *, M.SELVARANI

### Combining SOM and GA-CBR for Flow Time Prediction in Semiconductor Manufacturing Factory

Combining SOM and GA-CBR for Flow Time Prediction in Semiconductor Manufacturing Factory Pei-Chann Chang 12, Yen-Wen Wang 3, Chen-Hao Liu 2 1 Department of Information Management, Yuan-Ze University, 2

### A Novel Approach for Network Traffic Summarization

A Novel Approach for Network Traffic Summarization Mohiuddin Ahmed, Abdun Naser Mahmood, Michael J. Maher School of Engineering and Information Technology, UNSW Canberra, ACT 2600, Australia, Mohiuddin.Ahmed@student.unsw.edu.au,A.Mahmood@unsw.edu.au,M.Maher@unsw.

### A Framework for Data Migration between Various Types of Relational Database Management Systems

A Framework for Data Migration between Various Types of Relational Database Management Systems Ahlam Mohammad Al Balushi Sultanate of Oman, International Maritime College Oman ABSTRACT Data Migration is

### A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH

205 A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH ABSTRACT MR. HEMANT KUMAR*; DR. SARMISTHA SARMA** *Assistant Professor, Department of Information Technology (IT), Institute of Innovation in Technology

### Chapter 6. The stacking ensemble approach

82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described

### Data Mining Solutions for the Business Environment

Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania ruxandra_stefania.petre@yahoo.com Over

### Robust Outlier Detection Technique in Data Mining: A Univariate Approach

Robust Outlier Detection Technique in Data Mining: A Univariate Approach Singh Vijendra and Pathak Shivani Faculty of Engineering and Technology Mody Institute of Technology and Science Lakshmangarh, Sikar,

### ISSN: 2348 9510. A Review: Image Retrieval Using Web Multimedia Mining

A Review: Image Retrieval Using Web Multimedia Satish Bansal*, K K Yadav** *, **Assistant Professor Prestige Institute Of Management, Gwalior (MP), India Abstract Multimedia object include audio, video,

### A Comparative Approach to Search Engine Ranking Strategies

26 A Comparative Approach to Search Engine Ranking Strategies Dharminder Singh 1, Ashwani Sethi 2 Guru Gobind Singh Collage of Engineering & Technology Guru Kashi University Talwandi Sabo, Bathinda, Punjab

### Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification

Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification Tina R. Patil, Mrs. S. S. Sherekar Sant Gadgebaba Amravati University, Amravati tnpatil2@gmail.com, ss_sherekar@rediffmail.com

### Research of Postal Data mining system based on big data

3rd International Conference on Mechatronics, Robotics and Automation (ICMRA 2015) Research of Postal Data mining system based on big data Xia Hu 1, Yanfeng Jin 1, Fan Wang 1 1 Shi Jiazhuang Post & Telecommunication

### Analysis Tools and Libraries for BigData

+ Analysis Tools and Libraries for BigData Lecture 02 Abhijit Bendale + Office Hours 2 n Terry Boult (Waiting to Confirm) n Abhijit Bendale (Tue 2:45 to 4:45 pm). Best if you email me in advance, but I

### Visualization of large data sets using MDS combined with LVQ.

Visualization of large data sets using MDS combined with LVQ. Antoine Naud and Włodzisław Duch Department of Informatics, Nicholas Copernicus University, Grudziądzka 5, 87-100 Toruń, Poland. www.phys.uni.torun.pl/kmk

### Data quality in Accounting Information Systems

Data quality in Accounting Information Systems Comparing Several Data Mining Techniques Erjon Zoto Department of Statistics and Applied Informatics Faculty of Economy, University of Tirana Tirana, Albania

### Mimicking human fake review detection on Trustpilot

Mimicking human fake review detection on Trustpilot [DTU Compute, special course, 2015] Ulf Aslak Jensen Master student, DTU Copenhagen, Denmark Ole Winther Associate professor, DTU Copenhagen, Denmark

### What is Customer Relationship Management? Customer Relationship Management Analytics. Customer Life Cycle. Objectives of CRM. Three Types of CRM

Relationship Management Analytics What is Relationship Management? CRM is a strategy which utilises a combination of Week 13: Summary information technology policies processes, employees to develop profitable

### In this presentation, you will be introduced to data mining and the relationship with meaningful use.

In this presentation, you will be introduced to data mining and the relationship with meaningful use. Data mining refers to the art and science of intelligent data analysis. It is the application of machine