Data Mining in Financial Application

Size: px
Start display at page:

Download "Data Mining in Financial Application"

Transcription

1 Journal of Modern Accounting and Auditing, ISSN December 2011, Vol. 7, No. 12, Data Mining in Financial Application G. Cenk Akkaya Dokuz Eylül University, Turkey Ceren Uzar Mugla University, Turkey Data mining enables us to form forecasts and models regarding future by making use of past data. Any method which helps to discover data can be used as a data mining method. Enterprises gain important competitive advantage by data mining methods. Data mining is used in different fields. In finance field, it is a specially used in portfolio management, fraud detection, payment prediction, loan risk analysis, mortgage scoring, determining transaction manipulation, determining financial risk management, determining customer profile and foreign exchange market. It can be costly, risky and time consuming for enterprises to gain knowledge. Thus today enterprises use data mining as an innovative competitive mean. The aim of the study is to determine the importance of data mining in financial applications. Keywords: data mining, data mining methods, prediction Introduction Data mining (DM) is defined as the process of discovering patterns in data. The process must be automatic or (more usually) semiautomatic (Witten & Frank, 2005, p. 5). DM may also be regarded as the natural development processed of the information technologies. Large scale data may be regarded as a data mine comprising valuable data within large scale databases in different areas. Ana DM is defined as the process of producing substantive information, which used to be unknown before these data (Albayrak & Yılmaz, 2009, p. 32). The purpose of data mining is to create decision making models for estimation of the behaviors in the future based on the analysis of the past activities (Koyuncugil & Özgülbaş, 2009, p. 24). At this point, it is a means which supports the decision process to be given for reaching to the solution rather than being a solution alone and provides the information required for solving the problem. Data mining refers to rendering assistance to the analyst for finding the patterns and relations between the data created in the stage of working (Akgöbek & Çakır, 2009, p. 802). Therefore, data mining can be used for a variety of purposes in private sector. Industries such as banking, insurance, and medicine commonly use data mining to reduce costs, enhance research, and increase sales. Data analysis techniques that have been traditionally used for such tasks include regression analysis, cluster analysis, numerical taxonomy, multidimensional analysis, other multivariate statistical methods, stochastic models, time series analysis, nonlinear estimation techniques, and others (Sumathi & Sivanandam, 2006, p. 24) Building Data Models Data mining builds models from data, using tools that vary both by the type of model built and, within G. Cenk Akkaya, Ph.D., associate professor, Faculty of Economics and Administrative Sciences, Dokuz Eylül University. Ceren Uzar, lecturer, Vocational School of Higher Education, Mugla University.

2 DATA MINING IN FINANCIAL APPLICATION 1363 each model domain, by the type of algorithm used. As Table 1 shows, at the highest level this taxonomy of data mining separates models into two classes: predictive and descriptive (Gerritsen, 1999, p. 17): Table1 Data Mining Taxonomy Predictive models Descriptive models Classification Regression Clustering Association One of the most common learning tasks in data mining is classification. Classification is a supervised way of learning. The database contains one or more attributes that denote the class of a tuple and these are known as predicted attributes, whereas the remaining attributes are called predicting attributes (Sumathi & Sivanandam, 2006, p. 577). The popular algorithms of supervised learning techniques include decision trees, artificial neural networks, and so on (Tsai, Lu, & Hsu, 2011, p. 131). When the data are unlabelled and each instance does not have a given class label the learning task is called unsupervised. If we still want to identify which instances belong together, that is, form natural clusters of instances, a clustering algorithm can be applied (Olafsson, Li, & Wu, 2008, p. 1439). Clustering techniques can be used to identify stable dependencies for risk management and investment management (Zhang & Zhou, 2004, p. 514). Another unsupervised learning approach is association rule discovery that aims to discover interesting correlation or other relationships among the attributes. Association rule mining was originally used for market basket analysis, where items are articles in the customer s shopping cart and the supermarket manager is looking for associations among these purchases (Olafsson et al., 2008, p. 1441). Suggestions for Implementation of Data Mining in Financial Application Data mining is able to uncover hidden patterns and predict future trends and behaviors in financial markets. It creates opportunities for companies to make proactive and knowledge-driven decisions in order to gain a competitive advantage (Zhang & Zhou, 2004, p. 513). In this respect, fraud detection, predicting credit risk, foreign exchange model forecasting accuracy, portfolio management and studies for financial failure predictions can be exemplified as follows. Fraud Detection Fraud is increasing dramatically with the expansion of modern technology and the global superhighways of communication, resulting in the loss of billions of dollars worldwide each year. Although prevention technologies are the best way of reducing fraud, fraudsters are adaptive and, given time, will usually find ways to circumvent such measures (Bhargava & Manik, 2011, p. 4). Fraud detection studies have now become widespread. Fraud detection must also be dealt across all industries, especially the sectors where many transactions are made more vulnerable, such as health care, retail, credit card services, and telecommunications (Koyuncugil & Ozgulbas, 2009, p. 234). Credit card fraud detection has drawn a number of techniques, with special emphasis on data mining. In this subject, the studies conducted by Stolfo, Fan, Lee, and Prodromidis (1997) may be summarized as follows. The first step was the collection of data. The members of the Financial Services Technology Consortium provided data with real credit card transactions. 500,000 records have been obtained. Each record has 30 fields

3 1364 DATA MINING IN FINANCIAL APPLICATION and a total of 137 bytes. The data were already labeled by the banks as fraud or non-fraud. Among the 500,000 records, 20% are fraud transactions. The data were sampled from a 12-month period. The aim of the study is to compute a classifier that predicts whether a transaction is fraud or non-fraud. In this respect, the training data were sampled from earlier months, the validation data (for meta-learning) and the testing data were sampled from later months. Data from months 1 to 10 are used for training, data in month 11 is for validation in meta-learning and data in month 12 is used for testing. Then, ID3 (Iterative Dichotomiser 3), CART (Classification and Regression Tree), BAYES, and RIPPER (Repeated Incremental Pruning to Produce Error Reduction), each is applied to every partition above. Stolfo et al. (1997) found that 50%/50% distribution of fraud/non-fraud training data will generate classifiers with the highest true positive rate (fraud catching rate) and low false positive rate (false alarm rate) so far. Meta-learning with BAYES as a meta-learner to combine base classifiers with the highest true positive rates learned from 50%/50% fraud distribution is the best method found so far. Predicting Credit Risk With the rapid growth in the credit industry, credit scoring models have been extensively used for the credit admission evaluation. In the last two decades, several quantitative methods have been developed for the credit admission decision (Huang, Chen, & Wang, 2007, p. 847). This problem, most commonly referred to as credit scoring, is an attempt to classify applicants for credit into either good or bad risk classes. It is to be distinguished from behavioral or performance scoring, in which the repayment behavior of the customer is tracked to make predictions, provided the credit was already accepted (Vellido, Lisboa, & Vaughan, 1999, p. 55). In this subject, the studies conducted by Kotsiantis (2007) may be summarized as follows. Kotsiantis (2007) used two public credit datasets: Credit-a dataset and Credit-g dataset. (1) In Credit-a dataset, each case out of 690 represents an application for credit card facilities described by eight discrete and six continuous attributes (sex, age, mean time at addresses, home status, current occupation, current job status, mean time with employers, other investments, bank account, time with bank, liability reference, account reference, monthly housing expense, savings account balance with two decision classes (accept/reject). In the UCI repository (University of California at Irvine), the original attribute names have been changed to meaningless symbols (A1-A14) with the purpose of protecting the confidentiality of the data and also, they make a distinction between discrete (nominal) and continuously valued attributes. (2) The German Credit dataset (Credit-g) contains observations on 20 variables (checking account status, duration of credit in months, credit history, purpose of credit, credit amount, average balance in savings account, present employment, installment rate as percentage of disposable income, personal status, other parties, present resident since (years, property magnitude, age in years, other payment plans, housing, number of existing credits at this bank, nature of job, number of people for whom liable to provide maintenance, applicant has phone in his or her name, foreign worker) for 1,000 past applicants for credit. Each applicant was rated as good credit (700 cases) or bad credit (300 cases). The K2 algorithm, C4.5 algorithm, the 3NN algorithm, The BP algorithm, the Ripper algorithm, the Sequential Minimal Optimization (SMO) algorithm and the LR algorithm were applied. As a result of these methods the Credit-a dataset and the Credit-g dataset classified good credit cases, bad credit cases and evaluation samples. These algorithms achieved different rates of correctly classified individuals.

4 DATA MINING IN FINANCIAL APPLICATION 1365 Then Kotsiantis (2007) compared the results proposed methodology SelVot (selecting voting rule) for the Credit-a and the Credit-g databases with the voting algorithm, The Best CV algorithm (B-cross validation), the Grading algorithm, the Stacking algorithm. Kotsiantis (2007) found that the proposed selective voting methodology achieves better performance than any examined simple and ensemble method. Neural Network Foreign Exchange Model Forecasting Accuracy Based on an early model of human brain function, neural networks do classification and regression equally well. More complex than other techniques, neural networks has often been described as a black box technology (Gerritsen, 1999, p. 18). In this respect, the dominant data mining method technique used foreign exchange market prediction so far is neural network modeling. In this subject, the studies conducted by Walczak (2001) may be summarized as follows. First, neural network models were developed to forecast the spot rates of the U.S. dollar against the British pound, German mark, and Japanese yen. Historic spot rate data values for each of these currency exchange rates were collected from March 1, 1973 through June 30, This data sample represented data from the time that a floating exchange rate policy was adopted until the near present. Eleven different neural network models were applied resulting in 33 different neural network forecasting models to forecast the three foreign exchanges. Walczak (2001) found that the dollar/yen neural network models are the only ones that have a similar (identical) performance result between the full 22-year training set and either of the 1- or 2-year training set models. Modifications to the training data set alone are responsible for 11.2% change in forecasting performance for the dollar/pound neural networks (comparing the 2-year training performance against the worst-case performance of larger training sets), 6.4% change for the dollar/mark networks, and 18.4% change in the dollar/yen neural network forecasting accuracy. Portfolio Management Portfolio management is a major issue in investment. Its goal is to choose a set of risk assets to form a portfolio that can maximize to return under a given risk or minimize the risk for obtaining a given return (Hung, Liang, & Liu, 1996, p. 301). With the data mining and optimization techniques investors are able to allocate capital across trading activities to maximize profit or minimize risk. This feature supports the ability to generate trade recommendations and portfolio structuring from user supplied profit and risk requirement (Dass, 2006, p. 6). In this respect, data mining has been applied portfolio management, investment risk analysis and so on. In this subject, the studies conducted by Hung et al. (1996) may be summarized as follows. The study includes three major components: APT (arbitrage price theory), ANN (artificial neural network) and portfolio constructor. The data were collected from Taiwan Stock Exchange. Fifty-one stocks were selected among the more than 300 traded stocks. The data used in this study are from November 25 to October 23, The daily return of each stock during the period was obtained from the econometrics programming system (EPS) database maintained by Ministry of education. In the next phase, the whole time period was divided into 12 sets of training and evaluation periods.

5 1366 DATA MINING IN FINANCIAL APPLICATION They first applied factor analysis to identify the key risk factors. Then multiple regression analysis was used to price the assets. ANN and ARIMA (Autoregressive Integrated Moving Average) methods were used to predict the trend of each risk factor in the future. One hundred and three weekly return data were used at this step to shorten the model building time at a slight cost of precision. IANN and IARIMA models were built. Finally, candidate portfolios were built based on the risk and the predicted return of each stock. Then optimal portfolio was selected from the candidates and its performance in the following four weeks. Predicting Corporate Failure Predicting corporate failure is a very critical task for government officials, stock holders, managers, employees, investors and researchers, especially in nowadays competitive economic environment. The corporate failure prediction is basically a dichotomous decision, either being failure or not. Most statistical and AI methods estimate the probability of corporate failure. In this subject, the studies conducted by Nguyen (2005) may be summarized as follows. Data were collected from the Corporate Scorecard Group s (CSG) databases in Australia in period Two sets of data collected. The first dataset consists of 32 Australian companies which is called the training or sample dataset. The second dataset consists of 200 companies that they are coded by the CSG and no extra defaulted information is revealed. There will be selected a set of financial ratios which have successfully predicted firm failure to build the models including liquidity, profitability, solvency and financial leverage. Three neutral network models were applied. These models were Probabilistic Neural Network, Multi-Layer Neural Network, and Logistic Regression. ANGOSS Knowledge Studio software used to test the predictive power of these models. As a result of these methods, Probabilistic neutral network, with the training process, the model correctly predicts 93.75%. Multi-layer neural network, with the training process, the model correctly predicts 95.45%. For Logistic regression model, with the training process, the model correctly predicts 100% both for one year and two years before failure. Nguyen (2005) found that three models are good at predicting probability of corporate failure. Moreover, probabilistic neural network model outperforms the others. Conclusions Data mining can be expressed as disclosure of hidden values and usable information among a large amount of data. The purpose of data mining is to create decision making models for future predictions based on the analysis of the past activities. Data mining can find an application usage field in almost every media in which the data is produced intensively and consequently the databases are created. It can be used in the field of finance, particularly for signature and bank note verification, mortgage scoring, country risk rating, credit risk analysis, foreign exchange rate forecasting, portfolio management, bond rating and trading early determination of financial failure, determination of financial information manipulation and internal learners, preventing law application in the initial public offerings, determining the market anomalies, analysis of investor risk perception depending on education, income level, gender and such factors, analysis of integration level in financial markets, analysis of the level of transmission of the capital provided from the initial public offerings to the reel investments,

6 DATA MINING IN FINANCIAL APPLICATION 1367 presupposition of bear and bull periods in the stock exchanges, creating risk (volatility) index for markets, company evaluation and equities price depth analysis of SMEs in entrepreneur capital market and forecasting of abnormal stock exchange revenues. In this study, information on data mining concept and data mining methods has been rendered. Furthermore, the suggestions for applying data mining on capital markets have been described by means of some related studies. References Akgöbek, Ö., & Çakır, F. (2009). Expert system design with data mining. Proceedings from: Akademic Information Conference (pp ). Harran University, Şanlıurfa. Albayrak, A. S., & Yılmaz, Ş. K. (2009). Data mining: Decision tree algorithms and an application on Ise data. The Journal of Faculty of Economics and Administrative Sciences, 14(1), Bhargava, N., & Manik, G. (2011). Application of Artificial neural networks in business applications. Retrieved from Dass, R. (2006). Data mining in banking and finance: A note for bankers. Retrieved from Gerritsen, R. (1999). Assessing loan risks: A data mining case study. IT Professional, 1, Huang, C. L., Chen, M. C., & Wang, C. J. (2007). Credit scoring with a data mining approach based on support vector machines. Expert Systems With Applications, 33, Hung, S. Y., Liang, T. P., & Liu, C. W. (1996). Integrating arbitrage pricing theory and artificial neural networks to support portfolio management. Decision Support System, 18(3/4), Kotsiantis, S. (2007). Credit risk analysis using a hybrid data mining model. International Journal of Intelligent Systems Technologies and Applications, 2(4), Koyuncugil, A. S., & Özgülbaş, N. (2009). Data mining: Using and applications in medicine and healthcare. Journal of Information Technology, 2(2), Nguyen, H. G. (2005). Using neutral network in predicting corporate failure. Journal of Social Sciences, 1(4), Olafsson, S., Li, X. N., & Wu, S. N. (2008). Operations research and data mining. European Journal of Operational Research, 187(3), Stolfo, S. J., Fan W. D., Lee, W., & Prodromidis, A. (1997). Credit card fraud detection using meta-learning issues and initial results. Proceedings from: AAAI-97 Workshop on Al Methods in Fraud and Risk Management (pp ). Menlo Park. Sumathi, S., & Sivanandam, S. N. (2006). Introduction to data mining and its applications (Vol. 29). Verlag Berlin Heidelberg: Springer. Tsai, C. F., Lu, Y. H., & Hsu, Y. F. (2011). Bankruptcy prediction by supervised machine learning techniques: A comparative study. In A. S. Koyuncugil, & N. Özgülbaş (Eds.), Surveillance technologies and early warning systems (pp ). Hershey New York: Information Science Reference. Vellido, A., Lisboa, P. J. G., & Vaughan, J. (1999). Neural networks in business: A survey of applications ( ). Expert Systems With Applications, 17, Walczak, S. (2001). An empirical analysis of data requirements for financial forecasting with neural networks. Journal of Management Information Systems, 17(4), Witten, I. H., & Frank, E. (2005). Data mining: Practical machine learning tools and techniques (2nd ed.). San Francisco: Morgan Kaufmann Publishers. Zhang, D. S., & Zhou, L. N. (2004). Discovering golden nuggets: Data mining in financial application. IEEE Transactions on Systems, Man, and Cybernetics Part C: Applications and Reviews, 34(4),

Review on Financial Forecasting using Neural Network and Data Mining Technique

Review on Financial Forecasting using Neural Network and Data Mining Technique ORIENTAL JOURNAL OF COMPUTER SCIENCE & TECHNOLOGY An International Open Free Access, Peer Reviewed Research Journal Published By: Oriental Scientific Publishing Co., India. www.computerscijournal.org ISSN:

More information

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

More information

Data Mining Solutions for the Business Environment

Data Mining Solutions for the Business Environment Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania ruxandra_stefania.petre@yahoo.com Over

More information

Prediction of Stock Performance Using Analytical Techniques

Prediction of Stock Performance Using Analytical Techniques 136 JOURNAL OF EMERGING TECHNOLOGIES IN WEB INTELLIGENCE, VOL. 5, NO. 2, MAY 2013 Prediction of Stock Performance Using Analytical Techniques Carol Hargreaves Institute of Systems Science National University

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

Equity forecast: Predicting long term stock price movement using machine learning

Equity forecast: Predicting long term stock price movement using machine learning Equity forecast: Predicting long term stock price movement using machine learning Nikola Milosevic School of Computer Science, University of Manchester, UK Nikola.milosevic@manchester.ac.uk Abstract Long

More information

DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES

DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES Vijayalakshmi Mahanra Rao 1, Yashwant Prasad Singh 2 Multimedia University, Cyberjaya, MALAYSIA 1 lakshmi.mahanra@gmail.com

More information

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,

More information

Review on Financial Forecasting using Neural Network and Data Mining Technique

Review on Financial Forecasting using Neural Network and Data Mining Technique Global Journal of Computer Science and Technology Neural & Artificial Intelligence Volume 12 Issue 11 Version 1.0 Year 2012 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global

More information

Clustering Marketing Datasets with Data Mining Techniques

Clustering Marketing Datasets with Data Mining Techniques Clustering Marketing Datasets with Data Mining Techniques Özgür Örnek International Burch University, Sarajevo oornek@ibu.edu.ba Abdülhamit Subaşı International Burch University, Sarajevo asubasi@ibu.edu.ba

More information

An Empirical Study of Application of Data Mining Techniques in Library System

An Empirical Study of Application of Data Mining Techniques in Library System An Empirical Study of Application of Data Mining Techniques in Library System Veepu Uppal Department of Computer Science and Engineering, Manav Rachna College of Engineering, Faridabad, India Gunjan Chindwani

More information

ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION

ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION ISSN 9 X INFORMATION TECHNOLOGY AND CONTROL, 00, Vol., No.A ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION Danuta Zakrzewska Institute of Computer Science, Technical

More information

Credit Risk Assessment of POS-Loans in the Big Data Era

Credit Risk Assessment of POS-Loans in the Big Data Era Credit Risk Assessment of POS-Loans in the Big Data Era Yiyang Bian 1,2, Shaokun Fan 1, Ryan Liying Ye 1, J. Leon Zhao 1 1 Department of Information Systems, City University of Hong Kong 2 School of Management,

More information

Rule based Classification of BSE Stock Data with Data Mining

Rule based Classification of BSE Stock Data with Data Mining International Journal of Information Sciences and Application. ISSN 0974-2255 Volume 4, Number 1 (2012), pp. 1-9 International Research Publication House http://www.irphouse.com Rule based Classification

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015 RESEARCH ARTICLE OPEN ACCESS Data Mining Technology for Efficient Network Security Management Ankit Naik [1], S.W. Ahmad [2] Student [1], Assistant Professor [2] Department of Computer Science and Engineering

More information

Applied Mathematical Sciences, Vol. 7, 2013, no. 112, 5591-5597 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2013.

Applied Mathematical Sciences, Vol. 7, 2013, no. 112, 5591-5597 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2013. Applied Mathematical Sciences, Vol. 7, 2013, no. 112, 5591-5597 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2013.38457 Accuracy Rate of Predictive Models in Credit Screening Anirut Suebsing

More information

BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL

BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL The Fifth International Conference on e-learning (elearning-2014), 22-23 September 2014, Belgrade, Serbia BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL SNJEŽANA MILINKOVIĆ University

More information

HYBRID PROBABILITY BASED ENSEMBLES FOR BANKRUPTCY PREDICTION

HYBRID PROBABILITY BASED ENSEMBLES FOR BANKRUPTCY PREDICTION HYBRID PROBABILITY BASED ENSEMBLES FOR BANKRUPTCY PREDICTION Chihli Hung 1, Jing Hong Chen 2, Stefan Wermter 3, 1,2 Department of Management Information Systems, Chung Yuan Christian University, Taiwan

More information

Financial Trading System using Combination of Textual and Numerical Data

Financial Trading System using Combination of Textual and Numerical Data Financial Trading System using Combination of Textual and Numerical Data Shital N. Dange Computer Science Department, Walchand Institute of Rajesh V. Argiddi Assistant Prof. Computer Science Department,

More information

International Journal of World Research, Vol: I Issue XIII, December 2008, Print ISSN: 2347-937X DATA MINING TECHNIQUES AND STOCK MARKET

International Journal of World Research, Vol: I Issue XIII, December 2008, Print ISSN: 2347-937X DATA MINING TECHNIQUES AND STOCK MARKET DATA MINING TECHNIQUES AND STOCK MARKET Mr. Rahul Thakkar, Lecturer and HOD, Naran Lala College of Professional & Applied Sciences, Navsari ABSTRACT Without trading in a stock market we can t understand

More information

AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM

AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM ABSTRACT Luis Alexandre Rodrigues and Nizam Omar Department of Electrical Engineering, Mackenzie Presbiterian University, Brazil, São Paulo 71251911@mackenzie.br,nizam.omar@mackenzie.br

More information

Enhanced Boosted Trees Technique for Customer Churn Prediction Model

Enhanced Boosted Trees Technique for Customer Churn Prediction Model IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719 Vol. 04, Issue 03 (March. 2014), V5 PP 41-45 www.iosrjen.org Enhanced Boosted Trees Technique for Customer Churn Prediction

More information

Data Quality Mining: Employing Classifiers for Assuring consistent Datasets

Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Fabian Grüning Carl von Ossietzky Universität Oldenburg, Germany, fabian.gruening@informatik.uni-oldenburg.de Abstract: Independent

More information

Electronic Payment Fraud Detection Techniques

Electronic Payment Fraud Detection Techniques World of Computer Science and Information Technology Journal (WCSIT) ISSN: 2221-0741 Vol. 2, No. 4, 137-141, 2012 Electronic Payment Fraud Detection Techniques Adnan M. Al-Khatib CIS Dept. Faculty of Information

More information

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015 An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content

More information

Use of Data Mining in Banking

Use of Data Mining in Banking Use of Data Mining in Banking Kazi Imran Moin*, Dr. Qazi Baseer Ahmed** *(Department of Computer Science, College of Computer Science & Information Technology, Latur, (M.S), India ** (Department of Commerce

More information

WITH THE INCREASE of economic globalization and

WITH THE INCREASE of economic globalization and IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART C: APPLICATIONS AND REVIEWS, VOL. 34, NO. 4, NOVEMBER 2004 513 Discovering Golden Nuggets: Data Mining in Financial Application Dongsong Zhang and

More information

Neural Network Applications in Stock Market Predictions - A Methodology Analysis

Neural Network Applications in Stock Market Predictions - A Methodology Analysis Neural Network Applications in Stock Market Predictions - A Methodology Analysis Marijana Zekic, MS University of Josip Juraj Strossmayer in Osijek Faculty of Economics Osijek Gajev trg 7, 31000 Osijek

More information

EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH

EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH SANGITA GUPTA 1, SUMA. V. 2 1 Jain University, Bangalore 2 Dayanada Sagar Institute, Bangalore, India Abstract- One

More information

PREDICTING STOCK PRICES USING DATA MINING TECHNIQUES

PREDICTING STOCK PRICES USING DATA MINING TECHNIQUES The International Arab Conference on Information Technology (ACIT 2013) PREDICTING STOCK PRICES USING DATA MINING TECHNIQUES 1 QASEM A. AL-RADAIDEH, 2 ADEL ABU ASSAF 3 EMAN ALNAGI 1 Department of Computer

More information

An Analysis of Missing Data Treatment Methods and Their Application to Health Care Dataset

An Analysis of Missing Data Treatment Methods and Their Application to Health Care Dataset P P P Health An Analysis of Missing Data Treatment Methods and Their Application to Health Care Dataset Peng Liu 1, Elia El-Darzi 2, Lei Lei 1, Christos Vasilakis 2, Panagiotis Chountas 2, and Wei Huang

More information

A Survey on Outlier Detection Techniques for Credit Card Fraud Detection

A Survey on Outlier Detection Techniques for Credit Card Fraud Detection IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661, p- ISSN: 2278-8727Volume 16, Issue 2, Ver. VI (Mar-Apr. 2014), PP 44-48 A Survey on Outlier Detection Techniques for Credit Card Fraud

More information

How To Make A Credit Risk Model For A Bank Account

How To Make A Credit Risk Model For A Bank Account TRANSACTIONAL DATA MINING AT LLOYDS BANKING GROUP Csaba Főző csaba.fozo@lloydsbanking.com 15 October 2015 CONTENTS Introduction 04 Random Forest Methodology 06 Transactional Data Mining Project 17 Conclusions

More information

Prediction of Heart Disease Using Naïve Bayes Algorithm

Prediction of Heart Disease Using Naïve Bayes Algorithm Prediction of Heart Disease Using Naïve Bayes Algorithm R.Karthiyayini 1, S.Chithaara 2 Assistant Professor, Department of computer Applications, Anna University, BIT campus, Tiruchirapalli, Tamilnadu,

More information

Towards applying Data Mining Techniques for Talent Mangement

Towards applying Data Mining Techniques for Talent Mangement 2009 International Conference on Computer Engineering and Applications IPCSIT vol.2 (2011) (2011) IACSIT Press, Singapore Towards applying Data Mining Techniques for Talent Mangement Hamidah Jantan 1,

More information

New Ensemble Combination Scheme

New Ensemble Combination Scheme New Ensemble Combination Scheme Namhyoung Kim, Youngdoo Son, and Jaewook Lee, Member, IEEE Abstract Recently many statistical learning techniques are successfully developed and used in several areas However,

More information

Reference Books. Data Mining. Supervised vs. Unsupervised Learning. Classification: Definition. Classification k-nearest neighbors

Reference Books. Data Mining. Supervised vs. Unsupervised Learning. Classification: Definition. Classification k-nearest neighbors Classification k-nearest neighbors Data Mining Dr. Engin YILDIZTEPE Reference Books Han, J., Kamber, M., Pei, J., (2011). Data Mining: Concepts and Techniques. Third edition. San Francisco: Morgan Kaufmann

More information

Advanced Ensemble Strategies for Polynomial Models

Advanced Ensemble Strategies for Polynomial Models Advanced Ensemble Strategies for Polynomial Models Pavel Kordík 1, Jan Černý 2 1 Dept. of Computer Science, Faculty of Information Technology, Czech Technical University in Prague, 2 Dept. of Computer

More information

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS Mrs. Jyoti Nawade 1, Dr. Balaji D 2, Mr. Pravin Nawade 3 1 Lecturer, JSPM S Bhivrabai Sawant Polytechnic, Pune (India) 2 Assistant

More information

Customer Classification And Prediction Based On Data Mining Technique

Customer Classification And Prediction Based On Data Mining Technique Customer Classification And Prediction Based On Data Mining Technique Ms. Neethu Baby 1, Mrs. Priyanka L.T 2 1 M.E CSE, Sri Shakthi Institute of Engineering and Technology, Coimbatore 2 Assistant Professor

More information

Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100

Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100 Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100 Erkan Er Abstract In this paper, a model for predicting students performance levels is proposed which employs three

More information

Credit Card Fraud Detection Using Meta-Learning: Issues and Initial Results 1

Credit Card Fraud Detection Using Meta-Learning: Issues and Initial Results 1 Credit Card Fraud Detection Using Meta-Learning: Issues and Initial Results 1 Salvatore J. Stolfo, David W. Fan, Wenke Lee and Andreas L. Prodromidis Department of Computer Science Columbia University

More information

Foundations of Business Intelligence: Databases and Information Management

Foundations of Business Intelligence: Databases and Information Management Foundations of Business Intelligence: Databases and Information Management Problem: HP s numerous systems unable to deliver the information needed for a complete picture of business operations, lack of

More information

Data Mining Algorithms Part 1. Dejan Sarka

Data Mining Algorithms Part 1. Dejan Sarka Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses

More information

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

More information

Data quality in Accounting Information Systems

Data quality in Accounting Information Systems Data quality in Accounting Information Systems Comparing Several Data Mining Techniques Erjon Zoto Department of Statistics and Applied Informatics Faculty of Economy, University of Tirana Tirana, Albania

More information

A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier

A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier G.T. Prasanna Kumari Associate Professor, Dept of Computer Science and Engineering, Gokula Krishna College of Engg, Sullurpet-524121,

More information

E-commerce Transaction Anomaly Classification

E-commerce Transaction Anomaly Classification E-commerce Transaction Anomaly Classification Minyong Lee minyong@stanford.edu Seunghee Ham sham12@stanford.edu Qiyi Jiang qjiang@stanford.edu I. INTRODUCTION Due to the increasing popularity of e-commerce

More information

Neural Networks and Back Propagation Algorithm

Neural Networks and Back Propagation Algorithm Neural Networks and Back Propagation Algorithm Mirza Cilimkovic Institute of Technology Blanchardstown Blanchardstown Road North Dublin 15 Ireland mirzac@gmail.com Abstract Neural Networks (NN) are important

More information

Roulette Sampling for Cost-Sensitive Learning

Roulette Sampling for Cost-Sensitive Learning Roulette Sampling for Cost-Sensitive Learning Victor S. Sheng and Charles X. Ling Department of Computer Science, University of Western Ontario, London, Ontario, Canada N6A 5B7 {ssheng,cling}@csd.uwo.ca

More information

Impact of Feature Selection on the Performance of Wireless Intrusion Detection Systems

Impact of Feature Selection on the Performance of Wireless Intrusion Detection Systems 2009 International Conference on Computer Engineering and Applications IPCSIT vol.2 (2011) (2011) IACSIT Press, Singapore Impact of Feature Selection on the Performance of ireless Intrusion Detection Systems

More information

Discovering, Not Finding. Practical Data Mining for Practitioners: Level II. Advanced Data Mining for Researchers : Level III

Discovering, Not Finding. Practical Data Mining for Practitioners: Level II. Advanced Data Mining for Researchers : Level III www.cognitro.com/training Predicitve DATA EMPOWERING DECISIONS Data Mining & Predicitve Training (DMPA) is a set of multi-level intensive courses and workshops developed by Cognitro team. it is designed

More information

Dr. U. Devi Prasad Associate Professor Hyderabad Business School GITAM University, Hyderabad Email: Prasad_vungarala@yahoo.co.in

Dr. U. Devi Prasad Associate Professor Hyderabad Business School GITAM University, Hyderabad Email: Prasad_vungarala@yahoo.co.in 96 Business Intelligence Journal January PREDICTION OF CHURN BEHAVIOR OF BANK CUSTOMERS USING DATA MINING TOOLS Dr. U. Devi Prasad Associate Professor Hyderabad Business School GITAM University, Hyderabad

More information

ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA

ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA D.Lavanya 1 and Dr.K.Usha Rani 2 1 Research Scholar, Department of Computer Science, Sree Padmavathi Mahila Visvavidyalayam, Tirupati, Andhra Pradesh,

More information

FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS

FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS Breno C. Costa, Bruno. L. A. Alberto, André M. Portela, W. Maduro, Esdras O. Eler PDITec, Belo Horizonte,

More information

Credit Card Fraud Detection Using Self Organised Map

Credit Card Fraud Detection Using Self Organised Map International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 13 (2014), pp. 1343-1348 International Research Publications House http://www. irphouse.com Credit Card Fraud

More information

Predictive time series analysis of stock prices using neural network classifier

Predictive time series analysis of stock prices using neural network classifier Predictive time series analysis of stock prices using neural network classifier Abhinav Pathak, National Institute of Technology, Karnataka, Surathkal, India abhi.pat93@gmail.com Abstract The work pertains

More information

How To Use Neural Networks In Data Mining

How To Use Neural Networks In Data Mining International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and

More information

Machine Learning in FX Carry Basket Prediction

Machine Learning in FX Carry Basket Prediction Machine Learning in FX Carry Basket Prediction Tristan Fletcher, Fabian Redpath and Joe D Alessandro Abstract Artificial Neural Networks ANN), Support Vector Machines SVM) and Relevance Vector Machines

More information

EFFICIENCY OF DECISION TREES IN PREDICTING STUDENT S ACADEMIC PERFORMANCE

EFFICIENCY OF DECISION TREES IN PREDICTING STUDENT S ACADEMIC PERFORMANCE EFFICIENCY OF DECISION TREES IN PREDICTING STUDENT S ACADEMIC PERFORMANCE S. Anupama Kumar 1 and Dr. Vijayalakshmi M.N 2 1 Research Scholar, PRIST University, 1 Assistant Professor, Dept of M.C.A. 2 Associate

More information

INTERNATIONAL JOURNAL FOR ENGINEERING APPLICATIONS AND TECHNOLOGY DATA MINING IN HEALTHCARE SECTOR. ankitanandurkar2394@gmail.com

INTERNATIONAL JOURNAL FOR ENGINEERING APPLICATIONS AND TECHNOLOGY DATA MINING IN HEALTHCARE SECTOR. ankitanandurkar2394@gmail.com IJFEAT INTERNATIONAL JOURNAL FOR ENGINEERING APPLICATIONS AND TECHNOLOGY DATA MINING IN HEALTHCARE SECTOR Bharti S. Takey 1, Ankita N. Nandurkar 2,Ashwini A. Khobragade 3,Pooja G. Jaiswal 4,Swapnil R.

More information

Predicting Student Performance by Using Data Mining Methods for Classification

Predicting Student Performance by Using Data Mining Methods for Classification BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 13, No 1 Sofia 2013 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.2478/cait-2013-0006 Predicting Student Performance

More information

Random forest algorithm in big data environment

Random forest algorithm in big data environment Random forest algorithm in big data environment Yingchun Liu * School of Economics and Management, Beihang University, Beijing 100191, China Received 1 September 2014, www.cmnt.lv Abstract Random forest

More information

Welcome. Data Mining: Updates in Technologies. Xindong Wu. Colorado School of Mines Golden, Colorado 80401, USA

Welcome. Data Mining: Updates in Technologies. Xindong Wu. Colorado School of Mines Golden, Colorado 80401, USA Welcome Xindong Wu Data Mining: Updates in Technologies Dept of Math and Computer Science Colorado School of Mines Golden, Colorado 80401, USA Email: xwu@ mines.edu Home Page: http://kais.mines.edu/~xwu/

More information

A Study of Detecting Credit Card Delinquencies with Data Mining using Decision Tree Model

A Study of Detecting Credit Card Delinquencies with Data Mining using Decision Tree Model A Study of Detecting Credit Card Delinquencies with Data Mining using Decision Tree Model ABSTRACT Mrs. Arpana Bharani* Mrs. Mohini Rao** Consumer credit is one of the necessary processes but lending bears

More information

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and Financial Institutions and STATISTICA Case Study: Credit Scoring STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table of Contents INTRODUCTION: WHAT

More information

NEURAL NETWORKS IN DATA MINING

NEURAL NETWORKS IN DATA MINING NEURAL NETWORKS IN DATA MINING 1 DR. YASHPAL SINGH, 2 ALOK SINGH CHAUHAN 1 Reader, Bundelkhand Institute of Engineering & Technology, Jhansi, India 2 Lecturer, United Institute of Management, Allahabad,

More information

Data Mining System, Functionalities and Applications: A Radical Review

Data Mining System, Functionalities and Applications: A Radical Review Data Mining System, Functionalities and Applications: A Radical Review Dr. Poonam Chaudhary System Programmer, Kurukshetra University, Kurukshetra Abstract: Data Mining is the process of locating potentially

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association

More information

Experiments in Web Page Classification for Semantic Web

Experiments in Web Page Classification for Semantic Web Experiments in Web Page Classification for Semantic Web Asad Satti, Nick Cercone, Vlado Kešelj Faculty of Computer Science, Dalhousie University E-mail: {rashid,nick,vlado}@cs.dal.ca Abstract We address

More information

Data Mining Part 5. Prediction

Data Mining Part 5. Prediction Data Mining Part 5. Prediction 5.1 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Classification vs. Numeric Prediction Prediction Process Data Preparation Comparing Prediction Methods References Classification

More information

REVIEW OF ENSEMBLE CLASSIFICATION

REVIEW OF ENSEMBLE CLASSIFICATION Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IJCSMC, Vol. 2, Issue.

More information

Studying Achievement

Studying Achievement Journal of Business and Economics, ISSN 2155-7950, USA November 2014, Volume 5, No. 11, pp. 2052-2056 DOI: 10.15341/jbe(2155-7950)/11.05.2014/009 Academic Star Publishing Company, 2014 http://www.academicstar.us

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

Credit Card Fraud Detection Using Meta-Learning: Issues 1 and Initial Results

Credit Card Fraud Detection Using Meta-Learning: Issues 1 and Initial Results From: AAAI Technical Report WS-97-07. Compilation copyright 1997, AAAI (www.aaai.org). All rights reserved. Credit Card Fraud Detection Using Meta-Learning: Issues 1 and Initial Results Salvatore 2 J.

More information

Stock Portfolio Selection using Data Mining Approach

Stock Portfolio Selection using Data Mining Approach IOSR Journal of Engineering (IOSRJEN) e-issn: 2250-3021, p-issn: 2278-8719 Vol. 3, Issue 11 (November. 2013), V1 PP 42-48 Stock Portfolio Selection using Data Mining Approach Carol Anne Hargreaves, Prateek

More information

Constrained Classification of Large Imbalanced Data by Logistic Regression and Genetic Algorithm

Constrained Classification of Large Imbalanced Data by Logistic Regression and Genetic Algorithm Constrained Classification of Large Imbalanced Data by Logistic Regression and Genetic Algorithm Martin Hlosta, Rostislav Stríž, Jan Kupčík, Jaroslav Zendulka, and Tomáš Hruška A. Imbalanced Data Classification

More information

A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE

A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE Kasra Madadipouya 1 1 Department of Computing and Science, Asia Pacific University of Technology & Innovation ABSTRACT Today, enormous amount of data

More information

Introducing diversity among the models of multi-label classification ensemble

Introducing diversity among the models of multi-label classification ensemble Introducing diversity among the models of multi-label classification ensemble Lena Chekina, Lior Rokach and Bracha Shapira Ben-Gurion University of the Negev Dept. of Information Systems Engineering and

More information

Potential Value of Data Mining for Customer Relationship Marketing in the Banking Industry

Potential Value of Data Mining for Customer Relationship Marketing in the Banking Industry Advances in Natural and Applied Sciences, 3(1): 73-78, 2009 ISSN 1995-0772 2009, American Eurasian Network for Scientific Information This is a refereed journal and all articles are professionally screened

More information

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING

EFFICIENT DATA PRE-PROCESSING FOR DATA MINING EFFICIENT DATA PRE-PROCESSING FOR DATA MINING USING NEURAL NETWORKS JothiKumar.R 1, Sivabalan.R.V 2 1 Research scholar, Noorul Islam University, Nagercoil, India Assistant Professor, Adhiparasakthi College

More information

Data Mining Applications in Higher Education

Data Mining Applications in Higher Education Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2

More information

Chapter 12 Discovering New Knowledge Data Mining

Chapter 12 Discovering New Knowledge Data Mining Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to

More information

A Prediction Model for Taiwan Tourism Industry Stock Index

A Prediction Model for Taiwan Tourism Industry Stock Index A Prediction Model for Taiwan Tourism Industry Stock Index ABSTRACT Han-Chen Huang and Fang-Wei Chang Yu Da University of Science and Technology, Taiwan Investors and scholars pay continuous attention

More information

Comparison of Data Mining Techniques used for Financial Data Analysis

Comparison of Data Mining Techniques used for Financial Data Analysis Comparison of Data Mining Techniques used for Financial Data Analysis Abhijit A. Sawant 1, P. M. Chawan 2 1 Student, 2 Associate Professor, Department of Computer Technology, VJTI, Mumbai, INDIA Abstract

More information

A Mechanism for Selecting Appropriate Data Mining Techniques

A Mechanism for Selecting Appropriate Data Mining Techniques A Mechanism for Selecting Appropriate Data Mining Techniques Rose Tinabo School of Computing Dublin institute of technology Kevin Street Dublin 8 Ireland rose.tinabo@mydit.ie ABSTRACT: Due to an increase

More information

A Content based Spam Filtering Using Optical Back Propagation Technique

A Content based Spam Filtering Using Optical Back Propagation Technique A Content based Spam Filtering Using Optical Back Propagation Technique Sarab M. Hameed 1, Noor Alhuda J. Mohammed 2 Department of Computer Science, College of Science, University of Baghdad - Iraq ABSTRACT

More information

CI6227: Data Mining. Lesson 11b: Ensemble Learning. Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore.

CI6227: Data Mining. Lesson 11b: Ensemble Learning. Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore. CI6227: Data Mining Lesson 11b: Ensemble Learning Sinno Jialin PAN Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore Acknowledgements: slides are adapted from the lecture notes

More information

Using Data Mining for Mobile Communication Clustering and Characterization

Using Data Mining for Mobile Communication Clustering and Characterization Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer

More information

An Introduction to Data Mining

An Introduction to Data Mining An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail

More information

U.P.B. Sci. Bull., Series C, Vol. 77, Iss. 1, 2015 ISSN 2286 3540

U.P.B. Sci. Bull., Series C, Vol. 77, Iss. 1, 2015 ISSN 2286 3540 U.P.B. Sci. Bull., Series C, Vol. 77, Iss. 1, 2015 ISSN 2286 3540 ENTERPRISE FINANCIAL DISTRESS PREDICTION BASED ON BACKWARD PROPAGATION NEURAL NETWORK: AN EMPIRICAL STUDY ON THE CHINESE LISTED EQUIPMENT

More information

Ensemble Data Mining Methods

Ensemble Data Mining Methods Ensemble Data Mining Methods Nikunj C. Oza, Ph.D., NASA Ames Research Center, USA INTRODUCTION Ensemble Data Mining Methods, also known as Committee Methods or Model Combiners, are machine learning methods

More information

Evaluation of Feature Selection Methods for Predictive Modeling Using Neural Networks in Credits Scoring

Evaluation of Feature Selection Methods for Predictive Modeling Using Neural Networks in Credits Scoring 714 Evaluation of Feature election Methods for Predictive Modeling Using Neural Networks in Credits coring Raghavendra B. K. Dr. M.G.R. Educational and Research Institute, Chennai-95 Email: raghavendra_bk@rediffmail.com

More information

Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification

Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification Tina R. Patil, Mrs. S. S. Sherekar Sant Gadgebaba Amravati University, Amravati tnpatil2@gmail.com, ss_sherekar@rediffmail.com

More information

Classification algorithm in Data mining: An Overview

Classification algorithm in Data mining: An Overview Classification algorithm in Data mining: An Overview S.Neelamegam #1, Dr.E.Ramaraj *2 #1 M.phil Scholar, Department of Computer Science and Engineering, Alagappa University, Karaikudi. *2 Professor, Department

More information

Working with telecommunications

Working with telecommunications Working with telecommunications Minimizing churn in the telecommunications industry Contents: 1 Churn analysis using data mining 2 Customer churn analysis with IBM SPSS Modeler 3 Types of analysis 3 Feature

More information

A Review of Data Mining Techniques

A Review of Data Mining Techniques Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,

More information

Visualization of large data sets using MDS combined with LVQ.

Visualization of large data sets using MDS combined with LVQ. Visualization of large data sets using MDS combined with LVQ. Antoine Naud and Włodzisław Duch Department of Informatics, Nicholas Copernicus University, Grudziądzka 5, 87-100 Toruń, Poland. www.phys.uni.torun.pl/kmk

More information

Data Mining Analysis (breast-cancer data)

Data Mining Analysis (breast-cancer data) Data Mining Analysis (breast-cancer data) Jung-Ying Wang Register number: D9115007, May, 2003 Abstract In this AI term project, we compare some world renowned machine learning tools. Including WEKA data

More information

Data Mining Practical Machine Learning Tools and Techniques

Data Mining Practical Machine Learning Tools and Techniques Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea

More information