Creditworthiness Analysis in E-Financing Businesses - A Cross-Business Approach

Size: px
Start display at page:

Download "Creditworthiness Analysis in E-Financing Businesses - A Cross-Business Approach"

Transcription

1 Creditworthiness Analysis in E-Financing Businesses - A Cross-Business Approach Kun Liang 1,2, Zhangxi Lin 2, Zelin Jia 2, Cuiqing Jiang 1,Jiangtao Qiu 2,3 1 Shcool of Management, Hefei University of Technology, 193 Tunxi Road, Hefei, , China; 2 Jerry S. Rawls College of Business Administration, Texas Tech University, 703 Flint Ave, Lubbock, TX 79409, USA 3 School of Economic Information Engineering, Southwestern University of Finance and Economics, 55 Guanghuacun Road, Chengdu , China liangkun_fd@163.com; zhangxi.lin@ttu.edu; zelin.jia@ttu.edu; jiangcuiq@163.com; qjt163@163.com Abstract. To cope with the challenge of data scarcity in creditworthiness analysis for e-financing business, this paper proposes a cross-business analysis approach based on the assumption of behavior consistency for client in different e-commerce environments. By this approach we can analyze individuals creditworthiness by associating financial data on lending platforms and crossbusiness non-financial data on social media. We conceived three creditworthiness assessment models, and conduct the experimental study on Ant Financial Co-Creation Data Platform. The results verify that our crossbusiness creditworthiness analysis approach is effective. Keywords: Online lending, Creditworthiness, Cross-business data, Modeling, Data mining 1 Introduction Following the trend of financial disintermediation, innovative e-financing businesses, such as P2P (Peer-to-Peer), P2B (Peer-to-Business), P2G (Private-to- Government), crowdfunding, and so on, provide diversified funding services directly to small businesses and consumers through various online platforms. For example, Alibaba launched its online microloan services in 2010, namely Aliloan, which has issued about 25 billion dollars loans by 2014, benefiting more than one million small and micro-sized enterprises [1]. In review an online loan application, a client s ability and willingness to fulfill contracts, i.e. his/her creditworthiness, is an important indicator [2], because creditworthiness analysis can effectively reduce the information asymmetry between financial suppliers and borrowers in the e-financing platform, and improve the accuracy of loan decisions. Existing creditworthiness analysis methods predicted individuals creditworthiness mainly based on the historical data formed in the lending business, such as payment history, credit usage, length of credit history; most of them are transactional financial

2 2 Kun Liang1,2, Zhangxi Lin2, Zelin Jia2, Cuiqing Jiang1,Jiangtao Qiu2,3 data [3, 4]. However, this kind of data is hard to obtain in e-financing businesses [3], while user-generated data are abundant but of non-financial, such as the social networking data in social media platforms and online reputation scores in electronic markets. These cross-business non-financial data can effectively reflect individuals creditworthiness from multiple perspectives [4]. For example, one person can play different roles in various e-commerce businesses. On one hand, he or she can be a borrower in an online lending business; on the other hand, he or she can also be a seller in a C2C market. In this context, many cross-business data, such as his or her online reputation scores formed in trading businesses (C2C transactions), can be used to assess his or her creditworthiness in the lending business [4]. This paper is intended to investigate that how individuals' creditworthiness-related non-financial data can be used in the e-financing business. Specifically, we study the following two questions. First, how effective the cross-business non-financial data can be used to analyze individuals creditworthiness in the e-financing business? Second, how to select reasonable cross-business non-financial indicators for creditworthiness analysis in e-financing businesses? The contribution of this study can be summarized as following: (1)We proposed a cross-business creditworthiness analysis approach to overcome the data scarcity problem of creditworthiness analysis in e-financing businesses; (2)We revealed that the social capital theory can be helpful in selecting reasonable cross-business nonfinancial data to improve the performance of creditworthiness analysis for e-financing businesses. 2 A cross-business creditworthiness analysis approach In this section, we propose a cross-business creditworthiness analysis approach based on the consistent attribute of creditworthiness. Adelson et al (2009) proposed that creditworthiness can be understood as the relative ranking of default frequency, and this relative ranking is consistent in different business and scenarios [5]. In addition, individuals creditworthiness is their ability and willingness to fulfill contract. Although individuals may play different roles and have to fulfill different contracts in various businesses, their creditworthiness is influenced by some common factors, such as their morality and sense of responsibility. Therefore, one person s creditworthiness is consistent in different businesses. We select individuals who are borrowers in the online lending market, and also sellers in the C2C market. Assuming that an individual s creditworthiness is consistent across different online platforms [5], we analyze these individuals creditworthiness in lending business by considering their behaviors in online commodity transactions, which could be reflected by their reputation and social capital. An individual s default behaviors could destroy his/her reputation that was previously accumulated as the intangible asset [6]. From this sense, the online reputation can effectively reflect one s creditworthiness. Furthermore, maintaining good creditworthiness can help one to gain more social resource such as friends, job opportunity, and so on [4]. Based on this idea, we conceive three models to analyze

3 Creditworthiness Analysis in E-Financing Businesses - A Cross-Business Approach 3 individuals creditworthiness in e-financing businesses with different indicator sets (see Figure 1). Model 1 analyzes individuals creditworthiness based on the indicator set which only contain financial factors (Fa); Model 2 counts online reputation factors (Fb) into the indicator set; Model 3 further adds social capital factors (Fc) into the indicator set. We are to study whether we can analyze individuals creditworthiness in e-financing business more accuracy by adding cross-business non-financial factors (Fb and Fc). Fig. 1. A cross-business creditworthiness analysis approach Table 1. The performance of different creditworthiness assessment models Models FN TN FP TP TPR TNR Cost DT DT DT NN NN NN LR LR LR SVM SVM SVM Model n (include DT n, NN n, LR n, and SVM n), n=1,2,3. Model 1, Model 2, and Model 3 refer to the models that use the indicator sets of Fa, Fa+Fb, Fa+Fb+Fc, respectively. For example, DT 1 is the decision tree model which adopt only financial indicators (Fa)

4 4 Kun Liang1,2, Zhangxi Lin2, Zelin Jia2, Cuiqing Jiang1,Jiangtao Qiu2,3 3 Experimental study We conduct an experimental study in Ant Financial Co-Creation Data Platform, which became available in June 2015 complying with the agreement between Ant Financial and Tongji University, to validate the feasible of proposed cross-business creditworthiness analysis approach. We compiled a dataset of 6,598 observations containing 366 variables dated 20 April, Each observation represents an individual s records in both online lending market and C2C market. The target variable, i.e. the creditworthiness of an individual is set either good or bad, determined by whether there is a late payment record in his/her loan history. We select four classification technologies to construct our creditworthiness assessment model, i.e. Logistic Regression (LR), Decision Tree (DT), Support Vector Machine (SVM), and the Neural Network (NN). We use three criteria to evaluate the performance of each model with different indicator sets. They are TPR, TNR and Cost, which were defined in [7]. TPR is the percentage of correctly classified bad creditworthiness. TNR is the percentage of correctly classified good creditworthiness. Cost is a aggregative criteria consider both TPR and TNR. The results for the comparison of different models and indicator sets are summarized in Table 1. In Table 1, the TPR of Model 2 is higher than that of Model 1. This result indicates that the model can identify more bad creditworthiness individuals when add the reputation factors. However, the TNR of Model 2 is slightly less than that of Model 1. This result shows that the ability to recognize the good creditworthiness individuals has a little decrease when add the reputation factors. Due to the different misclassification cost of good and bad creditworthiness [7], we use the criterion of Cost to determine the Models comprehensive ability to differentiate the good and bad creditworthiness. In general, Model that has a lower Cost is better in creditworthiness analysis. In Table 1, the Cost of Model 1 is higher than that of Model 2, which indicates that individuals reputation in Alibaba C2C business has significant prediction ability to their creditworthiness in Ali-loan business. Similarly, the Cost of Model 2 is higher than that of Model 3, which indicates that individuals social capital in Sina micro-blog (reflected the social capital accumulated in C2C transaction) has significant prediction ability to their creditworthiness in Ali-loan business. Compared to the performance of Model 1-3, we revealed that individuals creditworthiness in an e-financing business can be predicted by their creditworthiness in the trading business which is analyzed by their reputation and social capital (cross-business non-financial factors) formed in C2C transaction activities. This result verifies the consistent attribute of creditworthiness [5]. Now, we can answer the two questions proposed in the Introduction. First, we can analyze individuals creditworthiness in the e-financing business based on the cross-business non-financial data. Second, we can select reasonable cross-business non-financial indicators according to the reputation and social capital theory.

5 Creditworthiness Analysis in E-Financing Businesses - A Cross-Business Approach 5 4 Conclusions In order to solve the data scarcity problem in creditworthiness analysis for the e- financing business, this study proposed a cross-business creditworthiness analysis approach according to the consistent attribute of creditworthiness. This paper makes several contributions to the literature. We proposed a new creditworthiness analysis approach for e-commerce businesses, which analyze individuals creditworthiness in one business by considering their creditworthiness in the other business. In addition, we enlarge the application scope of reputation and social capital theory to crossbusiness creditworthiness analysis. From the perspective of practices, we provide a better creditworthiness assessment model for e-financing businesses, which can effectively improve the accuracy of loan decisions. The methodology of crossbusiness creditworthiness analysis can also be used in other e-commerce businesses, such as P2P lending. From this sense, we could provide e-commerce platforms with more insights about individuals creditworthiness from various business perspectives. However, this research is limited and must be further expanded. In this paper, evaluation indicators are restricted to structured data, and there is a lack of relevant indicators to reflect the relational aspect of social capital in creditworthiness analysis. In next step, we will analyze the creditworthiness through the relationship types of social network ties and indirect social network ties, which can effectively reflect the effect of relational aspect of social capital on creditworthiness. 5 References 1. The actual financing cost is only 6.7 %: Ali finance shocks the private lending, 2. Safi, R., & Lin, Z. (2014). Using non-financial data to assess the creditworthiness of businesses in online trade. PACIS Proceedings. 3. Wang, Y., Li, S., & Lin, Z. (2013, July). Revealing Key Non-financial Factors for Online Credit-Scoring in e-financing. In Service Systems and Service Management (ICSSSM), th International Conference on (pp ). IEEE. 4. Lin, Z., Whinston, A. B., & Fan, S. (2015). Harnessing Internet finance with innovative cyber credit management. Financial Innovation, 1(1), Adelson, M., Ravimohan, R., Griep, C., Jacob, D., Coughlin, P., Bukspan, N., & Wyss, D. (2009). Understanding Standard & Poor s Rating Definitions.Standard & Poor's. Rating_Definitions. pdf. 6. Van den Bogaerd, M., & Aerts, W. (2015). Does media reputation affect properties of accounts payable? European Management Journal, 33(1), Lessmann, S., Baesens, B., Seow, H. V., & Thomas, L. C. (2015). Benchmarking state-of-the-art classification algorithms for credit scoring: A ten-year update. European Journal of Operational Research.

harpreet@utdallas.edu, {ram.gopal, xinxin.li}@business.uconn.edu

harpreet@utdallas.edu, {ram.gopal, xinxin.li}@business.uconn.edu Risk and Return of Investments in Online Peer-to-Peer Lending (Extended Abstract) Harpreet Singh a, Ram Gopal b, Xinxin Li b a School of Management, University of Texas at Dallas, Richardson, Texas 75083-0688

More information

Choosing the Best Classification Performance Metric for Wrapper-based Software Metric Selection for Defect Prediction

Choosing the Best Classification Performance Metric for Wrapper-based Software Metric Selection for Defect Prediction Choosing the Best Classification Performance Metric for Wrapper-based Software Metric Selection for Defect Prediction Huanjing Wang Western Kentucky University huanjing.wang@wku.edu Taghi M. Khoshgoftaar

More information

Data Mining Application in Direct Marketing: Identifying Hot Prospects for Banking Product

Data Mining Application in Direct Marketing: Identifying Hot Prospects for Banking Product Data Mining Application in Direct Marketing: Identifying Hot Prospects for Banking Product Sagarika Prusty Web Data Mining (ECT 584),Spring 2013 DePaul University,Chicago sagarikaprusty@gmail.com Keywords:

More information

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and Financial Institutions and STATISTICA Case Study: Credit Scoring STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table of Contents INTRODUCTION: WHAT

More information

ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION

ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION ISSN 9 X INFORMATION TECHNOLOGY AND CONTROL, 00, Vol., No.A ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION Danuta Zakrzewska Institute of Computer Science, Technical

More information

Business Analytics and Credit Scoring

Business Analytics and Credit Scoring Study Unit 5 Business Analytics and Credit Scoring ANL 309 Business Analytics Applications Introduction Process of credit scoring The role of business analytics in credit scoring Methods of logistic regression

More information

Behavior Model to Capture Bank Charge-off Risk for Next Periods Working Paper

Behavior Model to Capture Bank Charge-off Risk for Next Periods Working Paper 1 Behavior Model to Capture Bank Charge-off Risk for Next Periods Working Paper Spring 2007 Juan R. Castro * School of Business LeTourneau University 2100 Mobberly Ave. Longview, Texas 75607 Keywords:

More information

Stock Market Forecasting Using Machine Learning Algorithms

Stock Market Forecasting Using Machine Learning Algorithms Stock Market Forecasting Using Machine Learning Algorithms Shunrong Shen, Haomiao Jiang Department of Electrical Engineering Stanford University {conank,hjiang36}@stanford.edu Tongda Zhang Department of

More information

Predicting Customer Default Times using Survival Analysis Methods in SAS

Predicting Customer Default Times using Survival Analysis Methods in SAS Predicting Customer Default Times using Survival Analysis Methods in SAS Bart Baesens Bart.Baesens@econ.kuleuven.ac.be Overview The credit scoring survival analysis problem Statistical methods for Survival

More information

Towards applying Data Mining Techniques for Talent Mangement

Towards applying Data Mining Techniques for Talent Mangement 2009 International Conference on Computer Engineering and Applications IPCSIT vol.2 (2011) (2011) IACSIT Press, Singapore Towards applying Data Mining Techniques for Talent Mangement Hamidah Jantan 1,

More information

IDENTIFIC ATION OF SOFTWARE EROSION USING LOGISTIC REGRESSION

IDENTIFIC ATION OF SOFTWARE EROSION USING LOGISTIC REGRESSION http:// IDENTIFIC ATION OF SOFTWARE EROSION USING LOGISTIC REGRESSION Harinder Kaur 1, Raveen Bajwa 2 1 PG Student., CSE., Baba Banda Singh Bahadur Engg. College, Fatehgarh Sahib, (India) 2 Asstt. Prof.,

More information

Applied Mathematical Sciences, Vol. 7, 2013, no. 112, 5591-5597 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2013.

Applied Mathematical Sciences, Vol. 7, 2013, no. 112, 5591-5597 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2013. Applied Mathematical Sciences, Vol. 7, 2013, no. 112, 5591-5597 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2013.38457 Accuracy Rate of Predictive Models in Credit Screening Anirut Suebsing

More information

How can we discover stocks that will

How can we discover stocks that will Algorithmic Trading Strategy Based On Massive Data Mining Haoming Li, Zhijun Yang and Tianlun Li Stanford University Abstract We believe that there is useful information hiding behind the noisy and massive

More information

A Content based Spam Filtering Using Optical Back Propagation Technique

A Content based Spam Filtering Using Optical Back Propagation Technique A Content based Spam Filtering Using Optical Back Propagation Technique Sarab M. Hameed 1, Noor Alhuda J. Mohammed 2 Department of Computer Science, College of Science, University of Baghdad - Iraq ABSTRACT

More information

ON THE HETEROGENEOUS EFFECTS OF NON- CREDIT-RELATED INFORMATION IN ONLINE P2P LENDING: A QUANTILE REGRESSION ANALYSIS

ON THE HETEROGENEOUS EFFECTS OF NON- CREDIT-RELATED INFORMATION IN ONLINE P2P LENDING: A QUANTILE REGRESSION ANALYSIS ON THE HETEROGENEOUS EFFECTS OF NON- CREDIT-RELATED INFORMATION IN ONLINE P2P LENDING: A QUANTILE REGRESSION ANALYSIS Sirong Luo, Dengpan Liu, Yinmin Ye Outline Introduction and Motivation Literature Review

More information

Random forest algorithm in big data environment

Random forest algorithm in big data environment Random forest algorithm in big data environment Yingchun Liu * School of Economics and Management, Beihang University, Beijing 100191, China Received 1 September 2014, www.cmnt.lv Abstract Random forest

More information

Financial Statement Fraud Detection: An Analysis of Statistical and Machine Learning Algorithms

Financial Statement Fraud Detection: An Analysis of Statistical and Machine Learning Algorithms Financial Statement Fraud Detection: An Analysis of Statistical and Machine Learning Algorithms Johan Perols Assistant Professor University of San Diego, San Diego, CA 92110 jperols@sandiego.edu April

More information

Expert Systems with Applications

Expert Systems with Applications Expert Systems with Applications 36 (2009) 5445 5449 Contents lists available at ScienceDirect Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa Customer churn prediction

More information

Enhanced Boosted Trees Technique for Customer Churn Prediction Model

Enhanced Boosted Trees Technique for Customer Churn Prediction Model IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719 Vol. 04, Issue 03 (March. 2014), V5 PP 41-45 www.iosrjen.org Enhanced Boosted Trees Technique for Customer Churn Prediction

More information

Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100

Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100 Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100 Erkan Er Abstract In this paper, a model for predicting students performance levels is proposed which employs three

More information

Loan Repayment and Credit Management of Small Businesses A CASE STUDY OF A SOUTH AFRICAN COMMERCIAL BANK

Loan Repayment and Credit Management of Small Businesses A CASE STUDY OF A SOUTH AFRICAN COMMERCIAL BANK Loan Repayment and Credit Management of Small Businesses A CASE STUDY OF A SOUTH AFRICAN COMMERCIAL BANK Clemence Hwarire 7 August 2012 Contents Introduction Obstacles hindering the growth of small businesses

More information

SURVIVABILITY OF COMPLEX SYSTEM SUPPORT VECTOR MACHINE BASED APPROACH

SURVIVABILITY OF COMPLEX SYSTEM SUPPORT VECTOR MACHINE BASED APPROACH 1 SURVIVABILITY OF COMPLEX SYSTEM SUPPORT VECTOR MACHINE BASED APPROACH Y, HONG, N. GAUTAM, S. R. T. KUMARA, A. SURANA, H. GUPTA, S. LEE, V. NARAYANAN, H. THADAKAMALLA The Dept. of Industrial Engineering,

More information

Prediction of Stock Performance Using Analytical Techniques

Prediction of Stock Performance Using Analytical Techniques 136 JOURNAL OF EMERGING TECHNOLOGIES IN WEB INTELLIGENCE, VOL. 5, NO. 2, MAY 2013 Prediction of Stock Performance Using Analytical Techniques Carol Hargreaves Institute of Systems Science National University

More information

A QoS-Aware Web Service Selection Based on Clustering

A QoS-Aware Web Service Selection Based on Clustering International Journal of Scientific and Research Publications, Volume 4, Issue 2, February 2014 1 A QoS-Aware Web Service Selection Based on Clustering R.Karthiban PG scholar, Computer Science and Engineering,

More information

A new Approach for Intrusion Detection in Computer Networks Using Data Mining Technique

A new Approach for Intrusion Detection in Computer Networks Using Data Mining Technique A new Approach for Intrusion Detection in Computer Networks Using Data Mining Technique Aida Parbaleh 1, Dr. Heirsh Soltanpanah 2* 1 Department of Computer Engineering, Islamic Azad University, Sanandaj

More information

ViviSight: A Sophisticated, Data-driven Business Intelligence Tool for Churn and Loan Default Prediction

ViviSight: A Sophisticated, Data-driven Business Intelligence Tool for Churn and Loan Default Prediction ViviSight: A Sophisticated, Data-driven Business Intelligence Tool for Churn and Loan Default Prediction Barun Paudel 1, T.H. Gopaluwewa 1, M.R.De. Waas Gunawardena 1, W.C.H. Wijerathna 1, Rohan Samarasinghe

More information

Data quality in Accounting Information Systems

Data quality in Accounting Information Systems Data quality in Accounting Information Systems Comparing Several Data Mining Techniques Erjon Zoto Department of Statistics and Applied Informatics Faculty of Economy, University of Tirana Tirana, Albania

More information

Data Quality Mining: Employing Classifiers for Assuring consistent Datasets

Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Fabian Grüning Carl von Ossietzky Universität Oldenburg, Germany, fabian.gruening@informatik.uni-oldenburg.de Abstract: Independent

More information

Crowdfunding Support Tools: Predicting Success & Failure

Crowdfunding Support Tools: Predicting Success & Failure Crowdfunding Support Tools: Predicting Success & Failure Michael D. Greenberg Bryan Pardo mdgreenb@u.northwestern.edu pardo@northwestern.edu Karthic Hariharan karthichariharan2012@u.northwes tern.edu Elizabeth

More information

E-commerce Transaction Anomaly Classification

E-commerce Transaction Anomaly Classification E-commerce Transaction Anomaly Classification Minyong Lee minyong@stanford.edu Seunghee Ham sham12@stanford.edu Qiyi Jiang qjiang@stanford.edu I. INTRODUCTION Due to the increasing popularity of e-commerce

More information

An analysis of suitable parameters for efficiently applying K-means clustering to large TCPdump data set using Hadoop framework

An analysis of suitable parameters for efficiently applying K-means clustering to large TCPdump data set using Hadoop framework An analysis of suitable parameters for efficiently applying K-means clustering to large TCPdump data set using Hadoop framework Jakrarin Therdphapiyanak Dept. of Computer Engineering Chulalongkorn University

More information

A Novel Classification Approach for C2C E-Commerce Fraud Detection

A Novel Classification Approach for C2C E-Commerce Fraud Detection A Novel Classification Approach for C2C E-Commerce Fraud Detection *1 Haitao Xiong, 2 Yufeng Ren, 2 Pan Jia *1 School of Computer and Information Engineering, Beijing Technology and Business University,

More information

Constrained Classification of Large Imbalanced Data by Logistic Regression and Genetic Algorithm

Constrained Classification of Large Imbalanced Data by Logistic Regression and Genetic Algorithm Constrained Classification of Large Imbalanced Data by Logistic Regression and Genetic Algorithm Martin Hlosta, Rostislav Stríž, Jan Kupčík, Jaroslav Zendulka, and Tomáš Hruška A. Imbalanced Data Classification

More information

A Two-Pass Statistical Approach for Automatic Personalized Spam Filtering

A Two-Pass Statistical Approach for Automatic Personalized Spam Filtering A Two-Pass Statistical Approach for Automatic Personalized Spam Filtering Khurum Nazir Junejo, Mirza Muhammad Yousaf, and Asim Karim Dept. of Computer Science, Lahore University of Management Sciences

More information

Machine Learning at DIKU

Machine Learning at DIKU Faculty of Science Machine Learning at DIKU Christian Igel Department of Computer Science igel@diku.dk Slide 1/12 Machine learning Machine learning is a branch of computer science and applied statistics

More information

MAXIMIZING RETURN ON DIRECT MARKETING CAMPAIGNS

MAXIMIZING RETURN ON DIRECT MARKETING CAMPAIGNS MAXIMIZING RETURN ON DIRET MARKETING AMPAIGNS IN OMMERIAL BANKING S 229 Project: Final Report Oleksandra Onosova INTRODUTION Recent innovations in cloud computing and unified communications have made a

More information

USING LOGIT MODEL TO PREDICT CREDIT SCORE

USING LOGIT MODEL TO PREDICT CREDIT SCORE USING LOGIT MODEL TO PREDICT CREDIT SCORE Taiwo Amoo, Associate Professor of Business Statistics and Operation Management, Brooklyn College, City University of New York, (718) 951-5219, Tamoo@brooklyn.cuny.edu

More information

Feature Subset Selection in E-mail Spam Detection

Feature Subset Selection in E-mail Spam Detection Feature Subset Selection in E-mail Spam Detection Amir Rajabi Behjat, Universiti Technology MARA, Malaysia IT Security for the Next Generation Asia Pacific & MEA Cup, Hong Kong 14-16 March, 2012 Feature

More information

Finding Minimal Neural Networks for Business Intelligence Applications

Finding Minimal Neural Networks for Business Intelligence Applications Finding Minimal Neural Networks for Business Intelligence Applications Rudy Setiono School of Computing National University of Singapore www.comp.nus.edu.sg/~rudys d Outline Introduction Feed-forward neural

More information

Dynamic Predictive Modeling in Claims Management - Is it a Game Changer?

Dynamic Predictive Modeling in Claims Management - Is it a Game Changer? Dynamic Predictive Modeling in Claims Management - Is it a Game Changer? Anil Joshi Alan Josefsek Bob Mattison Anil Joshi is the President and CEO of AnalyticsPlus, Inc. (www.analyticsplus.com)- a Chicago

More information

Start-up Companies Predictive Models Analysis. Boyan Yankov, Kaloyan Haralampiev, Petko Ruskov

Start-up Companies Predictive Models Analysis. Boyan Yankov, Kaloyan Haralampiev, Petko Ruskov Start-up Companies Predictive Models Analysis Boyan Yankov, Kaloyan Haralampiev, Petko Ruskov Abstract: A quantitative research is performed to derive a model for predicting the success of Bulgarian start-up

More information

Data Mining Yelp Data - Predicting rating stars from review text

Data Mining Yelp Data - Predicting rating stars from review text Data Mining Yelp Data - Predicting rating stars from review text Rakesh Chada Stony Brook University rchada@cs.stonybrook.edu Chetan Naik Stony Brook University cnaik@cs.stonybrook.edu ABSTRACT The majority

More information

Behavior Revealed in Mobile Phone Usage Predicts Loan Repayment

Behavior Revealed in Mobile Phone Usage Predicts Loan Repayment Behavior Revealed in Mobile Phone Usage Predicts Loan Repayment Daniel Björkegren a and Darrell Grissen b July 8, 2015 Abstract: Many households in developing countries lack formal financial histories,

More information

CREDIT REPORTING FOR A SMALL BUSINESS

CREDIT REPORTING FOR A SMALL BUSINESS CREDIT REPORTING FOR A SMALL BUSINESS Objectives Northern Initiatives is committed to entrepreneurs like you, because you are the people who are creating jobs and enabling the communities of our region

More information

Evaluation of Feature Selection Methods for Predictive Modeling Using Neural Networks in Credits Scoring

Evaluation of Feature Selection Methods for Predictive Modeling Using Neural Networks in Credits Scoring 714 Evaluation of Feature election Methods for Predictive Modeling Using Neural Networks in Credits coring Raghavendra B. K. Dr. M.G.R. Educational and Research Institute, Chennai-95 Email: raghavendra_bk@rediffmail.com

More information

SVM Ensemble Model for Investment Prediction

SVM Ensemble Model for Investment Prediction 19 SVM Ensemble Model for Investment Prediction Chandra J, Assistant Professor, Department of Computer Science, Christ University, Bangalore Siji T. Mathew, Research Scholar, Christ University, Dept of

More information

Behavior Revealed in Mobile Phone Usage Predicts Loan Repayment

Behavior Revealed in Mobile Phone Usage Predicts Loan Repayment Behavior Revealed in Mobile Phone Usage Predicts Loan Repayment Daniel Björkegren a and Darrell Grissen b October 28, 2015 Abstract: Many households in developing countries lack formal financial histories,

More information

AUCTION SWINDLE DISCOVERY FOR PROACTIVE SELF-DISCIPLINED SYSTEMS

AUCTION SWINDLE DISCOVERY FOR PROACTIVE SELF-DISCIPLINED SYSTEMS IJCITP Volume.8* Number 2* December 2013, pp. 95-99 Serials Publications AUCTION SWINDLE DISCOVERY FOR PROACTIVE SELF-DISCIPLINED SYSTEMS D. C. Venkateswarlu 1 and V. Premalatha 2 1 M.Tech. Student, Department

More information

Enhancing Quality of Data using Data Mining Method

Enhancing Quality of Data using Data Mining Method JOURNAL OF COMPUTING, VOLUME 2, ISSUE 9, SEPTEMBER 2, ISSN 25-967 WWW.JOURNALOFCOMPUTING.ORG 9 Enhancing Quality of Data using Data Mining Method Fatemeh Ghorbanpour A., Mir M. Pedram, Kambiz Badie, Mohammad

More information

Research on Sentiment Classification of Chinese Micro Blog Based on

Research on Sentiment Classification of Chinese Micro Blog Based on Research on Sentiment Classification of Chinese Micro Blog Based on Machine Learning School of Economics and Management, Shenyang Ligong University, Shenyang, 110159, China E-mail: 8e8@163.com Abstract

More information

Data Mining Part 5. Prediction

Data Mining Part 5. Prediction Data Mining Part 5. Prediction 5.1 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Classification vs. Numeric Prediction Prediction Process Data Preparation Comparing Prediction Methods References Classification

More information

保 理 在 中 国 情 境 下 的 融 资 作 用

保 理 在 中 国 情 境 下 的 融 资 作 用 所 属 领 域 : 资 产 证 券 化 Category: Asset securitization 发 表 语 言 : 英 文 In English 保 理 在 中 国 情 境 下 的 融 资 作 用 The financing role of factoring in China context Shuzhen Chen 1 School of Management, University of

More information

Data Mining Algorithms Part 1. Dejan Sarka

Data Mining Algorithms Part 1. Dejan Sarka Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses

More information

DOUBLE ENSEMBLE APPROACHES TO PREDICTING FIRMS CREDIT RATING

DOUBLE ENSEMBLE APPROACHES TO PREDICTING FIRMS CREDIT RATING DOUBLE ENSEMBLE APPROACHES TO PREDICTING FIRMS CREDIT RATING Jungeun Kwon, Business School, Korea University, Seoul, Republic of Korea, jugugbim@gmail.com Keunho Choi * Business School, Korea University,

More information

Evaluation and Comparison of Data Mining Techniques Over Bank Direct Marketing

Evaluation and Comparison of Data Mining Techniques Over Bank Direct Marketing Evaluation and Comparison of Data Mining Techniques Over Bank Direct Marketing Niharika Sharma 1, Arvinder Kaur 2, Sheetal Gandotra 3, Dr Bhawna Sharma 4 B.E. Final Year Student, Department of Computer

More information

Employer Health Insurance Premium Prediction Elliott Lui

Employer Health Insurance Premium Prediction Elliott Lui Employer Health Insurance Premium Prediction Elliott Lui 1 Introduction The US spends 15.2% of its GDP on health care, more than any other country, and the cost of health insurance is rising faster than

More information

and Hung-Wen Chang 1 Department of Human Resource Development, Hsiuping University of Science and Technology, Taichung City 412, Taiwan 3

and Hung-Wen Chang 1 Department of Human Resource Development, Hsiuping University of Science and Technology, Taichung City 412, Taiwan 3 A study using Genetic Algorithm and Support Vector Machine to find out how the attitude of training personnel affects the performance of the introduction of Taiwan TrainQuali System in an enterprise Tung-Shou

More information

Nagarjuna College Of

Nagarjuna College Of Nagarjuna College Of Information Technology (Bachelor in Information Management) TRIBHUVAN UNIVERSITY Project Report on World s successful data mining and data warehousing projects Submitted By: Submitted

More information

Predicting borrowers chance of defaulting on credit loans

Predicting borrowers chance of defaulting on credit loans Predicting borrowers chance of defaulting on credit loans Junjie Liang (junjie87@stanford.edu) Abstract Credit score prediction is of great interests to banks as the outcome of the prediction algorithm

More information

How To Cluster

How To Cluster Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main

More information

Chapter 6. The stacking ensemble approach

Chapter 6. The stacking ensemble approach 82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described

More information

Credit Risk Assessment of POS-Loans in the Big Data Era

Credit Risk Assessment of POS-Loans in the Big Data Era Credit Risk Assessment of POS-Loans in the Big Data Era Yiyang Bian 1,2, Shaokun Fan 1, Ryan Liying Ye 1, J. Leon Zhao 1 1 Department of Information Systems, City University of Hong Kong 2 School of Management,

More information

Analyzing Customer Churn in the Software as a Service (SaaS) Industry

Analyzing Customer Churn in the Software as a Service (SaaS) Industry Analyzing Customer Churn in the Software as a Service (SaaS) Industry Ben Frank, Radford University Jeff Pittges, Radford University Abstract Predicting customer churn is a classic data mining problem.

More information

Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing and Developing E-mail Classifier

Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing and Developing E-mail Classifier International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-1, Issue-6, January 2013 Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing

More information

An Approach to Detect Spam Emails by Using Majority Voting

An Approach to Detect Spam Emails by Using Majority Voting An Approach to Detect Spam Emails by Using Majority Voting Roohi Hussain Department of Computer Engineering, National University of Science and Technology, H-12 Islamabad, Pakistan Usman Qamar Faculty,

More information

A Logistic Regression Approach to Ad Click Prediction

A Logistic Regression Approach to Ad Click Prediction A Logistic Regression Approach to Ad Click Prediction Gouthami Kondakindi kondakin@usc.edu Satakshi Rana satakshr@usc.edu Aswin Rajkumar aswinraj@usc.edu Sai Kaushik Ponnekanti ponnekan@usc.edu Vinit Parakh

More information

How To Solve The Kd Cup 2010 Challenge

How To Solve The Kd Cup 2010 Challenge A Lightweight Solution to the Educational Data Mining Challenge Kun Liu Yan Xing Faculty of Automation Guangdong University of Technology Guangzhou, 510090, China catch0327@yahoo.com yanxing@gdut.edu.cn

More information

A Comparative Assessment of the Performance of Ensemble Learning in Customer Churn Prediction

A Comparative Assessment of the Performance of Ensemble Learning in Customer Churn Prediction The International Arab Journal of Information Technology, Vol. 11, No. 6, November 2014 599 A Comparative Assessment of the Performance of Ensemble Learning in Customer Churn Prediction Hossein Abbasimehr,

More information

A Novel Feature Selection Method Based on an Integrated Data Envelopment Analysis and Entropy Mode

A Novel Feature Selection Method Based on an Integrated Data Envelopment Analysis and Entropy Mode A Novel Feature Selection Method Based on an Integrated Data Envelopment Analysis and Entropy Mode Seyed Mojtaba Hosseini Bamakan, Peyman Gholami RESEARCH CENTRE OF FICTITIOUS ECONOMY & DATA SCIENCE UNIVERSITY

More information

Internet-Point of Penetration to Successful Business Development

Internet-Point of Penetration to Successful Business Development International Journal of Business and Industrial Marketing 2015; 1(1): 1-5 Published online March 20, 2015 (http://www.aascit.org/journal/ijbim) Internet-Point of Penetration to Successful Business Fang

More information

Manjeet Kaur Bhullar, Kiranbir Kaur Department of CSE, GNDU, Amritsar, Punjab, India

Manjeet Kaur Bhullar, Kiranbir Kaur Department of CSE, GNDU, Amritsar, Punjab, India Volume 5, Issue 6, June 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Multiple Pheromone

More information

On-line Spam Filter Fusion

On-line Spam Filter Fusion On-line Spam Filter Fusion Thomas Lynam & Gordon Cormack originally presented at SIGIR 2006 On-line vs Batch Classification Batch Hard Classifier separate training and test data sets Given ham/spam classification

More information

Contemporary Logistics. Logistics Outsourcing Risks Evaluation Based on Rough Sets Theory

Contemporary Logistics. Logistics Outsourcing Risks Evaluation Based on Rough Sets Theory Contemporary Logistics 11 2013) 1838-739X Contents lists available at SEI Contemporary Logistics journal homepage: www.seiofbluemountain.com Logistics Outsourcing Risks Evaluation Based on Rough Sets Theory

More information

Influencing Factors of Online P2P Lending Success Rate in China

Influencing Factors of Online P2P Lending Success Rate in China Influencing Factors of Online P2P Lending Success Rate in China Yanmei Zhang School of Information, Central University of Finance and Economics No.39 South Xueyuan Road, Beijing, China, 100081 jlzym0309@sina.com

More information

How To Make A Credit Risk Model For A Bank Account

How To Make A Credit Risk Model For A Bank Account TRANSACTIONAL DATA MINING AT LLOYDS BANKING GROUP Csaba Főző csaba.fozo@lloydsbanking.com 15 October 2015 CONTENTS Introduction 04 Random Forest Methodology 06 Transactional Data Mining Project 17 Conclusions

More information

ElegantJ BI. White Paper. The Competitive Advantage of Business Intelligence (BI) Forecasting and Predictive Analysis

ElegantJ BI. White Paper. The Competitive Advantage of Business Intelligence (BI) Forecasting and Predictive Analysis ElegantJ BI White Paper The Competitive Advantage of Business Intelligence (BI) Forecasting and Predictive Analysis Integrated Business Intelligence and Reporting for Performance Management, Operational

More information

A Survey on Product Aspect Ranking

A Survey on Product Aspect Ranking A Survey on Product Aspect Ranking Charushila Patil 1, Prof. P. M. Chawan 2, Priyamvada Chauhan 3, Sonali Wankhede 4 M. Tech Student, Department of Computer Engineering and IT, VJTI College, Mumbai, Maharashtra,

More information

Credit Risk Models. August 24 26, 2010

Credit Risk Models. August 24 26, 2010 Credit Risk Models August 24 26, 2010 AGENDA 1 st Case Study : Credit Rating Model Borrowers and Factoring (Accounts Receivable Financing) pages 3 10 2 nd Case Study : Credit Scoring Model Automobile Leasing

More information

testo dello schema Secondo livello Terzo livello Quarto livello Quinto livello

testo dello schema Secondo livello Terzo livello Quarto livello Quinto livello Extracting Knowledge from Biomedical Data through Logic Learning Machines and Rulex Marco Muselli Institute of Electronics, Computer and Telecommunication Engineering National Research Council of Italy,

More information

The Impact of Sample Bias on Consumer Credit Scoring Performance and Profitability

The Impact of Sample Bias on Consumer Credit Scoring Performance and Profitability The Impact of Sample Bias on Consumer Credit Scoring Performance and Profitability Geert Verstraeten, Dirk Van den Poel Corresponding author: Dirk Van den Poel Department of Marketing, Ghent University,

More information

Pattern-Aided Regression Modelling and Prediction Model Analysis

Pattern-Aided Regression Modelling and Prediction Model Analysis San Jose State University SJSU ScholarWorks Master's Projects Master's Theses and Graduate Research Fall 2015 Pattern-Aided Regression Modelling and Prediction Model Analysis Naresh Avva Follow this and

More information

Software project cost estimation using AI techniques

Software project cost estimation using AI techniques Software project cost estimation using AI techniques Rodríguez Montequín, V.; Villanueva Balsera, J.; Alba González, C.; Martínez Huerta, G. Project Management Area University of Oviedo C/Independencia

More information

Lending Decision Model for Agricultural Sector in Thailand

Lending Decision Model for Agricultural Sector in Thailand Lending Decision Model for Agricultural Sector in Thailand 1 Limsombunchai, V. 2 C. Gan and 3 M. Lee 1, 2 Commerce Division, Lincoln University, New Zealand 3 Department of Economics, American University

More information

Cloud-based trading & financing ecosystem for global ecommerce

Cloud-based trading & financing ecosystem for global ecommerce Cloud-based trading & financing ecosystem for global ecommerce specializing in China inbound and outbound trade for small online retailers and social commerce players Our Motto MAKING BUY AND SELL EASY!

More information

Predict Influencers in the Social Network

Predict Influencers in the Social Network Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons

More information

Machine Learning and Financial Advice

Machine Learning and Financial Advice Faculty of Science Machine Learning and Financial Advice Christian Igel Department of Computer Science igel@diku.dk Slide 1/24 Outline 1 Machine Learning at DIKU 2 Example Applications in Finance 3 Risks

More information

FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS

FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS Breno C. Costa, Bruno. L. A. Alberto, André M. Portela, W. Maduro, Esdras O. Eler PDITec, Belo Horizonte,

More information

Introducing diversity among the models of multi-label classification ensemble

Introducing diversity among the models of multi-label classification ensemble Introducing diversity among the models of multi-label classification ensemble Lena Chekina, Lior Rokach and Bracha Shapira Ben-Gurion University of the Negev Dept. of Information Systems Engineering and

More information

Data are everywhere. IBM projects that every day we generate 2.5 quintillion bytes of data. In relative terms, this means 90

Data are everywhere. IBM projects that every day we generate 2.5 quintillion bytes of data. In relative terms, this means 90 FREE echapter C H A P T E R1 Big Data and Analytics Data are everywhere. IBM projects that every day we generate 2.5 quintillion bytes of data. In relative terms, this means 90 percent of the data in the

More information

Data Mining Solutions for the Business Environment

Data Mining Solutions for the Business Environment Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania ruxandra_stefania.petre@yahoo.com Over

More information

Researching individual credit rating models

Researching individual credit rating models Paper 2242-2014 Researching individual credit rating models F.P.N. (Colin) Nugteren MSc., Direct Pay Services B.V. Summary: In this research paper we try to find an optimal model for predicting credit

More information

Role of Customer Response Models in Customer Solicitation Center s Direct Marketing Campaign

Role of Customer Response Models in Customer Solicitation Center s Direct Marketing Campaign Role of Customer Response Models in Customer Solicitation Center s Direct Marketing Campaign Arun K Mandapaka, Amit Singh Kushwah, Dr.Goutam Chakraborty Oklahoma State University, OK, USA ABSTRACT Direct

More information

Addressing the Class Imbalance Problem in Medical Datasets

Addressing the Class Imbalance Problem in Medical Datasets Addressing the Class Imbalance Problem in Medical Datasets M. Mostafizur Rahman and D. N. Davis the size of the training set is significantly increased [5]. If the time taken to resample is not considered,

More information

Blog Post Extraction Using Title Finding

Blog Post Extraction Using Title Finding Blog Post Extraction Using Title Finding Linhai Song 1, 2, Xueqi Cheng 1, Yan Guo 1, Bo Wu 1, 2, Yu Wang 1, 2 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing 2 Graduate School

More information

Ensemble Approach for the Classification of Imbalanced Data

Ensemble Approach for the Classification of Imbalanced Data Ensemble Approach for the Classification of Imbalanced Data Vladimir Nikulin 1, Geoffrey J. McLachlan 1, and Shu Kay Ng 2 1 Department of Mathematics, University of Queensland v.nikulin@uq.edu.au, gjm@maths.uq.edu.au

More information

Beating the MLB Moneyline

Beating the MLB Moneyline Beating the MLB Moneyline Leland Chen llxchen@stanford.edu Andrew He andu@stanford.edu 1 Abstract Sports forecasting is a challenging task that has similarities to stock market prediction, requiring time-series

More information

TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM

TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM Thanh-Nghi Do College of Information Technology, Cantho University 1 Ly Tu Trong Street, Ninh Kieu District Cantho City, Vietnam

More information

FACTS. Why? What? How? Questions? We do not share WHAT DOES MINEOLA COMMUNITY BANK SSB DO WITH YOUR PERSONAL INFORMATION?

FACTS. Why? What? How? Questions? We do not share WHAT DOES MINEOLA COMMUNITY BANK SSB DO WITH YOUR PERSONAL INFORMATION? Rev 01/15 FACTS Why? What? WHAT DOES MINEOLA COMMUNITY BANK SSB DO WITH YOUR PERSONAL INFORMATION? Financial companies choose how they share your personal information. Federal law gives consumers the right

More information

Some Statistical Applications In The Financial Services Industry

Some Statistical Applications In The Financial Services Industry Some Statistical Applications In The Financial Services Industry Wenqing Lu May 30, 2008 1 Introduction Examples of consumer financial services credit card services mortgage loan services auto finance

More information

Analysis of Fire Statistics of China: Fire Frequency and Fatalities in Fires

Analysis of Fire Statistics of China: Fire Frequency and Fatalities in Fires Analysis of Fire Statistics of China: Fire Frequency and Fatalities in Fires FULIANG WANG, SHOUXIANG LU, and CHANGHAI LI State Key Laboratory of Fire Science University of Science and Technology of China

More information