Fraud Risk Prediction in Merchant-Bank Relationship using Regression Modeling

Size: px
Start display at page:

Download "Fraud Risk Prediction in Merchant-Bank Relationship using Regression Modeling"

Transcription

1 R E S E A R C H includes research articles that focus on the analysis and resolution of managerial and academic issues based on analytical and empirical or case research Fraud Risk Prediction in Merchant-Bank Relationship using Regression Modeling Nishant Agarwal and Meghna Sharma Executive Summary Banking industry has gone through one of the worst crisis in recent times, and is still recovering from the after-shocks. However, there were a lot of learnings that banks would have taken away from this crisis. One of them is the need for a robust risk management system. The crisis dealt a blow to the banking system, catching them off guard when it came to foreseeing the risk. Banks, in the credit card business, face financial risk in the form of both credit risk and fraud risk. Sharma and Agarwal (2013) proposed a model for predicting the credit risk from the merchants. This paper builds upon their technique to predict the fraud risk posed by the merchants to the banks. Fraud risk is an important aspect of risk management systems, particularly in the credit space. The uncertainty surrounding the receipt of paybacks calls for designing robust risk prediction models. Fraud risk is very different from credit risk because fraud risk does not follow a pattern. It happens suddenly, and may not always have a trend before it happens. This creates a need for separate model for fraud risk prediction. This paper develops a fraud risk prediction model that uses logistic regression technique, deployed using SAS. The setup of the study is the merchant-bank relationship in the credit card industry. KEY WORDS Merchant Risk Acquiring Bank Risk Management Fraud Risk Management Logistic Regression SAS The model developed in this paper triggers on a transaction level, and assigns a probability score of default (PF) to each merchant for a possible fraud risk whenever a transaction is done at the merchant. Such a score warns the management in advance of probable future losses on merchant accounts. Banks can rank order merchants based on their PF score, and instead of working on the entire merchant portfolio, they can focus on the relatively riskier set of merchants. The PF model is validated by comparing the actual defaults with those predicted by the model and a good alignment is found between the two. The results show that the model can capture 62 percent frauds in the first decile when the transactions are sorted by the probability of fraud computed by the model. VIKALPA VOLUME 39 NO 3 JULY - SEPTEMBER

2 The merchant-acquirer relationship is often ignored by banks, when planning for risk management. The relationship, however, poses an invisible risk to the acquiring banks, because, when a transaction is recorded at a merchant today, the risk from this transaction arises a few months into the future. This makes the risk unforeseeable. The motivation behind this paper is to create an Early Warning System, to tackle the above-mentioned challenge posed by the merchant-acquirer relationship. This Early Warning system enables the banks to estimate the expected loss from the defaults predicted by the model, and strategize accordingly. Such an Early Warning System is possible to create because of the availability of live transactional data with the banks. This data can be leveraged by using tools like SAS and SQL Server to derive analytical insights and then create a statistical model. This model can then be run on live data. LITERATURE REVIEW Cressey (1953), a criminologist, first studied the motivation behind committing fraud. He formulated the wellknown fraud triangle, which has its three vertices as Pressure, Opportunity, and Rationalization. Cressey concluded from his research that Trust violators when they conceive of themselves as having a financial problem which is non-shareable, have knowledge or awareness that this problem can be secretly resolved by violation of the position of financial trust, and are able to apply to their own conduct in that situation verbalizations which enable them to adjust their conceptions of themselves as trusted persons with their conceptions of themselves as users of the entrusted funds or property (Cressey, 1953, p. 742). Credit card fraud risk prediction and modeling has been a research interest of many banks around the world and a number of techniques, with special emphasis on regression techniques. Molyneaux (2007) and George (1992) also discussed the ethics of banking and Clarke (1994), in his work, mentioned about the moral complexity of fraudulent behaviour. Ghosh and Reilly (1994) proposed a fraud detection system that used past fraud cases due to lost cards, stolen cards, and application frauds. Syeda (2002) used parallel granular neural networks (PGNNs) to improve the knowledge discovery process in credit card fraud detection. Fan, Prodromidis, and Stolfo (1999) suggested the application of distributed data mining in credit card fraud detection. Brause, Langsdorf, and Hepp (1999) developed an approach that involved advanced data mining techniques and neural network algorithms to obtain high fraud coverage. Chiu and Tsai (2004)proposed web services and data mining techniques to establish a collaborative scheme for fraud detection in the banking industry. Not much literature is available on addressing the risk faced by a bank from a merchant. This study aims to fill this existing gap. MERCHANT-ACQUIRER RELATIONSHIP Business Framework Merchants, who accept credit cards, get a Point of Sale (POS) device installed at their outlet. This POS device is installed through a bank. And this bank is known as the acquiring bank, or simply the acquirer. When a transaction is recorded by a merchant, a copy of the receipt signed by the customer is sent to the bank by the merchant for claiming the amount of transaction. The acquiring bank settles this within a stipulated number of days, after deducting a surcharge, known as discount rate. Figure 1 shows the flow of events once a transaction is done by a customer at a merchant. The merchant sends a copy of the signed receipt to the acquiring bank. The bank, in turn, settles the dues to the merchant, and also communicates the transaction details to the customer bank, via the Credit Card Association. Once this communication is received by the customer bank, it adds this transaction to the customer s statement of account. Figure 1: A Typical Credit Card Business Model Customer Bank Credit Card Association (Visa and Mastercard) Acquiring Bank Customer Merchant 68 FRAUD RISK PREDICTION IN MERCHANT-BANK RELATIONSHIP USING REGRESSION MODELING

3 Fraud Risk Fraud Risk is the possibility that a counter party engages in an activity which entails earning revenue through the means that act as a breach of contract with the signatory party. The authors discuss the cause of fraud risk in the merchant-acquirer relationship. Fraud risk is a challenge to the acquiring banks because the merchants always have a motivation to earn money through fraudulent ways, primarily because merchants look upon the discount rate charged by the banks as a cost, and they engage in activities in order to compensate for this perceived cost. These activities invariably breach the contract between the merchant and the bank, and are termed as fraud. To understand this, an example of a merchant selling high value product, with irregular business cycle is considered. Such a merchant could have a small customer base. This concept is termed as card member concentration, which means that most of the transactions at the merchant are initiated by a small group of customers and therefore the transactions are concentrated around a few customers. In such a scenario, the possibility of fraud increases tremendously, because the merchant and customer can tie up together for committing the fraud. Another scenario where fraud is possible is when a merchant uses his personal credit card, issued by the same bank as the acquiring bank, and swipes the card on his POS device. In this way, the merchant has not sold anything. He has just swiped his personal card, and submits the slip generated by the POS device to the acquiring bank for claiming the money. Typically, this money is credited in two to three days, after deducting 2-3 percent as the surcharge, also known as the discount rate. This amount will appear on the statement of the personal card of the merchant, which would need to be paid after 50 days approximately. Upon looking at this transaction from another perspective, it was realized that this amounted to taking a loan from the bank for 50 days, at a rate of 2-3 percent. This is a breach of contract by the merchant, and should be termed as fraud. The last scenario that is considered for fraud is the possibility of a merchant who has a low business volume under usual circumstances, but has some big spikes in the volume in the recent past. In such a scenario, the mathematical slope of the spike in volume is calculated, and if that slope is more than a certain cutoff value, then that merchant is tagged as a possible fraud case. FRAUD RISK MODELING IN MERCHANT- ACQUIRER RELATIONSHIP The Need An acquiring bank can have millions of merchants in its global portfolio. While all merchants would be risky at some level, all of them do not pose the same credit risk. The management of the bank cannot build risk management strategy for the entire merchant portfolio. Hence, it would be helpful for the bank if they can assign a probability of fraud (PF) score to the merchants based on the level of risk posed by them, and then sort the list of merchants in decreasing order of this probability score. The authors propose to build a statistical model, which would help the acquiring bank in the following ways: Generate PF scores for all merchants in the portfolio of the bank, which would help the bank in identifying the high risk merchants, and plan accordingly Act as an Early Warning System, which would give the acquiring bank some visibility into the potentially invisible fraud risk in the future The above-mentioned points enable the banks to create sufficient reserves in advance, in view of the fraud risk they foresee using the model. Also, the banks can do further analysis on the kind of industries, geographies, etc. that are posing more risk, and plan accordingly. Methodology As discussed earlier, the number of merchants in the global portfolio of the bank can be in millions. This creates a need for a statistical model. To make an effective model, the authors propose to first create segments of merchants, based on their perceived risk profile, and then create the required models for each of the segments. The methodology is described in Figure 2. Create Merchant Segments Figure 3 shows the Merchant Segments based on two attributes: Ability to pay and Willingness to pay. Merchants who are high on both these attributes are least risky, and a fraud risk model may not be required for them. Merchants who are low on willingness to pay are generally risky group of merchants. VIKALPA VOLUME 39 NO 3 JULY - SEPTEMBER

4 Figure 2: Fraud Risk Modeling Credit Risk Modelling Figure 3: Merchant Segmentation Framework Ability to Pay Low High Willingness to Pay High Low Lowest Risk Lower Risk Create Merchant Segments Identify the type of statistical model to be built Identify the variables suitable for the model for each segment Test the statistical significance of each variable Build models of each segment and validate the results using out of time-out of sample data High Risk Highest Risk While this classification is subjective in nature to some extent, it is not critical to get this classification absolutely right. This framework is more of a guiding tool, and even if some merchants do get misclassified, it does not impact the model because in the end, all merchants are risky, and models can be built for all the four segments in Figure 3. This paper has built the fraud risk model on merchants who have high ability and low willingness to pay. A model for the Highest Risk category needs to be created separately, but since the column of merchants in the High Risk category is the highest, it was decided to create a model for this segment. Identify the type of statistical model to be built In the fraud risk model built here, the dependent variable is whether a merchant will go into a fraudulent activity or not. This is a binary variable, with the value 1 representing the possibility that the merchant will do a fraud, and the value 0 representing the possibility that the merchant will not do a fraud. Hence, logistic regression was used to model the risk. Another choice that was made was based on the nature of model required. Broadly speaking, there are two types of models: Transaction level and Account level. The transaction level model gets triggered when transaction is recorded at a merchant and transaction variables are used. The drawback of this model is that if a merchant does not have too many transactions (As per industry standards, the merchant should have at least 100 transactions per day on an average in the last one year), or has seasonal transactions, then this model may not be useful because there will not be enough transactions to get a reliable PF score. To overcome this, one can use account level model which uses variables which are merchant specific, irrespective of whether a transaction is being recorded at the merchant or not. However, the drawback of this model is that the variables are not updated because they are based on merchant history rather than on fresh transactions. This paper builds a transaction level model to demonstrate its usefulness. Modeling Dataset This includes the time period for historical data that will be used in developing the model. For validation of the model, it is necessary to create an out of time-out of sample dataset. This paper uses the following timeframes: Model Development Dataset: It is based on the time period January December The time period chosen is one year assuming that most banks do not store data which is older than one year because of the enormity of data, and costs associated with storage. Model Validation Dataset: It begins from January 2012 and ends in December 2012, on a different sample of merchants. 70 FRAUD RISK PREDICTION IN MERCHANT-BANK RELATIONSHIP USING REGRESSION MODELING

5 Identify Suitable Variables Suitable variables for each segment need to be identified by combining business experience of the management, supported by statistical significance of the variable itself. Another thing that needs to be taken care of is the sign of the variable. The variable needs to have predictive power, but in the right direction Credit Risk PF Model Development A credit risk PF (Probability of Fraud) model was proposed to assess the risk associated with the merchantacquirer relationship, from an acquirer s perspective. As discussed, the logistic regression technique was adopted to build the model. Subsequently, the model was tested on out of time-out of sample data. Table 1 gives the summary of the model prerequisites that were used. Sample of Transaction level Data Tables 2a and 2b are only a snapshot of transaction level data. The exhaustive list of variables, which usually are hundreds in number, is not shown; only the variables finally used in the model are considered. Table 1: Summary of Modeling Framework Regression Technique Logistic Regression Timeframe Development Dataset January 2011 December 2011Validation Dataset January 2012 December 2012 Modeling Platform SAS, SQL Server Model Type Transaction level Model Merchant Segment High Risk (High Willingness and High Ability to Pay) The description of the variables is presented in Table 2b. Number of customers in the last five days as a variable was also tested for significance, and while it was found to be significant, it had high multi-collinearity with other variables, indicating that its presence in the model could inflate the error term in the regression. Hence, this variable was not included in the model. At the same time, this variable could be included in the model, but with the caveat that any bank including this variable in the model should test it across a much larger sample of data, to ensure that the influence of this variable is not exaggerated. After the identification of variables, logistic regression was applied on the framework discussed above. A typical logistic regression model is represented as: Table 2a: Variables used in PF Model Merchant_ID Trans_Dt Sub_Dt Trans_Amt Slope_Spike No_of_Cust_Last 5 Days CM_Conc /2/ /2/ /3/ /2/ /5/ /5/ /6/ /6/ /1/ /1/ /7/ /7/ /8/2011 1/9/ /9/ /9/ Table 2b: Description of Variables used in PF Model Variable Variable Description Variable Type Merchant_Id A unique identifier for the merchant Text Trans_dt Date of transaction between merchant and the customer Date Sub_dt Date on which the merchant submits the transaction slip to the acquiring bank Date Trans_amt Transaction amount Numeric CM_conc A measure of the concentration of card members at the merchant. Calculated as the ratio of Numeric number of transactions in the last 5 days at the merchant to the number of card members initiating those transactions Slope_spike The slope of the spike in current month s business volume over last month s volume Numeric No_of_cust_last 5 days Number of customers transacting at the merchant in the last 5 days Numeric VIKALPA VOLUME 39 NO 3 JULY - SEPTEMBER

6 z = β 0 + β 0 x 1 + β 1 x 2 + β 2 x 3 + β 3 x p = 1/(1 + e z ) 5.2 where, z is the linear function of the dependent variables. p is the probability of default, that is modelled. The SAS output of our model run is as shown in Table 2c. The full blown output is not shown here. It is clear from this output that: x 1 x 2 x 3 x 4 = Trans_amt = CM_conc = Slope_spike = No._of_cust_last5days AIC - This is the Akaike Information Criterion. It is calculated as AIC = -2 Log L + 2((k-1) + s), where k is the number of levels of the dependent variable and s is the number of predictors in the model. AIC is used for the comparison of non-nested models on the same sample. Ultimately, the model with the smallest AIC is considered the best, although the AIC value itself is not meaningful. SC - This is the Schwarz Criterion. It is defined as - 2 Log L + ((k-1) + s)*log ( fi), where fi s are the frequency values of the ith observation, and k and s were defined previously. Like AIC, SC penalizes for the number of predictors in the model and the smallest SC is most desirable and the value itself is not meaningful. -2 Log L - This is negative two times the log-likelihood. The -2 Log L is used in hypothesis tests for nested models and the value in itself is not meaningful. Likelihood Ratio - This is the Likelihood Ratio (LR) Chi- Square test that at least one of the predictors regression coefficients is not equal to zero in the model. The LR Chi- Square statistic can be calculated by -2 Log L (null model) - 2 Log L (fitted model) = = 71.05, where L(null model) refers to the Intercept Only model and L(fitted model) refers to the Intercept and Covariates model. Score - This is the Score Chi-Square Test that at least one of the predictors regression coefficients is not equal to zero in the model. Wald - This is the Wald Chi-Square Test that at least one of the predictors regression coefficient is not equal to zero in the model. McFadden s pseudo R 2 - This measure compares a random model with the model suggested in the paper. This R square is used to compare Table 2c: SAS Output of the Model Model Fit Statistics Criterion Intercept Only Intercept and Covariates AIC SC Log L Testing Global Null Hypothesis: BETA=0 Test Chi Square DF Pr>Chi Sq Likelihood ratio <.0001 Score <.0001 Wald <.0001 Analysis of Maximum Likelihood Estimates Parameter DF Estimate Standard Error Wald Chi-Square Pr>Chi Sq Intercept <.0001 Trans_amt <.0001 CM_conc <.0001 Slope_spike <.0001 No_of_cust_last 5 days < FRAUD RISK PREDICTION IN MERCHANT-BANK RELATIONSHIP USING REGRESSION MODELING

7 two models, and its value indicates the improvement achieved by one over the other. LL full model R 2 = 1 = 1 = LL intercept The value of McFadden s pseudo R 2 suggests that the model has a 34 percent improvement over a random model, or a model with no covariates. Logistic regression was run in SAS on the transaction level data. The SAS output showed the following values of the β coefficient. β 0 = , β 1 = , β 2 = , β 3 = , β 4 = In Figure 4, the graph, known as ROC (Receiver Operating Characteristics) curve shows that the model captures 63 percent defaults in the top 10 percent of the transaction data when the data is sorted in decreasing order of the PF score generated by the model. A high percentage of default capture in the top percentile of ROC curve indicates that the model is working as expected, and has been able to fit the data well. The steepness in ROC Curve indicates the amount by which using a model is better than not using a model at all. Figure 4: ROC (Receiver Operating Characteristics) Curve To understand this better, an example of the merchant having the id (row 1 in Table 2a) is considered. For this merchant, the z value comes out to be using equation 5.1. ROC Curve Using equation 5.2, the PF score of this merchant comes out to be 0.63 which means that this transaction has 63 percent probability to go into default. Similarly, this model can calculate the PF score of each transaction. Once the PF score is obtained from each transaction, the dataset is sorted in descending order of probability score and the data is then consolidated into deciles as shown in Table 3. Table 3: Percentile Distribution of Defaults captured by PF Model Percentile No. of Cumulative Percent Cumulative Defaults No. of Defaults No.of Defaults The 45 degree line in Figure 4 indicates the ROC curve when no model is used, and hence each merchant in the portfolio is assumed to have a 50 percent PF. The area between the actual ROC, originating from the use of the PF model, and the 45 degree line, indicates the incremental benefit emanating from the use of PF model. Once the probability score is obtained from each transaction, the dataset is sorted in descending order of probability score and then consolidated into deciles and the actual and predicted default rate calculated for each decile as shown in Table 4. Actual default rates are calculated as the ratio between the number of defaults in that decile and the total number of merchants in that decile. Predicted default rates are calculated in a similar way, but the numerator in this case represents the total number of merchants tagged as defaulters by the PF model. VIKALPA VOLUME 39 NO 3 JULY - SEPTEMBER

8 Table 4: Comparison of Actual Default Rates and Default Rates Predicted by the PF Model Percentile No. of No. of Actual Predicted Defaults Transactions Default Default Rate (%) Rate (%) Figure 5: Comparison of Actual vs Predicted Default Rate for the PF Model the merchant fraud risk. Moreover, managers would find it more efficient to strategize for a smaller set of risky merchants, rather than planning for the entire merchant portfolio. A possible obstacle in the use of this model could be the presence of millions of merchants in a bank s portfolio, thereby making the task of classification of these merchants using the Merchant Segmentation Framework very difficult. However, this can be easily overcome by creating geographical portfolios for merchants, instead of taking the entire global merchant base as a whole. As an example, a bank could have one million merchants in India, out of which a quarter of a million could be in the northern region. The bank can assess the northern portfolio first, and then the other regions one by one. Also, banks usually have regional teams which handle merchant relationships in their respective regions. These teams can help in using the Merchant Segmentation Framework. LIMITATIONS OF THE MODEL AND FUTURE RESEARCH Although this study has made some contribution towards a better understanding of the risk inherent in a merchantacquirer relationship, it is not without limitations. The PF model is sensitive to the explanatory variables chosen. It is also dependent, to an extent, on the modeling timeframe chosen, which makes it all the more important to test the model under different set of scenarios, and in different economies around the world. Figure 5 shows the comparison of actual default rate and the default rate predicted by the model for each decile and helps in understanding the model performance in real time. MANAGERIAL IMPLICATIONS This research on fraud risk arising from merchant acquirer relationship and the subsequent development of the PF model is founded on the concept of risk management. The approach employed to model the fraud risk from global credit card operations should be useful for modern day managers of a bank. With the help of the PF Model, managers should be able to strategize with the invisible fraud risk always in their mind. The PF model would enable the banks to keep in reserve a pool of funds in advance, to negate the losses arising from While the PF model can predict the possibility of a merchant going into default, it cannot predict the amount of default. Since this model is based on logistic regression, the dependent variable needs to be binary in nature. Therefore, to predict the amount of default, the PF model can be extended to another level, where a regression model can be built, with the dependent variable being the LGD (Loss Given Default). The model seems to have good explanatory abilities. However, it is pertinent to mention that the variables used in the PF are not the only set of variables that can be used. The choice of variables would depend upon the sample of data used, along with the nature of the data. Hence it becomes important to test this model under different conditions, and in different geographies. Also, it is important to mention that apart from disputes, other causes of 74 FRAUD RISK PREDICTION IN MERCHANT-BANK RELATIONSHIP USING REGRESSION MODELING

9 fraud risk could also be explored. While the PF model proposed in this paper is useful for both practitioners as well as researchers, caution needs to be exercised while generalizing the findings. In this paper, PF model was tested on sample data of a bank. Future research attempts should test the generalizability of the PF framework of different banks in different geographies. CONCLUSION The management of risk in global business scenario has taken more importance than ever. In the financial services sector, fraud risk is inherent in the nature of business. The aim of this research was to create an Early Warning System for banks to effectively manage the risk arising out of their merchant-acquirer relationship. To facilitate this, the PF (Probability of Fraud) Model was proposed. The PF model assigns a PF score to the entire merchant portfolio of the acquiring bank, which can be used by the banks to assess their merchant portfolio more effectively. In this paper, the business framework of the merchantacquirer relationship was explained, along with the ways in which fraud risk can arise from this association. The PF model was created in SAS and applied to the transactional data of a bank. It was observed that the model captured 63 percent defaults in the top decile of transactions, when sorted in decreasing order of PF scores. The PF model was also validated by comparing the actual defaults with those predicted by the model and it was found that there was a very good alignment between the two. This indicated that the model rank ordered the defaults quite close to reality. To summarize, the PF model is expected to act as a tool for acquiring banks in both analysing as well as strategizing for credit risk arising from the merchant side. REFERENCES Agarwal N., & Sharma M. (2013). Default risk modeling in credit cards: A study of merchant and aquiring bank relationship. IIMS Journal of Management Science, 4(1), Brause, R., Langsdorf, T., & Hepp, M. (1999). Neural data mining for credit card fraud detection. Proceedings of IEEE International Conference. Tools with Artificial Intelligence, pp Chiu, C., & Tsai, C. (2004). A web services-based collaborative scheme for credit card fraud detection. Proceedings of IEEE International Conference. e-technology, e-commerce and e Service, pp Clarke, M. (1994).Fraud and the politics of morality. Business Ethics: A European Review, 3(2), Cressey, D. R. (1953). Other people s money. Montclair, NJ: Patterson Smith. Fan, W., Prodromidis, A.L., & Stolfo, S.J. (1999). Distributed data mining in credit card fraud detection. IEEE Intelligent Systems, 14(6), pp George, E. (1992). Ethics in banking. Business Ethics: A European Review, 1(3), Ghosh, S., & Reilly, D.L., (1994). Credit card fraud detection with a neural-network. 27th Hawaii International Conference on Information Systems, Vol. 3 (2003), pp Molyneaux, D. (2007). Two case study scenarios in banking: A commentary on The Hutton Prize for Professional Ethics, 2004 and Business Ethics: A European Review, 16(4), Syeda, M., Zhang, Y. Q., & Pan, Y. (2002). Parallel granular networks for fast credit card fraud detection, Proceedings of IEEE International Conference on Fuzzy Systems, pp Nishant Agarwal is pursuing doctoral studies in the area of Accounting from Indian School of Business, Hyderabad. He has degrees in Engineering (Delhi College of Engineering) and MBA (Indian Institute of Management, Calcutta). Ha has more than 10 years of work experience spread across technology, consulting, risk management, and teaching. nishanta2006@ .iimcal.ac.in Meghna Sharma is an academician, researcher, and a consultant with over 18 years of corporate and teaching experience. She has done her Post Graduation in Economics, PGDBM in International Business, and a Ph.D. in Economics (Microfinance). She has several national and international publications to her credit. Currently, she is a faculty with Amity International Business School, Amity University, Uttar Pradesh. meghnaasharma@gmail.com VIKALPA VOLUME 39 NO 3 JULY - SEPTEMBER

10 76 VIKALPA VOLUME 39 NO 3 JULY - SEPTEMBER 2014

STATISTICA Formula Guide: Logistic Regression. Table of Contents

STATISTICA Formula Guide: Logistic Regression. Table of Contents : Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

More information

11. Analysis of Case-control Studies Logistic Regression

11. Analysis of Case-control Studies Logistic Regression Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

More information

Meta Learning Algorithms for Credit Card Fraud Detection

Meta Learning Algorithms for Credit Card Fraud Detection International Journal of Engineering Research and Development e-issn: 2278-67X, p-issn: 2278-8X, www.ijerd.com Volume 6, Issue 6 (March 213), PP. 16-2 Meta Learning Algorithms for Credit Card Fraud Detection

More information

Statistics in Retail Finance. Chapter 2: Statistical models of default

Statistics in Retail Finance. Chapter 2: Statistical models of default Statistics in Retail Finance 1 Overview > We consider how to build statistical models of default, or delinquency, and how such models are traditionally used for credit application scoring and decision

More information

Accurately and Efficiently Measuring Individual Account Credit Risk On Existing Portfolios

Accurately and Efficiently Measuring Individual Account Credit Risk On Existing Portfolios Accurately and Efficiently Measuring Individual Account Credit Risk On Existing Portfolios By: Michael Banasiak & By: Daniel Tantum, Ph.D. What Are Statistical Based Behavior Scoring Models And How Are

More information

Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541

Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541 Using An Ordered Logistic Regression Model with SAS Vartanian: SW 541 libname in1 >c:\=; Data first; Set in1.extract; A=1; PROC LOGIST OUTEST=DD MAXITER=100 ORDER=DATA; OUTPUT OUT=CC XBETA=XB P=PROB; MODEL

More information

Logistic Regression. http://faculty.chass.ncsu.edu/garson/pa765/logistic.htm#sigtests

Logistic Regression. http://faculty.chass.ncsu.edu/garson/pa765/logistic.htm#sigtests Logistic Regression http://faculty.chass.ncsu.edu/garson/pa765/logistic.htm#sigtests Overview Binary (or binomial) logistic regression is a form of regression which is used when the dependent is a dichotomy

More information

SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

More information

Binary Logistic Regression

Binary Logistic Regression Binary Logistic Regression Main Effects Model Logistic regression will accept quantitative, binary or categorical predictors and will code the latter two in various ways. Here s a simple model including

More information

Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP

Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP ABSTRACT In data mining modelling, data preparation

More information

SUGI 29 Statistics and Data Analysis

SUGI 29 Statistics and Data Analysis Paper 194-29 Head of the CLASS: Impress your colleagues with a superior understanding of the CLASS statement in PROC LOGISTIC Michelle L. Pritchard and David J. Pasta Ovation Research Group, San Francisco,

More information

Cool Tools for PROC LOGISTIC

Cool Tools for PROC LOGISTIC Cool Tools for PROC LOGISTIC Paul D. Allison Statistical Horizons LLC and the University of Pennsylvania March 2013 www.statisticalhorizons.com 1 New Features in LOGISTIC ODDSRATIO statement EFFECTPLOT

More information

Virtual Site Event. Predictive Analytics: What Managers Need to Know. Presented by: Paul Arnest, MS, MBA, PMP February 11, 2015

Virtual Site Event. Predictive Analytics: What Managers Need to Know. Presented by: Paul Arnest, MS, MBA, PMP February 11, 2015 Virtual Site Event Predictive Analytics: What Managers Need to Know Presented by: Paul Arnest, MS, MBA, PMP February 11, 2015 1 Ground Rules Virtual Site Ground Rules PMI Code of Conduct applies for this

More information

VI. Introduction to Logistic Regression

VI. Introduction to Logistic Regression VI. Introduction to Logistic Regression We turn our attention now to the topic of modeling a categorical outcome as a function of (possibly) several factors. The framework of generalized linear models

More information

Ordinal Regression. Chapter

Ordinal Regression. Chapter Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe

More information

5. Multiple regression

5. Multiple regression 5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

Statistics, Data Analysis & Econometrics

Statistics, Data Analysis & Econometrics Using the LOGISTIC Procedure to Model Responses to Financial Services Direct Marketing David Marsh, Senior Credit Risk Modeler, Canadian Tire Financial Services, Welland, Ontario ABSTRACT It is more important

More information

Credit Card Fraud Detection Using Hidden Markov Model

Credit Card Fraud Detection Using Hidden Markov Model International Journal of Soft Computing and Engineering (IJSCE) Credit Card Fraud Detection Using Hidden Markov Model SHAILESH S. DHOK Abstract The most accepted payment mode is credit card for both online

More information

Statistical Models in R

Statistical Models in R Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova

More information

Data quality in Accounting Information Systems

Data quality in Accounting Information Systems Data quality in Accounting Information Systems Comparing Several Data Mining Techniques Erjon Zoto Department of Statistics and Applied Informatics Faculty of Economy, University of Tirana Tirana, Albania

More information

Easily Identify Your Best Customers

Easily Identify Your Best Customers IBM SPSS Statistics Easily Identify Your Best Customers Use IBM SPSS predictive analytics software to gain insight from your customer database Contents: 1 Introduction 2 Exploring customer data Where do

More information

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives

More information

Factors affecting online sales

Factors affecting online sales Factors affecting online sales Table of contents Summary... 1 Research questions... 1 The dataset... 2 Descriptive statistics: The exploratory stage... 3 Confidence intervals... 4 Hypothesis tests... 4

More information

Credit Scorecards for SME Finance The Process of Improving Risk Measurement and Management

Credit Scorecards for SME Finance The Process of Improving Risk Measurement and Management Credit Scorecards for SME Finance The Process of Improving Risk Measurement and Management April 2009 By Dean Caire, CFA Most of the literature on credit scoring discusses the various modelling techniques

More information

Auto Days 2011 Predictive Analytics in Auto Finance

Auto Days 2011 Predictive Analytics in Auto Finance Auto Days 2011 Predictive Analytics in Auto Finance Vick Panwar SAS Risk Practice Copyright 2010 SAS Institute Inc. All rights reserved. Agenda Introduction Changing Risk Landscape - Key Drivers and Challenges

More information

Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and

More information

Despite its emphasis on credit-scoring/rating model validation,

Despite its emphasis on credit-scoring/rating model validation, RETAIL RISK MANAGEMENT Empirical Validation of Retail Always a good idea, development of a systematic, enterprise-wide method to continuously validate credit-scoring/rating models nonetheless received

More information

Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing. C. Olivia Rud, VP, Fleet Bank

Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing. C. Olivia Rud, VP, Fleet Bank Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing C. Olivia Rud, VP, Fleet Bank ABSTRACT Data Mining is a new term for the common practice of searching through

More information

Segmentation For Insurance Payments Michael Sherlock, Transcontinental Direct, Warminster, PA

Segmentation For Insurance Payments Michael Sherlock, Transcontinental Direct, Warminster, PA Segmentation For Insurance Payments Michael Sherlock, Transcontinental Direct, Warminster, PA ABSTRACT An online insurance agency has built a base of names that responded to different offers from various

More information

Paper AA-08-2015. Get the highest bangs for your marketing bucks using Incremental Response Models in SAS Enterprise Miner TM

Paper AA-08-2015. Get the highest bangs for your marketing bucks using Incremental Response Models in SAS Enterprise Miner TM Paper AA-08-2015 Get the highest bangs for your marketing bucks using Incremental Response Models in SAS Enterprise Miner TM Delali Agbenyegah, Alliance Data Systems, Columbus, Ohio 0.0 ABSTRACT Traditional

More information

Modeling Lifetime Value in the Insurance Industry

Modeling Lifetime Value in the Insurance Industry Modeling Lifetime Value in the Insurance Industry C. Olivia Parr Rud, Executive Vice President, Data Square, LLC ABSTRACT Acquisition modeling for direct mail insurance has the unique challenge of targeting

More information

Soft Computing Tools in Credit card fraud & Detection Rashmi G.Dukhi G.H.Raisoni Institute of Information & Technology, Nagpur rashmidukhi25@gmail.

Soft Computing Tools in Credit card fraud & Detection Rashmi G.Dukhi G.H.Raisoni Institute of Information & Technology, Nagpur rashmidukhi25@gmail. Soft Computing Tools in Credit card fraud & Detection Rashmi G.Dukhi G.H.Raisoni Institute of Information & Technology, Nagpur rashmidukhi25@gmail.com Abstract Fraud is one of the major ethical issues

More information

(M.S.), INDIA. Keywords: Internet, SQL injection, Filters, Session tracking, E-commerce Security, Online shopping.

(M.S.), INDIA. Keywords: Internet, SQL injection, Filters, Session tracking, E-commerce Security, Online shopping. Securing Web Application from SQL Injection & Session Tracking 1 Pranjali Gondane, 2 Dinesh. S. Gawande, 3 R. D. Wagh, 4 S.B. Lanjewar, 5 S. Ugale 1 Lecturer, Department Computer Science & Engineering,

More information

KnowledgeSTUDIO HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES

KnowledgeSTUDIO HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES Translating data into business value requires the right data mining and modeling techniques which uncover important patterns within

More information

Bachelor's Degree in Business Administration and Master's Degree course description

Bachelor's Degree in Business Administration and Master's Degree course description Bachelor's Degree in Business Administration and Master's Degree course description Bachelor's Degree in Business Administration Department s Compulsory Requirements Course Description (402102) Principles

More information

ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS

ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS DATABASE MARKETING Fall 2015, max 24 credits Dead line 15.10. ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS PART A Gains chart with excel Prepare a gains chart from the data in \\work\courses\e\27\e20100\ass4b.xls.

More information

International Statistical Institute, 56th Session, 2007: Phil Everson

International Statistical Institute, 56th Session, 2007: Phil Everson Teaching Regression using American Football Scores Everson, Phil Swarthmore College Department of Mathematics and Statistics 5 College Avenue Swarthmore, PA198, USA E-mail: peverso1@swarthmore.edu 1. Introduction

More information

Credit Risk Analysis Using Logistic Regression Modeling

Credit Risk Analysis Using Logistic Regression Modeling Credit Risk Analysis Using Logistic Regression Modeling Introduction A loan officer at a bank wants to be able to identify characteristics that are indicative of people who are likely to default on loans,

More information

Prediction of Stock Performance Using Analytical Techniques

Prediction of Stock Performance Using Analytical Techniques 136 JOURNAL OF EMERGING TECHNOLOGIES IN WEB INTELLIGENCE, VOL. 5, NO. 2, MAY 2013 Prediction of Stock Performance Using Analytical Techniques Carol Hargreaves Institute of Systems Science National University

More information

A Study of Detecting Credit Card Delinquencies with Data Mining using Decision Tree Model

A Study of Detecting Credit Card Delinquencies with Data Mining using Decision Tree Model A Study of Detecting Credit Card Delinquencies with Data Mining using Decision Tree Model ABSTRACT Mrs. Arpana Bharani* Mrs. Mohini Rao** Consumer credit is one of the necessary processes but lending bears

More information

Benchmarking of different classes of models used for credit scoring

Benchmarking of different classes of models used for credit scoring Benchmarking of different classes of models used for credit scoring We use this competition as an opportunity to compare the performance of different classes of predictive models. In particular we want

More information

Paper D10 2009. Ranking Predictors in Logistic Regression. Doug Thompson, Assurant Health, Milwaukee, WI

Paper D10 2009. Ranking Predictors in Logistic Regression. Doug Thompson, Assurant Health, Milwaukee, WI Paper D10 2009 Ranking Predictors in Logistic Regression Doug Thompson, Assurant Health, Milwaukee, WI ABSTRACT There is little consensus on how best to rank predictors in logistic regression. This paper

More information

A Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data

A Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data A Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data Athanasius Zakhary, Neamat El Gayar Faculty of Computers and Information Cairo University, Giza, Egypt

More information

Advanced analytics at your hands

Advanced analytics at your hands 2.3 Advanced analytics at your hands Neural Designer is the most powerful predictive analytics software. It uses innovative neural networks techniques to provide data scientists with results in a way previously

More information

A Property & Casualty Insurance Predictive Modeling Process in SAS

A Property & Casualty Insurance Predictive Modeling Process in SAS Paper AA-02-2015 A Property & Casualty Insurance Predictive Modeling Process in SAS 1.0 ABSTRACT Mei Najim, Sedgwick Claim Management Services, Chicago, Illinois Predictive analytics has been developing

More information

Nine Common Types of Data Mining Techniques Used in Predictive Analytics

Nine Common Types of Data Mining Techniques Used in Predictive Analytics 1 Nine Common Types of Data Mining Techniques Used in Predictive Analytics By Laura Patterson, President, VisionEdge Marketing Predictive analytics enable you to develop mathematical models to help better

More information

Hexaware E-book on Predictive Analytics

Hexaware E-book on Predictive Analytics Hexaware E-book on Predictive Analytics Business Intelligence & Analytics Actionable Intelligence Enabled Published on : Feb 7, 2012 Hexaware E-book on Predictive Analytics What is Data mining? Data mining,

More information

Charles Secolsky County College of Morris. Sathasivam 'Kris' Krishnan The Richard Stockton College of New Jersey

Charles Secolsky County College of Morris. Sathasivam 'Kris' Krishnan The Richard Stockton College of New Jersey Using logistic regression for validating or invalidating initial statewide cut-off scores on basic skills placement tests at the community college level Abstract Charles Secolsky County College of Morris

More information

Penalized regression: Introduction

Penalized regression: Introduction Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20th-century statistics dealt with maximum likelihood

More information

Business Intelligence. Tutorial for Rapid Miner (Advanced Decision Tree and CRISP-DM Model with an example of Market Segmentation*)

Business Intelligence. Tutorial for Rapid Miner (Advanced Decision Tree and CRISP-DM Model with an example of Market Segmentation*) Business Intelligence Professor Chen NAME: Due Date: Tutorial for Rapid Miner (Advanced Decision Tree and CRISP-DM Model with an example of Market Segmentation*) Tutorial Summary Objective: Richard would

More information

Identifying SPAM with Predictive Models

Identifying SPAM with Predictive Models Identifying SPAM with Predictive Models Dan Steinberg and Mikhaylo Golovnya Salford Systems 1 Introduction The ECML-PKDD 2006 Discovery Challenge posed a topical problem for predictive modelers: how to

More information

Geostatistics Exploratory Analysis

Geostatistics Exploratory Analysis Instituto Superior de Estatística e Gestão de Informação Universidade Nova de Lisboa Master of Science in Geospatial Technologies Geostatistics Exploratory Analysis Carlos Alberto Felgueiras cfelgueiras@isegi.unl.pt

More information

Alex Vidras, David Tysinger. Merkle Inc.

Alex Vidras, David Tysinger. Merkle Inc. Using PROC LOGISTIC, SAS MACROS and ODS Output to evaluate the consistency of independent variables during the development of logistic regression models. An example from the retail banking industry ABSTRACT

More information

Predictive modelling around the world 28.11.13

Predictive modelling around the world 28.11.13 Predictive modelling around the world 28.11.13 Agenda Why this presentation is really interesting Introduction to predictive modelling Case studies Conclusions Why this presentation is really interesting

More information

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and Financial Institutions and STATISTICA Case Study: Credit Scoring STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table of Contents INTRODUCTION: WHAT

More information

Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets

Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets http://info.salford-systems.com/jsm-2015-ctw August 2015 Salford Systems Course Outline Demonstration of two classification

More information

Analyzing CRM Results: It s Not Just About Software and Technology

Analyzing CRM Results: It s Not Just About Software and Technology Analyzing CRM Results: It s Not Just About Software and Technology One of the key factors in conducting successful CRM programs is the ability to both track and interpret results. Many of the technological

More information

The Impact of Big Data on Classic Machine Learning Algorithms. Thomas Jensen, Senior Business Analyst @ Expedia

The Impact of Big Data on Classic Machine Learning Algorithms. Thomas Jensen, Senior Business Analyst @ Expedia The Impact of Big Data on Classic Machine Learning Algorithms Thomas Jensen, Senior Business Analyst @ Expedia Who am I? Senior Business Analyst @ Expedia Working within the competitive intelligence unit

More information

Predicting Customer Default Times using Survival Analysis Methods in SAS

Predicting Customer Default Times using Survival Analysis Methods in SAS Predicting Customer Default Times using Survival Analysis Methods in SAS Bart Baesens Bart.Baesens@econ.kuleuven.ac.be Overview The credit scoring survival analysis problem Statistical methods for Survival

More information

Fraud Detection in Online Banking Using HMM

Fraud Detection in Online Banking Using HMM 2012 International Conference on Information and Network Technology (ICINT 2012) IPCSIT vol. 37 (2012) (2012) IACSIT Press, Singapore Fraud Detection in Online Banking Using HMM Sunil Mhamane + and L.M.R.J

More information

Developing Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@

Developing Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@ Developing Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@ Yanchun Xu, Andrius Kubilius Joint Commission on Accreditation of Healthcare Organizations,

More information

A Comparison of Variable Selection Techniques for Credit Scoring

A Comparison of Variable Selection Techniques for Credit Scoring 1 A Comparison of Variable Selection Techniques for Credit Scoring K. Leung and F. Cheong and C. Cheong School of Business Information Technology, RMIT University, Melbourne, Victoria, Australia E-mail:

More information

Predictive Modeling Techniques in Insurance

Predictive Modeling Techniques in Insurance Predictive Modeling Techniques in Insurance Tuesday May 5, 2015 JF. Breton Application Engineer 2014 The MathWorks, Inc. 1 Opening Presenter: JF. Breton: 13 years of experience in predictive analytics

More information

New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Introduction

New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Introduction Introduction New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Predictive analytics encompasses the body of statistical knowledge supporting the analysis of massive data sets.

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

HLM software has been one of the leading statistical packages for hierarchical

HLM software has been one of the leading statistical packages for hierarchical Introductory Guide to HLM With HLM 7 Software 3 G. David Garson HLM software has been one of the leading statistical packages for hierarchical linear modeling due to the pioneering work of Stephen Raudenbush

More information

VIANELLO FORENSIC CONSULTING, L.L.C.

VIANELLO FORENSIC CONSULTING, L.L.C. VIANELLO FORENSIC CONSULTING, L.L.C. 6811 Shawnee Mission Parkway, Suite 310 Overland Park, KS 66202 (913) 432-1331 THE MARKETING PERIOD OF PRIVATE SALES TRANSACTIONS By Marc Vianello, CPA, ABV, CFF 1

More information

Calculating the Probability of Returning a Loan with Binary Probability Models

Calculating the Probability of Returning a Loan with Binary Probability Models Calculating the Probability of Returning a Loan with Binary Probability Models Associate Professor PhD Julian VASILEV (e-mail: vasilev@ue-varna.bg) Varna University of Economics, Bulgaria ABSTRACT The

More information

In this presentation, you will be introduced to data mining and the relationship with meaningful use.

In this presentation, you will be introduced to data mining and the relationship with meaningful use. In this presentation, you will be introduced to data mining and the relationship with meaningful use. Data mining refers to the art and science of intelligent data analysis. It is the application of machine

More information

Table of Contents. June 2010

Table of Contents. June 2010 June 2010 From: StatSoft Analytics White Papers To: Internal release Re: Performance comparison of STATISTICA Version 9 on multi-core 64-bit machines with current 64-bit releases of SAS (Version 9.2) and

More information

SUMAN DUVVURU STAT 567 PROJECT REPORT

SUMAN DUVVURU STAT 567 PROJECT REPORT SUMAN DUVVURU STAT 567 PROJECT REPORT SURVIVAL ANALYSIS OF HEROIN ADDICTS Background and introduction: Current illicit drug use among teens is continuing to increase in many countries around the world.

More information

Classification of Bad Accounts in Credit Card Industry

Classification of Bad Accounts in Credit Card Industry Classification of Bad Accounts in Credit Card Industry Chengwei Yuan December 12, 2014 Introduction Risk management is critical for a credit card company to survive in such competing industry. In addition

More information

not possible or was possible at a high cost for collecting the data.

not possible or was possible at a high cost for collecting the data. Data Mining and Knowledge Discovery Generating knowledge from data Knowledge Discovery Data Mining White Paper Organizations collect a vast amount of data in the process of carrying out their day-to-day

More information

Dr. Pushpa Bhatt, Sumangala JK Department of Commerce, Bangalore University, India pushpa_bhatt12@rediffmail.com; sumangalajkashok@gmail.

Dr. Pushpa Bhatt, Sumangala JK Department of Commerce, Bangalore University, India pushpa_bhatt12@rediffmail.com; sumangalajkashok@gmail. Journal of Finance, Accounting and Management, 3(2), 1-14, July 2012 1 Impact of Earnings per share on Market Value of an equity share: An Empirical study in Indian Capital Market Dr. Pushpa Bhatt, Sumangala

More information

IBM SPSS Direct Marketing 19

IBM SPSS Direct Marketing 19 IBM SPSS Direct Marketing 19 Note: Before using this information and the product it supports, read the general information under Notices on p. 105. This document contains proprietary information of SPSS

More information

Mobile Phone APP Software Browsing Behavior using Clustering Analysis

Mobile Phone APP Software Browsing Behavior using Clustering Analysis Proceedings of the 2014 International Conference on Industrial Engineering and Operations Management Bali, Indonesia, January 7 9, 2014 Mobile Phone APP Software Browsing Behavior using Clustering Analysis

More information

Lina Warrad. Applied Science University, Amman, Jordan

Lina Warrad. Applied Science University, Amman, Jordan Journal of Modern Accounting and Auditing, March 2015, Vol. 11, No. 3, 168-174 doi: 10.17265/1548-6583/2015.03.006 D DAVID PUBLISHING The Effect of Net Working Capital on Jordanian Industrial and Energy

More information

Section A. Index. Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1. Page 1 of 11. EduPristine CMA - Part I

Section A. Index. Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1. Page 1 of 11. EduPristine CMA - Part I Index Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1 EduPristine CMA - Part I Page 1 of 11 Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting

More information

Chapter 29 The GENMOD Procedure. Chapter Table of Contents

Chapter 29 The GENMOD Procedure. Chapter Table of Contents Chapter 29 The GENMOD Procedure Chapter Table of Contents OVERVIEW...1365 WhatisaGeneralizedLinearModel?...1366 ExamplesofGeneralizedLinearModels...1367 TheGENMODProcedure...1368 GETTING STARTED...1370

More information

IBM SPSS Direct Marketing 23

IBM SPSS Direct Marketing 23 IBM SPSS Direct Marketing 23 Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to version 23, release

More information

Product recommendations and promotions (couponing and discounts) Cross-sell and Upsell strategies

Product recommendations and promotions (couponing and discounts) Cross-sell and Upsell strategies WHITEPAPER Today, leading companies are looking to improve business performance via faster, better decision making by applying advanced predictive modeling to their vast and growing volumes of data. Business

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring Non-life insurance mathematics Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring Overview Important issues Models treated Curriculum Duration (in lectures) What is driving the result of a

More information

Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví. Pavel Kříž. Seminář z aktuárských věd MFF 4.

Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví. Pavel Kříž. Seminář z aktuárských věd MFF 4. Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví Pavel Kříž Seminář z aktuárských věd MFF 4. dubna 2014 Summary 1. Application areas of Insurance Analytics 2. Insurance Analytics

More information

The Probit Link Function in Generalized Linear Models for Data Mining Applications

The Probit Link Function in Generalized Linear Models for Data Mining Applications Journal of Modern Applied Statistical Methods Copyright 2013 JMASM, Inc. May 2013, Vol. 12, No. 1, 164-169 1538 9472/13/$95.00 The Probit Link Function in Generalized Linear Models for Data Mining Applications

More information

Residential mortgage default risk and the loan-tovalue

Residential mortgage default risk and the loan-tovalue Residential mortgage default risk and the loan-tovalue ratio by Jim Wong, Laurence Fung, Tom Fong and Angela Sze of the Research Department Many residential mortgage holders in Hong Kong have experienced

More information

Executive Master's in Business Administration Program

Executive Master's in Business Administration Program Executive Master's in Business Administration Program College of Business Administration 1. Introduction \ Program Mission: The UOS EMBA program has been designed to deliver high quality management education

More information

Mobile Money in Developing Countries: Common Factors. Khyati Malik

Mobile Money in Developing Countries: Common Factors. Khyati Malik Mobile Money in Developing Countries: Common Factors Khyati Malik Abstract In this paper, we address many questions related to the adoption of mobile money in developing countries. Is mobile money being

More information

IBM SPSS Direct Marketing 22

IBM SPSS Direct Marketing 22 IBM SPSS Direct Marketing 22 Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to version 22, release

More information

Fraud Detection in Credit Card Using DataMining Techniques Mr.P.Matheswaran 1,Mrs.E.Siva Sankari ME 2,Mr.R.Rajesh 3

Fraud Detection in Credit Card Using DataMining Techniques Mr.P.Matheswaran 1,Mrs.E.Siva Sankari ME 2,Mr.R.Rajesh 3 Fraud Detection in Credit Card Using DataMining Techniques Mr.P.Matheswaran 1,Mrs.E.Siva Sankari ME 2,Mr.R.Rajesh 3 1 P.G. Student, Department of CSE, Govt.College of Engineering, Thirunelveli, India.

More information

7 Time series analysis

7 Time series analysis 7 Time series analysis In Chapters 16, 17, 33 36 in Zuur, Ieno and Smith (2007), various time series techniques are discussed. Applying these methods in Brodgar is straightforward, and most choices are

More information

Chapter VIII Customers Perception Regarding Health Insurance

Chapter VIII Customers Perception Regarding Health Insurance Chapter VIII Customers Perception Regarding Health Insurance This chapter deals with the analysis of customers perception regarding health insurance and involves its examination at series of stages i.e.

More information

LOGIT AND PROBIT ANALYSIS

LOGIT AND PROBIT ANALYSIS LOGIT AND PROBIT ANALYSIS A.K. Vasisht I.A.S.R.I., Library Avenue, New Delhi 110 012 amitvasisht@iasri.res.in In dummy regression variable models, it is assumed implicitly that the dependent variable Y

More information

2. Simple Linear Regression

2. Simple Linear Regression Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

More information

Predictive Analytics Tools and Techniques

Predictive Analytics Tools and Techniques Global Journal of Finance and Management. ISSN 0975-6477 Volume 6, Number 1 (2014), pp. 59-66 Research India Publications http://www.ripublication.com Predictive Analytics Tools and Techniques Mr. Chandrashekar

More information

Module 14: Missing Data Stata Practical

Module 14: Missing Data Stata Practical Module 14: Missing Data Stata Practical Jonathan Bartlett & James Carpenter London School of Hygiene & Tropical Medicine www.missingdata.org.uk Supported by ESRC grant RES 189-25-0103 and MRC grant G0900724

More information

How to Get More Value from Your Survey Data

How to Get More Value from Your Survey Data Technical report How to Get More Value from Your Survey Data Discover four advanced analysis techniques that make survey research more effective Table of contents Introduction..............................................................2

More information

Regression 3: Logistic Regression

Regression 3: Logistic Regression Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic regression Logistic regression in R Outline Logistic regression Introduction The model Looking at and comparing

More information

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

 Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

More information