2 Elementary Data Analysis Group Comparison & One-way ANOVA Non-parametric Tests Correlations General Linear Regression Logistic Models Binary Logistic Model Ordinal Logistic Model Multinomial Logistic Model
3 The Explore procedure Exploratory data analysis Summary statistics Distribution plots Normality plots with tests Analyze Descriptive Statistics Explore The dependent variable must be a scale variable, while the grouping variables may be ordinal or nominal.
4 E.g.: demo.sav Explore the household income for married and unmarried people.
6 The Crosstabulation table Analysis of cross-classifications To examine the relationship btw two categorical variables Nominal-by-nominal relationships Ordinal-by-ordinal relationships Nominal-by-interval relationships Relative risk measurement Agreement measurement Analyze Descriptive Statistics Crosstabs
7 E.g.: Test and measure the relationship btw the job satisfactions and the number of year with current employer.
9 T Tests The one-sample T test The paired-samples T test (skip) The independent-samples T test Analyze Compare Means
10 E.g.: 1. Test if the average household income equals to 75k. 2. Test if the average household income for married and unmarried people has no significant difference.
11 One-way analysis of variance to test the hypothesis that the means of two or more groups are not significantly different. E.g.: Test if the average household income for the five education groups are different. If there is significant difference, identify which groups have the major difference.
14 Non-parametric tests Two-independent samples tests Tests for several independent samples Analyze Nonparametric Tests 2 Independent Samples K Independent Samples
15 Two-independent samples tests: Test if the household income varies btw married and unmarried people. Mann-Whitney and Wilcoxon tests--
16 The two-sample Kolmogorov-Smirnov test
17 K-independent samples tests: Test if the household income varies among people with different educations. Using the median test to detect the difference:
18 Using Kruskal-Wallis to Test Ordinal Outcomes
19 Correlation Analyze Correlation Bivariate correlation: Describe relationship btw two variables. Partial correlation: Describe relationship btw two variables while controlling for the effects of one or more additional variables. Distances: (skip) Similarities/dissimilarities btw pairs of variables/cases.
20 E.g.: Compute the correlation coefficient btw the household income and the price of primary vehicle.
21 E.g.: Compute the correlation coefficient btw the household income and the price of primary vehicle after controlling factor age, marriage status and education.
22 The GLM Univariate procedure Analyze General Linear Model Univariate Analyze Regression Linear In SPSS, there are two menus about linear regressions under pull-down menu Analyze. The difference btw the two menus is that Linear function under Regression treats every predictor to be continuous variables, while in General Linear Model, there are options to define categorical predictors and continuous predictors.
23 Factors: Categorical predictors should be selected as factors in the model. Fixed-effects factors are generally thought of as variables whose values of interest are all represented in the data file. Random-effects factors are variables whose values in the data file can be considered a random sample from a larger population of values. They are useful for explaining excess variability in the dependent variable. Covariates: Scale predictors should be selected as covariates in the model. E.g.: grocery_1month.sav Use the GLM Univariate procedure to fit a model with fixed and random effects to the amounts customers spent in grocery stores.
24 E.g.: A grocery store chain surveyed a set of customers concerning their purchasing habits. Given the survey results and how much each customer spent in the previous month, the store wants to see the factors. Variable storeid shop for style usecoup amtspent Variable information Store id Who shop for: 1 = self 2 = self and spouse 3 = self and family Shopping style: 1 = biweekly, in bulk 2 = weekly, similar items 3 = often, what s on sale Use coupons: 1 = no 2 = from newspaper 3 = from mailings 4 = from both Amount spent last month
27 Hays, W. L Statistics, 3rd ed. New York: Holt, Rinehart, and Winston. Brown, M. B., and A. B. Forsythe. 1974b. Robust tests for the equality of variances. Journal of the American Statistical Association, 69:, Milliken, G., and D. Johnson Analysis of Messy Data: Volume 1. Designed Experiments. New York: Chapman & Hall. Neter, J., W. Wasserman, and M. H. Kutner Applied Linear Statistical Models, 3rd ed. Homewood, Ill.: Irwin. Siegel, S., and N. J. Castellan Nonparametric statistics for the behavioral sciences. New York: McGraw-Hill, Inc.. Conover, W. J Practical Nonparametric Statistics, 2nd ed. New York: John Wiley and Sons. Horton, R. L The General Linear Model. New York: McGraw-Hill, Inc..
28 Logistic Models Binary logistic model: dichotomous response outcomes e.g.: presence or absence of an event π i = E( yi xi ) logit( π ) = log( π /(1 π )) = g( π ) = α + β X Ordinal logistic model: ordinal response variable with more than two ordered categories e.g.: a 5-point Likert scale g(pr( Y i X )) = αi + β ' X, i = 1,..., k Multinomial logistic model: nominal response variables with more than two categories e.g.: different types of programs in school Pr( Y = i X ) log = αi + β ' i X, i = 1,..., k Pr( Y = k + 1 X )
29 Using binary logistic regression to assess credit risk E.g.: bankloan.sav Analyze Regression Binary Logistic If you are a loan officer at a bank, then you want to be able to identify characteristics that are indicative of people who are likely to default on loans, and use those characteristics to identify good and bad credit risks. We have information on 850 past and prospective customers. The first 700 cases are customers who were previously given loans. Use these 700 customers to create a logistic regression model. Then use the model to classify the 150 prospective customers as good or bad credit risks.
30 Variable name age ed employ address income Variable information Age in years Level of education 1= didn t complete high school 2= high school degree 3= college degree 4= undergraduate 5= postgraduate Years with current employer Years in current address Household income in thousands debtinc Debt to income ratio (*100) creddebt othdebt default Credit card debt in thousands Other debts in thousands Previously defaulted 1= Yes 0 = No
31 Step 1: select the first 700 obs for logistic model Data Select Cases
32 Step 2: construct logistic model
34 Step 3: identify possible outlying obs Scatter plot: predicted probability vs Cook s D statistics
35 Step 4: draw ROC curve and find the optimal cut-off point Analyze ROC Curve
36 Step 5: predict credit risk Select the rest 150 obs and compute predicted probability with model coefficients
37 Using Multinomial Logistic Regression to Classify Telecommunications Customers E.g.: telco.sav A telecommunications provider has segmented its customer base by service usage patterns, categorizing the customers into four groups. If demographic data can be used to predict group membership, you can customize offers for individual prospective customers. Analyze Regression Multinomial Logistic
40 Using Ordinal Regression to Build a Credit Scoring Model g(pr( Y i X )) = αi + β ' X, i = 1,..., k E.g.: german_credit.sav A creditor wants to be able to determine whether an applicant is a good credit risk, given various financial and personal characteristics. From their customer database, the creditor (dependent) variable is account status, with five ordinal levels: no debt history, no current debt, debt payments current, debt payments past due, and critical account. Potential predictors consist of various financial and personal characteristics of applicants, including age, number of credits at the bank, housing type, checking account status, and so on.
41 Analyze Regression Ordinal
43 Hays, W. L Statistics, 3rd ed. New York: Holt, Rinehart, and Winston. McCullagh, P., and J. A. Nelder Generalized Linear Models, 2nd ed. London: Chapman & Hall. Cox, D. R., and E. J. Snell The Analysis of Binary Data, 2nd ed. London: Chapman and Hall. Hosmer, D. W., and S. Lemeshow Applied Logistic Regression, 2nd ed. New York: John Wiley and Sons. Kleinbaum, D. G Logistic Regression: A Self-Learning Text. New York: Springer-Verlag. Norusis, M SPSS 13.0 Advanced Statistical Procedures Companion. Upper Saddle-River, N.J.: Prentice Hall, Inc..
44 Advanced Models 30 May Friday 10am-12pm Training Library Level 5 Curve Estimation Non-linear Regression Survival Analysis Linear Mixed Model Time-series Data Analysis
SPSS TRAINING SESSION 3 ADVANCED TOPICS (PASW STATISTICS 17.0) Sun Li Centre for Academic Computing firstname.lastname@example.org IN SPSS SESSION 2, WE HAVE LEARNT: Elementary Data Analysis Group Comparison & One-way
SPSS Tests for Versions 9 to 13 Chapter 2 Descriptive Statistic (including median) Choose Analyze Descriptive statistics Frequencies... Click on variable(s) then press to move to into Variable(s): list
A cohort of people is a group of people whose membership is clearly defined. A prospective study is one in which a cohort of people is followed for the occurrence or nonoccurrence of specified endpoints
IBM SPSS Statistics: A Guide to Functionality IBM SPSS Statistics is a renowned statistical analysis software package that encompasses a broad range of easy-to-use, sophisticated analytical procedures.
Credit Risk Analysis Using Logistic Regression Modeling Introduction A loan officer at a bank wants to be able to identify characteristics that are indicative of people who are likely to default on loans,
COURSE DESCRIPTION The course Data Analysis with SPSS was especially designed for students of Master s Programme System and Software Engineering. The content and teaching methods of the course correspond
statstutor community project encouraging academics to share statistics support resources All stcp resources are released under a Creative Commons licence The Statistics Tutor s Quick Guide to Stcp-marshallowen-7
Coordinating unit: Teaching unit: Academic year: Degree: ECTS credits: 2015 240 - ETSEIB - Barcelona School of Industrial Engineering 715 - EIO - Department of Statistics and Operations Research MASTER'S
Cairo University Faculty of Economics and Political Science Statistics Department English Section Students' Opinion about Universities: The Faculty of Economics and Political Science (Case Study) Prepared
SOCIOLOGY 7702 FALL, 2014 INTRODUCTION TO STATISTICS AND DATA ANALYSIS Professor Michael A. Malec Mailbox is in McGuinn 426 Office: McGuinn 427 Phone: 617-552-4131 Office Hours: TBA E-mail: email@example.com
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
Module 9: Nonparametric Tests The Applied Research Center Module 9 Overview } Nonparametric Tests } Parametric vs. Nonparametric Tests } Restrictions of Nonparametric Tests } One-Sample Chi-Square Test
Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the
Data analysis process Data collection and preparation Collect data Prepare codebook Set up structure of data Enter data Screen data for errors Exploration of data Descriptive Statistics Graphs Analysis
IBM SPSS Missing Values 22 Note Before using this information and the product it supports, read the information in Notices on page 23. Product Information This edition applies to version 22, release 0,
How to choose a statistical test Francisco J. Candido dos Reis DGO-FMRP University of São Paulo Choosing the right test One of the most common queries in stats support is Which analysis should I use There
Course Agenda First Day 4 th February - Monday 14.30-19.00 14:30-15.30 Students Registration Main Entrance Registration Desk 15.30-17.00 Opening Works Teacher presentation Brief Students presentation Course
Mplus Short Courses Topic 2 Regression Analysis, Eploratory Factor Analysis, Confirmatory Factor Analysis, And Structural Equation Modeling For Categorical, Censored, And Count Outcomes Linda K. Muthén
When to Use Which Statistical Test Rachel Lovell, Ph.D., Senior Research Associate Begun Center for Violence Prevention Research and Education Jack, Joseph, and Morton Mandel School of Applied Social Sciences
SPSS Explore procedure One useful function in SPSS is the Explore procedure, which will produce histograms, boxplots, stem-and-leaf plots and extensive descriptive statistics. To run the Explore procedure,
PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 When and why do we use logistic regression? Binary Multinomial Theory behind logistic regression Assessing the model Assessing predictors
Survival Analysis Using SPSS By Hui Bian Office for Faculty Excellence Survival analysis What is survival analysis Event history analysis Time series analysis When use survival analysis Research interest
Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,
SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling
IBM SPSS Direct Marketing 19 Note: Before using this information and the product it supports, read the general information under Notices on p. 105. This document contains proprietary information of SPSS
UNIVERSITY OF MISKOLC Faculty of Economics Institute of Business Information and Methods Department of Business Statistics and Economic Forecasting PETRA PETROVICS SPSS TUTORIAL & EXERCISE BOOK FOR BUSINESS
Classification of Bad Accounts in Credit Card Industry Chengwei Yuan December 12, 2014 Introduction Risk management is critical for a credit card company to survive in such competing industry. In addition
ABSTRACT Paper 92 %DISCIT Macro: Pre-screening continuous variables for subsequent binary logistic regression analysis through visualization Mohamed S. Anany, Videa (Cox Media Group subsidiary) In binary
Paper TU04 An Overview of Non-parametric Tests in SAS : When, Why, and How Paul A. Pappas and Venita DePuy Durham, North Carolina, USA ABSTRACT Most commonly used statistical procedures are based on the
A Guide for a Selection of SPSS Functions IBM SPSS Statistics 19 Compiled by Beth Gaedy, Math Specialist, Viterbo University - 2012 Using documents prepared by Drs. Sheldon Lee, Marcus Saegrove, Jennifer
Multiple logistic regression analysis of cigarette use among high school students ABSTRACT Joseph Adwere-Boamah Alliant International University A binary logistic regression analysis was performed to predict
COURSE PLAN BDA: Biomedical Data Analysis Master in Bioinformatics for Health Sciences 2015-2016 Academic Year Qualification. Master's Degree 1. Description of the subject Subject name: Biomedical Data
Essentials of Marketing Second Edition Joseph F. Hair, Jr. Kennesaw State University Mary F. Wolfinbarger California State University-Long Beach David J. Ortinau University of South Florida Robert P. Bush
1 SPSS Multivariable Linear Models and Logistic Regression Multivariable Models Single continuous outcome (dependent variable), one main exposure (independent) variable, and one or more potential confounders
When to Use a Particular Statistical Test Central Tendency Univariate Descriptive Mode the most commonly occurring value 6 people with ages 21, 22, 21, 23, 19, 21 - mode = 21 Median the center value the
Information Builders enables agile information solutions with business intelligence (BI) and integration technologies. WebFOCUS the most widely utilized business intelligence platform connects to any enterprise
Binary Logistic Regression Main Effects Model Logistic regression will accept quantitative, binary or categorical predictors and will code the latter two in various ways. Here s a simple model including
Ordinal Regression Chapter 4 Many variables of interest are ordinal. That is, you can rank the values, but the real distance between categories is unknown. Diseases are graded on scales from least severe
Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics:- Behavioural
Overview of Non-Parametric Statistics PRESENTER: ELAINE EISENBEISZ OWNER AND PRINCIPAL, OMEGA STATISTICS About Omega Statistics Private practice consultancy based in Southern California, Medical and Clinical
HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate
Statistical Analysis Using SPSS for Windows Getting Started (Ver. 2014/11/6) The numbers of figures in the SPSS_screenshot.pptx are shown in red. 1. How to display English messages from IBM SPSS Statistics
UNIVERSITY OF NAIROBI MASTERS IN PROJECT PLANNING AND MANAGEMENT NAME: SARU CAROLYNN ELIZABETH REGISTRATION NO: L50/61646/2013 COURSE CODE: LDP 603 COURSE TITLE: RESEARCH METHODS LECTURER: GAKUU CHRISTOPHER
Applied Multivariate Statistical Modelling Prof. J. Maiti Department of Industrial Engineering and Management Indian Institute of Technology, Kharagpur Lecture - 32 Regression Modelling Using SPSS (Refer
Probability of Selecting Response Mode Between Paper- and Web-Based Options: Logistic Model Based on Institutional Demographics Marina Murray Graduate Management Admission Council, 11921 Freedom Drive,
xi List of s 2.1 Goleman's Emotional Intelligence Components 13 2.2 Components of TLQ 34 2.3 Components of Styles / Self Awareness Reviewed 51 2.4 Relationships to be studied between Self Awareness / Styles
Statistical Package for the Social Science (SPSS) and Sample Power Introduction to the Practice of Statistics UF INFORMATION TECHNOLOGY Fall 2016 Authored by: Jose Lorenzo Silva-Lugo, Ph.D. STATISTICAL
Chapter 855 Introduction Linear regression is a commonly used procedure in statistical analysis. One of the main objectives in linear regression analysis is to test hypotheses about the slope (sometimes
statstutor community project encouraging academics to share statistics support resources All stcp resources are released under a Creative Commons licence Stcp-marshallowen-7 The Statistics Tutor s www.statstutor.ac.uk
business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel
Journal of Modern Applied Statistical Methods Copyright 2013 JMASM, Inc. May 2013, Vol. 12, No. 1, 164-169 1538 9472/13/$95.00 The Probit Link Function in Generalized Linear Models for Data Mining Applications
Evidence is no evidence if based solely on p value Logistic Regression: Basics Prediction Model: Binary Outcomes Nemours Stats 101 Laurens Holmes, Jr. General Linear Model OUTCOME Continuous Counts Data
Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................
Developing Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@ Yanchun Xu, Andrius Kubilius Joint Commission on Accreditation of Healthcare Organizations,
SPSS Introduction Yi Li Note: The report is based on the websites below http://glimo.vub.ac.be/downloads/eng_spss_basic.pdf http://academic.udayton.edu/gregelvers/psy216/spss http://www.nursing.ucdenver.edu/pdf/factoranalysishowto.pdf
PROBABILITY AND STATISTICS Ma 527 Course Description Prefaced by a study of the foundations of probability and statistics, this course is an extension of the elements of probability and statistics introduced
DEPARTMENT OF HEALTH AND HUMAN SCIENCES HS900 RESEARCH METHODS Using SPSS Session 2 Topics addressed today: 1. Recoding data missing values, collapsing categories 2. Making a simple scale 3. Standardisation
i PASW Direct Marketing 18 For more information about SPSS Inc. software products, please visit our Web site at http://www.spss.com or contact SPSS Inc. 233 South Wacker Drive, 11th Floor Chicago, IL 60606-6412
Wednesday PM Presentation of AM results Multiple linear regression Simultaneous Stepwise Hierarchical Logistic regression Multiple regression Multiple regression extends simple linear regression to consider
Logistic regression modeling the probability of success Regression models are usually thought of as only being appropriate for target variables that are continuous Is there any situation where we might
Cool Tools for PROC LOGISTIC Paul D. Allison Statistical Horizons LLC and the University of Pennsylvania March 2013 www.statisticalhorizons.com 1 New Features in LOGISTIC ODDSRATIO statement EFFECTPLOT
NEW YORK UNIVERSITY ROBERT F. WAGNER GRADUATE SCHOOL OF PUBLIC SERVICE Course Syllabus Spring 2016 Statistical Methods for Public, Nonprofit, and Health Management Section Format Day Begin End Building
GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March 2009 Presented by: Tanya D. Havlicek, Actuarial Assistant 0 ANTITRUST Notice The Casualty Actuarial
Technical report How to Get More Value from Your Survey Data Discover four advanced analysis techniques that make survey research more effective Table of contents Introduction..............................................................2
Correlational Research Stephen E. Brock, Ph.D., NCSP California State University, Sacramento 1 Correlational Research A quantitative methodology used to determine whether, and to what degree, a relationship
Nominal and ordinal logistic regression April 26 Nominal and ordinal logistic regression Our goal for today is to briefly go over ways to extend the logistic regression model to the case where the outcome
Credit Risk Models August 24 26, 2010 AGENDA 1 st Case Study : Credit Rating Model Borrowers and Factoring (Accounts Receivable Financing) pages 3 10 2 nd Case Study : Credit Scoring Model Automobile Leasing
Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are highly extraverted people less afraid of rejection
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in
Applications of Intermediate/Advanced Statistics in Institutional Research Edited by Mary Ann Coughlin THE ASSOCIATION FOR INSTITUTIONAL RESEARCH Number Sixteen Resources in Institional Research 2005 Association
STAT X400 (2 semester units in Statistics) Business, Technology & Engineering Technology & Information Management Quantitative Analysis & Analytics Course Description This course introduces students to
An introduction to IBM SPSS Statistics Contents 1 Introduction... 1 2 Entering your data... 2 3 Preparing your data for analysis... 10 4 Exploring your data: univariate analysis... 14 5 Generating descriptive
1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation
IBM SPSS Direct Marketing 20 Note: Before using this information and the product it supports, read the general information under Notices on p. 105. This edition applies to IBM SPSS Statistics 20 and to
IBM SPSS Complex Samples 22 Note Before using this information and the product it supports, read the information in Notices on page 51. Product Information This edition applies to version 22, release 0,
IMPACT: International Journal of Research in Business Management (IMPACT: IJRBM) ISSN(E): 2321-886X; ISSN(P): 2347-4572 Vol. 3, Issue 4, Apr 2015, 13-18 Impact Journals CREDIT RISK ASSESSMENT FOR MORTGAGE
Name of the module: Multivariate biostatistics and SPSS Number of module: 471-8-4081 BGU Credits: 1.5 ECTS credits: Academic year: 4 th Semester: 15 days during fall semester Hours of instruction: 8:00-17:00