PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12



Similar documents
STATISTICAL DATA ANALYSIS IN EXCEL

SIMPLE LINEAR CORRELATION

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES

Recurrence. 1 Definitions and main statements

v a 1 b 1 i, a 2 b 2 i,..., a n b n i.

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

The OC Curve of Attribute Acceptance Plans

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

Can Auto Liability Insurance Purchases Signal Risk Attitude?

7 ANALYSIS OF VARIANCE (ANOVA)

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

How To Understand The Results Of The German Meris Cloud And Water Vapour Product

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Calculation of Sampling Weights

Finite Math Chapter 10: Study Guide and Solution to Problems

Vasicek s Model of Distribution of Losses in a Large, Homogeneous Portfolio

Economic Interpretation of Regression. Theory and Applications

What is Candidate Sampling

Section 5.4 Annuities, Present Value, and Amortization

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence

Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation

BERNSTEIN POLYNOMIALS

HOUSEHOLDS DEBT BURDEN: AN ANALYSIS BASED ON MICROECONOMIC DATA*

CHAPTER 14 MORE ABOUT REGRESSION

An Alternative Way to Measure Private Equity Performance

Support Vector Machines

Chapter XX More advanced approaches to the analysis of survey data. Gad Nathan Hebrew University Jerusalem, Israel. Abstract

1 Example 1: Axis-aligned rectangles

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

Lecture 3: Force of Interest, Real Interest Rate, Annuity

Number of Levels Cumulative Annual operating Income per year construction costs costs ($) ($) ($) 1 600,000 35, , ,200,000 60, ,000

Simple Interest Loans (Section 5.1) :

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6

Risk Model of Long-Term Production Scheduling in Open Pit Gold Mining

IDENTIFICATION AND CORRECTION OF A COMMON ERROR IN GENERAL ANNUITY CALCULATIONS

Approximating Cross-validatory Predictive Evaluation in Bayesian Latent Variables Models with Integrated IS and WAIC

Analysis of Premium Liabilities for Australian Lines of Business

5 Multiple regression analysis with qualitative information

Underwriting Risk. Glenn Meyers. Insurance Services Office, Inc.

Statistical Methods to Develop Rating Models

Evaluating credit risk models: A critique and a new proposal

PRIVATE SCHOOL CHOICE: THE EFFECTS OF RELIGIOUS AFFILIATION AND PARTICIPATION

Implied (risk neutral) probabilities, betting odds and prediction markets

Answer: A). There is a flatter IS curve in the high MPC economy. Original LM LM after increase in M. IS curve for low MPC economy

The Application of Fractional Brownian Motion in Option Pricing

Estimation of Dispersion Parameters in GLMs with and without Random Effects

Production. 2. Y is closed A set is closed if it contains its boundary. We need this for the solution existence in the profit maximization problem.

Gender differences in revealed risk taking: evidence from mutual fund investors

Efficient Project Portfolio as a tool for Enterprise Risk Management

RELIABILITY, RISK AND AVAILABILITY ANLYSIS OF A CONTAINER GANTRY CRANE ABSTRACT

DEFINING %COMPLETE IN MICROSOFT PROJECT

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek

FINANCIAL MATHEMATICS. A Practical Guide for Actuaries. and other Business Professionals

EXAMPLE PROBLEMS SOLVED USING THE SHARP EL-733A CALCULATOR

Regression Models for a Binary Response Using EXCEL and JMP

LIFETIME INCOME OPTIONS

1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP)

A Probabilistic Theory of Coherence

J. Parallel Distrib. Comput.

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

Testing The Torah Code Hypothesis: The Experimental Protocol

4 Hypothesis testing in the multiple regression model

GENETIC ALGORITHM FOR PROJECT SCHEDULING AND RESOURCE ALLOCATION UNDER UNCERTAINTY

How Much to Bet on Video Poker

Stress test for measuring insurance risks in non-life insurance

Computer-assisted Auditing for High- Volume Medical Coding

A Model of Private Equity Fund Compensation

Dynamic Pricing for Smart Grid with Reinforcement Learning

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification

NON-CONSTANT SUM RED-AND-BLACK GAMES WITH BET-DEPENDENT WIN PROBABILITY FUNCTION LAURA PONTIGGIA, University of the Sciences in Philadelphia

Extending Probabilistic Dynamic Epistemic Logic

) of the Cell class is created containing information about events associated with the cell. Events are added to the Cell instance

The Greedy Method. Introduction. 0/1 Knapsack Problem

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

This circuit than can be reduced to a planar circuit

An Investigation of the Performance of the Generalized S-X 2 Item-Fit Index for Polytomous IRT Models. Taehoon Kang Troy T. Chen

Covariate-based pricing of automobile insurance

1.2 DISTRIBUTIONS FOR CATEGORICAL DATA

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING

1. Math 210 Finite Mathematics

Feasibility of Using Discriminate Pricing Schemes for Energy Trading in Smart Grid


Time Value of Money. Types of Interest. Compounding and Discounting Single Sums. Page 1. Ch. 6 - The Time Value of Money. The Time Value of Money

The Distribution of Eigenvalues of Covariance Matrices of Residuals in Analysis of Variance

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006

Psicológica Universidad de Valencia ISSN (Versión impresa): ISSN (Versión en línea): ESPAÑA

Conceptual and Practical Issues in the Statistical Design and Analysis of Usability Tests

1 De nitions and Censoring

Joe Pimbley, unpublished, Yield Curve Calculations

Section 5.3 Annuities, Future Value, and Sinking Funds

Part 1: quick summary 5. Part 2: understanding the basics of ANOVA 8

PRACTICE 1: MUTUAL FUNDS EVALUATION USING MATLAB.

Thursday, December 10, 2009 Noon - 1:50 pm Faraday 143

Rapid Estimation Method for Data Capacity and Spectrum Efficiency in Cellular Networks

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements

Although ordinary least-squares (OLS) regression

Forecasting the Direction and Strength of Stock Market Movement

Social Nfluence and Its Models

Chapter 8 Group-based Lending and Adverse Selection: A Study on Risk Behavior and Group Formation 1

Transcription:

14 The Ch-squared dstrbuton PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 1 If a normal varable X, havng mean µ and varance σ, s standardsed, the new varable Z has a mean 0 and varance 1. When ths standardsed varable s squared, t s sad to follow a ch-squared dstrbuton wth one degree of freedom, denoted by χ (1). In general, the sum of the squares of n ndependent standardsed normal varables follows a ch-squared dstrbuton wth n degrees of freedom. The shape of the ch-squared dstrbuton s a functon of ts number of degrees of freedom, denoted by Greek letter ν (nu). The followng dagram gves us the shapes for ν = 1, ν = and ν 3. f (x) ν =1 ν = ν 3 (generally) Fg. 14.1 x 14.1 The ch-squared test for ndependence The ch-squared dstrbuton s very often used n hypothess testng, especally when establshng whether two factors are ndependent or assocated. Ths s done by buldng a contngency table nvolvng the attrbutes of these factors. Ths type of testng s dfferent from our usual way of testng for a populaton parameter (mean or proporton) n the sense that the statements of the null and alternatve hypotheses are very straghtforward: H 0 : the factors are ndependent H : the factors are not ndependent (or assocated) 1

Note Independence s always assumed n the null hypothess and, obvously, the factors have to be stated n the context of the problem. The problem s normally presented as a table of observed frequences of the factors n terms of ther attrbutes (subfactors). We are requred to test whether there s a sgnfcant dfference between these observed frequences and ther theoretcal (expected) counterparts. Snce we always assume that the null hypothess s true n any testng procedure, we therefore calculate the expected frequences on the assumpton that the factors are ndependent. These expected frequences are thus computed accordng to probablty theory whereby, f two varables (factors) A and B are ndependent, P ( A B) = P( A) P( B). It s then a matter of performng a test-statstc to fnd the sgnfcance of the dfference. Ths s known as the ch-squared test-statstc and s defned as χ n = = 1 ( O E ) where n s the number of frequency cells n the contngency table, O and E are the th observed and expected frequences respectvely. The test-statstc value wll be compared wth a crtcal ch-squared value (from a table of ch-squared values) before decdng whether the null hypothess wll be eventually accepted or rejected - the sgnfcance level of the test s usually gven. Note In ths course, we assume that the shape of the ch-squared dstrbuton wll always resemble that for the case ν 3 (even though we may have ν =1 or ν = n the exams). Indeed, the above explanaton s confusng and meanngless wthout havng a look at an example for a better llustraton. E 14. Example The members of a sports team are nterested n whether the weather has an effect on ther results. They play 50 matches, wth the followng results. Weather Result Good Bad Total Wn 1 4 16 Draw 5 8 13 Lose 7 14 1 Total 4 6 50

Formulate sutable null and alternatve hypotheses, and use a χ test to test the clam, at the 1% sgnfcance level, that the weather has no effect on the team s results. State your concluson clearly. Soluton We start by formulatng our null and alternatve hypothess: H 0 : results are ndependent of the weather H : results are affected by the weather 1 The above contngency table s of order 3 (pronounced 3 by ) as shown by the shaded cells (wn, draw, lose) and (good, bad). Note The general formula for the number of degrees of freedom for an contngency table s ν = ( m 1) ( n 1). m n The expected frequences are computed, as mentoned above, by usng the multplcatve rule for ndependent events. Let us compute the expected frequency for the cell good and wn, whch has an observed frequency of 1. If a result s 16 selected at random, the probablty that t s a wn s 50 and the probablty that 4 the weather was good on that day s 50. Assumng that the null hypothess s true, that s, result and weather are ndependent, the probablty that a wn was obtaned when the weather was good 16 4 s the product of these two probabltes, whch s. The expected number of 50 50 16 16 4 such results out of 50 matches s therefore 4 50 =, whch, f well 50 50 50 observed, s the product of the margnal totals (n bold) over the grand total of 50. We can thus safely use the formula Product of margnal totals Expected frequency = Grand total However, there are two condtons to be satsfed: (1) All expected frequences must be at least 5 f not one or more cells should be merged. () The total observed and expected frequences must be equal.

Result Weather Good Bad Total Wn 1 4 16 7.68 8.3 Draw 5 8 13 6.4 6.76 Lose 7 14 1 10.08 10.9 Total 4 6 50 Table 14. We can calculate all the expected frequences (n bold above) usng the above formula but there s no need to. Careful observaton shows that, once we compute the expected frequences of the cells wth observed frequences 1 and 5, all the remanng expected frequences may be readly obtaned by makng use of the margnal totals snce they are constant. In fact, we need only two of them (not n the same row) to complete the table. That s why we wll use two degrees of freedom for the crtcal ch-squared value. In smple Englsh, two of these frequences are free but the remanng ones depend on the frst two and the margnal totals. Ths s confrmed by the formula for the number of degrees of freedom ν = ( 3 1) ( 1) = 1 =. (1 7.68) 7.68 We now calculate the statstc value usng (4 8.3) + 8.3 0.941 + 0.869 = 6.956 + + (14 10.9) 10.9 χ n = = 1 ( O E ) E, that s, =.43 +.43 + 0.46 + 0.7 + Now, at a 1% level of sgnfcance, the crtcal ch-squared value s 9.1034 (see table). The dagram below ndcates the crtcal regon (level of sgnfcance) and the crtcal ch-squared value.

f (x) Accept H 0 (Reject H 0 ) 0.01 x 6.956 9.1034 (crtcal value) (test-statstc value) Fg. 14.3 Snce 6.956 < 9.1034, we cannot reject H 0 ; we conclude that results and weather are ndependent. It s worth notng that the crtcal regon s always on the rght-hand sde of the curve (there s no such thng as two-taled tests!)