Chapter 1. The Role of Statistics. Example 1.11 Bar Charts

Transcription

1 Chapter 1 The Role of Statistics Example 1.11 Bar Charts The Problem Why Students Drop Out The article So Close, Yet So Far: Predictors of Attrition in College Seniors (J. College Student Development (1998): ) examined the reasons that college seniors leave their college programs before graduating. Forty-two college seniors at a large public university who dropped out prior to graduation were interviewed and asked the main reason for discontinuing enrollment at the university. Data consistent with that given in the article is summarized in the frequency distribution found in file Example 1.11.sav. 1. Steps to prepare the data: a. Open the data set Example 1.11.sav. 2. Steps to create a bar chart: a. To create the bar chart we use the Graphs menu. b. Scroll down to the Bar option. c. A window opens up. Select the Simple option. d. Under the Data in Chart Are heading, select the Values of individual cases option and click Define. e. A second window opens up. The window is shown below: f. Select the variable Frequency and move it into the Bars Represent box. g. Under the Category Labels heading, select the Variable option. h. Select the variable Reason and move it into the Variable box. i. Optional Step: to add a title to the chart, select the Titles option, enter your title, and click Continue. j. Click OK.

2 The SPSS Output Graph Value Frequency Acad. Advising Break Econ. Family NewSch. Other Pers. Reason Example 1.12 Dot Plots The Problem Graduation Rates for NCAA Division I Schools in California and Texas The Chronicle of Higher Education (Almanac Issue, Aug. 31, 2001) reported graduation rates for NCAA Division I schools. The rates reported are the percent of full time freshman in the fall 1993 that had earned a bachelor s degree by Aug Data from the two largest states, the 20 Division I schools in California and the 19 in Texas, is given in the file Example 1.12.sav. 1. Steps to prepare the data: a. Open the data set Example 1.12.sav. 2. Steps to create a dot plot: a. To create the dot plot, we use the Graphs menu. b. Scroll down to Interactive, and select the Dot option. c. The following window opens:

3 Count d. Select the variable Graduation R and drag it to the x-axis box located under the box containing the variable Count. e. Optional Step: to add a title to the plot, select the Titles tab enter your title, and click Continue. f. Click OK. Interactive Graph The SPSS Output Graduation R

4 3. Steps to split the data for individual state dot plots: a. To split the data, we use the Data menu. b. Scroll down to the Select Cases option. c. The following window opens: d. Under the Select heading, select the If condition is satisfied option and click If. e. A second window opens. The window is shown below: f. Select the variable State and click the button. Set the variable equal to 1 for California, or 2 for Texas. g. Click Continue. h. Click OK. i. Repeat steps 2.a-f. This time the plot will represent the data for either California or Texas only, depending on the condition set in step 3.f.

5 Count Count The SPSS Output Interactive Graph Interactive Graph California Texas By Hasan Hamdan, Department of Mathematics and Statistics, James Madison University, Harrisonburg,VA

6 Chapter 2 Example 2.3: Selecting Random Samples The Problem- Breaking Strength of Glass Soda Bottles A random sample of size n=3 selected from four crates containing 100 bottles (population) numbered 1, 2,, 100 which is recorded in the file Example2.3. The crate number is given in the second column. A random sample of size 3 is to be drawn from the population of 100 bottles. In Example2.3, each bottle is identified by two numbers. The first number is the population number and the second number is the crate number. 1. Steps to prepare the data a. Open the data set Example2.3.sav 2. Steps to generate a random sample: a. To generate a random sample we use Data menu. b. Scroll down to the Select Cases option. c. A second window opens up. Select Random sample of cases. The window is shown below: d. Click on Sample. e. A second window opens up. The window is shown below:

7 f. Change the option to Exactly. g. Type the sample size in the first window (I.e. 3). The population size in the second window (I.e. 100) h. Click Continue. i. The random sample consists of those observations that have a filter_$ value equals1. The SPSS Output: Observation number 11 which is located in crate number 1, observation number 33 which is located in crate number 2 and observation number 78 which is located in crate number 4.

8 Chapter 3 Graphical Methods for Describing Data Example 3.1 Comparative Bar Chart The Problem Perceived Risk of Smoking The article Most Smokers Wish They Could Quit (Gallup Poll Analyses, Nov.21, 2002) noted that smokers and nonsmokers perceive the risks of smoking differently. The data found in file Example 3.1.sav summarizes responses regarding the perceived harm of smoking for each of 3 groups a sample of 241 smokers, a sample of 261 former smokers, and a sample of 502 nonsmokers. 1. Steps to prepare the data: a. Open the data set Example 3.1.sav. 2. Steps to create a comparative bar chart: a. To create the comparative bar chart we use the Graphs menu. b. Scroll down to the Bar option. c. A window opens up. Select the Clustered option. d. Under the Data in Chart Are heading, select the Values of individual cases option and click Define. e. A second window opens up. The window is shown below: f. Select variables Smoker, Former Smoker, and Nonsmoker and move them into the Bars Represent box. g. Under the Category Labels heading, select the Variable option. h. Select the variable Perceived Harm and move it into the Variable box. i. Click OK.

9 The SPSS Output Graph Relative Frequency Very Harmful Not Too Harmful Somewhat Harmful Not Harmful Smoker Former Smoker Nonsmoker Perceived Harm Example 3.4 Pie Chart The Problem Birds That Fish Night herons and cattle egrets are species of birds that feed on aquatic prey in shallow water. These birds stalk submerged prey while wading in shallow water, and than strike rapidly and downward through the water in an attempt to catch the prey. The article Cattle Egrets Are Less Able to Cope with Light Refraction Than Are Other Herons (Animal Behavior (1999): ) gave data on outcome when 240 cattle egrets and 180 night herons attempted to capture submerged prey. The data is summarized in file Example 3.4.sav. 1. Steps to prepare the data: a. Open the data set Example 3.4.sav. 2. Steps to create a pie chart: a. To create the pie chart we use the Graphs menu. b. Scroll down to the Pie option. c. A window opens up. Select the Values of individual cases option and click Define. d. A second window opens up. The window is shown below:

10 e. Select the variable Cattle Egret and move it into the Slices Represent box. f. Under the Slice Labels heading, select the Variable option. g. Select the variable Outcome and move it into the Variable box. h. Click OK. i. To display the values or percentages on the slice labels, double-click anywhere in the chart area. The SPSS chart editor window opens. j. Select the Chart menu, and scroll down to Options option. k. A second window opens. Under the Labels heading, check Values to display the frequency or Percents for the relative frequency. l. Click OK. m. Repeat steps e through l for the Night Heron variable. The SPSS Output Cattle Egret Not Caught 39.2% 1st Attempt 42.9% 3rd Attempt.8% 2nd Attempt 17.1%

11 Night Heron 3rd Attempt 2.8% 2nd Attempt 15.0% 1st Attempt 82.2% Example 3.8 Stem-and-Leaf Display The Problem Binge Drinking The use of alcohol by college students is of great concern, not only to those in the academic community, but also, because of potential health and safety consequences, to society at large. The article Health and Behavioral Consequences of Binge Drinking in College (Journal of the American Medical Association (1994): ) reported on a comprehensive study of heavy drinking on campuses across the country. A binge episode was defined as five or more drinks in a row for males and four or more for females. File Example 3.8.sav contains 140 values of the percentage of undergraduate students who are binge drinkers at various colleges. 1. Steps to prepare the data: a. Open the data set Example 3.8.sav. 2. Steps to create a stem-and-leaf display: a. To create the stem-and-leaf display we use the Analyze menu. b. Scroll down to Descriptive Statistics and select the Explore option. c. The following window opens up:

12 d. Select the variable % Binge Drinkers and move it into the Dependent List box. e. Under the Display heading, select the Plots option. f. Click Plots. g. A second window opens up. The window is shown below: h. Under the Boxplots heading, select the None option. i. Under the Descriptive heading, check the Stem-and-Leaf option. j. Click Continue. k. Click OK. % Binge Drinkers Stem-and-Leaf Plot The SPSS Output Frequency Stem & Leaf Stem width: 10 Each leaf: 1 case(s) Example 3.17 Histogram The Problem Mercury Contamination Mercury contamination is a serious environmental concern. Mercury levels are particularly high in certain types of fish. Citizens of the Republic of Seychelles, a group of islands in the Indian Ocean, are among those who consume the most fish in the world. The article Mercury Content of Commercially Important Fish of the Seychelles, and Hair Mercury Levels of a Selected Part of the Population (Environ. Research (1983): ) reported the observations on mercury content (ppm) in hair of 40 fisherman found in file Example 3.17.sav.

13 1. Steps to prepare the data: a. Open the data set Example 3.17.sav. 2. Steps to create a histogram: a. To create the histogram we use the Graphs menu. b. Scroll down to the Histogram option. c. The following window opens up: d. Select the variable Mercury Content and move it into the Variable box. e. Click OK. f. To reset the interval width, double-click anywhere in the graph area. The SPSS chart editor window opens up. n. Double-click on the horizontal axis values. o. A second window opens. Under the Intervals heading, select the Custom option and click Define. p. A third window opens. Under the Definition heading, select the Interval width option and enter the desired width in the box. q. Under the Range heading, enter the desired Minimum and Maximum in the appropriate Displayed boxes. r. Click Continue. s. Click OK.

14 The SPSS Output Graph Frequency Mercury Content Example 3.21 Scatter Plot The Problem Vermont Sugarbushes The growth and decline of forests is a matter of great public and scientific interest. The article Relationships Among Crown Condition, Growth, and Stand Nutrition in Seven Northern Vermont Sugarbushes (Canad. J. of Forest Res. (1995): ) included a scatter plot of y = mean crown dieback (%), which is one indicator of growth retardation, and x = soil ph (higher ph corresponds to less acidic soil). Observations read from the scatter plot are found in file Example 3.21.sav. 1. Steps to prepare the data: a. Open the data set Example 3.21.sav. 2. Steps to create a scatter plot: a. To create the scatter plot we use the Graphs menu. b. Scroll down to the Scatter option. c. A window opens up. Select the Simple option. d. Click Define. e. A second window opens up. The window is shown below:

15 f. Select the variable Dieback % and move it into the Y Axis box. g. Select the variable Soil ph and move it into the X Axis box. h. Click OK. The SPSS Output Graph Dieback % Soil ph

16 Chapter 4 Example 4.3: Number of Visits to a Class Website The Problem- Finding the mean number of visits per student in a given month for a class website 1. Steps to prepare the data a. Open the data set Example4.3.sav 2. Dot plot See Example1.12 for the dotplot. (Graphs- Interactive- Dot) 3. How to find the mean? a. Go to Analyze b. Scroll down to Descriptive Statistics b. Select Descriptive c. A new window opens up. e. Move the variable number under variable(s). You can change the default options by going to Options. f. Then click on Continue then click on ok.

17 Example 4.8: Computing the sample variance Problem- Finding the sample variance and sample standard deviation of acrylamide levels of a sample of size 7 MacDonald s French fries purchased at 7 different locations 1. Steps to prepare the data 1. Open SPSS data sheet 2. Go to variable view window 3. Type x under names 4. Type the data under x: Repeat Step 3 in the previous example but using the variable x instead of number. There is a second way of finding the sample variance: 1. Go to analyze 2. Scroll down to Descriptive Statistics 3. Select Explore 4. The following window opens up. 5. Move the variable under the Dependent list as shown. 6. Under Display choose Statistics as shown then click ok.

18 Example 4.11: Quartiles and Boxplot of Golden Rectangles Problem-The data in Example4.11 came from an anthropological study of rectangular shapes. Observations were made on x =width/length for a sample of 20 beaded rectangles used in Shoshoni Indian leather handicrafts. 1. Steps to prepare the data Open the data set Example4.11.sav 2. Boxplot a. Go to Graphs b. Scroll down to Boxplot d. Select Simple as shown above. e. Select Summaries of separate variables as shown above. f. Click on Define then move the variable x under Boxes represent then click ok. We can also do the boxplot using the interactive window. 3. Quartiles a. Go to Analyze b. Scroll down to Descriptive Statistics c. Select Frequencies d. A new window opens up.

19 e. Click on Statistics as shown. f. A new window opens up. g. Check all the statistics you need including the Quartiles as shown then click ok.

20 Chapter 5 Summarizing Bivariate Data Example 5.4 Correlation The Problem Is Foal Weight Related to Mare Weight? Foal weight at birth is an indicator of health, so it is of interest to breeders of thoroughbred horses. Is foal weight related to the weight of the mare (mother)? The data found in file Example 5.4.sav are from the article Suckling Behavior Does Not Measure Milk Intake in Horses (Animal Behavior (1999): ). 1. Steps to prepare the data: a. Open the data set Example 5.4.sav. 2. Steps to find the correlation coefficient: a. To find the correlation coefficient, we use the Analyze menu. b. Scroll down to the Correlate menu and select the Bivariate option. c. The following window opens up: d. Select the variables Mare weight and Foal weight, and move them into the Variables box. e. Under the Correlation Coefficients heading, check the Pearson option. f. Click OK. Correlations Correlations The SPSS Output Mare weight Foal weight Pearson Correlation Sig. (2-tailed) N Pearson Correlation Sig. (2-tailed) N Mare weight Foal weight

21 g. The correlation coefficient is given in the first entry of the first row, second column, and the first entry of the second row, first column. 3. Steps to create a scatter plot: a. To create the scatter plot, refer to the procedure outlined in Example Graph The SPSS Output Foal weight Mare weight Example 5.6 Regression Line The Problem Time to Defibrillator Shock and Heart Attack Survival Rate Studies have shown that people who suffer sudden cardiac arrest (SCA) have a better chance of survival if a defibrillator shock is administered very soon after cardiac arrest. How is survival rate related to the time between when cardiac arrest occurs and when the defibrillator shock is delivered? This question is addressed in the paper Improving Survival from Sudden Cardiac Arrest The Role of Home Defibrillators (University of Michigan, Feb. 2002). The data found in the file Example 5.6.sav gives y = survival rate (percent) and x = mean call-to-shock time (minutes) for a cardiac rehabilitation center (where cardiac arrests occurred while victims were hospitalized and so the call-to-shock time tends to be short) and for four communities of different sizes. 1. Steps to prepare the data: a. Open the data set Example 5.6.sav. 2. Steps to create a scatter plot with the regression line superimposed: a. To create the scatter plot, we use the Graphs menu. b. Scroll down to the Scatter option. c. A window opens up. Select the Simple option and click Define. d. A second window opens. The window is shown below:

22 e. Select the variable Survival Rate and move it into the Y Axis box. f. Select the variable Call-to-Shock Time and move it into the X Axis box. g. Click OK. h. Double-click on the graph. The SPSS Chart Editor opens up. i. Under the Chart menu, scroll down to the Options submenu. The following window opens up: j. Under the Fit Line heading, check the Total option. k. Click OK. The regression line will automatically be drawn on the graph. l. Close the SPSS Chart Editor.

23 The SPSS Output Graph Survival Rate Call-to-Shock Time 3. Steps to find the equation of the least squares regression line: a. To find the equation, we use the Analyze menu. b. Scroll down to the Regression submenu and select the Linear option. c. The following window opens up: d. Select the variable Survival Rate and move it into the Dependent box. e. Select the variable Call-to-Shock Time and move it into the Independent(s) box. f. Click OK.

24 The SPSS Output Regression Model 1 Variables Entered/Removed b Variables Variables Entered Removed Method Call-to-Sh ock Time a. Enter a. All requested variables entered. b. Dependent Variable: Survival Rate Model 1 Model Summary Adjusted Std. Error of R R Square R Square the Estimate.960 a a. Predictors: (Constant), Call-to-Shock Time Model 1 Regression Residual Total ANOVA b Sum of Squares df Mean Square F Sig a a. Predictors: (Constant), Call-to-Shock Time b. Dependent Variable: Survival Rate Model 1 (Constant) Call-to-Shock Time a. Dependent Variable: Survival Rate Coefficients a Unstandardized Coefficients Standardized Coefficients B Std. Error Beta t Sig The output consists of 4 boxes. The least squares regression equation may be obtained from the last (fourth) box in the output labeled Coefficients. The estimates are both given in the first column labeled B. The y-intercept is given in the first row labeled (Constant) and the slope is given in the second row labeled Call-to-Shock Time. For this example the least squares regression equation is ŷ = x. Example 5.10 Residual Plot The Problem Tennis Elbow One factor in the development of tennis elbow is the impact-induced vibration of the racket and arm at ball contact. Tennis elbow is thought to be related to various properties of the tennis racket

25 used. The data in the file Example 5.10.sav is a subset of that analyzed in the article Transfer of Tennis Racket Vibrations into the Human Forearm (Med. and Sci. in Sports and Exercise (1992): ). Measurements on x = racket resonance frequency (Hz) and y = sum of peak-to-peak accelerations (a characteristic of arm vibration in m/sec/sec) are given for n = 14 different rackets. 1. Steps to prepare the data: a. Open the data set Example 5.10.sav. 2. Steps to find the residuals: a. To find the residuals, we use the Analyze menu. b. Scroll down to the Regression submenu and select the Linear option. c. Select the variable Acceleration and move it into the Dependent box. d. Select the variable Resonance and move it into the Independent(s) box. e. Click on Save. The following window opens up: f. Under the Residuals heading, select the Unstandardized option. Click Continue. g. Click OK. h. You will get the regression output window as in Example 5.6. The residuals will appear in the SPSS Data Editor window as a variable named res_1. 3. Steps to create a residual plot: a. To create the scatter plot for the full sample, refer to the procedure outlined in Example b. To create the residual plot for the full sample, we again use the Graphs menu. c. Scroll down to the Scatter option. d. A window opens up. Select the Simple option and click Define. e. A second window opens. The window is shown below:

26 f. Select the variable Unstandardized Residual and move it into the Y Axis box. g. Select the variable Resonance and move it into the X Axis box. h. Click OK. The SPSS Output 38 Scatter Plot (full sample) 2.0 Residual Plot (full sample) Acceleration Unstandardized Residual Resonance Resonance 4. Steps to delete an influential observation: a. To delete the influential observation, we return to the SPSS Data Editor. b. Click on number block 14 in the first column. Press the Delete key. c. Repeat Steps 1 3 above with the revised sample.

27 The SPSS Output 37.5 Scatter Plot (revised sample) 2.0 Residual Plot (revised sample) Acceleration Unstandardized Residual Resonance Resonance

28 Chapter 6 Example 6.32: One-Boy Family Planning The Problem- Estimating the proportion of boys in a population where couples keep having children until they have a baby boy. 1. Steps to prepare data Simulated data 2. Steps to simulate a random sample of size 19 (families): a. Go to variable view. b. Under names type variable name like (family) c. Under Width, type14. d. Under Decimals, type 12. e. Create a new variable with name (numofkids) as shown below. f. Click on Data View at the bottom of the page. g. A second window opens up. g. Go to number 19 (because the sample size is 19) and type 1.

29 h. Go to Transform then Compute as shown below. i. Type in the Target Variable family and under Numeric Expression RV.UNIFORM(0, 1) j. Click ok and click ok for Change existing variable? k. Ignore the decimal point and count the number figures until the first odd (include the first odd) as shown below:

30 To find the total number of children: a.go to Analyze b.select Descriptive Statistics c. Select Descriptives and under drag the numkids vraible under Vriable(s) d. Click on Options and check Sum. The estimated proportion of boys in the sample = (number of boys in the sample) / Sum = 19/40 = 0.475

31 Chapter 7 Random Variables and Probability Distributions Example 7.25 Finding Probabilities The Problem Children s Heights In poor countries, the growth of children can be an important indicator of general levels of nutrition and health. Data from the article The Osteological Paradox: Problems in Inferring Prehistoric Health from Skeletal Samples (Current Anthropology (1992): ) suggests that a reasonable model for the probability distribution of the continuous numerical variable x = height of a randomly selected five-year-old child is a normal distribution with a mean of µ = 100 cm and standard deviation σ = 6 cm. What proportion of the heights is between 94 cm and 112 cm? 1. Steps to prepare the data: a. Open the data set Example 7.25.sav. 2. Steps to calculate probabilities for a normal distribution: a. To calculate the probability, we use the Transform menu. b. Scroll down to the Compute option. c. A window opens up. The window is shown below: d. Enter the variable p into the Target Variable box. e. In the box under the heading Functions, scroll down and select the function CDF.NORMAL and move it into the Numeric Expression box. f. In the apprentices after the function CDF.NORMAL, enter the upper endpoint, mean, and standard deviation respectively. g. Click the subtraction key, and repeat steps (e) and (f) for the lower endpoint. h. Click OK and then OK again when asked, Change Existing Variable? i. The output will appear in the SPSS Data Editor window as the first entry under the variable p. The probability is about 82%.

32 Chapter 8 There are no examples for chapter 8. 1

33 Chapter 9 Estimation Using a Single Sample Example 9.2 Point Estimates for the Population Mean, µ The Problem Internet Use by College Students The article Online Extracurricular Activity (USA Today, Mar.13, 2000) reported the results of a study of college students conducted by a polling organization called The Student Monitor. One aspect of computer use examined in this study was the number of hours per week spent on the Internet. The data found in file Example 9.2.sav represent observations of the number of Internet hours per week reported by twenty college students. A point estimate of µ, the true mean Internet time per week for college students, is desired. Three possible statistics are the sample mean, the sample median (data is relatively symmetric), and the 5% trimmed mean. 1. Steps to prepare the data: a. Open the data set Example 9.2.sav. 2. Steps to find the sample mean, the sample median, and the 5% trimmed mean: a. To find these statistics, we use the Analyze menu. b. Scroll down to the Descriptive Statistics submenu and select the Explore option. c. A window opens up. The window is shown below: Explore d. Select the variable Internet Time and move it into the Dependent List box. e. Under the Display heading, select the Statistics option. f. Click OK. The SPSS Output Case Processing Summary Internet Time Cases Valid Missing Total N Percent N Percent N Percent % 0.0% %

34 Descriptives Internet Time Mean 95% Confidence Interval for Mean Lower Bound Upper Bound Statistic Std. Error % Trimmed Mean Median Variance Std. Deviation Minimum Maximum Range Interquartile Range Skewness Kurtosis The output consists of two boxes. The sample mean, the sample median, and the 5% trimmed mean may be obtained from the second box in the output labeled Descriptives. For this sample, the sample mean is 7.075, the sample median is 7.125, and the 5% trimmed mean is Example 9.5 Confidence Interval for a Population Proportion, π The Problem Violent Behavior in the Workplace An Associated Press article on potential violent behavior reported the results of a survey of 750 workers who were employed full time (San Luis Obispo Tribune, Sept. 7, 1999). Of those surveyed, 125 indicated that they were so angered by a coworker during the past year that he or she felt like hitting the person (but didn t). Assuming that is reasonable to regard this sample of 750 as a random sample from the population of full-time workers, we can use this information to construct an estimate of π, the true proportion of full-time workers so angered in the last year that they wanted to hit a colleague. For this sample 125 p = = Since np = 125 and n(1-p) = 625 are both greater than or equal to 10, the sample size is large enough to use the formula for a large-sample confidence interval. Find a 90% confidence interval for π. 1. Steps to prepare the data: a. Open the data set Example 9.5.sav. 2. Steps to find the confidence interval: a. To find the confidence interval, we use the Transform menu. b. Scroll down to the Compute option. c. A window opens up. The window is shown below:

35 g. Enter the variable lower into the Target Variable box. h. In the box under the heading Numeric Expression, enter the following equation for the lower endpoint of the confidence interval:.167 PROBIT (.9)* SQRT ((.167*.833) / 750) i. The functions PROBIT( ) and SQRT( ) can be found in the Functions scroll box, and are used to determine the z critical value and square root respectively. j. Click OK and then OK again when asked, Change Existing Variable? k. The output will appear in the SPSS Data Editor window as the first entry under the variable lower. The lower endpoint is about.150. l. Repeat steps (a) through (j) using the following equation for the upper endpoint of the confidence interval:.167 PROBIT (.9)* SQRT ((.167*.833) / 750) m. The output will appear in the SPSS Data Editor window as the first entry under the variable upper. The upper endpoint is about.184. Example 9.9 The One-Sample t Confidence Interval for µ The Problem Walking a Straight Line A study of the ability of individuals to walk in a straight line ( Can We Really Walk Straight? Amer. J. of Physical Anthropology (1992): 19-27) reported the data found in file Example 9.9.sav on cadence (strides per second) for a sample of n = 20 randomly selected healthy men. 1. Steps to prepare the data: a. Open the data set Example 9.9.sav. 2. Steps to find the 99% t confidence interval for the population mean: a. To find the 99% t confidence interval, we use the Analyze menu. b. Scroll down to the Compare Means submenu and select the One-Sample T Test option. c. A window opens up. The window is shown below:

36 d. Select the variable Cadence and move it into the Test Variable(s) box. e. Enter 0 in the Test Value box. f. Click Options. g. The following window opens up: h. In the Confidence Interval box, enter the value 99. i. Click Continue. j. Click OK. T-Test Cadence The SPSS Output One-Sample Statistics Std. Error N Mean Std. Deviation Mean One-Sample Test Cadence Test Value = 0 99% Confidence Interval of the Mean Difference t df Sig. (2-tailed) Difference Lower Upper The output consists of two boxes. The 99% t confidence interval may be obtained from the second box in the output labeled One-Sample Test. For this sample, the interval is (.8737,.9773).

37 Chapter 10 Example 10.14: Personal Use of Company Technology The Problem- To perform a hypothesis test about the mean time spent in personal use of company technology. It is believed that the average time spent in personal use of company technology is more than 75 minutes. 1. Steps to prepare the data a. Open the data set Example10.14sav 2. How to test H 0 : µ=75 minutes vs. H a : µ >75 minutes? a. Go to Analyze b. Scroll down to Compare Means b. A new window opens up, select One- Sample T-Test e. Move the variable time under Test Variable(s). To change the default confidence level from.95 to any level, go to Options The output is given by One-Sample Statistics TIME Std. Error N Mean Std. Deviation Mean

38 One-Sample Test TIME Test Value = 75 95% Confidence Interval of the Mean Difference t df Sig. (2-tailed) Difference Lower Upper Conclusion: because the p-value =Pr(T>-.07) = /2 =.526 >α=.05, we reject H 0 in favor of H a : µ >75 minutes. I.e. the average time spent in personal use of company technology is significantly more than 75 minutes. (Note that the test we need is a one sided test but SPSS produces a two sided test)

39 Chapter 11 Comparing Two Populations or Treatments Example 11.2 Two-Sample t Test for Comparing Two Population Means The Problem Oral Contraceptive Use and Bone Mineral Density To assess the impact of oral contraceptive use on bone mineral density (BMD), researchers in Canada carried out a study comparing BMD for women who had used oral contraceptives for at least three months to BMD fro women who had never used oral contraceptives ( Oral Contraceptive Use and Bone Mineral Density in Premenopausal Women, Canadian Medical Association Journal (2001): ). Data consistent with summary quantities given in the paper appear in the file Example 11.2.sav (the actual sample sizes for the study were much larger). 1. Steps to prepare the data: a. Open the data set Example 11.2.sav. 2. Steps to conduct a two-sample t test for comparing two population means: a. To conduct the test, we use the Analyze menu. b. Scroll down to the Compare Means submenu and select the Independent- Samples T Test option. c. A window opens up. The window is shown below: d. Select the variable BMD and move it into the Test Variable(s) box. e. Select the variable Contraceptive and move into the Grouping Variable box. f. Click on Define Groups. A second window opens up. The window is shown below: g. Select the Use specified values option. h. Enter 1 in the box labeled Group 1 (represents those women in the study that never used the contraceptive).

40 i. Enter 0 in the box labeled Group 2 (represents those women in the study that used the contraceptive). j. Click Continue. k. Click OK. T-Test The SPSS Output BMD Contraceptive Never Used Used Group Statistics Std. Error N Mean Std. Deviation Mean BMD Equal variances assumed Equal variances not assumed Levene's Test for Equality of Variances F Sig. Independent Samples Test t df Sig. (2-tailed) t-test for Equality of Means Mean Difference 95% Confidence Interval of the Std. Error Difference Difference Lower Upper The output consists of 2 boxes. The sample means and standard deviations may be obtained from the first box, labeled Group Statistics, under the Mean and Std. Deviation headings. The t test statistic can be found in the second box, labeled Independent Samples Test, under the t column, second row (Equal variances not assumed). The relevant degrees of freedom are in the column labeled df, again second row. The P-value can be calculated by dividing the value in the Sig. (2-tailed) column, second row, by two (the example is a single-tailed test). For this example, since the P-value = > 0.05, we fail to reject the Null Hypothesis; that is, there is not convincing evidence to support the claim that mean bone mineral density is lower for women who used oral contraceptive. Example 11.4 Two-Sample t Confidence Interval for the Difference Between Two Population or Treatment Means The Problem Effect of Talking on Blood Pressure Does talking elevate blood pressure, contributing to the tendency for blood pressure to be higher when measured in a doctor s office than when measured in a less stressful environment (called the white coat effect)? The article The Talking Effect and White Coat Effect in Hypertensive Patients: Physical Effort of Emotional Content (Behavioral Medicine (2001): ) describes a study in which patients with high blood pressure were randomly assigned to one of two groups. Those in the first group (the talking group) were asked to count aloud from 1 to 100 four times prior to having blood pressure measured. The data values for diastolic blood pressure (mmhg) found in file Example 11.4.sav are consistent with summary quantities appearing in the paper. 1. Steps to prepare the data: a. Open the data set Example 11.4.sav. 2. Steps to find a two-sample t CI for the difference between two population means: a. To find the CI, we use the Analyze menu.

41 b. Scroll down to the Compare Means submenu and select the Independent- Samples T Test option. c. A window opens up. The window is shown below: d. Select the variable Blood Pressure and move it into the Test Variable(s) box. e. Select the variable Activity and move into the Grouping Variable box. f. Click on Define Groups. A second window opens up. The window is shown below: g. Select the Use specified values option. h. Enter 1 in the box labeled Group 1 (represents those in the study who were talking prior to having their blood pressure measured). i. Enter 0 in the box labeled Group 2 (represents those in the study who were counting prior to having their blood pressure measured). j. Click Continue. k. Click OK. T-Test The SPSS Output Blood Pressure Activity Talking Counting Group Statistics Std. Error N Mean Std. Deviation Mean

42 Blood Pressure Equal variances assumed Equal variances not assumed Levene's Test for Equality of Variances F Sig. Independent Samples Test t df Sig. (2-tailed) t-test for Equality of Means Mean Difference 95% Confidence Interval of the Std. Error Difference Difference Lower Upper The output consists of 2 boxes. The number of observations, sample means, and sample standard deviations may be obtained from the first box, labeled Group Statistics, under the N, Mean, and Std. Deviation headings. The relevant degrees of freedom are in the second box, labeled Independent Samples Test, in the column labeled df, second row (Equal variances not assumed). The interval can be found under the column heading 95% Confidence Interval of the Difference, again second row. For this example, the 95% CI for the difference is (1.046, ). Example 11.8 Paired t Confidence Interval for the Mean Value of the Difference Population The Problem Lactic Acid in the Blood After Exercise The effect of exercise on the amount of lactic acid in the blood was examined in the article A Descriptive Analysis of Elite-Level Racquetball (Research Quarterly for Exercise and Sport (1991): ). Eight males were measured before and after playing three games of racquetball, the results are found in file Example 11.8.sav. We will use this data to estimate the mean change in blood lactate level using a 95% confidence interval. 1. Steps to prepare the data: a. Open the data set Example 11.8.sav. 2. Steps to find a paired t CI for µ d : a. To find the CI, we use the Analyze menu. b. Scroll down to the Compare Means submenu and select the Paired-Samples T Test option. c. A window opens up. The window is shown below: d. Select the variable Before and then the variable After. Move both variables into the Paired Variables box. e. Click OK.

43 T-Test The SPSS Output Pair 1 Before After Paired Samples Statistics Std. Error Mean N Std. Deviation Mean Paired Samples Correlations Pair 1 Before & After N Correlation Sig Pair 1 Before - After Paired Samples Test Paired Differences 95% Confidence Interval of the Std. Error Difference Mean Std. Deviation Mean Lower Upper t df Sig. (2-tailed) The output consists of 3 boxes. The interval may be obtained in the third box, labeled Paired Samples Test, under the column heading 95% Confidence Interval of the Difference. For this example, we can be 95% confident that the difference in mean blood lactate level is between and -6.7; that is, we are 95% confident that the mean increase in blood lactate level is somewhere between and after three games of racquetball. Example Two-Sample z Test for Comparing Two Population Proportions The Problem AIDS and Housing Availability The authors of the article Accommodating Persons with AIDS: Acceptance and Rejection in Rental Situations (J. Applied Social Psychology (1999): ) state that even though landlords participating in a telephone survey indicated that they would generally be willing to rent to persons with AIDS, they wondered whether this was true in actual practice. To investigate, two random samples of 80 advertisements for rooms for rent were independently selected from newspaper advertisements in three large cities. An adult male caller responded to each ad in the first sample of 80 and inquired about the availability of the room and was told that the room was still available in 61 of these calls. The same caller also responded to each ad in the second sample. In these calls, the caller indicated that he was currently receiving some treatment for AIDS and was about to be released from the hospital and would require a place to live. The caller was told that a room was available in 32 of these calls. Based on this information, the authors concluded that reference to AIDS substantially decreased the likelihood of a room being described as available. Does the data support this conclusion? 1. Steps to prepare the data: a. Open the data set Example sav. 2. Steps to conduct a two-sample z test for comparing two population proportions: a. To conduct the test, we use the Analyze menu. b. Scroll down to the Descriptive Statistics submenu and select the Crosstabs option. c. A window opens up. The window is shown below:

44 d. Select the variable No AIDS Reference and move it into the Row(s) box. e. Select the variable AIDS Reference and move into the Columns(s) box. f. Click on Statistics. A second window opens up. The window is shown below: g. Check the Chi-square option and click Continue. h. Click OK. Crosstabs The SPSS Output Case Processing Summary No AIDS Reference * AIDS Reference Cases Valid Missing Total N Percent N Percent N Percent % 0.0% %

45 No AIDS Reference * AIDS Reference Crosstabulation Count No AIDS Reference Total Not Available Available AIDS Reference Not Available Available Total Pearson Chi-Square Continuity Correction a Likelihood Ratio Fisher's Exact Test Linear-by-Linear Association Chi-Square Tests Asymp. Sig. Value df (2-sided) b Exact Sig. (2-sided) N of Valid Cases 80 a. Computed only for a 2x2 table b. 0 cells (.0%) have expected count less than 5. The minimum expected count is Exact Sig. (1-sided) The output consists of 3 boxes. The z test statistic may be obtained in the third box, labeled Chi- Square Tests, by using Pearson Chi-Square value in the following equation Z = Pearson Chi Square = The relevant P-value can be found in the first row, third column, under the heading Asymp. Sig. (2-sided) as follows 1 P-value = Asymp. Sig. ( 2 sided) = In this example since P-value α, the null hypothesis is rejected at level.01.

46 Chapter 12 Example 12.7: Risky Behavior The Problem- The relationship between gender and contraceptive use by sexually active teens Each person in a random sample of sexually active teens was classified according to gender and contraceptive use (with three categories: rarely or never use, use sometimes or most of the time, and always use) 1. Steps to prepare the data a. Open the data set Example12.7.sav 2. Steps to generate the 3X2 cross-table: a. Go to Analyze. b. Scroll down to the Descriptive Statistics option. c. A second window opens up. Select Crosstabs. The window is shown below: d. Click on Statistics and choose Chi-Square e. Click on Cells and choose Expected

47 Note that the Pearson Chi Square statistic is with p-value is 0.037, so at the.05 level of significance, we reject the null hypothesis and conclude that Gender and Contraceptive use are not independent.

48 Chapter 13 Simple Linear Regression and Correlation: Inferential Methods Example 13.2 Estimating the Population Regression Line and Point Estimates/Predictions The Problem Mother s Age and Babies Birth Weight Medical researchers have noted that adolescent females are much more likely to deliver low birth weight babies than are adult females. Because low birth weight babies have higher mortality rates, there have been a number of studies examining the relationship between birth weight and mother s age for babies born to young mothers. One such study is described in the article The Risk of Teen Mothers Having Low Birth Weight Babies: Implications of Recent Medical Research for School Health Personnel (J. of School Health (1998): ). The data found in file Example 13.2.sav is consistent with summary values given in the referenced article and also with data published by the National Center for Health Sciences. 1. Steps to prepare the data: a. Open the data set Example 13.2.sav. 2. Steps to create a scatter plot: a. Refer to Example Steps to find an estimate for the population regression line: a. To find the estimate, we use the Analyze menu. b. Scroll down to the Regression submenu and select the Linear option. c. The following window opens up: d. Select the variable Birth Weight and move it into the Dependent box. e. Select the variable Maternal Age and move it into the Independent(s) box. f. Click OK.

49 Regression Variables Entered/Removed b The SPSS Output Model 1 Variables Variables Entered Removed Method Maternal Age a. Enter a. All requested variables entered. b. Dependent Variable: Birth Weight Model 1 Model Summary Adjusted Std. Error of R R Square R Square the Estimate.884 a a. Predictors: (Constant), Maternal Age Model 1 Regression Residual Total a. Predictors: (Constant), Maternal Age b. Dependent Variable: Birth Weight ANOVA b Sum of Squares df Mean Square F Sig a Model 1 (Constant) Maternal Age Unstandardized Coefficients a. Dependent Variable: Birth Weight Coefficients a Standardized Coefficients B Std. Error Beta t Sig The output consists of 4 boxes. The least squares regression equation may be obtained from the last (fourth) box in the output labeled Coefficients. The estimates are both given in the first column labeled B. The y-intercept is given in the first row labeled (Constant) and the slope is given in the second row labeled Maternal Age. For this example the least squares regression equation is ŷ = Steps to find point estimates/predictions: a. To find the point estimates/predictions, repeat Steps 3 (a)-(e). b. Click on Save. A second window opens up. The window is shown below:

50 c. Check the Unstandardized option and click Continue. d. Click OK. a. The point estimates/predictions will appear in the SPSS Data Editor window as a variable named pre_1. Example 13.4 Confidence Interval for the Slope of the Population Regression Line The Problem Athletic Performance and Cardiovascular Fitness Is cardiovascular fitness (as measured by the time to exhaustion running on a treadmill) related to an athlete s performance in a 20-km ski race? The data in file Example 13.4.sav on x = treadmill time to exhaustion (min) and y = 20-km ski time (min) was taken from the article Physiological Characteristics and Performance of Top U.S. Biathletes (Medicine and Science in Sports and Exercise (1995): ). 1. Steps to prepare the data: a. Open the data set Example 13.4.sav. 2. Steps to create a scatter plot: a. Refer to Example Steps to find a CI for β: a. To find the CI, we use the Analyze menu. b. Scroll down to the Regression submenu and select the Linear option. c. The following window opens up:

51 d. Select the variable Birth Weight and move it into the Dependent box. e. Select the variable Maternal Age and move it into the Independent(s) box. f. Click on Statistics. A second window opens up. The window is shown below: Regression Model 1 e. Check the Estimates option (also ensure the default selections, Confidence intervals and Model fit, are checked) and click Continue. f. Click OK. Variables Entered/Removed b Variables Variables Entered Removed Method Treadmill Time a. Enter a. All requested variables entered. b. Dependent Variable: Ski Time The SPSS Output

52 Model 1 Model Summary Adjusted Std. Error of R R Square R Square the Estimate.796 a a. Predictors: (Constant), Treadmill Time Model 1 Regression Residual Total a. Predictors: (Constant), Treadmill Time b. Dependent Variable: Ski Time ANOVA b Sum of Squares df Mean Square F Sig a Model 1 (Constant) Treadmill Time a. Dependent Variable: Ski Time Unstandardized Coefficients Coefficients a Standardized Coefficients 95% Confidence Interval for B t Sig. Lower Bound Upper Bound B Std. Error Beta The output consists of 4 boxes. The desired interval can be found under the column heading 95% Confidence Interval for B, second row. For this example, the 95% CI for β is (-3.671, -.996). Example 13.6 Residual Analysis The Problem Landslides and Timber Growth Landslides are common events in tree-growing regions of the Pacific Northwest, so their effect on timber growth is of special concern to foresters. The article Effects of Landslide Erosion on Subsequent Douglas Fir Growth and Stocking Levels in the Western Cascades, Oregon (Soil Science Soc. of Amer. J. (1984): ) reported on the results of a study in which growth in a landslide area was compared with growth in a previously clear-cut area. We present data on clear-cut growth, with x = tree age (years) and y = 5-year height growth (cm). The data in file Example 13.6.sav is consistent the study results. 1. Steps to prepare the data: a. Open the data set Example 13.4.sav. 2. Steps to create a scatter plot: a. Refer to Example Steps to find the point predictions, residuals, and the standardized residuals: a. To find the values, we use the Analyze menu. b. Scroll down to the Regression submenu and select the Linear option. c. The following window opens up:

53 d. Select the variable Growth and move it into the Dependent box. e. Select the variable Age and move it into the Independent(s) box. f. Click on Save. A second window opens up. The window is shown below: g. Under the Predicted Values heading, select the Unstandardized option. h. Under the Residuals heading, select the Unstandardized and Studentized options. i. Click Continue. j. Click OK. k. You will get the regression output window as in Examples 13.2 and In the SPSS Data Editor window the point predictions will appear as the variable pre_1, the residuals as res_1, and the standardized residuals as sre_1. 4. Steps to create a normal probability plot of the residuals and standardized residuals: a. To create the plots, we use the Graphs menu. b. Scroll down to the P-P option. c. The following window opens up:

54 d. Select the variables Unstandardized Residual and Studentized Residual, and move them into the Variables box. e. Click OK. PPlot The SPSS Output 1.00 Normal P-P Plot of Unstandardized Residu 1.00 Normal P-P Plot of Studentized Residua Expected Cum Prob Expected Cum Prob Observed Cum Prob Observed Cum Prob Example Confidence Interval for a Mean y Value The Problem Shark Length and Jaw Width Physical characteristics of sharks are of interest to surfers and scuba divers, as well as marine researchers. The data in file Example on x = length (ft) and y = jaw width (in) for 44 sharks was found in various articles appearing in the magazines Skin Diver and Scuba News. Use the data to compute a 90% confidence interval for the mean jaw width for 15-ft long sharks. 1. Steps to prepare the data: a. Open the data set Example 13.4.sav. 2. Steps to create a scatter plot: a. Refer to Example 3.21.

55 3. Steps to find a CI for a mean y value: a. To find the CI, we use the Analyze menu. b. Scroll down to the Regression submenu and select the Linear option. c. The following window opens up: d. Select the variable Jaw Width and move it into the Dependent box. e. Select the variable Length and move it into the Independent(s) box. f. Click on Save. A second window opens up. The window is shown below: g. Under the Prediction Intervals heading, select the Mean option. h. In the box labeled Confidence Interval enter the desired level of confidence. i. Click Continue. j. Click OK.

56 k. You will get the regression output window as in Examples 13.2 and In the SPSS Data Editor window the upper and lower CI estimates will appear as the variables lmci_1, the residuals as umci_1 repectively.

57 Chapter 14 Multiple Regression Analysis Example 14.6 Least Squares Estimates The Problem Soil and Sediment Characteristics Soil and sediment adsorption, the extent to which chemicals collect in a condensed form on the surface, is an important characteristic because it influences the effectiveness of pesticides and various agricultural chemicals. The article Absorption of Phosphates, Arsenate, Methanearsenate, and Calcodylate by Lake and Stream Sediments: Comparisons with soils (J. of Environ. Qual. (1984): ) presented the data found in file Example 14.6.sav consisting of n = 13 (x 1, x 2, y) triples and proposed the model y = α + β x + β x + e for relating y = phosphate adsorption index, x 1 = amount of extractable iron, and x 2 = amount of extractable aluminum. 1. Steps to prepare the data: a. Open the data set Example 14.6.sav. 2. Steps to find the least squares estimates: a. To find the estimate, we use the Analyze menu. b. Scroll down to the Regression submenu and select the Linear option. c. The following window opens up: d. Select the variable Phosphate and move it into the Dependent box. e. Select the variables Iron and Aluminum, and move them into the Independent(s) box. f. Click OK.

58 Regression Variables Entered/Removed b The SPSS Output Model Variables Entered Variables Removed Method 1 Aluminum, Iron a. Enter a. All requested variables entered. b. Dependent Variable: Phosphate Model Summary Model R R Square Adjusted R Square Std. Error of the Estimate a a. Predictors: (Constant), Aluminum, Iron Model 1 Regression Residual Total a. Predictors: (Constant), Aluminum, Iron b. Dependent Variable: Phosphate ANOVA b Sum of Squares df Mean Square F Sig a Model 1 (Constant) Iron Aluminum Unstandardized Coefficients a. Dependent Variable: Phosphate Coefficients a Standardized Coefficients B Std. Error Beta t Sig The output consists of 4 boxes. The least square estimates for the regression equation coefficients may be obtained from the last (fourth) box in the output labeled Coefficients. The estimates are given in the first column labeled B. The estimate of the constant term α is given in the first row labeled (Constant). The estimate of the coefficient β 1 is given in the second row labeled Iron. The estimate of the coefficient β 2 is given in the third row labeled Aluminum. For this example the estimated regression equation is ŷ = x 1.349x 2. Example Model Selection and the Coefficient of Multiple Determination, R 2 The Problem Modeling the Price of Industrial Properties The paper Using Multiple Regression Analysis in Real Estate Appraisal (The Appraisal Journal (2001): ) reported the data found in file Example sav for a random sample of nine large industrial properties. A primary objective was to relate the price of the property to various other characteristics of the property. The variables from the study are y = price per square foot, x 1 = size of building (square feet), x 2 = age of the building (years), x 3 = quality of location (measured on a scale of 1, very poor location, to 4, very good location), and x 4 = land to building ratio. 1. Steps to prepare the data: a. Open the data set Example sav. 2. Steps to find the best fit model based on R 2 and adjusted R 2 :

59 a. To find examine the fit of the possible models, we use the Analyze menu. b. Scroll down to the Regression submenu and select the Linear option. c. The following window opens up: d. Select the variable Price and move it into the Dependent box. e. Select the variable Size and move it into the Independent(s) box. f. To the right of the heading Block 1 of 4, click Next. g. Select the variable Age and move it into the Independent(s) box. h. Click Next. i. Select the variable Location and move it into the Independent(s) box. j. Click Next. k. Select the variable Land/Building Ratio and move it into the Independent(s) box. l. Click OK. Regression Variables Entered/Removed b The SPSS Output Model Variables Entered Variables Removed Method 1 Size a. Enter 2 Age a. Enter 3 Location a. Enter 4 Land/Build ing Ratio a. Enter a. All requested variables entered. b. Dependent Variable: Price

60 Model Model Summary Adjusted Std. Error of R R Square R Square the Estimate.497 a b c d a. Predictors: (Constant), Size b. Predictors: (Constant), Size, Age c. Predictors: (Constant), Size, Age, Location d. Predictors: (Constant), Size, Age, Location, Land/Building Ratio Model Regression Residual Total Regression Residual Total Regression Residual Total Regression Residual Total ANOVA e Sum of Squares df Mean Square F Sig a b c d a. Predictors: (Constant), Size b. Predictors: (Constant), Size, Age c. Predictors: (Constant), Size, Age, Location d. Predictors: (Constant), Size, Age, Location, Land/Building Ratio e. Dependent Variable: Price Model (Constant) Size (Constant) Size Age (Constant) Size Age Location (Constant) Size Age Location Land/Building Ratio a. Dependent Variable: Price Coefficients a Unstandardized Coefficients Standardized Coefficients B Std. Error Beta t Sig E E E E

61 Model Age Location Land/Building Ratio Location Land/Building Ratio Land/Building Ratio a. Predictors in the Model: (Constant), Size Excluded Variables d b. Predictors in the Model: (Constant), Size, Age c. Predictors in the Model: (Constant), Size, Age, Location d. Dependent Variable: Price Collinearity Partial Statistics Beta In t Sig. Correlation Tolerance a a a b b c The output consists of 5 boxes. The coefficients of determination for the examined predictor models can be found in the box labeled Model Summary. The results suggest utilizing the twopredictor model that uses x 1 (size) and x 2 (age): this combination has the largest adjusted R2 of the potential predictor combinations. Example Another Example of Model Selection The Problem Durable Press Rating of Cotton Fabric The data found in file Example sav were taken from the article Applying Stepwise Multiple Regression Analysis to the Reaction of Formaldehyde with Cotton Cellulose (Textile Research J. (1984): ). The dependent variable, y = durable press rating, is a quantitative measure of wrinkle resistance. The four independent variables used in the model building process are x 1 = HCH (formaldehyde) concentration, x 2 = catalyst ratio, x 3 = curing temperature, and x 4 = curing time. In addition to these variables, the investigators considered as potential predictors x 1 2, x 2 2, x 3 2, and x 4 2 and all six interactions x 1 x 2,, x 3 x 4, a total of p = 14 candidates. 1. Steps to prepare the data: a. Open the data set Example sav. 2. Steps to find the best fit model based on R 2 and adjusted R 2 : a. To find examine the fit of the possible models, we use the Analyze menu. b. Scroll down to the Regression submenu and select the Linear option. c. The following window opens up:

62 d. Select the dependent variable and move it into the Dependent box e. Select the independent variables, x1, x2, x3, and x4 and move them into the Independent box. f. Click on Next for Block 1 of 1. f. Select sq_x, sq_x2, sq_x3, sq_x4 and move them into the Independent box. g. Click on Next for Block 2 of 2. h. Select x1x2, x1x3, x1x4, x2x3,x2x4 anx3x4 and move them into the Independent box. i. Click on Next for Block 3 of 3. j. Click on Statistics k. Choose R squared change as seen in the following window

63 Here is the model summary: Model Summary Change Statistics Model R Square F Change df1 df2 Sig. F Change Change a Predictors: (Constant), Catalyst Ratio b Predictors: (Constant), Catalyst Ratio, Curing Temp c Predictors: (Constant), Catalyst Ratio, Curing Temp, SQ_X1, SQ_X4, SQ_X2, SQ_X3 d Predictors: (Constant), Catalyst Ratio, Curing Temp, SQ_X1, SQ_X4, SQ_X2, SQ_X3, X1X3 The best model is model 3 because there is a great increase in R square.

64 Chapter 15 Example 15.4: Musical Preferences and Reckless Behavior The Problem- Determine whether adolescents who preferred certain types of music reported higher rates of reckless behaviors, such as speeding, drug use, shoplifting, and unprotected sex. A sample of size 20 students, at a large high school, was selected from each of four different musical preferences: 1) acoustic /pop 2) mainstream rock 3) hard rock and 4) heavy metal. The numbers represent the number of times the selected student was engaged in various reckless activities. 1. Steps to prepare the data Open the data set Example15.4.sav 2. Side-by-side Boxplot a. Go to Graphs b. Scroll down to Boxplot c. Click on Simple d. Select Define e. Move the variable Reckless[y] under Variable and move the variable Musical Preference[pr] under Category Axis f. Then click on Ok.

65 Reckless score N = Acoustic/pop Mainstream Rock Hard Rock Heavy Metal Musical Preference 2. Creating the ANOVA Table a.go to Analyze b. Select Compare Means c. Select One-Way-ANOVA d. Move the variable Reckless score [y] under the Dependent List and Musical Preference [pref] under Factor e. Click OK

66 Here is the output ANOVA Reckless score Sum of Squares df Mean Square F Sig. Between Groups Within Groups Total Example Sleep Time The Problem- The effects of ethanol on sleep time. A sample of 20 rats was given an oral injection with a given concentration of ethanol per body weight. Then the REM, rapid eye movement, sleep time for each rat was recorded. 1 Steps to prepare the data Open the data set Example15.6.sav 3. Grouping the data based on significant difference using a. Repeat steps a though d in Example 15.4 b Click on Post Hoc c.click on Tukey s-b and Tukey d. Click on Continue

67 Example 15.10: Comparing Four Stool Designs The dataset in Example has three variables. The first variable is stool type (stooltp), the second variable is the subject number (blook) and the third variable is the effort score (effort) (measured in Borg Scale). The effort variable measures the effort required by a subject to rise form a sitting position. 1. Steps to prepare the data Open the data set Example15.10.sav 2. Preparing ANOVA Table

68 b. Go to Analyze c. Scroll down to General Linear Models d. Click on Univariate e. Move the variable blook and stooltype under Fixed Factor(s) and move the variable effort under Dependent Variable f. Click on Model g. The following window opens up h. Select Custom for Specify Model, Main effect under Build Terms, you may unclick the intercept and Source of squares should be Type II i.click Continue Tests of Between-Subjects Effects Dependent Variable: EFFORT Source Type II Sum of Squares df Mean Square F Sig. Model BLOOK STOOLTY P Error Total a R Squared =.993 (Adjusted R Squared =.989)