Opgaven Onderzoeksmethoden, Onderdeel Statistiek



Similar documents
3.4 Statistical inference for 2 populations based on two samples

Two Related Samples t Test

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

Independent t- Test (Comparing Two Means)

Chapter 2 Probability Topics SPSS T tests

4. Continuous Random Variables, the Pareto and Normal Distributions

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test

Statistics 2014 Scoring Guidelines

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Introduction to Hypothesis Testing

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

Lecture Notes Module 1

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

Data Analysis Tools. Tools for Summarizing Data

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

The Dummy s Guide to Data Analysis Using SPSS

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Statistics Review PSY379

Using SPSS, Chapter 2: Descriptive Statistics

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

The Chi-Square Test. STAT E-50 Introduction to Statistics

An introduction to IBM SPSS Statistics

Projects Involving Statistics (& SPSS)

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Testing for differences I exercises with SPSS

Hypothesis Testing: Two Means, Paired Data, Two Proportions

HYPOTHESIS TESTING WITH SPSS:

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

Section 7.1. Introduction to Hypothesis Testing. Schrodinger s cat quantum mechanics thought experiment (1935)

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

Mind on Statistics. Chapter 13

TI-Inspire manual 1. Instructions. Ti-Inspire for statistics. General Introduction

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

You flip a fair coin four times, what is the probability that you obtain three heads.

MATH 103/GRACEY PRACTICE EXAM/CHAPTERS 2-3. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Math 251, Review Questions for Test 3 Rough Answers

Descriptive Analysis

Statistics 100 Sample Final Questions (Note: These are mostly multiple choice, for extra practice. Your Final Exam will NOT have any multiple choice!

1.5 Oneway Analysis of Variance

SOLUTIONS: 4.1 Probability Distributions and 4.2 Binomial Distributions

Binomial Probability Distribution

AP STATISTICS (Warm-Up Exercises)

Probability Distributions

Study Guide for the Final Exam

EPS 625 INTERMEDIATE STATISTICS FRIEDMAN TEST

HYPOTHESIS TESTING: POWER OF THE TEST

Final Exam Practice Problem Answers

Point Biserial Correlation Tests

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

STAT 350 Practice Final Exam Solution (Spring 2015)

THE BINOMIAL DISTRIBUTION & PROBABILITY

Chapter 7 Section 7.1: Inference for the Mean of a Population

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median

Statistics 100A Homework 4 Solutions

Mind on Statistics. Chapter 12

Standard Deviation Estimator

statistics Chi-square tests and nonparametric Summary sheet from last time: Hypothesis testing Summary sheet from last time: Confidence intervals

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Two-sample hypothesis testing, II /16/2004

Unit 26 Estimation with Confidence Intervals

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

Midterm Review Problems

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction

Using Excel for inferential statistics

Descriptive and Inferential Statistics

SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.

Chapter 23 Inferences About Means

Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck!

Difference of Means and ANOVA Problems

Def: The standard normal distribution is a normal probability distribution that has a mean of 0 and a standard deviation of 1.

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

SPSS Explore procedure

Chapter 7 Section 1 Homework Set A

consider the number of math classes taken by math 150 students. how can we represent the results in one number?

Review #2. Statistics

Testing a claim about a population mean

An SPSS companion book. Basic Practice of Statistics

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

ISyE 2028 Basic Statistical Methods - Fall 2015 Bonus Project: Big Data Analytics Final Report: Time spent on social media

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST

Descriptive Statistics

UNDERSTANDING THE TWO-WAY ANOVA

Hypothesis testing - Steps

TI-Inspire manual 1. I n str uctions. Ti-Inspire for statistics. General Introduction

TIPS FOR DOING STATISTICS IN EXCEL

The Big Picture. Describing Data: Categorical and Quantitative Variables Population. Descriptive Statistics. Community Coalitions (n = 175)

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples

MBA 611 STATISTICS AND QUANTITATIVE METHODS

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp , ,

Chapter 5 Analysis of variance SPSS Analysis of variance

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Transcription:

Opgaven Onderzoeksmethoden, Onderdeel Statistiek 1. What is the measurement scale of the following variables? a Shoe size b Religion c Car brand d Score in a tennis game e Number of work hours per week f Salary in euros per month g Your opinion about this course in the Caracal system 2. What is the median of the distribution 123, 154, 160, 187? 3. What happens to the mean and the median in a set of five scores when the largest one is increased by several points? 4. Assume that for the variable gender we encode male with 0 and female with 1. We can compute the mean of this variable. What type of variable is this? What is the interpretation of the mean of this variable? 5. Try this at home A software company is about to release new spreadsheet software. For testing the practicability of the software, it conducted a pilot study in which 10 participants were asked to use the new software for performing a set of pre-defined computation tasks. In the pilot study, the time required by the participants to complete the tasks was measured in minutes. The results of the study were: 13.7 28.4 21.6 15.3 79.1 14.8 17.4 16.1 23.4 19.2 a Open SPSS on your computer. Select Type in data in the splash screen and name a new variable time and type in the data as above. Construct a histogram to depict the frequency distribution of the results of the study as follows: Click Graphs on the toolbar, select the item Chart Builder.... Click on the question mark?. This will launch the Help function in your browser. In the left hand sidebar of the Help window, open Building charts, then open Chart types, then click Histograms. On the right hand side of the Help window now click How to create a histogram and follow the steps.

Upon grouping the results, experiment with different interval sizes and see how the histogram changes. You can change the interval size and number of intervals in the Chart Builder as follows. In the Chart Builder dialog click on Element Properties, a new dialog window opens. In this new dialog click on Bar1, select Statistic: Histogram in the pop-up menu and click the button Set Parameters.... A new dialog window opens. In the new window click on Custom for Bin Sizes and change the Number of intervals. Finish the dialogs by clicking Continue, Apply, and OK b Compute the mode, the median and the mean of the results of the study. Estimate these summary statistics also from the histograms you constructed in part a. and compare the results. Check your results with SPSS try to find it in the SPSS Help system which has very good step-by-step tutorials in the left hand sidebar!. c The value 79.1 clearly is an outlier within the collection of results. Argue whether or not this outlier can have influenced the mode, the median and the mean which you have computed for the data. d Suppose that the 10 participants in the pilot study constitute the entire clientele of the software company. Compute the variance and standard deviation for the clientele. e Now suppose that the 10 participants in the pilot study constitute just a portion of the company s clientele. Can the variance and standard deviation computed above be used as estimates for the variance and standard deviation for the entire clientele? f Let SPSS calculate the variance You can find this in Analyze and Descriptives. Which of the two variants is used by SPSS to calculate the variance? Perform the calculations involved in the exercise both by hand and using SPSS. 6. Let X be a random variable to denote the number of pips NL: ogen in the throw of a die NL: dobbelsteen. a Describe the probability mass function of X. b Compute EX. c Compute varx. 7. Prove that for any random variable X, and for any a and b: a EaX + b = aex + b b varx = EX 2 µ 2 c varax + b = a 2 varx 8. Prove that a random variable X, that has a binomial distribution with parameters n and p, EX = np and varx = np1 p. Hint: Denote the outcome of the i-th binary trial as X i, then X = X 1 + X 2 +... + X n. Calculate EX i and varx i. 9. Prove that the mean of the geometric distribution is indeed EX = 1 p p. Hint: use the fact that k=0 x k = 1 d 1 x, for x < 1, and have a look at k=0 dx x k. Page 2

10. The age of participants of a seminar on probability in business has a mean of 44.9 and a standard deviation of 13.48. Calculate the Z-scores for the ages 47, 23, and 52. 11. Calculate the first quartile, the second quartile and the third quartile for a standard normally distributed variable. Also calculate the ninetieth percentile. 12. Use the tables for the standard normal distribution to calculate the following probabilities: a P Z 2.00. b P Z 0.15. c P Z = 0.25. d P Z 1.22. e P Z 2.22. f P 1.33 Z 0.95. g P 2.51 Z 1.89. h P 0.51 Z 2.12. i P 0.83 Z 0.59. 13. An internet provider analyses monthly use by his customers and concludes that the use follows a normal distribution with a mean of 130 hours and a standard deviation of 12 hours. a How likely is it for a customer to use the internet for more than 148 hours a month? b The provider would like to give a bonus to his 20% best clients. For how many hours at least do these clients use the internet? c What proportion of the population is more than 0.7 standard deviations from the mean? d Is it possible to compute the percentage of clients who use the internet for exactly 135 hours per month? Support your answer with some argumentation. 14. With the notation tdf we can represent a variable that has a t-distribution with df degrees of freedom. A formula like P t10 > 3.5, then represents the probability that a random variable that follows a t-distribution with 10 degrees of freedom takes a value larger than 3.5. In the tables for the t-distribution you can look up these one-tail probabilities for special cases. For instance you can find for df = 27, that P t27 > 1.703 =.05 Check if the following statements are true or false: a P t27 > 1.638 >.05. b P t12 > 1.980 <.025. c P t21 < 1.634 <.10. d P t30 > 1.310 > P t18 > 1.734 e P t28 > 2.467 > P t14 > 2.977. Page 3

f P t22 > 1.815 < P t16 > 2.133. 15. A group of consumers has been questioned about their satisfaction with a certain type of washing powder. This satisfaction level was measured by a series of questions which resulted per customer in a value between 0 and 10 interval scale. The manufacturer wants to test if the mean measurement is significantly larger than 7.0, with a confidence level α = 0.05. The result from interviews with 25 consumers gives a mean score of 7.48, with a sample standard deviation of 1.34. a Perform a test of the hypothesis for the manufacturer. b If the manufacturer would have used a stronger test by choosing α = 0.01, would that change the conclusion of the test? 16. In a survey over 30 visitors of a country fair the average age was measured as 44.9 years with a standard deviation of 13.48. Test whether this mean is significantly different from 40 years two-sided, α = 0.05. 17. A student has collected survey data on user interfaces and has found the following figures for a measurement on the completion time of a task: X = 39.8, s = 2.67, n = 9. We consider the hypotheses: H 0 : µ = 38 null hypothesis H 1 : µ 38 alternative hypothesis a Calculate the acceptance region for the hypothesis test with confidence level α = 0.05. The acceptance region is the complement of the critical region: those values which make you retain the null hypothesis. b Calculate the 95% confidence interval for the mean. For a sample of size n with unknown variance this interval is: X t crit df s X, X + t crit df s X with t crit the two-sided critical value with α =.05. c Can you spot the relationship between the acceptance region of item a and the confidence interval of item b? 18. The following data are from a repeated-measures or paired-samples study examining the effect of a treatment by measuring a group of N = 4 participants before and after they receive the treatment. a Calculate the difference scores and D. Before After Participant Treatment Treatment A 7 10 B 6 13 C 9 12 D 5 8 b Calculate SS, the sample variance of D, and estimated standard error. c Is there a significant treatment effect? Use α =.05, two tails. Page 4

19. Siegel 1990 found that master students were less likely to buy books than bachelor students. The researcher had recorded the number books bought per year per student in two samples and collected the following data: Bachelor students Master students 10 7 8 4 7 9 9 3 13 7 7 6 12 a Is the number of books bought per master student significantly lower than for bachelor students? Use a one-tailed test with α =.05. b Compute the value of ω 2 percentage of variance accounted for these data. 20. In the TV quiz University Challenge 10 students from Computer Science CS take on 10 students from Information Science IS. Each students gets a quiz of 100 questions. The score of each student is the number of correct answers. The scores are: Computer Science: 55 23 43 43 39 52 53 62 27 23 Information Science: 52 19 42 40 48 39 41 53 18 28 a Test if the scores of the two groups are different α = 0.05. b Calculate the percentage of variance in the scores explained by the difference between the two groups. c Try this at home Repeat the analysis of the previous exercise using SPSS. Construct appropriate variables. Note that you have to construct two variables and a total of 20 data subjects lines in your data set. One variable is used to denote which group the measurement comes from e.g. use 1 for CS, and 2 for IS and the other variable is the test variable which stores the measured data number of correct answers. The analysis can be done with the Independent Samples t-test Dialog. 21. A company wants to determine if a prevention campaign will decrease the amount of time that employees are unavailable for work due to illness. For a group of 12 employees their sick days in one year is measure before and after the campaign. The measurements are reported in the following table: Before: 10 13 19 12 9 8 14 12 17 20 7 11 After: 5 9 13 15 4 5 11 14 13 18 7 12 a Test if the campaign has made a significant difference in absenteeism due to illness α = 0.05 b Calculate the percentage of variance in the number of sick days explained by the effect. Page 5

c try this at home Repeat the analysis using SPSS. In this case you have to construct a dataset with two variables, one variable X to measure the sick days before the campaign, and one variable Y to measure the sick days after the campaign. You will have 12 lines in your data set, one for each employee. As an extra you might even compute the difference variable D = Y X and perform a one-sample t-test on this. You can fill in the values for D by hand, or you could use SPSS to do this for you Click Transform in the toolbar, then select Compute Variable.... The analysis can be done with Paired Samples t-test. 22. A computer company has service call-center. During one week the number of incoming calls was registered and counted, specified in numbers before lunch break AM and after lunch break PM. The results were: Mon Tue Wed Thu Fri AM 39 16 19 15 21 PM 24 19 31 14 37 a Test whether the incoming calls are evenly distributed over the five working days of the week α = 0.05 b Test whether the incoming calls are evenly distributed over the morning and afternoon α = 0.05. 23. In a street interview people were asked whether they regularly practice some kind of exercise or sport. The interviewers received a negative answer from 563 women and from 382 men. These respondents could also choose from four options to provide a reason for their negative answer. This resulted in the following table: No time No need No fun Bad health Male 134 149 57 42 Female 208 118 152 85 a Test whether the reason for not exercising differs between the two genders α = 0.05. What kind of test do you need? b Report per gender the reason for not exercising in percentages. c If you would have to condense the data to a 2*2 table, which reasons would you merge? 24. People who participate in some kind of sports, have several reasons to do so. Two of those reasons are Competition and Mastery increasing skills. Among 150 sporting students 75 men and 75 women an interview was carried out. To each respondent two scores have been assigned: one for Competition +C: enjoys competition; -C: does not enjoy competition and one for Mastery +M: finds mastery important; -M: does not find mastery important. By combining the two scores each respondent can be assigned to one of two groups. This resulted in the following crosstable: +C+M +C-M -C+M -C-M Male 33 20 7 15 Female 16 9 23 27 Page 6

a Test whether the two genders differ in their motives to participate in sports α = 0.05. What type of test do you need? 25. For the data of the previous question test if the score on Competition +C, -C is independent of the score on Mastery +M, -M. Also, what type test is this? Page 7