Math 1011 Homework Set 2

Similar documents
MEASURES OF VARIATION

Density Curve. A density curve is the graph of a continuous probability distribution. It must satisfy the following properties:

AP Statistics Solutions to Packet 2

MATH 103/GRACEY PRACTICE EXAM/CHAPTERS 2-3. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles.

Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median

First Midterm Exam (MATH1070 Spring 2012)

Continuing, we get (note that unlike the text suggestion, I end the final interval with 95, not 85.

Statistics Revision Sheet Question 6 of Paper 2

Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple.

HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

Shape of Data Distributions

Interpreting Data in Normal Distributions

Descriptive statistics; Correlation and regression

c. Construct a boxplot for the data. Write a one sentence interpretation of your graph.

AP * Statistics Review. Descriptive Statistics

Chapter 1: Exploring Data

Section 1.3 Exercises (Solutions)

The Normal Distribution

3.2 Measures of Spread

Probability. Distribution. Outline

Def: The standard normal distribution is a normal probability distribution that has a mean of 0 and a standard deviation of 1.

SAMPLING DISTRIBUTIONS

Mind on Statistics. Chapter 2

6. Decide which method of data collection you would use to collect data for the study (observational study, experiment, simulation, or survey):

3.4 The Normal Distribution

Unit 7: Normal Curves

Stat 20: Intro to Probability and Statistics

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

PRACTICE PROBLEMS FOR BIOSTATISTICS

Mean, Median, and Mode

Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2

Father s height (inches)

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences

Interpret Box-and-Whisker Plots. Make a box-and-whisker plot

5/31/ Normal Distributions. Normal Distributions. Chapter 6. Distribution. The Normal Distribution. Outline. Objectives.

Descriptive Statistics

Lesson 17: Margin of Error When Estimating a Population Proportion

EXAM #1 (Example) Instructor: Ela Jackiewicz. Relax and good luck!

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

Descriptive Statistics and Measurement Scales

Key Concept. Density Curve

7 CONTINUOUS PROBABILITY DISTRIBUTIONS

4. Continuous Random Variables, the Pareto and Normal Distributions

CALCULATIONS & STATISTICS

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

9. Sampling Distributions

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Mental Questions. Day What number is five cubed? 2. A circle has radius r. What is the formula for the area of the circle?

Describing, Exploring, and Comparing Data

Measurement & Data Analysis. On the importance of math & measurement. Steps Involved in Doing Scientific Research. Measurement

Lesson 4 Measures of Central Tendency

Lesson 20. Probability and Cumulative Distribution Functions

Name: Date: Use the following to answer questions 2-3:

6.4 Normal Distribution

6 3 The Standard Normal Distribution

31 Misleading Graphs and Statistics

The normal approximation to the binomial

Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation

Characteristics of Binomial Distributions

Lecture 13/Chapter 10 Relationships between Measurement (Quantitative) Variables

Statistics Chapter 2

The Normal Distribution

Chapter 4: Average and standard deviation

Relationships Between Two Variables: Scatterplots and Correlation

DATA INTERPRETATION AND STATISTICS

Probability Distributions

Functional Skills Mathematics Level 2 sample assessment

Opgaven Onderzoeksmethoden, Onderdeel Statistiek

DESCRIPTIVE STATISTICS & DATA PRESENTATION*

The normal approximation to the binomial

Poverty among ethnic groups

Chapter 7: Simple linear regression Learning Objectives

Frequency Distributions

Probability Distributions

RECOMMENDED COURSE(S): Algebra I or II, Integrated Math I, II, or III, Statistics/Probability; Introduction to Health Science

September Population analysis of the Retriever (Flat Coated) breed

Valor Christian High School Mrs. Bogar Biology Graphing Fun with a Paper Towel Lab

Section 2.5 Average Rate of Change

Statistics Review PSY379

Thursday, November 13: 6.1 Discrete Random Variables

Week 4: Standard Error and Confidence Intervals

Chapter 4. Probability Distributions

MODUL 8 MATEMATIK SPM ENRICHMENT TOPIC : STATISTICS TIME : 2 HOURS

MATH 10: Elementary Statistics and Probability Chapter 5: Continuous Random Variables

Diagrams and Graphs of Statistical Data

STT315 Chapter 4 Random Variables & Probability Distributions KM. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables

The data set we have taken is about calculating body fat percentage for an individual.

WEEK #22: PDFs and CDFs, Measures of Center and Spread

Chapter 4. Probability and Probability Distributions

An Introduction to Ages & Stages Questionnaires (ASQ-3)

AMS5 - MIDTERM Thursday 30th April, 2009

Elementary Statistics

HYPOTHESIS TESTING WITH SPSS:

What Does the Normal Distribution Sound Like?

Variables. Exploratory Data Analysis

Name: Math 29 Probability. Practice Second Midterm Exam Show all work. You may receive partial credit for partially completed problems.

Scatter Plots with Error Bars

Transcription:

Math 1011 Homework Set 2 Due February 12, 2014 1. Suppose we have two lists: (i) 1, 3, 5, 7, 9, 11; and (ii) 1001, 1003, 1005, 1007, 1009, 1011. (a) Find the average and standard deviation for each of the lists. (b) From your result in (a), can you find any property of the average and the standard deviation? (Hint: what is the relation between (i) and (ii)?) (a) For list (i), the average is (1 + 3 + 5 + 7 + 9 + 11)/6 = 6. The list of deviations will be -5, -3, -1, 1, 3, 5. Then the SD is ( 5)2 + ( 3) 2 + ( 1) 2 + (1) 2 + (3) 2 + (5) 2 35 = 6 3 = 3.42. For list (ii), the average is (1001 + 1003 + 1005 + 1007 + 1009 + 1011)/6 = 1006. The list of deviations will be -5, -3, -1, 1, 3, 5. Then the SD is again ( 5)2 + ( 3) 2 + ( 1) 2 + (1) 2 + (3) 2 + (5) 2 6 = 35 3 = 3.42. (b) This shows the property of change of scale: by adding a constant to each entry of the list, the average will be added by the same constant, but the SD remains the same. 2. Suppose we have two lists: (i) 1, 2, 3, 4, 5, 6, 7; and (ii) 3, 6, 9, 12, 15, 18, 21. (a) Find the average and standard deviation for each of the lists. (b) From your result in (a), can you find any property of the average and the standard deviation? (Hint: what is the relation between (i) and (ii)?) (a) For list (i), the average is (1 + 2 + 3 + 4 + 5 + 6 + 7)/7 = 4. The list of deviations will be -3, -2, -1, 0, 1, 2, 3. Then the SD is ( 3)2 + ( 2) 2 + ( 1) 2 + (0) 2 + (1) 2 + (2) 2 + (3) 2 For list (ii), the average is 7 (3 + 6 + 9 + 12 + 15 + 18 + 21)/7 = 12. = 2. 1

The list of deviations will be -9, -6, -3, 0, 3, 6, 9. Then the SD is ( 9)2 + ( 6) 2 + ( 3) 2 + (0) 2 + (3) 2 + (6) 2 + (9) 2 7 (b) This shows the property of change of scale: by multiplying a positive constant to each entry of the list, the average will be multiplied by the same constant, and so is the SD. = 6. 3. (a) Find the average and SD of the list: 41, 48, 50, 50, 54, 57. (b) Which numbers on the list are within 0.5 SDs of average? within 1.5 SDs of average? (a) The average is (41 + 48 + 50 + 50 + 54 + 57)/6 = 50. So the list of deviations is -9, -2, 0, 0, 4, 7. The SD is ( 9)2 + ( 2) 2 + (0) 2 + (0) 2 + (4) 2 + (7) 2 6 = 5. (b) 48, 50, 50 are within 0.5 SDs of average, i.e. in the range from 47.5 to 52.5. 48, 50, 50, 54, 57 are within 1.5 SDs of average, i.e. in range from 42.5 to 57.5. 4. A study on college students found that the men had an average weight of about 66 kg and an SD of about 9 kg. The women had an average weight of about 55 kg and an SD of 9 kg. (a) Find the averages and SDs, in pounds for both men and women (1 kg = 2.2 lb). (b) Just roughly, what percentage of the men weighed between 57 kg and 75 kg? (c) If you took the men and women together, would the SD of their weights be smaller than 9 kg, just about 9 kg, or bigger than 9 kg? Why? (Hint: recall that the standard deviation indicates how the data spread around the average.) (a) Average weight of men = 66 2.2 145 pounds, SD = 9 2.2 20 pounds. Average weight of women = 55 2.2 121 pounds, SD 20 pounds. (b) 68%: the range is average ± 1 SD. (c) Bigger than 9 kg : if you take the men and the women together, the spread in weights goes up. 2

5. An investigator has a computer file showing family incomes for 1,000 subjects in a certain study. These range from $5,800 a year to $98,600 a year. By accident, the highest income in the file gets changed to $986,000. (a) Does this affect the average? If so, by how much? (b) Does this affect the median? If so, by how much? (Hint: think about whether the highest income will affect the percentage to the right of the median or not.) (a) Yes: the average goes up by ($986, 000 $98, 600)/1000 = $887.40. (b) No: one advantage of the median is that it is not thrown off by outliers. 6. Many observers think there is a permanent underclass in American society most of those in poverty typically remain poor from year to year. Over the period 1970 2000, the percentage of the American population in poverty each year has been remarkably stable, at 12% or so. Income figures for each year were taken from the March Current Population Survey of that year; the cutoff for poverty was based on official government definitions. To what extent do these data support the theory of the permanent underclass? Discuss briefly. (Hint: The study draws conclusions about the effects of year, this is similar to the effects of age in the example talked in class.) The data are cross sectional not longitudinal, so the data only provide weak support for the theory. (Longitudinal data show that most spells of poverty are short.) 7. The following list of test scores has an average of 50 and an SD of 10: 39 41 47 58 65 37 37 49 56 59 62 36 48 52 64 29 44 47 49 52 53 54 72 50 50 (a) Use the normal approximation to estimate the number of scores within 1.25 SDs of the average. (b) How many scores really were within 1.25 SDs of the average? (a) In standard units, it is between -1.25 and 1.25. So from the normal table, under the normal curve, the area is about 79%. The total number of the entries of the list is 25. So the approximation is about 25 79% 20. (b) Within 1.25 SDs means between 37.5 and 62.5, by counting the numbers, we see there are 18 scores. 3

8. The table below shows the distribution of adults by the last digit of their age, as reported in the Census of 1880 and the Census of 1970. You might expect each of the ten possible digits to turn up for 10% of the people, but this is not the case. For example, in 1880, 16.8% of all persons reported an age ending in 0 like 30 or 40 or 50. In 1970, this percentage was only 10.6%. (a) Draw histograms for these two distributions. (Note: you may use the convention for selecting the class intervals when the variable is discrete.) (b) In 1880, there was a strong preference for the digits 0 and 5. How can this be explained? (Hint: in the old days, do people know their ages accurately?) (c) In 1970, the preference was much weaker. How can this be explained? Digit 1880 1970 0 16.8 10.6 1 6.7 9.9 2 9.4 10.0 3 8.6 9.6 4 8.8 9.8 5 13.4 10.0 6 9.4 9.9 7 8.5 10.2 8 10.2 10.0 9 8.2 10.1 Source: United States Census. (a) See the graphs at the end. (2 points: each histogram worth 1 point.) (b) In 1880, people did not know their ages at all accurately, and rounded off. (c) In 1970, people knew when they were born. 9. In a survey carried out at the University of California, Berkeley, a sample of students were interviewed and asked what their grade-point average was. A sketched histogram of the results is shown on the next page. (GPA ranges from 0 to 4, and 2 is a bare pass.) (a) True or false: more students reported a GPA in the range 2.0 to 2.1 than in the range 1.5 to 1.6. (b) True or false: more students reported a GPA in the range 2.0 to 2.1 than in the range 2.5 to 2.6. (c) What accounts for the spike at 2? (Hint: recall the example educational level we discussed in class, think of the property of the peaks.) (a) True. (b) True. (c) People with failing GPAs may round them up ; and 2 is such an important unmber for GPAs that people with GPAs just above 2 may round them down (1 point). 4

10. The histogram on the next page represents the birth weight of babies in some hospital. Suppose we know 30 babies weighed over 4.5 kg. And babies weighing under 2 kg are taken to the Special Care Baby Unit. (Notice that the vertical scale is not percentage per kg, that is, the vertical scale is not the density scale.) (a) What is total number of babies being weighted? (Hint: In this case, the area of the blocks represent the frequency of the corresponding class intervals, i.e. the number of babies with the weight in the interval. To compute the total number of the babies, find out the total area of the blocks. Note that we have already known one of the area of the blocks. By comparing the scale of the blocks, you may figure out the area of the rest of the blocks.) (b) How many babies are taken to the Special Care Baby Unit? (a) By counting the squares with edge length 0.5, there are 6 such squares in the block for babies weighted over 4.5 kg, while there are 30 such squares in total. So the percentage of babies weighted over 4.5 kg is 6/30 = 20%. Since we know that there are 30 babies weighted over 4.5 kg, so the total number of babies is 30/20% = 150. (b) Again by counting the squares with edge length 0.5, there are 4 such squares in the block for babies weighted under 2 kg. So the number of babies weighted under 2 kg (thus taken to the Special Care Baby Unit) is 150 4/30 = 20. The total points are 25. 5

The histograms for problem 8 part (a): You can either draw the two histograms together, or draw them separately. You don t need to point out the units on the vertical scale. You don t need to indicate the heights either. All you need to do for the vertical axis scale is to sketch the heights just like the graph above.