College of the Canyons Math 140 Exam 1 Amy Morrow. Name:

Similar documents
Describing, Exploring, and Comparing Data

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

Mind on Statistics. Chapter 2

2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles.

MATH 103/GRACEY PRACTICE EXAM/CHAPTERS 2-3. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

AP * Statistics Review. Descriptive Statistics

Exercise 1.12 (Pg )

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median

Lecture 1: Review and Exploratory Data Analysis (EDA)

Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation

Means, standard deviations and. and standard errors

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Exploratory data analysis (Chapter 2) Fall 2011

Summarizing and Displaying Categorical Data

Name: Date: Use the following to answer questions 2-3:

Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs

c. Construct a boxplot for the data. Write a one sentence interpretation of your graph.

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test March 2014

Diagrams and Graphs of Statistical Data

Chapter 1: Exploring Data

Homework 8 Solutions

MEASURES OF VARIATION

a) Find the five point summary for the home runs of the National League teams. b) What is the mean number of home runs by the American League teams?

Using SPSS, Chapter 2: Descriptive Statistics

Statistics 2014 Scoring Guidelines

Descriptive Statistics

Variables. Exploratory Data Analysis

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2

MTH 140 Statistics Videos

Statistics 100 Sample Final Questions (Note: These are mostly multiple choice, for extra practice. Your Final Exam will NOT have any multiple choice!

Statistics. Measurement. Scales of Measurement 7/18/2012

2 Describing, Exploring, and

Center: Finding the Median. Median. Spread: Home on the Range. Center: Finding the Median (cont.)

Premaster Statistics Tutorial 4 Full solutions

Section 1.1 Exercises (Solutions)


Classify the data as either discrete or continuous. 2) An athlete runs 100 meters in 10.5 seconds. 2) A) Discrete B) Continuous

Section 3 Part 1. Relationships between two numerical variables

The Big Picture. Describing Data: Categorical and Quantitative Variables Population. Descriptive Statistics. Community Coalitions (n = 175)

Box-and-Whisker Plots

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Lecture 2. Summarizing the Sample

Midterm Review Problems

EXAM #1 (Example) Instructor: Ela Jackiewicz. Relax and good luck!

AP Statistics Solutions to Packet 2

Box-and-Whisker Plots

Bar Graphs and Dot Plots

ALGEBRA I (Common Core) Thursday, January 28, :15 to 4:15 p.m., only

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

consider the number of math classes taken by math 150 students. how can we represent the results in one number?

First Midterm Exam (MATH1070 Spring 2012)

MBA 611 STATISTICS AND QUANTITATIVE METHODS

HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

1.3 Measuring Center & Spread, The Five Number Summary & Boxplots. Describing Quantitative Data with Numbers

Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple.

Exploratory Data Analysis

Exploratory Data Analysis. Psychology 3256

Chapter 7: Simple linear regression Learning Objectives

Correlation and Regression

Draft 1, Attempted 2014 FR Solutions, AP Statistics Exam

Characteristics of Binomial Distributions

Data Exploration Data Visualization

The Normal Distribution

Section 1.3 Exercises (Solutions)

5/31/ Normal Distributions. Normal Distributions. Chapter 6. Distribution. The Normal Distribution. Outline. Objectives.

Descriptive Statistics

List of Examples. Examples 319

Statistics 151 Practice Midterm 1 Mike Kowalski

Final Exam Practice Problem Answers

Algebra II EOC Practice Test

Session 7 Bivariate Data and Analysis

Relationships Between Two Variables: Scatterplots and Correlation

The Effect of Dropping a Ball from Different Heights on the Number of Times the Ball Bounces

Interpreting Data in Normal Distributions

Shape of Data Distributions

3: Summary Statistics

Demographics of Atlanta, Georgia:

Chapter 23. Inferences for Regression

Descriptive statistics; Correlation and regression

STAT 350 Practice Final Exam Solution (Spring 2015)

Geostatistics Exploratory Analysis

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Unit 7: Normal Curves

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences

Expression. Variable Equation Polynomial Monomial Add. Area. Volume Surface Space Length Width. Probability. Chance Random Likely Possibility Odds

Introduction to Environmental Statistics. The Big Picture. Populations and Samples. Sample Data. Examples of sample data

DesCartes (Combined) Subject: Mathematics Goal: Statistics and Probability

Statistics E100 Fall 2013 Practice Midterm I - A Solutions

Chapter 2 Data Exploration

What Does the Normal Distribution Sound Like?

DESCRIPTIVE STATISTICS & DATA PRESENTATION*

Simple linear regression

Descriptive Statistics

Algebra I Vocabulary Cards

a. mean b. interquartile range c. range d. median

Lecture 13/Chapter 10 Relationships between Measurement (Quantitative) Variables

Transcription:

Name: Answer the following questions NEATLY. Show all necessary work directly on the exam. Scratch paper will be discarded unread. One point each part unless otherwise marked. 1. Owners of an exercise gym believe that a Normal model is useful in projecting the number of clients who will exercise in their gym each week. They use a mean of 800 clients and a standard deviation of 90 clients. (a) Draw and clearly label this model. (b) What is the first quartile of the weekly number of clients? Answer: 739. 2. Dogs 3. (a) The SPCA collects the following data about the dogs they house. Which is categorical? (Choose the best answer) A. age B. number of days housed C. veterinary costs D. breed E. weight (b) Which of the variables in the previous question, collected for only German Shepherds is most likely to be described by a Normal model? (Choose the best answer) A. veterinary costs B. weight C. age D. number of days housed E. breed

A statistics teacher kept track of the number of emails each student sent over the course of one term. Answer the following questions based on the histogram. (a) Which is largest? (Choose the best answer) A. Mean B. Median C. Location of the Mode (b) Which is smallest? Circle one. (Choose the best answer) A. Location of the Mode B. Median C. Mean (c) Which measure of center is best for the data? Circle one. (Choose the best answer) A. Median because it is closer to the higher numbers of emails B. Mean because it is closer to the higher numbers of emails C. Median because it is closer to the lower numbers of emails D. Mean because it is closer to the lower number of emails (d) Which measure of spread is best for the data? Circle one. (Choose the best answer) A. Range because the histogram is unimodal B. IQR because the data is skew C. Standard deviation because the data potentially has outliers (e) Based on the histogram, create a sketch (by hand) of the corresponding box plot. Keys: Stretches from 0 to 8, left whisker shorter than right, median line on box closer to low side (f) Which is true of the data shown in the histogram? Circle one of A-E below. I. The mean and median are approximately equal II. The data is symmetric III. The median and IQR to summarize this data are better than the mean and standard deviation. (Choose the best answer) A. I only B. III only C. I, II, and III D. I and III E. I and II 4. Suppose that a Normal model describes the acidity (ph) of rainwater, and that water tested after last week s storm had a z-score of 1.8. This means that the acidity of that rain (Choose the best answer) A. had a ph 1.8 higher than that of average rainwater B. had a ph 1.8 time that of average rainwater C. had a ph of 1.8 D. had a ph 1.8 standard deviations higher than that of average rainwater E. varied with a standard deviation of 1 Page 2

5. 192 students in an Intro Stats course were asked to describe their politics as Liberal, Moderate, or Conservative. Here are the results: L M C Female 35 36 6 Male 50 44 21 (a) What percent of conservative students are female? 6/27 =.222 (a) (b) What percent of female students are conservative? 6/77 = 0.078 (b) (c) What percent of the class is female? 77/192 =.401 (c) (d) What percent of all students in the class are females who consider themselves conservative? 6/192 = 0.031 (d) (e) Below is a graph of the data above, using percents. Based on this graph, can we conclude that the variables are dependent? Why or why not? Yes. The distribution of gender is NOT the same for all political orientations. 6. A taxi company monitoring the safety of its cabs kept track of the number of miles tires had been driven (in thousands) and the depth of the tread remaining (in mm). Their data are displayed in the scatterplot. They found the equation of the least squares regression line to be tread = 36 0.6miles, with r 2 = 0.74. Page 3

(a) Draw the regression line on the graph. (b) What is the explanatory variable? (b) Miles (c) Explain (in context) what the y-intercept of the line means. For a taxi that has not been driven any miles, the starting tread depth is 36mm. (d) Explain (in context) what the slope of the line means. For every extra thousand miles a taxi is driven, the tire tread depth decreases by 0.6 mm. (e) Explain (in context) what R 2 means. 74% of the variation in tread depth is explained by the linear relationship with miles driven. (f) The correlation r = r 2 = 0.74 = 0.86. Since the slope is negative, we have r = 0.86. (f) (g) Based on our regression, can we conclude that driving more miles will lead to a decrease in tire tread depth? Explain No. Despite the fact that commonsense might suggest this, we are not able to draw causal conclusions from a regression analysis. (h) What is the best predicted tread depth for a car driven 40 thousand miles? tread = 36.6(40) = 12mm (i) In this context, what does a negative residual mean? Page 4

A taxi has a smaller tread depth for the miles driven than was predicted by the model. 7. The boxplots show prices of used cars (in thousands of dollars) advertised for sale at three different car dealers. For each question below, choose one from Ace, BuyIt, or Carz. (a) Which dealer has the lowest median price? BuyIt (a) (b) Which dealer has the smallest price range? (b) Ace (c) Which dealer s prices have the smallest IQR, and what is it? (2 points) (c) Carz, 4 or 5 thousand (d) Which dealer offers the cheapest car offered? (d) Carz (e) Which dealer generally sells cars the cheapest? (e) BuyIt 8. Light bulbs are measured in lumens (light output), watts (energy used), and hours (life). A standard white light bulb has a mean life of 675 hours and a standard deviation of 50 hours. A soft white light bulb has a mean life of 700 hours and a standard deviation of 35 hours. At a local science competition, both light bulbs lasted 750 hours. Which light bulb s life span was better? (Choose the best answer) Page 5

A. Relative to each other, the light bulbs performed the same. B. Relatively, the soft white light bulb performed better. C. There is no basis for comparison, since they are different kinds of light bulbs. D. There is not enough information for comparison. E. Relatively, the standard white light bulb performed better. 9. Data Analysis. Data are from the US Department of Health and Human Services, National Center for Health Statistics, Third National Health and Nutrition Examination Survey and includes health data for 80 people. The variables are: GENDER is the person 92 s gender, AGE is in years, HT is height in inches, WT is weight in pounds, WAIST is circumference in cm, PULSE is pulse rate in beats per minute, SYS is systolic blood pressure in mmhg, DIAS is diastolic blood pressure in mmhg, CHOL is cholesterol in mg, BMI is body mass index, LEG is upper leg length in cm, ELBOW is elbow breadth in cm, WRIST is wrist breadth in cm, ARM is arm circumference in cm. The data and this prompt can be found at http://www.canyons.edu/faculty/morrowa/140/mozart/. Use Word to answer the following questions. Print your solutions when you are ready. Your final write-ups should include ONLY the graphs/statistics that are relevant. Suggested Discussion Points: Describe the distribution (shape, center, spread, other features) If it is appropriate, fit an appropriate model (Normal model, linear model). Provide evidence (appropriate graphs and statistics) for all of your findings. 3 points each (a) Analyze/summarize the weight of individuals in the study. Variable: WT. A correct solution will include: Graph: Boxplot/histogram. Page 6

Description of Shape: As shown in the histogram and boxplot, the distribution of weights is approximately symmetric, although it could be said there is a slight skew to the right. The distribution is unimodal, with a mode around 160 pounds. The histogram does not show any gaps, but the boxplot shows an outlier on the high end. Center: The mean weight is 159.39 pounds. The median is 161 pounds. Because the data is roughly symmetric, the center is best measured with the mean. Spread: The standard deviation is 34.88, and the IQR is 44.75. Because the data is roughly symmetric, the spread is best measured by the standard deviation. Normal?: The normal model is appropriate, and this can be verified by the probability plot shown below. (b) Analyze/summarize the gender of individuals in the study. Variable: GENDER. GENDER Percent female 50.00 male 50.00 A correct solution will include: Graph: Bar chart (relative or frequency is okay) Table: Relative or frequency Comment on Most : There were equal numbers of men and women in the study. (c) Analyze/summarize the relationship between gender and weight of individuals in the study. Variables: WT, GENDER. Page 7

Variable GENDER Mean StDev Median IQR WT female 146.22 37.62 135.80 47.17 male 172.55 26.33 169.95 38.60 A correct solution will include: Graph: Side-by-side boxplots and histograms of weight (separated by gender). Comparison of the Shape: Weights of females are skewed to the right, where as weights of males are more symmetric. Weights of males and females are both unimodal. The males do not show outliers in weights, but the females show 2 on the high end. Comparison of Center: Males have a higher mean and median weight, and so generally weigh more. Comparison of Spread: Females, however, have larger spread in weight as measured by both the standard deviation and IQR. (d) Analyze/summarize the relationship between weight and waist size of individuals in the study. Does weight have an influence on waste size? Variables: WT, WAIST. A correct solution will include: Graph: Scatterplot with waist on the y and weight on the x. Description of scatterplot: There appears to be a strong positive linear pattern. Correlation: r =.908, which is pretty low Linear Regression: W AIST = 33.2 + 0.345W T Analyze Fit: The residual plot shows no significant pattern, indicating that the pattern in the original scatterplot is linear. R 2 = 82.5%, which is fairly high, indicating that the linear relationship is strong. Page 8