Biology 300 Homework assignment #1 Solutions. Assignment:



Similar documents
Describing, Exploring, and Comparing Data

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

Variables. Exploratory Data Analysis

HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

Exploratory data analysis (Chapter 2) Fall 2011

Using SPSS, Chapter 2: Descriptive Statistics

Exploratory Data Analysis

List of Examples. Examples 319

AP * Statistics Review. Descriptive Statistics

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median

AP Statistics Solutions to Packet 2

Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple.

CHAPTER TWELVE TABLES, CHARTS, AND GRAPHS

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

II. DISTRIBUTIONS distribution normal distribution. standard scores

The Big Picture. Describing Data: Categorical and Quantitative Variables Population. Descriptive Statistics. Community Coalitions (n = 175)

MATH 103/GRACEY PRACTICE EXAM/CHAPTERS 2-3. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Diagrams and Graphs of Statistical Data

Section 1.3 Exercises (Solutions)

Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences

Describing and presenting data

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

Data Exploration Data Visualization

Foundation of Quantitative Data Analysis

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining

Iris Sample Data Set. Basic Visualization Techniques: Charts, Graphs and Maps. Summary Statistics. Frequency and Mode

Descriptive Statistics

Mean, Median, and Mode

Statistics. Measurement. Scales of Measurement 7/18/2012

Lecture 2. Summarizing the Sample

Summarizing and Displaying Categorical Data

How To Understand The Scientific Theory Of Evolution

Week 1. Exploratory Data Analysis

Natural surface water on earth includes lakes, ponds, streams, rivers, estuaries, seas and oceans.

2 Describing, Exploring, and

Midterm Review Problems

Chapter 2: Frequency Distributions and Graphs

Sta 309 (Statistics And Probability for Engineers)

with functions, expressions and equations which follow in units 3 and 4.

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

A Correlation of. to the. South Carolina Data Analysis and Probability Standards

BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS

Valor Christian High School Mrs. Bogar Biology Graphing Fun with a Paper Towel Lab

Fairfield Public Schools

Common Tools for Displaying and Communicating Data for Process Improvement

6. Decide which method of data collection you would use to collect data for the study (observational study, experiment, simulation, or survey):

CHARTS AND GRAPHS INTRODUCTION USING SPSS TO DRAW GRAPHS SPSS GRAPH OPTIONS CAG08

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers)

Session 7 Bivariate Data and Analysis

The Effect of Dropping a Ball from Different Heights on the Number of Times the Ball Bounces

Algebra II EOC Practice Test

Lecture 1: Review and Exploratory Data Analysis (EDA)

CHAPTER THREE. Key Concepts

Northumberland Knowledge

Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences.

Density Curve. A density curve is the graph of a continuous probability distribution. It must satisfy the following properties:

DESCRIPTIVE STATISTICS - CHAPTERS 1 & 2 1

Statistics Revision Sheet Question 6 of Paper 2

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

Unit 9 Describing Relationships in Scatter Plots and Line Graphs

Pennsylvania System of School Assessment

Signs of the Seasons: A Maine Phenology Project

COM CO P 5318 Da t Da a t Explora Explor t a ion and Analysis y Chapte Chapt r e 3

Exploratory Data Analysis

Vertical Alignment Colorado Academic Standards 6 th - 7 th - 8 th

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

How Does My TI-84 Do That

Grade Level Expectations for the Sunshine State Standards

430 Statistics and Financial Mathematics for Business

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Data Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining

Tutorial 3: Graphics and Exploratory Data Analysis in R Jason Pienaar and Tom Miller

The Importance of Statistics Education

Descriptive Statistics and Measurement Scales

MBA 611 STATISTICS AND QUANTITATIVE METHODS

Random Variables. Chapter 2. Random Variables 1

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.

CALCULATIONS & STATISTICS

What Does the Normal Distribution Sound Like?

Exercise 1: How to Record and Present Your Data Graphically Using Excel Dr. Chris Paradise, edited by Steven J. Price

Chapter 1: Exploring Data

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Scatter Plots with Error Bars

Bar Graphs and Dot Plots

Information Visualization Multivariate Data Visualization Krešimir Matković

Intro to Statistics 8 Curriculum

MTH 140 Statistics Videos

Mathematics (Project Maths)

Characteristics of Binomial Distributions

THE BINOMIAL DISTRIBUTION & PROBABILITY

a. mean b. interquartile range c. range d. median

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS

Mind on Statistics. Chapter 2

Transcription:

Biology 300 Homework assignment #1 Solutions Assignment: Chapter 1, Problems 6, 15 Chapter 2, Problems 6, 8, 9, 12 Chapter 3, Problems 4, 6, 15 Chapter 4, Problem 16 Answers in bold. Chapter 1 6. Identify whether the following variables are numerical or categorical. If numerical, state whether the variable is discrete or continuous. If categorical, state whether the variable is nominal or ordinal. a) Number of sexual partners in a year by college students numerical, discrete b) Petal area of rose flowers numerical, continuous c) Key on the musical scale categorical, nominal (not ordinal because there is no set starting point for any ordering) d) Heart beats per minute of Tour de France cyclists numerical, discrete e) Stage of fruit ripeness (under ripe, ripe, over ripe) categorical, ordinal f) Angle of flower orientation relative to position of sun numerical, continuous g) Letter grade on high school report card categorical, ordinal h) Tree species categorical, nominal i) Year of birth categorical, ordinal (not numerical because there is no zero point for the scale) j) Sex categorical, nominal k) Birth weight numerical, continuous 15. The average age of piñon juniper trees in the coastal range of California was investigated by placing a 10-hectare plot randomly on a distribution map of the tree in California using a computer. Researchers then recorder the location of the random plot, found it in the field, and flagged it using compass and measuring tape. They then proceeded to measure the age of every piñon juniper tree within the 10 hectare plot. The average age within the plot was used to estimate the average age of the whole California population.

a) What is the population of interest in this study? The whole California population of juniper trees b) Were the trees sampled randomly from this population? Why or why not? No. All of the trees selected were very close together, so the sampling of individual trees cannot be considered to be independent - trees are likely to be selected if their neighbors were selected. Chapter 2 6. The data below are the occurrences of different taxa in the list of endangered and threatened species under the US Endangered Species Act by 2002, separately for each type of organism (U.S. Fish and Wildlife Service 2001). The taxa are listed in no particular order in the table below. Taxon Birds 92 Clams 70 Reptiles 36 Fish 115 Crustaceans 21 Mammals 74 Snails 32 Plants 745 Amphibians 22 Insects 44 Arachnids 12 Number of species a) Rewrite the table, but list the taxa in a more revealing order. Explain your reasons behind the ordering you chose. Taxon Plants 745 Fish 115 Birds 92 Mammals 74 Clams 70 Insects 44 Reptiles 36 Snails 32 Amphibians 22 Crustaceans 21 Arachnids 12 Number of species

I chose this ordering because then the relative ranking of species numbers can be easily seen. b) What kind of table is this? A frequency table c) Choosing the most appropriate graphical method, display the number of species listed in each taxon of organism in 2002. What kind of graph did you choose? Why? 700 Histogram of species counts Number of species 600 500 400 300 200 100 0 Plants Fish Birds Mammals Clams Insects Reptiles Snails Taxon Amphibians Crustaceans Arachnids I created a histogram, because that's a good way to show a frequency distribution. d) Should the baseline for number of species in your graph in (c) be 0 or 21, the smallest number in the data set for 2002? Why? Zero, so that the relative areas of the bars is proportional to the species counts - this is the most accurate way to visually display this information. 8. Can environmental factors influence the incidence of schizophrenia? A recent project measured incidence of the disease among children born in a region of eastern China. 192 of 13,748 babies born in the midst of a severe famine in the region in 1960 later developed schizophrenia. This compared with 483 schizophrenics out of 59,088 births in 1956, before the famine, and 695 out of 83,536 births in 1965, after the famine (St Claire et al. (2005). a) What two variables are compared in this example? year and incidence of schizophrenia

b) Are the variables numerical or categorical? If numerical, are they continuous or discrete; if categorical, are they nominal or ordinal? both are categorical; year is ordinal and schizophrenia is nominal c) Effectively display the findings in a table. What kind of table did you use? Year 1956 1960 1965 Row totals not schizophrenic 58,605 13,556 83,841 155,002 schizophrenic 483 192 695 1,370 column totals 59,088 13,748 83,536 156,372 This is a contingency table. d) Calculate the relative frequency (proportion) of children born in each of the three years that later developed schizophrenia. Plot these proportions in a line graph. What pattern is revealed? 0.014 Proportion of Schizophrenia 0.012 0.010 Proportion 0.008 0.006 0.004 0.002 0.000 1956 1960 1965 Year There is a higher proportion of schizophrenia in the famine year. 9. Swordfish have a unique "heater organ" that maintains elevated eye and brain temperatures when hunting in deep cold water, but its function is unclear. The graph below illustrates the results of a study by Fritsches et al. (2005) that measured how the ability of swordfish retinas to detect rapid motion, measured by the flicker fusion frequency, changes with eye temperature.

a) What types of variables are displayed? Both are numerical and continuous b) What type of graph is this? A scatter plot. c) Describe the association between the two variables. Is the relationship between flicker fusion frequency and temperature positive or negative? Is the relationship linear or nonlinear? There is a positive nonlinear relationship. d) The 20 points in the graph were obtained from measurements of 6 swordfish. Can we treat the 20 measurements as a random sample? Why or why not? No, because more than one measurement was taken for at least some of the fish. Repeated measurements taken from the same individual are not independent - these data are pseudoreplicated. 12. The following graph shows the population growth rates of 204 countries recognized by the United Nations. Growth rate is measured as the average annual percent change in the total human population between 2000 and 2004 (United Nations Statistics Division 2004).

a) Identify the type of graph depicted. A cumulative frequency distribution b) Explain the quantity along the Y-axis. Y is cumulative relative frequency. It is the proportion of the sample that is less than a given X value. In this case, the y-axis depicts the proportion of countries with a growth rate less than X. c) Approximately what percentage of countries had a negative change in population? Approximately 10%. (Read the value off the y axis above 0 on the X axis. d) Identify by eye the 0.10, 0.50, and 0.90 quantiles of change in population size. The 0.1 quantile is approximately 0. The 0.50 quantile is approximately 1.3. The 0.90 quantile is approximately 3. (You get these by finding 0.1, 0.5, or 0.9 on the y-axis and finding the X value that corresponds. e) Identify by eye the 60th percentile of change in population size. The 60th percentile is approximately 1.9. Chapter 3 4. Birds of the Caribbean islands of the Lesser Antilles are descended from immigrants originating from larger islands and the nearby mainland. Here are the approximate dates of immigration, in millions of years, of each of 37 bird species now present on the Lesser Antilles (Ricklefs and Bermingham 2001). The dates were calculated from the difference in mitochondrial DNA sequence between each of the species and its closest living relative on larger islands or the mainland.

0.00, 0.00, 0.04, 0.21, 0.29, 0.54, 0.63, 0.88, 0.96, 1.25, 1.67, 1.75, 1.84, 1.96, 2.01, 2.51, 2.72, 3.30, 3.51, 4.05, 4.85, 6.94, 8.73, 10.57, 11.11, 12.45, 14.00, 17.30, 17.92, 18.05, 18.43, 22.48, 22.48, 23.48, 26.32, 26.45, 28.87 a) Plot the data in a histogram and describe the shape of the frequency distribution. Histogram of bird immigration dates Frequency 0 5 10 15 20 0 5 10 15 20 25 30 Dates of immigration (m.y.) These data are skewed to the right. b) By viewing the graph alone, approximate the mean and median of the distribution. Which is greater? Explain your reasoning. (Here and forever after, provide units with your answer.) Estimated mean: 10 m.y. Estimated median: 4 m.y. The mean will be greater because it is more strongly affected by outliers - that is, very old immigration times. c) Calculate the mean and median. Was your intuition in (b) correct? Mean: 8.67 m.y. Median: 3.51 m.y. Yes, my intuition was correct - the mean is much larger than the median. d) Calculate the first and third quartiles and the interquartile range. I'll use the method outlined in the book: first quartile: (0.96+1.25)/2 = 1.105 third quartile: (17.30+17.92)/2 = 17.61 Interquartile range: 17.61-1.105 = 16.505 6) The data below are from an ecological study of the entire rainforest community at El Verde in Puerto Rico (Waide and Reagan 1996). Diet breadth is the number of types of prey eaten by the

consumer species present. The number of species having each diet breadth is shown in the second column. The total number of consumer species is n = 127. Diet breadth (No. prey types eaten) 1 21 2 8 3 9 4 10 5 8 6 3 7 4 8 8 9 4 10 4 11 4 12 2 13 5 14 2 15 1 16 1 17 2 18 1 19 3 20 2 More than 20 25 Total 127 Frequency a) Calculate the median number of prey types of species in the community. The median is 8. b) What is the interquartile range in the number of prey types? Use the simple method outlines in Section 3.2 to calculate the quartiles. First quartile: consider just the first 63 values, and find their median (the 32nd number) First quartile = 3 Third quartile: similarly, consider the last 63 values, and find their median Third quartile = 17

Interquartile range = 17-3 = 14 c) Can you calculate the mean number of prey types in the diet? No, because you don't know how many prey types are represented by the last category, more than 20. 15) The gene for the vasopressin receptor V1a is expressed at higher levels in the forebrain of monogamous vole species than promiscuous vole species9? Can expression of this one gene itself influence monogamy? To test this, Lim et al. (2004) experimentally enhanced V1a expression in the forebrain of 11 males of the meadow vole, a solitary promiscuous species. The percent of time each male spent huddling with the female provided to him was recorded. The same measurements were taken in 20 control males left untreated: Control males: 98, 96, 94, 88, 86, 82, 77, 74, 70, 60, 59, 52, 50, 47, 40, 35, 29, 13, 6, 5 V1a enhanced males: 100, 97, 96, 97, 93, 89, 88, 84, 77, 67, 61 a) Display these data in a graph. Two possibilities: a box plot Percent of time huddling 20 40 60 80 100 control treatment or stacked histograms:

Control Frequency 0.0 1.5 3.0 0 20 40 60 80 100 Treatment Frequency 0 2 4 0 20 40 60 80 100 Percent of time huddling b) Which group has the higher mean percent time spent huddling with females? The treatment group: mean = 86.3% versus 58.1% for the controls. c) Which group has the higher standard deviation in percent time spent huddling with females? The control group: s = 29.8 versus s = 12.9 for the controls. Chapter 4 16) Examine the times to rigor mortis of 114 corpses displayed in Problem 12 of Chapter 3. Hours 1 0 2 2 3 14 4 31 5 14 6 20 Number of bodies

Hours 7 11 8 7 9 4 10 7 11 1 12 1 13 2 Total 114 Number of bodies a) What is the standard error of the mean time to rigor mortis? First, calculate the mean: x = (0*1 + 2*2 + 14*3 + 31*4 +... + 2*13)/114 = 648/114 = 5.68 Next, calculate the standard deviation: s 2 = [0*(1-5.68)^2 + 2*(2-5.68)^2 + 14*(3-5.68)^2 + 31*(4-5.68)^2 +... + 2*(13-5.68)^2]/(114-1) s 2 = 630.63/113 = 5.58 s = 5.58 = 2.36 standard error = s / n = 2.36/ 114 = 0.22 b) The standard error calculated in (a) measures the width of what frequency distribution? The sampling distribution of the mean. c) What assumption does your calculation in (a) require? That the cadavers used represent a random sample of the population.