Introduction to Statistics for the Life Sciences Fall 2014
Volunteer
Definition A biased sample systematically overestimates or underestimates a characteristic of the population Paid subjects for drug testing Brood size sampling* Length of fish (fish net) Other ideas?
Definition A biased sample systematically overestimates or underestimates a characteristic of the population Paid subjects for drug testing Brood size sampling* Length of fish (fish net) Other ideas?
Definition A biased sample systematically overestimates or underestimates a characteristic of the population Paid subjects for drug testing Brood size sampling* Length of fish (fish net) Other ideas?
Definition A biased sample systematically overestimates or underestimates a characteristic of the population Paid subjects for drug testing Brood size sampling* Length of fish (fish net) Other ideas?
Types of Nonresponse caused by a lack of response (typically a survey). Example 1.3.11: 949 men were asked to submit for an HIV test. 782 agreed, giving an HIV rate of 1.02%. Medical records were used to establish the HIV prevalence among the nonresponse group. It was 5.4%.
Chapter 2: Description of Samples and Populations 2.1 Introduction 2.2
Definition Variable A variable is a characteristic of a specimen that can be assigned a number or a category. Examples: Genre of books sampled from a library Gender of sampled students Shade temperature at noon
Definition Categorical Variable If a variable takes values in a finite list of labels, it is called a categorical variable. Examples: Blood type (A, B, etc.) Major of students Species of tree in a forest
Definition Ordinal If the labels of a categorical variable can be ranked in a meaningful way, the variable is an ordinal categorical variable. Examples: Survey responses such as Strongly Disagree, Disagree, Slightly Disagree,... Letter grades such as A, AB, B Note All ordinal categorical variables are categorical variables, but the reverse is not true.
Definition Numeric Variable If a variable takes values which are numbers, it is called a numeric variable. The height of people The speed of passing cars The price of a stock
Continuous Versus Discrete Two types of numeric variables: Continuous Discrete
Continuous Versus Discrete Two types of numeric variables: Continuous: the values can be arbitrarily close together Discrete
Continuous Versus Discrete Two types of numeric variables: Continuous: the values can be arbitrarily close together For example: the temperature Discrete
Continuous Versus Discrete Two types of numeric variables: Continuous: the values can be arbitrarily close together For example: the temperature Discrete: the values can only get so close together
Continuous Versus Discrete Two types of numeric variables: Continuous: the values can be arbitrarily close together For example: the temperature Discrete: the values can only get so close together For example: the number of pencils in your backpack
Classify the The height of several patients The height of several patients measured to two decimal places The nation s terror alert
Classify the The height of several patients The height of several patients measured to two decimal places The nation s terror alert
Classify the The height of several patients The height of several patients measured to two decimal places The nation s terror alert
Observational Units Often, we will take various measurements for each item in a sample of size n. For example, if a sample of n dolphins are caught and released, the researchers may measure Gender Length (Estimated) Age
Observational Units Often, we will take various measurements for each item in a sample of size n. For example, if a sample of n dolphins are caught and released, the researchers may measure Gender Length (Estimated) Age In this case, the dolphin is referred to as the observational unit.
Definition Frequency Distribution A table that counts the number of occurrences (or frequency) of each value of a variable. Note: the textbook defines this as the table or the graph. For our purposes, it is just the table.
Example Frequency table of cell phone types of a sample of UW-Madison students. Frequency Old Flip Phone 36 iphone 59 Android 37 No Cell Phone 12
Example Frequency table of cell phone types of a sample of UW-Madison students. Frequency Old Flip Phone 36 iphone 59 Android 37 No Cell Phone 12 Note: a row at the bottom giving the total is often useful, but not strictly necessary.
Bar Chart Bar Chart of Cell Phone Type Frequency 0 10 20 30 40 50 Old Flip Phone iphone Android No Cell Phone
This is always a bad idea. Old Flip Phone iphone No Cell Phone Android
Dotplot Dotplots are constructed by drawing a number line and placing the data points on the number line. Duplicate values can be stacked. Imagine measuring the height (cm) of 20 plants after 3 weeks of growth.
Dotplot Dotplots are constructed by drawing a number line and placing the data points on the number line. Duplicate values can be stacked. Imagine measuring the height (cm) of 20 plants after 3 weeks of growth. Plot of Height of Plants 4 5 6 7 8 Height (cm)
Dotplot What kinds of variables can we make dotplots for? Plot of Height of Plants 4 5 6 7 8 Height (cm)
Relative Frequency Bar charts can also track relative frequency, or the number of observations in each category divided by the total number of observations. Bar Chart of Cell Phone Type Relative Frequency 0.0 0.1 0.2 0.3 0.4 Old Flip Phone iphone Android No Cell Phone
Histogram A histogram is a bar chart for a numeric random variable. The height (or sometimes area) of a bar over a range of values indicates the frequency or relative frequency of the occurrence of that value. Histogram of Height of Plants Frequency 0 2 4 6 8 3 4 5 6 7 8 Height (cm)
Histogram How interval inclusion works: The number of plants with height in the interval [3,4) is represented by the height of the bar between 3 and 4. If a plant was added of height 4cm, which bar would increase in height? Histogram of Height of Plants Frequency 0 2 4 6 8 3 4 5 6 7 8 Height (cm)
Relative Frequency Histogram Histogram of Height of Plants Density 0.0 0.1 0.2 0.3 0.4 3 4 5 6 7 8 Height (cm)
Grouped Frequency Distribution Plant Height (cm) Frequency [3, 4) 1 [4, 5) 1 [5, 6) 6 [6, 7) 8 [7, 8) 4
Requirement You need to be able to make these plots by hand.