Day 1 MTH 490.433. Basic Probability Problem Set



Similar documents
WEEK #22: PDFs and CDFs, Measures of Center and Spread

Random Variables. Chapter 2. Random Variables 1

MONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010

Elementary Statistics Sample Exam #3

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

Simple Regression Theory II 2010 Samuel L. Baker

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Using Excel for inferential statistics

Math 58. Rumbos Fall Solutions to Review Problems for Exam 2

Chapter 2: Linear Equations and Inequalities Lecture notes Math 1010

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

7 CONTINUOUS PROBABILITY DISTRIBUTIONS

12.5: CHI-SQUARE GOODNESS OF FIT TESTS

Probability: Terminology and Examples Class 2, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Chapter 3 RANDOM VARIATE GENERATION

9. Sampling Distributions

Chapter 7 Section 7.1: Inference for the Mean of a Population

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

STT315 Chapter 4 Random Variables & Probability Distributions KM. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables

Problem sets for BUEC 333 Part 1: Probability and Statistics

List of Examples. Examples 319

SECTION 2.5: FINDING ZEROS OF POLYNOMIAL FUNCTIONS

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Algebra 2: Q1 & Q2 Review

Chi Square Tests. Chapter Introduction

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

Inference for two Population Means

Simulation Exercises to Reinforce the Foundations of Statistical Thinking in Online Classes

AP Physics 1 and 2 Lab Investigations

Independent t- Test (Comparing Two Means)

Confidence Intervals in Public Health

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Survey, Statistics and Psychometrics Core Research Facility University of Nebraska-Lincoln. Log-Rank Test for More Than Two Groups

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools

UNIVERSITY of MASSACHUSETTS DARTMOUTH Charlton College of Business Decision and Information Sciences Fall 2010

LOGNORMAL MODEL FOR STOCK PRICES

Permutation Tests for Comparing Two Populations

Lecture Notes Module 1

Math 370/408, Spring 2008 Prof. A.J. Hildebrand. Actuarial Exam Practice Problem Set 5 Solutions

How To Check For Differences In The One Way Anova

PRIORITY DISPATCH SCHEDULING IN AN AUTOMOBILE REPAIR AND MAINTENANCE WORKSHOP

Name: Math 29 Probability. Practice Second Midterm Exam Show all work. You may receive partial credit for partially completed problems.

Statistics 151 Practice Midterm 1 Mike Kowalski

MAT 155. Key Concept. September 27, S5.5_3 Poisson Probability Distributions. Chapter 5 Probability Distributions

Chapter 9. Two-Sample Tests. Effect Sizes and Power Paired t Test Calculation

seven Statistical Analysis with Excel chapter OVERVIEW CHAPTER

Activity 1: Using base ten blocks to model operations on decimals

WISE Power Tutorial All Exercises

Review #2. Statistics

How To Run Statistical Tests in Excel

COLLEGE ALGEBRA 10 TH EDITION LIAL HORNSBY SCHNEIDER 1.1-1

Non-Inferiority Tests for One Mean

A Quick Algebra Review

MA 1125 Lecture 14 - Expected Values. Friday, February 28, Objectives: Introduce expected values.

Math 370, Actuarial Problemsolving Spring 2008 A.J. Hildebrand. Practice Test, 1/28/2008 (with solutions)

CALCULATING THE AREA OF A FLOWER BED AND CALCULATING NUMBER OF PLANTS NEEDED

Chapter 4. Probability Distributions

2 GENETIC DATA ANALYSIS

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

Goodness of Fit. Proportional Model. Probability Models & Frequency Data

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

About Compound Interest

Probability Concepts Probability Distributions. Margaret Priest Nokuthaba Sibanda

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

7. Normal Distributions

Testing Research and Statistical Hypotheses

Statistical Functions in Excel

6 Scalar, Stochastic, Discrete Dynamic Systems

Point Biserial Correlation Tests

Statistical Rules of Thumb

Learning Objectives for Section 1.1 Linear Equations and Inequalities

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

Study Guide for the Final Exam

Recall this chart that showed how most of our course would be organized:

Experimental Analysis

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

11. Analysis of Case-control Studies Logistic Regression

Mean = (sum of the values / the number of the value) if probabilities are equal

Fairfield Public Schools

MTH 140 Statistics Videos

1.4 Compound Inequalities

Intro to GIS Winter Data Visualization Part I

DATA INTERPRETATION AND STATISTICS

Example: Find the expected value of the random variable X. X P(X)

ModuMath Basic Math Basic Math Naming Whole Numbers Basic Math The Number Line Basic Math Addition of Whole Numbers, Part I

Aachen Summer Simulation Seminar 2014

ALGEBRA I (Common Core)

FACTORING QUADRATICS and 8.1.2

Example: Boats and Manatees

Comparing Multiple Proportions, Test of Independence and Goodness of Fit

HOW TO WRITE A LABORATORY REPORT

Quantitative Methods for Finance

Lecture 7: Continuous Random Variables

How To Understand The Scientific Theory Of Evolution

INVESTIGATION OF FALLING BALL VISCOMETRY AND ITS ACCURACY GROUP R1 Evelyn Chou, Julia Glaser, Bella Goyal, Sherri Wykosky

MATHS LEVEL DESCRIPTORS

Confidence Intervals for the Difference Between Two Means

Transcription:

Basic Probability Problem Set The five problems in this section can all be done by hand. 1) A roulette wheel has 37 positions numbered 0,1,,36. Assume that the ball comes to rest at each position with equal probabilities. What is the probability a) of an even number, b) of a number greater than 30, c) of a number which is at most 10? 2) In a population of plants, individuals can have white flowers instead of the normal purple flowers with a probability of 0.02. Independently, plants in this population can have short anthers with a probability of 0.013. What is the probability of finding one of these rare plants, in other words a plant having white flowers and/or short anthers? Note, this problem can be solved by two subtly different ways of thinking about the problem. 3) Derive the formulas for the mean and variance of a uniform distribution from the probability density function and the definition of mean and variance. The probability density function of a uniform distribution is f(z)=1/(b-a) where where a z b.

4) The body mass of a population of adult beetles is normally distributed with a mean mass of 18.7 and a standard deviation of 1.3. You want to select some large beetles for an experiment. Identify the size which defines the largest 7% of the beetles. Show how you would find this size, but you don t necessarily have to solve the equation. (Hint: a graph of the probability distribution function may help you think through this problem) 5) Now that you ve achieved statistical mastery, it is time to create your very own probability distribution function! Create a probability density function defined for z between 2 and 4. Make the function linear and proportional to the line (1 + 3x). Don t forget about the rule that must be true for all pdf s. Sketch a graph of this new distribution and write the formula for the pdf. You should be able to do this all by hand (Optional bonus: do this all symbolically, in other words proportional to the line m x + b defined on the interval [p,q]).

Exercises in Simulating Data Example 1 Imagine a plant with a single female flower. The plant will either be visited by a pollen bearing pollinator or not, with a probability of success p. Examine across a number of patches. In this example, the probability of being pollinated is 0.85. We have 20 flowers in each patch, and 10 patches. R code visits <- rbinom(10,20,0.85) plot(visits,ylim = c(0,20)) Mathematica code <<Statistics`DiscreteDistributions` visits=table[random[binomialdistribution[20,0.85]],{10}] ListPlot[visits,PlotStyle PointSize[0.02],PlotRange {0,20}]; Example 2 a) Imagine you now have a camera set-up to record the number of pollinator visits to each individual flower. Now a flower can be visited multiple times. The rate of pollinator visits is a continuous value, but the number of individual visits on any particular plant is a discrete value. The rate of pollinator visits times the period of time over which the flower is observed is 2.26. The outcome is the discrete number of visits. b) Now imagine the rate of visitation depends on the size of the flower. Bigger flowers get more visits, E[v] = 2.00 + 0.2 w, where E[v] is the expected number of visits and w is the width of the flower corolla. Example 3 Imagine you sample the temperature in 40 experimental ponds. The average temperature is 25.0. Example 4 You are trying to estimate the rate of parasitism by Chewing Bird Lice in two different species. In a preliminary study, birds had been parasitized with an average of 12.3 lice on each bird. You have captured 35 Kirtland Warblers and 83 Bell's Vireos to look at differences in lice parasitism between the species. Simulate your results, assuming the Warblers are parasitized at a higher rate than the Vireos. Repeat the simulation twice using a different distribution form each time (hint: one of the distributions being the assumed distribution in classical statistics).

Useful Code for Chapter 4 Day 1 MTH 490.433 Return to the pollination example. We watch 100 flowers and count the number of times each flower is visited by a pollinator. 31 flowers are never visited. 41 flowers are visited once, etc. visits flowers 0 31 1 41 2 15 3 12 4 1 <<Statistics`DataManipulation`; exampledata={{0,31},{1,41},{2,15},{3,12},{4,1}}; visits=column[exampledata,1]; flowers=column[exampledata,2]; exampledata = data.frame(list(visits = c(0,1,2,3,4), flowers = c(31,41,15,12,1))) Find the average number of pollinators that visited a flower Mathmatica Code N[Total[visits*flowers]/Total[flowers]] with(exampledata,sum(flowers*visits)/sum(flowers)) If we think this is Poisson distributed, we can simulate a Poisson data set. <<Statistics`DiscreteDistributions` simdata=randomarray[poissondistribution[1.11],{100}] simdata <- rpois(100,1.11) table(simdata) We can compare the observed data to the expected distribution one would get from a Poisson distribution with 100 samples and a mean of 1.11. Use the Poisson pdf and generate the probability of getting 0, 1, 2, 3, and 4 visits if the rate parameter is 1.11. Mathmatica Code Table[PDF[PoissonDistribution[1.11],i],{i,0,4}] dpois(0:4,1.11)

Multiply each fraction by the total number of flowers to get the expected. We will extend the range of expected out to 10 in order to capture the entire distribution. expected=100 Table[PDF[PoissonDistribution[1.11],i],{i,0,10}] observed=flowers expected <- 100*dpois(0:10,1.11) observed <- exampledata$flowers Now I can use the c 2 test to compare the observed and expected distributions. This will give me absolute measure of how well the Poisson distribution fits these data. X 2 = Hobserved-expectedL2 expected This statistic can then be compared to a c 2 distribution to yield the probability that the observed and the expected come from the same distribution. As a rule of thumb, the expected value for each class should be greater than 5. If the expected value is less than 5 for any category, it should be combined with an adjacent class. Combine all the small expected classes and any corresponding observed classes. expected=append[take[expected,3],total[drop[expected,3]]] observed=append[take[observed,3],observed[[4]]+observed[[5]]] teststat=total[(observed-expected)^2/expected] <<Statistics`HypothesisTests` ChiSquarePValue[testStat,3] expected = c(expected[1:3],sum(expected[4:10])) observed = c(observed[1:3],observed[4]+observed[5]) chisq.test(observed,p=expected,rescale.p=true) Because we are comparing 4 classes, we have 3 degrees of freedom. We cannot reject the null hypothesis at the 0.05 level, so we consider the Poisson distribution to be an adequate fit to the data. Finally, this last bit of code can be used to generate the original data from a contingency table. Flatten[Apply[Table[#1,{#2}]&,exampleData,1]] with(exampledata,rep(visits,flowers))

Example from Chapter 4 Day 1 MTH 490.433 Try to work through this exercise without reading Chapter 4. It will be more informative for you to think through the distributions before you read their suggestions. After you ve worked through the data here, read Chapter 4, pages 94-95, 98-103. It will give you a little more background. See if you agree with their analysis. The data are the number of New Zealand fish tows that had an incidental catch of seabirds. For example, 807 tows had no birds caught in the gear. 37 tows had 1 bird caught in the gear, etc. Find a distribution that fit these data. You can download the data on the course website. Under Day 1 it is called bird data <http://www.unl.edu/cbrassil/elme/2007/ch4.csv>. You can import the data using the following code. mydata=import["c:\ch4.csv","csv"]; mydata <- read.csv("ch4.csv") Birds Tows 0 807 1 37 2 27 3 8 4 4 5 4 6 1 7 3 8 1 9 0 10 0 11 2 12 1 13 1 14 0 15 0 16 0 17 1