Consider a system that consists of a finite number of equivalent states. The chance that a given state will occur is given by the equation.

Similar documents
AP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

Section 6.2 Definition of Probability

2 GENETIC DATA ANALYSIS

Lesson 1. Basics of Probability. Principles of Mathematics 12: Explained! 314

Math 3C Homework 3 Solutions

Basic Probability. Probability: The part of Mathematics devoted to quantify uncertainty

Lab 11. Simulations. The Concept

(b) You draw two balls from an urn and track the colors. When you start, it contains three blue balls and one red ball.

Contemporary Mathematics- MAT 130. Probability. a) What is the probability of obtaining a number less than 4?

Probabilistic Strategies: Solutions

MATH 140 Lab 4: Probability and the Standard Normal Distribution

MATHEMATICS Y3 Using and applying mathematics 3810 Solve mathematical puzzles and investigate. Equipment MathSphere

Probability definitions

Probability & Probability Distributions

Probability --QUESTIONS-- Principles of Math 12 - Probability Practice Exam 1

Probability. Sample space: all the possible outcomes of a probability experiment, i.e., the population of outcomes

Contemporary Mathematics Online Math 1030 Sample Exam I Chapters No Time Limit No Scratch Paper Calculator Allowed: Scientific

Mathematical goals. Starting points. Materials required. Time needed

AMS 5 CHANCE VARIABILITY

Question: What is the probability that a five-card poker hand contains a flush, that is, five cards of the same suit?

Chapter What is the probability that a card chosen from an ordinary deck of 52 cards is an ace? Ans: 4/52.

Mendelian and Non-Mendelian Heredity Grade Ten

Ready, Set, Go! Math Games for Serious Minds

ACMS Section 02 Elements of Statistics October 28, Midterm Examination II

Study Guide for the Final Exam

Math/Stats 425 Introduction to Probability. 1. Uncertainty and the axioms of probability

6.3 Conditional Probability and Independence

Chapter 13: Meiosis and Sexual Life Cycles

2. Three dice are tossed. Find the probability of a) a sum of 4; or b) a sum greater than 4 (may use complement)

Math Quizzes Winter 2009

Curriculum Design for Mathematic Lesson Probability

ACMS Section 02 Elements of Statistics October 28, 2010 Midterm Examination II Answers

1 Combinations, Permutations, and Elementary Probability

Having a coin come up heads or tails is a variable on a nominal scale. Heads is a different category from tails.

Math Games For Skills and Concepts

36 Odds, Expected Value, and Conditional Probability

Heredity. Sarah crosses a homozygous white flower and a homozygous purple flower. The cross results in all purple flowers.

LAB : PAPER PET GENETICS. male (hat) female (hair bow) Skin color green or orange Eyes round or square Nose triangle or oval Teeth pointed or square

Bayesian Tutorial (Sheet Updated 20 March)

2. How many ways can the letters in PHOENIX be rearranged? 7! = 5,040 ways.

Projects Involving Statistics (& SPSS)

Lecture 1 Introduction Properties of Probability Methods of Enumeration Asrat Temesgen Stockholm University

Testing Research and Statistical Hypotheses

Independent samples t-test. Dr. Tom Pierce Radford University

Math 58. Rumbos Fall Solutions to Review Problems for Exam 2

Session 8 Probability

Hoover High School Math League. Counting and Probability

Chapter 4 - Practice Problems 2

MATHEMATICS 154, SPRING 2010 PROBABILITY THEORY Outline #3 (Combinatorics, bridge, poker)

E3: PROBABILITY AND STATISTICS lecture notes

The study of probability has increased in popularity over the years because of its wide range of practical applications.

REPEATED TRIALS. The probability of winning those k chosen times and losing the other times is then p k q n k.

Basic Probability Theory II

6th Grade Lesson Plan: Probably Probability

Chapter 4 - Practice Problems 1

Chapter 4: Probability and Counting Rules

Probability, statistics and football Franka Miriam Bru ckler Paris, 2015.

AP Stats - Probability Review

Definition and Calculus of Probability

statistics Chi-square tests and nonparametric Summary sheet from last time: Hypothesis testing Summary sheet from last time: Confidence intervals

How To Find The Sample Space Of A Random Experiment In R (Programming)

Question of the Day. Key Concepts. Vocabulary. Mathematical Ideas. QuestionofDay

Probability and Venn diagrams UNCORRECTED PAGE PROOFS

4. Continuous Random Variables, the Pareto and Normal Distributions

Current California Math Standards Balanced Equations

Genetics 1. Defective enzyme that does not make melanin. Very pale skin and hair color (albino)

6. Let X be a binomial random variable with distribution B(10, 0.6). What is the probability that X equals 8? A) (0.6) (0.4) B) 8! C) 45(0.6) (0.

A Few Basics of Probability

In the situations that we will encounter, we may generally calculate the probability of an event

Gaming the Law of Large Numbers

Exam. Name. How many distinguishable permutations of letters are possible in the word? 1) CRITICS

Tasks to Move Students On

That s Not Fair! ASSESSMENT #HSMA20. Benchmark Grades: 9-12

A and B are not absolutely linked. They could be far enough apart on the chromosome that they assort independently.

Appendix 2 Statistical Hypothesis Testing 1

What Does the Normal Distribution Sound Like?

Introductory Probability. MATH 107: Finite Mathematics University of Louisville. March 5, 2014

PROBABILITY. SIMPLE PROBABILITY is the likelihood that a specific event will occur, represented by a number between 0 and 1.

Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance

AP Statistics 7!3! 6!

WHERE DOES THE 10% CONDITION COME FROM?

How To Check For Differences In The One Way Anova

Review for Test 2. Chapters 4, 5 and 6

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Section 6-5 Sample Spaces and Probability

Foundation 2 Games Booklet

Ch. 13.3: More about Probability

Probability and Expected Value

PROBABILITY SECOND EDITION

TABLE OF CONTENTS. About Chi Squares What is a CHI SQUARE? Chi Squares Hypothesis Testing with Chi Squares... 2

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

Chapter 13: Meiosis and Sexual Life Cycles

Betting systems: how not to lose your money gambling

Recall this chart that showed how most of our course would be organized:

Math 108 Exam 3 Solutions Spring 00

Exam 3 Review/WIR 9 These problems will be started in class on April 7 and continued on April 8 at the WIR.

Fairfield Public Schools

It is remarkable that a science, which began with the consideration of games of chance, should be elevated to the rank of the most important

Chapter 5. Discrete Probability Distributions

Transcription:

Probability and the Chi-Square Test written by J. D. Hendrix Learning Objectives Upon completing the exercise, each student should be able: to determine the chance that a given state will occur in a system that consists of a finite number of equivalent states; to determine the chance of two or more independent events occurring simultaneously by using the product rule; to determine the chance that either one or the other of two mutually exclusive events will occur by using the sum rule; to test hypotheses based on expected frequencies using the chi-square test. Background Many genetic events are random processes. These include the segregation and assortment of genes during meiosis, the combination of gametes during fertilization, and crossover between homologous chromosomes. Scientists describe random natural processes using the mathematical tools of probability and statistics. A. The Chance that an Event will Occur Consider a system that consists of a finite number of equivalent states. The chance that a given state will occur is given by the equation C= a t in which C is the chance (probability) of the state, a is the number of times the state is represented in the system, and t is the total number of equivalent states in the system. For example, we can develop a mathematical model to describe a coin toss. We assume that a coin toss is a system with two equivalent states, heads-up and tails-up. We describe each state with a probability. Chance of heads = Chance of tails = number of heads total sides on the coin = 2 number of tails total sides on the coin = 2 = 0.5 = 50% = 0.5 = 50% We can use probabilities to predict the frequency of an event, or how often an event will occur. In an experiment with 50 coin tosses: Expected number of heads = 50 x 0.5 = 25 Expected number of tails = 50 x 0.5 = 25 We test the validity of the hypothesis (that there is an equal chance of getting heads or tails) by comparing the observed number of heads and tails in a coin-toss experiment with the expected values calculated from the probabilities. If the original assumptions in the hypothesis are not valid (for example, if the coin is heavier on one side, or if it is deformed in some way),

2 then there could be a significant difference between the observed and expected values. It is customary to express the expected outcome of an experiment involving frequencies as a ratio. In the coin toss experiment, we expect a heads:tails ratio of :. Imagine a standard deck of 52 playing cards, randomly shuffled. What is the chance of drawing an ace of hearts (A ) from the deck? Chance of A = Number of A Total number of Cards = 52 0.092 What is the chance of drawing any ace? Chance of any ace = Number of Aces Total number of Cards = 4 52 0.0769 Imagine that an ace of hearts was drawn from a standard deck of 52 cards and returned to the deck. Then, the deck was reshuffled and another card drawn. What is the chance that the card will be an ace of hearts? Chance of A = Number of A Total number of Cards = 52 0.092 Notice that, since the card was returned to the deck, the total number of cards and the chance remain the same. Imagine that an ace of hearts was drawn from a standard deck of 52 cards and discarded. Then, another card was drawn. What is the probability that the card will be an ace of hearts? Chance of A = Number of A Total number of Cards = 0 5 = 0 Notice that the number of cards and the chance have changed. Since there is no longer an ace of hearts in the deck, then the probability of drawing an ace of hearts is zero. What is the chance of drawing one of the three remaining aces? 3 Chance of drawing one of the three remaining aces = 5 0.0588

3 B. The Chance of Independent Events occurring together The chance of two or more independent events occurring together is the product of their individual probabilities. An example is the simultaneous tossing of two coins. The outcome of the toss on one coin should not affect the outcome on the second coin (unless the coins are glued together). Therefore, the events are independent of each other. Outcome of toss on Probability of toss on Coin # Coin #2 Coin # Coin #2 Probability of both events occurring Heads Heads 0.5 0.5 0.5 x 0.5 = 0.25 Heads Tails 0.5 0.5 0.5 x 0.5 = 0.25 Tails Heads 0.5 0.5 0.5 x 0.5 = 0.25 Tails Tails 0.5 0.5 0.5 x 0.5 = 0.25 What is the probability of rolling a pair of sixes on a standard set of dice? As you probably know, a die is a game cube with six sides, each side numbered with between one through six dots. Assuming that the mass of the cube is evenly distributed, the chance of rolling any of the numbers is /6. Therefore, the probability of rolling a pair of sixes on a pair of dice is 6 6 = 36 C. Mutually Exclusive Events (Either/or situations) The chance that either one or the other of two mutually exclusive events will occur is the sum of their individual probabilities. For example, consider a box containing two red beads, three white beads, and four blue beads. If one bead is randomly chosen, what is the chance that the bead will be either red or white? Chance of a red bead = 2 9 Chance of a white bead = 3 9 Chance of either red or white = 2 9 3 9 = 5 9

4 D. Hypothesis Testing using the Chi-square Test Let s develop a formal hypothesis for the coin toss experiment. Hypothesis: If the mass of a coin is symmetrically distributed on both sides of the coin, then there is an equal probability of a coin toss resulting in heads or tails. From this hypothesis we can make the following prediction. Prediction: If a specific coin is tossed 50 times, then 25 of the tosses will result in heads and 25 of the tosses will result in tails. The prediction can be tested by performing the following experiment. Experiment: Toss the coin 50 times and count the number of heads and tails. Independent variable: Number of times the coin is tossed. Dependent variable: Number of heads or tails. The observed results in an experiment are almost never exactly equal to the expected results. For example, in the coin toss experiment one expects 25 heads and 25 tails if a coin is tossed 50 times. However, what if the result is 27 heads and 23 tails? Is this a significant difference between the expected and the observed results, or can we attribute the difference to random chance? It seems to make sense that a result of 27 heads, 23 tails is reasonable, but how can we be sure? If we repeated the experiment 00 times, how often would we expect to this much deviation from the expected value (25:25)? The chi-square (χ 2 ) test is a statistical test used to determine whether the difference between an expected result and an observed result is significant or whether the difference can be attributed to random chance. To analyze experimental data using the χ 2 test, the data must consist of a finite number of mutually exclusive outcomes or classes. Also, we must know the probability of each class in order to calculate the expected values. The degrees of freedom in an experiment is the total number of classes minus one: df = k -, where k is the number of classes. In the coin experiment, there are two outcomes or classes of results, heads and tails. Therefore, there is one degree of freedom. The value of χ 2 is given by the equation χ 2 = O E 2 E where O is the observed number of items in a given class, E is the expected number of items in the class, and the summation sign (Σ) indicates the sum of all values of [(O-E) 2 ]/E for every class in the system. Consider the following results of the coin toss experiment.

5 # obtained # expected O E (O E) 2 (O E) 2 Toss (O) (E) E Heads 27 50 x 0.5 = 25 2 4 0.60 Tails 23 50 x 0.5 = 25-2 4 0.60 Total: 50 χ 2 = 0.320 Is the difference between O and E significant? If so, then we reject the hypothesis. If not, then we fail to reject the hypothesis. We evaluate the difference from a table of χ 2 values, such as the one shown below. P value = Probability that the Difference is due to Chance and is Not Significant df 0.95 0.80 0.50 0.20 0.0 0.05 0.0 0.00393 0.0642 0.455.642 2.706 3.84 6.635 2 0.03 0.446.386 3.29 4.605 5.99 9.20 3 0.352.005 2.366 4.642 6.25 7.85.34 4 0.7.649 3.357 5.989 7.779 9.488 3.277 5.45 2.343 4.35 7.289 9.236.070 5.086 6.635 3.070 5.348 8.558 0.645 2.592 6.82 7 2.67 3.822 6.346 9.803 2.07 4.067 8.475 8 2.733 4.594 7.344.030 3.362 5.507 20.090 9 3.325 5.380 8.343 2.242 4.684 6.99 2.666 0 3.940 6.79 9.342 3.442 5.987 8.307 23.209 Locate the value of χ 2 in the row corresponding to the appropriate df value. In this example, the value of χ 2 = 0.320, and the value of df = 2 - =. Therefore, the χ 2 value is between 0.0642 and 0.455. 0.0642 < χ 2 < 0.455 The probability, P, that the deviation is due to random chance, and is not significant, is read from the top row of the table. How do we interpret this nonsense? 0.80 > P > 0.50 In most genetics work, deviations are considered significant only if the probability value from the χ 2 table is 0.05 (5%) or less. This is called a 5% level of significance (or a 95% confidence level). If the probability is 0.05 or less, this means that there is a 95% or greater probability that the deviation is not due to chance, and the hypothesis is rejected. If the probability is greater than 0.05, then we cannot reject the hypothesis based on the data. In our example, P is greater than 0.05, so the hypothesis is not rejected. Whew! That s a lot of words. To put it simply: If the P value from the χ 2 table is less than 0.05, then the deviation of the observed values from the expected values is significant and the data do not support the hypothesis. If the P value from the χ 2 table is greater that 0.05, then the deviation of the observed values from the expected values is not significant and the data support the hypothesis.

6 In this example, the P value is between 0.50 and 0.80. This means that there is between a 50% and 80% probability that the deviation seen is due to chance. Since the P value is greater than 0.05, the deviation is not significant at the 95% confidence level, and the data support the hypothesis. Consider the results of another coin toss experiment, using a different coin. # obtained # expected O E (O E) 2 (O E) 2 Toss (O) (E) E Heads 3 25-2 44 5.76 Tails 37 25 2 44 5.76 Total: 50 χ 2 =.52 As before, df = 2 =. At df =, it looks like the χ 2 value we calculated is off the chart! This simply means that the deviation is so big that it is larger than the the largest recorded value in the chart. The P value must therefore be much much smaller than 0.0 (and smaller than 0.05). Hence: χ 2 > 6.635 P < 0.0 Since P < 0.05, the deviation of the observed values from the expected in this coin toss is significant and the data do not support the hypothesis. Can you suggest a reason why the coin toss experiment failed to support the expected : ratio in this case? (Here s a hint: read the first sentence of the hypothesis for an important assumption that lead us to the : ratio.)

Probability and the Chi-Square Test Laboratory Report Sheet Name Lab Partners. A standard deck of 52 playing cards has 3 cards of each suit (hearts, spades, diamonds, or clubs). What is the probability of drawing a diamond? 2. If two coins are tossed, what is the probability that one coin will be heads and the other coin will be tails, with either of the two coins being heads. To solve this problem, start with the information given in section B under the Product Rule, then apply the Sum Rule to solve for an either/or situation. 3. What is the probability of rolling a seven in any combination on a pair of dice? To solve this problem, you will need to use a combination of the product rule and the summation rule. Try completing this table. Remember that each roll is a mutually exclusive event (that is, if you roll a and a 6, you can t roll a 2 and a 5 at the same time). Roll on Die Roll on Die 2 Probability of this Roll 6 6 x 6 = 36 2 Probability of rolling a seven in any combination: 7

8 4. In corn, the genes for seed color (purple or yellow) and seed shape (smooth or wrinkled) assort independently of each other. This means that the expected frequencies (probabilities) of these traits in a cross can be treated as independent events occurring simultaneously, so the product rule applies. If hybrid purple corn is self-fertilized, the following offspring are expected: ¾ Purple ¼ Yellow If hybrid smooth corn is self-fertilized, the following offspring are expected: ¾ Smooth ¼ Wrinkled Here s the question: If corn that is both hybrid purple and hybrid smooth is self-fertilized, what results do you expect? Use the product rule to figure out how many purple smooth, purple wrinkled, yellow smooth, and yellow wrinkled kernels you expect. 5. You will be provided a 6-sided die (game cube). (a) (b) (c) (d) Write a formal hypothesis, prediction, experiment, and variables about the probabilities of tossing numbers on the die. Perform your experiment. You should have a sufficient sample size (i.e. several hundred rolls) for a valid statistical sample. Use the χ 2 test to determine if your data support your hypothesis. Show your work. Write a brief conclusion summarizing your results. If the data do not support your conclusion, you should suggest reasons in your conclusion.