2 GENETIC DATA ANALYSIS



Similar documents
GENETIC CROSSES. Monohybrid Crosses

Name: Class: Date: ID: A

AP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

PRACTICE PROBLEMS - PEDIGREES AND PROBABILITIES

Hardy-Weinberg Equilibrium Problems

Chapter 9 Patterns of Inheritance

Chapter 5. Discrete Probability Distributions

Bio 102 Practice Problems Mendelian Genetics and Extensions

Mendelian and Non-Mendelian Heredity Grade Ten

Ex) A tall green pea plant (TTGG) is crossed with a short white pea plant (ttgg). TT or Tt = tall tt = short GG or Gg = green gg = white

The Making of the Fittest: Natural Selection in Humans

7 th Grade Life Science Name: Miss Thomas & Mrs. Wilkinson Lab: Superhero Genetics Due Date:

LAB : PAPER PET GENETICS. male (hat) female (hair bow) Skin color green or orange Eyes round or square Nose triangle or oval Teeth pointed or square

Incomplete Dominance and Codominance

Problems 1-6: In tomato fruit, red flesh color is dominant over yellow flesh color, Use R for the Red allele and r for the yellow allele.

Heredity. Sarah crosses a homozygous white flower and a homozygous purple flower. The cross results in all purple flowers.

CCR Biology - Chapter 7 Practice Test - Summer 2012

Variations on a Human Face Lab

Mendelian Genetics in Drosophila

Name: 4. A typical phenotypic ratio for a dihybrid cross is a) 9:1 b) 3:4 c) 9:3:3:1 d) 1:2:1:2:1 e) 6:3:3:6

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

5 GENETIC LINKAGE AND MAPPING

I. Genes found on the same chromosome = linked genes

Lesson Plan: GENOTYPE AND PHENOTYPE

Math 108 Exam 3 Solutions Spring 00

A trait is a variation of a particular character (e.g. color, height). Traits are passed from parents to offspring through genes.

DNA Determines Your Appearance!

The Genetics of Drosophila melanogaster

Odds ratio, Odds ratio test for independence, chi-squared statistic.

Heredity - Patterns of Inheritance

7A The Origin of Modern Genetics

Likelihood: Frequentist vs Bayesian Reasoning

Genetics 1. Defective enzyme that does not make melanin. Very pale skin and hair color (albino)

somatic cell egg genotype gamete polar body phenotype homologous chromosome trait dominant autosome genetics recessive

Genetics for the Novice

Testing Research and Statistical Hypotheses

Is it statistically significant? The chi-square test

Recommend Continued CPS Monitoring. 63 (a) 17 (b) 10 (c) (d) 20 (e) 25 (f) 80. Totals/Marginal

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

STATISTICS 8, FINAL EXAM. Last six digits of Student ID#: Circle your Discussion Section:

Biology Notes for exam 5 - Population genetics Ch 13, 14, 15

P1 Gold X Black. 100% Black X. 99 Black and 77 Gold. Critical Values

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

CCpp X ccpp. CcPp X CcPp. CP Cp cp cp. Purple. White. Purple CcPp. Purple Ccpp White. White. Summary: 9/16 purple, 7/16 white

Having a coin come up heads or tails is a variable on a nominal scale. Heads is a different category from tails.

Chromosomes, Mapping, and the Meiosis Inheritance Connection

BIO 184 Page 1 Spring 2013 NAME VERSION 1 EXAM 3: KEY. Instructions: PRINT your Name and Exam version Number on your Scantron

MA 1125 Lecture 14 - Expected Values. Friday, February 28, Objectives: Introduce expected values.

STATISTICS PROJECT: Hypothesis Testing

statistics Chi-square tests and nonparametric Summary sheet from last time: Hypothesis testing Summary sheet from last time: Confidence intervals

Chi Square Tests. Chapter Introduction

Basic Principles of Forensic Molecular Biology and Genetics. Population Genetics

MONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010

Two copies of each autosomal gene affect phenotype.

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

EMPIRICAL FREQUENCY DISTRIBUTION

CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS

Bio 102 Practice Problems Mendelian Genetics: Beyond Pea Plants

Math 3C Homework 3 Solutions

2 18. If a boy s father has haemophilia and his mother has one gene for haemophilia. What is the chance that the boy will inherit the disease? 1. 0% 2

Genetics and Evolution: An ios Application to Supplement Introductory Courses in. Transmission and Evolutionary Genetics

5. The cells of a multicellular organism, other than gametes and the germ cells from which it develops, are known as

HLA data analysis in anthropology: basic theory and practice

Chi Squared and Fisher's Exact Tests. Observed vs Expected Distributions

Introduction to Hypothesis Testing

Appendix 2 Statistical Hypothesis Testing 1

4. Continuous Random Variables, the Pareto and Normal Distributions

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

I Have...Who Has... Multiplication Game

EXERCISE 11 MENDELIAN GENETICS PROBLEMS

17. A testcross A.is used to determine if an organism that is displaying a recessive trait is heterozygous or homozygous for that trait. B.

Chi-square test Fisher s Exact test

Correlational Research

Saffiyah Y. Manboard Biology Instructor Seagull Alternative High School

Simple Linear Regression Inference

Chicago Booth BUSINESS STATISTICS Final Exam Fall 2011

A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

COURSE DESCRIPTION. Course Number: NM: RISD: 13109A, 13109B. Successful completion of Forensics I (C or better)

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

12: Analysis of Variance. Introduction

University of Chicago Graduate School of Business. Business 41000: Business Statistics

edtpa: Task 1 Secondary Science

If you crossed a homozygous, black guinea pig with a white guinea pig, what would be the phenotype(s)

Chi Square Distribution

Crosstabulation & Chi Square

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Recall this chart that showed how most of our course would be organized:

Biology 1406 Exam 4 Notes Cell Division and Genetics Ch. 8, 9

Use of the Chi-Square Statistic. Marie Diener-West, PhD Johns Hopkins University

CHAPTER 5 COMPARISON OF DIFFERENT TYPE OF ONLINE ADVERTSIEMENTS. Table: 8 Perceived Usefulness of Different Advertisement Types

Human Blood Types: Codominance and Multiple Alleles. Codominance: both alleles in the heterozygous genotype express themselves fully

Study Guide for the Final Exam

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Name: Date: Use the following to answer questions 3-4:

The correct answer is c A. Answer a is incorrect. The white-eye gene must be recessive since heterozygous females have red eyes.

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp

Roadmap to Data Analysis. Introduction to the Series, and I. Introduction to Statistical Thinking-A (Very) Short Introductory Course for Agencies

November 08, S8.6_3 Testing a Claim About a Standard Deviation or Variance

MCAS Biology. Review Packet

Transcription:

2.1 Strategies for learning genetics 2 GENETIC DATA ANALYSIS We will begin this lecture by discussing some strategies for learning genetics. Genetics is different from most other biology courses you have taken in that memorization is not very important. You are expected to learn vocabulary and some examples of genetic disorders, formulae, etc. But learning and applying concepts is much more important. In particular, you need to be able to think about and analyze genetic data. Almost all the homework and exam questions in this part of the course will require you to examine and analyze data, and to make some conclusion based upon your analysis. In this way, the work you do in this course is very similar to work done by real geneticists. This is one reason why students who have always done well in biology sometimes do poorly in genetics. Learning how to approach a set of genetic data in a logical and consistent way takes patience and a good deal of practice. I will introduce you to ways to think about problems during lectures. It is therefore very important that you attend lecture. I will use several techniques of active learning, because these techniques have been proven to be more effective in increasing student learning than the traditional lecture-only format. I think these techniques also make class time more interesting for students. 2.2 Rules of Probability Sum rule: The combined probability of two events that are mutually exclusive is the sum of the individual probabilities. Genetic example of the sum rule: Parental genotypes (monohybrid cross): GG x gg x indicates a genetic cross F1 genotype: Gg F2 (produced by mating F1 individuals): 1/4 GG: 1/2 Gg: 1/4 gg (1:2:1 genotypic ratio) Q: What is the probability that an F2 offspring of a monohybrid cross has the dominant phenotype (is either GG OR Gg)? A: P[dominant phenotype] = P[GG] + P[Gg] = + = (Note: the P[x] stands for Probability of x ) Product rule: The probability of both of two independent events is the product of the individual probabilities. Genetic example using product rule and sum rule: Parental genotypes (dihybrid cross): GGww x ggww F1: GgWw F2: 9/16 G W : 3/16 G ww : 3/16 ggw : 1/16 ggww 1

(Note: the indicates either the dominant or recessive allele G indicates an individual with the dominant phenotype: either GG or Gg) Q: What is the probability that an F2 offspring of the following dihybrid cross will have at least one dominant allele for each trait (i.e. that it will be G W )? Solution: First note that segregation of the G locus is independent of segregation at the W locus (Mendel s law of independent assortment) So that P[G W ] = P[G ] * P[W ] As above, GG and Gg are mutually exclusive (an individual can be GG or Gg, but not both). WW and Ww are also mutually exclusive. P[G-] = P[Gg] or P[GG] = P[Gg] + P[GG] P[Gg] = 1/2 P[GG] = 1/4 So: P[G-] = 1/2 + 1/4 = 3/4 P[W-]= P[Ww] + P[WW] = 1/2 + 1/4 =3/4 P[G-W-] = P[G-] * P[W-] = 3/4 * 3/4 Answer: iii. Complex genetic problem--6 independent genes: Parental genotypes: AA bb CC DD ee ff x aa BB cc dd EE FF F1: Aa Bb Cc Dd Ee Ff x Aa Bb Cc Dd Ee Ff Q: What proportion of F2 progeny will be AA bb Cc DD ee Ff? A: iv. Genetic example using conditional probability Q: In the F2 progeny from monohybrid cross, what is the proportion of heterozygotes among dominant progeny? A: P[heterozygous] / P[dominant] = P [Gg] / P[GG or Gg] = 1/2 / [1/4 + 1/2] = 1/2 / 3/4 = 2/3 2.3 Genetic Data Analysis: Goodness of Fit 2

What if the F2 generation of a monohybrid cross has a phenotypic ratio close to 3:1, but not exactly 3:1? Q: How do we tell if deviations from expected proportions in a genetic experiment are due to chance or due to the fact that our genetic hypothesis is wrong? A: Repeat an experiment many times. E.g, the experiment is: toss a fair coin 100 times and count how many heads are throw. Repeat this experiment 100 times. Q: What is the most likely outcome of a single experiment? A: Q: Will all 100 experiments have the same outcome if the coin is fair? A: 100 Tosses of a Fair Coin Times out of 100 9 8 7 6 5 4 3 2 1 0 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 Number of Heads Getting 65 H and 35 T is an unlikely result, but not impossible. How unlikely is it? The probability is actually 0.0018. In other words, this result is expected to occur 18 times out of every thousand times the experiment is repeated (or 1.8 times out of every one hundred trials). In same way, consider a cross of Aa to an individual of unknown genotype that yields 100 progeny. We hypothesize that the unknown individual has genotype aa. Q: If the progeny contain 65 of the dominant phenotype, and 35 of the recessive phenotype, can we reject our hypothesis that the unknown parent was aa and that the cross was therefore Aa x aa? Another way to phrase the question is, how likely are we to get a result this different from the expected result if our hypothesis is true? First, what is the expected result? What is the observed result? How could we measure the difference between the expected and observed results? 3

A: A test commonly used to measure the goodness of fit is the CHI-SQUARE TEST (pronounced with a hard C and to rhyme with pie ). The observed number (O) in each category is compared to the expected number (E) under your hypothesis. We use the square of the difference between observed and expected numbers, and standardize by dividing by the expected number in a category. (O-E) is referred to as the difference (d). the symbol means to sum the values over each category or class of the data χ 2 = [(O-E) 2 ]/E = (d 2 /E) So for our testcross example: class O E (O-E) (O-E) 2 = d 2 (O-E) 2 /E Aa 65 50 15 225 225/50 = 4.5 aa 35 50-15 225 225/50 = 4.5 χ 2 = 9 d.f. = (#classes -1) = 1 Compare χ 2 to a theoretical value from a chi-square table that has the same degrees of freedom. Theoretical value based on: Assuming the hypothesis (null hypothesis) we proposed is correct, if we repeated our genetic experiment, many times (counting 100 progeny each time) and calculated a χ 2 for each, we could obtain a distribution of chi-square values. In a certain number of these repeated experiments, we would get 65 or more Aa offspring out of 100, and the χ 2 value associated with these cases would be greater than or equal to the χ 2 value we obtained above (9). The proportion of times that this occurs, is the probability of getting a value of 9 or greater, when you have 1 degree of freedom. In this case the probability associated with a χ 2 value of 9 is less than 1%. χ 2 Table Probabilities df 0.90 0.50 0.20 0.05 0.01 0.001 1 0.02 0.46 1.64 3.84 6.64 10.83 2 0.21 1.39 3.22 5.99 9.21 13.82 3 0.58 2.37 4.64 7.82 11.35 16.27 4 1.06 3.36 5.99 9.49 13.28 18.47 0.001<P<0.01 < 1% chance 4

Degrees of freedom refers to integer number that is generally equal to the number of categories in the data, minus 1. In our coin-toss problem, the degrees of freedom is the number of phenotypic classes that are independent of each other. For example, if there are two categories, there is only one degree of freedom. This constraint occurs because, given the total number, 100 progeny, and the number in one class (65), the number in the other class (35) is fixed. Only one class can vary freely, the other is constrained. To reiterate, due to chance, χ 2 will be large for a few samples and small in most other samples. The Critical Values of the χ 2 distribution are exceeded by chance only rarely, and that rarity is indicated by the probability value. The common practice in genetics is to set the critical value at 5%. This means that an event that should occur by chance less than 5% of the time, is considered statistically significant, and justifies the rejection of the null hypothesis (in this example the null hypothesis is that the coin is fair, so that we expect 50% heads and 50% tails). So, we say that the standard for rejecting the null hypothesis is P < 0.05. Data from a genetic experiment: Assume you have crossed two pea plants with purple flowers. Your hypothesis is that both plants are heterozygous for a dominant allele at a single locus controlling flower color. You could write this in symbols as: H: P: (Ww X Ww) F1: 3/4 W- (purple) and 1/4 ww (white). H stands for genetic Hypothesis Or a 3:1 phenotypic ratio Observed Results: Of 166 progeny, 110 had purple flowers and 56 had white flowers. Expected Results: If your genetic hypothesis is true, then you expect 3/4 purple and 1/4 white out of 166 total offspring. That is, you expect 124.5 purple and 41.5 white. Note: If you have learned to think about statistical tests in terms of a null hypothesis and an alternative hypothesis, then the your null hypothesis is the F1 offspring numbers are consistent with a 3:1 phenotypic ratio. The alternative hypothesis is that the F1 offspring numbers are not consistent with a 3:1 ratio. Q: Is the difference between observed and expected too big to be due to chance? If it is, then you reject your proposed genetic hypothesis (you reject the null hypothesis in favor of the alternative hypothesis). χ 2 TEST Class O E (O-E) (O-E) 2 = d 2 (O-E) 2 /E Purple 110 124.5-14.5 210.25 210.25/124.5 = 1.69 White 56 41.5 14.5 210.25 210.25/41.5 = 5.07 χ 2 = 6.76 1 degree of freedom 5

This χ 2 value is a relatively large number indicating that the observed numbers are not very close to the expected, but we need an objective measure of how close to the expected it really is. Since 6.76 is larger than the value in the table for P=0.01 with one degree of freedom, we can be fairly confident that our hypothesis is not correct. We would expect this large a deviation from our expected results due to chance alone fewer than once out of every 100 repeated experiments. 6