TIEE Teaching Issues and Experiments in Ecology - Volume 1, January 2004

Similar documents
Hypothesis testing. Null and alternative hypotheses

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Measures of Spread and Boxplots Discrete Math, Section 9.4

PSYCHOLOGICAL STATISTICS

Lesson 15 ANOVA (analysis of variance)

1. C. The formula for the confidence interval for a population mean is: x t, which was

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean

1 Computing the Standard Deviation of Sample Means

Exam 3. Instructor: Cynthia Rudin TA: Dimitrios Bisias. November 22, 2011

Lesson 17 Pearson s Correlation Coefficient

Determining the sample size

Confidence Intervals for One Mean

I. Chi-squared Distributions

One-sample test of proportions

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number.

Case Study. Normal and t Distributions. Density Plot. Normal Distributions

Output Analysis (2, Chapters 10 &11 Law)

Quadrat Sampling in Population Ecology

5: Introduction to Estimation

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

Biology 171L Environment and Ecology Lab Lab 2: Descriptive Statistics, Presenting Data and Graphing Relationships

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means)

Math C067 Sampling Distributions

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles

Properties of MLE: consistency, asymptotic normality. Fisher information.

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here).

Inference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution

Chapter 7: Confidence Interval and Sample Size


Mann-Whitney U 2 Sample Test (a.k.a. Wilcoxon Rank Sum Test)

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

Normal Distribution.

Practice Problems for Test 3

Present Value Factor To bring one dollar in the future back to present, one uses the Present Value Factor (PVF): Concept 9: Present Value

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights

Chapter 14 Nonparametric Statistics

Descriptive Statistics

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

Now here is the important step

hp calculators HP 12C Statistics - average and standard deviation Average and standard deviation concepts HP12C average and standard deviation

Hypergeometric Distributions

CHAPTER 3 THE TIME VALUE OF MONEY

MEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book)

THE TWO-VARIABLE LINEAR REGRESSION MODEL

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:

1 Correlation and Regression Analysis

A Mathematical Perspective on Gambling

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring

Unit 8: Inference for Proportions. Chapters 8 & 9 in IPS

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable

, a Wishart distribution with n -1 degrees of freedom and scale matrix.

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN

THE ARITHMETIC OF INTEGERS. - multiplication, exponentiation, division, addition, and subtraction

Your organization has a Class B IP address of Before you implement subnetting, the Network ID and Host ID are divided as follows:

Soving Recurrence Relations

Confidence intervals and hypothesis tests

Sequences and Series

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem

Repeating Decimals are decimal numbers that have number(s) after the decimal point that repeat in a pattern.

Chapter 7 Methods of Finding Estimators

Statistical inference: example 1. Inferential Statistics

G r a d e. 2 M a t h e M a t i c s. statistics and Probability

A Test of Normality. 1 n S 2 3. n 1. Now introduce two new statistics. The sample skewness is defined as:

Section 11.3: The Integral Test

A GUIDE TO LEVEL 3 VALUE ADDED IN 2013 SCHOOL AND COLLEGE PERFORMANCE TABLES

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13

The Stable Marriage Problem

INVESTMENT PERFORMANCE COUNCIL (IPC)

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

Project Deliverables. CS 361, Lecture 28. Outline. Project Deliverables. Administrative. Project Comments

LECTURE 13: Cross-validation

Universal coding for classes of sources

Elementary Theory of Russian Roulette

Modified Line Search Method for Global Optimization

CS103X: Discrete Structures Homework 4 Solutions

OMG! Excessive Texting Tied to Risky Teen Behaviors

Overview of some probability distributions.

How To Solve The Homewor Problem Beautifully

Building Blocks Problem Related to Harmonic Series

Incremental calculation of weighted mean and variance

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

STA 2023 Practice Questions Exam 2 Chapter 7- sec 9.2. Case parameter estimator standard error Estimate of standard error

Research Method (I) --Knowledge on Sampling (Simple Random Sampling)

Page 1. Real Options for Engineering Systems. What are we up to? Today s agenda. J1: Real Options for Engineering Systems. Richard de Neufville

*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature.

Forecasting techniques

DAME - Microsoft Excel add-in for solving multicriteria decision problems with scenarios Radomir Perzina 1, Jaroslav Ramik 2

WindWise Education. 2 nd. T ransforming the Energy of Wind into Powerful Minds. editi. A Curriculum for Grades 6 12

A Guide to the Pricing Conventions of SFE Interest Rate Products

Data Analysis and Statistical Behaviors of Stock Market Fluctuations

Lecture 3. denote the orthogonal complement of S k. Then. 1 x S k. n. 2 x T Ax = ( ) λ x. with x = 1, we have. i = λ k x 2 = λ k.

Estimating Probability Distributions by Observing Betting Practices

TI-83, TI-83 Plus or TI-84 for Non-Business Statistics

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations

Central Limit Theorem and Its Applications to Baseball

A Recursive Formula for Moments of a Binomial Distribution

Solving Logarithms and Exponential Equations

COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S 2 CONTROL CHART FOR THE CHANGES IN A PROCESS

Transcription:

TIEE Teachig Issues ad Experimets i Ecology - Volume 1, Jauary 2004 EXPERIMENTS Evirometal Correlates of Leaf Stomata Desity Bruce W. Grat ad Itzick Vatick Biology, Wideer Uiversity, Chester PA, 19013 grat@pop1.sciece.wideer.edu vatick@pop1.sciece.wideer.edu stomata viewed at 400x i ail polish impressio from leaf uderside Marc Brodki, 2000 Appedix 1. Guidelies for Statistical Aalysis Moder biological research emphasizes the collectio of quatitative data o a variety of biological topics. Much of these data are highly variable. As a result, techiques of statistical aalysis are very valuable i helpig the biologist describe the variatio withi sets of data, express the degree of cofidece that ca be placed i average values, ad objectively test hypotheses about data collected from differet groups of subjects. This hadout describes a umber of techiques commoly used by biologists for these purposes ad that you will use i the aalysis of your stomata data. Experimets i Ecology, TIEE Volume 1 2004 - Ecological Society of America. (www.tiee.ecoed.et)

page 2 Bruce Grat ad Itzick Vatick TIEE Volume 1, Jauary 2004 A. Descriptive Statistics. After a set of data is collected it the ca be aalyzed statistically i order to better determie whether the data support or reject a give hypothesis. The first procedure that is usually doe is to calculate a set of parameters that describe two aspects of the data: (1) cetral tedecy ad (2) dispersio. dispersio cetral tedecy (1) Measures of Cetral Tedecy. Oe type of statistics determies the cetral tedecy of the data. The cetral tedecy provides iformatio o how the values of the data you collected cluster aroud some sigle middle value. There are three measures of cetral tedecy that are used i the aalysis of data, which are described below: MODE = the most frequetly observed value of the data MEDIAN = the middle value whe the data set is ordered i sequetial rak (i.e. highest to lowest, or lowest to highest) MEAN = average value. The mea is the most commoly used measure of cetral tedecy. It is estimated usig the sum of all the idividual values (x i ) divided by the total umber of idividuals i the sample (): x MEAN = X = Σ i = ( x 1 + x 2 + x 3 + x 4 +... + x N ) / Experimets i Ecology, TIEE 2004 - Ecological Society of America. (www.tiee.ecoed.et)

TIEE EXPERIMENT Evirometal Correlates of Leaf Stomata Desity page 3 (2) Measures of Dispersio. Aother set of statistics describes how spread out the data are. RANGE = The highest value mius the lowest value. VARIANCE. The variace is the sum of each of the differeces or deviatios betwee idividual values ad the mea value. The total differece is divided by the umber of idividuals i the sample mius oe. VARIANCE: σ 2 ( X i - X ) = = Σ 2-1 Σ ( X i 2 - * X 2 ) - 1 STANDARD DEVIATION. The square root of the variace. σ2 STANDARD DEVIATION = S = = Σ ( X i 2 - * X 2 ) - 1 STANDARD ERROR. The stadard error is the stadard deviatio divided by the square root of the sample size. STANDARD ERROR = S e.g. for the data: { 3, 3, 4, 5, 6, 6, 6, 6, 7, 8, 10 } that could represet a set of quiz scores, MODE = 6, MEDIAN = 6 MEAN = (3 + 3 + 4 + 5 + 6 + 6 + 6 + 6 + 7 + 8 + 10)/11 = 5.82 SAMPLE VARIANCE = 4.363636 STANDARD DEVIATION = 2.088932 STANDARD ERROR = 0.629837 Experimets i Ecology, TIEE Volume 1 2004 - Ecological Society of America. (www.tiee.ecoed.et)

page 4 Bruce Grat ad Itzick Vatick TIEE Volume 1, Jauary 2004 FINDING DESCRIPTIVE STATISTICS USING MICROSOFT S EXCEL The computer makes data aalysis easy. All you eed is to eter your data ito a spreadsheet ad follow the simple steps below: 1. Uder Tools click o Add Is ad the click o Aalysis ToolPack ad OK. 2. Look agai uder the Tools meu ad a ew optio Data Aalysis will appear at the bottom of the meu. Click o Data Aalysis, ad click o Descriptive Statistics. 3. Highlight your colum of data. Hit the Summary Statistics box so that a X appears. Next, specify the Output Rage, i.e. where you wat to put the aalysis output table, ad fially hit OK. 4. The program will spew out a table of statistics that will look somethig like this: Variable 1 Mea 5.818182 Stadard Error 0.629837 Media 6 Mode 6 Stadard Deviatio 2.088932 Sample Variace 4.363636 Kurtosis 0.338976 Skewess 0.454113 Rage 7 Miimum 3 Maximum 10 Sum 64 Cout 11 Cofidece Level(95.000%) 1.234455 Experimets i Ecology, TIEE 2004 - Ecological Society of America. (www.tiee.ecoed.et)

TIEE EXPERIMENT Evirometal Correlates of Leaf Stomata Desity page 5 B. Statistical Testig: Comparisos of Meas Usig a STUDENT S T-TEST For this lab activity, we are goig to carry the aalysis of the data oe step further ad determie whether the hypothesis you proposed for the distributio of stomata i the two groups of leaves you collected should be accepted or rejected. To do this you should compare the meas of your two experimetal groups usig a statistical test called Studet s t-test. The t-test is a statistical test used to determie if the meas of two data sets are sigificatly differet. I statistical terms, the t-test is used to determie if the two data sets you collected come from the same or differet distributios. The t-test ca oly be used whe comparig meas of two samples. More tha two requires a differet test. To perform a t-test, oe calculates a t-value from the two data sets you wish to compare. The t-value is a measure of the ratio of sigal to oise i your data. The "sigal i the umerator represets the differece betwee the meas. I other words, if the meas of your two samples are very differet the the sigal is large. t = sigal oise The "oise i the deomiator represets the total amout of variatio i both samples ad ca be foud by summig the stadard deviatios for each of the data sets (i.e. the pooled variatio) divided by each data set s sample size. Admittedly, this is kid of a tricky idea, however it makes sese whe you thik about it because if there is a great deal of variatio i either or both of your data sets, the it should be more difficult to tell their meas apart. This is the whole idea behid the t-test (as well as behid a large class of statistical tests called parametric tests). The equatio for the t-test (assumig uequal variaces) is thus mea #A - mea #B t = = pooled variatio σ 2 A A X A - + X B σ 2 B B Calculatio of the t-value ca be doe by had, o a calculator, or o a computer (such as the computer program MS-EXCEL). To calculate the t-value by had, all that is required beforehad is that oe kow the sample sizes, meas, ad variaces, σ 2, for each group. As oe ca see from the equatio above, as the differece betwee the meas of your groups gets bigger, the t-value gets bigger. Also, as the pooled variatio gets smaller, the t-value will get bigger. Experimets i Ecology, TIEE Volume 1 2004 - Ecological Society of America. (www.tiee.ecoed.et)

page 6 Bruce Grat ad Itzick Vatick TIEE Volume 1, Jauary 2004 Now, the ext questio to ask is - How big does the t-value have to be i order for oe to coclude that the meas are sigificatly differet? This is a really importat questio, ad the aswer is at the heart of all statistical aalyses. The aswer depeds o two thigs - how large is your sample? ad how cofidet do you wat to be that the averages are i fact differet? The effect of sample size ca be easily see i the equatio for the t-value above. Note that the sample sizes ( A ad B ) appear i the deomiator of the deomiator. Thus, as the s get larger, the pooled variatio gets smaller, ad as you recall, the effect of this is that the t-value gets larger. The other way to look at the sample size is more formal ad ivolves a term called degrees of freedom. The umber of degrees of freedom - df for a t-test equals the pooled sample size mius 2 (however if variaces are ot equal a more complicated approximatio method is used). The df tells you i effect how well you ca resolve your averages give the t-value you calculate the higher the df the greater the resolutio. The secod issue metioed above i determiig how big a t-value you eed depeds o how cofidet you wat to be. If you wated to be really cofidet that these meas differ, the you had better look for a very large t-value. But, if you are oly satisfied with a rather margial level of cofidece (say oe i 20 that you re wrog whe you say they differ) the you would be happy with a smaller t-value. The cofidece level is deoted i the test by the P value, which stads for probability. Probability is expressed as a decimal, P = 0.05 is the same as P = 5%. If you happeed to do a stats test ad get a P value of exactly 0.05, the there is a 95% chace that your averages differ, however there also is a 5% chace that if you coclude that the averages differ you are i fact wrog. The 0.95 cutoff really is the miimum criterio for sigificace, however, you ca be 99% sure if you hold out for a larger t-value ad use a P = 0.01 as your criterio for sigificace. The acceptable probability level is always determied BEFORE performig the t-test (most experimeters use P = 0.05). The computer makes data aalysis easy. Now it is time to cosider a example. All you eed is to eter the data below i a spreadsheet (which you already did to perform descriptive statistical aalyses above), ad perform a t test (followig the directios o whatever spreadsheet or stats package is available to you. data set A data set B t-test: Two-Sample Assumig Uequal Variaces 3 1 Data set A data set B 3 1 Mea 5.8181 3.8181 4 2 Variace 4.3636 4.3636 5 3 Observatios 11 11 6 4 Ho Mea Differece 0 6 4 df 20 6 4 t Stat 2.2453 6 4 P(T<=t) oe-tail 0.0181 7 5 t Critical oe-tail 1.7247 8 6 P(T<=t) two-tail 0.0362 10 8 t Critical two-tail 2.0859 Q are the meas for data set A ad B sigificatly differet, ad exactly what iformatio i the table above tells you this? Experimets i Ecology, TIEE 2004 - Ecological Society of America. (www.tiee.ecoed.et)