Biology 171L Environment and Ecology Lab Lab 2: Descriptive Statistics, Presenting Data and Graphing Relationships



Similar documents
The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number.

Measures of Spread and Boxplots Discrete Math, Section 9.4

I. Chi-squared Distributions

Descriptive Statistics

Confidence Intervals for One Mean

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

Convexity, Inequalities, and Norms

Trigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE The absolute value of the complex number z a bi is

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean

G r a d e. 2 M a t h e M a t i c s. statistics and Probability

1 Correlation and Regression Analysis

Now here is the important step

This document contains a collection of formulas and constants useful for SPC chart construction. It assumes you are already familiar with SPC.

Case Study. Normal and t Distributions. Density Plot. Normal Distributions

FOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

hp calculators HP 12C Statistics - average and standard deviation Average and standard deviation concepts HP12C average and standard deviation

5: Introduction to Estimation

Chapter 7: Confidence Interval and Sample Size

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring

Lesson 17 Pearson s Correlation Coefficient

NATIONAL SENIOR CERTIFICATE GRADE 11

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means)

Chapter 5 Unit 1. IET 350 Engineering Economics. Learning Objectives Chapter 5. Learning Objectives Unit 1. Annual Amount and Gradient Functions

Normal Distribution.

Lesson 15 ANOVA (analysis of variance)

Research Method (I) --Knowledge on Sampling (Simple Random Sampling)

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:

THE ARITHMETIC OF INTEGERS. - multiplication, exponentiation, division, addition, and subtraction

1 Computing the Standard Deviation of Sample Means

Asymptotic Growth of Functions

Modified Line Search Method for Global Optimization

Maximum Likelihood Estimators.

How To Solve The Homewor Problem Beautifully

Incremental calculation of weighted mean and variance

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights

HCL Dynamic Spiking Protocol

Systems Design Project: Indoor Location of Wireless Devices

A Guide to the Pricing Conventions of SFE Interest Rate Products

PSYCHOLOGICAL STATISTICS

MEP Pupil Text 9. The mean, median and mode are three different ways of describing the average.

Chapter 7 Methods of Finding Estimators

1. C. The formula for the confidence interval for a population mean is: x t, which was

Exploratory Data Analysis

5.4 Amortization. Question 1: How do you find the present value of an annuity? Question 2: How is a loan amortized?

Baan Service Master Data Management

One-sample test of proportions

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Engineering Data Management

STUDENTS PARTICIPATION IN ONLINE LEARNING IN BUSINESS COURSES AT UNIVERSITAS TERBUKA, INDONESIA. Maya Maria, Universitas Terbuka, Indonesia

PUBLIC RELATIONS PROJECT 2016

CREATIVE MARKETING PROJECT 2016

Chapter 5: Inner Product Spaces

Properties of MLE: consistency, asymptotic normality. Fisher information.

Your organization has a Class B IP address of Before you implement subnetting, the Network ID and Host ID are divided as follows:

Confidence Intervals for Linear Regression Slope

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here).

*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature.

Basic Elements of Arithmetic Sequences and Series

Chapter XIV: Fundamentals of Probability and Statistics *

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

CHAPTER 3 THE TIME VALUE OF MONEY

Math C067 Sampling Distributions

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM

CHAPTER 3 DIGITAL CODING OF SIGNALS

Determining the sample size

Department of Computer Science, University of Otago

NATIONAL SENIOR CERTIFICATE GRADE 12

3. Greatest Common Divisor - Least Common Multiple

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth

, a Wishart distribution with n -1 degrees of freedom and scale matrix.

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

A GUIDE TO LEVEL 3 VALUE ADDED IN 2013 SCHOOL AND COLLEGE PERFORMANCE TABLES

The Zephyr K-Ratio. By Thomas Becker, Ph.D. Zephyr Associates, Inc.

Working with numbers

Hypothesis testing. Null and alternative hypotheses

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable

Quadrat Sampling in Population Ecology

Escola Federal de Engenharia de Itajubá

Cantilever Beam Experiment

CS103X: Discrete Structures Homework 4 Solutions

Mann-Whitney U 2 Sample Test (a.k.a. Wilcoxon Rank Sum Test)

Listing terms of a finite sequence List all of the terms of each finite sequence. a) a n n 2 for 1 n 5 1 b) a n for 1 n 4 n 2

A Test of Normality. 1 n S 2 3. n 1. Now introduce two new statistics. The sample skewness is defined as:

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem

Output Analysis (2, Chapters 10 &11 Law)

Soving Recurrence Relations

Forecasting techniques

Domain 1: Designing a SQL Server Instance and a Database Solution

NATIONAL SENIOR CERTIFICATE GRADE 11

Approximating Area under a curve with rectangles. To find the area under a curve we approximate the area using rectangles and then use limits to find

Volatility of rates of return on the example of wheat futures. Sławomir Juszczyk. Rafał Balina

Transcription:

Biology 171L Eviromet ad Ecology Lab Lab : Descriptive Statistics, Presetig Data ad Graphig Relatioships Itroductio Log lists of data are ofte ot very useful for idetifyig geeral treds i the data or the sigificace of a particular treatmet i affectig the outcome of a experimet. There are, however, statistical procedures that facilitate the summarizatio, presetatio, ad aalysis of the data. These procedures may allow a scietist to look through the oise i the data to see major treds. The use of these procedures is particularly beeficial i biological sciece, where the variability i the data may be very great. I this laboratory exercise you will lear how to calculate averages ad stadard deviatios to summarize your koa haole seed pod data. I additio, you will lear how to graph data to summarize data treds ad illustrate relatioships. Some Defiitios Statistics is the scietific study of umerical data based upo variatio i ature. I statistics, we begi by makig observatios, or measuremets. A sigle observatio is a datum. A collectio of observatios is called data. The actual attribute beig measured is called a variable. For example, a umber of variables could be measured o a rat: weight, body legth, tail legth, sex, fur color, eye color, the umber of whiskers o its sout, the umber of toes o its right forefoot, aggressiveess, ad health. A statistical populatio represets the totality of idividual observatios for a variable about which we would like to make some ifereces or geeralizatios. I this cotext, you should distiguish betwee a biological populatio ad a statistical populatio. Thus our populatio could be the rat weights of the world or the tail legths of the world. We could defie our populatio more specifically, e.g., the rat weights of Hoolulu, or the weights of rats fed a specific high protei diet. Note that we rarely collect data o the etire populatio. Sice it is impractical, ot to metio costly, to collect data from a etire populatio, we usually collect a small subset, or sample, of observatios from the populatio that hopefully represets the populatio. I geeral, a larger sample is more represetative of the populatio tha a smaller sample. Descriptive Statistics We ca use statistics to summarize ad describe the data. Thus we ca give meaig to a log list of umbers. There are two basic types of descriptive statistics: statistics of locatio ad statistics of dispersio. Statistics of locatio tell us about the cetral tedecy of the data, i.e., where the ceter of the data lies. A familiar statistic of locatio is the average, or arithmetic mea. The average is calculated by summig up the values of the idividual observatios ad dividig by the umber of observatios. Other statistics of locatio iclude the media ad mode. Statistics of dispersio tell us about the spread of the data, idicatig their variability. Oe simple statistic of dispersio is the rage, which is the differece betwee the miimum value ad the maximum value. Note that the rage is highly iflueced by sample size. A statistic of dispersio that is ot biased by sample size is the stadard deviatio, which ca be thought of as the average deviatio of the idividual observatios from the mea. A high value for the stadard deviatio would idicate high variability i the data. Calculatio of the stadard deviatio will be demostrated i class. Graphig Data Besides calculatig averages ad stadard deviatios to summarize a data set, it is ofte useful to graph the values. I this way, a picture of the data may be made. Sometimes graphs allow us to see treds i the data that would ot otherwise be apparet. Oe type of graph is the frequecy histogram. I this type of graph, the horizotal axis represets the possible values for a particular variable. The vertical axis represets the frequecy, or umber of times, a particular value was observed. This frequecy is plotted as a vertical bar. For example, if we observed five seed pods bearig 1 seeds per pod, the the bar height for this valued would be five uits high. Graphs may be drafted to illustrate relatioships betwee variables. For example,

Summarizig Data BIOLOGY 171L we may wat to illustrate the relatioship betwee weight ad body legth i rats. I this case, we could use the horizotal axis to represet the values for rat body legth ad the vertical axis to represet values for rat weight. The body legth ad weight of each rat could the be plotted. A lie may the be draw through the poits to idicate the tred observed. Presetig Tables ad Figures A table is a list of values arraged i colums ad rows for presetatio. Be aware that a useful table is oe that does ot require further explaatio. I other words, a table should be able to stad aloe for ayoe to iterpret without havig to refer to more descriptive text elsewhere. Thus every table should have a idetifyig umber, descriptive title, colum headigs, ad row headigs. The uits for a particular variable should also be idicated (usually i the colum or row headigs). Ay other iformatio pertiet to the table may be placed i a leged below the table. A example of a table (Table I) is provided below. Figure 1. Frequecy histogram of umber of seeds per pod for koa haole, Leucaea leucocephala, seed pods. TABLE I The Effects of High Fat Versus Low Fat Diets o Rat Weights ad Legths AVERAGE AVERAGE DIET TYPE WEIGHT (g) LENGTH (cm) high fat 545.5 55.3 low fat 346.7 53. The high fat diet cosisted of 50% fat, while the low fat diet cosisted of 5% fat. Each rat was fed ad libidum. Legth was measured from the tip of the sout to the base of the tail. The values preseted averages for samples of 100 rats. Figures may be graphs, photographs, or drawigs. As with tables, figures should be able to stad aloe without havig to refer to text elsewhere. Figures require assiged figure umbers, descriptive titles, ad legeds if ecessary. The axes of graphs should be adequately labeled with uits idicated. Examples of graphs (Figs. 1 & ) are preseted below. The followig rules regardig the presetatio of data i tables or figures should be followed: 1. Each figure or table must be idetified by a figure or table referece umber. Figure. Weight-legth relatioships of koa haole, Leucaea leucocephala, seed pods.. Each figure or table must have a clear descriptive title. If the figure or table refers to biological specimes, the the scietific ames of these specimes are geerally referred to i the title. 3. I tables, colum/row headigs should clearly idetify the variable ad its uits of measure. 4. I figures such as graphs, the axes should be clearly labeled ad the uits of measure idicated. 5. I figures, such as diagrams or maps, the pertiet features should be clearly labeled.

Summarizig Data BIOLOGY 171L 3 6. Do ot cram figures ad graphs ito a small space o a sheet of paper; try to fill up a whole sheet of paper. 7. A geeral rule of thumb is that each figure/table should be able to stad aloe without forcig the reader to read additioal text to uderstad it. Procedures ad Assigmets I. AVERAGES AND STANDARD DEVIATIONS Usig your koa haole seed pod data from the previous lab activity (pod legth i cm, pod mass i gm, ad umber of seeds per pod), calculate the averages ad stadard deviatios for each of your variables (thus you will have three meas ad three stadard deviatios to calculate). These values should be preseted i a table alog with the pooled class data (see Part II below). Do't forget to provide a title ad proper row ad colum headigs for this table. II. TABLE OF AVERAGES AND STANDARD DEVIATIONS You will be give a copy of the averages ad stadard deviatios for the data pooled over the etire class. Iclude these class values, alog with your ow values ito a table (Table I) that allows the reader to compare your idividual values with the pooled values. Do't forget to provide a title ad proper row ad colum headigs for this table. III. FREQUENCY HISTOGRAM FOR NUMBER OF SEEDS PER POD USING CLASS DATA Usig the pooled class data for the umber of seeds per pod, plot a frequecy histogram (Figure 1) for this variable oly. Be sure the axes are properly labeled ad the graph has a descriptive title. IV. GRAPHING RELATIONSHIPS: WEIGHT VERSUS LENGTH AND NUMBER OF SEEDS PER POD VERSUS LENGTH Usig Excel prepare two graphs (usig your data oly) as described below. Be sure the axes are properly labeled ad the graphs have descriptive titles. A. Weight Versus Legth Draw a graph (Figure ) that presets seed pod weight as a fuctio of seed pod legth. B. Seeds/Pod Versus Legth Draw a graph (Figure 3) that presets the umber of seeds per pod as a fuctio of seed pod legth. C. Liear Treds i the Data Usig a ruler, draw the "best fit" straight lie that illustrates the tred of the data i each graph described i Parts A & B above. V. DATA INTERPRETATION AND ANALYSIS A. Comparig Idividual Data to Class Data Were your values much differet from the combied class data? Explai what effect might be resposible for ay differeces observed. B. Descriptio of Number of Seeds Pre Pod After examiig the frequecy histogram for the umber of seeds per pod data, commet of the cetral tedecy ad variatio of the data. Does your impressio from this graph match the actual values calculated? C. Quatitative Relatioships Betwee Variables After reviewig the graphs, describe how pod weight ad legth are related to each other ad how the umber of seeds per pod relates to pod legth. Were these results expected? Explai. VOCABULARY statistics datum/data variable statistical populatio sample descriptive statistics statistics of locatio statistics of dispersio average (arithmetic mea) rage

Summarizig Data BIOLOGY 171L 4 stadard deviatio frequecy histogram table figure Summary of Assigmet to be Tured I Your laboratory summary (submitted at the ext lab meetig) should cosist of the followig elemets: 1. Descriptive title for assigmet.. Brief itroductio describig the purpose of this lab activity ad its objectives. 3. Table (Table I; created usig Excel) of averages ad stadard deviatios for your idividual data ad the class pooled data. 4. Frequecy histogram (Figure 1) for umber of seeds per pod (class pooled data oly). 5. Weight-versus-legth graph (Figure ) for your koa haole seed pod data. 6. Seeds/Pod-versus-legth graph (Figure 3) for your koa haole seed pod data. 7. Iterpretatio ad aalyses of the data ad relatioships observed (see Sectio V.) The assigmet must be typed. Writte text must utilize correct spellig, complete seteces, ad correct grammar.

CALCULATING MEANS AND STANDARD DEVIATIONS MEANS The average value or mea is calculated by summig up the values of the observatios (the Xi's) ad dividig this total by the umber of observatios (): X = ( Xi) / A example of how to make these calculatios is illustrated below: i Xi 1 3 3 5 4 8 5 10 STANDARD DEVIATIONS = 5 Xi = X1+ X + X 3 + X 4 + X5 = + 3 + 5 + 8 + 10 = 8 X = ( Xi) / = 8 / 5 = 5.6 The stadard deviatio is covetioally calculated by fidig the differece betwee each value ad the mea value, squarig each of these differeces, summig up these squared differeces, dividig this umber by the umber of observatios mius oe, ad takig the square root of this result: S.D. = ( ( Xi - X ) )/ ( 1) However, this formula leads to substatial roudig error because the subtractio step takes place before squarig the differeces betwee the values ad the mea. A more robust formula is the followig: S.D. = [ (Xi) (( Xi) / )]/ ( 1)

= ( Xi) = ( Xi) = Usig the symbols (butterfly, whale & octopus) to represet the quatities defied above, the formula preseted o the previous page may be expressed as: Arrage the data as follows: Xi (Xi) 4 3 9 5 5 8 64 10 100 8 0 = 0 = 8 = 5