basic biostatistics ME Mass spectrometry in an omics world December 10, 2012 Stefani Thomas, Ph.D.
|
|
- Ethel Fay Bishop
- 7 years ago
- Views:
Transcription
1 Lecture 13. Clinical studies and basic biostatistics ME Mass spectrometry in an omics world December 10, 2012 Stefani Thomas, Ph.D. 1
2 Statistics and biostatistics Statistics collection, organization, analysis, and interpretation of numerical data Objective: make an inference about a population based on information contained in a sample Biostatistics application of statistical methods to medical and biological problems 2
3 Role of statistics in decision-making processes Analysis of data from clinical i l trials to determine efficacy of new drugs Should a mastectomy always be recommended to a patient with breast cancer? What factors increase the risk that t an individual id will develop coronary heart disease? 3
4 Numbers are more precise than words There are three kinds of lies: lies, damned lies, and statistics Benjamin Disraeli (British Prime Minister ) It is easy to lie with statistics, but it is easier to lie without them Professor Frederick Mosteller (founding chairman of Harvard s statistics department, 1956) 4
5 1. Types of data (variables) 2. Descriptive statistics/numerical summary measures 3. Measures of dispersion/variability 4. Normal distribution and confidence intervals 5. Hypothesis testing 6. Correlation and regression analysis 7. Analysis of variance (ANOVA) 8. Experimental design 5
6 1. Types of data (variables) 6
7 Categorical data Nominal data - categories without a natural order Sex, race, country Ordinal data categories with a natural order e.g., Socioeconomic status (low, middle, high); type of bone break (hairline, simple, compound) Numbers can be assigned to specific values, but the value of the numbers is arbitrary % and proportions are used to analyze categorical data 7
8 Discrete data Ordered numerical data restricted to integer values e.g., Number of deaths due to AIDS in 2011; eggs laid per chicken; number of new cases of tuberculosis reported in the U.S. during a one-year period Both ordering and magnitude are important Numbers represent actual measurable quantities rather than mere labels l 8
9 Continuous data Ordered numerical data that can theoretically take on any value Data that represent measurable quantities but are not restricted to taking on certain specified values (such as integers) Only limiting factor for a continuous observation is the degree of accuracy with which it can be measured e.g., serum cholesterol level of a patient, concentration of a pollutant, height, weight, age, temperature 9
10 2. Descriptive statistics/ numerical summary measures 10
11 Measures of central tendency Most commonly investigated characteristic of a set of data is its center, or the point about which the observations tend to cluster 11
12 Mean Sum of all observations divided by n Pro: natural measure utilizing all the data Con: sensitive to extreme values 12
13 Median (m) Middle-most observation of ordered data Pro: insensitive to extreme values Con: determined mainly by middle points of sample Calculation 1. Order data from smallest to largest 2. If n is odd: m = (n+1)/2 largest observation 3. If n is even: m = average of the (n/2) and (n/2) +1 observation 13
14 Mode Observation that occurs most frequently Pro: can be used with categorical data (e.g., most popular presidential candidate) Con: less useful with continuous data Possible for data set to not have any modes or more than 1 mode 14
15 Relationships Symmetric distribution: mean = median = mode Skewed distribution to the right : mean>median to the left : mean<median 15
16 3. Measures of dispersion/ variability 16
17 Range Difference between the largest observation and the smallest Quick and dirty measure of variability Pro: easy to calculate Cons: Sensitive to extreme values Tends to increase with increasing n 17
18 Interquartile range Difference between the 25 th and the 75 th percentiles (quartiles) Encompasses middle 50% of observations Percentiles: pth percentile is the value such that X(p) percent of the data values are less than or equal to X(p) 18
19 Variance Quantifies the amount of variability, or spread, around the mean of the measurements Calculated by measuring the average squared distance of the observations from the mean 19
20 Standard deviation Square root of the variance More widely reported than the variance since the units are the same as for the data 20
21 Standard error of the mean (SEM) Indication of how the mean varies with different experiments measuring the same quantity If effect of random changes are significant, SEM will be higher If no change in data points as experiments are repeated, SEM is zero SEM decreases as n increases 21
22 Coefficient of variation Standard deviation as a percentage of the mean Useful for comparing variability of different samples, each with different means 22
23 4. Normal distribution and confidence intervals 23
24 Normal distribution Widely used continuous distribution (Gaussian distribution or bell-shaped curve) Mean = median = mode Standard normal distribution: mean = 0; s.d. = 1 Central limit theorem given certain conditions, the mean of a sufficiently large number of independent random variables, each with finite mean and variance, will be approximately normally distributed 24
25 Normal range Applies to normally distributed data 68% normal range = µ + 1σ 95% normal range = µ σ 99% normal range = µ σ 25
26 Confidence interval Range that describes where the true population parameter is likely to be with a certain level of confidence 26
27 5. Hypothesis testing 27
28 Procedure for hypothesis testing Hypothesis testing - an objective framework for making scientific conclusions based on a sample of data 28
29 Procedure for hypothesis testing: Step 1 Ask a question about a population p parameter Is the mean CD4 count for HIV(+) patients less than 400? Does smoking increase the risk of lung cancer? Is there a difference in mean serum cholesterol levels between kids who eat oatmeal and kids who eat Frosted Cheerios? 29
30 Procedure for hypothesis testing: Step 2 Translate the question into a hypothesis Null hypothesis (H 0 ) no difference or no effect Mean CD4 levels in HIV(+) patients = 400 (µ = 400) Alternative hypothesis (H 1 ) hypothesis that contradicts the null hypothesis; usually the research hypothesis of interest One-sided - used when interested in deviation from the null hypothesis in one direction Mean CD4 levels in HIV(+) patients < 400 (µ < 400) Two-sided - used when interested in any deviation from the null hypothesis Mean CD4 levels in HIV(+) patients 400 (µ 400) 30
31 Procedure for hypothesis testing: Step 3 Pick a significance level Decision H 0 Accept H 0 Reject H 0 TRUE No error Type I error FALSE Type II error No error Type I error - incorrectly rejecting H 0 when H 0 is true α - probability of Type I error; also called Significance level Type II error - incorrectly accepting H 0 when H 1 is true β - probability of Type II error Power = 1 β; probability of making the correct conclusion 31
32 Procedure for hypothesis testing: Steps 4-7 Collect data Calculate the test statistic Differs depending on the sampling design and the type of outcome variable Convert to p-valuel Probability of obtaining the observed data if H 0 is true Make a decision about the data based on the p- value 32
33 Test statistics for inferences about a population mean Z-test known variance; distribution of the test statistic under H 0 can be approximated by a normal distribution ib ti p-value for this test is given by the probability of obtaining a z- value equal to or more extreme than the computed z 33
34 Test statistics for inferences about a population mean t-test unknown variance p-value for this test is given by the probability of obtaining a t statistic with n-1 1 degrees of freedom equal to or more extreme than the computed t 34
35 Example of hypothesis testing 1. Is the mean CD4 level of HIV(+) patients less than 400, assuming that CD4 levels are normally distributed? 2. H 0 : µ=400; H 1 : µ< α = Collect random sample of 10 HIV(+) patients; mean CD4 level = 305.5; standard deviation = t = ( )/[(100)/ 10] = < p < p < 0.05; therefore reject H 0; the result is significant 8. Conclusion: These data show that the mean CD4 level of HIV(+) patients is statistically significantly less than 400 (p < 0.01). 35
36 6. Correlation and regression analysis 36
37 Correlation Quantification of the degree to which two random variables are related, provided d that t the relationship is linear Advantages Maintain continuity of data Model one variable as a function of the other Disadvantages Only measures linear relationships Requires normality assumption for testing hypotheses Only useful when both variables are continuous 37
38 Two-way scatter plot Possible values of X are placed on the horizontal axis X is used to predict Y; X is the independent variable Possible values of Y are placed on the vertical axis Y is the dependent variable Percentages of births attended by trained health care personnel and maternal mortality rate for 20 countries 38
39 Population correlation coefficient (ρ) Purpose of correlation analysis is to determine whether two continuous variables (X and Y) are linearly related Correlation coefficient: i Measures linear relationship between X and Y Ranges between -1 (perfect negative correlation) and 1 (perfect positive correlation) When ρ = 0, X and Y are linearly unrelated Strong correlation does not necessarily imply causation Pearson correlation coefficient (r) is an estimate of ρ based on a sample of data; both X and Y are assumed to be normally distributed Spearman nonparametric correlation coefficient (r s ) is the non-parametric analog to the Pearson correlation; no assumptions are necessary about distributions of X and Y 39
40 Simple linear regression Purpose is to model the change in Y as X changes Examples of uses: Prediction (what is the predicted amount of time it will take you to get home from work given the time that you leave?) Linear association (is there a linear relationship between CD4 levels and time since infection with HIV?) 40
41 7. Analysis of variance (ANOVA) 41
42 ANOVA Used to model the means of one variable (Y) for the various levels of other variables Extension of the two-sample t-test to three or more samples Number of t-tests increases geometrically as a function of the number of groups; analysis becomes cognitively difficult; ANOVA organizes and directs the analysis Conducting a greater number of analyses greatly increases the probability of committing at least one Type I error somewhere in the analysis Performing fewer hypothesis tests reduces the experimental error rate 42
43 Completely randomized design; One-way ANOVA One-way implies that there is a single factor or characteristic that distinguishes the various populations from each other Applicable when the outcome variable (Y) is continuous, normally distributed, and has approximately equal variance in all treatment groups Notation: Let Y be a continuous variable under investigation in k populations. Let µ be the true means in each of the k populations. Let n be the number of subjects from each population 43
44 Completely randomized design; Hypotheses One-way ANOVA H 0 : µ 1 = µ 2 = µ k H 1 : µ v µ w for some v w (do not need to specify which means differ) Data layout Total sample size (n) Grand Total (T) Grand mean (y ) Data presentation Tables of means and standard deviations for each group, along with sample sizes Test statistic F-test arising from an ANOVA table yields 2-sided p-values 44
45 Generating an ANOVA table (F-statistic) 45
46 One-way ANOVA example Study investigating the effects of carbon monoxide exposure on individuals with coronary artery disease Patients (men) subjected to series of exercise tests; men recruited from 3 medical centers Before combining subjects into one large group to conduct analysis, need to examine baseline characteristics to ensure that patients from the different centers were comparable Characteristic to test: FEV 1 (forced expiratory volume in 1 sec) ANOVA table Source of Variation SS df MS F P-value Between Groups Within Groups Total
47 8. Experimental design 47
48 Sample size determination When designing a study with the goal of testing a hypothesis, we need to know how many subjects to study Five variables must be specified 1. α: level of significance 2. One- or two-sided form of alternative hypothesis 3. δ: desired difference to detect 4. Power: 1 β (probability of detecting a difference of δ; power increases with increasing sample size) 5. σ D : standard deviation of the paired differences (typically estimated using published or pilot data) 48
49 Basic study designs (listed in order of increasing stringency) 1. Cross-sectional sectional study observation of a population, or a representative subset, at one specific point in time descriptive study (not longitudinal or experimental) 2. Cohort (prospective/observational) study identify cohort; measure exposure; follow for prolonged period of time; determine who develops disease; analyze to determine whether disease is associated with exposure 3. Case-control (retrospective) study identify set of patients with disease and corresponding set of controls without disease; find out retrospectively about exposure; analyze data to determine whether associations exist 49
DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.
DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,
More informationII. DISTRIBUTIONS distribution normal distribution. standard scores
Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,
More informationBiostatistics: Types of Data Analysis
Biostatistics: Types of Data Analysis Theresa A Scott, MS Vanderbilt University Department of Biostatistics theresa.scott@vanderbilt.edu http://biostat.mc.vanderbilt.edu/theresascott Theresa A Scott, MS
More informationDescriptive Statistics
Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize
More informationDATA INTERPRETATION AND STATISTICS
PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE
More informationThere are three kinds of people in the world those who are good at math and those who are not. PSY 511: Advanced Statistics for Psychological and Behavioral Research 1 Positive Views The record of a month
More informationX X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)
CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.
More informationIntroduction to Statistics and Quantitative Research Methods
Introduction to Statistics and Quantitative Research Methods Purpose of Presentation To aid in the understanding of basic statistics, including terminology, common terms, and common statistical methods.
More informationSTATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI
STATS8: Introduction to Biostatistics Data Exploration Babak Shahbaba Department of Statistics, UCI Introduction After clearly defining the scientific problem, selecting a set of representative members
More informationBiostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY
Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY 1. Introduction Besides arriving at an appropriate expression of an average or consensus value for observations of a population, it is important to
More informationStudy Guide for the Final Exam
Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make
More informationStatistics Review PSY379
Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses
More informationAdditional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
More informationPrinciples of Hypothesis Testing for Public Health
Principles of Hypothesis Testing for Public Health Laura Lee Johnson, Ph.D. Statistician National Center for Complementary and Alternative Medicine johnslau@mail.nih.gov Fall 2011 Answers to Questions
More informationBusiness Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.
Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing
More informationMEASURES OF VARIATION
NORMAL DISTRIBTIONS MEASURES OF VARIATION In statistics, it is important to measure the spread of data. A simple way to measure spread is to find the range. But statisticians want to know if the data are
More informationIntroduction to Quantitative Methods
Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................
More informationBNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I
BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential
More informationThe correlation coefficient
The correlation coefficient Clinical Biostatistics The correlation coefficient Martin Bland Correlation coefficients are used to measure the of the relationship or association between two quantitative
More informationAnalysis of Data. Organizing Data Files in SPSS. Descriptive Statistics
Analysis of Data Claudia J. Stanny PSY 67 Research Design Organizing Data Files in SPSS All data for one subject entered on the same line Identification data Between-subjects manipulations: variable to
More informationStatistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013
Statistics I for QBIC Text Book: Biostatistics, 10 th edition, by Daniel & Cross Contents and Objectives Chapters 1 7 Revised: August 2013 Chapter 1: Nature of Statistics (sections 1.1-1.6) Objectives
More informationWhy Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts
More informationCourse Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics
Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This
More informationMeans, standard deviations and. and standard errors
CHAPTER 4 Means, standard deviations and standard errors 4.1 Introduction Change of units 4.2 Mean, median and mode Coefficient of variation 4.3 Measures of variation 4.4 Calculating the mean and standard
More informationUsing Excel for inferential statistics
FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied
More informationSCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES
SCHOOL OF HEALTH AND HUMAN SCIENCES Using SPSS Topics addressed today: 1. Differences between groups 2. Graphing Use the s4data.sav file for the first part of this session. DON T FORGET TO RECODE YOUR
More informationStatistics. Measurement. Scales of Measurement 7/18/2012
Statistics Measurement Measurement is defined as a set of rules for assigning numbers to represent objects, traits, attributes, or behaviors A variableis something that varies (eye color), a constant does
More informationDescriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion
Descriptive Statistics Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion Statistics as a Tool for LIS Research Importance of statistics in research
More information2013 MBA Jump Start Program. Statistics Module Part 3
2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just
More informationIntroduction to Statistics for Psychology. Quantitative Methods for Human Sciences
Introduction to Statistics for Psychology and Quantitative Methods for Human Sciences Jonathan Marchini Course Information There is website devoted to the course at http://www.stats.ox.ac.uk/ marchini/phs.html
More information6.4 Normal Distribution
Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under
More informationAnalyzing Research Data Using Excel
Analyzing Research Data Using Excel Fraser Health Authority, 2012 The Fraser Health Authority ( FH ) authorizes the use, reproduction and/or modification of this publication for purposes other than commercial
More informationSummary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)
Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume
More informationNorthumberland Knowledge
Northumberland Knowledge Know Guide How to Analyse Data - November 2012 - This page has been left blank 2 About this guide The Know Guides are a suite of documents that provide useful information about
More informationQuantitative Methods for Finance
Quantitative Methods for Finance Module 1: The Time Value of Money 1 Learning how to interpret interest rates as required rates of return, discount rates, or opportunity costs. 2 Learning how to explain
More information1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number
1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression
More information2. Simple Linear Regression
Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according
More informationUnit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
More informationChapter 5 Analysis of variance SPSS Analysis of variance
Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationAlgebra 1 2008. Academic Content Standards Grade Eight and Grade Nine Ohio. Grade Eight. Number, Number Sense and Operations Standard
Academic Content Standards Grade Eight and Grade Nine Ohio Algebra 1 2008 Grade Eight STANDARDS Number, Number Sense and Operations Standard Number and Number Systems 1. Use scientific notation to express
More informationLecture 2: Descriptive Statistics and Exploratory Data Analysis
Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals
More informationWHAT IS A JOURNAL CLUB?
WHAT IS A JOURNAL CLUB? With its September 2002 issue, the American Journal of Critical Care debuts a new feature, the AJCC Journal Club. Each issue of the journal will now feature an AJCC Journal Club
More informationUNIVERSITY OF NAIROBI
UNIVERSITY OF NAIROBI MASTERS IN PROJECT PLANNING AND MANAGEMENT NAME: SARU CAROLYNN ELIZABETH REGISTRATION NO: L50/61646/2013 COURSE CODE: LDP 603 COURSE TITLE: RESEARCH METHODS LECTURER: GAKUU CHRISTOPHER
More informationCHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression
Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the
More informationCome scegliere un test statistico
Come scegliere un test statistico Estratto dal Capitolo 37 of Intuitive Biostatistics (ISBN 0-19-508607-4) by Harvey Motulsky. Copyright 1995 by Oxfd University Press Inc. (disponibile in Iinternet) Table
More informationSection 14 Simple Linear Regression: Introduction to Least Squares Regression
Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship
More informationDescriptive Statistics and Measurement Scales
Descriptive Statistics 1 Descriptive Statistics and Measurement Scales Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample
More informationMATH BOOK OF PROBLEMS SERIES. New from Pearson Custom Publishing!
MATH BOOK OF PROBLEMS SERIES New from Pearson Custom Publishing! The Math Book of Problems Series is a database of math problems for the following courses: Pre-algebra Algebra Pre-calculus Calculus Statistics
More informationDescriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics
Descriptive statistics is the discipline of quantitatively describing the main features of a collection of data. Descriptive statistics are distinguished from inferential statistics (or inductive statistics),
More informationbusiness statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar
business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel
More informationWeek 4: Standard Error and Confidence Intervals
Health Sciences M.Sc. Programme Applied Biostatistics Week 4: Standard Error and Confidence Intervals Sampling Most research data come from subjects we think of as samples drawn from a larger population.
More informationQUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS
QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS This booklet contains lecture notes for the nonparametric work in the QM course. This booklet may be online at http://users.ox.ac.uk/~grafen/qmnotes/index.html.
More informationSimple Regression Theory II 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the
More informationRegression Analysis: A Complete Example
Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty
More informationCALCULATIONS & STATISTICS
CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents
More informationSTA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance
Principles of Statistics STA-201-TE This TECEP is an introduction to descriptive and inferential statistics. Topics include: measures of central tendency, variability, correlation, regression, hypothesis
More informationSection 3 Part 1. Relationships between two numerical variables
Section 3 Part 1 Relationships between two numerical variables 1 Relationship between two variables The summary statistics covered in the previous lessons are appropriate for describing a single variable.
More informationRank-Based Non-Parametric Tests
Rank-Based Non-Parametric Tests Reminder: Student Instructional Rating Surveys You have until May 8 th to fill out the student instructional rating surveys at https://sakai.rutgers.edu/portal/site/sirs
More informationMeasurement & Data Analysis. On the importance of math & measurement. Steps Involved in Doing Scientific Research. Measurement
Measurement & Data Analysis Overview of Measurement. Variability & Measurement Error.. Descriptive vs. Inferential Statistics. Descriptive Statistics. Distributions. Standardized Scores. Graphing Data.
More informationCorrelation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables 2
Lesson 4 Part 1 Relationships between two numerical variables 1 Correlation Coefficient The correlation coefficient is a summary statistic that describes the linear relationship between two numerical variables
More informationIntroduction to Regression and Data Analysis
Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it
More informationTwo-Sample T-Tests Assuming Equal Variance (Enter Means)
Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of
More informationAP Physics 1 and 2 Lab Investigations
AP Physics 1 and 2 Lab Investigations Student Guide to Data Analysis New York, NY. College Board, Advanced Placement, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks
More informationSample Size and Power in Clinical Trials
Sample Size and Power in Clinical Trials Version 1.0 May 011 1. Power of a Test. Factors affecting Power 3. Required Sample Size RELATED ISSUES 1. Effect Size. Test Statistics 3. Variation 4. Significance
More informationDescribing and presenting data
Describing and presenting data All epidemiological studies involve the collection of data on the exposures and outcomes of interest. In a well planned study, the raw observations that constitute the data
More informationLecture 1: Review and Exploratory Data Analysis (EDA)
Lecture 1: Review and Exploratory Data Analysis (EDA) Sandy Eckel seckel@jhsph.edu Department of Biostatistics, The Johns Hopkins University, Baltimore USA 21 April 2008 1 / 40 Course Information I Course
More informationProbability and Statistics Vocabulary List (Definitions for Middle School Teachers)
Probability and Statistics Vocabulary List (Definitions for Middle School Teachers) B Bar graph a diagram representing the frequency distribution for nominal or discrete data. It consists of a sequence
More informationTwo-Sample T-Tests Allowing Unequal Variance (Enter Difference)
Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption
More informationChapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs
Types of Variables Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs Quantitative (numerical)variables: take numerical values for which arithmetic operations make sense (addition/averaging)
More informationMBA 611 STATISTICS AND QUANTITATIVE METHODS
MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain
More informationA and B This represents the probability that both events A and B occur. This can be calculated using the multiplication rules of probability.
Glossary Brase: Understandable Statistics, 10e A B This is the notation used to represent the conditional probability of A given B. A and B This represents the probability that both events A and B occur.
More informationExploratory data analysis (Chapter 2) Fall 2011
Exploratory data analysis (Chapter 2) Fall 2011 Data Examples Example 1: Survey Data 1 Data collected from a Stat 371 class in Fall 2005 2 They answered questions about their: gender, major, year in school,
More information1.5 Oneway Analysis of Variance
Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments
More informationDescriptive Statistics
Y520 Robert S Michael Goal: Learn to calculate indicators and construct graphs that summarize and describe a large quantity of values. Using the textbook readings and other resources listed on the web
More informationAssociation Between Variables
Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi
More informationParametric and Nonparametric: Demystifying the Terms
Parametric and Nonparametric: Demystifying the Terms By Tanya Hoskin, a statistician in the Mayo Clinic Department of Health Sciences Research who provides consultations through the Mayo Clinic CTSA BERD
More informationChapter 7. One-way ANOVA
Chapter 7 One-way ANOVA One-way ANOVA examines equality of population means for a quantitative outcome and a single categorical explanatory variable with any number of levels. The t-test of Chapter 6 looks
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More informationFinal Exam Practice Problem Answers
Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal
More informationCurriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010
Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010 Week 1 Week 2 14.0 Students organize and describe distributions of data by using a number of different
More informationNon-Inferiority Tests for Two Means using Differences
Chapter 450 on-inferiority Tests for Two Means using Differences Introduction This procedure computes power and sample size for non-inferiority tests in two-sample designs in which the outcome is a continuous
More informationIntroduction to Analysis of Variance (ANOVA) Limitations of the t-test
Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA Limitations of the t-test Although the t-test is commonly used, it has limitations Can only
More informationModule 3: Correlation and Covariance
Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis
More informationRecall this chart that showed how most of our course would be organized:
Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical
More informationCorrelation and Regression
Correlation and Regression Scatterplots Correlation Explanatory and response variables Simple linear regression General Principles of Data Analysis First plot the data, then add numerical summaries Look
More informationCorrelation key concepts:
CORRELATION Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson s coefficient of correlation c) Spearman s Rank correlation coefficient d)
More informationFairfield Public Schools
Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity
More informationNONPARAMETRIC STATISTICS 1. depend on assumptions about the underlying distribution of the data (or on the Central Limit Theorem)
NONPARAMETRIC STATISTICS 1 PREVIOUSLY parametric statistics in estimation and hypothesis testing... construction of confidence intervals computing of p-values classical significance testing depend on assumptions
More informationNon-Inferiority Tests for One Mean
Chapter 45 Non-Inferiority ests for One Mean Introduction his module computes power and sample size for non-inferiority tests in one-sample designs in which the outcome is distributed as a normal random
More informationTHE KRUSKAL WALLLIS TEST
THE KRUSKAL WALLLIS TEST TEODORA H. MEHOTCHEVA Wednesday, 23 rd April 08 THE KRUSKAL-WALLIS TEST: The non-parametric alternative to ANOVA: testing for difference between several independent groups 2 NON
More informationUnivariate Regression
Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is
More informationHow To Write A Data Analysis
Mathematics Probability and Statistics Curriculum Guide Revised 2010 This page is intentionally left blank. Introduction The Mathematics Curriculum Guide serves as a guide for teachers when planning instruction
More informationStatistical tests for SPSS
Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly
More informationUNDERSTANDING THE TWO-WAY ANOVA
UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables
More information430 Statistics and Financial Mathematics for Business
Prescription: 430 Statistics and Financial Mathematics for Business Elective prescription Level 4 Credit 20 Version 2 Aim Students will be able to summarise, analyse, interpret and present data, make predictions
More informationChapter 4. Probability and Probability Distributions
Chapter 4. robability and robability Distributions Importance of Knowing robability To know whether a sample is not identical to the population from which it was selected, it is necessary to assess the
More informationPie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple.
Graphical Representations of Data, Mean, Median and Standard Deviation In this class we will consider graphical representations of the distribution of a set of data. The goal is to identify the range of
More informationCORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA
We Can Early Learning Curriculum PreK Grades 8 12 INSIDE ALGEBRA, GRADES 8 12 CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA April 2016 www.voyagersopris.com Mathematical
More informationSENSITIVITY ANALYSIS AND INFERENCE. Lecture 12
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More information