Notes. Statistical consulting is like a final exam on steroids.

Size: px
Start display at page:

Download "Notes. Statistical consulting is like a final exam on steroids."

Transcription

1 Notes Statistical consulting is like a final exam on steroids. A statistical consultant usually works as part of a team on a project and provides statistical knowledge for the team. That team may comprise just the consultant and client, or it may include multiple members who bring overlapping expertise to the team. The consultants contribution may include problem formulation, designs for data collectiona and analysis, and writing a report that describes methods, results, and conclusions. Typically, conclusions are produced by the team after the statistical consultant discusses results with everyone. Example: OAB, overactive-bladder syndrome. Overactive Bladder Syndrome is an urological condition that sometimes is treated by lifestyle changes, sometimes by drugs, depending on the physician s evaluation and severity of patient symptoms. This project involved two pharmaceutical companies and was initiated by an independent urologist. In addition to the statistical consultant, he team for this project included two urologists from the phama companies, a statistician from one of the pharma companies who was responsible for providing data, the vice president for the urology section of one of the pharma companies. The basic question for this project was to investigate why some patients in clinical trials who received a placebo respond as well as patients who received the drug treatment but others on a placebo responded poorly. Methods involved classification and regression modeling that included variable selection and prediction. Example. A project may involve only obtaining summaries of data. A large organization that provides enhanced training for Advanced Placement classes and teachers wanted to compare test scores of students in these courses. These comparisons were to be made across school districts, schools, and teachers. Here are some questions that arose. What should be the basis for comparisons? Means? Medians? Something else? Would a difference between two districts (or schools, or teachers) indicate that one district is better than the other? Example. A company that sells medical supplies to physicians, clinics, and hospitals is audited by the State Comptroller s office for sales tax paid. Instead of conducting a complete audit of all invoices over the three-year audit period, sample-based auditing was used. Invoices from eight randomly selected days from the audit period were examined by the auditor. Total sales tax error in these invoices was dividied by total invoice amount. This ratio was multiplied by the total invoice amount for the three-year audit period to give the estimated total sales tax error. This turned out to be $700,000. Sales tax errors can occur several different ways. Supplies for Medicare/Medicaid use are not subject to state sales tax, but most customers of this company have both Medicare and non-medicare patients. Some physicians have multiple offices whose locations may have different sales tax rates. Does the method used by the auditor give a proper estimate? If not, what is the appropriate method? How accurate is a proper estimate based on this data? 1

2 Example. The Village Creek sewage treatment plant is located on the Trinity River just west of the Tarrant-Dallas county line. Water is discharged after treatment into the Trinity. During summer months when we have little rain, as much as 90% of the Trinity s flow comes from Village Creek. Chlorine used during treatment was part of the effluent discharged into the river. This chlorine is toxic to aquatic insects downstream of Village Creek. These insects are at the bottom of a food chain that includes fish, birds, and people. Under the Clean Water Act, Village Creek was required to dechlorinate its effluent before discharge. How can we assess the effect of dechlorination on the receiving stream? The CWA requires that discharge does no harm. Has this requirement been met? Whole Effluent Toxicity Tests form the basis for permitting in this situation. As these examples show, the statistician often works with people and data from unfamiliar fields. He/she must be able to learn enough about these areas for intelligent discussion with experts in those fields. The statistician may not know details about the appropriate statistical methods for the problem, but he/she must be able to follow the proper path that leads to those methods, then learn all the details and pitfalls associated with their application. The statistical consultant must: 1. understand and define the problem in statistical terms; 2. assess overall objectives and identify potential problems; 3. plan for data collection; 4. check data for errors or inconsistencies (never assume data is correct); 5. determine and implement appropriate statistical methods; 6. check, then recheck, and recheck again all code and programs used for the analysis; 7. perform analyses, check assumptions, deal with problems, recheck code; 8. discuss preliminary results, add any additional analyses; make changes to code; 9. check code and rerun, if necessary; 10. write final report. Tools. R for statistical analyses and graphics. It s free, it s widely used and accepted in many fields, it has an extensive set of addon packages that keeps R state-of-the-art, and it has unmatched graphical capabilities. Downside: it has a steep learning curve, errors in code can be subtle and difficult to identify. 2

3 Reports: PDF is the most commonly used format for reports. Printed copies are no longer needed, so graphics should make extensive use of color. PDF files can be generated by L A TEXand is strongly recommended if any mathematical notation is included. If Word is used, then a PDF version of the report is what should be delivered, not the Word document. This ensures the report can be viewed across all operating systems and devices including tablets. Presentation: PowerPoint and Keystone (mac) are most common, but latex-based beamer is useful if the presentation includes mathematical notation. Contributed library xtable provides an interface between L A TEXand output of R tables and matrices. beamer also includes navigation links to move easily among different sections of the presentation. 3

4 Case study: Johannes Kepler and his third law of planetary motion. Johannes Kepler ( ) was a mathematician who derived the fundamental laws of planetary motion that were the basis for the theory of gravity presented by Isaac Newton in Kepler was employed by a Danish nobleman, Tycho Brahe, to analyze the extensive, and very accurate for its time, sets of planetary positions. Kepler tried to fit various models to the positions of Mars but was unsuccessful until he tried fitting an ellipse. He found a near-perfect fit and this became his first law of planetary motion: all planets move in ellipses, with the sun at one focus. His second law, planets sweep out equal areas in equal times was derived from the observation that a planet moves faster when it is closer to the sun and geometrical properties of ellipses. What is remarkable about these laws is that they were derived from data obtained before telescopes were invented. At that time distances of planets from the Sun only could be obtained relative to the earth s distance from the sun, referred to as an astronomical unit (a.u.). Kepler s third law relates distance of a planet from the sun and its orbital period. He originally stated his third law as: a planet s period is proportional to the square of its distance from the Sun. Here are distances and orbital periods of the planets known to Kepler. Distance is given in a.u. and period is given in Earth years. Planet Period Distance Mercury Venus Earth 1 1 Mars Jupiter Saturn

5 Kepler s original model can be fit in R by: Planets0 = read.table(" header=true,sep=",", row.names=1) png("planets1.png",width=600,height=600) Pnames = dimnames(planets0)[[1]] tpos = rep(c(4,2),3) plot(period ~ Distance,data=Planets0,pch=19,xlab="Distance (a.u.)") title("orbital Period vs Distance for Planets Known to Kepler") text(planets0$distance, Planets0$Period, Pnames, pos=tpos,cex=.8) graphics.off() 5

6 #Kepler first hypothesized that Period is proportional to Distance squared Period = Planets0$Period Distance2 = Planets0$Distance^2 P2.lm = lm(period ~ Distance2-1) #note that this is a no-intercept model print(summary(p2.lm)) png("planets2.png",width=600,height=600) plot(period ~ Distance,data=Planets0,pch=19,xlab="Distance (a.u.)") D2 = seq(min(planets0$distance),max(planets0$distance),length=200) D2new = data.frame(distance2=d2^2) P2.pred = predict(p2.lm,newdata=d2new) lines(d2,p2.pred,col="red") text(planets0$distance, Planets0$Period, Pnames, pos=tpos,cex=.8) title("orbital Period vs Distance for Planets Known to Kepler\nwith Kepler s Original Third Law") title(sub="model: period of a planet is proportional to square of its distance",cex=.9) graphics.off() #diagnostic plots png("planets3.png",width=600,height=600) par(mfrow=c(2,2)) plot(p2.lm) mtext("diagnostic Plots for Kepler s Original Third Law", outer=true,line=-2) graphics.off() The result of this model fit is given here: Call: lm(formula = Period ~ Distance2-1) Coefficients: Estimate Std. Error t value Pr(> t ) Distance e-06 *** Residual standard error: on 5 degrees of freedom Multiple R-squared: 0.989,Adjusted R-squared: F-statistic: on 1 and 5 DF, p-value: 4.356e-06 6

7 7

8 Kepler s original third law looks really bad. This model is an example of a power law. Power laws are fit best by log-log transformations. In R this is accomplished by #now consider log-log transformation logperiod = log(planets0$period) logdistance = log(planets0$distance) logp.lm = lm(logperiod ~ logdistance) #this model includes intercept print(summary(logp.lm)) png("planets4.png",width=600,height=600) par(mfrow=c(2,2)) 8

9 plot(logp.lm) mtext("diagnostic Plots for log-log Transformed Data", outer=true,line=-2) graphics.off() png("planets5.png",width=600,height=600) plot(logperiod ~ logdistance,pch=19) LD2 = seq(min(logdistance),max(logdistance),length=200) LD2new = data.frame(logdistance=ld2) LP2.pred = predict(logp.lm,newdata=ld2new) lines(ld2,lp2.pred,col="red") text(logdistance,logperiod,pnames,pos=tpos,cex=.9) title("orbital Period vs Distance for Planets Known to Kepler\nlog-log transformed") graphics.off() Here are the results of this fit: Call: lm(formula = logperiod ~ logdistance) Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) logdistance e-13 *** Residual standard error: on 4 degrees of freedom Multiple R-squared: 1,Adjusted R-squared: 1 F-statistic: 4.35e+06 on 1 and 4 DF, p-value: 3.17e-13 9

10 10

11 This analysis shows that log(period) is related to 1.5*log(Distance). In terms of the original variables, that relationship can be expressed as Period squared is proportional to Distance cubed. This implies that T 2 a 3, is constant for all planets, where T is the orbital period and a is the length of the semi-major axis of the planet (distance from the sun). Although least squares was unknown in Kepler s time, he was not satisfied with his original formulation and changed it to this new version. Here is code to produce plots that show how his third law fits the data. # plot Kepler s third law 11

12 png("planets6.png",width=600,height=600) plot(period ~ Distance,data=Planets0,pch=19,xlab="Distance (a.u.)") title("orbital Period vs Distance for Planets Known to Kepler\n with Kepler s Third Law") P2 = D2^1.5 lines(d2,p2,col="red") text(planets0$distance, Planets0$Period, Pnames, pos=tpos,cex=.8) graphics.off() ### now use new data Planets = read.table(" header=true,sep=",", row.names=1) png("planets7.png",width=600,height=900) par(mfrow=c(2,1),mar=c(1,4,3,2),oma=c(3,0,0,0)) ndx = 10:14 D2 = seq(0,max(planets$distance),length=200) P2 = D2^1.5 plot(period ~ Distance,data=Planets[-ndx,],pch=19, xlab="distance (a.u.)",xlim=c(0,1.1*max(planets$distance[-ndx]))) title("orbital Period vs Distance\nPlanets, Minor Planets, Asteroids with Kepler s Third Law") lines(d2,p2,col="red") Pnames1 = dimnames(planets)[[1]][-ndx] tpos = rep(4,length(pnames1)) names(tpos) = Pnames1 tpos["apophis"] = 2 text(planets$distance[-ndx],planets$period[-ndx],pnames1,pos=tpos) ### plot(period ~ Distance,data=Planets,pch=19,xlab="Distance (a.u.)") lines(d2,p2,col="red") text(planets$distance[ndx],planets$period[ndx], dimnames(planets)[[1]][ndx],pos=2) mtext("distance (a.u.)",outer=true,side=1,line=1.5,font=2,cex=1.2) graphics.off() 12

13 13

14 14

15 Power function of two-sample t-test The power function of the classical two-sample t-test is easy to obtain under the assumption of equal variances of the two populations. Assumptions: X and Y are independent random samples of sizes n and m, respectively, from normally distributed populations with means µ 1, µ 2 and the same s.d. σ. The one-sided hypotheses H 0 : µ 1 µ 2 H 1 : µ 1 > µ 2 are described here. Power functions for other hypotheses are derived similarly. The pooled-variance test statistic is X Y s p 1 n + 1 m, where s 2 p is the pooled variance estimator, s 2 p = 1 [ (Xi X) 2 + (Y j Y ) 2]. n + m 2 Statistical theory shows that under the null hypothesis µ 1 = µ 2, this test statistic has a t-distribution with n+m-2 d.f. Denote the critical value for a size α test with d.f. d by t d,α. In R this is obtained by qt(1-alpha,n+m-2) Therefore, the power function of this test is given by where π(δ, n, m) = P (T t d,α ), δ = µ 1 µ 2, T has a non-central t-distribution with d d.f. and non-centrality parameter λ = δ σ 1 n + 1 m, and σ is the common population s.d. In R this can be obtained with the function power.t.test() if the sample sizes are equal. This power function also can be used to obtain observable differences and sample sizes. Observable difference for this test is the value of δ such that π(δ, n, m) = 1 β. 15

16 where β is the specified probability of making a Type II error with given sample sizes n,m. That is, we want to find the difference between population means that would result in probability 1 β of rejecting the null hypothesis. Sample size determination is the same except that δ is specified and we need to find values for n,m that give required power. Suppose for example we have random samples each of size 15 and wish to determine what difference between means is detectable by a size 0.05 test with power In R this is obtained by power.t.test(n=15,power=.9,type="two.sample",alternative="one.sided") The value that is returned represents the observable difference in terms of the pooled variance. In this case that value is This implies that with independent random samples each of size 15, the means must be at least times the common s.d. for a size 0.05 test to reject with 90% probability. Suppose instead that we plan to use equal sample sizes for the two groups and need to find the sample size such that power of a size 0.05 test has power 0.90 when δ =.5σ. In R this is obtained by power.t.test(delta =.5,power=.9,type="two.sample",alternative="one.sided") The result here is n=69. If we want to obtain observable difference when the group sample sizes are not equal, then we need to input an appropriate range of values for delta into the non-central t-distribution function and then find the value of delta that gives the target for power. For example, suppose the sample sizes are 20,30 and we want to find observable difference for alpha = 0.05 and beta = n = c(20,30) df = sum(n)-2 alpha =.05 beta =.10 delta = seq(.1,.9,length=81) cv = qt(1-alpha,sum(n)-2) lambda = delta/sqrt(sum(1/n)) pwr = 1 - pt(cv,df,lambda) if(max(pwr) < 1-beta min(pwr) > 1 - beta) { cat("no values of delta gave required power\n") obsdiff = NA } else { ndx = seq(pwr)[pwr >= 1 - beta] obsdiff = delta[min(ndx)] } obsdiff 16

17 The value returned by this is obsdiff = In practice, we never use the pooled-sample t-test because Welch s approximation works well when the population variances are unequal and performs about the same as the pooled sample test when the population variances are equal. However, obtaining the power function for this test is more complicated. Let V 1 = σ2 1 n, V 2 = σ2 2 m, ˆV 1 = s2 1 n, ˆV2 = s2 2 m. Welch s approximation is based on the result: (X Y ) (µ 1 µ 2 ) V1 + V 2 t d, where degrees of freedom is given by d = (V 1 + V 2 ) 2. V1 2 + V 2 2 n 1 m 1 In practice we replace V i by its estimate ˆV i to obtain d.f. The power function for a size alpha test is then π w (δ, n, m) = 1 pt(cv, d, λ ), where the non-centrality parameter is given by λ = To simplify, let Then and d.f. is δ V1 + V 2. a = σ2 2, b = m σ1 2 n. λ = δ nb σ 1 a + b d = (a + b)2 b 2 + a2 n 1 bn 1 In practice we estimate a by â = s2 2. s 2 1 Here is a simple R function that evaluates this power function. 17

18 power.welch.test = function(delta,n1,sig1,a,b,alpha=.05) { # n1 is sample size of group 1 # b = n2/n1 # sig1 is sd of group 1 # a = (sig2/sig1)^2 # either delta or n1 can be a vector but not both df = (a+b)^2/(b^2/(n1-1) + a^2/(b*n1-1)) lambda = delta*sqrt(n1*b/(a+b))/sig1 cv = qt(1-alpha,df) pwr = 1 - pt(cv,df,lambda) pwr } As a test, this function with n1=20, a=1, b=1.5 should give the same result as the equal variance power function. Scripts Links to scripts used in class are here

From Aristotle to Newton

From Aristotle to Newton From Aristotle to Newton The history of the Solar System (and the universe to some extent) from ancient Greek times through to the beginnings of modern physics. The Geocentric Model Ancient Greek astronomers

More information

Lecture 13. Gravity in the Solar System

Lecture 13. Gravity in the Solar System Lecture 13 Gravity in the Solar System Guiding Questions 1. How was the heliocentric model established? What are monumental steps in the history of the heliocentric model? 2. How do Kepler s three laws

More information

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption

More information

Exercise: Estimating the Mass of Jupiter Difficulty: Medium

Exercise: Estimating the Mass of Jupiter Difficulty: Medium Exercise: Estimating the Mass of Jupiter Difficulty: Medium OBJECTIVE The July / August observing notes for 010 state that Jupiter rises at dusk. The great planet is now starting its grand showing for

More information

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Two-Sample T-Tests Assuming Equal Variance (Enter Means) Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of

More information

Non-Inferiority Tests for Two Means using Differences

Non-Inferiority Tests for Two Means using Differences Chapter 450 on-inferiority Tests for Two Means using Differences Introduction This procedure computes power and sample size for non-inferiority tests in two-sample designs in which the outcome is a continuous

More information

Unit 8 Lesson 2 Gravity and the Solar System

Unit 8 Lesson 2 Gravity and the Solar System Unit 8 Lesson 2 Gravity and the Solar System Gravity What is gravity? Gravity is a force of attraction between objects that is due to their masses and the distances between them. Every object in the universe

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

USING MS EXCEL FOR DATA ANALYSIS AND SIMULATION

USING MS EXCEL FOR DATA ANALYSIS AND SIMULATION USING MS EXCEL FOR DATA ANALYSIS AND SIMULATION Ian Cooper School of Physics The University of Sydney i.cooper@physics.usyd.edu.au Introduction The numerical calculations performed by scientists and engineers

More information

5. Linear Regression

5. Linear Regression 5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

More information

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

More information

Orbital Dynamics with Maple (sll --- v1.0, February 2012)

Orbital Dynamics with Maple (sll --- v1.0, February 2012) Orbital Dynamics with Maple (sll --- v1.0, February 2012) Kepler s Laws of Orbital Motion Orbital theory is one of the great triumphs mathematical astronomy. The first understanding of orbits was published

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is

More information

We extended the additive model in two variables to the interaction model by adding a third term to the equation.

We extended the additive model in two variables to the interaction model by adding a third term to the equation. Quadratic Models We extended the additive model in two variables to the interaction model by adding a third term to the equation. Similarly, we can extend the linear model in one variable to the quadratic

More information

Chapter 25.1: Models of our Solar System

Chapter 25.1: Models of our Solar System Chapter 25.1: Models of our Solar System Objectives: Compare & Contrast geocentric and heliocentric models of the solar sytem. Describe the orbits of planets explain how gravity and inertia keep the planets

More information

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,

More information

Psychology 205: Research Methods in Psychology

Psychology 205: Research Methods in Psychology Psychology 205: Research Methods in Psychology Using R to analyze the data for study 2 Department of Psychology Northwestern University Evanston, Illinois USA November, 2012 1 / 38 Outline 1 Getting ready

More information

Two-sample t-tests. - Independent samples - Pooled standard devation - The equal variance assumption

Two-sample t-tests. - Independent samples - Pooled standard devation - The equal variance assumption Two-sample t-tests. - Independent samples - Pooled standard devation - The equal variance assumption Last time, we used the mean of one sample to test against the hypothesis that the true mean was a particular

More information

Statistical Models in R

Statistical Models in R Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Linear Models in R Regression Regression analysis is the appropriate

More information

Applied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne

Applied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne Applied Statistics J. Blanchet and J. Wadsworth Institute of Mathematics, Analysis, and Applications EPF Lausanne An MSc Course for Applied Mathematicians, Fall 2012 Outline 1 Model Comparison 2 Model

More information

AE554 Applied Orbital Mechanics. Hafta 1 Egemen Đmre

AE554 Applied Orbital Mechanics. Hafta 1 Egemen Đmre AE554 Applied Orbital Mechanics Hafta 1 Egemen Đmre A bit of history the beginning Astronomy: Science of heavens. (Ancient Greeks). Astronomy existed several thousand years BC Perfect universe (like circles

More information

Using R for Linear Regression

Using R for Linear Regression Using R for Linear Regression In the following handout words and symbols in bold are R functions and words and symbols in italics are entries supplied by the user; underlined words and symbols are optional

More information

Name: Earth 110 Exploration of the Solar System Assignment 1: Celestial Motions and Forces Due in class Tuesday, Jan. 20, 2015

Name: Earth 110 Exploration of the Solar System Assignment 1: Celestial Motions and Forces Due in class Tuesday, Jan. 20, 2015 Name: Earth 110 Exploration of the Solar System Assignment 1: Celestial Motions and Forces Due in class Tuesday, Jan. 20, 2015 Why are celestial motions and forces important? They explain the world around

More information

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association

More information

2. Orbits. FER-Zagreb, Satellite communication systems 2011/12

2. Orbits. FER-Zagreb, Satellite communication systems 2011/12 2. Orbits Topics Orbit types Kepler and Newton laws Coverage area Influence of Earth 1 Orbit types According to inclination angle Equatorial Polar Inclinational orbit According to shape Circular orbit

More information

Difference of Means and ANOVA Problems

Difference of Means and ANOVA Problems Difference of Means and Problems Dr. Tom Ilvento FREC 408 Accounting Firm Study An accounting firm specializes in auditing the financial records of large firm It is interested in evaluating its fee structure,particularly

More information

Orbital Mechanics. Angular Momentum

Orbital Mechanics. Angular Momentum Orbital Mechanics The objects that orbit earth have only a few forces acting on them, the largest being the gravitational pull from the earth. The trajectories that satellites or rockets follow are largely

More information

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

More information

The orbit of Halley s Comet

The orbit of Halley s Comet The orbit of Halley s Comet Given this information Orbital period = 76 yrs Aphelion distance = 35.3 AU Observed comet in 1682 and predicted return 1758 Questions: How close does HC approach the Sun? What

More information

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To

More information

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Chapter 13 Introduction to Linear Regression and Correlation Analysis Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing

More information

The Solar System. Unit 4 covers the following framework standards: ES 10 and PS 11. Content was adapted the following:

The Solar System. Unit 4 covers the following framework standards: ES 10 and PS 11. Content was adapted the following: Unit 4 The Solar System Chapter 7 ~ The History of the Solar System o Section 1 ~ The Formation of the Solar System o Section 2 ~ Observing the Solar System Chapter 8 ~ The Parts the Solar System o Section

More information

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name: Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours

More information

Planetary Orbit Simulator Student Guide

Planetary Orbit Simulator Student Guide Name: Planetary Orbit Simulator Student Guide Background Material Answer the following questions after reviewing the Kepler's Laws and Planetary Motion and Newton and Planetary Motion background pages.

More information

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups In analysis of variance, the main research question is whether the sample means are from different populations. The

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

Non-Inferiority Tests for One Mean

Non-Inferiority Tests for One Mean Chapter 45 Non-Inferiority ests for One Mean Introduction his module computes power and sample size for non-inferiority tests in one-sample designs in which the outcome is distributed as a normal random

More information

Gravitation and Newton s Synthesis

Gravitation and Newton s Synthesis Gravitation and Newton s Synthesis Vocabulary law of unviversal Kepler s laws of planetary perturbations casual laws gravitation motion casuality field graviational field inertial mass gravitational mass

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

Chapter 7 Section 7.1: Inference for the Mean of a Population

Chapter 7 Section 7.1: Inference for the Mean of a Population Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used

More information

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction

More information

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing Introduction Hypothesis Testing Mark Lunt Arthritis Research UK Centre for Ecellence in Epidemiology University of Manchester 13/10/2015 We saw last week that we can never know the population parameters

More information

Comparing Nested Models

Comparing Nested Models Comparing Nested Models ST 430/514 Two models are nested if one model contains all the terms of the other, and at least one additional term. The larger model is the complete (or full) model, and the smaller

More information

Regression step-by-step using Microsoft Excel

Regression step-by-step using Microsoft Excel Step 1: Regression step-by-step using Microsoft Excel Notes prepared by Pamela Peterson Drake, James Madison University Type the data into the spreadsheet The example used throughout this How to is a regression

More information

Chapter 7. One-way ANOVA

Chapter 7. One-way ANOVA Chapter 7 One-way ANOVA One-way ANOVA examines equality of population means for a quantitative outcome and a single categorical explanatory variable with any number of levels. The t-test of Chapter 6 looks

More information

Parametric and non-parametric statistical methods for the life sciences - Session I

Parametric and non-parametric statistical methods for the life sciences - Session I Why nonparametric methods What test to use? Rank Tests Parametric and non-parametric statistical methods for the life sciences - Session I Liesbeth Bruckers Geert Molenberghs Interuniversity Institute

More information

Testing for Lack of Fit

Testing for Lack of Fit Chapter 6 Testing for Lack of Fit How can we tell if a model fits the data? If the model is correct then ˆσ 2 should be an unbiased estimate of σ 2. If we have a model which is not complex enough to fit

More information

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct 16 2015

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct 16 2015 Stat 411/511 THE RANDOMIZATION TEST Oct 16 2015 Charlotte Wickham stat511.cwick.co.nz Today Review randomization model Conduct randomization test What about CIs? Using a t-distribution as an approximation

More information

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

individualdifferences

individualdifferences 1 Simple ANalysis Of Variance (ANOVA) Oftentimes we have more than two groups that we want to compare. The purpose of ANOVA is to allow us to compare group means from several independent samples. In general,

More information

Interaction between quantitative predictors

Interaction between quantitative predictors Interaction between quantitative predictors In a first-order model like the ones we have discussed, the association between E(y) and a predictor x j does not depend on the value of the other predictors

More information

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS The Islamic University of Gaza Faculty of Commerce Department of Economics and Political Sciences An Introduction to Statistics Course (ECOE 130) Spring Semester 011 Chapter 10- TWO-SAMPLE TESTS Practice

More information

Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools. Tools for Summarizing Data Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

More information

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7. THERE ARE TWO WAYS TO DO HYPOTHESIS TESTING WITH STATCRUNCH: WITH SUMMARY DATA (AS IN EXAMPLE 7.17, PAGE 236, IN ROSNER); WITH THE ORIGINAL DATA (AS IN EXAMPLE 8.5, PAGE 301 IN ROSNER THAT USES DATA FROM

More information

Chapter 3 The Science of Astronomy

Chapter 3 The Science of Astronomy Chapter 3 The Science of Astronomy Days of the week were named for Sun, Moon, and visible planets. What did ancient civilizations achieve in astronomy? Daily timekeeping Tracking the seasons and calendar

More information

Testing a claim about a population mean

Testing a claim about a population mean Introductory Statistics Lectures Testing a claim about a population mean One sample hypothesis test of the mean Department of Mathematics Pima Community College Redistribution of this material is prohibited

More information

A.4 The Solar System Scale Model

A.4 The Solar System Scale Model CHAPTER A. LABORATORY EXPERIMENTS 25 Name: Section: Date: A.4 The Solar System Scale Model I. Introduction Our solar system is inhabited by a variety of objects, ranging from a small rocky asteroid only

More information

Statistical Models in R

Statistical Models in R Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova

More information

MONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010

MONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010 MONT 07N Understanding Randomness Solutions For Final Examination May, 00 Short Answer (a) (0) How are the EV and SE for the sum of n draws with replacement from a box computed? Solution: The EV is n times

More information

Newton s Law of Universal Gravitation

Newton s Law of Universal Gravitation Newton s Law of Universal Gravitation The greatest moments in science are when two phenomena that were considered completely separate suddenly are seen as just two different versions of the same thing.

More information

Module 5: Statistical Analysis

Module 5: Statistical Analysis Module 5: Statistical Analysis To answer more complex questions using your data, or in statistical terms, to test your hypothesis, you need to use more advanced statistical tests. This module reviews the

More information

Lab 6: Kepler's Laws. Introduction. Section 1: First Law

Lab 6: Kepler's Laws. Introduction. Section 1: First Law Lab 6: Kepler's Laws Purpose: to learn that orbit shapes are ellipses, gravity and orbital velocity are related, and force of gravity and orbital period are related. Materials: 2 thumbtacks, 1 pencil,

More information

E(y i ) = x T i β. yield of the refined product as a percentage of crude specific gravity vapour pressure ASTM 10% point ASTM end point in degrees F

E(y i ) = x T i β. yield of the refined product as a percentage of crude specific gravity vapour pressure ASTM 10% point ASTM end point in degrees F Random and Mixed Effects Models (Ch. 10) Random effects models are very useful when the observations are sampled in a highly structured way. The basic idea is that the error associated with any linear,

More information

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares

Outline. Topic 4 - Analysis of Variance Approach to Regression. Partitioning Sums of Squares. Total Sum of Squares. Partitioning sums of squares Topic 4 - Analysis of Variance Approach to Regression Outline Partitioning sums of squares Degrees of freedom Expected mean squares General linear test - Fall 2013 R 2 and the coefficient of correlation

More information

Two-sample inference: Continuous data

Two-sample inference: Continuous data Two-sample inference: Continuous data Patrick Breheny April 5 Patrick Breheny STA 580: Biostatistics I 1/32 Introduction Our next two lectures will deal with two-sample inference for continuous data As

More information

CHAPTER 13. Experimental Design and Analysis of Variance

CHAPTER 13. Experimental Design and Analysis of Variance CHAPTER 13 Experimental Design and Analysis of Variance CONTENTS STATISTICS IN PRACTICE: BURKE MARKETING SERVICES, INC. 13.1 AN INTRODUCTION TO EXPERIMENTAL DESIGN AND ANALYSIS OF VARIANCE Data Collection

More information

How To Understand The Theory Of Gravity

How To Understand The Theory Of Gravity Newton s Law of Gravity and Kepler s Laws Michael Fowler Phys 142E Lec 9 2/6/09. These notes are partly adapted from my Physics 152 lectures, where more mathematical details can be found. The Universal

More information

SPSS Guide: Regression Analysis

SPSS Guide: Regression Analysis SPSS Guide: Regression Analysis I put this together to give you a step-by-step guide for replicating what we did in the computer lab. It should help you run the tests we covered. The best way to get familiar

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

More information

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as... HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men

More information

An analysis method for a quantitative outcome and two categorical explanatory variables.

An analysis method for a quantitative outcome and two categorical explanatory variables. Chapter 11 Two-Way ANOVA An analysis method for a quantitative outcome and two categorical explanatory variables. If an experiment has a quantitative outcome and two categorical explanatory variables that

More information

UC Irvine FOCUS! 5 E Lesson Plan

UC Irvine FOCUS! 5 E Lesson Plan UC Irvine FOCUS! 5 E Lesson Plan Title: Astronomical Units and The Solar System Grade Level and Course: 8th grade Physical Science Materials: Visual introduction for solar system (slides, video, posters,

More information

Vocabulary - Understanding Revolution in. our Solar System

Vocabulary - Understanding Revolution in. our Solar System Vocabulary - Understanding Revolution in Universe Galaxy Solar system Planet Moon Comet Asteroid Meteor(ite) Heliocentric Geocentric Satellite Terrestrial planets Jovian (gas) planets Gravity our Solar

More information

Comparing Means in Two Populations

Comparing Means in Two Populations Comparing Means in Two Populations Overview The previous section discussed hypothesis testing when sampling from a single population (either a single mean or two means from the same population). Now we

More information

Study Guide: Solar System

Study Guide: Solar System Study Guide: Solar System 1. How many planets are there in the solar system? 2. What is the correct order of all the planets in the solar system? 3. Where can a comet be located in the solar system? 4.

More information

EDUCATION AND VOCABULARY MULTIPLE REGRESSION IN ACTION

EDUCATION AND VOCABULARY MULTIPLE REGRESSION IN ACTION EDUCATION AND VOCABULARY MULTIPLE REGRESSION IN ACTION EDUCATION AND VOCABULARY 5-10 hours of input weekly is enough to pick up a new language (Schiff & Myers, 1988). Dutch children spend 5.5 hours/day

More information

Confidence Intervals for the Difference Between Two Means

Confidence Intervals for the Difference Between Two Means Chapter 47 Confidence Intervals for the Difference Between Two Means Introduction This procedure calculates the sample size necessary to achieve a specified distance from the difference in sample means

More information

Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

More information

Halliday, Resnick & Walker Chapter 13. Gravitation. Physics 1A PHYS1121 Professor Michael Burton

Halliday, Resnick & Walker Chapter 13. Gravitation. Physics 1A PHYS1121 Professor Michael Burton Halliday, Resnick & Walker Chapter 13 Gravitation Physics 1A PHYS1121 Professor Michael Burton II_A2: Planetary Orbits in the Solar System + Galaxy Interactions (You Tube) 21 seconds 13-1 Newton's Law

More information

SIR ISAAC NEWTON (1642-1727)

SIR ISAAC NEWTON (1642-1727) SIR ISAAC NEWTON (1642-1727) PCES 1.1 Born in the small village of Woolsthorpe, Newton quickly made an impression as a student at Cambridge- he was appointed full Prof. there The young Newton in 1669,

More information

Penn State University Physics 211 ORBITAL MECHANICS 1

Penn State University Physics 211 ORBITAL MECHANICS 1 ORBITAL MECHANICS 1 PURPOSE The purpose of this laboratory project is to calculate, verify and then simulate various satellite orbit scenarios for an artificial satellite orbiting the earth. First, there

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

HYPOTHESIS TESTING: POWER OF THE TEST

HYPOTHESIS TESTING: POWER OF THE TEST HYPOTHESIS TESTING: POWER OF THE TEST The first 6 steps of the 9-step test of hypothesis are called "the test". These steps are not dependent on the observed data values. When planning a research project,

More information

Tests for Two Proportions

Tests for Two Proportions Chapter 200 Tests for Two Proportions Introduction This module computes power and sample size for hypothesis tests of the difference, ratio, or odds ratio of two independent proportions. The test statistics

More information

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp. 394-398, 404-408, 410-420

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp. 394-398, 404-408, 410-420 BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp. 394-398, 404-408, 410-420 1. Which of the following will increase the value of the power in a statistical test

More information

Permutation Tests for Comparing Two Populations

Permutation Tests for Comparing Two Populations Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. Jae-Wan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of

More information

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression Correlation and Simple Linear Regression We are often interested in studying the relationship among variables to determine whether they are associated with one another. When we think that changes in a

More information

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone:

More information

How To Check For Differences In The One Way Anova

How To Check For Differences In The One Way Anova MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way

More information

Simple treatment structure

Simple treatment structure Chapter 3 Simple treatment structure 3.1 Replication of control treatments Suppose that treatment 1 is a control treatment and that treatments 2,..., t are new treatments which we want to compare with

More information

17. SIMPLE LINEAR REGRESSION II

17. SIMPLE LINEAR REGRESSION II 17. SIMPLE LINEAR REGRESSION II The Model In linear regression analysis, we assume that the relationship between X and Y is linear. This does not mean, however, that Y can be perfectly predicted from X.

More information

Section 13, Part 1 ANOVA. Analysis Of Variance

Section 13, Part 1 ANOVA. Analysis Of Variance Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability

More information

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

More information

Introduction to Minitab and basic commands. Manipulating data in Minitab Describing data; calculating statistics; transformation.

Introduction to Minitab and basic commands. Manipulating data in Minitab Describing data; calculating statistics; transformation. Computer Workshop 1 Part I Introduction to Minitab and basic commands. Manipulating data in Minitab Describing data; calculating statistics; transformation. Outlier testing Problem: 1. Five months of nickel

More information

NCSS Statistical Software

NCSS Statistical Software Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the

More information

Chapter 9. Two-Sample Tests. Effect Sizes and Power Paired t Test Calculation

Chapter 9. Two-Sample Tests. Effect Sizes and Power Paired t Test Calculation Chapter 9 Two-Sample Tests Paired t Test (Correlated Groups t Test) Effect Sizes and Power Paired t Test Calculation Summary Independent t Test Chapter 9 Homework Power and Two-Sample Tests: Paired Versus

More information

Two-sample hypothesis testing, II 9.07 3/16/2004

Two-sample hypothesis testing, II 9.07 3/16/2004 Two-sample hypothesis testing, II 9.07 3/16/004 Small sample tests for the difference between two independent means For two-sample tests of the difference in mean, things get a little confusing, here,

More information