Robust and Resistant Regression


 Shona Wiggins
 3 years ago
 Views:
Transcription
1 Chapter 13 Robust ad Resistat Regressio Whe the errors are ormal, least squares regressio is clearly best but whe the errors are oormal, other methods may be cosidered. A particular cocer is logtailed error distributios. Oe approach is to remove the largest residuals as outliers ad still use least squares but this may ot be effective whe there are several large residuals because of the leaveoutoe ature of the outlier tests. Furthermore, the outlier test is a accept/reject procedure that is ot smooth ad may ot be statistically efficiet for the estimatio of β. Robust regressio provides a alterative. There are several methods. Mestimates choose β to miimize Possible choices for ρ are 1. ρ x x 2 is just least squares ρ y i x T i β σ 2. ρ x x is called least absolute deviatios regressio (LAD). This is also called L 1 regressio. 3. ρ x x 2 2 if x c c x c 2 2 otherwise is called Huber s method ad is a compromise betwee least squares ad LAD regressio. c ca be a estimate of σ but ot the usual oe which is ot robust. Somethig media ˆε i for example. Robust regressio is related to weighted least squares. The ormal equatios tell us that With weights ad i omatrix form this becomes: X T y X ˆβ 0 w i x i j y i p x i j β j 0 j 1 p j 1 Now differetiatig the Mestimate criterio with respect to β j ad settig to zero we get ρ y i p j 1 x i jβ j σ x i j 0 j 1 p 150
2 CHAPTER 13. ROBUST AND RESISTANT REGRESSION 151 Now let u i y i p j 1 x i jβ j to get ρ u i u i x i j y i p x i j β j 0 j 1 p j 1 so we ca make the idetificatio of ad we fid for our choices of ρ above: 1. LS: w u is costat. w u ρ u! u 2. LAD: w u 1!#" u"  ote the asymptote at 0  this makes a weightig approach difficult. 3. Huber: w u$ 1 if " u"&% c c!#" u" otherwise There are may other choices that have bee used. Because the weights deped o the residuals, a iteratively reweighted least squares approach to fittig must be used. We ca sometimes get stadard errors by var ˆ ˆβ ˆσ 2 X T W X ' 1 (use a robust estimate of σ 2 also). We demostrate the methods o the Chicago isurace data. Usig least squares first. > data(chicago) > g < lm(ivolact race + fire + theft + age + log(icome),chicago) > summary(g) Coefficiets: Estimate Std. Error t value Pr(> t ) (Itercept) race fire e05 theft age log(icome) Residual stadard error: o 41 degrees of freedom Multiple RSquared: 0.752, Adjusted Rsquared: Fstatistic: 24.8 o 5 ad 41 degrees of freedom, pvalue: 2.01e11 Least squares works well whe there are ormal errors but ca be upset by logtailed errors. A coveiet way to apply the Huber method is to apply the rlm() fuctio which is part of the MASS (see the book Moder Applied Statistics i S+) which also gives stadard errors. The default is to use the Huber method but there are other choices. > library(mass) > g < rlm( ivolact race + fire + theft + age + log(icome), chicago) Coefficiets: Value Std. Error t value (Itercept) race
3 CHAPTER 13. ROBUST AND RESISTANT REGRESSION 152 fire theft age log(icome) Residual stadard error: o 41 degrees of freedom The R 2 ad Fstatistics are ot give because they caot be calculated (at least ot i the same way). The umerical values of the coefficiets have chaged a small amout but the geeral sigificace of the variables remais the same ad our substative coclusio would ot be altered. Had we see somethig differet, we would eed to fid out the cause. Perhaps some group of observatios were ot beig fit well ad the robust regressio excluded these poits. Aother method that ca be used is Least Trimmed Squares(LTS). Here oe miimizes q i( 1 ˆε2) i* where q is some umber less tha ad + i, idicates sortig. This method has a high breakdow poit because it ca tolerate a large umber of outliers depedig o how q is chose. The Huber ad L 1 methods will still fail if some ε i . LTS is a example of a resistat regressio method. Resistat methods are good at dealig with data where we expect there to be a certai umber of bad observatios that we wat to have o weight i the aalysis. > library(lqs) > g < ltsreg(ivolact race + fire + theft + age + log(icome),chicago) > g$coef (Itercept) race fire theft age log(icome) > g < ltsreg(ivolact race + fire + theft + age + log(icome),chicago) > g$coef (Itercept) race fire theft age log(icome) The default choice of q is. / p 1 1,/ 20 where. x0 idicates the largest iteger less tha or equal to x. I repeated the commad twice ad you will otice that the results are somewhat differet. This is because the default geetic algorithm used to compute the coefficiets is odetermiistic. A exhaustive search method ca be used > g < ltsreg(ivolact race + fire + theft + age + log(icome),chicago, samp="exact") > g$coef (Itercept) race fire theft age log(icome) This takes about 20 miutes o a 400Mhz Itel Petium II processor. For larger datasets, it will take much loger so this method might be impractical. The most otable differece from LS for the purposes of this data is the decrease i the race coefficiet  if the same stadard error applied the it would verge o isigificace. However, we do t have the stadard errors for the LTS regressio coefficiets. We ow use a geeral method for iferece which is especially useful whe such theory is lackig  the Bootstrap. To uderstad how this method works, thik about how we might empirically determie the distributio of a estimator. We could repeatedly geerate artificial data from the true model, compute the estimate each
4 CHAPTER 13. ROBUST AND RESISTANT REGRESSION 153 time ad gather the results to study the distributio. This techique, called simulatio, is ot available to us for real data because we do t kow the true model. The Bootstrap emulates the simulatio procedure above except istead of samplig from the true model, it samples from the observed data itself. Remarkably, this techique is ofte effective. It sidesteps the eed for theoretical calculatios that may be extremely difficult or eve impossible. The Bootstrap may be the sigle most importat iovatio i Statistics i the last 20 years. To see how the bootstrap method compares with simulatio, let s spell out the steps ivolved. I both cases, we cosider X fixed. Simulatio I geeral the idea is to sample from the kow distributio ad compute the estimate, repeatig may times to fid as good a estimate of the samplig distributio of the estimator as we eed. For the regressio case, it is easiest to start with a sample from the error distributio sice these are assumed to be idepedet ad idetically distributed: 1. Geerate ε from the kow error distributio. 2. Form y 4 Xβ 5 ε from the kow β. 3. Compute ˆβ. We repeat these three steps may times. We ca estimate the samplig distributio of ˆβ usig the empirical distributio of the geerated ˆβ, which we ca estimate as accurately as we please by simply ruig the simulatio for log eough. This techique is useful for a theoretical ivestigatio of the properties of a proposed ew estimator. We ca see how its performace compares to other estimators. However, it is of o value for the actual data sice we do t kow the true error distributio ad we do t kow the true β. The bootstrap method mirrors the simulatio method but uses quatities we do kow. Istead of samplig from the populatio distributio which we do ot kow i practice, we resample from the data itself. Bootstrap 1. Geerate ε6 by samplig with replacemet from ˆε ˆε. 2. Form y6:4 X ˆβ 5 ε6 3. Compute ˆβ6 from ; X 7 y6=< This time, we use oly quatities that we kow. For small, it is possible to compute ˆβ6 for every possible sample from ˆε ˆε, but usually we ca oly take as may samples as we have computig power available. This umber of bootstrap samples ca be as small as 50 if all we wat is a estimate of the variace of our estimates but eeds to be larger if cofidece itervals are wated. To implemet this, we eed to be able to take a sample of residuals with replacemet. sample() is good for geeratig radom samples of idices: > sample(10,rep=t) [1] ad hece a radom sample (with replacemet) of RTS residuals is: > g$res[sample(47,rep=t)]
5 CHAPTER 13. ROBUST AND RESISTANT REGRESSION 154 (rest deleted You will otice that there is a repeated value eve i this small sippet. We ow execute the bootstrap  first we make a matrix to save the results i ad the repeat the bootstrap process 1000 times: (This takes about 6 miutes to ru o a 400Mhz Itel Petium II processor) > x < model.matrix( race+fire+theft+age+log(icome),chicago)[,1] > bcoef < matrix(0,1000,6) > for(i i 1:1000){ + ewy < g$fit + g$res[sample(47,rep=t)] + brg < ltsreg(x,ewy,samp="best") + bcoef[i,] < brg$coef + } It is ot coveiet to use the samp="exact" sice that would require 1000 times the 20 miutes it takes to make origial estimate. That s about two weeks, so I compromised ad used the secod best optio of samp="best". This likely meas that our bootstrap estimates of variability will be somewhat o the high side. This illustrates a commo practical difficulty with the bootstrap it ca take a log time to compute. Fortuately, this problem recedes as processor speeds icrease. It is otable that this calculatio was the oly oe i this book that did ot take a egligible amout of time. You typically do ot eed the latest ad greatest computer to do statistics o the size of datasets ecoutered i this book. To test the ull hypothesis that H 0 : β race > 0 agaist the alterative H 1 : β race? 0 we may figure what fractio of the bootstrap sampled β race were less tha zero: > legth(bcoef[bcoef[,2]<0,2])/1000 [1] So our pvalue is 1.9% ad we reject the ull at the 5% level. We ca also make a 95% cofidece iterval for this parameter by takig the empirical quatiles: > quatile(bcoef[,2],c(0.025,0.975)) 2.5% 97.5% We ca get a better picture of the distributio by lookig at the desity ad markig the cofidece iterval: > plot(desity(bcoef[,2]),xlab="coefficiet of Race",mai="") > ablie(v=quatile(bcoef[,2],c(0.025,0.975))) See Figure We see that the distributio is approximately ormal with perhaps so logish tails. This would be more accurate if we took more tha 1000 bootstrap resamples. The coclusio here would be that the race variable is sigificat but the effect is less tha that estimated by least squares. Which is better? This depeds o what the true model is which we will ever kow but sice the QQ plot did ot idicate ay big problem with oormality I would ted to prefer the LS estimates. However, this does illustrate a geeral problem that occurs whe more tha oe statistical method is available for a give dataset. Summary
6 CHAPTER 13. ROBUST AND RESISTANT REGRESSION 155 Desity Coefficiet of Race Figure 13.1: Bootstrap distributio of ˆβ race with 95% cofidece itervals 1. Robust estimators provide protectio agaist logtailed errors but they ca t overcome problems with the choice of model ad its variace structure. This is ufortuate because these problems are more serious tha oormal error. 2. Robust estimates just give you ˆβ ad possibly stadard errors without the associated iferetial methods. Software ad methodology for this iferece is ot easy to come by. The bootstrap is a geeral purpose iferetial method which is useful i these situatios. 3. Robust methods ca be used i additio to LS as a cofirmatory method. You have cause to worry if the two estimates are far apart.
Hypothesis testing. Null and alternative hypotheses
Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate
More informationI. Chisquared Distributions
1 M 358K Supplemet to Chapter 23: CHISQUARED DISTRIBUTIONS, TDISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad tdistributios, we first eed to look at aother family of distributios, the chisquared distributios.
More informationOnesample test of proportions
Oesample test of proportios The Settig: Idividuals i some populatio ca be classified ito oe of two categories. You wat to make iferece about the proportio i each category, so you draw a sample. Examples:
More informationZTEST / ZSTATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown
ZTEST / ZSTATISTIC: used to test hypotheses about µ whe the populatio stadard deviatio is kow ad populatio distributio is ormal or sample size is large TTEST / TSTATISTIC: used to test hypotheses about
More informationInference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval
Chapter 8 Tests of Statistical Hypotheses 8. Tests about Proportios HT  Iferece o Proportio Parameter: Populatio Proportio p (or π) (Percetage of people has o health isurace) x Statistic: Sample Proportio
More informationProperties of MLE: consistency, asymptotic normality. Fisher information.
Lecture 3 Properties of MLE: cosistecy, asymptotic ormality. Fisher iformatio. I this sectio we will try to uderstad why MLEs are good. Let us recall two facts from probability that we be used ofte throughout
More informationConfidence Intervals for One Mean
Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a
More informationOutput Analysis (2, Chapters 10 &11 Law)
B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should
More informationCase Study. Normal and t Distributions. Density Plot. Normal Distributions
Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca
More information1. C. The formula for the confidence interval for a population mean is: x t, which was
s 1. C. The formula for the cofidece iterval for a populatio mea is: x t, which was based o the sample Mea. So, x is guarateed to be i the iterval you form.. D. Use the rule : pvalue
More informationTHE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n
We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample
More informationDetermining the sample size
Determiig the sample size Oe of the most commo questios ay statisticia gets asked is How large a sample size do I eed? Researchers are ofte surprised to fid out that the aswer depeds o a umber of factors
More information5: Introduction to Estimation
5: Itroductio to Estimatio Cotets Acroyms ad symbols... 1 Statistical iferece... Estimatig µ with cofidece... 3 Samplig distributio of the mea... 3 Cofidece Iterval for μ whe σ is kow before had... 4 Sample
More information0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5
Sectio 13 KolmogorovSmirov test. Suppose that we have a i.i.d. sample X 1,..., X with some ukow distributio P ad we would like to test the hypothesis that P is equal to a particular distributio P 0, i.e.
More informationStatistical inference: example 1. Inferential Statistics
Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either
More information1 Correlation and Regression Analysis
1 Correlatio ad Regressio Aalysis I this sectio we will be ivestigatig the relatioship betwee two cotiuous variable, such as height ad weight, the cocetratio of a ijected drug ad heart rate, or the cosumptio
More informationChapter 6: Variance, the law of large numbers and the MonteCarlo method
Chapter 6: Variace, the law of large umbers ad the MoteCarlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value
More informationOverview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals
Overview Estimatig the Value of a Parameter Usig Cofidece Itervals We apply the results about the sample mea the problem of estimatio Estimatio is the process of usig sample data estimate the value of
More informationMaximum Likelihood Estimators.
Lecture 2 Maximum Likelihood Estimators. Matlab example. As a motivatio, let us look at oe Matlab example. Let us geerate a radom sample of size 00 from beta distributio Beta(5, 2). We will lear the defiitio
More informationIncremental calculation of weighted mean and variance
Icremetal calculatio of weighted mea ad variace Toy Fich faf@cam.ac.uk dot@dotat.at Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically
More informationUniversity of California, Los Angeles Department of Statistics. Distributions related to the normal distribution
Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100B Istructor: Nicolas Christou Three importat distributios: Distributios related to the ormal distributio Chisquare (χ ) distributio.
More informationLesson 15 ANOVA (analysis of variance)
Outlie Variability betwee group variability withi group variability total variability Fratio Computatio sums of squares (betwee/withi/total degrees of freedom (betwee/withi/total mea square (betwee/withi
More informationIn nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008
I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces
More informationPSYCHOLOGICAL STATISTICS
UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc. Cousellig Psychology (0 Adm.) IV SEMESTER COMPLEMENTARY COURSE PSYCHOLOGICAL STATISTICS QUESTION BANK. Iferetial statistics is the brach of statistics
More informationNow here is the important step
LINEST i Excel The Excel spreadsheet fuctio "liest" is a complete liear least squares curve fittig routie that produces ucertaity estimates for the fit values. There are two ways to access the "liest"
More informationLECTURE 13: Crossvalidation
LECTURE 3: Crossvalidatio Resampli methods Cross Validatio Bootstrap Bias ad variace estimatio with the Bootstrap Threeway data partitioi Itroductio to Patter Aalysis Ricardo GutierrezOsua Texas A&M
More informationChapter 7: Confidence Interval and Sample Size
Chapter 7: Cofidece Iterval ad Sample Size Learig Objectives Upo successful completio of Chapter 7, you will be able to: Fid the cofidece iterval for the mea, proportio, ad variace. Determie the miimum
More informationChapter 14 Nonparametric Statistics
Chapter 14 Noparametric Statistics A.K.A. distributiofree statistics! Does ot deped o the populatio fittig ay particular type of distributio (e.g, ormal). Sice these methods make fewer assumptios, they
More informationCenter, Spread, and Shape in Inference: Claims, Caveats, and Insights
Ceter, Spread, ad Shape i Iferece: Claims, Caveats, ad Isights Dr. Nacy Pfeig (Uiversity of Pittsburgh) AMATYC November 2008 Prelimiary Activities 1. I would like to produce a iterval estimate for the
More informationConfidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.
Cofidece Itervals A cofidece iterval is a iterval whose purpose is to estimate a parameter (a umber that could, i theory, be calculated from the populatio, if measuremets were available for the whole populatio).
More informationCHAPTER 7: Central Limit Theorem: CLT for Averages (Means)
CHAPTER 7: Cetral Limit Theorem: CLT for Averages (Meas) X = the umber obtaied whe rollig oe six sided die oce. If we roll a six sided die oce, the mea of the probability distributio is X P(X = x) Simulatio:
More informationPractice Problems for Test 3
Practice Problems for Test 3 Note: these problems oly cover CIs ad hypothesis testig You are also resposible for kowig the samplig distributio of the sample meas, ad the Cetral Limit Theorem Review all
More informationConfidence intervals and hypothesis tests
Chapter 2 Cofidece itervals ad hypothesis tests This chapter focuses o how to draw coclusios about populatios from sample data. We ll start by lookig at biary data (e.g., pollig), ad lear how to estimate
More informationNonlife insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring
Nolife isurace mathematics Nils F. Haavardsso, Uiversity of Oslo ad DNB Skadeforsikrig Mai issues so far Why does isurace work? How is risk premium defied ad why is it importat? How ca claim frequecy
More informationSoving Recurrence Relations
Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree
More informationNormal Distribution.
Normal Distributio www.icrf.l Normal distributio I probability theory, the ormal or Gaussia distributio, is a cotiuous probability distributio that is ofte used as a first approimatio to describe realvalued
More informationChapter 7 Methods of Finding Estimators
Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of
More informationGCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number.
GCSE STATISTICS You should kow: 1) How to draw a frequecy diagram: e.g. NUMBER TALLY FREQUENCY 1 3 5 ) How to draw a bar chart, a pictogram, ad a pie chart. 3) How to use averages: a) Mea  add up all
More information1 Computing the Standard Deviation of Sample Means
Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.
More informationChapter 7  Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:
Chapter 7  Samplig Distributios 1 Itroductio What is statistics? It cosist of three major areas: Data Collectio: samplig plas ad experimetal desigs Descriptive Statistics: umerical ad graphical summaries
More information, a Wishart distribution with n 1 degrees of freedom and scale matrix.
UMEÅ UNIVERSITET Matematiskstatistiska istitutioe Multivariat dataaalys D MSTD79 PA TENTAMEN 00409 LÖSNINGSFÖRSLAG TILL TENTAMEN I MATEMATISK STATISTIK Multivariat dataaalys D, 5 poäg.. Assume that
More informationThe following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles
The followig eample will help us uderstad The Samplig Distributio of the Mea Review: The populatio is the etire collectio of all idividuals or objects of iterest The sample is the portio of the populatio
More informationMath C067 Sampling Distributions
Math C067 Samplig Distributios Sample Mea ad Sample Proportio Richard Beigel Some time betwee April 16, 2007 ad April 16, 2007 Examples of Samplig A pollster may try to estimate the proportio of voters
More informationLesson 17 Pearson s Correlation Coefficient
Outlie Measures of Relatioships Pearso s Correlatio Coefficiet (r) types of data scatter plots measure of directio measure of stregth Computatio covariatio of X ad Y uique variatio i X ad Y measurig
More informationA Test of Normality. 1 n S 2 3. n 1. Now introduce two new statistics. The sample skewness is defined as:
A Test of Normality Textbook Referece: Chapter. (eighth editio, pages 59 ; seveth editio, pages 6 6). The calculatio of p values for hypothesis testig typically is based o the assumptio that the populatio
More informationCHAPTER 3 DIGITAL CODING OF SIGNALS
CHAPTER 3 DIGITAL CODING OF SIGNALS Computers are ofte used to automate the recordig of measuremets. The trasducers ad sigal coditioig circuits produce a voltage sigal that is proportioal to a quatity
More information15.075 Exam 3. Instructor: Cynthia Rudin TA: Dimitrios Bisias. November 22, 2011
15.075 Exam 3 Istructor: Cythia Rudi TA: Dimitrios Bisias November 22, 2011 Gradig is based o demostratio of coceptual uderstadig, so you eed to show all of your work. Problem 1 A compay makes highdefiitio
More informationA probabilistic proof of a binomial identity
A probabilistic proof of a biomial idetity Joatho Peterso Abstract We give a elemetary probabilistic proof of a biomial idetity. The proof is obtaied by computig the probability of a certai evet i two
More informationQuadrat Sampling in Population Ecology
Quadrat Samplig i Populatio Ecology Backgroud Estimatig the abudace of orgaisms. Ecology is ofte referred to as the "study of distributio ad abudace". This beig true, we would ofte like to kow how may
More informationSampling Distribution And Central Limit Theorem
() Samplig Distributio & Cetral Limit Samplig Distributio Ad Cetral Limit Samplig distributio of the sample mea If we sample a umber of samples (say k samples where k is very large umber) each of size,
More informationTHE TWOVARIABLE LINEAR REGRESSION MODEL
THE TWOVARIABLE LINEAR REGRESSION MODEL Herma J. Bieres Pesylvaia State Uiversity April 30, 202. Itroductio Suppose you are a ecoomics or busiess maor i a college close to the beach i the souther part
More informationUnit 8: Inference for Proportions. Chapters 8 & 9 in IPS
Uit 8: Iferece for Proortios Chaters 8 & 9 i IPS Lecture Outlie Iferece for a Proortio (oe samle) Iferece for Two Proortios (two samles) Cotigecy Tables ad the χ test Iferece for Proortios IPS, Chater
More informationParametric (theoretical) probability distributions. (Wilks, Ch. 4) Discrete distributions: (e.g., yes/no; above normal, normal, below normal)
6 Parametric (theoretical) probability distributios. (Wilks, Ch. 4) Note: parametric: assume a theoretical distributio (e.g., Gauss) Noparametric: o assumptio made about the distributio Advatages of assumig
More informationMEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book)
MEI Mathematics i Educatio ad Idustry MEI Structured Mathematics Module Summary Sheets Statistics (Versio B: referece to ew book) Topic : The Poisso Distributio Topic : The Normal Distributio Topic 3:
More informationModified Line Search Method for Global Optimization
Modified Lie Search Method for Global Optimizatio Cria Grosa ad Ajith Abraham Ceter of Excellece for Quatifiable Quality of Service Norwegia Uiversity of Sciece ad Techology Trodheim, Norway {cria, ajith}@q2s.tu.o
More informationMannWhitney U 2 Sample Test (a.k.a. Wilcoxon Rank Sum Test)
NoParametric ivariate Statistics: WilcoxoMaWhitey 2 Sample Test 1 MaWhitey 2 Sample Test (a.k.a. Wilcoxo Rak Sum Test) The (Wilcoxo) MaWhitey (WMW) test is the oparametric equivalet of a pooled
More information3. Greatest Common Divisor  Least Common Multiple
3 Greatest Commo Divisor  Least Commo Multiple Defiitio 31: The greatest commo divisor of two atural umbers a ad b is the largest atural umber c which divides both a ad b We deote the greatest commo gcd
More informationExploratory Data Analysis
1 Exploratory Data Aalysis Exploratory data aalysis is ofte the rst step i a statistical aalysis, for it helps uderstadig the mai features of the particular sample that a aalyst is usig. Itelliget descriptios
More informationMeasures of Spread and Boxplots Discrete Math, Section 9.4
Measures of Spread ad Boxplots Discrete Math, Sectio 9.4 We start with a example: Example 1: Comparig Mea ad Media Compute the mea ad media of each data set: S 1 = {4, 6, 8, 10, 1, 14, 16} S = {4, 7, 9,
More informationRepeating Decimals are decimal numbers that have number(s) after the decimal point that repeat in a pattern.
5.5 Fractios ad Decimals Steps for Chagig a Fractio to a Decimal. Simplify the fractio, if possible. 2. Divide the umerator by the deomiator. d d Repeatig Decimals Repeatig Decimals are decimal umbers
More informationOverview of some probability distributions.
Lecture Overview of some probability distributios. I this lecture we will review several commo distributios that will be used ofte throughtout the class. Each distributio is usually described by its probability
More informationConfidence Intervals
Cofidece Itervals Cofidece Itervals are a extesio of the cocept of Margi of Error which we met earlier i this course. Remember we saw: The sample proportio will differ from the populatio proportio by more
More informationCOMPARISON OF THE EFFICIENCY OF SCONTROL CHART AND EWMAS 2 CONTROL CHART FOR THE CHANGES IN A PROCESS
COMPARISON OF THE EFFICIENCY OF SCONTROL CHART AND EWMAS CONTROL CHART FOR THE CHANGES IN A PROCESS Supraee Lisawadi Departmet of Mathematics ad Statistics, Faculty of Sciece ad Techoology, Thammasat
More informationExample 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here).
BEGINNING ALGEBRA Roots ad Radicals (revised summer, 00 Olso) Packet to Supplemet the Curret Textbook  Part Review of Square Roots & Irratioals (This portio ca be ay time before Part ad should mostly
More informationBASIC STATISTICS. f(x 1,x 2,..., x n )=f(x 1 )f(x 2 ) f(x n )= f(x i ) (1)
BASIC STATISTICS. SAMPLES, RANDOM SAMPLING AND SAMPLE STATISTICS.. Radom Sample. The radom variables X,X 2,..., X are called a radom sample of size from the populatio f(x if X,X 2,..., X are mutually idepedet
More informationSequences and Series
CHAPTER 9 Sequeces ad Series 9.. Covergece: Defiitio ad Examples Sequeces The purpose of this chapter is to itroduce a particular way of geeratig algorithms for fidig the values of fuctios defied by their
More informationSECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES
SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,
More informationQuantitative Computer Architecture
Performace Measuremet ad Aalysis i Computer Quatitative Computer Measuremet Model Iovatio Proposed How to measure, aalyze, ad specify computer system performace or My computer is faster tha your computer!
More informationForecasting techniques
2 Forecastig techiques this chapter covers... I this chapter we will examie some useful forecastig techiques that ca be applied whe budgetig. We start by lookig at the way that samplig ca be used to collect
More informationOMG! Excessive Texting Tied to Risky Teen Behaviors
BUSIESS WEEK: EXECUTIVE EALT ovember 09, 2010 OMG! Excessive Textig Tied to Risky Tee Behaviors Kids who sed more tha 120 a day more likely to try drugs, alcohol ad sex, researchers fid TUESDAY, ov. 9
More informationCS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations
CS3A Hadout 3 Witer 00 February, 00 Solvig Recurrece Relatios Itroductio A wide variety of recurrece problems occur i models. Some of these recurrece relatios ca be solved usig iteratio or some other ad
More informationHypergeometric Distributions
7.4 Hypergeometric Distributios Whe choosig the startig lieup for a game, a coach obviously has to choose a differet player for each positio. Similarly, whe a uio elects delegates for a covetio or you
More informationMultiserver Optimal Bandwidth Monitoring for QoS based Multimedia Delivery Anup Basu, Irene Cheng and Yinzhe Yu
Multiserver Optimal Badwidth Moitorig for QoS based Multimedia Delivery Aup Basu, Iree Cheg ad Yizhe Yu Departmet of Computig Sciece U. of Alberta Architecture Applicatio Layer Request receptio coectio
More informationDescriptive Statistics
Descriptive Statistics We leared to describe data sets graphically. We ca also describe a data set umerically. Measures of Locatio Defiitio The sample mea is the arithmetic average of values. We deote
More informationSimple Annuities Present Value.
Simple Auities Preset Value. OBJECTIVES (i) To uderstad the uderlyig priciple of a preset value auity. (ii) To use a CASIO CFX9850GB PLUS to efficietly compute values associated with preset value auities.
More informationThe Stable Marriage Problem
The Stable Marriage Problem William Hut Lae Departmet of Computer Sciece ad Electrical Egieerig, West Virgiia Uiversity, Morgatow, WV William.Hut@mail.wvu.edu 1 Itroductio Imagie you are a matchmaker,
More informationCHAPTER 3 THE TIME VALUE OF MONEY
CHAPTER 3 THE TIME VALUE OF MONEY OVERVIEW A dollar i the had today is worth more tha a dollar to be received i the future because, if you had it ow, you could ivest that dollar ad ear iterest. Of all
More informationwhere: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return
EVALUATING ALTERNATIVE CAPITAL INVESTMENT PROGRAMS By Ke D. Duft, Extesio Ecoomist I the March 98 issue of this publicatio we reviewed the procedure by which a capital ivestmet project was assessed. The
More information1. MATHEMATICAL INDUCTION
1. MATHEMATICAL INDUCTION EXAMPLE 1: Prove that for ay iteger 1. Proof: 1 + 2 + 3 +... + ( + 1 2 (1.1 STEP 1: For 1 (1.1 is true, sice 1 1(1 + 1. 2 STEP 2: Suppose (1.1 is true for some k 1, that is 1
More informationS. Tanny MAT 344 Spring 1999. be the minimum number of moves required.
S. Tay MAT 344 Sprig 999 Recurrece Relatios Tower of Haoi Let T be the miimum umber of moves required. T 0 = 0, T = 7 Iitial Coditios * T = T + $ T is a sequece (f. o itegers). Solve for T? * is a recurrece,
More informationWeek 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable
Week 3 Coditioal probabilities, Bayes formula, WEEK 3 page 1 Expected value of a radom variable We recall our discussio of 5 card poker hads. Example 13 : a) What is the probability of evet A that a 5
More informationHere are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.
This documet was writte ad copyrighted by Paul Dawkis. Use of this documet ad its olie versio is govered by the Terms ad Coditios of Use located at http://tutorial.math.lamar.edu/terms.asp. The olie versio
More informationChapter 5: Basic Linear Regression
Chapter 5: Basic Liear Regressio 1. Why Regressio Aalysis Has Domiated Ecoometrics By ow we have focused o formig estimates ad tests for fairly simple cases ivolvig oly oe variable at a time. But the core
More informationA modified KolmogorovSmirnov test for normality
MPRA Muich Persoal RePEc Archive A modified KolmogorovSmirov test for ormality Zvi Drezer ad Ofir Turel ad Dawit Zerom Califoria State UiversityFullerto 22. October 2008 Olie at http://mpra.ub.uimueche.de/14385/
More information(VCP310) 18004186789
Maual VMware Lesso 1: Uderstadig the VMware Product Lie I this lesso, you will first lear what virtualizatio is. Next, you ll explore the products offered by VMware that provide virtualizatio services.
More informationDiscrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13
EECS 70 Discrete Mathematics ad Probability Theory Sprig 2014 Aat Sahai Note 13 Itroductio At this poit, we have see eough examples that it is worth just takig stock of our model of probability ad may
More information*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature.
Itegrated Productio ad Ivetory Cotrol System MRP ad MRP II Framework of Maufacturig System Ivetory cotrol, productio schedulig, capacity plaig ad fiacial ad busiess decisios i a productio system are iterrelated.
More informationAnalyzing Longitudinal Data from Complex Surveys Using SUDAAN
Aalyzig Logitudial Data from Complex Surveys Usig SUDAAN Darryl Creel Statistics ad Epidemiology, RTI Iteratioal, 312 Trotter Farm Drive, Rockville, MD, 20850 Abstract SUDAAN: Software for the Statistical
More informationTHE ARITHMETIC OF INTEGERS.  multiplication, exponentiation, division, addition, and subtraction
THE ARITHMETIC OF INTEGERS  multiplicatio, expoetiatio, divisio, additio, ad subtractio What to do ad what ot to do. THE INTEGERS Recall that a iteger is oe of the whole umbers, which may be either positive,
More informationLecture 2: Karger s Min Cut Algorithm
priceto uiv. F 3 cos 5: Advaced Algorithm Desig Lecture : Karger s Mi Cut Algorithm Lecturer: Sajeev Arora Scribe:Sajeev Today s topic is simple but gorgeous: Karger s mi cut algorithm ad its extesio.
More informationCentral Limit Theorem and Its Applications to Baseball
Cetral Limit Theorem ad Its Applicatios to Baseball by Nicole Aderso A project submitted to the Departmet of Mathematical Scieces i coformity with the requiremets for Math 4301 (Hoours Semiar) Lakehead
More information.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth
Questio 1: What is a ordiary auity? Let s look at a ordiary auity that is certai ad simple. By this, we mea a auity over a fixed term whose paymet period matches the iterest coversio period. Additioally,
More informationSection 11.3: The Integral Test
Sectio.3: The Itegral Test Most of the series we have looked at have either diverged or have coverged ad we have bee able to fid what they coverge to. I geeral however, the problem is much more difficult
More informationSTA 2023 Practice Questions Exam 2 Chapter 7 sec 9.2. Case parameter estimator standard error Estimate of standard error
STA 2023 Practice Questios Exam 2 Chapter 7 sec 9.2 Formulas Give o the test: Case parameter estimator stadard error Estimate of stadard error Samplig Distributio oe mea x s t (1) oe p ( 1 p) CI: prop.
More informationA Review and Comparison of Methods for Detecting Outliers in Univariate Data Sets
A Review ad Compariso of Methods for Detectig Outliers i Uivariate Data Sets by Sogwo Seo BS, Kyughee Uiversity, Submitted to the Graduate Faculty of Graduate School of Public Health i partial fulfillmet
More informationSTATISTICAL METHODS FOR BUSINESS
STATISTICAL METHODS FOR BUSINESS UNIT 7: INFERENTIAL TOOLS. DISTRIBUTIONS ASSOCIATED WITH SAMPLING 7.1. Distributios associated with the samplig process. 7.2. Iferetial processes ad relevat distributios.
More informationPROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUSMALUS SYSTEM
PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY Physical ad Mathematical Scieces 2015, 1, p. 15 19 M a t h e m a t i c s AN ALTERNATIVE MODEL FOR BONUSMALUS SYSTEM A. G. GULYAN Chair of Actuarial Mathematics
More informationThis document contains a collection of formulas and constants useful for SPC chart construction. It assumes you are already familiar with SPC.
SPC Formulas ad Tables 1 This documet cotais a collectio of formulas ad costats useful for SPC chart costructio. It assumes you are already familiar with SPC. Termiology Geerally, a bar draw over a symbol
More informationARTICLE IN PRESS. Statistics & Probability Letters ( ) A Kolmogorovtype test for monotonicity of regression. Cecile Durot
STAPRO 66 pp:  col.fig.: il ED: MG PROD. TYPE: COM PAGN: Usha.N  SCAN: il Statistics & Probability Letters 2 2 2 2 Abstract A Kolmogorovtype test for mootoicity of regressio Cecile Durot Laboratoire
More informationFactoring x n 1: cyclotomic and Aurifeuillian polynomials Paul Garrett <garrett@math.umn.edu>
(March 16, 004) Factorig x 1: cyclotomic ad Aurifeuillia polyomials Paul Garrett Polyomials of the form x 1, x 3 1, x 4 1 have at least oe systematic factorizatio x 1 = (x 1)(x 1
More information