Useful and unuseful summaries of regression models
|
|
- Annabella Williamson
- 7 years ago
- Views:
Transcription
1 Tutorial 5 b. Todeschii Useful ad uuseful summaries of regressio models oberto Todeschii Milao Chemometrics ad QSA esearch Group - Dept. of Evirometal Scieces, Uiversit of Milao-Bicocca, P.za della Scieza Milao (Ital) I the scietific papers, it is ver commo to preset the summar of a regressio model i the followig form: logp = V V = 15 r = s = F = 18.1 where logp is the studied experimetal respose ad V1 ad V are two idepedet variables for which a quatitative relatioship with the respose has bee searched for. The umbers i the equatio are called regressio coefficiets, beig the first oe the itercept of the regressio model. is the umber of samples used for buildig the regressio equatio, r is the multiple correlatio coefficiet, s is the residual stadard deviatio (or error stadard deviatio), F the calculated value of the F-ratio test. I several cases, r = is replaced b = or b = or b = %, beig the last two idices (coefficiet of determiatio) the squared r () value ad the last oe the correspodig percetage of the variace explaied b the regressio model. For evaluatig the qualit of the summar, mathematical defiitios of the quatities ivolved i the previous example are give. esidual Sum of Squares (SS) It is the sum of the squared differece betwee the experimetal respose ad the respose calculated b the regressio model: ( ˆ ) i i (1) SS = If SS is equal to zero the model is perfect, i.e. for all the samples, the calculated resposes coicide with the experimetal resposes. Obviousl, SS also depeds o the measure uit used for the respose. I practice, for the same model, if ou multipl the experimetal respose for 10, SS is 100 times greater, beig a squared quatit. Total Sum of Squares (TSS) pag. 1
2 Tutorial 5 b. Todeschii It is the total variace that a regressio model ca explai ad is used as a referece quatit to calculate stadardized qualit parameters. Also deoted as SSY, it is the sum of the squared differeces betwee the experimetal resposes ad the average experimetal respose: ( i ) TSS SSY = () TSS is assumed as a theoretical referece model where for each experimetal respose a costat value is calculated as the average experimetal respose. As for SS, also TSS depeds o the measure uit used for the respose. Derived regressio parameters for evaluatig the goodess of fit From the previous defiitios of SS ad TSS, the followig quatities are usuall defied: TSS = MSS + SS (3) where TSS ad SS are the quatities defied above ad MSS is the Model Sum of Squares. All these three quatities are sums of squares ad the are alwas positive quatities: TSS 0 MSS 0 SS 0 The coefficiet of determiatio ad the multiple correlatio coefficiet are defied as: ( ) ( ˆ i i ) MSS SS = = 1 = TSS TSS r ( i ) (4) or = 0 1 The Mea square error (or residual mea square) s ad the residual stadard deviatio (or residual stadard error) s are defied as: s SS SS = s = p ' p ' (5) where is the umber of samples, p' the umber of model parameters (ofte give b p variables plus the itercept). Other smbols that ca be commol used for the residual stadard deviatio are SD, SE, ad s. Fiall, the F-ratio test i regressio is defied as the ratio betwee the variace explaied b the model to the residual variace, both scaled b the correspodig degrees of freedom: / ( p ' 1) ( ) MSS F = SS / p ' It must be observed that, for the same umber of objects ad variables, if SS decreases (i.e. a better model is obtaied), the both ad F icrease mootoicall with SS, while the residual mea square s decreases mootoicall with SS. This meas that the best goodess of fit ca be equall evaluated b usig a of the three parameters (max( ), max(f) ad mi(s ). (6) pag.
3 Tutorial 5 b. Todeschii Moreover, it must be oted that oe of these parameters is related i a wa to the model predictio power: the are all ol related to the goodess of fit. About r, ad First of all, it must be observed that, due to the mathematical properties of, the goodess of fit of the followig ested models ( ) ( ) ( ) logp = f V1, V (model 1) logp = f V1, V, V 3 (model ) logp = f V1, V, V 3, V 4 (model 3) ( p ) logp = f V1, V, V 3, V 4,, V (model p) ca alwas raked as: (mod. p) (mod.3) (mod.) (mod.1) i.e. the value of a model A costituted b the same variables of aother model B plus a variable is alwas greater tha (or at least equal to) the value of the model B. This effect is due to the possible presece (ad usuall probable) of chace correlatio, i.e. eve a completel uuseful variable (or a radom variable) ca ehace the value, but ever decreases it. Therefore, whe the aim is to select the best model i a populatio of models or select the best variables withi a set of variables or compare differet models, or get models cotaiig ol relevat variables, the parameter caot be used (either r or, obviousl). I order to overcome these drawbacks, Adjusted ( ADJ) ca be used i place of, sice icreasig umber of uecessar variables gives a pealt to the. It is defied as: = ( ) p 1 ADJ 1 1 Note that Adjusted coicides with for model cotaiig just oe idepedet variable. However, remember that eve adjusted is ot related i a wa to the real model predictio abilit. (7) About the error stadard deviatio The error stadard deviatio s is a statistical estimate of the error also accoutig for the model degrees of freedom. This quatit is i the same uit of the respose ad is based o the fittig performace of the model, i.e. SS. A more direct measure of the average error of the respose estimates is the Stadard Deviatio Error i Calculatio (SDEC or SEC): SDEC SEC = SS (8) pag. 3
4 Tutorial 5 b. Todeschii where is the umber of samples. About the F-ratio test I the few last decades, the majorit of regressio model summaries iclude the F value. Aalogousl to adjusted, higher the F value better the model. This use of the F value i evaluatig a regressio model is correct, but some problems arise whe oe thik that the F value was origiall proposed as a statistical test, that is the calculated F eeds to be compared with a critical value at some probabilit level i order to draw decisios. I effect, the ull hpothesis H 0 of the F-ratio test i regressio states that all the regressio coefficiets are equal to zero (i.e. o regressio model is obtaied) agaist the alterative hpothesis H 1 that at least oe of the regressio coefficiets is differet from zero (i.e. a regressio model is obtaied). Therefore, i multivariate aalsis, where more tha oe idepedet variable is used, this test is ver weak ad, i practice, completel uuseful: i fact, we wat all the variables icluded i the model to be relevat (or statisticall sigificat) for the respose we are modellig. egressio parameters related to the predictio power (goodess of predictio) I statistics ad chemometrics several validatio techiques have bee proposed i the last 0 ears i order to estimate the model predictio capabilit. The model predictio capabilit is somethig differet from the model fitess capabilit, i.e. the abilit of the model to estimate the respose for objects that do ot participate to the calculated model. Here, the attetio is stressed o the most simple wa to obtai some measures of model abilit, i.e. the leave-oe-out cross-validatio techique. This techique is ofte too optimistic with respect to the "true" predictio power of the models; however, it ca be easil carried out ad allows model compariso as well as variable selectio. The measures of model predictive abilit are ver similar to those defied for the goodess of fit ad are based o the Predictive Error Sum of Squares (PESS) which substitutes the esidual Sum of Squares (SS): Predictive Error Sum of Squares (PESS) It is the sum of the squared differeces betwee the experimetal respose ad the respose predicted b the regressio model, i.e. for a object that was ot used for model estimatio. It is defied as: PESS = ( ˆ ) i i / i (9) where the otatio i/i idicates that the respose is predicted b a model estimated whe the i-th sample was left out from the traiig set. pag. 4
5 Tutorial 5 b. Todeschii Therefore, a equivalet parameter to ca be defied, usig i place of SS the quatit PESS. This parameter is called cross-validated ad the accepted smbols are CV or Q : ( ˆ i i / i ) PESS CV Q = 1 = 1 Q 1 TSS ( i ) For bad predictive models, Q ca assume eve egative values whe PESS is greater tha TSS, meaig that i predictio the model performs worse tha the o-model estimate, i.e. the mea respose of the traiig set. (10) Note. A discussio about differet Q fuctios was recetl published i the refereces give below, where a modified geeral Q fuctio is also proposed. 1) Cosoi, V., Ballabio, D. ad Todeschii,. (009) Commets o the defiitio of the Q parameter for QSA validatio. Joural of Chemical Iformatio ad Modelig, 49, ) Cosoi, V., Ballabio, D. ad Todeschii,. (010) Evaluatio of model predictive abilit b exteral validatio techiques. J. Chemometrics, 4, The predictive squared error (PSE) ad the stadard deviatio error i predictio (SDEP or SEP) o the modeled respose are give b: PESS PESS PSE = SDEP SEP = The characteristic behaviour of ad Q is show i the figure below. (11) As it ca be oted, values alwas icrease whe oe variable at-a-time is added to the previous model variables; o the other side, Q values icrease util useful variables are added to the pag. 5
6 Tutorial 5 b. Todeschii previous model variables; whe ois variables are added to the model, Q values decrease, i.e. the predictio power of the model is lowered. Therefore, the optimal complexit of the model the best model predictive power - ca be chose b simpl lookig at the maximum value of Q (obviousl, the same model rakig is obtaied lookig at the miimum values of PSE or SDEP). Examples of some questioable regressio summaries I order to better explai the differet meaigs of the parameters used for a summar of regressio models, some examples take from literature are give below. Example 1 Compariso / selectio of regressio models I a paper, five differet models have bee calculated for modellig bidig affiities of 49 compouds, usig five differet sets of variables, each of te differet descriptors. The summar of the five models is collected i the Table below. Here Q is the so-called "qualit idex" ad ot the cross-validated Q parameter (compare equatios 10 ad 1 ad example give below). 1. = 49 r = s = F = Q = = CV = = 49 r = 0.93 s = F = 5.56 Q = 3.06 = CV = = 49 r = s = 0.43 F = 19.3 Q =.160 = CV = = 49 r = s = F = 8.45 Q =.940 = CV = = 49 r = 0.94 s = 0.41 F = 4.40 Q =.86 = CV = First of all, it there is redudat iformatio due to the presece of both r ad values. Moreover, havig all the models the same umber of samples (49) ad model parameters (10+itercept), the degrees of freedom for the umerator ad deomiator of the F-ratio are 10 ad 38, respectivel, with a F critical value (at 5%, 1% ad 0.1% of cofidece) of.09,.81 ad 3.90, respectivel: usig the F test, all the models are more tha acceptable at a level of cofidece (see also the above paragraph About the F-ratio test). O the basis of (ad r), the models ca be raked (from the best to the worst) with the followig order: 5, 4,, 3, 1. The same order should be obtaied b rakig the models o the basis of s (lower the value, better the model) ad b usig the qualit idex Q (see example ). As for, also these two parameters deped o SS, TSS beig a costat, provided that the measure uit of the respose was ot chaged from oe model to aother. However, the model rakig is:, 4, 5, 3, 1 i both cases, thus idicatig the presece of some mistakes i reportig the results or i the calculatios. Disregardig the previous wrog rakigs, the uique reasoable model rakig is that obtaied b CV: 4,, 5, 3, 1, i.e. evaluatig the model predictive power. pag. 6
7 Tutorial 5 b. Todeschii The Authors claim that model 5 is better tha model 4 ad that it is the best oe because "Itroductio of the ew... [idex]... i QSA model for σ receptor icreased the predictive value of the model.": this setece is false. Example - The qualit idex Aother iterestig model, reported i literature, is the followig: logp = W = 16 r = s = Q = I 1994, a qualit idex Q for regressio was defied as: Q = s where ad s are the measures of goodess of fit defied i equatios (4) ad (5), respectivel. Several people have used this qualit idex without a criticism (see also example 1). However, it ca be easil demostrated that it is based o a ver doubtful statistics. I effect, Q ( p ') ( ) ( ) ( ') (1) MSS = = TSS MSS p ' p ' 1 p ' p ' 1 = = F = F s SS SS TSS p TSS TSS where F' is ( ) ( ) ( p ) ( ) / ' 1 / 1 F ' = MSS p ' 1 = MSS SS / p ' SS / p ' ( ) ( ) F ' Q = TSS Therefore, the Q idex is related to a modified F test, which takes ito accout ol the degrees of freedom of the model residual sum of squares. Moreover, Q also depeds o the total sum of squares TSS, which is costat for a give respose. However, if the measure uit of the respose is chaged (for example, multiplied b 10), the TSS value will icrease 100 times ad the qualit decreases cosequetl. O the cotrar, if the respose is divided b 10, the qualit icreases proportioall. I fact, if the respose is defied as multiplied b a scalig factor c, the total sum of squares is: ( i ) ( i ) (xx) TSS = c c = c Therefore, the qualit idex Q ca be also expressed i a simpler form as: Q 1 F ' Q = c TSS ( ') ( ') p p = = = SS / ( p ') ( 1 ) TSS ( 1 ) c ( i ) pag. 7
8 Tutorial 5 b. Todeschii thus, showig that, for objects ad p variables, Q icreases mootoicall (but ot liearl) with (as well as with the decreasig of s ), beig TSS costat. However, if a chage i the respose scale is performed, the qualit chages as 1/c. But, i the case of the model show above, aother problem arises: i fact, the Authors cofoud the qualit idex Q with the cross-validated Q (or Q or CV) parameter, the attributig to the qualit idex predictive properties which o the cotrar are completel abset: "This qualit factor Q is defied as the ratio of the correlatio coefficiet (r) to the stadard deviatio (S) i.e., Q = r/s. This factor accouts for the predictive power of the model.". As it ca be easil observed, oe of the parameters i the qualit idex defiitio is i some wa related to the predictio power of the model, but is (of course) related to. Example 3 - Negative r values Look at the model summar below: logk = Sz i = 16 r = s = F = I this model, r (the multiple correlatio coefficiet) is preseted with the mius sig, due to the iverse pairwise correlatio betwee the variables logk i (the respose) ad Sz (the descriptor). The reported r is wrog, beig the multiple correlatio coefficiet (see equatios 3 ad 4) alwas positive because it represets (or is related to) the explaied variace of the model. The correct expressio is r = , while the iformatio related to the existig iverse correlatio betwee logk i ad Sz is well highlighted b the sig of the regressio coefficiet. This is aother error which sometimes appears i usig r or due to the coufoudig of the multiple correlatio coefficiet (just r or ) with the bivariate correlatio r, which represets the correlatio betwee a pair of variables ad rages betwee 1 ad +1, defied as: xi i (, ) = xi r x i (13) Of course, for bivariate models, the absolute value of r(x,) coicides with. A suggested summar of a regressio model From the above cosideratios, it seems quite reasoable to characterize the models with a simple but effective summar, separatig the predictio ad the fitess qualities of the models. Moreover, i order to avoid cofusios, it is suggested to use capital i place of r, leavig to the latter the meaig of pairwise correlatio betwee two variables. Moreover, the square values of both ad Q give a explicit idea of the modeled variace. Oce the model has bee accepted, a better pag. 8
9 Tutorial 5 b. Todeschii iformatio about the average error (i fittig ad predictio) o the modeled respose is give b the parameters SDEC ad SDEP. Fiall, i order to avoid pit discussios with some referees, sometimes the F-ratio test ca be also icluded, givig the degrees of freedom ad the correspodig probabilit. logp = V V = 15 Q = 93.6% SDEP = 0.79 LOO = 97.65% SDEC = 0.81 Obviousl, other estimates of the predictio power performed o a exteral test set (reportig also the umber of objects i the test set) or usig leave-ma-out (together with the percetage of objects left out i each step) or boostrap or other validatio techiques are welcome ad should be also reported, together with other specific iformatio about the adopted validatio techique. ( ) ( ) = 15 Q 0% = 91.13% Q 30% = 90.3% Q = 90.56% LMO LMO BOOT = 5 Q = SDEP = 0.87 EXT EXT pag. 9
1 Correlation and Regression Analysis
1 Correlatio ad Regressio Aalysis I this sectio we will be ivestigatig the relatioship betwee two cotiuous variable, such as height ad weight, the cocetratio of a ijected drug ad heart rate, or the cosumptio
More informationNow here is the important step
LINEST i Excel The Excel spreadsheet fuctio "liest" is a complete liear least squares curve fittig routie that produces ucertaity estimates for the fit values. There are two ways to access the "liest"
More informationPSYCHOLOGICAL STATISTICS
UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc. Cousellig Psychology (0 Adm.) IV SEMESTER COMPLEMENTARY COURSE PSYCHOLOGICAL STATISTICS QUESTION BANK. Iferetial statistics is the brach of statistics
More informationLesson 17 Pearson s Correlation Coefficient
Outlie Measures of Relatioships Pearso s Correlatio Coefficiet (r) -types of data -scatter plots -measure of directio -measure of stregth Computatio -covariatio of X ad Y -uique variatio i X ad Y -measurig
More informationHypothesis testing. Null and alternative hypotheses
Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate
More informationI. Chi-squared Distributions
1 M 358K Supplemet to Chapter 23: CHI-SQUARED DISTRIBUTIONS, T-DISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad t-distributios, we first eed to look at aother family of distributios, the chi-squared distributios.
More informationConfidence Intervals for One Mean
Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a
More informationLesson 15 ANOVA (analysis of variance)
Outlie Variability -betwee group variability -withi group variability -total variability -F-ratio Computatio -sums of squares (betwee/withi/total -degrees of freedom (betwee/withi/total -mea square (betwee/withi
More informationGCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number.
GCSE STATISTICS You should kow: 1) How to draw a frequecy diagram: e.g. NUMBER TALLY FREQUENCY 1 3 5 ) How to draw a bar chart, a pictogram, ad a pie chart. 3) How to use averages: a) Mea - add up all
More informationOutput Analysis (2, Chapters 10 &11 Law)
B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should
More informationTHE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n
We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample
More informationNon-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring
No-life isurace mathematics Nils F. Haavardsso, Uiversity of Oslo ad DNB Skadeforsikrig Mai issues so far Why does isurace work? How is risk premium defied ad why is it importat? How ca claim frequecy
More informationZ-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown
Z-TEST / Z-STATISTIC: used to test hypotheses about µ whe the populatio stadard deviatio is kow ad populatio distributio is ormal or sample size is large T-TEST / T-STATISTIC: used to test hypotheses about
More informationSECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES
SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,
More informationBiology 171L Environment and Ecology Lab Lab 2: Descriptive Statistics, Presenting Data and Graphing Relationships
Biology 171L Eviromet ad Ecology Lab Lab : Descriptive Statistics, Presetig Data ad Graphig Relatioships Itroductio Log lists of data are ofte ot very useful for idetifyig geeral treds i the data or the
More information1 Computing the Standard Deviation of Sample Means
Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.
More informationIn nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008
I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces
More informationTHE ARITHMETIC OF INTEGERS. - multiplication, exponentiation, division, addition, and subtraction
THE ARITHMETIC OF INTEGERS - multiplicatio, expoetiatio, divisio, additio, ad subtractio What to do ad what ot to do. THE INTEGERS Recall that a iteger is oe of the whole umbers, which may be either positive,
More informationUniversity of California, Los Angeles Department of Statistics. Distributions related to the normal distribution
Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100B Istructor: Nicolas Christou Three importat distributios: Distributios related to the ormal distributio Chi-square (χ ) distributio.
More informationExample 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here).
BEGINNING ALGEBRA Roots ad Radicals (revised summer, 00 Olso) Packet to Supplemet the Curret Textbook - Part Review of Square Roots & Irratioals (This portio ca be ay time before Part ad should mostly
More informationStatistical inference: example 1. Inferential Statistics
Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either
More informationA probabilistic proof of a binomial identity
A probabilistic proof of a biomial idetity Joatho Peterso Abstract We give a elemetary probabilistic proof of a biomial idetity. The proof is obtaied by computig the probability of a certai evet i two
More informationIncremental calculation of weighted mean and variance
Icremetal calculatio of weighted mea ad variace Toy Fich faf@cam.ac.uk dot@dotat.at Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically
More informationModified Line Search Method for Global Optimization
Modified Lie Search Method for Global Optimizatio Cria Grosa ad Ajith Abraham Ceter of Excellece for Quatifiable Quality of Service Norwegia Uiversity of Sciece ad Techology Trodheim, Norway {cria, ajith}@q2s.tu.o
More informationChapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:
Chapter 7 - Samplig Distributios 1 Itroductio What is statistics? It cosist of three major areas: Data Collectio: samplig plas ad experimetal desigs Descriptive Statistics: umerical ad graphical summaries
More informationChapter 7: Confidence Interval and Sample Size
Chapter 7: Cofidece Iterval ad Sample Size Learig Objectives Upo successful completio of Chapter 7, you will be able to: Fid the cofidece iterval for the mea, proportio, ad variace. Determie the miimum
More informationChapter 7 Methods of Finding Estimators
Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of
More informationProperties of MLE: consistency, asymptotic normality. Fisher information.
Lecture 3 Properties of MLE: cosistecy, asymptotic ormality. Fisher iformatio. I this sectio we will try to uderstad why MLEs are good. Let us recall two facts from probability that we be used ofte throughout
More informationQuadrat Sampling in Population Ecology
Quadrat Samplig i Populatio Ecology Backgroud Estimatig the abudace of orgaisms. Ecology is ofte referred to as the "study of distributio ad abudace". This beig true, we would ofte like to kow how may
More information.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth
Questio 1: What is a ordiary auity? Let s look at a ordiary auity that is certai ad simple. By this, we mea a auity over a fixed term whose paymet period matches the iterest coversio period. Additioally,
More information5.4 Amortization. Question 1: How do you find the present value of an annuity? Question 2: How is a loan amortized?
5.4 Amortizatio Questio 1: How do you fid the preset value of a auity? Questio 2: How is a loa amortized? Questio 3: How do you make a amortizatio table? Oe of the most commo fiacial istrumets a perso
More informationNormal Distribution.
Normal Distributio www.icrf.l Normal distributio I probability theory, the ormal or Gaussia distributio, is a cotiuous probability distributio that is ofte used as a first approimatio to describe realvalued
More informationMeasures of Spread and Boxplots Discrete Math, Section 9.4
Measures of Spread ad Boxplots Discrete Math, Sectio 9.4 We start with a example: Example 1: Comparig Mea ad Media Compute the mea ad media of each data set: S 1 = {4, 6, 8, 10, 1, 14, 16} S = {4, 7, 9,
More informationThe analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection
The aalysis of the Courot oligopoly model cosiderig the subjective motive i the strategy selectio Shigehito Furuyama Teruhisa Nakai Departmet of Systems Maagemet Egieerig Faculty of Egieerig Kasai Uiversity
More informationDetermining the sample size
Determiig the sample size Oe of the most commo questios ay statisticia gets asked is How large a sample size do I eed? Researchers are ofte surprised to fid out that the aswer depeds o a umber of factors
More informationDefinition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean
1 Social Studies 201 October 13, 2004 Note: The examples i these otes may be differet tha used i class. However, the examples are similar ad the methods used are idetical to what was preseted i class.
More informationOverview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals
Overview Estimatig the Value of a Parameter Usig Cofidece Itervals We apply the results about the sample mea the problem of estimatio Estimatio is the process of usig sample data estimate the value of
More informationConfidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.
Cofidece Itervals A cofidece iterval is a iterval whose purpose is to estimate a parameter (a umber that could, i theory, be calculated from the populatio, if measuremets were available for the whole populatio).
More information1. C. The formula for the confidence interval for a population mean is: x t, which was
s 1. C. The formula for the cofidece iterval for a populatio mea is: x t, which was based o the sample Mea. So, x is guarateed to be i the iterval you form.. D. Use the rule : p-value
More informationLECTURE 13: Cross-validation
LECTURE 3: Cross-validatio Resampli methods Cross Validatio Bootstrap Bias ad variace estimatio with the Bootstrap Three-way data partitioi Itroductio to Patter Aalysis Ricardo Gutierrez-Osua Texas A&M
More informationData-Enhanced Predictive Modeling for Sales Targeting
Data-Ehaced Predictive Modelig for Sales Targetig Saharo Rosset Richard D. Lawrece Abstract We describe ad aalyze the idea of data-ehaced predictive modelig (DEM). The term ehaced here refers to the case
More informationBINOMIAL EXPANSIONS 12.5. In this section. Some Examples. Obtaining the Coefficients
652 (12-26) Chapter 12 Sequeces ad Series 12.5 BINOMIAL EXPANSIONS I this sectio Some Examples Otaiig the Coefficiets The Biomial Theorem I Chapter 5 you leared how to square a iomial. I this sectio you
More informationMathematical goals. Starting points. Materials required. Time needed
Level A1 of challege: C A1 Mathematical goals Startig poits Materials required Time eeded Iterpretig algebraic expressios To help learers to: traslate betwee words, symbols, tables, ad area represetatios
More informationAnalyzing Longitudinal Data from Complex Surveys Using SUDAAN
Aalyzig Logitudial Data from Complex Surveys Usig SUDAAN Darryl Creel Statistics ad Epidemiology, RTI Iteratioal, 312 Trotter Farm Drive, Rockville, MD, 20850 Abstract SUDAAN: Software for the Statistical
More informationNATIONAL SENIOR CERTIFICATE GRADE 12
NATIONAL SENIOR CERTIFICATE GRADE MATHEMATICS P EXEMPLAR 04 MARKS: 50 TIME: 3 hours This questio paper cosists of 8 pages ad iformatio sheet. Please tur over Mathematics/P DBE/04 NSC Grade Eemplar INSTRUCTIONS
More informationTHE ROLE OF EXPORTS IN ECONOMIC GROWTH WITH REFERENCE TO ETHIOPIAN COUNTRY
- THE ROLE OF EXPORTS IN ECONOMIC GROWTH WITH REFERENCE TO ETHIOPIAN COUNTRY BY: FAYE ENSERMU CHEMEDA Ethio-Italia Cooperatio Arsi-Bale Rural developmet Project Paper Prepared for the Coferece o Aual Meetig
More informationInference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval
Chapter 8 Tests of Statistical Hypotheses 8. Tests about Proportios HT - Iferece o Proportio Parameter: Populatio Proportio p (or π) (Percetage of people has o health isurace) x Statistic: Sample Proportio
More informationSoving Recurrence Relations
Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree
More informationCHAPTER 11 Financial mathematics
CHAPTER 11 Fiacial mathematics I this chapter you will: Calculate iterest usig the simple iterest formula ( ) Use the simple iterest formula to calculate the pricipal (P) Use the simple iterest formula
More informationINVESTMENT PERFORMANCE COUNCIL (IPC)
INVESTMENT PEFOMANCE COUNCIL (IPC) INVITATION TO COMMENT: Global Ivestmet Performace Stadards (GIPS ) Guidace Statemet o Calculatio Methodology The Associatio for Ivestmet Maagemet ad esearch (AIM) seeks
More informationCase Study. Normal and t Distributions. Density Plot. Normal Distributions
Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca
More informationCHAPTER 3 THE TIME VALUE OF MONEY
CHAPTER 3 THE TIME VALUE OF MONEY OVERVIEW A dollar i the had today is worth more tha a dollar to be received i the future because, if you had it ow, you could ivest that dollar ad ear iterest. Of all
More informationSequences and Series
CHAPTER 9 Sequeces ad Series 9.. Covergece: Defiitio ad Examples Sequeces The purpose of this chapter is to itroduce a particular way of geeratig algorithms for fidig the values of fuctios defied by their
More informationData Analysis and Statistical Behaviors of Stock Market Fluctuations
44 JOURNAL OF COMPUTERS, VOL. 3, NO. 0, OCTOBER 2008 Data Aalysis ad Statistical Behaviors of Stock Market Fluctuatios Ju Wag Departmet of Mathematics, Beijig Jiaotog Uiversity, Beijig 00044, Chia Email:
More informationSubject CT5 Contingencies Core Technical Syllabus
Subject CT5 Cotigecies Core Techical Syllabus for the 2015 exams 1 Jue 2014 Aim The aim of the Cotigecies subject is to provide a groudig i the mathematical techiques which ca be used to model ad value
More informationPresent Values, Investment Returns and Discount Rates
Preset Values, Ivestmet Returs ad Discout Rates Dimitry Midli, ASA, MAAA, PhD Presidet CDI Advisors LLC dmidli@cdiadvisors.com May 2, 203 Copyright 20, CDI Advisors LLC The cocept of preset value lies
More informationEkkehart Schlicht: Economic Surplus and Derived Demand
Ekkehart Schlicht: Ecoomic Surplus ad Derived Demad Muich Discussio Paper No. 2006-17 Departmet of Ecoomics Uiversity of Muich Volkswirtschaftliche Fakultät Ludwig-Maximilias-Uiversität Müche Olie at http://epub.ub.ui-mueche.de/940/
More informationhp calculators HP 12C Statistics - average and standard deviation Average and standard deviation concepts HP12C average and standard deviation
HP 1C Statistics - average ad stadard deviatio Average ad stadard deviatio cocepts HP1C average ad stadard deviatio Practice calculatig averages ad stadard deviatios with oe or two variables HP 1C Statistics
More informationChapter 5: Basic Linear Regression
Chapter 5: Basic Liear Regressio 1. Why Regressio Aalysis Has Domiated Ecoometrics By ow we have focused o formig estimates ad tests for fairly simple cases ivolvig oly oe variable at a time. But the core
More informationSAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx
SAMPLE QUESTIONS FOR FINAL EXAM REAL ANALYSIS I FALL 006 3 4 Fid the followig usig the defiitio of the Riema itegral: a 0 x + dx 3 Cosider the partitio P x 0 3, x 3 +, x 3 +,......, x 3 3 + 3 of the iterval
More information5: Introduction to Estimation
5: Itroductio to Estimatio Cotets Acroyms ad symbols... 1 Statistical iferece... Estimatig µ with cofidece... 3 Samplig distributio of the mea... 3 Cofidece Iterval for μ whe σ is kow before had... 4 Sample
More informationMEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book)
MEI Mathematics i Educatio ad Idustry MEI Structured Mathematics Module Summary Sheets Statistics (Versio B: referece to ew book) Topic : The Poisso Distributio Topic : The Normal Distributio Topic 3:
More information5 Boolean Decision Trees (February 11)
5 Boolea Decisio Trees (February 11) 5.1 Graph Coectivity Suppose we are give a udirected graph G, represeted as a boolea adjacecy matrix = (a ij ), where a ij = 1 if ad oly if vertices i ad j are coected
More informationCOMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S 2 CONTROL CHART FOR THE CHANGES IN A PROCESS
COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S CONTROL CHART FOR THE CHANGES IN A PROCESS Supraee Lisawadi Departmet of Mathematics ad Statistics, Faculty of Sciece ad Techoology, Thammasat
More informationResearch Method (I) --Knowledge on Sampling (Simple Random Sampling)
Research Method (I) --Kowledge o Samplig (Simple Radom Samplig) 1. Itroductio to samplig 1.1 Defiitio of samplig Samplig ca be defied as selectig part of the elemets i a populatio. It results i the fact
More informationMath C067 Sampling Distributions
Math C067 Samplig Distributios Sample Mea ad Sample Proportio Richard Beigel Some time betwee April 16, 2007 ad April 16, 2007 Examples of Samplig A pollster may try to estimate the proportio of voters
More informationODBC. Getting Started With Sage Timberline Office ODBC
ODBC Gettig Started With Sage Timberlie Office ODBC NOTICE This documet ad the Sage Timberlie Office software may be used oly i accordace with the accompayig Sage Timberlie Office Ed User Licese Agreemet.
More informationHCL Dynamic Spiking Protocol
ELI LILLY AND COMPANY TIPPECANOE LABORATORIES LAFAYETTE, IN Revisio 2.0 TABLE OF CONTENTS REVISION HISTORY... 2. REVISION.0... 2.2 REVISION 2.0... 2 2 OVERVIEW... 3 3 DEFINITIONS... 5 4 EQUIPMENT... 7
More informationThe following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles
The followig eample will help us uderstad The Samplig Distributio of the Mea Review: The populatio is the etire collectio of all idividuals or objects of iterest The sample is the portio of the populatio
More informationDepartment of Computer Science, University of Otago
Departmet of Computer Sciece, Uiversity of Otago Techical Report OUCS-2006-09 Permutatios Cotaiig May Patters Authors: M.H. Albert Departmet of Computer Sciece, Uiversity of Otago Micah Colema, Rya Fly
More informationSystems Design Project: Indoor Location of Wireless Devices
Systems Desig Project: Idoor Locatio of Wireless Devices Prepared By: Bria Murphy Seior Systems Sciece ad Egieerig Washigto Uiversity i St. Louis Phoe: (805) 698-5295 Email: bcm1@cec.wustl.edu Supervised
More informationA Combined Continuous/Binary Genetic Algorithm for Microstrip Antenna Design
A Combied Cotiuous/Biary Geetic Algorithm for Microstrip Atea Desig Rady L. Haupt The Pesylvaia State Uiversity Applied Research Laboratory P. O. Box 30 State College, PA 16804-0030 haupt@ieee.org Abstract:
More informationCS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations
CS3A Hadout 3 Witer 00 February, 00 Solvig Recurrece Relatios Itroductio A wide variety of recurrece problems occur i models. Some of these recurrece relatios ca be solved usig iteratio or some other ad
More informationWeek 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable
Week 3 Coditioal probabilities, Bayes formula, WEEK 3 page 1 Expected value of a radom variable We recall our discussio of 5 card poker hads. Example 13 : a) What is the probability of evet A that a 5
More information15.075 Exam 3. Instructor: Cynthia Rudin TA: Dimitrios Bisias. November 22, 2011
15.075 Exam 3 Istructor: Cythia Rudi TA: Dimitrios Bisias November 22, 2011 Gradig is based o demostratio of coceptual uderstadig, so you eed to show all of your work. Problem 1 A compay makes high-defiitio
More informationCenter, Spread, and Shape in Inference: Claims, Caveats, and Insights
Ceter, Spread, ad Shape i Iferece: Claims, Caveats, ad Isights Dr. Nacy Pfeig (Uiversity of Pittsburgh) AMATYC November 2008 Prelimiary Activities 1. I would like to produce a iterval estimate for the
More informationConfidence Intervals for Linear Regression Slope
Chapter 856 Cofidece Iterval for Liear Regreio Slope Itroductio Thi routie calculate the ample ize eceary to achieve a pecified ditace from the lope to the cofidece limit at a tated cofidece level for
More information*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature.
Itegrated Productio ad Ivetory Cotrol System MRP ad MRP II Framework of Maufacturig System Ivetory cotrol, productio schedulig, capacity plaig ad fiacial ad busiess decisios i a productio system are iterrelated.
More informationFOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10
FOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10 [C] Commuicatio Measuremet A1. Solve problems that ivolve liear measuremet, usig: SI ad imperial uits of measure estimatio strategies measuremet strategies.
More informationDescriptive Statistics
Descriptive Statistics We leared to describe data sets graphically. We ca also describe a data set umerically. Measures of Locatio Defiitio The sample mea is the arithmetic average of values. We deote
More informationForecasting. Forecasting Application. Practical Forecasting. Chapter 7 OVERVIEW KEY CONCEPTS. Chapter 7. Chapter 7
Forecastig Chapter 7 Chapter 7 OVERVIEW Forecastig Applicatios Qualitative Aalysis Tred Aalysis ad Projectio Busiess Cycle Expoetial Smoothig Ecoometric Forecastig Judgig Forecast Reliability Choosig the
More informationA Recursive Formula for Moments of a Binomial Distribution
A Recursive Formula for Momets of a Biomial Distributio Árpád Béyi beyi@mathumassedu, Uiversity of Massachusetts, Amherst, MA 01003 ad Saverio M Maago smmaago@psavymil Naval Postgraduate School, Moterey,
More informationCHAPTER 7: Central Limit Theorem: CLT for Averages (Means)
CHAPTER 7: Cetral Limit Theorem: CLT for Averages (Meas) X = the umber obtaied whe rollig oe six sided die oce. If we roll a six sided die oce, the mea of the probability distributio is X P(X = x) Simulatio:
More informationOne-sample test of proportions
Oe-sample test of proportios The Settig: Idividuals i some populatio ca be classified ito oe of two categories. You wat to make iferece about the proportio i each category, so you draw a sample. Examples:
More informationVladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT
Keywords: project maagemet, resource allocatio, etwork plaig Vladimir N Burkov, Dmitri A Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT The paper deals with the problems of resource allocatio betwee
More informationwhere: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return
EVALUATING ALTERNATIVE CAPITAL INVESTMENT PROGRAMS By Ke D. Duft, Extesio Ecoomist I the March 98 issue of this publicatio we reviewed the procedure by which a capital ivestmet project was assessed. The
More informationMann-Whitney U 2 Sample Test (a.k.a. Wilcoxon Rank Sum Test)
No-Parametric ivariate Statistics: Wilcoxo-Ma-Whitey 2 Sample Test 1 Ma-Whitey 2 Sample Test (a.k.a. Wilcoxo Rak Sum Test) The (Wilcoxo-) Ma-Whitey (WMW) test is the o-parametric equivalet of a pooled
More informationExploratory Data Analysis
1 Exploratory Data Aalysis Exploratory data aalysis is ofte the rst step i a statistical aalysis, for it helps uderstadig the mai features of the particular sample that a aalyst is usig. Itelliget descriptios
More informationA Test of Normality. 1 n S 2 3. n 1. Now introduce two new statistics. The sample skewness is defined as:
A Test of Normality Textbook Referece: Chapter. (eighth editio, pages 59 ; seveth editio, pages 6 6). The calculatio of p values for hypothesis testig typically is based o the assumptio that the populatio
More informationYour organization has a Class B IP address of 166.144.0.0 Before you implement subnetting, the Network ID and Host ID are divided as follows:
Subettig Subettig is used to subdivide a sigle class of etwork i to multiple smaller etworks. Example: Your orgaizatio has a Class B IP address of 166.144.0.0 Before you implemet subettig, the Network
More informationBio-Plex Manager Software
Multiplex Suspesio Array Bio-Plex Maager Software Extract Kowledge Faster Move Your Research Forward Bio-Rad cotiues to iovate where it matters most. With Bio-Plex Maager 5.0 software, we offer valuable
More informationConfidence intervals and hypothesis tests
Chapter 2 Cofidece itervals ad hypothesis tests This chapter focuses o how to draw coclusios about populatios from sample data. We ll start by lookig at biary data (e.g., pollig), ad lear how to estimate
More informationInfinite Sequences and Series
CHAPTER 4 Ifiite Sequeces ad Series 4.1. Sequeces A sequece is a ifiite ordered list of umbers, for example the sequece of odd positive itegers: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29...
More informationHere are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.
This documet was writte ad copyrighted by Paul Dawkis. Use of this documet ad its olie versio is govered by the Terms ad Coditios of Use located at http://tutorial.math.lamar.edu/terms.asp. The olie versio
More informationAsymptotic Growth of Functions
CMPS Itroductio to Aalysis of Algorithms Fall 3 Asymptotic Growth of Fuctios We itroduce several types of asymptotic otatio which are used to compare the performace ad efficiecy of algorithms As we ll
More informationAutomatic Tuning for FOREX Trading System Using Fuzzy Time Series
utomatic Tuig for FOREX Tradig System Usig Fuzzy Time Series Kraimo Maeesilp ad Pitihate Soorasa bstract Efficiecy of the automatic currecy tradig system is time depedet due to usig fixed parameters which
More informationBasic Elements of Arithmetic Sequences and Series
MA40S PRE-CALCULUS UNIT G GEOMETRIC SEQUENCES CLASS NOTES (COMPLETED NO NEED TO COPY NOTES FROM OVERHEAD) Basic Elemets of Arithmetic Sequeces ad Series Objective: To establish basic elemets of arithmetic
More informationOutline. Determine Confidence Interval. EEC 686/785 Modeling & Performance Evaluation of Computer Systems. Confidence Interval for The Mean.
EEC 686/785 Modelig & Performace Evaluatio of Computer Systems Lecture 9 Departmet of Electrical ad Computer Egieerig Clevelad State Uiversity webig@ieee.org (based o Dr. Raj jai s lecture otes) Outlie
More informationDiscrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13
EECS 70 Discrete Mathematics ad Probability Theory Sprig 2014 Aat Sahai Note 13 Itroductio At this poit, we have see eough examples that it is worth just takig stock of our model of probability ad may
More informationMaximum Likelihood Estimators.
Lecture 2 Maximum Likelihood Estimators. Matlab example. As a motivatio, let us look at oe Matlab example. Let us geerate a radom sample of size 00 from beta distributio Beta(5, 2). We will lear the defiitio
More information