Logistic Regression. Chapter Modeling Conditional Probabilities

Size: px
Start display at page:

Download "Logistic Regression. Chapter 12. 12.1 Modeling Conditional Probabilities"

Transcription

1 Chapter 12 Logistic Regressio 12.1 Modelig Coditioal Probabilities So far, we either looked at estimatig the coditioal expectatios of cotiuous variables (as i regressio), or at estimatig distributios. There are may situatios where however we are iterested i iputoutput relatioships, as i regressio, but the output variable is discrete rather tha cotiuous. I particular there are may situatios where we have biary outcomes (it sows i Pittsburgh o a give day, or it does t; this squirrel carries plague, or it does t; this loa will be paid back, or it wo t; this perso will get heart disease i the ext five years, or they wo t). I additio to the biary outcome, we have some iput variables, which may or may ot be cotiuous. How could we model ad aalyze such data? We could try to come up with a rule which guesses the biary output from the iput variables. This is called classificatio, ad is a importat topic i statistics ad machie learig. However, simply guessig yes or o is pretty crude especially if there is o perfect rule. (Why should there be?) Somethig which takes oise ito accout, ad does t just give a biary aswer, will ofte be useful. I short, we wat probabilities which meas we eed to fit a stochastic model. What would be ice, i fact, would be to have coditioal distributio of the respose Y, give the iput variables, Pr(Y X ). This would tell us about how precise our predictios are. If our model says that there s a 51% chace of sow ad it does t sow, that s better tha if it had said there was a 99% chace of sow (though eve a 99% chace is ot a sure thig). We have see how to estimate coditioal probabilities oparametrically, ad could do this usig the kerels for discrete variables from lecture 6. While there are a lot of merits to this approach, it does ivolve comig up with a model for the joit distributio of outputs Y ad iputs X, which ca be quite timecosumig. Let s pick oe of the classes ad call it 1 ad the other 0. (It does t matter which is which. The Y becomes a idicator variable, ad you ca covice yourself that Pr(Y = 1) = E[Y ]. Similarly, Pr(Y = 1 X = x) = E[Y X = x]. (I a phrase, coditioal probability is the coditioal expectatio of the idicator.) 223

2 224 CHAPTER 12. LOGISTIC REGRESSION This helps us because by this poit we kow all about estimatig coditioal expectatios. The most straightforward thig for us to do at this poit would be to pick out our favorite smoother ad estimate the regressio fuctio for the idicator variable; this will be a estimate of the coditioal probability fuctio. There are two reasos ot to just pluge ahead with that idea. Oe is that probabilities must be betwee 0 ad 1, but our smoothers will ot ecessarily respect that, eve if all the observed y i they get are either 0 or 1. The other is that we might be better off makig more use of the fact that we are tryig to estimate probabilities, by more explicitly modelig the probability. Assume that Pr(Y = 1 X = x) = p(x;θ), for some fuctio p parameterized by θ. parameterized fuctio θ, ad further assume that observatios are idepedet of each other. The the (coditioal) likelihood fuctio is Pr Y = y i X = x i = p(x i ;θ) y i (1 p(xi ;θ) 1 y i ) (12.1) Recall that i a sequece of Beroulli trials y 1,...y, where there is a costat probability of success p, the likelihood is p y i (1 p) 1 y i (12.2) As you leared i itro. stats, this likelihood is maximized whe p = ˆp = 1 y i. If each trial had its ow success probability p i, this likelihood becomes p y i (1 p i i ) 1 y i (12.3) Without some costraits, estimatig the ihomogeeous Beroulli model by maximum likelihood does t work; we d get ˆp i = 1 whe y i = 1, ˆp i = 0 whe y i = 0, ad lear othig. If o the other had we assume that the p i are t just arbitrary umbers but are liked together, those costraits give otrivial parameter estimates, ad let us geeralize. I the kid of model we are talkig about, the costrait, p i = p(x i ;θ), tells us that p i must be the same wheever x i is the same, ad if p is a cotiuous fuctio, the similar values of x i must lead to similar values of p i. Assumig p is kow (up to parameters), the likelihood is a fuctio of θ, ad we ca estimate θ by maximizig the likelihood. This lecture will be about this approach Logistic Regressio To sum up: we have a biary output variable Y, ad we wat to model the coditioal probability Pr(Y = 1 X = x) as a fuctio of x; ay ukow parameters i the fuctio are to be estimated by maximum likelihood. By ow, it will ot surprise you to lear that statisticias have approach this problem by askig themselves how ca we use liear regressio to solve this?

3 12.2. LOGISTIC REGRESSION The most obvious idea is to let p(x) be a liear fuctio of x. Every icremet of a compoet of x would add or subtract so much to the probability. The coceptual problem here is that p must be betwee 0 ad 1, ad liear fuctios are ubouded. Moreover, i may situatios we empirically see dimiishig returs chagig p by the same amout requires a bigger chage i x whe p is already large (or small) tha whe p is close to 1/2. Liear models ca t do this. 2. The ext most obvious idea is to let log p(x) be a liear fuctio of x, so that chagig a iput variable multiplies the probability by a fixed amout. The problem is that logarithms are ubouded i oly oe directio, ad liear fuctios are ot. 3. Fially, the easiest modificatio of log p which has a ubouded rage is the logistic (or logit) trasformatio, log p. We ca make this a liear fuctio of x without fear of osesical results. (Of course the results could still happe to be wrog, but they re ot guarateed to be wrog.) 1 p This last alterative is logistic regressio. Formally, the model logistic regressio model is that Solvig for p, this gives p(x; b, w)= p(x) log 1 p(x) = β 0 x β (12.4) eβ 0 x β 1 e β 0 x β = 1 1 e (β 0 x β) (12.5) Notice that the overall specificatio is a lot easier to grasp i terms of the trasformed probability that i terms of the utrasformed probability. 1 To miimize the misclassificatio rate, we should predict Y = 1 whe p ad Y = 0 whe p <. This meas guessig 1 wheever β 0 x β is oegative, ad 0 otherwise. So logistic regressio gives us a liear classifier. The decisio boudary separatig the two predicted classes is the solutio of β 0 x β = 0, which is a poit if x is oe dimesioal, a lie if it is two dimesioal, etc. Oe ca show (exercise!) that the distace from the decisio boudary is β 0 /β x β/β. Logistic regressio ot oly says where the boudary betwee the classes is, but also says (via Eq. 12.5) that the class probabilities deped o distace from the boudary, i a particular way, ad that they go towards the extremes (0 ad 1) more rapidly whe β is larger. It s these statemets about probabilities which make logistic regressio more tha just a classifier. It makes stroger, more detailed predictios, ad ca be fit i a differet way; but those strog predictios could be wrog. Usig logistic regressio to predict class probabilities is a modelig choice, just like it s a modelig choice to predict quatitative variables with liear regressio. 1 Uless you ve take statistical mechaics, i which case you recogize that this is the Boltzma distributio for a system with two states, which differ i eergy by β 0 x β.

4 226 CHAPTER 12. LOGISTIC REGRESSION!! 1 1!,w=!,!! 2 2! x[,2] Liear classifier with b= x[,1] Logistic regressio with b=2.5, w=(5,5) x[,1] x[,2] x[,1] x[,2] x[,2] Logistic regressio with b=, w=(1,1) Logistic regressio with b=0.1, w=(.2,.2) x[,1] Figure 12.1: Effects of scalig logistic regressio parameters. Values of x1 ad x2 are the same i all plots ( Uif( 1, 1) for both coordiates), but labels were geerated radomly from logistic regressios with β0 = 0.1, β = ( 0.2, 0.2) (top left); from β0 =, β = ( 1, 1) (top right); from β0 = 2.5, β = ( 5, 5) (bottom left); ad from a perfect liear classifier with the same boudary. The large black dot is the origi.

5 12.2. LOGISTIC REGRESSION 227 I either case is the appropriateess of the model guarateed by the gods, ature, mathematical ecessity, etc. We begi by positig the model, to get somethig to work with, ad we ed (if we kow what we re doig) by checkig whether it really does match the data, or whether it has systematic flaws. Logistic regressio is oe of the most commoly used tools for applied statistics ad discrete data aalysis. There are basically four reasos for this. 1. Traditio. 2. I additio to the heuristic approach above, the quatity log p/(1 p) plays a importat role i the aalysis of cotigecy tables (the log odds ). Classificatio is a bit like havig a cotigecy table with two colums (classes) ad ifiitely may rows (values of x). With a fiite cotigecy table, we ca estimate the logodds for each row empirically, by just takig couts i the table. With ifiitely may rows, we eed some sort of iterpolatio scheme; logistic regressio is liear iterpolatio for the logodds. 3. It s closely related to expoetial family distributios, where the probability of some vector v is proportioal to expβ 0 m f j =1 j (v)β j. If oe of the compoets of v is biary, ad the fuctios f j are all the idetity fuctio, the we get a logistic regressio. Expoetial families arise i may cotexts i statistical theory (ad i physics!), so there are lots of problems which ca be tured ito logistic regressio. 4. It ofte works surprisigly well as a classifier. But, may simple techiques ofte work surprisigly well as classifiers, ad this does t really testify to logistic regressio gettig the probabilities right Likelihood Fuctio for Logistic Regressio Because logistic regressio predicts probabilities, rather tha just classes, we ca fit it usig likelihood. For each traiig datapoit, we have a vector of features, x i, ad a observed class, y i. The probability of that class was either p, if y i = 1, or 1 p, if y i = 0. The likelihood is the L(β 0,β)= p(x i ) y i (1 p(xi ) 1 y i (12.6)

6 228 CHAPTER 12. LOGISTIC REGRESSION (I could substitute i the actual equatio for p, but thigs will be clearer i a momet if I do t.) The loglikelihood turs products ito sums: (β 0,β) = = = = y i log p(x i )(1 y i )log1 p(x i ) (12.7) log1 p(x i ) log1 p(x i ) p(x i ) y i log (12.8) 1 p(x i ) y i (β 0 x i β) (12.9) log1 e β 0 x i β y i (β 0 x i β) (12.10) where i the exttolast step we fially use equatio Typically, to fid the maximum likelihood estimates we d differetiate the log likelihood with respect to the parameters, set the derivatives equal to zero, ad solve. To start that, take the derivative with respect to oe compoet of β, say β j. 1 = β j 1 e β 0 x i β eβ 0 x i β x ij y i x ij (12.11) = yi p(x i ;β 0,β) x ij (12.12) We are ot goig to be able to set this to zero ad solve exactly. (That s a trascedetal equatio, ad there is o closedform solutio.) We ca however approximately solve it umerically Logistic Regressio with More Tha Two Classes If Y ca take o more tha two values, say k of them, we ca still use logistic regressio. Istead of havig oe set of parameters β 0,β, each class c i 0 : (k 1) will have its ow offset β (c) 0 ad vector β (c), ad the predicted coditioal probabilities will be Pr Y = c X = x = (c) eβ 0 x β(c) (12.13) c eβ(c) 0 x β(c) You ca check that whe there are oly two classes (say, 0 ad 1), equatio reduces to equatio 12.5, with β 0 = β (1) 0 β(0) ad β = β (1) β (0). I fact, o matter 0 how may classes there are, we ca always pick oe of them, say c = 0, ad fix its parameters at exactly zero, without ay loss of geerality 2. 2 Sice we ca arbitrarily chose which class s parameters to zero out without affectig the predicted probabilities, strictly speakig the model i Eq is uidetified. That is, differet parameter settigs lead to exactly the same outcome, so we ca t use the data to tell which oe is right. The usual respose here is to deal with this by a covetio: we decide to zero out the parameters of the first class, ad the estimate the cotrastig parameters for the others.

7 12.3. NEWTON S METHOD FOR NUMERICAL OPTIMIZATION 229 Calculatio of the likelihood ow proceeds as before (oly with more bookkeepig), ad so does maximum likelihood estimatio Newto s Method for Numerical Optimizatio There are a huge umber of methods for umerical optimizatio; we ca t cover all bases, ad there is o magical method which will always work better tha aythig else. However, there are some methods which work very well o a awful lot of the problems which keep comig up, ad it s worth spedig a momet to sketch how they work. Oe of the most aciet yet importat of them is Newto s method (alias NewtoRaphso ). Let s start with the simplest case of miimizig a fuctio of oe scalar variable, say f (β). We wat to fid the locatio of the global miimum, β. We suppose that f is smooth, ad that β is a regular iterior miimum, meaig that the derivative at β is zero ad the secod derivative is positive. Near the miimum we could make a Taylor expasio: f (β) f (β ) 1 2 (β β ) 2 d 2 f dβ 2 (12.14) β=β (We ca see here that the secod derivative has to be positive to esure that f (β) > f (β ).) I words, f (β) is close to quadratic ear the miimum. Newto s method uses this fact, ad miimizes a quadratic approximatio to the fuctio we are really iterested i. (I other words, Newto s method is to replace the problem we wat to solve, with a problem which we ca solve.) Guess a iitial poit β (0). If this is close to the miimum, we ca take a secod order Taylor expasio aroud β (0) ad it will still be accurate: f (β) f (β (0) )(β β (0) ) df 1 β β (0) 2 d 2 f dw β=β (0) 2 dw 2 (12.15) β=β (0) Now it s easy to miimize the righthad side of equatio Let s abbreviate df the derivatives, because they get tiresome to keep writig out: = f (β (0) ), β=β (0) d 2 f dw 2 β=β (0) = f (β (0) ). We just take the derivative with respect to β, ad set it equal to zero at a poit we ll call β (1) : 0 = f (β (0) ) 1 2 f (β (0) )2(β (1) β (0) ) (12.16) dw β (1) = β (0) f (β (0) ) f (β (0) ) (12.17) The value β (1) should be a better guess at the miimum β tha the iitial oe β (0) was. So if we use it to make a quadratic approximatio to f, we ll get a better approximatio, ad so we ca iterate this procedure, miimizig oe approximatio

8 230 CHAPTER 12. LOGISTIC REGRESSION ad the usig that to get a ew approximatio: β (1) = β () f (β () ) f (β () ) (12.18) Notice that the true miimum β is a fixed poit of equatio 12.18: if we happe to lad o it, we ll stay there (sice f (β )=0). We wo t show it, but it ca be proved that if β (0) is close eough to β, the β () β, ad that i geeral β () β = O( 2 ), a very rapid rate of covergece. (Doublig the umber of iteratios we use does t reduce the error by a factor of two, but by a factor of four.) Let s put this together i a algorithm. my.ewto = fuctio(f,f.prime,f.prime2,beta0,tolerace=1e3,max.iter=50) { beta = beta0 old.f = f(beta) iteratios = 0 made.chages = TRUE while(made.chages & (iteratios < max.iter)) { iteratios < iteratios 1 made.chages < FALSE ew.beta = beta f.prime(beta)/f.prime2(beta) ew.f = f(ew.beta) relative.chage = abs(ew.f old.f)/old.f 1 made.chages = (relative.chages > tolerace) beta = ew.beta old.f = ew.f } if (made.chages) { warig("newto s method termiated before covergece") } retur(list(miimum=beta,value=f(beta),deriv=f.prime(beta), deriv2=f.prime2(beta),iteratios=iteratios, coverged=!made.chages)) } The first three argumets here have to all be fuctios. The fourth argumet is our iitial guess for the miimum, β (0). The last argumets keep Newto s method from cyclig forever: tolerace tells it to stop whe the fuctio stops chagig very much (the relative differece betwee f (β () ) ad f (β (1) ) is small), ad max.iter tells it to ever do more tha a certai umber of steps o matter what. The retur value icludes the estmated miimum, the value of the fuctio there, ad some diagostics the derivative should be very small, the secod derivative should be positive, etc. You may have oticed some potetial problems what if we lad o a poit where f is zero? What if f (β (1) ) > f (β () )? Etc. There are ways of hadlig these issues, ad more, which are icorporated ito real optimizatio algorithms from umerical aalysis such as the optim fuctio i R; I strogly recommed

9 12.3. NEWTON S METHOD FOR NUMERICAL OPTIMIZATION 231 you use that, or somethig like that, rather tha tryig to roll your ow optimizatio code Newto s Method i More tha Oe Dimesio Suppose that the objective f is a fuctio of multiple argumets, f (β 1,β 2,...β p ). Let s budle the parameters ito a sigle vector, w. The the Newto update is β (1) = β () H 1 (β () ) f (β () ) (12.19) where f is the gradiet of f, its vector of partial derivatives [ f / β 1, f / β 2,... f / β p ], ad H is the Hessia of f, its matrix of secod partial derivatives, H ij = 2 f / β i β j. Calculatig H ad f is t usually very timecosumig, but takig the iverse of H is, uless it happes to be a diagoal matrix. This leads to various quasinewto methods, which either approximate H by a diagoal matrix, or take a proper iverse of H oly rarely (maybe just oce), ad the try to update a estimate of H 1 (β () ) as β () chages Iteratively ReWeighted Least Squares This discussio of Newto s method is quite geeral, ad therefore abstract. I the particular case of logistic regressio, we ca make everythig look much more statistical. Logistic regressio, after all, is a liear model for a trasformatio of the probability. Let s call this trasformatio g: So the model is p g(p) log 1 p (12.20) g(p)=β 0 x β (12.21) ad Y X = x Biom(1, g 1 (β 0 x β)). It seems that what we should wat to do is take g(y) ad regress it liearly o x. Of course, the variace of Y, accordig to the model, is goig to chace depedig o x it will be (g 1 (β 0 x β))(1 g 1 (β 0 x β)) so we really ought to do a weighted liear regressio, with weights iversely proportioal to that variace. Sice writig β 0 x β is gettig aoyig, let s abbreviate it by µ (for mea ), ad let s abbreviate that variace as V (µ). The problem is that y is either 0 or 1, so g(y) is either or. We will evade this by usig Taylor expasio. g(y) g(µ)(y µ)g (µ) z (12.22) The right had side, z will be our effective respose variable. To regress it, we eed its variace, which by propagatio of error will be (g (µ)) 2 V (µ). 3 optim actually is a wrapper for several differet optimizatio methods; method=bfgs selects a Newtoia method; BFGS is a acroym for the ames of the algorithm s ivetors.

10 232 CHAPTER 12. LOGISTIC REGRESSION Notice that both the weights ad z deped o the parameters of our logistic regressio, through µ. So havig doe this oce, we should really use the ew parameters to update z ad the weights, ad do it agai. Evetually, we come to a fixed poit, where the parameter estimates o loger chage. The treatmet above is rather heuristic 4, but it turs out to be equivalet to usig Newto s method, with the expected secod derivative of the log likelihood, istead of its actual value. 5 Sice, with a large umber of observatios, the observed secod derivative should be close to the expected secod derivative, this is oly a small approximatio Geeralized Liear Models ad Geeralized Additive Models Logistic regressio is part of a broader family of geeralized liear models (GLMs), where the coditioal distributio of the respose falls i some parametric family, ad the parameters are set by the liear predictor. Ordiary, leastsquares regressio is the case where respose is Gaussia, with mea equal to the liear predictor, ad costat variace. Logistic regressio is the case where the respose is biomial, with equal to the umber of datapoits with the give x (ofte but ot always 1), ad p is give by Equatio Chagig the relatioship betwee the parameters ad the liear predictor is called chagig the lik fuctio. For computatioal reasos, the lik fuctio is actually the fuctio you apply to the mea respose to get back the liear predictor, rather tha the other way aroud (12.4) rather tha (12.5). There are thus other forms of biomial regressio besides logistic regressio. 6 There is also Poisso regressio (appropriate whe the data are couts without ay upper limit), gamma regressio, etc.; we will say more about these i Chapter 13. I R, ay stadard GLM ca be fit usig the (base) glm fuctio, whose sytax is very similar to that of lm. The major wrikle is that, of course, you eed to specify the family of probability distributios to use, by the family optio family=biomial defaults to logistic regressio. (See help(glm) for the gory details o how to do, say, probit regressio.) All of these are fit by the same sort of umerical likelihood maximizatio. Oe cautio about usig maximum likelihood to fit logistic regressio is that it ca seem to work badly whe the traiig data ca be liearly separated. The reaso is that, to make the likelihood large, p(x i ) should be large whe y i = 1, ad p should be small whe y i = 0. If β 0,β 0 is a set of parameters which perfectly classifies the traiig data, the cβ 0, cβ is too, for ay c > 1, but i a logistic regressio the secod 4 That is, mathematically icorrect. 5 This takes a reasoable amout of algebra to show, so we ll skip it. The key poit however is the followig. Take a sigle Beroulli observatio with success probability p. The loglikelihood is Y log p (1 Y )log1 p. The first derivative with respect to p is Y / p (1 Y )/(1 p), ad the secod derivative is Y / p 2 (1 Y )/(1 p) 2. Takig expectatios of the secod derivative gives 1/ p 1/(1 p) = 1/ p(1 p). I other words, V ( p)= 1/E. Usig weights iversely proportioal to the variace thus turs out to be equivalet to dividig by the expected secod derivative. 6 My experiece is that these ted to give similar error rates as classifiers, but have rather differet guesses about the uderlyig probabilities.

11 12.4. GENERALIZED LINEAR MODELS AND GENERALIZED ADDITIVE MODELS233 set of parameters will have more extreme probabilities, ad so a higher likelihood. For liearly separable data, the, there is o parameter vector which maximizes the likelihood, sice ca always be icreased by makig the vector larger but keepig it poited i the same directio. You should, of course, be so lucky as to have this problem Geeralized Additive Models A atural step beyod geeralized liear models is geeralized additive models (GAMs), where istead of makig the trasformed mea respose a liear fuctio of the iputs, we make it a additive fuctio of the iputs. This meas combiig a fuctio for fittig additive models with likelihood maximizatio. The R fuctio here is gam, from the CRAN package of the same ame. (Alterately, use the fuctio gam i the package mgcv, which is part of the default R istallatio.) We will look at how this works i some detail i Chapter 13. GAMs ca be used to check GLMs i much the same way that smoothers ca be used to check parametric regressios: fit a GAM ad a GLM to the same data, the simulate from the GLM, ad refit both models to the simulated data. Repeated may times, this gives a distributio for how much better the GAM will seem to fit tha the GLM does, eve whe the GLM is true. You ca the read a pvalue off of this distributio A Example (Icludig Model Checkig) Here s a worked R example, usig the data from the upper right pael of Figure The 50 2 matrix x holds the iput variables (the coordiates are idepedetly ad uiformly distributed o [ 1, 1]), ad y.1 the correspodig class labels, themselves geerated from a logistic regressio with β 0 =, β =( 1,1). > logr = glm(y.1 ~ x[,1] x[,2], family=biomial) > logr Call: glm(formula = y.1 ~ x[, 1] x[, 2], family = biomial) Coefficiets: (Itercept) x[, 1] x[, 2] Degrees of Freedom: 49 Total (i.e. Null); 47 Residual Null Deviace: Residual Deviace: AIC: > sum(ifelse(logr$fitted.values<,0,1)!= y.1)/legth(y.1) [1] 0.32 The deviace of a model fitted by maximum likelihood is (twice) the differece betwee its log likelihood ad the maximum log likelihood for a saturated model, i.e., a model with oe parameter per observatio. Hopefully, the saturated model

12 234 CHAPTER 12. LOGISTIC REGRESSION ca give a perfect fit. 7 Here the saturated model would assig probability 1 to the observed outcomes 8, ad the logarithm of 1 is zero, so D = 2( β 0, β). The ull deviace is what s achievable by usig just a costat bias b ad settig w = 0. The fitted model defiitely improves o that. 9 The fitted values of the logistic regressio are the class probabilities; this shows that the error rate of the logistic regressio, if you force it to predict actual classes, is 32%. This souds bad, but otice from the cotour lies i the figure that lots of the probabilities are ear, meaig that the classes are just geuiely hard to predict. To see how well the logistic regressio assumptio holds up, let s compare this to a GAM. 10 > library(gam) > gam.1 = gam(y.1~lo(x[,1])lo(x[,2]),family="biomial") > gam.1 Call: gam(formula = y.1 ~ lo(x[, 1]) lo(x[, 2]), family = "biomial") Degrees of Freedom: 49 total; Residual Residual Deviace: This fits a GAM to the same data, usig lowess smoothig of both iput variables. Notice that the residual deviace is lower. That is, the GAM fits better. We expect this; the questio is whether the differece is sigificat, or withi the rage of what we should expect whe logistic regressio is valid. To test this, we eed to simulate from the logistic regressio model. simulate.from.logr = fuctio(x, coefs) { require(faraway) # For accessible logit ad iverselogit fuctios = row(x) liear.part = coefs[1] x %*% coefs[1] probs = ilogit(liear.part) # Iverse logit y = rbiom(,size=1,prob=probs) retur(y) } Now we simulate from our fitted model, ad refit both the logistic regressio ad the GAM. 7 The factor of two is so that the deviace will have a χ 2 distributio. Specifically, if the model with p parameters is right, the deviace will have a χ 2 distributio with p degrees of freedom. 8 This is ot possible whe there are multiple observatios with the same iput features, but differet classes. 9 AIC is of course the Akaike iformatio criterio, 2 2q, with q beig the umber of parameters (here, q = 3). AIC has some truly devoted adherets, especially amog ostatisticias, but I have bee deliberately igorig it ad will cotiue to do so. Basically, to the extet AIC succeeds, it works as fast, largesample approximatio to doig leaveoeout crossvalidatio. Claeskes ad Hjort (2008) is a thorough, moder treatmet of AIC ad related modelselectio criteria from a statistical viewpoit. 10 Previous examples of usig GAMs have mostly used the mgcv package ad splie smoothig. There is o particular reaso to switch to the gam library ad lowess smoothig here, but there s also o real reaso ot to.

13 12.4. GENERALIZED LINEAR MODELS AND GENERALIZED ADDITIVE MODELS235 delta.deviace.sim = fuctio (x,logistic.model) { y.ew = simulate.from.logr(x,logistic.model$coefficiets) GLM.dev = glm(y.ew ~ x[,1] x[,2], family="biomial")$deviace GAM.dev = gam(y.ew ~ lo(x[,1]) lo(x[,2]), family="biomial")$deviace retur(glm.dev GAM.dev) } Notice that i this simulatio we are ot geeratig ew X values. The logistic regressio ad the GAM are both models for the respose coditioal o the iputs, ad are agostic about how the iputs are distributed, or eve whether it s meaigful to talk about their distributio. Fially, we repeat the simulatio a buch of times, ad see where the observed differece i deviaces falls i the samplig distributio. > delta.dev = replicate(1000,delta.deviace.sim(x,logr)) > delta.dev.observed = logr$deviace gam.1$deviace # 9.64 > sum(delta.dev.observed > delta.dev)/1000 [1] I other words, the amout by which a GAM fits the data better tha logistic regressio is pretty ear the middle of the ull distributio. Sice the example data really did come from a logistic regressio, this is a relief.

14 236 CHAPTER 12. LOGISTIC REGRESSION Amout by which GAM fits better tha logistic regressio Desity N = 1000 Badwidth = Samplig distributio uder logistic regressio Figure 12.2: Samplig distributio for the differece i deviace betwee a GAM ad a logistic regressio, o data geerated from a logistic regressio. The observed differece i deviaces is show by the dashed horizotal lie.

15 12.5. EXERCISES Exercises To thik through, ot to had i. 1. A multiclass logistic regressio, as i Eq , has parameters β (c) ad β (c) 0 for each class c. Show that we ca always get the same predicted probabilities by settig β (c) = 0, β (c) = 0 for ay oe class c, ad adjustig the parameters 0 for the other classes appropriately. 2. Fid the first ad secod derivatives of the loglikelihood for logistic regressio with oe predictor variable. Explicitly write out the formula for doig oe step of Newto s method. Explai how this relates to reweighted least squares.

I. Chi-squared Distributions

I. Chi-squared Distributions 1 M 358K Supplemet to Chapter 23: CHI-SQUARED DISTRIBUTIONS, T-DISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad t-distributios, we first eed to look at aother family of distributios, the chi-squared distributios.

More information

Properties of MLE: consistency, asymptotic normality. Fisher information.

Properties of MLE: consistency, asymptotic normality. Fisher information. Lecture 3 Properties of MLE: cosistecy, asymptotic ormality. Fisher iformatio. I this sectio we will try to uderstad why MLEs are good. Let us recall two facts from probability that we be used ofte throughout

More information

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring No-life isurace mathematics Nils F. Haavardsso, Uiversity of Oslo ad DNB Skadeforsikrig Mai issues so far Why does isurace work? How is risk premium defied ad why is it importat? How ca claim frequecy

More information

Soving Recurrence Relations

Soving Recurrence Relations Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree

More information

5: Introduction to Estimation

5: Introduction to Estimation 5: Itroductio to Estimatio Cotets Acroyms ad symbols... 1 Statistical iferece... Estimatig µ with cofidece... 3 Samplig distributio of the mea... 3 Cofidece Iterval for μ whe σ is kow before had... 4 Sample

More information

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations CS3A Hadout 3 Witer 00 February, 00 Solvig Recurrece Relatios Itroductio A wide variety of recurrece problems occur i models. Some of these recurrece relatios ca be solved usig iteratio or some other ad

More information

1. C. The formula for the confidence interval for a population mean is: x t, which was

1. C. The formula for the confidence interval for a population mean is: x t, which was s 1. C. The formula for the cofidece iterval for a populatio mea is: x t, which was based o the sample Mea. So, x is guarateed to be i the iterval you form.. D. Use the rule : p-value

More information

Hypothesis testing. Null and alternative hypotheses

Hypothesis testing. Null and alternative hypotheses Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate

More information

Chapter 7 Methods of Finding Estimators

Chapter 7 Methods of Finding Estimators Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of

More information

Maximum Likelihood Estimators.

Maximum Likelihood Estimators. Lecture 2 Maximum Likelihood Estimators. Matlab example. As a motivatio, let us look at oe Matlab example. Let us geerate a radom sample of size 00 from beta distributio Beta(5, 2). We will lear the defiitio

More information

Determining the sample size

Determining the sample size Determiig the sample size Oe of the most commo questios ay statisticia gets asked is How large a sample size do I eed? Researchers are ofte surprised to fid out that the aswer depeds o a umber of factors

More information

Case Study. Normal and t Distributions. Density Plot. Normal Distributions

Case Study. Normal and t Distributions. Density Plot. Normal Distributions Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca

More information

Overview of some probability distributions.

Overview of some probability distributions. Lecture Overview of some probability distributios. I this lecture we will review several commo distributios that will be used ofte throughtout the class. Each distributio is usually described by its probability

More information

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5 Sectio 13 Kolmogorov-Smirov test. Suppose that we have a i.i.d. sample X 1,..., X with some ukow distributio P ad we would like to test the hypothesis that P is equal to a particular distributio P 0, i.e.

More information

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008 I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces

More information

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed. This documet was writte ad copyrighted by Paul Dawkis. Use of this documet ad its olie versio is govered by the Terms ad Coditios of Use located at http://tutorial.math.lamar.edu/terms.asp. The olie versio

More information

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100B Istructor: Nicolas Christou Three importat distributios: Distributios related to the ormal distributio Chi-square (χ ) distributio.

More information

1 Correlation and Regression Analysis

1 Correlation and Regression Analysis 1 Correlatio ad Regressio Aalysis I this sectio we will be ivestigatig the relatioship betwee two cotiuous variable, such as height ad weight, the cocetratio of a ijected drug ad heart rate, or the cosumptio

More information

Math C067 Sampling Distributions

Math C067 Sampling Distributions Math C067 Samplig Distributios Sample Mea ad Sample Proportio Richard Beigel Some time betwee April 16, 2007 ad April 16, 2007 Examples of Samplig A pollster may try to estimate the proportio of voters

More information

Solutions to Selected Problems In: Pattern Classification by Duda, Hart, Stork

Solutions to Selected Problems In: Pattern Classification by Duda, Hart, Stork Solutios to Selected Problems I: Patter Classificatio by Duda, Hart, Stork Joh L. Weatherwax February 4, 008 Problem Solutios Chapter Bayesia Decisio Theory Problem radomized rules Part a: Let Rx be the

More information

Lesson 17 Pearson s Correlation Coefficient

Lesson 17 Pearson s Correlation Coefficient Outlie Measures of Relatioships Pearso s Correlatio Coefficiet (r) -types of data -scatter plots -measure of directio -measure of stregth Computatio -covariatio of X ad Y -uique variatio i X ad Y -measurig

More information

CHAPTER 3 DIGITAL CODING OF SIGNALS

CHAPTER 3 DIGITAL CODING OF SIGNALS CHAPTER 3 DIGITAL CODING OF SIGNALS Computers are ofte used to automate the recordig of measuremets. The trasducers ad sigal coditioig circuits produce a voltage sigal that is proportioal to a quatity

More information

Output Analysis (2, Chapters 10 &11 Law)

Output Analysis (2, Chapters 10 &11 Law) B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should

More information

PSYCHOLOGICAL STATISTICS

PSYCHOLOGICAL STATISTICS UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc. Cousellig Psychology (0 Adm.) IV SEMESTER COMPLEMENTARY COURSE PSYCHOLOGICAL STATISTICS QUESTION BANK. Iferetial statistics is the brach of statistics

More information

Modified Line Search Method for Global Optimization

Modified Line Search Method for Global Optimization Modified Lie Search Method for Global Optimizatio Cria Grosa ad Ajith Abraham Ceter of Excellece for Quatifiable Quality of Service Norwegia Uiversity of Sciece ad Techology Trodheim, Norway {cria, ajith}@q2s.tu.o

More information

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth Questio 1: What is a ordiary auity? Let s look at a ordiary auity that is certai ad simple. By this, we mea a auity over a fixed term whose paymet period matches the iterest coversio period. Additioally,

More information

CHAPTER 3 THE TIME VALUE OF MONEY

CHAPTER 3 THE TIME VALUE OF MONEY CHAPTER 3 THE TIME VALUE OF MONEY OVERVIEW A dollar i the had today is worth more tha a dollar to be received i the future because, if you had it ow, you could ivest that dollar ad ear iterest. Of all

More information

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,

More information

Incremental calculation of weighted mean and variance

Incremental calculation of weighted mean and variance Icremetal calculatio of weighted mea ad variace Toy Fich faf@cam.ac.uk dot@dotat.at Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically

More information

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method Chapter 6: Variace, the law of large umbers ad the Mote-Carlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value

More information

Now here is the important step

Now here is the important step LINEST i Excel The Excel spreadsheet fuctio "liest" is a complete liear least squares curve fittig routie that produces ucertaity estimates for the fit values. There are two ways to access the "liest"

More information

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the. Cofidece Itervals A cofidece iterval is a iterval whose purpose is to estimate a parameter (a umber that could, i theory, be calculated from the populatio, if measuremets were available for the whole populatio).

More information

Normal Distribution.

Normal Distribution. Normal Distributio www.icrf.l Normal distributio I probability theory, the ormal or Gaussia distributio, is a cotiuous probability distributio that is ofte used as a first approimatio to describe realvalued

More information

Linear classifier MAXIMUM ENTROPY. Linear regression. Logistic regression 11/3/11. f 1

Linear classifier MAXIMUM ENTROPY. Linear regression. Logistic regression 11/3/11. f 1 Liear classifier A liear classifier predicts the label based o a weighted, liear combiatio of the features predictio = w 0 + w 1 f 1 + w 2 f 2 +...+ w m f m For two classes, a liear classifier ca be viewed

More information

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample

More information

Lesson 15 ANOVA (analysis of variance)

Lesson 15 ANOVA (analysis of variance) Outlie Variability -betwee group variability -withi group variability -total variability -F-ratio Computatio -sums of squares (betwee/withi/total -degrees of freedom (betwee/withi/total -mea square (betwee/withi

More information

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights Ceter, Spread, ad Shape i Iferece: Claims, Caveats, ad Isights Dr. Nacy Pfeig (Uiversity of Pittsburgh) AMATYC November 2008 Prelimiary Activities 1. I would like to produce a iterval estimate for the

More information

Confidence Intervals for One Mean

Confidence Intervals for One Mean Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a

More information

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals Overview Estimatig the Value of a Parameter Usig Cofidece Itervals We apply the results about the sample mea the problem of estimatio Estimatio is the process of usig sample data estimate the value of

More information

Sequences and Series

Sequences and Series CHAPTER 9 Sequeces ad Series 9.. Covergece: Defiitio ad Examples Sequeces The purpose of this chapter is to itroduce a particular way of geeratig algorithms for fidig the values of fuctios defied by their

More information

A probabilistic proof of a binomial identity

A probabilistic proof of a binomial identity A probabilistic proof of a biomial idetity Joatho Peterso Abstract We give a elemetary probabilistic proof of a biomial idetity. The proof is obtaied by computig the probability of a certai evet i two

More information

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable Week 3 Coditioal probabilities, Bayes formula, WEEK 3 page 1 Expected value of a radom variable We recall our discussio of 5 card poker hads. Example 13 : a) What is the probability of evet A that a 5

More information

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number.

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number. GCSE STATISTICS You should kow: 1) How to draw a frequecy diagram: e.g. NUMBER TALLY FREQUENCY 1 3 5 ) How to draw a bar chart, a pictogram, ad a pie chart. 3) How to use averages: a) Mea - add up all

More information

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return EVALUATING ALTERNATIVE CAPITAL INVESTMENT PROGRAMS By Ke D. Duft, Extesio Ecoomist I the March 98 issue of this publicatio we reviewed the procedure by which a capital ivestmet project was assessed. The

More information

INVESTMENT PERFORMANCE COUNCIL (IPC)

INVESTMENT PERFORMANCE COUNCIL (IPC) INVESTMENT PEFOMANCE COUNCIL (IPC) INVITATION TO COMMENT: Global Ivestmet Performace Stadards (GIPS ) Guidace Statemet o Calculatio Methodology The Associatio for Ivestmet Maagemet ad esearch (AIM) seeks

More information

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem Lecture 4: Cauchy sequeces, Bolzao-Weierstrass, ad the Squeeze theorem The purpose of this lecture is more modest tha the previous oes. It is to state certai coditios uder which we are guarateed that limits

More information

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here).

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here). BEGINNING ALGEBRA Roots ad Radicals (revised summer, 00 Olso) Packet to Supplemet the Curret Textbook - Part Review of Square Roots & Irratioals (This portio ca be ay time before Part ad should mostly

More information

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means)

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means) CHAPTER 7: Cetral Limit Theorem: CLT for Averages (Meas) X = the umber obtaied whe rollig oe six sided die oce. If we roll a six sided die oce, the mea of the probability distributio is X P(X = x) Simulatio:

More information

Confidence intervals and hypothesis tests

Confidence intervals and hypothesis tests Chapter 2 Cofidece itervals ad hypothesis tests This chapter focuses o how to draw coclusios about populatios from sample data. We ll start by lookig at biary data (e.g., pollig), ad lear how to estimate

More information

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown Z-TEST / Z-STATISTIC: used to test hypotheses about µ whe the populatio stadard deviatio is kow ad populatio distributio is ormal or sample size is large T-TEST / T-STATISTIC: used to test hypotheses about

More information

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13 EECS 70 Discrete Mathematics ad Probability Theory Sprig 2014 Aat Sahai Note 13 Itroductio At this poit, we have see eough examples that it is worth just takig stock of our model of probability ad may

More information

Factoring x n 1: cyclotomic and Aurifeuillian polynomials Paul Garrett <garrett@math.umn.edu>

Factoring x n 1: cyclotomic and Aurifeuillian polynomials Paul Garrett <garrett@math.umn.edu> (March 16, 004) Factorig x 1: cyclotomic ad Aurifeuillia polyomials Paul Garrett Polyomials of the form x 1, x 3 1, x 4 1 have at least oe systematic factorizatio x 1 = (x 1)(x 1

More information

Simple Annuities Present Value.

Simple Annuities Present Value. Simple Auities Preset Value. OBJECTIVES (i) To uderstad the uderlyig priciple of a preset value auity. (ii) To use a CASIO CFX-9850GB PLUS to efficietly compute values associated with preset value auities.

More information

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas: Chapter 7 - Samplig Distributios 1 Itroductio What is statistics? It cosist of three major areas: Data Collectio: samplig plas ad experimetal desigs Descriptive Statistics: umerical ad graphical summaries

More information

*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature.

*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature. Itegrated Productio ad Ivetory Cotrol System MRP ad MRP II Framework of Maufacturig System Ivetory cotrol, productio schedulig, capacity plaig ad fiacial ad busiess decisios i a productio system are iterrelated.

More information

Measures of Spread and Boxplots Discrete Math, Section 9.4

Measures of Spread and Boxplots Discrete Math, Section 9.4 Measures of Spread ad Boxplots Discrete Math, Sectio 9.4 We start with a example: Example 1: Comparig Mea ad Media Compute the mea ad media of each data set: S 1 = {4, 6, 8, 10, 1, 14, 16} S = {4, 7, 9,

More information

Hypergeometric Distributions

Hypergeometric Distributions 7.4 Hypergeometric Distributios Whe choosig the startig lie-up for a game, a coach obviously has to choose a differet player for each positio. Similarly, whe a uio elects delegates for a covetio or you

More information

Chapter 7: Confidence Interval and Sample Size

Chapter 7: Confidence Interval and Sample Size Chapter 7: Cofidece Iterval ad Sample Size Learig Objectives Upo successful completio of Chapter 7, you will be able to: Fid the cofidece iterval for the mea, proportio, ad variace. Determie the miimum

More information

5 Boolean Decision Trees (February 11)

5 Boolean Decision Trees (February 11) 5 Boolea Decisio Trees (February 11) 5.1 Graph Coectivity Suppose we are give a udirected graph G, represeted as a boolea adjacecy matrix = (a ij ), where a ij = 1 if ad oly if vertices i ad j are coected

More information

Chapter 14 Nonparametric Statistics

Chapter 14 Nonparametric Statistics Chapter 14 Noparametric Statistics A.K.A. distributio-free statistics! Does ot deped o the populatio fittig ay particular type of distributio (e.g, ormal). Sice these methods make fewer assumptios, they

More information

Page 1. Real Options for Engineering Systems. What are we up to? Today s agenda. J1: Real Options for Engineering Systems. Richard de Neufville

Page 1. Real Options for Engineering Systems. What are we up to? Today s agenda. J1: Real Options for Engineering Systems. Richard de Neufville Real Optios for Egieerig Systems J: Real Optios for Egieerig Systems By (MIT) Stefa Scholtes (CU) Course website: http://msl.mit.edu/cmi/ardet_2002 Stefa Scholtes Judge Istitute of Maagemet, CU Slide What

More information

Building Blocks Problem Related to Harmonic Series

Building Blocks Problem Related to Harmonic Series TMME, vol3, o, p.76 Buildig Blocks Problem Related to Harmoic Series Yutaka Nishiyama Osaka Uiversity of Ecoomics, Japa Abstract: I this discussio I give a eplaatio of the divergece ad covergece of ifiite

More information

The Stable Marriage Problem

The Stable Marriage Problem The Stable Marriage Problem William Hut Lae Departmet of Computer Sciece ad Electrical Egieerig, West Virgiia Uiversity, Morgatow, WV William.Hut@mail.wvu.edu 1 Itroductio Imagie you are a matchmaker,

More information

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN Aalyzig Logitudial Data from Complex Surveys Usig SUDAAN Darryl Creel Statistics ad Epidemiology, RTI Iteratioal, 312 Trotter Farm Drive, Rockville, MD, 20850 Abstract SUDAAN: Software for the Statistical

More information

Quadrat Sampling in Population Ecology

Quadrat Sampling in Population Ecology Quadrat Samplig i Populatio Ecology Backgroud Estimatig the abudace of orgaisms. Ecology is ofte referred to as the "study of distributio ad abudace". This beig true, we would ofte like to kow how may

More information

15.075 Exam 3. Instructor: Cynthia Rudin TA: Dimitrios Bisias. November 22, 2011

15.075 Exam 3. Instructor: Cynthia Rudin TA: Dimitrios Bisias. November 22, 2011 15.075 Exam 3 Istructor: Cythia Rudi TA: Dimitrios Bisias November 22, 2011 Gradig is based o demostratio of coceptual uderstadig, so you eed to show all of your work. Problem 1 A compay makes high-defiitio

More information

How to use what you OWN to reduce what you OWE

How to use what you OWN to reduce what you OWE How to use what you OWN to reduce what you OWE Maulife Oe A Overview Most Caadias maage their fiaces by doig two thigs: 1. Depositig their icome ad other short-term assets ito chequig ad savigs accouts.

More information

MEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book)

MEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book) MEI Mathematics i Educatio ad Idustry MEI Structured Mathematics Module Summary Sheets Statistics (Versio B: referece to ew book) Topic : The Poisso Distributio Topic : The Normal Distributio Topic 3:

More information

A gentle introduction to Expectation Maximization

A gentle introduction to Expectation Maximization A getle itroductio to Expectatio Maximizatio Mark Johso Brow Uiversity November 2009 1 / 15 Outlie What is Expectatio Maximizatio? Mixture models ad clusterig EM for setece topic modelig 2 / 15 Why Expectatio

More information

Universal coding for classes of sources

Universal coding for classes of sources Coexios module: m46228 Uiversal codig for classes of sources Dever Greee This work is produced by The Coexios Project ad licesed uder the Creative Commos Attributio Licese We have discussed several parametric

More information

Project Deliverables. CS 361, Lecture 28. Outline. Project Deliverables. Administrative. Project Comments

Project Deliverables. CS 361, Lecture 28. Outline. Project Deliverables. Administrative. Project Comments Project Deliverables CS 361, Lecture 28 Jared Saia Uiversity of New Mexico Each Group should tur i oe group project cosistig of: About 6-12 pages of text (ca be loger with appedix) 6-12 figures (please

More information

5.4 Amortization. Question 1: How do you find the present value of an annuity? Question 2: How is a loan amortized?

5.4 Amortization. Question 1: How do you find the present value of an annuity? Question 2: How is a loan amortized? 5.4 Amortizatio Questio 1: How do you fid the preset value of a auity? Questio 2: How is a loa amortized? Questio 3: How do you make a amortizatio table? Oe of the most commo fiacial istrumets a perso

More information

Professional Networking

Professional Networking Professioal Networkig 1. Lear from people who ve bee where you are. Oe of your best resources for etworkig is alumi from your school. They ve take the classes you have take, they have bee o the job market

More information

Trigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE 6.44. The absolute value of the complex number z a bi is

Trigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE 6.44. The absolute value of the complex number z a bi is 0_0605.qxd /5/05 0:45 AM Page 470 470 Chapter 6 Additioal Topics i Trigoometry 6.5 Trigoometric Form of a Complex Number What you should lear Plot complex umbers i the complex plae ad fid absolute values

More information

Lecture 3. denote the orthogonal complement of S k. Then. 1 x S k. n. 2 x T Ax = ( ) λ x. with x = 1, we have. i = λ k x 2 = λ k.

Lecture 3. denote the orthogonal complement of S k. Then. 1 x S k. n. 2 x T Ax = ( ) λ x. with x = 1, we have. i = λ k x 2 = λ k. 18.409 A Algorithmist s Toolkit September 17, 009 Lecture 3 Lecturer: Joatha Keler Scribe: Adre Wibisoo 1 Outlie Today s lecture covers three mai parts: Courat-Fischer formula ad Rayleigh quotiets The

More information

LECTURE 13: Cross-validation

LECTURE 13: Cross-validation LECTURE 3: Cross-validatio Resampli methods Cross Validatio Bootstrap Bias ad variace estimatio with the Bootstrap Three-way data partitioi Itroductio to Patter Aalysis Ricardo Gutierrez-Osua Texas A&M

More information

INFINITE SERIES KEITH CONRAD

INFINITE SERIES KEITH CONRAD INFINITE SERIES KEITH CONRAD. Itroductio The two basic cocepts of calculus, differetiatio ad itegratio, are defied i terms of limits (Newto quotiets ad Riema sums). I additio to these is a third fudametal

More information

Learning objectives. Duc K. Nguyen - Corporate Finance 21/10/2014

Learning objectives. Duc K. Nguyen - Corporate Finance 21/10/2014 1 Lecture 3 Time Value of Moey ad Project Valuatio The timelie Three rules of time travels NPV of a stream of cash flows Perpetuities, auities ad other special cases Learig objectives 2 Uderstad the time-value

More information

1 Computing the Standard Deviation of Sample Means

1 Computing the Standard Deviation of Sample Means Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.

More information

1 The Gaussian channel

1 The Gaussian channel ECE 77 Lecture 0 The Gaussia chael Objective: I this lecture we will lear about commuicatio over a chael of practical iterest, i which the trasmitted sigal is subjected to additive white Gaussia oise.

More information

NATIONAL SENIOR CERTIFICATE GRADE 12

NATIONAL SENIOR CERTIFICATE GRADE 12 NATIONAL SENIOR CERTIFICATE GRADE MATHEMATICS P EXEMPLAR 04 MARKS: 50 TIME: 3 hours This questio paper cosists of 8 pages ad iformatio sheet. Please tur over Mathematics/P DBE/04 NSC Grade Eemplar INSTRUCTIONS

More information

Chapter 5 Unit 1. IET 350 Engineering Economics. Learning Objectives Chapter 5. Learning Objectives Unit 1. Annual Amount and Gradient Functions

Chapter 5 Unit 1. IET 350 Engineering Economics. Learning Objectives Chapter 5. Learning Objectives Unit 1. Annual Amount and Gradient Functions Chapter 5 Uit Aual Amout ad Gradiet Fuctios IET 350 Egieerig Ecoomics Learig Objectives Chapter 5 Upo completio of this chapter you should uderstad: Calculatig future values from aual amouts. Calculatig

More information

One-sample test of proportions

One-sample test of proportions Oe-sample test of proportios The Settig: Idividuals i some populatio ca be classified ito oe of two categories. You wat to make iferece about the proportio i each category, so you draw a sample. Examples:

More information

Mann-Whitney U 2 Sample Test (a.k.a. Wilcoxon Rank Sum Test)

Mann-Whitney U 2 Sample Test (a.k.a. Wilcoxon Rank Sum Test) No-Parametric ivariate Statistics: Wilcoxo-Ma-Whitey 2 Sample Test 1 Ma-Whitey 2 Sample Test (a.k.a. Wilcoxo Rak Sum Test) The (Wilcoxo-) Ma-Whitey (WMW) test is the o-parametric equivalet of a pooled

More information

2-3 The Remainder and Factor Theorems

2-3 The Remainder and Factor Theorems - The Remaider ad Factor Theorems Factor each polyomial completely usig the give factor ad log divisio 1 x + x x 60; x + So, x + x x 60 = (x + )(x x 15) Factorig the quadratic expressio yields x + x x

More information

Section 11.3: The Integral Test

Section 11.3: The Integral Test Sectio.3: The Itegral Test Most of the series we have looked at have either diverged or have coverged ad we have bee able to fid what they coverge to. I geeral however, the problem is much more difficult

More information

I. Why is there a time value to money (TVM)?

I. Why is there a time value to money (TVM)? Itroductio to the Time Value of Moey Lecture Outlie I. Why is there the cocept of time value? II. Sigle cash flows over multiple periods III. Groups of cash flows IV. Warigs o doig time value calculatios

More information

MMQ Problems Solutions with Calculators. Managerial Finance

MMQ Problems Solutions with Calculators. Managerial Finance MMQ Problems Solutios with Calculators Maagerial Fiace 2008 Adrew Hall. MMQ Solutios With Calculators. Page 1 MMQ 1: Suppose Newma s spi lads o the prize of $100 to be collected i exactly 2 years, but

More information

Systems Design Project: Indoor Location of Wireless Devices

Systems Design Project: Indoor Location of Wireless Devices Systems Desig Project: Idoor Locatio of Wireless Devices Prepared By: Bria Murphy Seior Systems Sciece ad Egieerig Washigto Uiversity i St. Louis Phoe: (805) 698-5295 Email: bcm1@cec.wustl.edu Supervised

More information

Asymptotic Growth of Functions

Asymptotic Growth of Functions CMPS Itroductio to Aalysis of Algorithms Fall 3 Asymptotic Growth of Fuctios We itroduce several types of asymptotic otatio which are used to compare the performace ad efficiecy of algorithms As we ll

More information

Solving Logarithms and Exponential Equations

Solving Logarithms and Exponential Equations Solvig Logarithms ad Epoetial Equatios Logarithmic Equatios There are two major ideas required whe solvig Logarithmic Equatios. The first is the Defiitio of a Logarithm. You may recall from a earlier topic:

More information

Valuing Firms in Distress

Valuing Firms in Distress Valuig Firms i Distress Aswath Damodara http://www.damodara.com Aswath Damodara 1 The Goig Cocer Assumptio Traditioal valuatio techiques are built o the assumptio of a goig cocer, I.e., a firm that has

More information

WHEN IS THE (CO)SINE OF A RATIONAL ANGLE EQUAL TO A RATIONAL NUMBER?

WHEN IS THE (CO)SINE OF A RATIONAL ANGLE EQUAL TO A RATIONAL NUMBER? WHEN IS THE (CO)SINE OF A RATIONAL ANGLE EQUAL TO A RATIONAL NUMBER? JÖRG JAHNEL 1. My Motivatio Some Sort of a Itroductio Last term I tought Topological Groups at the Göttige Georg August Uiversity. This

More information

Data-Enhanced Predictive Modeling for Sales Targeting

Data-Enhanced Predictive Modeling for Sales Targeting Data-Ehaced Predictive Modelig for Sales Targetig Saharo Rosset Richard D. Lawrece Abstract We describe ad aalyze the idea of data-ehaced predictive modelig (DEM). The term ehaced here refers to the case

More information

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY Physical ad Mathematical Scieces 2015, 1, p. 15 19 M a t h e m a t i c s AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM A. G. GULYAN Chair of Actuarial Mathematics

More information

THE TWO-VARIABLE LINEAR REGRESSION MODEL

THE TWO-VARIABLE LINEAR REGRESSION MODEL THE TWO-VARIABLE LINEAR REGRESSION MODEL Herma J. Bieres Pesylvaia State Uiversity April 30, 202. Itroductio Suppose you are a ecoomics or busiess maor i a college close to the beach i the souther part

More information

BASIC STATISTICS. f(x 1,x 2,..., x n )=f(x 1 )f(x 2 ) f(x n )= f(x i ) (1)

BASIC STATISTICS. f(x 1,x 2,..., x n )=f(x 1 )f(x 2 ) f(x n )= f(x i ) (1) BASIC STATISTICS. SAMPLES, RANDOM SAMPLING AND SAMPLE STATISTICS.. Radom Sample. The radom variables X,X 2,..., X are called a radom sample of size from the populatio f(x if X,X 2,..., X are mutually idepedet

More information

, a Wishart distribution with n -1 degrees of freedom and scale matrix.

, a Wishart distribution with n -1 degrees of freedom and scale matrix. UMEÅ UNIVERSITET Matematisk-statistiska istitutioe Multivariat dataaalys D MSTD79 PA TENTAMEN 004-0-9 LÖSNINGSFÖRSLAG TILL TENTAMEN I MATEMATISK STATISTIK Multivariat dataaalys D, 5 poäg.. Assume that

More information

Basic Elements of Arithmetic Sequences and Series

Basic Elements of Arithmetic Sequences and Series MA40S PRE-CALCULUS UNIT G GEOMETRIC SEQUENCES CLASS NOTES (COMPLETED NO NEED TO COPY NOTES FROM OVERHEAD) Basic Elemets of Arithmetic Sequeces ad Series Objective: To establish basic elemets of arithmetic

More information

Example: Probability ($1 million in S&P 500 Index will decline by more than 20% within a

Example: Probability ($1 million in S&P 500 Index will decline by more than 20% within a Value at Risk For a give portfolio, Value-at-Risk (VAR) is defied as the umber VAR such that: Pr( Portfolio loses more tha VAR withi time period t)

More information