Part 2: Oneparameter models


 Griffin Preston
 2 years ago
 Views:
Transcription
1 Part 2: Oneparameter models
2 Bernoilli/binomial models Return to iid Y 1,...,Y n Bin(1, θ). The sampling model/likelihood is p(y 1,...,y n θ) =θ P y i (1 θ) n P y i When combined with a prior p(θ), Bayes rule gives the posterior p(θ y 1,...,y n )= θp y i (1 θ) n P yi p(θ) p(y 1,...,y n )
3 Posterior odds ratio If we compare the relative probabilities of any two values, say and, we see that θ θ a θ b p(θ a y 1,...,y n ) p(θ a y 1,...,y n ) P = θ yi a (1 θ a ) n P y i p(θ a )/p(y 1,...,y n ) P θ yi (1 θ b b ) n P y i p(θb )/p(y 1,...,y n ) P y P i n θa 1 yi θa p(θ a ) = θ b 1 θ b p(θ b ) This shows that the probability density at relative that at θ depends on y b 1,...,y n only through yi θ a
4 Sufficient statistics From this it is possible to show that P (θ A Y 1 = y 1,...,Y n = y n ) n = P θ A Y i = i=1 n i=1 y i We interpret this as meaning that yi contains all the information about θ available from the data yi we say that is a sufficient statistic for θ and p(y 1,...,y n θ)
5 The Binomial model When Y 1,...,Y n iid we saw that the Bin(1, θ) n sufficient statistic has a Bin(n, θ) distribution i=1 Y i Having observed {Y = y} our task is to obtain the posterior distribution of θ: p(θ y) = p(y θ)p(θ) p(y) = n y θ y (1 θ) n y p(θ) p(y) = c(y)θ y (1 θ) n y p(θ) where c(y) is a function of y and not θ
6 A uniform prior The parameter 0 and 1 θ is some unknown number between Suppose our prior information for θ is such that all subintervals of [0, 1] having the same length also have the same probability P (a θ b) =P (a + c θ b + c), for 0 a<b<b+ c 1 This condition implies a uniform density for θ: p(θ) = 1 for all θ [0, 1]
7 Normalizing constant Using the following result from calculus 1 θ a 1 (1 θ) b 1 dθ = Γ(a)Γ(b) 0 Γ(a + b) where Γ(n) =(n 1)! is evaluated in R as gamma(n) we can find out what 1 c(y) is under the uniform prior p(θ y) dy =1 0 c(y) = Γ(n + 2) Γ(y + 1)Γ(n y + 1)
8 Beta posterior So the posterior distribution is p(θ y) = = To wrap up: Γ(n + 2) Γ(y + 1)Γ(n y + 1) θy (1 θ) n y Γ(n + 2) Γ(y + 1)Γ(n y + 1) θ(y+1) 1 (1 θ) (n y+1) 1 = Beta(y +1,n y + 1) a uniform prior, plus a Bernoilli/Binomial likelihood (sampling model) gives a Beta posterior
9 Example: Happiness Each female of age 65 or over in the 1998 General Social Survey was asked whether or not the were generally happy Let Y i =1if respondent i reported being generally happy, and Y i =0otherwise, for i =1,...,n= 129 individuals Since 129 N, the total size of the female senior citizen population, our joint beliefs about Y 1,...,Y 129 are well approximated by (the sampling model) our beliefs about θ = N i=1 Y i/n the model that, conditional on θ, the Y i s are IID Bernoilli RVs with expectation θ
10 Example: Happiness IID Bernoulli likelihood This sampling model says that the probability for any potential outcome {y 1,...,y n }, conditional on θ, is given by p(y 1,...,y 129 θ) =θ P 129 i=1 y i (1 θ) 129 P 129 i=1 y i The survey revealed that 118 individuals report being generally happy (91%) 11 individuals do not (9%) So the probability of these data for a given value of p(y 1,...,y 129 θ) =θ 118 (1 θ) 11 θ is
11 Example: Happiness Binomial likelihood We could work with this IID Bernoulli likelihood directly. But for a sense of the bigger picture, we ll use that we know that n Y = Y i Bin(n, θ) i=1 In this case, our observed data is y = 118 So (now) the probability of these data for a given value of θ is 129 p(y θ) = 118 θ 118 (1 θ) 11 Notice that this is within a constant factor of the previous Bernoilli likelihood
12 Example: Happiness Posterior If we take a uniform prior for θ, expressing our ignorance, then we know the posterior is p(θ y) = Beta(y +1,n y + 1) For n = 129 and y = 118, this gives θ {Y = 118} Beta(119, 12) density posterior prior theta
13 Uniform is Beta The uniform prior has p(θ) =1for all θ [0, 1] This can be thought of as a Beta distribution with parameters a =1,b=1 p(θ) = Γ(2) Γ(1)Γ(1) θ1 1 (1 θ) 1 1 = (under the convention that Γ(1) = 1) θ Beta(1, 1) We saw that if Y θ Bin(n, θ) then θ {Y = y} Beta(1 + y, 1+n y)
14 General Beta prior Suppose θ Beta(a, b) and Y θ Bin(n, θ) p(θ y) θ a+y 1 (1 θ) b+n y 1 θ y Beta(a + y, b + n y) In other words, the constant of proportionality must be: c(n, y, a, b) = Γ(a + b + n) Γ(a + y)γ(b + n y) How do I know? p(θ y) (and the beta density) must integrate to 1
15 Posterior proportionality We will use this trick over and over again! I.e., if we recognize that the posterior distribution is proportional to a known probability density, then it must be identical to that density But Careful! The constant of proportionality must be constant with respect to θ
16 Conjugacy We have shown that a beta prior distribution and a binomial sampling model lead to a beta posterior distribution To reflect this, we say that the class of beta priors is conjugate for the binomial sampling model Formally, we say that a class P of prior distributions for θ is conjugate for a sampling model p(y θ) if p(θ) P p(θ y) P Conjugate priors make posterior calculations easy, but might not actually represent our prior information
17 Posterior summaries If we are interested in pointsummaries of our posterior inference, then the (full) distribution gives us many points to choose from E.g., if θ {Y = y} Beta(a + y, b + n y) then E{θ y} = a + y a + b + n mode(θ y) = a + y 1 a + b + n 2 Var[θ y] = E{θ y}e{1 θ y} a + b + n +1 not including quantiles, etc.
18 Combining information The posterior expectation E{θ y} is easily recognized as a combination of prior and data information: E{θ y} = a + b a + b + n prior expectation + n a + b + n data average I.e., for this model and prior distribution, the posterior mean is a weighted average of the prior mean and the sample average with weights proportional to a + b and n respectively
19 Prior sensitivity This leads to the interpretation of a and b as prior data : a prior number of 1 s b prior number of 0 s a + b prior sample size If n a + b, then it seems reasonable that most of our information should be coming from the data as opposed to the prior Indeed: a + b a + b + n 0 E{θ y} y n & Var[θ y] 1 n y n 1 y n
20 Prediction An important feature of Bayesian inference is the existence of a predictive distribution for new observations Revert, for the moment, to our notation for Bernoilli data. Let y 1,...,y n, be the outcomes of a sample of n binary RVs Let Ỹ {0, 1} be an additional outcome from the same population that has yet to be observed The predictive distribution of Ỹ is the conditional distribution of Ỹ given {Y 1 = y 1,...,Y n = y n }
21 Predictive distribution For conditionally IID binary variables the predictive distribution can be derived from the distribution of Ỹ given θ and the posterior distribution of θ p(ỹ =1 y 1,...,y n )=E{θ y 1,...,y n } = a + N i=1 y i a + b + n p(ỹ =0 y 1,...,y n )=1 p(ỹ =1 y 1,...,y n ) b + n N i=1 y i a + b + n
22 Two points about prediction 1. The predictive distribution does not depend upon any unknown quantities if it did, we would not be able to use it to make predictions 2. The predictive distribution depends on our observed data i.e. Ỹ is not independent of Y 1,...,Y n this is because observing Y 1,...,Y n gives information about θỹ, which in turn gives information about
23 Example: Posterior predictive Consider a uniform prior, or Y = n i=1 y i P (Ỹ Beta(1, 1), using =1 Y = y) =E{θ Y = y} = 2 2+n n 2+n y n but mode(θ Y = y) = y n Does this discrepancy between these two posterior summaries of our information make sense? Consider the case in which Y =0, for which mode(θ Y = 0) = 0 but P (Ỹ =1 Y = 0) = 1/(2 + n)
24 Confidence regions An interval [l(y),u(y)], based on the observed data Y = y, has 95% Bayesian coverage for θ if P (l(y) < θ <u(y) Y = y) =0.95 The interpretation of this interval is that it describes your information about the true value of θ after you have observed Y = y Such intervals are typically called credible intervals, to distinguish them from frequentist confidence intervals which is an interval that describes a region (based upon ˆθ ) wherein the true θ 0 lies 95% of the time Both, confusingly, use the acronym CI
25 Quantilebased (Bayesian) CI Perhaps the easiest way to obtain a credible interval is to use the posterior quantiles To make a 100 (1 α) % quantilebased CI, find numbers θ α/2 < θ 1 α/2 such that (1) (2) P (θ < θ α/2 Y = y) =α/2 P (θ > θ 1 α/2 Y = y) =α/2 The numbers θ α/2, θ 1 α/2 are the α/2 and 1 α/2 posterior quantiles of θ
26 Example: Binomial sampling and uniform prior Suppose out of n = 10 conditionally independent draws of a binary random variable we observe Y =2 ones Using a uniform prior distribution for θ, the posterior distribution is θ {Y =2} Beta(1 + 2, 1 + 8) A 95% CI can be obtained from the and quantiles of this beta distribution These quantiles are 0.06 and 0.52, respectively, and so the posterior probability that is 95% θ [0.06, 0.52]
27 Example: a quantilebased CI drawback Notice that there are θvalues outside the quantilebased CI that have higher probability [density] than some points inside the interval density posterior 95% CI theta
28 HPD region The quantilebased drawback suggests a more restrictive type of interval may be warranted A 100 (1 α) % high posterior density (HPD) region consists of a subset of the parameter space such that 1. P (θ s(y) Y = y) =1 α 2. If θ a s(y), and θ b / s(y), then p(θ a Y = y) >p(θ b Y = y) All points in an HPD region have a higher posterior density than points outside the region
29 Constructing a HPD region However, a HPD region might not be an interval if the posterior density is multimodal When it is an interval, the basic idea behind its construction is as follows: 1. Gradually move a horizontal line down across the density, including in the HPD region all of the θvalues having a density above the horizontal line 2. Stop when the posterior probability of the θvalues in the region reaches 1 α
30 Example: HPD region Numerically, we may collect the highest density points whose cumulative density is greater than 1 α density % q CI 50% HPD 75% HPD 95% HPD theta The 95% HPD region is [0.04, 0.048] is narrower than the quantilebased CI, yet both contain 95% probability
31 Example: Estimating the probability of a female birth The proportion of births that are female has long been a topic of interest both scientifically and to the lay public E.g., Laplace, perhaps the first Bayesian, analysed Parisian birth data from 1745 with a uniform prior and a binomial sampling model 241,945 girls, and 251,527 boys
32 Example: female birth posterior The posterior distribution for θ is (n = ) θ {Y = } Beta( , ) Now, we can use the posterior CDF to calculate P (θ > 0.5 Y = ) = pbeta(0.5, , , lower.tail=false) =
33 Example: female birth with placenta previa Placenta previa is an unusual condition of pregnancy in which the placenta is implanted very low in the uterus, obstructing the fetus from a normal vaginal delivery An early study concerning the sex of placenta previa births in Germany found that of a total of 980 births, 437 were female How much evidence does this provide for the claim that the proportion of the female births in the population of placenta previa births is less than 0.485, the proportion of female births in the general population?
34 Example: female birth with placenta previa The posterior distribution is θ {Y = 437} Beta( , ) The summary statistics are E{θ {Y = 437}} =0.446 Var[θ {Y = 437}] =0.016 med(θ {Y = 437}) =0.446 (sd) The central 95% CI is [0.415, 0.476] P (θ > Y = 437) = 0.007
35 The Poisson model Some measurements, such as a person s number of children or number of friends, have values that are whole numbers In these cases the sample space is Y = {0, 1, 2,...} Perhaps the simplest probability model on Poisson model Y is the A RV Y has a Poisson distribution with mean θ if P (Y = y θ) =θ y e θ y! for y {0, 1, 2,...}
36 Mean variance relationship Poisson RVs have the property that E{Y θ} = Var[Y θ] =θ People sometimes say that the Poisson family of distributions has a mean variance relationship because if one Poisson distribution has a larger mean than another, it will have a larger variance as well
37 IID Sampling model If we take is: Y 1,...,Y n iid Pois(θ) then the joint PDF n P (Y 1 = y 1,...,Y n = y n θ) = p(y i θ) = i=1 n i=1 1 y! θy i e nθ = c(y 1,...,y n )θ P y i e nθ
38 Posterior odds ratio Comparing two values of θ a posteriori, we have P p(θ a y 1,...,y n ) p(θ b y 1,...,y n ) = c(y 1,...,y n )e nθa θ yi a p(θ a ) P c(y 1,...,y n )e nθ b θ yi p(θ b b ) = e nθ a e nθ b θa θ b P y i p(θ a) p(θ b ) n As in the case of the IID Bernoulli model, i=1 Y i, is a sufficient statistic Furthermore n i=1 Y i Pois(nθ)
39 Conjugate prior Recall that a class of priors is conjugate for a sampling model if the posterior is also in that class For the Poisson model, our posterior distribution for θ has the following form p(θ y 1,...,y m ) p(y 1,...,y n θ)p(θ) θ P y i e nθ p(θ) This means that whatever our conjugate class of densities is, it will have to include terms like θ c 1 e c 2θ for numbers c 1 and c 2
40 Conjugate gamma prior The simplest class of probability distributions matching this description is the gamma family p(θ) = ba Γ(a) θa 1 e bθ for θ,a,b>0 Some properties include E{θ} = a b Var[θ] = a b 2 mode(θ) = (a 1)/b if a>1 0 if a 1 We shall denote this family as G(a, b)
41 Posterior distribution Suppose Y 1,...Y n θ iid Pois(θ) and θ G(a, b) Then p(θ y 1,...,y n ) θ a+p y i 1 e (b+n)θ This is evidently a gamma distribution, and we have confirmed the conjugacy of the gamma family for the Poisson sampling model θ G(a, b) Y 1,...,Y n Pois(θ) n {θ Y 1,...,Y n } G a + Y i,b+ n i=1
42 Combining information Estimation and prediction proceed in a manner similar to that in the binomial model The posterior expectation of θ is a convex combination of the prior expectation and the sample average b a n yi E{θ y 1,...,y n } = b + n b + b + n n b is the number of prior observations a is the sum of the counts from b prior observations For large n, the information in the data dominates n b E{θ y 1,...y n } ȳ, Var[θ y 1,...,y n ] ȳ n
43 Posterior predictive Predictions about additional data can be obtained with the posterior predictive distribution p(ỹ y 1,...,y n ) (b + n) a+p y i = Γ(ỹ + 1)Γ(a + y i ) (b + n) a+p y i = Γ(ỹ + 1)Γ(a + y i ) 0 θ a+p y i +ỹ 1 e (b n)θ dθ P a+ b + n yi 1 b + n +1 b + n +1 ȳ for ỹ {0, 1, 2,...,}
44 Posterior predictive This is a negative binomial distribution with parameters (a + y i,b+ n) for which E{Ỹ y 1,...,y n } = a + y i b + n = E{θ y 1,...,y n } Var[Ỹ y 1,...,y n ]= a + y i b + n b + n +1 b + n = Var[θ y 1,...,y n ] (b + n + 1) = E{θ y 1,...,y n } b + n +1 b + n
45 Example: birth rates and education Over the course of the 1990s the General Soceital Survey gathered data on the educational attainment and number of children of 155 women who were 40 years of age at the time of their participation in the survey These women were in their 20s in the 1970s, a period of historically low fertility rates in the United States In this example we will compare the women with college degrees to those without in terms of their numbers of children
46 Example: sampling model(s) Let Y 1,1,...,Y n1,1 denote the numbers of children for the n 1 women without college degrees and n 2 Y 1,2,...,Y n2,2 degrees be the data for the women with For this example, we will use the following sampling models Y 1,1,...,Y n1,1 Y 1,2,...,Y n2,2 iid Pois(θ1 ) iid Pois(θ2 ) The (sufficient) data are no bachelors: bachelors: n 1 = 111, n 1 i=1 y i,1 = 217, ȳ 1 =1.95 n 2 = 44, n 2 i=1 y i,2 = 66, ȳ 2 =1.50
47 Example: prior(s) and posterior(s) In the case were {θ 1, θ 2 } iid G(a =1,b= 2) we have the following posterior distributions: θ 1 {n 1 = 111, Y i,1 = 217} G( , ) θ 2 {n 1 = 44, Y i,2 = 66} G(2 + 66, ) Posterior means, modes and 95% quantilebased CIs for θ 1 and θ 2 can be obtained from their gamma posterior distributions
48 Example: prior(s) and posterior(s) density prior post none post bach theta The posterior(s) provide substantial evidence that θ 1 > θ 2. E.g., P (θ 1 > θ 2 Y i,1 = 217, Y i,2 = 66) = 0.97
49 Example: posterior predictive(s) Consider a randomly sampled individual from each population. To what extent do we expect the uneducated one to have more children than the other? probability none bach P (Ỹ1 > Ỹ2 Y i,1 = 217, Y i,2 = 66) = 0.48 P (Ỹ1 = Ỹ2 Y i,1 = 217, Y i,2 = 66) = children The distinction between {θ 1 > θ 2 } and {Ỹ1 > Ỹ2} is important: strong evidence of a difference between two populations does not mean the distance is large
50 Noninformative priors Priors can be difficult to construct There has long been a desire for prior distributions that can be guaranteed to play a minimal role in the posterior distribution Such distributions are sometimes called reference priors, and described as vague, flat, diffuse, or noninformative The rationale for using noninformative priors is often said to be to let the data speak for themselves, so that inferences are unaffected by the information external to the current data
51 Jeffreys invariance principle One approach that is sometimes used to define a noninformative prior distribution was introduced by Jeffreys (1946), based on considering onetoone transformations of the parameter Jeffrey s general principle is that any rule for determining the prior density should yield an equivalent posterior if applied to the transformed parameter Naturally, one must take the sampling model into account in order to study this invariance
52 The Jeffreys prior Jeffreys principle leads to defining the noninformative prior as p(θ) [i(θ)] 1/2, where i(θ) is the Fisher information for θ Recall that i(θ) =E θ d dθ log p(y θ) 2 = E θ d 2 log p(y θ) dθ2 A prior soconstructed is called the Jeffreys prior for the sampling model p(y θ)
53 Example: Jeffreys prior for the binomial sampling model Consider the binomial distribution Y Bin(n, θ) which has loglikelihood log p(y θ) =c + y log θ +(n y) log(1 θ) Therefore i(θ) = E θ d 2 log p(y θ) dθ2 = n θ(1 θ) So the Jeffreys prior density is then p(θ) θ 1/2 (1 θ) 1/2 which is the same as θ Beta( 1 2, 1 2 )
54 Example: noninformative? This θ Beta( 1 2, 1 2 ) Jeffreys prior is noninformative in some sense, but not in all senses The posterior expectation would give sample(s) worth of weight to the prior mean This Jeffreys prior less informative than the uniform prior which gives a weight of 2 samples to the prior mean (also 1) a + b =1 a a+b =1 In this sense, the least informative prior is θ Beta(0, 0) But is this a valid prior distribution?
55 Propriety of priors We call a prior density p(θ) proper if it does not depend on data and is integrable, i.e., Θ p(θ) dθ < Proper priors lead to valid joint probability models p(θ,y) and proper posteriors ( p(θ y) dθ < ) Θ Improper priors do not provide valid joint probability models, but they can lead to proper posterior by proceeding with the Bayesian algebra p(θ y) p(y θ)p(θ)
56 Example: continued The θ Beta(0, 0) 1 prior is improper since 0 θ 1 (1 θ) 1 dθ = However, the Beta(0 + y, 0+n y) posterior that results is proper as long as y = 0,n A θ Beta(, ) prior, for 0 < 1 is always proper, and (in the limit) noninformative In practice, the difference between these alternatives (Beta(0, 0), Beta( 1 2, 1 2 ), Beta(1, 1)) is small; all three allow the data (likelihood) to dominate in the posterior
57 Example: Jeffreys prior for the Poisson sampling model Consider the Poisson distribution has loglikelihood Y Pois(θ) which log p(y θ) =c + y log θ θ Therefore i(θ) = E θ d 2 log p(y θ) dθ2 = 1 θ So the Jeffreys prior density is then p(θ) θ 1/2
58 Example: Improper Jeffreys prior Since 0 θ 1/2 dθ = this prior is improper However, we can interpret is as a G( 1, i.e., within 2, 0) the conjugate gamma family, to see that the posterior is G 1 n 2 + i=1 y i, 0+n This is proper as long as we have observed one data point, i.e., n 1, since 0 θ α e β dθ < for all α > 0, β 1
59 Example: noninformative? Again, this G( 1 2, 0) Jeffreys prior is noninformative in some sense, but not in all senses The (improper) prior p(θ) θ 1, or θ G(0, 0), is in some sense equivalent since it gives the same weight (b = 0) to the prior mean (although different) It can be shown that, in the limit as the prior parameters (a, b) (0, 0) the posterior expectation approaches the MLE p(θ) θ 1 is a typical default for this reason As before, the practical difference between these choices is small
60 Notes of caution The search for noninformative priors has several problems, including it can be misguided: if the likelihood (data) is truly dominant, the the choice among relatively flat priors cannot matter for many problems there is no clear choice difficulties arise when averaging over competing models (Lindley s paradox) Bottom line: If so few data are available that the choice of noninformative prior matters, then one should put relevant information into the prior
61 Example: Heart transplant mortality rate For a particular hospital, we observe n : the number of transplant surgeries y : the number of deaths within 30 days of surgery In addition, one can predict the probability of death for each patient based upon a model that uses information such as the patients medical condition before the surgery, gender, and race. From this, one can obtain e : the expected number of deaths (exposure) A standard model assumes that the number of deaths follows a Poisson distribution with mean eλ, and the objective is to estimate the mortality rate per unit exposure λ
62 Example: problems with the MLE The standard estimate of λ is the MLE ˆλ = y/e. Unfortunately, this estimate can be poor when the number of deaths y is close to zero In this situation, when small death counts are possible, it is desirable to use a Bayesian estimate that uses prior knowledge about the size of the mortality rate
63 Example: prior information A convenient source of prior information is heart transplant data from a small group of hospitals that we believe has the same rate of mortality as the rate from the hospital of interest Suppose we observe z j : the number of deaths o j : the exposure where Z j iid Pois(oj λ) for ten hospitals (j =1,...,10) If we assign λ the noninformative prior then the posterior is p(λ) λ 1 G( 10 j=1 z i, n j=1 o i) (check this)
64 Example: the prior from data Suppose that, in this example, we have the prior data 10 z i = 16, n o i = j=1 j=1 So for our prior (to analyse the data for our hospital) we shall use λ G(16, 15174) Now, we consider inference about the heart transplant death rate for two hospitals A. small exposure (e a = 66) and y a =1death B. larger exposure (e b = 1767) and y b =4deaths
65 Example: posterior The posterior distributions are as follows A. B. λ a y a,e a G(16 + 1, ) λ b y b,e b G(16 + 4, ) density prior post A post B lambda P (λ a < λ b y a,e a,y b,e b )=0.57
Basics of Statistical Machine Learning
CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar
More informationParametric Models Part I: Maximum Likelihood and Bayesian Density Estimation
Parametric Models Part I: Maximum Likelihood and Bayesian Density Estimation Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2015 CS 551, Fall 2015
More information1 Prior Probability and Posterior Probability
Math 541: Statistical Theory II Bayesian Approach to Parameter Estimation Lecturer: Songfeng Zheng 1 Prior Probability and Posterior Probability Consider now a problem of statistical inference in which
More informationCHAPTER 2 Estimating Probabilities
CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a
More informationMaximum Likelihood Estimation
Math 541: Statistical Theory II Lecturer: Songfeng Zheng Maximum Likelihood Estimation 1 Maximum Likelihood Estimation Maximum likelihood is a relatively simple method of constructing an estimator for
More informationGeneralized Linear Models. Today: definition of GLM, maximum likelihood estimation. Involves choice of a link function (systematic component)
Generalized Linear Models Last time: definition of exponential family, derivation of mean and variance (memorize) Today: definition of GLM, maximum likelihood estimation Include predictors x i through
More informationErrata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page
Errata for ASM Exam C/4 Study Manual (Sixteenth Edition) Sorted by Page 1 Errata and updates for ASM Exam C/Exam 4 Manual (Sixteenth Edition) sorted by page Practice exam 1:9, 1:22, 1:29, 9:5, and 10:8
More informationPrinciple of Data Reduction
Chapter 6 Principle of Data Reduction 6.1 Introduction An experimenter uses the information in a sample X 1,..., X n to make inferences about an unknown parameter θ. If the sample size n is large, then
More informationNotes on the Negative Binomial Distribution
Notes on the Negative Binomial Distribution John D. Cook October 28, 2009 Abstract These notes give several properties of the negative binomial distribution. 1. Parameterizations 2. The connection between
More informationMATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators...
MATH4427 Notebook 2 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 20092016 by Jenny A. Baglivo. All Rights Reserved. Contents 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................
More informationBayesian Analysis for the Social Sciences
Bayesian Analysis for the Social Sciences Simon Jackman Stanford University http://jackman.stanford.edu/bass November 9, 2012 Simon Jackman (Stanford) Bayesian Analysis for the Social Sciences November
More informationBasic Bayesian Methods
6 Basic Bayesian Methods Mark E. Glickman and David A. van Dyk Summary In this chapter, we introduce the basics of Bayesian data analysis. The key ingredients to a Bayesian analysis are the likelihood
More informationFor a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i )
Probability Review 15.075 Cynthia Rudin A probability space, defined by Kolmogorov (19031987) consists of: A set of outcomes S, e.g., for the roll of a die, S = {1, 2, 3, 4, 5, 6}, 1 1 2 1 6 for the roll
More informationChapter 9: Hypothesis Testing Sections
Chapter 9: Hypothesis Testing Sections 9.1 Problems of Testing Hypotheses Skip: 9.2 Testing Simple Hypotheses Skip: 9.3 Uniformly Most Powerful Tests Skip: 9.4 TwoSided Alternatives 9.5 The t Test 9.6
More informationα α λ α = = λ λ α ψ = = α α α λ λ ψ α = + β = > θ θ β > β β θ θ θ β θ β γ θ β = γ θ > β > γ θ β γ = θ β = θ β = θ β = β θ = β β θ = = = β β θ = + α α α α α = = λ λ λ λ λ λ λ = λ λ α α α α λ ψ + α =
More informationModels for Count Data With Overdispersion
Models for Count Data With Overdispersion Germán Rodríguez November 6, 2013 Abstract This addendum to the WWS 509 notes covers extrapoisson variation and the negative binomial model, with brief appearances
More information15.0 More Hypothesis Testing
15.0 More Hypothesis Testing 1 Answer Questions Type I and Type II Error Power Calculation Bayesian Hypothesis Testing 15.1 Type I and Type II Error In the philosophy of hypothesis testing, the null hypothesis
More informationMAS2317/3317. Introduction to Bayesian Statistics. More revision material
MAS2317/3317 Introduction to Bayesian Statistics More revision material Dr. Lee Fawcett, 2014 2015 1 Section A style questions 1. Describe briefly the frequency, classical and Bayesian interpretations
More informationBayes and Naïve Bayes. cs534machine Learning
Bayes and aïve Bayes cs534machine Learning Bayes Classifier Generative model learns Prediction is made by and where This is often referred to as the Bayes Classifier, because of the use of the Bayes rule
More informationStatistics 100A Homework 8 Solutions
Part : Chapter 7 Statistics A Homework 8 Solutions Ryan Rosario. A player throws a fair die and simultaneously flips a fair coin. If the coin lands heads, then she wins twice, and if tails, the onehalf
More informationBayesian Statistics in One Hour. Patrick Lam
Bayesian Statistics in One Hour Patrick Lam Outline Introduction Bayesian Models Applications Missing Data Hierarchical Models Outline Introduction Bayesian Models Applications Missing Data Hierarchical
More informationChapter 3 RANDOM VARIATE GENERATION
Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.
More informationBayesian probability: P. State of the World: X. P(X your information I)
Bayesian probability: P State of the World: X P(X your information I) 1 First example: bag of balls Every probability is conditional to your background knowledge I : P(A I) What is the (your) probability
More informationThe DirichletMultinomial and DirichletCategorical models for Bayesian inference
The DirichletMultinomial and DirichletCategorical models for Bayesian inference Stephen Tu tu.stephenl@gmail.com 1 Introduction This document collects in one place various results for both the Dirichletmultinomial
More informationSummary of Probability
Summary of Probability Mathematical Physics I Rules of Probability The probability of an event is called P(A), which is a positive number less than or equal to 1. The total probability for all possible
More informationThe Exponential Family
The Exponential Family David M. Blei Columbia University November 3, 2015 Definition A probability density in the exponential family has this form where p.x j / D h.x/ expf > t.x/ a./g; (1) is the natural
More informationUNIT I: RANDOM VARIABLES PART A TWO MARKS
UNIT I: RANDOM VARIABLES PART A TWO MARKS 1. Given the probability density function of a continuous random variable X as follows f(x) = 6x (1x) 0
More informationExam C, Fall 2006 PRELIMINARY ANSWER KEY
Exam C, Fall 2006 PRELIMINARY ANSWER KEY Question # Answer Question # Answer 1 E 19 B 2 D 20 D 3 B 21 A 4 C 22 A 5 A 23 E 6 D 24 E 7 B 25 D 8 C 26 A 9 E 27 C 10 D 28 C 11 E 29 C 12 B 30 B 13 C 31 C 14
More informationLab 8: Introduction to WinBUGS
40.656 Lab 8 008 Lab 8: Introduction to WinBUGS Goals:. Introduce the concepts of Bayesian data analysis.. Learn the basic syntax of WinBUGS. 3. Learn the basics of using WinBUGS in a simple example. Next
More informationLinear Classification. Volker Tresp Summer 2015
Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong
More informationHomework 4  KEY. Jeff Brenion. June 16, 2004. Note: Many problems can be solved in more than one way; we present only a single solution here.
Homework 4  KEY Jeff Brenion June 16, 2004 Note: Many problems can be solved in more than one way; we present only a single solution here. 1 Problem 21 Since there can be anywhere from 0 to 4 aces, the
More informationAggregate Loss Models
Aggregate Loss Models Chapter 9 Stat 477  Loss Models Chapter 9 (Stat 477) Aggregate Loss Models Brian Hartman  BYU 1 / 22 Objectives Objectives Individual risk model Collective risk model Computing
More informationComparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom
Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1 Learning Goals 1. Be able to explain the difference between the pvalue and a posterior
More informationHypothesis Testing COMP 245 STATISTICS. Dr N A Heard. 1 Hypothesis Testing 2 1.1 Introduction... 2 1.2 Error Rates and Power of a Test...
Hypothesis Testing COMP 45 STATISTICS Dr N A Heard Contents 1 Hypothesis Testing 1.1 Introduction........................................ 1. Error Rates and Power of a Test.............................
More informationChapter 4 Statistical Inference in Quality Control and Improvement. Statistical Quality Control (D. C. Montgomery)
Chapter 4 Statistical Inference in Quality Control and Improvement 許 湘 伶 Statistical Quality Control (D. C. Montgomery) Sampling distribution I a random sample of size n: if it is selected so that the
More informationInference of Probability Distributions for Trust and Security applications
Inference of Probability Distributions for Trust and Security applications Vladimiro Sassone Based on joint work with Mogens Nielsen & Catuscia Palamidessi Outline 2 Outline Motivations 2 Outline Motivations
More informationTHE MULTINOMIAL DISTRIBUTION. Throwing Dice and the Multinomial Distribution
THE MULTINOMIAL DISTRIBUTION Discrete distribution  The Outcomes Are Discrete. A generalization of the binomial distribution from only 2 outcomes to k outcomes. Typical Multinomial Outcomes: red A area1
More information4. Introduction to Statistics
Statistics for Engineers 41 4. Introduction to Statistics Descriptive Statistics Types of data A variate or random variable is a quantity or attribute whose value may vary from one unit of investigation
More informationHypothesis Testing. 1 Introduction. 2 Hypotheses. 2.1 Null and Alternative Hypotheses. 2.2 Simple vs. Composite. 2.3 OneSided and TwoSided Tests
Hypothesis Testing 1 Introduction This document is a simple tutorial on hypothesis testing. It presents the basic concepts and definitions as well as some frequently asked questions associated with hypothesis
More informationThe Distribution of Aggregate Life Insurance Claims
The Distribution of ggregate Life Insurance laims Tom Edwalds, FS Munich merican Reassurance o. BSTRT This paper demonstrates the calculation of the moments of the distribution of aggregate life insurance
More informationDongfeng Li. Autumn 2010
Autumn 2010 Chapter Contents Some statistics background; ; Comparing means and proportions; variance. Students should master the basic concepts, descriptive statistics measures and graphs, basic hypothesis
More informationIntroduction to Hypothesis Testing. Point estimation and confidence intervals are useful statistical inference procedures.
Introduction to Hypothesis Testing Point estimation and confidence intervals are useful statistical inference procedures. Another type of inference is used frequently used concerns tests of hypotheses.
More informationTime Series Analysis
Time Series Analysis hm@imm.dtu.dk Informatics and Mathematical Modelling Technical University of Denmark DK2800 Kgs. Lyngby 1 Outline of the lecture Identification of univariate time series models, cont.:
More informationSufficient Statistics and Exponential Family. 1 Statistics and Sufficient Statistics. Math 541: Statistical Theory II. Lecturer: Songfeng Zheng
Math 541: Statistical Theory II Lecturer: Songfeng Zheng Sufficient Statistics and Exponential Family 1 Statistics and Sufficient Statistics Suppose we have a random sample X 1,, X n taken from a distribution
More informationChapter 4 Lecture Notes
Chapter 4 Lecture Notes Random Variables October 27, 2015 1 Section 4.1 Random Variables A random variable is typically a realvalued function defined on the sample space of some experiment. For instance,
More informationL10: Probability, statistics, and estimation theory
L10: Probability, statistics, and estimation theory Review of probability theory Bayes theorem Statistics and the Normal distribution Least Squares Error estimation Maximum Likelihood estimation Bayesian
More informationProbabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014
Probabilistic Models for Big Data Alex Davies and Roger Frigola University of Cambridge 13th February 2014 The State of Big Data Why probabilistic models for Big Data? 1. If you don t have to worry about
More informationProbability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur
Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #15 Special DistributionsVI Today, I am going to introduce
More informationCHAPTER 6: Continuous Uniform Distribution: 6.1. Definition: The density function of the continuous random variable X on the interval [A, B] is.
Some Continuous Probability Distributions CHAPTER 6: Continuous Uniform Distribution: 6. Definition: The density function of the continuous random variable X on the interval [A, B] is B A A x B f(x; A,
More informationJoint Exam 1/P Sample Exam 1
Joint Exam 1/P Sample Exam 1 Take this practice exam under strict exam conditions: Set a timer for 3 hours; Do not stop the timer for restroom breaks; Do not look at your notes. If you believe a question
More informationMath Review. for the Quantitative Reasoning Measure of the GRE revised General Test
Math Review for the Quantitative Reasoning Measure of the GRE revised General Test www.ets.org Overview This Math Review will familiarize you with the mathematical skills and concepts that are important
More informationA crash course in probability and Naïve Bayes classification
Probability theory A crash course in probability and Naïve Bayes classification Chapter 9 Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s
More informationTest Volume 14, Number 2. December 2005
Sociedad de Estadística e Investigación Operativa Test Volume 14, Number 2. December 2005 Intrinsic Credible Regions: An Objective Bayesian Approach to Interval Estimation José M. Bernardo Departamento
More informationBayesian logistic betting strategy against probability forecasting. Akimichi Takemura, Univ. Tokyo. November 12, 2012
Bayesian logistic betting strategy against probability forecasting Akimichi Takemura, Univ. Tokyo (joint with Masayuki Kumon, Jing Li and Kei Takeuchi) November 12, 2012 arxiv:1204.3496. To appear in Stochastic
More informationPattern Analysis. Logistic Regression. 12. Mai 2009. Joachim Hornegger. Chair of Pattern Recognition Erlangen University
Pattern Analysis Logistic Regression 12. Mai 2009 Joachim Hornegger Chair of Pattern Recognition Erlangen University Pattern Analysis 2 / 43 1 Logistic Regression Posteriors and the Logistic Function Decision
More informationReject Inference in Credit Scoring. JieMen Mok
Reject Inference in Credit Scoring JieMen Mok BMI paper January 2009 ii Preface In the Master programme of Business Mathematics and Informatics (BMI), it is required to perform research on a business
More informationUniversity of California, Berkeley, Statistics 134: Concepts of Probability
University of California, Berkeley, Statistics 134: Concepts of Probability Michael Lugo, Spring 211 Exam 2 solutions 1. A fair twentysided die has its faces labeled 1, 2, 3,..., 2. The die is rolled
More information38. STATISTICS. 38. Statistics 1. 38.1. Fundamental concepts
38. Statistics 1 38. STATISTICS Revised September 2013 by G. Cowan (RHUL). This chapter gives an overview of statistical methods used in highenergy physics. In statistics, we are interested in using a
More information0 x = 0.30 x = 1.10 x = 3.05 x = 4.15 x = 6 0.4 x = 12. f(x) =
. A mailorder computer business has si telephone lines. Let X denote the number of lines in use at a specified time. Suppose the pmf of X is as given in the accompanying table. 0 2 3 4 5 6 p(.0.5.20.25.20.06.04
More information1 Sufficient statistics
1 Sufficient statistics A statistic is a function T = rx 1, X 2,, X n of the random sample X 1, X 2,, X n. Examples are X n = 1 n s 2 = = X i, 1 n 1 the sample mean X i X n 2, the sample variance T 1 =
More informationStatistics 104: Section 6!
Page 1 Statistics 104: Section 6! TF: Deirdre (say: Deardra) Bloome Email: dbloome@fas.harvard.edu Section Times Thursday 2pm3pm in SC 109, Thursday 5pm6pm in SC 705 Office Hours: Thursday 6pm7pm SC
More informationDetection of changes in variance using binary segmentation and optimal partitioning
Detection of changes in variance using binary segmentation and optimal partitioning Christian Rohrbeck Abstract This work explores the performance of binary segmentation and optimal partitioning in the
More informationCONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE
1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 3 Problems with small populations 8 II. Why Random Sampling is Important 9 A myth,
More informationSpatial Statistics Chapter 3 Basics of areal data and areal data modeling
Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data
More informationWorked examples Multiple Random Variables
Worked eamples Multiple Random Variables Eample Let X and Y be random variables that take on values from the set,, } (a) Find a joint probability mass assignment for which X and Y are independent, and
More informationSTT315 Chapter 4 Random Variables & Probability Distributions KM. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables
Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables Discrete vs. continuous random variables Examples of continuous distributions o Uniform o Exponential o Normal Recall: A random
More informationPractice problems for Homework 11  Point Estimation
Practice problems for Homework 11  Point Estimation 1. (10 marks) Suppose we want to select a random sample of size 5 from the current CS 3341 students. Which of the following strategies is the best:
More information6. Jointly Distributed Random Variables
6. Jointly Distributed Random Variables We are often interested in the relationship between two or more random variables. Example: A randomly chosen person may be a smoker and/or may get cancer. Definition.
More informationSOCIETY OF ACTUARIES/CASUALTY ACTUARIAL SOCIETY EXAM C CONSTRUCTION AND EVALUATION OF ACTUARIAL MODELS EXAM C SAMPLE QUESTIONS
SOCIETY OF ACTUARIES/CASUALTY ACTUARIAL SOCIETY EXAM C CONSTRUCTION AND EVALUATION OF ACTUARIAL MODELS EXAM C SAMPLE QUESTIONS Copyright 005 by the Society of Actuaries and the Casualty Actuarial Society
More informationThe Two Envelopes Problem
1 The Two Envelopes Problem Rich Turner and Tom Quilter The Two Envelopes Problem, like its better known cousin, the Monty Hall problem, is seemingly paradoxical if you are not careful with your analysis.
More informationA Predictive Probability Design Software for Phase II Cancer Clinical Trials Version 1.0.0
A Predictive Probability Design Software for Phase II Cancer Clinical Trials Version 1.0.0 Nan Chen Diane Liu J. Jack Lee University of Texas M.D. Anderson Cancer Center March 23 2010 1. Calculation Method
More informationLecture 14. Chapter 7: Probability. Rule 1: Rule 2: Rule 3: Nancy Pfenning Stats 1000
Lecture 4 Nancy Pfenning Stats 000 Chapter 7: Probability Last time we established some basic definitions and rules of probability: Rule : P (A C ) = P (A). Rule 2: In general, the probability of one event
More informationChapter 2: Data quantifiers: sample mean, sample variance, sample standard deviation Quartiles, percentiles, median, interquartile range Dot diagrams
Review for Final Chapter 2: Data quantifiers: sample mean, sample variance, sample standard deviation Quartiles, percentiles, median, interquartile range Dot diagrams Histogram Boxplots Chapter 3: Set
More informationMINITAB ASSISTANT WHITE PAPER
MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. OneWay
More informationConstrained Bayes and Empirical Bayes Estimator Applications in Insurance Pricing
Communications for Statistical Applications and Methods 2013, Vol 20, No 4, 321 327 DOI: http://dxdoiorg/105351/csam2013204321 Constrained Bayes and Empirical Bayes Estimator Applications in Insurance
More informationAn introduction to the BetaBinomial model
An introduction to the BetaBinomial model COMPSCI 316: Computational Cognitive Science Dan Navarro & Amy Perfors University of Adelaide Abstract This note provides a detailed discussion of the betabinomial
More informationOverview of Monte Carlo Simulation, Probability Review and Introduction to Matlab
Monte Carlo Simulation: IEOR E4703 Fall 2004 c 2004 by Martin Haugh Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab 1 Overview of Monte Carlo Simulation 1.1 Why use simulation?
More informationSTATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI
STATS8: Introduction to Biostatistics Data Exploration Babak Shahbaba Department of Statistics, UCI Introduction After clearly defining the scientific problem, selecting a set of representative members
More informationBayesian Updating with Discrete Priors Class 11, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom
1 Learning Goals Bayesian Updating with Discrete Priors Class 11, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1. Be able to apply Bayes theorem to compute probabilities. 2. Be able to identify
More informationPoisson Models for Count Data
Chapter 4 Poisson Models for Count Data In this chapter we study loglinear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the
More informationDepartment of Mathematics, Indian Institute of Technology, Kharagpur Assignment 23, Probability and Statistics, March 2015. Due:March 25, 2015.
Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 3, Probability and Statistics, March 05. Due:March 5, 05.. Show that the function 0 for x < x+ F (x) = 4 for x < for x
More informationSummary of Formulas and Concepts. Descriptive Statistics (Ch. 14)
Summary of Formulas and Concepts Descriptive Statistics (Ch. 14) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume
More informationReview of Random Variables
Chapter 1 Review of Random Variables Updated: January 16, 2015 This chapter reviews basic probability concepts that are necessary for the modeling and statistical analysis of financial data. 1.1 Random
More information7.1 The Hazard and Survival Functions
Chapter 7 Survival Models Our final chapter concerns models for the analysis of data which have three main characteristics: (1) the dependent variable or response is the waiting time until the occurrence
More informationMaster s Theory Exam Spring 2006
Spring 2006 This exam contains 7 questions. You should attempt them all. Each question is divided into parts to help lead you through the material. You should attempt to complete as much of each problem
More informationThe Exponential Distribution
21 The Exponential Distribution From DiscreteTime to ContinuousTime: In Chapter 6 of the text we will be considering Markov processes in continuous time. In a sense, we already have a very good understanding
More informationLecture notes for Choice Under Uncertainty
Lecture notes for Choice Under Uncertainty 1. Introduction In this lecture we examine the theory of decisionmaking under uncertainty and its application to the demand for insurance. The undergraduate
More information2DI36 Statistics. 2DI36 Part II (Chapter 7 of MR)
2DI36 Statistics 2DI36 Part II (Chapter 7 of MR) What Have we Done so Far? Last time we introduced the concept of a dataset and seen how we can represent it in various ways But, how did this dataset came
More information, for x = 0, 1, 2, 3,... (4.1) (1 + 1/n) n = 2.71828... b x /x! = e b, x=0
Chapter 4 The Poisson Distribution 4.1 The Fish Distribution? The Poisson distribution is named after SimeonDenis Poisson (1781 1840). In addition, poisson is French for fish. In this chapter we will
More informationWhat is Statistics? Lecture 1. Introduction and probability review. Idea of parametric inference
0. 1. Introduction and probability review 1.1. What is Statistics? What is Statistics? Lecture 1. Introduction and probability review There are many definitions: I will use A set of principle and procedures
More information7 Hypothesis testing  one sample tests
7 Hypothesis testing  one sample tests 7.1 Introduction Definition 7.1 A hypothesis is a statement about a population parameter. Example A hypothesis might be that the mean age of students taking MAS113X
More informationLecture 2 ESTIMATING THE SURVIVAL FUNCTION. Onesample nonparametric methods
Lecture 2 ESTIMATING THE SURVIVAL FUNCTION Onesample nonparametric methods There are commonly three methods for estimating a survivorship function S(t) = P (T > t) without resorting to parametric models:
More informationThe Delta Method and Applications
Chapter 5 The Delta Method and Applications 5.1 Linear approximations of functions In the simplest form of the central limit theorem, Theorem 4.18, we consider a sequence X 1, X,... of independent and
More informationExperimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test
Experimental Design Power and Sample Size Determination Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 3 8, 2011 To this point in the semester, we have largely
More informationLikelihood Approaches for Trial Designs in Early Phase Oncology
Likelihood Approaches for Trial Designs in Early Phase Oncology Clinical Trials Elizabeth GarrettMayer, PhD Cody Chiuzan, PhD Hollings Cancer Center Department of Public Health Sciences Medical University
More informationSTAT 315: HOW TO CHOOSE A DISTRIBUTION FOR A RANDOM VARIABLE
STAT 315: HOW TO CHOOSE A DISTRIBUTION FOR A RANDOM VARIABLE TROY BUTLER 1. Random variables and distributions We are often presented with descriptions of problems involving some level of uncertainty about
More informationUsing the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, and Discrete Changes
Using the Delta Method to Construct Confidence Intervals for Predicted Probabilities, Rates, Discrete Changes JunXuJ.ScottLong Indiana University August 22, 2005 The paper provides technical details on
More information1.1 Introduction, and Review of Probability Theory... 3. 1.1.1 Random Variable, Range, Types of Random Variables... 3. 1.1.2 CDF, PDF, Quantiles...
MATH4427 Notebook 1 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 20092016 by Jenny A. Baglivo. All Rights Reserved. Contents 1 MATH4427 Notebook 1 3 1.1 Introduction, and Review of Probability
More informationChoice under Uncertainty
Choice under Uncertainty Part 1: Expected Utility Function, Attitudes towards Risk, Demand for Insurance Slide 1 Choice under Uncertainty We ll analyze the underlying assumptions of expected utility theory
More informationMultivariate Logistic Regression
1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation
More information