Repeated sampling in Successive Survey

Size: px
Start display at page:

Download "Repeated sampling in Successive Survey"

Transcription

1 Statistics istitutio Repeated samplig i Successive Survey (RSSS) Xiaolu Cao 15-högskolepoägsuppsats iom Statistik III, HT011 Supervisor: Mikael Möller

2 Abstract: I this thesis, the idea of samplig desig for successive surveys is ivestigated. A good samplig desig ca ifluece the result of the survey. It ca decrease the bias estimatio of survey i real world. Author wats to compare a ew samplig desig with a classical samplig desig. The thesis cocered o usig repeated samplig withi realm of stratified radom samplig. Thesis shows this kowledge i theory ad how the theory is applied. The data for applicatio are geerated by a Mote-Carlo method. The result from simulatios shows that usig repeated samplig to estimate the populatio mea gives higher precisio ad less variace tha to use fixed sample Thaks I would like to express my gratitude to Mikael Möller who support ad guidace durig my writig this thesis.

3 Cotets 1 Itroductio 1.1 Backgroud Related work Data set Theoretical Framwork 5.1 Notatios ad cocept of repeated samplig Repeated samplig Simple radom samplig Repeated samplig withi realm of simple radom samplig Repeated samplig withi realm of stratified radom samplig Estimatio of sample size: Experimetal results Implemetatio issues Experimetal result for data set 1: Sample size Compariso for fixed sample ad repeated sample Estimated variace Optimality Experimetal result for data set Experimet result for testig sceario Sample size Compariso for fixed sample ad repeated sample Estimate variace Summary of experimet Coclusio ad discussio 19 Refereces 0 1

4 1 Itroductio 1.1 Backgroud Successive surveys meas survey o repeated occasios. May studies have used survey method: e.g. sociological, ecoomical, medical etc. The objective for these are to estimate characteristics of a populatio o repeated occasios i order to measure time-tred as well as curret values of the characteristics (see [Rao ad Graham]). For example, the estimatio of the yearly average icome for people livig i Stockholm couty. Whe chages i timedepedet populatio values are to be examied, several samplig alteratives, as listed by Yate (1960), ca be used. These ivolve: (i) A ew sample o each occasio; (ii) A fixed sample used o all occasios; (iii) A subsample of the origial sample o secod occasio; (iv) A partial replacemet of uits from occasio to occasio. I this thesis we cocetrate o comparig (ii) a fixed sample used o all occasios ad (iv) a partial replacemet of uits from occasio to occasio (see [Maoussakis]). A partial replacemet of uits from occasio to occasio is usually kow as repeated samplig; i.e. part of the sample is replaced at equidistace times. Durig this process o the h th occasio we may have parts of the sample that are matched with the (h 1) th occasio, parts that are matched with both the (h 1) th ad the (h ) th occasio, ad so o (see [Patterso]). Fidig the optimal replacemet strategy for two or more occasios is very importat. Sectio 1 will compare the covetioal method (ii) ad repeated samplig withi the realm of stratified radom samplig. Stratified samplig divide the populatio Ω N ito K homogeeous subpopulatios of size N 1, N,..., N K. Each subpopulatio is called a stratum. Draw from each stratum a sample, the drawig must be idepedet i differet strata. If the process of drawig a sample is radom, the the whole procedure is called stratified radom samplig. Stratified samplig is a commo techique ad works well for populatios with a variety of attributes. The pricipal reasos are (see [Cochra]): 1. If data of kow precisio are wated for certai subdivisios of the populatio, it is advisable to treat each subdivisio as a populatio i its ow right.. Admiistrative coveiece may dictate the use of stratificatio; for example, the agecy coductig the survey may have field offi ces, each of which ca supervise the survey for a part of the populatio. 3. Samplig problems may differ markedly i differet parts of the populatio. With huma populatios, people livig i istitutios (e.g., hotels, hospitals, prisos) are ofte placed i a differet stratum from people livig i ordiary homes because a differet approach to the samplig is appropriate for the two situatios. I samplig busiesses we may possess a list of the large firms, which are placed i a separate stratum. Some type of area samplig may have to be used for the smaller firms. 4. Stratificatio may produce a gai i precisio i the estimates of characteristics of the whole populatio. If it is possible to divide a heterogeeous populatio ito subpopulatios, each of which is iterally homogeeous, stratificatio may produce a gai i precisio i the estimates of the characteristics. This is suggested by the ame stratum,

5 with its implicatio of a divisio ito layers. If each stratum is homogeeous, i that the measuremets vary little from oe uit to aother, a precise estimate of ay stratum mea ca be obtaied from a small sample i that stratum. These estimates ca the be combied ito a precise estimate for the whole populatio. For reasos above, this thesis will focus o usig repeated samplig withi the realm of stratified radom samplig. The improvemet by usig repeated sample will be ivestigated by both theoretically derivatios ad data simulatio. 1. Related work Radom samplig, stratified samplig, cluster samplig two-stage stratified samplig, etc usually use fixed samples. I these cases, every occasio use the same uits. However, measurig the same uits o successive occasio surveys might ot be a wise choice. This because usig the same fixed sample may egatively ifluece ivestigated uits (e.g. people) ad this might ifluece the fial result. This motivates ew methods to load, for each occasio, a "ew" sample. O the other had usig ew sample o each occasio, the cotiuity of statistical features betwee curret sample ad ew sample ca ot be guarateed also ew samples cost extra. Because of the reasos above the repeated samplig techique was iveted. This techique cosists of partial replacemet of uits from occasio to occasio.o the (h + 1) th occasio, do a partial replacemet of uits. The process of repeated samplig may thus be summarized as: The sample for the (h+1) th occasio iclude some uits which match with the h th occasio, parts which are matched with both h th ad (h 1) th occasio, ad so o. The method of repeated samplig withi the realm of simple radom samplig is mature. But there are oly a few paper about the use of repeated samplig withi the realm of more complex samplig, e.g. stratified radom samplig, stratified cluster samplig, two-stage stratified samplig etc. Stratified samplig is a method of samplig from a heterogeous populatio. Stratificatio is the process of dividig this populatio ito homogeeous stratums before samplig. This thesis will use repeated samplig withi stratified samplig. Here stratified samplig cotais stratified radom samplig, stratified cluster samplig ad two-stage stratified samplig. I this thesis we focuses o repeated samplig withi the realm of stratified radom samplig. 1.3 Data set Sice real data was ot available we decided to use the Mote-Carlo method to geerate datasets. Hece we first simulate our populatio from which we will take a sample. Examples of characteristics are the icome for all workers who work at Stockholm Uiversity durig 007, 008, 009 etc, or the BMI (body mass idex) for all the people who live i Stockholm couty durig 006, 007, 008 etc. We simulate icome data sice they are ot possible for us to get idividually from public offi cial statistics. The expressio "Mote-Carlo method" is actually very geeral. Mote-Carlo methods also called MC methods, are stochastic techiques which are based o pseudo radom geerators ad probability theory. You ca fid Mote Carlo methods used i may differet areas; from 3

6 ecoomics to uclear physics to regulatig the traffi c flow. Of course the way they are applied varies widely from field to field, ad there are dozes of subsets of Mote Carlo methods eve withi chemistry. But, strictly speakig, to call somethig a "Mote Carlo" experimet, all you eed to do is to use radom umbers whe examiig your problem (see [Woller]). Geerally, we assume that the data distributio follows a Gaussia distributio. Thus, we use the fuctio ormrd i MATLAB to geerate our dataset. From Statistics Swede we obtai the average icome for males ad females livig i Stockholm couty durig : Average icome (tkr) male female With aid of the Matlab fuctio ormrd(µ, σ, N), we geerate N Gaussia radom umbers with mea µ ad stadard deviatio σ. I our case, we geerate differet datasets by usig give mea values (as show i table Average icome) ad varyig stadard deviatio values. Commo stadard deviatio value ca be i rage of 10 to 50. 4

7 Theoretical Framwork.1 Notatios ad cocept of repeated samplig I the sequal we are iterested i measurig the idividual icome i Stockholm couty. More exact, we wat to estimate the mea icome ad its stadard deviatio for two differet samplig schemes. We itroduce the followig otatio Y j (h) = icome idividual j at occasio h i Stockholm couty where j {1,,..., N}. The size of the populatio N is the same at every occasio h. Let Ω = {Y 1 (h), Y (h),..., Y N (h)} µ (h) = Ȳ (h) = 1 N Y j (h) N σ (h) = 1 N 1 j=1 N ( Yj (h) Ȳ (h)) j=1 Here occasio h stads for oe of the years ivestigated (h = 006, 007, 008, 009). Owards we will use small y to deote a sample value ad our samples will be of size. For partial replacemet of uits, we use m to deote a keep part ad u to deote a replacemet part. ˆµ m (h) is the estimate of the populatio mea use keep part from origial sample at h th occasio, ad ˆµ u+m (h) is the estimate of the populatio mea use repeated samplig at h th occasio.. Repeated samplig I a stratified radom samplig process, we do simple radom samplig i each stratum. Usig repeated samplig, withi realm of stratified radom samplig, meas usig repeated samplig i each stratum. Repeated samplig is a process that chages the sample from occasio to occasio. Figure 1: dropped uits kept uits ew uits 5

8 We write (h1) th occasio istead of previous occasio ad h th occasio istead of curret occasio. O the h th occasio, we eed keep uits i the o-replacemet part of the sample used at (h1) th occasio ad drop the rest. We draw ew observatios to replace the dropped oes. This process will call repeated samplig ad it is described by figure o previous page..3 Simple radom samplig The idea behid simple radom samplig is that the values of sample characteristics is approximetely the same as the correspodig values i the populatio. If we use a fixed sample for a survey the the accuracy of the measuremets i this sample will decrease over time due to populatio chages. It is believed that a repeated sample is better tha a fixed sample. I repeated samplig, we eed take ito accout the correlatio betwee curret occasio h th ad previous occasio (h 1) th. We use correlatio coeffi ciet ρ to deote this relatioship. Geerally, the value of ρ is betwee 0 ad 1. Whe ρ = 1, the previous sample is ot good for estimatio of curret populatio characteristics. The proportio of replacemet is 100%. Cotrarily, if ρ = 0, that meas data from the curret occasio has o relatioship with the previous occasio. This sample is though useful for estimatio of populatio characteristics i the curret occasio, we eed ot chage ay uits i the previous sample. The proportio of replacemet is 0%. I the real world, we ormally use stadard deviatio σ(ȳ) to measure how close the estimate ȳ of a sample is to µ of the populatio. The bigger stadard deviasio σ(ȳ), the flatter distributio curve ad the more spread is data. I other words, the estimate of the mea has low precisio. Cotrarily, if σ(ȳ) is small that meas the distributio curve is peaked, all data are close to the mea. Hece, the estimate of the mea has high precisio. I simple radom samplig, the formula for stadard deviatio of the mea ȳ is i samplig from 1. a ifiite populatio: σ(ȳ) = σ. a fiite populatio: σ(ȳ) = σ ( 1 ) N From formulas above we see that stadard deviatio, sample size ad populatio size will effect the stadard deviatio for the mea ȳ. But these formulas are ot good eough to calculate the stadard deviatio of the mea ȳ i repeated samplig. I repeated samplig, sample size ad populatio size are the same at all occasios. But the sample is ot the same from occasio to occasio. The correlatio betwee the two occasios will have a effect o the sample estimate. Hece it will also have a effect o the sample characteristics for the curret occasio. Further, the optimal proportio of replacemet also effects the characteristics of the sample. The correlatio will help us to decide how may uits to replace to get a estimate of highest precisio. If we wat to estimate the characteristics of a populatio well, we eed have a sample that resembles the populatio. These are the reasos why we should add the correlatio coeffi ciet ρ ad the optimal proportio of replacemet p ito formulas above. 6

9 .4 Repeated samplig withi realm of simple radom samplig Firstly, oly cosider about a survey i two successive occasios h ad h 1. The sample size is same i all occasios ad deote measuremet variable by y. So y (h) meas variable at curret occasio ad y (h 1) meas variable at previous occasio. Let m be the umber of kept uits (m ) ad u the umber of dropped uits (u = m). The we have the followig samplig schemes for occasio h give occasio h 1 (where we write r s for measures at time r give sample at time s) 1. All uits are ewly sampled (h) ad their values are measured at the curret occasio (h). I this case ˆµ (h h) = ȳ (h h) V (ˆµ (h h)) = s (h h) I the sequal we will write ȳ (h) ad s (h) whe sample time is same as measure time.. All uits are from the previous occasio (h 1) but their values are measured at the curret occasio (h). ˆµ (h h 1) = ȳ (h h 1) V (ˆµ (h h 1)) = s (h h 1) 3. The sample is regarded as a populatio i its ow respect ad we take a subsample, size m, of it. Sample i curret occasio is a subsample of origial sample at previous occasio. It meas m <. The liear regressio estimate is desiged to icrease precisio by use of a auxiliary variate Y j (h 1) that is correlated with Y j (h). Whe the relatio betwee Y j (h) ad Y j (h 1) is examied, it may be foud that although the relatio is approximately liear, the lie does ot go through the origi. This suggests a estimate based o the liear regressio of y j (h) o y j (h 1) rather tha o the ratio of the two variables. We suppose that values y j (h) ad y j (h 1) are each obtaied for every uit i the sample ad that the populatio mea µ (h 1) of {Y j (h 1)} is kow. The liear regressio estimate of µ (h), the populatio mea of {Y j (h)} is for the kept part ad some β (to be decided) (see [Cochra]) ˆµ m (h h 1) = ȳ m (h h 1) + β [ȳ (h 1) ȳ m (h 1)] (1) Here ˆµ m (h h 1) is the estimate of µ(h) give by the kept part, at time h, with sample, 7

10 at time h 1. We let f = m, the estimate of variace for populatio is V mi (ˆµ m (h h 1)) = 1 f m {[(y j (h) ȳ m (h h 1)) β (y j (h 1) ȳ (h 1))]} j=1 1 = 1 f m [ s (h) βs (h, h 1) + β s (h 1) ] () Where s (h, h 1) is the covariace betwee h th ad (h 1) th occasio. Here, the regressio coeffi ciet is estimated as ˆβ = (y j (h) ȳ m (h h 1)) (y j (h 1) ȳ (h 1)) j=1 = (y j (h 1) ȳ (h 1)) j=1 s (h, h 1) s (h 1) (3) The ρ is the populatio correlatio coeffi ciet betwee h th ad (h 1) th occasio, ad is estimated as ˆρ s (h, h 1) = (4) s (h) s (h 1) here ˆρ depeds o occasio h ad (h 1). We combie equatio (), (3) ad (4) ad get V mi (ˆµ m (h h 1)) = s (h) (1 ˆρ ) m + ˆρ s (h) (5) 4. The correspodig calculatios for the replacemet part give us All observatios i replacemet part at curret occasio are ew ad have o correlatio with same observatios at previous occasio. So the sample mea is ad ˆµ u (h) = ȳ u (h) = 1 u u y j (h) j=1 V (ˆµ u (h)) = s (h) u 5. From 3 we get ˆµ m (h h 1) ad from 4 we get ˆµ u (h). The, at time h, all uits are measured, ad combied ito the estimate ˆµ u+m (h) = φ ˆµ u (h) + (1 φ) ˆµ m (h h 1.) (6) 8

11 Where φ is a costat weight factor. ad observe that ȳ u (h h) is idepedet from ˆµ m (h h 1.). I followig equatio, ad g (h) is shor for g(h h). From equatio (1) ito (6) give ˆµ u+m (h) = φ ˆµ u (h) + (1 φ) {ȳ m (h h 1) + β [ȳ (h 1) ȳ m (h 1)]} (7) To calculate the variace we use equatios (5), (9) ad (8) to get V (ˆµ u (h)) = s (h) u = 1 w u (h) (8) s (h) (1 ˆρ ) m + ˆρ s (h) = 1 w m (h) (9) V (ˆµ u+m (h)) = φ V (ˆµ u (h)) + (1 φ) V mi (ˆµ m (h h 1.)) (10) = φ s (h) + (1 φ) ( s (h) (1 ˆρ ) u m = φ 1 w u (h) + (1 1 φ) w m (h) + ˆρ s (h) ) (11) We wat to get the miimum variace, ad a estimate of φ. V (ˆµ u+m (h) ) we take derivative with respect to φ ad put it equal to zero. The we ca get the optimal φ for miimum variace estimate. This gives the weight factor. It also meas: φ = φ = w u (h) w m (h) + w u (h) V (ˆµ u (h)) V (ˆµ m (h h 1)) + V (ˆµ u (h)) The we use (13) istead of φ i (1) to get optimal variace estimate by optimal φ V optimal (ˆµu+m (h) ) ( 1 = w u (h) + w m (h) = s (h) ) uˆρ u ˆρ (14) The optimal proportio of replacemet is a importat estimatio value. We wat to estimate a optimal proportio of replacemet which ca help us to get the miimum variace for the estimate of populatio mea. I other words, it ca help us to estimate, the populatio mea, with highest precisio. If we wat to get the optimal proportio of replacemet for the miimum variace, we derivate the V (ˆµ u+m (h)) with respect to u ad put derivative equal to zero. We get after some calculatios (1) (13) p = u = (ˆρ 1) (15) 1 ˆρ 9

12 use (15) ito equatio (14) to get miimum variace estimate by optimal φ ad p. V mi (ˆµu+m (h) ) ( ) = s (h) ˆρ Secodly, we cosider partial replacemet of uits o occasios that are more tha two. We assume that these are (h ) th, (h 1) th ad h th occasio. So o h th occasio, we eed use estimatio of sample mea i (h 1) th occasio to be the auxiliary variable, ad use it to help us to estimate o h th occasio. I fourth cosequece above, (h 1) th occasio is the first occasio, but i this case it is the secod occasio. That is the reaso for us to use ˆµ m (h h 1) istead of y (h 1) ad φ (h) istead of φ i this case. We still assume that populatio variace S is same for all occasios. The estimatio of populatio mea o h th occasio is ˆµ u+m (h) = φ (h) ˆµ u (h) + (1 φ (h)) ˆµ m (h h 1) (h > ) (16) We get it from (13) φ (h) = ˆµ u (h) = 1 u w u (h) w u (h) + w m (h) u y i (h) i=1 V (ˆµ u (h)) = s (h) u (h) = 1 w u (h) (17) (18) ˆµ m (h h 1) = ȳ m (h) + β(ˆµ m (h 1 h ) ȳ m (h 1)) (19) So combie the (16) ad (19) to get the formula for estimate of populatio mea at h th occasio, whe h >, Here ˆµ u+m (h) = φ (h) ˆµ u (h) + (1 φ (h)) {ȳ m (h) + β [ˆµ m (h 1 h ) ȳ m (h 1)]} ˆµ m (h 1 h ) = ȳ m (h 1) + β(ˆµ m (h h 3) ȳ m (h )) so whe we calculate for variace of ˆµ m (h h 1), it becomes V mi (ˆµ m (h h 1)) = s (h) (1 ˆρ ) m (h) + ˆρ V mi (ˆµ m (h 1)) s (h) = 1 w m (h) (0) We combie equatio (1), (17), (18) ad (0) [ V (ˆµ u+m (h)) = φ h s (h) s u (h) + (1 (h) (1 ˆρ ) φ h) m (h) ] + ˆρ V mi (ˆµ m (h 1)) s (h) Combie equatio (18), (0) ad use equatio (17) istead of φ (h), the we get V (ˆµ u+m (h)) = 1 w u (h) + w m (h) = g (h) S 10

13 We assume that S is same for all occasios, but for the sample s (h) is chaged by occasios. If we use S to estimate for the V (ˆµ u+m (h)), we eed put a weight g (h) here, ad g (h) is the ratio for variace which estimated by (h th ) occasio ad 1 st occasio. At the first occasio, we assume that s (1) S. g (1) = 1 whe h = 1. g (h) = V (ˆµ u+m (h)) S Q = S V (ˆµ u+m (h)) = g (h) = S (w u (h) + w m (h)) 1 = u (h) + 1ˆρ + ˆρ g(h1) m(h) (1) We wat to get maximum of Q, it meas that a miimum variace estimate, so we take the derivative of m (h) i equatio (1) ad put it equal to zero. This gives after some simplificatiot that: 1 ˆρ ˆρ = (1 m (h) m (h) + ˆρ g (h 1) ) For keep part ad this equatio has the solutio. ˆm (h) = 1 ˆρ g (h 1) (1 + 1 ˆρ ) () Now combiig equatio (1) ad () we get 1 g (h) = ˆρ g (h 1) (1 + 1 ˆρ ) (3) Now defie r (h) ad b as, r (h) = 1 g (h) We may the rewrite equatio (3) as The it is readily foud that b = 1 1 ˆρ ˆρ r (h) = 1 + b r (h 1) r (h) = 1 + b + b + b b h1 = 1 bh 1 b 11

14 sice ad hece r (h) = 1 g (h) = 1 + b + b + b b h1 = 1 bh 1 b g (h) = 1 b 1 b h 1 b g (h 1) = (4) 1 b h1 combie equatio () ad (4) ˆm (h) = (1 bh1 ) 1 ˆρ (1 b)(1 + 1 ˆρ ) = 1 bh1 from this follows û (h) ˆm (h) = 1 = 1 + bh1 which gives us optimal proportio of replacemet ( p = û (h) = 1 1 ) h1 1 ˆρ (5) 1 ˆρ.5 Repeated samplig withi realm of stratified radom samplig With aid of equatios above, we ca derive formulas for stratified radom samplig with replacemet. N is the size for whole populatio. K is the umber of strata. K = 1,,..., K, the size for each stratum is N 1, N,..., N K ad K i=1 N i = N. The sample size for each stratum is i ad m i is the umber of kept uits, ad u i is the umber of replacemet uits i strata i. So that m i + u i = i, i = 1,,..., K. The total sample size is, so K i=1 i =. The value for j th observatio i i th stratum is deoted by Y ij ad Y ij (h) is the value for j th observatio i i th stratum o h th occasio. The weight for each stratum is W i = N i N sample size for each stratum are i = W i. Estimate of sample mea for stratum i is y i (h) where ȳ i (h) = 1 i i j=1 Estimate of variace for stratum i is S i (h) where y ij (h) Si (h) = 1 N i (Y ij (h) µ N i (h)) i j=1 i = 1,,..., K ad the 1

15 For sample variace estimatio for stratum i is s i (h) where s i (h) = 1 i (y ij (h) ȳ i (h)) i 1 Estimate of populatio mea i stratified radom samplig is j=1 ˆµ u+m (h) = L W iˆµ i (h) (6) i=1 Combie equatio (7), (16) ad (6) ˆµ u+m (h) = K W i {φ i (h) ȳ ui (h) + (1 φ i (h)) [ȳ mi (h) + β (ˆµ mi (h 1) ȳ mi (h 1))]} i=1 This is the formula for miimum variace liear ubiased estimator ˆµ u+m (h) of the true populatio mea Ȳ (h) o the hth occasio. If the proportio for replacemet is ot equal to zero, the estimatio of mea at each stratum ˆµ i (h) is a miimum variace liear ubiased estimator. So that i a stratified samplig case, if ˆµ i (h) is a miimum variace liear ubiased estimator, i = 1,,..., K, the ˆµ u+m (h) = K W iˆµ i (h) is a miimum variace liear ubiased estimator of Ȳ (h). (see i=1 [Prabhu Ajgaokar]) Estimatio of variace for ˆµ u+m (h) is V (ˆµ u+m (h)) = K i=1 W i V (ˆµ i (h)) (7) Combie equatio (11) ad (7) to get V (ˆµ u+m (h)) = K i=1 W i { [ φ i (h) S i (h) S u i (h) + (1 φ i (h)) i (h) (1 ˆρ ]} i (h)) + ˆρ i (h) S i (h) m i (h) i Usig sample variace s istead of true variace S for estimatio of stratum s variace: i equatio (11) will get the equatio ˆv (ȳ i (h)) = φ i (h) ˆv (ȳ ui (h)) + (1 φ i (h)) ˆv (ȳ mi (h)) (8) ( = φ i (h) s i (h) s u i (h) + (1 φ i (h)) i (h) ( 1 ˆρ i (h) ) ) + ˆρ i (h) s i (h) m i (h) i.6 Estimatio of sample size: Usig stratified radom samplig i the successive occasio, the sample size ad of populatio size N are the same for all occasios. We oly eed cosider total sample size whe 13

16 h = 1. Because whe h = 1, the method of samplig is stratified radom samplig. The estimatio of sample size for repeated samplig, withi the realm of stratified radom samplig, is the same as i stratified radom samplig. Here it is assumed that the estimate has a specified variace V. If istead, the margi of error d has bee specified, V = (d/t), where t is the ormal deviate correspodig to the allowable probability that the error will exceed the desired margi.(see [Cochra]) As a remider W i = N i N ad w i = i V = 1 from this follows. W i S i w i = 1 N Wi Si /w i i V + 1 W N i Si i Wi S i Presumed optimum allocatio (for the fixed sample size ): w i W i S i (W i S i ) = i V + 1 W N i Si i Proportioal allocatio: w i = W i = W i Si i V + 1 W N i Si i (9) 14

17 3 Experimetal results 3.1 Implemetatio issues This thesis focuses o a miimum variace estimatio problem for a time-depedet populatio mea. We restrict our ivestigatio to the case of liear ubiased estimators, ad to fixed sample ad populatio sizes at all occasios. This sectio is experimetal part. Firstly, we use Mote-Carlo method to geerate data. This data simulatio is based o the mea value of people s average yearly icome i Stockholm couty durig 006, 007, 008 ad 009. For populatio of icome is all people livig i Stockholm couty, we write it Ω = {Y 1, Y,..., Y N }. First, calculate the true populatio mea for 006, 007, 008 ad 009. Value i 006 is the base iformatio, we oly use it to decide the sample size. We use equatio (9) to calculate the sample size. Data separates ito two strata by geder: oe stratum is Male ad oe is Female. The stratum has N 1 ad N { Y f 1, Y f,..., Y f N }. idividuals. We deote them Ω male = { } Y1 m, Y m,..., YN m 1 ad Ωfemale = The we use radom samplig at each stratum to get the sample ad the sample sizes are 1 ad. The samples are 1 = { y m 1, y m,..., y m 1 } ad = {y f1, y f,..., y f }. After that, use fixed sample respectively to estimate the mea ad variace for whole populatio i 007, 008 ad 009, these are ˆµ (07) fixed, ˆµ (08) fixed ad ˆµ (09) fixed. Ad the use repeated samplig to estimate the mea ad variace for the populatio i 008 ad 009, because 007 is the first occasio. At the first occasio the mea ad variace are the same as the fixed sample estimates. Use equatio (5) to calculate the optimal size for replacemet at each stratum. These are u male (08), u female (08), u male (09) ad u female (09). The umber of kept part is m ad 1 u male (08) = m male (08), u female (08) = m female (08), 1 u male (09) = m male (09) ad u female (09) = m female (09). Estimate the sample mea ad sample variace for the ew sample which is equal to replacemet part plus kept part, at each stratum. After that use equatio (6) ad (7) to calculate the populatio mea ad variace for 008 ad 009. We ca estimate populatio mea with fixed sample ad estimate populatio mea with repeated samplig method. Which oe is closer to the true mea, meas use that method of samplig to get the estimate with highest precisio. Ad the ratio of these two variaces shows how much differece of precisio there are betwee these two methods. Durig this experimet we oly use data that is yearly icome i 007, 008 ad 009. Whe we use repeated samplig to estimate the mea ad variace for whole populatio, we estimate the optimal proportio for replacemet. 3. Experimetal result for data set 1: 3..1 Sample size Calculate from base year 006 Same for all occasios. N N male N female male female

18 3.. Compariso for fixed sample ad repeated sample true mea Ȳ (tkr) estimatio of estimatio of Y fixed (tkr) Y repeated (tkr) estimatio of V ( Y fixed ) estimatio ( of V ( Y repeated ) u ( ) male u ) female bias of estimatio (fixed sample) 0.014% 0.04% 0.106% bias of estimatio (repeated sample) 0.017% 0.06% 3..3 Estimated variace V ( Y fixed )/V ( Y repeated ) From this table we see that the estimated bias is always smaller for repeated samplig whe compared with "fixed" samplig. It is also foud that V (ˆµ fixed ) > V (ˆµ repeated ) for both year 008 ad 009. Hece repeated samplig is foud to be a better choice. 3.3 Optimality Here we check whether the optimal proportio of replacemet is real optimum or ot. The method for estimatio of optimal proportio of replacemet is the same i 008 ad 009. So we a test whether the proportio estimated is optimal or ot for 008 (h = ). If it is a optimal proportio i 008, it will be also a optimal proportio i 009. Durig this experimet we oly chage the proportio of replacemet, sice the correlatio coeffi ciet, regressio coeffi ciet ad optimum stratum weight etc. will ot chage. We foud estimatio for optimum proportio of replacemet is for male ad for female.use proportios 0.6 ad 0.8 to check bias. If both 0.6 ad 0.8 gives higher bias the proportio of replacemet, which was gotte from last experimet, is better oe. The also check 0.65 ad ( u ) at secod occasio male bias of estimatio 0.036% 0.0% 0.017% 0.07% 0.06% From this experimet, chage the proportio of replacemet, lower ad higher, which was estimated from last experimet. Follow the proportio chages closes to 0.71; the bias of estimatio has decreased ad closes to the 0.017% which estimated i last experimet. It meas that the proportio of replacemet, estimated i last experimet, is the optimum proportio of replacemet. 16

19 3.4 Experimetal result for data set We used simulated data. Simulate agai but with a bigger variace (σ = 50) ad keep the total umber, stratum ad stratum size the same Use this ew data to repeat expermet above Experimet result for testig sceario 3.4. Sample size Calculate from base year 006. Same for all occasios. N N male N female male female Compariso for fixed sample ad repeated sample true mea Ȳ (tkr) estimatio of estimatio of Y fixed (tkr) Y repeated (tkr) estimatio of V ( Y fixed ) estimatio ( of V ( Y repeated ) u ( ) male u ) female bias of estimatio (fixed sample) % % 0.185% bias of estimatio (repeated sample) % 0.039% Estimate variace V ( Y fixed )/V ( Y repeated ) The results also show that use repeated samplig has higher precisio estimatio tha use fixed samplig. Hece repeated samplig if foud to be a better choice. No matter the variace for data is larger or small Summary of experimet From these experimets, the results show that usig repeated samplig, withi the realm of stratified samplig, to estimate mea of whole populatio is closer to the true populatio mea tha whe we use fixed samplig. I other words, use repeated samplig will give higher 17

20 precisio tha fixed samplig. The formula, for repeated samplig that we derived i theory part, are useful to us to estimate populatio mea with higher precisio. They also help us to get the optimal proportio of replacemet. 18

21 4 Coclusio ad discussio This thesis has derived formulas to estimate populatio mea ad variace whe usig repeated samplig withi realm of stratified radom samplig. After derivig the formulas we use data from SCB to simulate the whole process. The results from the simulatio for data set 1 shows that, use of repeated samplig ca improve the precisio cosiderably (0.04%-0.017%) ad lower the variace compared to fixed samplig whe estimatig the populatio mea at 008. Ad for year 009, we atfai compareable improvemets. For these simulatios, the results are that stratified radom samplig with replacemet whe estimatig the populatio mea gives lower variace tha to use stratified radom samplig without replacemet. This is due to that the sample with partial replacemet o the h th occasio iclude some uits that match with (h 1) th occasio ad some uits that match both the (h1) th ad (h) th occasio. To use this sample to estimate populatio mea o h th occasio will have higher precisio tha to use the sample which oly matched (h) th occasio. Durig the whole process, the optimal proportio of replacemet is a very importat characteristic. I the sectio Optimality, we test if the estimate of proportio is optimal or ot. We chage the proportio of replacemet aroud our estimate, from experimet for data set 1, sice our estimate is we choose 0.6, 0.65 ad 0.75, 0.8. The results shows that whe proportio is closer to the estimated proportio , the the bias for estimatio of populatio mea do decrease. That meas the estimatio of proportio i experimet for data set 1 is at a real optimal proportio of replacemet. Usig repeated samplig withi realm of stratified radom samplig is oe of a samplig desigs. Desig of samplig is a art, a successful samplig desig will have direct positive ifluece o the result of the survey. I this thesis the results from the experimetal part shows that our assumptio is correct. Usig repeated samplig withi realm of stratified radom samplig ca help us to get estimates with higher presicio tha fixed samplig. This is a good use of samplig desig. But there is also alteratives to stratified radom samplig such as sigle-stage cluster samplig, systematic samplig, double samplig, stratified cluster samplig ad two-stage stratified samplig, etc. We are ot sure if these methods also gives estimates with higher presicio. I this thesis we oly studied that usig repeated samplig withi realm of stratified radom samplig. So usig repeated samplig withi other samplig methods may give good samplig desigs. 19

22 Refereces [Cochra] W.G. Cochra, Samplig Techiques, third editio, Wiley, New York [Eckler] A.R. Eckler Rotatio Samplig, A. Math. Statist [Gbur ad Sielke] E.E. Gbur ad R.L. Sielke Rotatio Samplig Desig, Texas A&M Uiversity. [Maoussakis] [Patterso] E. Maoussakis, Repeated Samplig with Partial Replacemet of Uits, The Aals of Statistics, Vol.5, No.4 (Jul.,1977), pp H.D.Patterso, Samplig o successive occasios with partial replacemet of uits, Joural of the Royal Statistical Society, Series B, 1 (1950), [Prabhu Ajgaokar] S.G.Prabhu-Ajgaokar ad B.D.Tikkiwal O classes of estimators of the variace fuctio of a liear estimator Volume 18, Number 1, 15-0, DOI: /BF06143 [Rao ad Graham] J. N. K. Rao ad Jack E. Graham, Desigs for Samplig o Repeated Occasios, Joural of the America Statistical Associatio, Vol.59, No.306 (Ju.,1964), pp [Ulam] [Woller] N.M.S.Ulam, The Mote Carlo Method, Joural of the America Statistical Associatio, Vol.44, No.47. (Sep., 1949), pp J.Woller, The basics of Mote Carlo Simulatios, uiversity of Nebraska- Licol Physical Chemistry Lab, Sprig

Confidence Intervals for One Mean

Confidence Intervals for One Mean Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a

More information

Hypothesis testing. Null and alternative hypotheses

Hypothesis testing. Null and alternative hypotheses Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate

More information

I. Chi-squared Distributions

I. Chi-squared Distributions 1 M 358K Supplemet to Chapter 23: CHI-SQUARED DISTRIBUTIONS, T-DISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad t-distributios, we first eed to look at aother family of distributios, the chi-squared distributios.

More information

1 Computing the Standard Deviation of Sample Means

1 Computing the Standard Deviation of Sample Means Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.

More information

Determining the sample size

Determining the sample size Determiig the sample size Oe of the most commo questios ay statisticia gets asked is How large a sample size do I eed? Researchers are ofte surprised to fid out that the aswer depeds o a umber of factors

More information

5: Introduction to Estimation

5: Introduction to Estimation 5: Itroductio to Estimatio Cotets Acroyms ad symbols... 1 Statistical iferece... Estimatig µ with cofidece... 3 Samplig distributio of the mea... 3 Cofidece Iterval for μ whe σ is kow before had... 4 Sample

More information

Normal Distribution.

Normal Distribution. Normal Distributio www.icrf.l Normal distributio I probability theory, the ormal or Gaussia distributio, is a cotiuous probability distributio that is ofte used as a first approimatio to describe realvalued

More information

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN Aalyzig Logitudial Data from Complex Surveys Usig SUDAAN Darryl Creel Statistics ad Epidemiology, RTI Iteratioal, 312 Trotter Farm Drive, Rockville, MD, 20850 Abstract SUDAAN: Software for the Statistical

More information

1. C. The formula for the confidence interval for a population mean is: x t, which was

1. C. The formula for the confidence interval for a population mean is: x t, which was s 1. C. The formula for the cofidece iterval for a populatio mea is: x t, which was based o the sample Mea. So, x is guarateed to be i the iterval you form.. D. Use the rule : p-value

More information

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number.

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number. GCSE STATISTICS You should kow: 1) How to draw a frequecy diagram: e.g. NUMBER TALLY FREQUENCY 1 3 5 ) How to draw a bar chart, a pictogram, ad a pie chart. 3) How to use averages: a) Mea - add up all

More information

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the. Cofidece Itervals A cofidece iterval is a iterval whose purpose is to estimate a parameter (a umber that could, i theory, be calculated from the populatio, if measuremets were available for the whole populatio).

More information

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means)

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means) CHAPTER 7: Cetral Limit Theorem: CLT for Averages (Meas) X = the umber obtaied whe rollig oe six sided die oce. If we roll a six sided die oce, the mea of the probability distributio is X P(X = x) Simulatio:

More information

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights Ceter, Spread, ad Shape i Iferece: Claims, Caveats, ad Isights Dr. Nacy Pfeig (Uiversity of Pittsburgh) AMATYC November 2008 Prelimiary Activities 1. I would like to produce a iterval estimate for the

More information

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008 I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces

More information

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample

More information

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown Z-TEST / Z-STATISTIC: used to test hypotheses about µ whe the populatio stadard deviatio is kow ad populatio distributio is ormal or sample size is large T-TEST / T-STATISTIC: used to test hypotheses about

More information

Math C067 Sampling Distributions

Math C067 Sampling Distributions Math C067 Samplig Distributios Sample Mea ad Sample Proportio Richard Beigel Some time betwee April 16, 2007 ad April 16, 2007 Examples of Samplig A pollster may try to estimate the proportio of voters

More information

Lesson 17 Pearson s Correlation Coefficient

Lesson 17 Pearson s Correlation Coefficient Outlie Measures of Relatioships Pearso s Correlatio Coefficiet (r) -types of data -scatter plots -measure of directio -measure of stregth Computatio -covariatio of X ad Y -uique variatio i X ad Y -measurig

More information

Measures of Spread and Boxplots Discrete Math, Section 9.4

Measures of Spread and Boxplots Discrete Math, Section 9.4 Measures of Spread ad Boxplots Discrete Math, Sectio 9.4 We start with a example: Example 1: Comparig Mea ad Media Compute the mea ad media of each data set: S 1 = {4, 6, 8, 10, 1, 14, 16} S = {4, 7, 9,

More information

Output Analysis (2, Chapters 10 &11 Law)

Output Analysis (2, Chapters 10 &11 Law) B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should

More information

PSYCHOLOGICAL STATISTICS

PSYCHOLOGICAL STATISTICS UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc. Cousellig Psychology (0 Adm.) IV SEMESTER COMPLEMENTARY COURSE PSYCHOLOGICAL STATISTICS QUESTION BANK. Iferetial statistics is the brach of statistics

More information

Chapter 7 Methods of Finding Estimators

Chapter 7 Methods of Finding Estimators Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of

More information

Lesson 15 ANOVA (analysis of variance)

Lesson 15 ANOVA (analysis of variance) Outlie Variability -betwee group variability -withi group variability -total variability -F-ratio Computatio -sums of squares (betwee/withi/total -degrees of freedom (betwee/withi/total -mea square (betwee/withi

More information

1 Correlation and Regression Analysis

1 Correlation and Regression Analysis 1 Correlatio ad Regressio Aalysis I this sectio we will be ivestigatig the relatioship betwee two cotiuous variable, such as height ad weight, the cocetratio of a ijected drug ad heart rate, or the cosumptio

More information

Quadrat Sampling in Population Ecology

Quadrat Sampling in Population Ecology Quadrat Samplig i Populatio Ecology Backgroud Estimatig the abudace of orgaisms. Ecology is ofte referred to as the "study of distributio ad abudace". This beig true, we would ofte like to kow how may

More information

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas: Chapter 7 - Samplig Distributios 1 Itroductio What is statistics? It cosist of three major areas: Data Collectio: samplig plas ad experimetal desigs Descriptive Statistics: umerical ad graphical summaries

More information

Soving Recurrence Relations

Soving Recurrence Relations Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree

More information

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles The followig eample will help us uderstad The Samplig Distributio of the Mea Review: The populatio is the etire collectio of all idividuals or objects of iterest The sample is the portio of the populatio

More information

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,

More information

Sampling Distribution And Central Limit Theorem

Sampling Distribution And Central Limit Theorem () Samplig Distributio & Cetral Limit Samplig Distributio Ad Cetral Limit Samplig distributio of the sample mea If we sample a umber of samples (say k samples where k is very large umber) each of size,

More information

Properties of MLE: consistency, asymptotic normality. Fisher information.

Properties of MLE: consistency, asymptotic normality. Fisher information. Lecture 3 Properties of MLE: cosistecy, asymptotic ormality. Fisher iformatio. I this sectio we will try to uderstad why MLEs are good. Let us recall two facts from probability that we be used ofte throughout

More information

Statistical inference: example 1. Inferential Statistics

Statistical inference: example 1. Inferential Statistics Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either

More information

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals Overview Estimatig the Value of a Parameter Usig Cofidece Itervals We apply the results about the sample mea the problem of estimatio Estimatio is the process of usig sample data estimate the value of

More information

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100B Istructor: Nicolas Christou Three importat distributios: Distributios related to the ormal distributio Chi-square (χ ) distributio.

More information

Case Study. Normal and t Distributions. Density Plot. Normal Distributions

Case Study. Normal and t Distributions. Density Plot. Normal Distributions Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca

More information

, a Wishart distribution with n -1 degrees of freedom and scale matrix.

, a Wishart distribution with n -1 degrees of freedom and scale matrix. UMEÅ UNIVERSITET Matematisk-statistiska istitutioe Multivariat dataaalys D MSTD79 PA TENTAMEN 004-0-9 LÖSNINGSFÖRSLAG TILL TENTAMEN I MATEMATISK STATISTIK Multivariat dataaalys D, 5 poäg.. Assume that

More information

A probabilistic proof of a binomial identity

A probabilistic proof of a binomial identity A probabilistic proof of a biomial idetity Joatho Peterso Abstract We give a elemetary probabilistic proof of a biomial idetity. The proof is obtaied by computig the probability of a certai evet i two

More information

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5 Sectio 13 Kolmogorov-Smirov test. Suppose that we have a i.i.d. sample X 1,..., X with some ukow distributio P ad we would like to test the hypothesis that P is equal to a particular distributio P 0, i.e.

More information

LECTURE 13: Cross-validation

LECTURE 13: Cross-validation LECTURE 3: Cross-validatio Resampli methods Cross Validatio Bootstrap Bias ad variace estimatio with the Bootstrap Three-way data partitioi Itroductio to Patter Aalysis Ricardo Gutierrez-Osua Texas A&M

More information

COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S 2 CONTROL CHART FOR THE CHANGES IN A PROCESS

COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S 2 CONTROL CHART FOR THE CHANGES IN A PROCESS COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S CONTROL CHART FOR THE CHANGES IN A PROCESS Supraee Lisawadi Departmet of Mathematics ad Statistics, Faculty of Sciece ad Techoology, Thammasat

More information

Modified Line Search Method for Global Optimization

Modified Line Search Method for Global Optimization Modified Lie Search Method for Global Optimizatio Cria Grosa ad Ajith Abraham Ceter of Excellece for Quatifiable Quality of Service Norwegia Uiversity of Sciece ad Techology Trodheim, Norway {cria, ajith}@q2s.tu.o

More information

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method Chapter 6: Variace, the law of large umbers ad the Mote-Carlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value

More information

One-sample test of proportions

One-sample test of proportions Oe-sample test of proportios The Settig: Idividuals i some populatio ca be classified ito oe of two categories. You wat to make iferece about the proportio i each category, so you draw a sample. Examples:

More information

Research Method (I) --Knowledge on Sampling (Simple Random Sampling)

Research Method (I) --Knowledge on Sampling (Simple Random Sampling) Research Method (I) --Kowledge o Samplig (Simple Radom Samplig) 1. Itroductio to samplig 1.1 Defiitio of samplig Samplig ca be defied as selectig part of the elemets i a populatio. It results i the fact

More information

Section 11.3: The Integral Test

Section 11.3: The Integral Test Sectio.3: The Itegral Test Most of the series we have looked at have either diverged or have coverged ad we have bee able to fid what they coverge to. I geeral however, the problem is much more difficult

More information

Chapter 7: Confidence Interval and Sample Size

Chapter 7: Confidence Interval and Sample Size Chapter 7: Cofidece Iterval ad Sample Size Learig Objectives Upo successful completio of Chapter 7, you will be able to: Fid the cofidece iterval for the mea, proportio, ad variace. Determie the miimum

More information

Inference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval

Inference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval Chapter 8 Tests of Statistical Hypotheses 8. Tests about Proportios HT - Iferece o Proportio Parameter: Populatio Proportio p (or π) (Percetage of people has o health isurace) x Statistic: Sample Proportio

More information

3. Greatest Common Divisor - Least Common Multiple

3. Greatest Common Divisor - Least Common Multiple 3 Greatest Commo Divisor - Least Commo Multiple Defiitio 31: The greatest commo divisor of two atural umbers a ad b is the largest atural umber c which divides both a ad b We deote the greatest commo gcd

More information

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth Questio 1: What is a ordiary auity? Let s look at a ordiary auity that is certai ad simple. By this, we mea a auity over a fixed term whose paymet period matches the iterest coversio period. Additioally,

More information

Practice Problems for Test 3

Practice Problems for Test 3 Practice Problems for Test 3 Note: these problems oly cover CIs ad hypothesis testig You are also resposible for kowig the samplig distributio of the sample meas, ad the Cetral Limit Theorem Review all

More information

Hypergeometric Distributions

Hypergeometric Distributions 7.4 Hypergeometric Distributios Whe choosig the startig lie-up for a game, a coach obviously has to choose a differet player for each positio. Similarly, whe a uio elects delegates for a covetio or you

More information

Incremental calculation of weighted mean and variance

Incremental calculation of weighted mean and variance Icremetal calculatio of weighted mea ad variace Toy Fich faf@cam.ac.uk dot@dotat.at Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically

More information

BASIC STATISTICS. f(x 1,x 2,..., x n )=f(x 1 )f(x 2 ) f(x n )= f(x i ) (1)

BASIC STATISTICS. f(x 1,x 2,..., x n )=f(x 1 )f(x 2 ) f(x n )= f(x i ) (1) BASIC STATISTICS. SAMPLES, RANDOM SAMPLING AND SAMPLE STATISTICS.. Radom Sample. The radom variables X,X 2,..., X are called a radom sample of size from the populatio f(x if X,X 2,..., X are mutually idepedet

More information

Overview of some probability distributions.

Overview of some probability distributions. Lecture Overview of some probability distributios. I this lecture we will review several commo distributios that will be used ofte throughtout the class. Each distributio is usually described by its probability

More information

Maximum Likelihood Estimators.

Maximum Likelihood Estimators. Lecture 2 Maximum Likelihood Estimators. Matlab example. As a motivatio, let us look at oe Matlab example. Let us geerate a radom sample of size 00 from beta distributio Beta(5, 2). We will lear the defiitio

More information

hp calculators HP 12C Statistics - average and standard deviation Average and standard deviation concepts HP12C average and standard deviation

hp calculators HP 12C Statistics - average and standard deviation Average and standard deviation concepts HP12C average and standard deviation HP 1C Statistics - average ad stadard deviatio Average ad stadard deviatio cocepts HP1C average ad stadard deviatio Practice calculatig averages ad stadard deviatios with oe or two variables HP 1C Statistics

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics We leared to describe data sets graphically. We ca also describe a data set umerically. Measures of Locatio Defiitio The sample mea is the arithmetic average of values. We deote

More information

Basic Elements of Arithmetic Sequences and Series

Basic Elements of Arithmetic Sequences and Series MA40S PRE-CALCULUS UNIT G GEOMETRIC SEQUENCES CLASS NOTES (COMPLETED NO NEED TO COPY NOTES FROM OVERHEAD) Basic Elemets of Arithmetic Sequeces ad Series Objective: To establish basic elemets of arithmetic

More information

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean 1 Social Studies 201 October 13, 2004 Note: The examples i these otes may be differet tha used i class. However, the examples are similar ad the methods used are idetical to what was preseted i class.

More information

CHAPTER 3 DIGITAL CODING OF SIGNALS

CHAPTER 3 DIGITAL CODING OF SIGNALS CHAPTER 3 DIGITAL CODING OF SIGNALS Computers are ofte used to automate the recordig of measuremets. The trasducers ad sigal coditioig circuits produce a voltage sigal that is proportioal to a quatity

More information

INVESTMENT PERFORMANCE COUNCIL (IPC)

INVESTMENT PERFORMANCE COUNCIL (IPC) INVESTMENT PEFOMANCE COUNCIL (IPC) INVITATION TO COMMENT: Global Ivestmet Performace Stadards (GIPS ) Guidace Statemet o Calculatio Methodology The Associatio for Ivestmet Maagemet ad esearch (AIM) seeks

More information

Tradigms of Astundithi and Toyota

Tradigms of Astundithi and Toyota Tradig the radomess - Desigig a optimal tradig strategy uder a drifted radom walk price model Yuao Wu Math 20 Project Paper Professor Zachary Hamaker Abstract: I this paper the author iteds to explore

More information

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM

PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY Physical ad Mathematical Scieces 2015, 1, p. 15 19 M a t h e m a t i c s AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM A. G. GULYAN Chair of Actuarial Mathematics

More information

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return EVALUATING ALTERNATIVE CAPITAL INVESTMENT PROGRAMS By Ke D. Duft, Extesio Ecoomist I the March 98 issue of this publicatio we reviewed the procedure by which a capital ivestmet project was assessed. The

More information

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed. This documet was writte ad copyrighted by Paul Dawkis. Use of this documet ad its olie versio is govered by the Terms ad Coditios of Use located at http://tutorial.math.lamar.edu/terms.asp. The olie versio

More information

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here).

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here). BEGINNING ALGEBRA Roots ad Radicals (revised summer, 00 Olso) Packet to Supplemet the Curret Textbook - Part Review of Square Roots & Irratioals (This portio ca be ay time before Part ad should mostly

More information

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable Week 3 Coditioal probabilities, Bayes formula, WEEK 3 page 1 Expected value of a radom variable We recall our discussio of 5 card poker hads. Example 13 : a) What is the probability of evet A that a 5

More information

Chapter 14 Nonparametric Statistics

Chapter 14 Nonparametric Statistics Chapter 14 Noparametric Statistics A.K.A. distributio-free statistics! Does ot deped o the populatio fittig ay particular type of distributio (e.g, ormal). Sice these methods make fewer assumptios, they

More information

MEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book)

MEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book) MEI Mathematics i Educatio ad Idustry MEI Structured Mathematics Module Summary Sheets Statistics (Versio B: referece to ew book) Topic : The Poisso Distributio Topic : The Normal Distributio Topic 3:

More information

THE TWO-VARIABLE LINEAR REGRESSION MODEL

THE TWO-VARIABLE LINEAR REGRESSION MODEL THE TWO-VARIABLE LINEAR REGRESSION MODEL Herma J. Bieres Pesylvaia State Uiversity April 30, 202. Itroductio Suppose you are a ecoomics or busiess maor i a college close to the beach i the souther part

More information

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13 EECS 70 Discrete Mathematics ad Probability Theory Sprig 2014 Aat Sahai Note 13 Itroductio At this poit, we have see eough examples that it is worth just takig stock of our model of probability ad may

More information

*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature.

*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature. Itegrated Productio ad Ivetory Cotrol System MRP ad MRP II Framework of Maufacturig System Ivetory cotrol, productio schedulig, capacity plaig ad fiacial ad busiess decisios i a productio system are iterrelated.

More information

Sequences and Series

Sequences and Series CHAPTER 9 Sequeces ad Series 9.. Covergece: Defiitio ad Examples Sequeces The purpose of this chapter is to itroduce a particular way of geeratig algorithms for fidig the values of fuctios defied by their

More information

Confidence Intervals

Confidence Intervals Cofidece Itervals Cofidece Itervals are a extesio of the cocept of Margi of Error which we met earlier i this course. Remember we saw: The sample proportio will differ from the populatio proportio by more

More information

Data Analysis and Statistical Behaviors of Stock Market Fluctuations

Data Analysis and Statistical Behaviors of Stock Market Fluctuations 44 JOURNAL OF COMPUTERS, VOL. 3, NO. 0, OCTOBER 2008 Data Aalysis ad Statistical Behaviors of Stock Market Fluctuatios Ju Wag Departmet of Mathematics, Beijig Jiaotog Uiversity, Beijig 00044, Chia Email:

More information

Annuities Under Random Rates of Interest II By Abraham Zaks. Technion I.I.T. Haifa ISRAEL and Haifa University Haifa ISRAEL.

Annuities Under Random Rates of Interest II By Abraham Zaks. Technion I.I.T. Haifa ISRAEL and Haifa University Haifa ISRAEL. Auities Uder Radom Rates of Iterest II By Abraham Zas Techio I.I.T. Haifa ISRAEL ad Haifa Uiversity Haifa ISRAEL Departmet of Mathematics, Techio - Israel Istitute of Techology, 3000, Haifa, Israel I memory

More information

Decomposition of Gini and the generalized entropy inequality measures. Abstract

Decomposition of Gini and the generalized entropy inequality measures. Abstract Decompositio of Gii ad the geeralized etropy iequality measures Stéphae Mussard LAMETA Uiversity of Motpellier I Fraçoise Seyte LAMETA Uiversity of Motpellier I Michel Terraza LAMETA Uiversity of Motpellier

More information

CHAPTER 3 THE TIME VALUE OF MONEY

CHAPTER 3 THE TIME VALUE OF MONEY CHAPTER 3 THE TIME VALUE OF MONEY OVERVIEW A dollar i the had today is worth more tha a dollar to be received i the future because, if you had it ow, you could ivest that dollar ad ear iterest. Of all

More information

3 Basic Definitions of Probability Theory

3 Basic Definitions of Probability Theory 3 Basic Defiitios of Probability Theory 3defprob.tex: Feb 10, 2003 Classical probability Frequecy probability axiomatic probability Historical developemet: Classical Frequecy Axiomatic The Axiomatic defiitio

More information

Infinite Sequences and Series

Infinite Sequences and Series CHAPTER 4 Ifiite Sequeces ad Series 4.1. Sequeces A sequece is a ifiite ordered list of umbers, for example the sequece of odd positive itegers: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29...

More information

Now here is the important step

Now here is the important step LINEST i Excel The Excel spreadsheet fuctio "liest" is a complete liear least squares curve fittig routie that produces ucertaity estimates for the fit values. There are two ways to access the "liest"

More information

Ekkehart Schlicht: Economic Surplus and Derived Demand

Ekkehart Schlicht: Economic Surplus and Derived Demand Ekkehart Schlicht: Ecoomic Surplus ad Derived Demad Muich Discussio Paper No. 2006-17 Departmet of Ecoomics Uiversity of Muich Volkswirtschaftliche Fakultät Ludwig-Maximilias-Uiversität Müche Olie at http://epub.ub.ui-mueche.de/940/

More information

Asymptotic Growth of Functions

Asymptotic Growth of Functions CMPS Itroductio to Aalysis of Algorithms Fall 3 Asymptotic Growth of Fuctios We itroduce several types of asymptotic otatio which are used to compare the performace ad efficiecy of algorithms As we ll

More information

Vladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT

Vladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT Keywords: project maagemet, resource allocatio, etwork plaig Vladimir N Burkov, Dmitri A Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT The paper deals with the problems of resource allocatio betwee

More information

Estimating Probability Distributions by Observing Betting Practices

Estimating Probability Distributions by Observing Betting Practices 5th Iteratioal Symposium o Imprecise Probability: Theories ad Applicatios, Prague, Czech Republic, 007 Estimatig Probability Distributios by Observig Bettig Practices Dr C Lych Natioal Uiversity of Irelad,

More information

WHEN IS THE (CO)SINE OF A RATIONAL ANGLE EQUAL TO A RATIONAL NUMBER?

WHEN IS THE (CO)SINE OF A RATIONAL ANGLE EQUAL TO A RATIONAL NUMBER? WHEN IS THE (CO)SINE OF A RATIONAL ANGLE EQUAL TO A RATIONAL NUMBER? JÖRG JAHNEL 1. My Motivatio Some Sort of a Itroductio Last term I tought Topological Groups at the Göttige Georg August Uiversity. This

More information

This document contains a collection of formulas and constants useful for SPC chart construction. It assumes you are already familiar with SPC.

This document contains a collection of formulas and constants useful for SPC chart construction. It assumes you are already familiar with SPC. SPC Formulas ad Tables 1 This documet cotais a collectio of formulas ad costats useful for SPC chart costructio. It assumes you are already familiar with SPC. Termiology Geerally, a bar draw over a symbol

More information

A Mathematical Perspective on Gambling

A Mathematical Perspective on Gambling A Mathematical Perspective o Gamblig Molly Maxwell Abstract. This paper presets some basic topics i probability ad statistics, icludig sample spaces, probabilistic evets, expectatios, the biomial ad ormal

More information

The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection

The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection The aalysis of the Courot oligopoly model cosiderig the subjective motive i the strategy selectio Shigehito Furuyama Teruhisa Nakai Departmet of Systems Maagemet Egieerig Faculty of Egieerig Kasai Uiversity

More information

Your organization has a Class B IP address of 166.144.0.0 Before you implement subnetting, the Network ID and Host ID are divided as follows:

Your organization has a Class B IP address of 166.144.0.0 Before you implement subnetting, the Network ID and Host ID are divided as follows: Subettig Subettig is used to subdivide a sigle class of etwork i to multiple smaller etworks. Example: Your orgaizatio has a Class B IP address of 166.144.0.0 Before you implemet subettig, the Network

More information

Present Values, Investment Returns and Discount Rates

Present Values, Investment Returns and Discount Rates Preset Values, Ivestmet Returs ad Discout Rates Dimitry Midli, ASA, MAAA, PhD Presidet CDI Advisors LLC dmidli@cdiadvisors.com May 2, 203 Copyright 20, CDI Advisors LLC The cocept of preset value lies

More information

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem Lecture 4: Cauchy sequeces, Bolzao-Weierstrass, ad the Squeeze theorem The purpose of this lecture is more modest tha the previous oes. It is to state certai coditios uder which we are guarateed that limits

More information

Forecasting techniques

Forecasting techniques 2 Forecastig techiques this chapter covers... I this chapter we will examie some useful forecastig techiques that ca be applied whe budgetig. We start by lookig at the way that samplig ca be used to collect

More information

Parametric (theoretical) probability distributions. (Wilks, Ch. 4) Discrete distributions: (e.g., yes/no; above normal, normal, below normal)

Parametric (theoretical) probability distributions. (Wilks, Ch. 4) Discrete distributions: (e.g., yes/no; above normal, normal, below normal) 6 Parametric (theoretical) probability distributios. (Wilks, Ch. 4) Note: parametric: assume a theoretical distributio (e.g., Gauss) No-parametric: o assumptio made about the distributio Advatages of assumig

More information

BENEFIT-COST ANALYSIS Financial and Economic Appraisal using Spreadsheets

BENEFIT-COST ANALYSIS Financial and Economic Appraisal using Spreadsheets BENEIT-CST ANALYSIS iacial ad Ecoomic Appraisal usig Spreadsheets Ch. 2: Ivestmet Appraisal - Priciples Harry Campbell & Richard Brow School of Ecoomics The Uiversity of Queeslad Review of basic cocepts

More information

Multi-server Optimal Bandwidth Monitoring for QoS based Multimedia Delivery Anup Basu, Irene Cheng and Yinzhe Yu

Multi-server Optimal Bandwidth Monitoring for QoS based Multimedia Delivery Anup Basu, Irene Cheng and Yinzhe Yu Multi-server Optimal Badwidth Moitorig for QoS based Multimedia Delivery Aup Basu, Iree Cheg ad Yizhe Yu Departmet of Computig Sciece U. of Alberta Architecture Applicatio Layer Request receptio -coectio

More information

Department of Computer Science, University of Otago

Department of Computer Science, University of Otago Departmet of Computer Sciece, Uiversity of Otago Techical Report OUCS-2006-09 Permutatios Cotaiig May Patters Authors: M.H. Albert Departmet of Computer Sciece, Uiversity of Otago Micah Colema, Rya Fly

More information

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations CS3A Hadout 3 Witer 00 February, 00 Solvig Recurrece Relatios Itroductio A wide variety of recurrece problems occur i models. Some of these recurrece relatios ca be solved usig iteratio or some other ad

More information

5.4 Amortization. Question 1: How do you find the present value of an annuity? Question 2: How is a loan amortized?

5.4 Amortization. Question 1: How do you find the present value of an annuity? Question 2: How is a loan amortized? 5.4 Amortizatio Questio 1: How do you fid the preset value of a auity? Questio 2: How is a loa amortized? Questio 3: How do you make a amortizatio table? Oe of the most commo fiacial istrumets a perso

More information

A modified Kolmogorov-Smirnov test for normality

A modified Kolmogorov-Smirnov test for normality MPRA Muich Persoal RePEc Archive A modified Kolmogorov-Smirov test for ormality Zvi Drezer ad Ofir Turel ad Dawit Zerom Califoria State Uiversity-Fullerto 22. October 2008 Olie at http://mpra.ub.ui-mueche.de/14385/

More information