Sampling online social networks by random walk

Size: px
Start display at page:

Download "Sampling online social networks by random walk"

Transcription

1 Samplig olie social etworks by radom walk Jiaguo Lu, Digdig Li 2 School of Computer Sciece, Uiversity of Widsor 2 Departmet of Ecoomics, Uiversity of Widsor {jlu, dli}@uwidsor.ca 40 Suset Aveue, Widsor, Otario N9B 3P4. Caada ABSTRACT This paper proposes to use simple radom walk, a samplig method supported by most olie social etworks (OSN), to estimate a variety of properties of large OSNs. We show that due to the scale-free ature of OSNs the estimators derived from radom walk samplig scheme are much better tha uiform radom samplig, eve whe uiform radom samples are available disregardig the otorious high cost of obtaiig the radom samples. The paper first proposes to use harmoic mea to estimate the average degree of OSNs. The accurate estimatio of the average degree leads to the discovery of other properties, such as the populatio size, the heterogeeity of the degrees, the umber of frieds of frieds, the threshold value for messages to reach a large compoet, ad Gii coefficiet of the populatio. The method is validated i complete Twitter data dated i 2009 that cotais 42 millio odes ad.5 billio edges. Keywords OSN, Olie Social Network, Hase-Hurwitz, Estimator, Scale free etwork, Harmoic mea. INTRODUCTION The properties of olie social etworks are of great iterests to geeral public as well as IT professioals. Yet the raw data are usually ot available to the public ad the summary released by the service providers is sketchy. Thus samplig is eeded to reveal the hidde properties or structure of the uderlyig data [5, 20, 3]. For istace, we may wat to lear the average umber of frieds i a etwork, or the average degree of Permissio to make digital or hard copies of all or part of this work for persoal or classroom use is grated without fee provided that copies are ot made or distributed for profit or commercial advatage ad that copies bear this otice ad the full citatio o the first page. To copy otherwise, to republish, to post o servers or to redistribute to lists, requires prior specific permissio ad/or a fee. Copyright 202 ACM $5.00. a graph. Oe obvious but ofte impractical method is to select radomly a set of users {U, U 2..., U }, cout their degrees {d,..., d } for each user, ad calculate the sample mea as the estimate of the populatio mea: d SM = d i () The sample mea estimator d SM is a ubiased estimator of the populatio, if the users ca be selected radomly with equal probability. Ufortuately this is ot the case i most practice. Whe micro bloggers are selected, they are ofte ot picked radomly due to the limited access methods provided by OSN sites. Rather, more popular bloggers ted to have a higher probability of beig sampled if users are crawled by followig the liks. There are studies o the samplig methods for OSN [5, 20] ad i related areas such as social etworks [22, 26], graphs [3, 25], web URLs [8], ad search egie idex ad deep web [, 7, 6]. The typical uderlyig techiques iclude Metropolis Hastig Radom Walk (MHRW) [8] for uiform samplig ad Radom Walk (RW) [4] for uequal probability samplig. A radom walk o graph follows oe of the liks with a equal probability amog all the liks. A blogger with more followers will have higher probability of beig sampled. It is well kow that the asymptotic probability of a ode beig sampled is proportioal to its degree [4]. Therefore, the sample mea teds to overestimate the populatio average degree. MHRW is reported rather good at obtaiig a radom sample of radom etworks. However, i the samplig process may odes are retrieved, examied, ad rejected. The cost is rather high especially for OSN where the ode retrieval eeds etwork traffic ad usually there are quota for daily accesses. Eve whe uiform radom samples are obtaied, the sample mea estimator has a high variace because the degree distributio of OSNs usually follows power law. May odes have small degrees, while some odes 33

2 may have very large degree. The iclusio/exclusio of a super large ode i a sample will make the estimates diverge. Whe uiform radom samples are hard to obtai, it is rather commo to use PPS (Probability Proportioal to Size) samplig ad Hase-Hurwitz related estimators [7]. I particular, the harmoic mea istead of the arithmetic mea of the sample ca be used as the estimator of the average degree of OSN: [ ] d H = (2) d i Here the subscript H idicates that it is the harmoic mea, ad that it ca be derived from the traditioal Hase-Hurwitz estimator as described i the ext sectio. For this estimator the sample is obtaied by simple radom walk, resultig i the ode selectio probability proportioal to its degree. This estimator was first derived ad studied i depth by Salgaik et al. [22] to estimate the properties of hidde populatio such as drug-addicts. I that settig the true values are ukow, the assumptios such as samplig probability are flimsy, thus the veracity of the estimator is impossible to evaluate. I the cotext of OSN, Kurat et al. [, 5, 6] studied various samplig methods, icludig radom walk, to discover etwork properties such distributio of ode degrees. [5] studied the samplig of Facebook, i particular the Re-Weighted Radom Walk that ca be also traced back to Hase-Hurwitz estimator. [] metioed harmoic mea estimator, but fell short of the aalysis ad compariso of the estimator. Rasti et al. [2] studied re-weighted radom walk samplig i peer-to peer etworks. Both [5] ad [2] compare their methods with Metropolis-Hastig radom walk, ot uiform radom samples. The compariso to uiform radom samples was coducted i [0] for the estimatio of populatio size ot average degree. This is the first paper to show that i a real large etwork the harmoic mea estimator is much better tha sample mea estimator i uiform radom samples, eve igorig the cost of obtaiig the uiform samples. I practice as demostrated i Twitter etwork, the sample size ca be thousads times smaller tha uiform radom samples to achieve similar accuracy. I theory, the improvemet ca be ulimited with the growth of the etwork size. The cotributios of this paper are ) the properties of the estimator (bias ad variace) are aalyzed ad empirically verified i a large real etwork; 2) the advatage over uiform radom sample is aalyzed ad compared. I particular we foud that i Twitter data the estimator is much better it has a very small bias, ad the variace is orders of magitude smaller tha the sample mea estimator; 3) the cause is idetified as the heterogeeity of the data iduced by the scalefree ature of the etwork. Coefficiet of variatio is proposed to quatify the heterogeeity; 4) the accurate estimatio of the average degree ca lead to the discovery of a strig of other etwork properties such as the etwork size, the heterogeeity of the degrees, the threshold value for message diffusio, ad the iequality of the frieds i the etwork. We wat to emphasize that our method is ot limited to the estimatio of direct coectios betwee users i OSN. The average degree ca be the average umber of frieds i the case of Facebook or Likedi, or average followers ad followees i Twitter ad Weibo etworks. I additio to such explicit graph where edges represet the followig (or fried) relatioships, i OSNs there are implicitly derived graphs where a edge exists if two odes share messages, groups, etc.., resultig i message etworks ad group etworks. I a message etwork, two persos are liked if they shared a message. I group etwork, two persos are coected if they belog to the same group. Thus, the degree ca represet the direct coectios to frieds, the umber of message reposts o the etwork, or the umber of groups people are associated with. 2. ESTIMATORS 2. Sample mea estimator Suppose that i the populatio there are N umber of users. Each user has a property Y i, i {, 2,..., N}, which ca be age, umber of frieds, or umber of messages etc.. Let the populatio total is τ = N Y i, ad populatio mea is Y = τ/n. Our task is to estimate Y usig a sample. I particular, this paper focuses o the degree property, i.e., estimatig the average degree d usig a sample {d, d 2,..., d }. If a uiform radom sample Y,..., Y is obtaied, the sample mea is a ubiased estimator as defied below: Ŷ SM = Y i (3) Whe Y i is the degree of ode i, i.e., Y i = d i, the above equatio becomes the sample mea estimator for degrees: d SM = d i (4) The variace of the estimator d SM is [24] var( d SM ) = N σ 2 N (5) where σ 2 is the populatio variace for degrees that 34

3 ca be calculated by ( ) 2 σ 2 = d 2 i d i N N = N d 2 i d 2 (6) where d is the arithmetic mea of all the degrees i the total populatio. The estimated variace of the estimator d SM is var( d SM ) = N s 2 N (7) where s 2 is the sample variace of d, d 2,..., d. The problem with this sample mea estimator is that uiform sample is ot easy to obtai. Moreover, the populatio variace σ 2, ad cosequetly the estimator variace, are large due to the scale-free ature of the etwork. The degree distributio i olie social etworks follows power law or Zipf law. That is, if we rak all the odes accordig to their degrees i decreasig order (d, d 2,..., d N ), the d i = A i α, (8) where A ad α are costats. α is called the expoet or slope that is typically aroud oe i various scale-free etworks. With such degree distributio the populatio variace is very large, leadig to large variace of the sample mea estimator. Suppose that α =, which is typical for may scale free etworks [9] icludig Twitter etwork [2]. σ 2 ca be approximated as below by combiig Equatios 8 ad 6: σ 2 = E(X 2 ) E 2 (X) ( E(X 2 ) ) = E 2 (X) E 2 (X) ( N ) N d2 i = = ( N d i) 2 ( N N i 2 ( N d 2 ) d 2 i ) 2 ) d 2 (9) ( N l 2 N It shows that the variace does ot coverge whe the etwork size N grows to the limit. Note that there are two ways to describe the property of power law, oe usig the Zipfia approach as used here, the other is the frequecy of the degrees that is equivalet to Zipfia approach except that the expoet is greater by oe. 2.2 Harmoic mea estimator Whe samplig probability is ot equal for each uit, a commo approach is to use Hase-Hurwitz estimators. Oe of them is to estimate the populatio total [24]: τ HH = Y i p i, (0) where p i is the selectio probability of uit i, τ = N Y i is the populatio total, ad N p i =. Selectio probability of uit i is the probability it is selected i oe draw of the sample elemets. Note that Hase-Hurwitz estimator is used whe samplig with replacemets, i.e., a uit ca be sampled multiple times just the same as i radom walk samplig. Whe Y i = for all i {, 2,..., N}, the above estimator is reduced to aother versio of Hase-Hurwitz estimator that estimates the total umber of odes N = N Y i: N HH = p i () I our OSN case, samples are ofte obtaied by radom walk. It is well kow that radom walk obtais a biased sample. Asymptotically the probability of a user beig visited i a radom walk is proportioal to its degree, i.e., i the case of radom walk, p i = d i N j= d j = d i τ (2) Therefore a estimator for degree mea d H ca be derived from the ubiased Hase-Hurwitz estimator for N as follows: d H = τ N HH ] [ τ = τ d i [ ] = (3) d i The estimator for the arithmetic mea degree turs out to be the harmoic mea of the degrees i the sample. Salgaik et al [22] gave a similar derivatio usig the ratio of two estimators i the settig of respodet drive samplig. Although N HH is a ubiased estimator, its iverse may ot be ubiased. Cochra [3] showed that the bias is o the order of /. Sice the sample size i social etwork samplig is rather large i geeral, the bias is egligible. 35

4 (A) (B) Table : Empirical bias ad stadard error of the two estimators over 00 rus for various sample size. Est d Est d Sample size 3 4 x (C) Sample size Est d Est d Sample size 3 4 x (D) Sample size Figure : Compariso of d SM i UR (Uiform Radom) samplig ad d H i RW (Radom Walk) samplig. Paels A (for UR) ad B( for RW) show that the estimatio fluctuates with the icrease of sample size. Paels C (for UR) ad D (for RW) show the box plots of 00 estimatios for sample sizes ragig betwee 500 ad The variace of N HH is var( N HH ) = p i (/p i N) 2 (4) It ca be estimated from a sample usig var( N HH ) = (/p i N) 2 (5) ( ) Usig Delta method the variace of estimator d H is var( d H ) = s2 v v 4 (6) where v i = /d i, v ad s 2 v are the sample mea ad variace of v i s. This equatio will be used i calculatig the error boud i Figure EXPERIMENTS 3. Data The estimator is verified o the Twitter etwork data that are provided by Kwak et al. [2], characterizig the complete Twitter etwork as of July The data cotai about.47 billio edges ad 4.7 millio Bias Stadard error UR RW UR RW odes or users, occupyig aroud 20 gigabytes hard drive space. Sice they are too large to fit ito the memory of commodity computers, we idex them usig Lucee, a popular idex egie. The the radom walk ad uiform radom samplig are performed o the idex that are stored i hard drive. Sice our method is better to be used i udirected graph, we remove the directio i Twitter data. 3.2 Results Two estimators, d SM i Equatio ad d H i Equatio 3, are tested o the data for five differet sample sizes 500, 000, 2000, 4000, ad For each sample size 00 samples are selected usig uiform radom samplig ad radom walk samplig respectively. Their bias ad stadard errors are tabulated i Table. It shows that ideed d H has a very small bias as expected. What is strikig is that its stadard error is much smaller tha d SM. We use Figure to explai the result further. Paels C is the box plot for d SM usig uiform samplig. It shows that the estimatio fluctuates very much, ca eve go as high as 000 whe =500, where the true mea is The big variace problem is ameliorated slightly but remais large with the growth of the sample size. O the other had the box plot for d H i Pael D shows much smaller variace. We also ru five large samples, each with size 4 0 5, as depicted i paels A (for UR) ad B (for RW). Note that i the case of uiform radom samplig, the estimate jumps from time to time eve whe the sample size is rather large. Figure 2 shows four estimatios bouded by the 95% cofidece iterval calculated by Equatio Discussios This paper shows that the biased samplig is much better tha uiform samplig for the estimatio of average degrees. I the past, people try to obtai uiform samples wheever possible, ad resort to biased 36

5 20 0 Radom walk RW UR d degree sample size Figure 2: 95% cofidece iterval ad four RW (Radom Walk) estimatio processes usig d H estimator. The error boud is draw from Equatio Figure 3: The degree distributios of the samples obtaied from UR (Uiform Radom) ad RW (Radom Walk) sampligs. =500,000. The odes, icludig the oes beig repeatedly sampled, are raked i decreasig order of their degrees, ad draw with degrees agaist their raks. rak samplig such as PPS (Proportioal To Size) samplig oly whe uiform samplig is impossible [22] or costly. The results of this paper suggest that i the cotext of olie social etworks, radom walk samplig istead of uiform samplig should be used, eve whe uiform radom samples are readily accessible. It is easy to uderstad that the variace of uiform radom estimator d SM is large because olie social etworks are mostly scale-free as show i Equatio 9. The smaller variace of d H ca be explaied below. Let d W be the radom variable for the degrees sampled by radom walk. First we draw its empirical distributio ad its compariso with uiformly sampled degrees i Figure 3. Uiform radom (UR) samples resemble the distributio of the total populatio [23] that obeys power law with expoet aroud oe. O the other had, i radom walk (RW) samplig scheme d W has a flatter startig sectio ad a droopig tail, which ca be approximated by the Madelbrot law: d W i = B (a + i) b (7) where b is the expoet, B is a ormalizatio costat, a is a costat that correspods to the positio where the curve droops dow. Let v =. (8) d W i The variace of the reciprocal of the variable is ( var(/d W ) (i + a)2b ) = ( (i + a)b ) 2 v 2 ( ) 2 = (i + a) 2b (i + a) b v 2 ( [ ] [ ] 2 2b + 2b+ b + b+ ) ( ) (b + ) 2 2b + v 2 Thus var(/d W ) is a costat that does ot grow with the populatio size as σ 2 does. 4. IMPLICATIONS Average degree plays a pivotal role i discoverig other properties of a large etwork. Its accurate estimatio has a ramificatio o a strig of other hidde properties of large etworks. Oe immediate result is the total umber of edges i the graph whe user size is kow. However, the more profoud cosequece is that we ca discover the heterogeeity, CV (Coefficiet of Variatio), of the etire etwork with a small sample usig average degree. The discovery of CV will i tur deduce other properties such as the total umber of users, the iequality of degrees (frieds of frieds ad Gii coefficiet). 4. Estimate heterogeeity d ca be used to estimate CV, Coefficiet of Vari- v 2 37

6 x Est CV sqr Sample size Est N Sample size x 0 4 Figure 4: 5 Estimatio processes of γ 2 i Twitter data usig Equatio 20. The red dotted lie is the true value. Figure 5: 5 estimatio processes of twitter accouts N usig Equatio 2. Red dotted lie is the true value. atio (deoted as γ), that is a importat metric to measure the heterogeeity of degree distributio. It is defied as the stadard deviatio ormalized by the average degree: γ 2 = σ 2 /d 2. Expadig the defiitio for variace we have γ 2 + = d2 d 2 d 2 + [ ] 2 = d 2 i d i N N [ N ] 2 = N d i d 2 i O the other had the sample mea of the degrees obtaied by radom walk is d W = d W i = p i d i = Nd d 2 i (9) Combiig the two equatios we derive the estimator for CV as follows: γ 2 + = dw d, (20) where d W is the sample mea of the degrees obtaied by radom walk, d ca be estimated by the arithmetic mea of the same data. The coveiece of the method is that oly oe radom walk is eeded. Figure 4 shows 5 estimates that coverge quickly with the growth of the sample size. 4.2 Populatio estimatio Oce γ 2 is available, it ca be used to estimate the populatio size as follows, which is a special case of Eq 3.20 i [2]: ( ) N = (γ 2 + ) 2 C, (2) where is the sample size, C is the umber of collisios, ad the sample is obtaied by radom walk 2. I the area of capture-recapture research [2, 7, 5], it has bee a perplexig problem for the populatio estimatio of heterogeeous data whose capture probabilities are uequal, maily due to the difficulty of estimatig the heterogeeity. Now i the settig of OSN, the problem is solved thaks to the estimator d H. Because of the accurate predicatio of the heterogeeity of the data (γ 2 ), the estimatio of populatio size is rather good as show i Figure 5. Sice this estimator higes o collisio times, extra cautio should be take to avoid spurious collisios caused by radom walk. For istace if a ode A is oly coected to ode B, a visit to A will cause ode B visited twice. To avoid such loops, we take samples spaced every a few steps apart. 4.3 Other properties 4.3. Frieds of frieds γ 2 ca be also used to measure the ratio betwee the umber of frieds of your frieds, ad the umber of your frieds. As the sayig goes, your frieds have more frieds tha you do. To be more precise, your frieds have γ 2 + times more frieds tha you do. The mea umber of frieds of frieds is [4] d 2 i / d i = d + σ 2 /d (22) 2 Here is a simple derivatio for the estimator. The expected umber of collisios is ( ) N ( ) ( ) E(C) = p 2 i = d 2 γ τ 2 i = 2 N 38

7 The above equatio shows that your frieds have o less tha the frieds you have. Simple rearragig the equatio results i: N d2 i / N d i = + σ 2 /d 2 d = + γ 2 (23) I words, the equatio says that o average your frieds have + γ 2 times more frieds the you do. I a homogeeous etwork where everybody has the same umber of coectios, γ = 0, thus your frieds have the same umber of frieds as you do. I twitter society, γ 2 is aroud 000, thus your frieds have a thousad times more frieds tha you do Message diffusio Alog the same lie γ 2 ca be used to quatify the diffusio of messages that is borrowed from epidemiology. I particular, it ca be derived that the threshold for the occurrece of large compoet, or the occurrece of epidemics [9] (Eq 7.8) is π = (γ2 + )d 2 (γ 2 + )d, (24) where π is the proportio of the odes that are immued uiformly from the etwork Clusterig Coefficiet Some structural etwork properties ca be also derived usig γ 2. For istace, oe importat etwork property is Clusterig Coefficiet, idicatig the proportio whether your fried of fried is also your fried. It is hard to calculate directly for a large etwork, but ca be estimated [9] (eq 3.47) by Gii coefficiet dγ 4 /. (25) Gii idex is used to measure the iequality of wealth. It ca also be used to measure the iequality of friedships i OSNs. Usig d the Gii coefficiet ca be approximated by Ĝ = d i d j (26) 2( )d j= The classic problem of Gii coefficiet estimatio is that the mea is hard to obtai. Thaks to the estimatio of average degree, i Twitter etwork, we fid its Gii coefficiet is aroud CONCLUSIONS This paper proposes to use radom walk to sample a etwork ad use the harmoic mea to estimate the average degree. The empirical experimets show that the estimator is much better eve tha uiform radom samples. The method is very practical i that i thousads or eve hudreds of steps of radom walk we ca lear the average degree of a large etwork cotaiig tes of millios of odes ad billios of edges. The method works well because of the scale-free ature of the uderlyig etwork where the variace teds to be very large, potetially ulimited whe the etwork size becomes ifiitely large. For such etworks, we aalytically showed that the harmoic mea estimator removed the large variace problem. Therefore the estimator works ot oly for olie social etworks, but also ay scale-free etworks that are ubiquitous ad more commo tha radom etworks. For istace, we also validated the estimator i documet-term graph where documet ad terms are odes, ad they are coected if a documet cotais a term. The method relies o the assumptio that radom walk produces samples whose selectio probability is proportioal to their degrees. Theoretically this is true oly asymptotically. Therefore the samples before the mixig time should be throw away. Our experimets show little differece whether or ot to iclude the first batch of samples i the radom walk. The degree estimatio is ot oly importat by itself but also crucial for discoverig other etwork properties. The success solutio of average degree ca lead to the discovery of the heterogeeity of the uderlyig data, the user ad lik size etc. The method is ot restricted to the degrees of the explicit etworks where the edges are the friedship relatios. Istead, the edges ca be forged by other implicit relatios, such as sharig the same message. 6. ACKNOWLEDGEMENTS We thak the reviewers for their detailed commets, ad the support from NSERC (Natural Scieces ad Egieerig Research Coucil of Caada) ad SSHRC (Social Scieces ad Humaities Research Coucil of Caada). 7. REFERENCES [] Z. Bar-Yossef ad M. Gurevich. Radom samplig from a search egie s idex. Joural of the ACM (JACM), 55(5):24, [2] A. Chao, S. Lee, ad S. Jeg. Estimatig populatio size for capture-recapture data whe capture probabilities vary by time ad idividual aimal. Biometrics, pages 20 26, 992. [3] W. Cochra. Samplig techiques. Wiley-Idia, [4] S. Feld. Why your frieds have more frieds tha you do. America Joural of Sociology, pages 39

8 , 99. [5] M. Gjoka, M. Kurat, C. Butts, ad A. Markopoulou. A walk i facebook: Uiform samplig of users i olie social etworks. Arxiv preprit arxiv: , [6] M. Gjoka, M. Kurat, C. Butts, ad A. Markopoulou. Practical recommedatios o crawlig olie social etworks. Selected Areas i Commuicatios, IEEE Joural o, 29(9): , 20. [7] M. Hase ad W. Hurwitz. O the theory of samplig from fiite populatios. The Aals of Mathematical Statistics, 4(4): , 943. [8] M. Heziger, A. Heydo, M. Mitzemacher, ad M. Najork. O ear-uiform url samplig. Computer Networks, 33(-6): , [9] M. Jackso. Social ad ecoomic etworks. Priceto Uiv Pr, [0] L. Katzir, E. Liberty, ad O. Somekh. Estimatig sizes of social etworks via biased samplig. I Proceedigs of the 20th iteratioal coferece o World wide web, pages ACM, 20. [] M. Kurat, A. Markopoulou, ad P. Thira. Towards ubiased bfs samplig. Selected Areas i Commuicatios, IEEE Joural o, 29(9): , 20. [2] H. Kwak, C. Lee, H. Park, ad S. Moo. What is twitter, a social etwork or a ews media? I Proceedigs of the 9th iteratioal coferece o World wide web, pages ACM, 200. [3] J. Leskovec ad C. Faloutsos. Samplig from large graphs. I Proceedigs of the 2th ACM SIGKDD iteratioal coferece o Kowledge discovery ad data miig, pages ACM, [4] L. Lovász. Radom walks o graphs: A survey. Combiatorics, Paul Erdos is Eighty, 2(): 46, 993. [5] J. Lu. Efficiet estimatio of the size of text deep web data source. I Proceedig of the 7th ACM coferece o Iformatio ad kowledge maagemet, pages , Napa Valley, Califoria, USA, ACM. [6] J. Lu. Rakig bias i deep web size estimatio usig capture recapture method. Data & Kowledge Egieerig, 69(8): , 200. [7] J. Lu ad D. Li. Estimatig deep web data source size by capture recapture method. Iformatio retrieval, 3():70 95, 200. [8] N. Metropolis, A. Rosebluth, M. Rosebluth, A. Teller, ad E. Teller. Equatio of state calculatios by fast computig machies. The joural of chemical physics, 2:087, 953. [9] M. Newma. Networks: a itroductio. Oxford Uiversity Press, Ic., 200. [20] M. Papagelis, G. Das, ad N. Koudas. Samplig olie social etworks. Kowledge ad Data Egieerig, IEEE Trasactios o, (99):, 20. [2] A. Rasti, M. Torkjazi, R. Rejaie, N. Duffield, W. Williger, ad D. Stutzbach. Respodet-drive samplig for characterizig ustructured overlays. I INFOCOM 2009, IEEE, pages IEEE, [22] M. Salgaik ad D. Heckathor. Samplig ad estimatio i hidde populatios usig respodet-drive samplig. Sociological methodology, 34():93 240, [23] M. Stumpf, C. Wiuf, ad R. May. Subets of scale-free etworks are ot scale-free: samplig properties of etworks. Proceedigs of the Natioal Academy of Scieces of the Uited States of America, 02(2):422, [24] S. Thompso. Samplig. Wiley, 202. [25] T. Wag, Y. Che, Z. Zhag, T. Xu, L. Ji, P. Hui, B. Deg, ad X. Li. Uderstadig graph samplig algorithms for social etwork aalysis. I the 3rd ICDCS Workshop o Simplifyig Complex Networks for Practitioers, 20. [26] C. Wejert ad D. Heckathor. Web-based etwork samplig. Sociological Methods & Research, 37():05 34,

Confidence Intervals for One Mean

Confidence Intervals for One Mean Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a

More information

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights Ceter, Spread, ad Shape i Iferece: Claims, Caveats, ad Isights Dr. Nacy Pfeig (Uiversity of Pittsburgh) AMATYC November 2008 Prelimiary Activities 1. I would like to produce a iterval estimate for the

More information

Hypothesis testing. Null and alternative hypotheses

Hypothesis testing. Null and alternative hypotheses Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate

More information

Case Study. Normal and t Distributions. Density Plot. Normal Distributions

Case Study. Normal and t Distributions. Density Plot. Normal Distributions Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca

More information

I. Chi-squared Distributions

I. Chi-squared Distributions 1 M 358K Supplemet to Chapter 23: CHI-SQUARED DISTRIBUTIONS, T-DISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad t-distributios, we first eed to look at aother family of distributios, the chi-squared distributios.

More information

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,

More information

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN Aalyzig Logitudial Data from Complex Surveys Usig SUDAAN Darryl Creel Statistics ad Epidemiology, RTI Iteratioal, 312 Trotter Farm Drive, Rockville, MD, 20850 Abstract SUDAAN: Software for the Statistical

More information

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown Z-TEST / Z-STATISTIC: used to test hypotheses about µ whe the populatio stadard deviatio is kow ad populatio distributio is ormal or sample size is large T-TEST / T-STATISTIC: used to test hypotheses about

More information

Chapter 7 Methods of Finding Estimators

Chapter 7 Methods of Finding Estimators Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of

More information

Determining the sample size

Determining the sample size Determiig the sample size Oe of the most commo questios ay statisticia gets asked is How large a sample size do I eed? Researchers are ofte surprised to fid out that the aswer depeds o a umber of factors

More information

5: Introduction to Estimation

5: Introduction to Estimation 5: Itroductio to Estimatio Cotets Acroyms ad symbols... 1 Statistical iferece... Estimatig µ with cofidece... 3 Samplig distributio of the mea... 3 Cofidece Iterval for μ whe σ is kow before had... 4 Sample

More information

Output Analysis (2, Chapters 10 &11 Law)

Output Analysis (2, Chapters 10 &11 Law) B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should

More information

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008 I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces

More information

Modified Line Search Method for Global Optimization

Modified Line Search Method for Global Optimization Modified Lie Search Method for Global Optimizatio Cria Grosa ad Ajith Abraham Ceter of Excellece for Quatifiable Quality of Service Norwegia Uiversity of Sciece ad Techology Trodheim, Norway {cria, ajith}@q2s.tu.o

More information

Department of Computer Science, University of Otago

Department of Computer Science, University of Otago Departmet of Computer Sciece, Uiversity of Otago Techical Report OUCS-2006-09 Permutatios Cotaiig May Patters Authors: M.H. Albert Departmet of Computer Sciece, Uiversity of Otago Micah Colema, Rya Fly

More information

Lesson 17 Pearson s Correlation Coefficient

Lesson 17 Pearson s Correlation Coefficient Outlie Measures of Relatioships Pearso s Correlatio Coefficiet (r) -types of data -scatter plots -measure of directio -measure of stregth Computatio -covariatio of X ad Y -uique variatio i X ad Y -measurig

More information

Hypergeometric Distributions

Hypergeometric Distributions 7.4 Hypergeometric Distributios Whe choosig the startig lie-up for a game, a coach obviously has to choose a differet player for each positio. Similarly, whe a uio elects delegates for a covetio or you

More information

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals Overview Estimatig the Value of a Parameter Usig Cofidece Itervals We apply the results about the sample mea the problem of estimatio Estimatio is the process of usig sample data estimate the value of

More information

Incremental calculation of weighted mean and variance

Incremental calculation of weighted mean and variance Icremetal calculatio of weighted mea ad variace Toy Fich faf@cam.ac.uk dot@dotat.at Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically

More information

How To Calculate The Size Of An Etwork On A Graph From A Facebook Graph

How To Calculate The Size Of An Etwork On A Graph From A Facebook Graph Estimatig Sizes of Social Networks via Biased Samplig Lira Katzir, Edo Liberty, ad Ore Somekh Yahoo! Labs., Haifa, Israel {lirak, edo, ores}@yahoo-ic.com ABSTRACT Olie social etworks have become very popular

More information

LECTURE 13: Cross-validation

LECTURE 13: Cross-validation LECTURE 3: Cross-validatio Resampli methods Cross Validatio Bootstrap Bias ad variace estimatio with the Bootstrap Three-way data partitioi Itroductio to Patter Aalysis Ricardo Gutierrez-Osua Texas A&M

More information

Chapter 7: Confidence Interval and Sample Size

Chapter 7: Confidence Interval and Sample Size Chapter 7: Cofidece Iterval ad Sample Size Learig Objectives Upo successful completio of Chapter 7, you will be able to: Fid the cofidece iterval for the mea, proportio, ad variace. Determie the miimum

More information

Data Analysis and Statistical Behaviors of Stock Market Fluctuations

Data Analysis and Statistical Behaviors of Stock Market Fluctuations 44 JOURNAL OF COMPUTERS, VOL. 3, NO. 0, OCTOBER 2008 Data Aalysis ad Statistical Behaviors of Stock Market Fluctuatios Ju Wag Departmet of Mathematics, Beijig Jiaotog Uiversity, Beijig 00044, Chia Email:

More information

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas: Chapter 7 - Samplig Distributios 1 Itroductio What is statistics? It cosist of three major areas: Data Collectio: samplig plas ad experimetal desigs Descriptive Statistics: umerical ad graphical summaries

More information

Properties of MLE: consistency, asymptotic normality. Fisher information.

Properties of MLE: consistency, asymptotic normality. Fisher information. Lecture 3 Properties of MLE: cosistecy, asymptotic ormality. Fisher iformatio. I this sectio we will try to uderstad why MLEs are good. Let us recall two facts from probability that we be used ofte throughout

More information

Quadrat Sampling in Population Ecology

Quadrat Sampling in Population Ecology Quadrat Samplig i Populatio Ecology Backgroud Estimatig the abudace of orgaisms. Ecology is ofte referred to as the "study of distributio ad abudace". This beig true, we would ofte like to kow how may

More information

Professional Networking

Professional Networking Professioal Networkig 1. Lear from people who ve bee where you are. Oe of your best resources for etworkig is alumi from your school. They ve take the classes you have take, they have bee o the job market

More information

Lecture 2: Karger s Min Cut Algorithm

Lecture 2: Karger s Min Cut Algorithm priceto uiv. F 3 cos 5: Advaced Algorithm Desig Lecture : Karger s Mi Cut Algorithm Lecturer: Sajeev Arora Scribe:Sajeev Today s topic is simple but gorgeous: Karger s mi cut algorithm ad its extesio.

More information

Normal Distribution.

Normal Distribution. Normal Distributio www.icrf.l Normal distributio I probability theory, the ormal or Gaussia distributio, is a cotiuous probability distributio that is ofte used as a first approimatio to describe realvalued

More information

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number.

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number. GCSE STATISTICS You should kow: 1) How to draw a frequecy diagram: e.g. NUMBER TALLY FREQUENCY 1 3 5 ) How to draw a bar chart, a pictogram, ad a pie chart. 3) How to use averages: a) Mea - add up all

More information

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5 Sectio 13 Kolmogorov-Smirov test. Suppose that we have a i.i.d. sample X 1,..., X with some ukow distributio P ad we would like to test the hypothesis that P is equal to a particular distributio P 0, i.e.

More information

Math C067 Sampling Distributions

Math C067 Sampling Distributions Math C067 Samplig Distributios Sample Mea ad Sample Proportio Richard Beigel Some time betwee April 16, 2007 ad April 16, 2007 Examples of Samplig A pollster may try to estimate the proportio of voters

More information

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring No-life isurace mathematics Nils F. Haavardsso, Uiversity of Oslo ad DNB Skadeforsikrig Mai issues so far Why does isurace work? How is risk premium defied ad why is it importat? How ca claim frequecy

More information

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means)

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means) CHAPTER 7: Cetral Limit Theorem: CLT for Averages (Meas) X = the umber obtaied whe rollig oe six sided die oce. If we roll a six sided die oce, the mea of the probability distributio is X P(X = x) Simulatio:

More information

Maximum Likelihood Estimators.

Maximum Likelihood Estimators. Lecture 2 Maximum Likelihood Estimators. Matlab example. As a motivatio, let us look at oe Matlab example. Let us geerate a radom sample of size 00 from beta distributio Beta(5, 2). We will lear the defiitio

More information

Measures of Spread and Boxplots Discrete Math, Section 9.4

Measures of Spread and Boxplots Discrete Math, Section 9.4 Measures of Spread ad Boxplots Discrete Math, Sectio 9.4 We start with a example: Example 1: Comparig Mea ad Media Compute the mea ad media of each data set: S 1 = {4, 6, 8, 10, 1, 14, 16} S = {4, 7, 9,

More information

PSYCHOLOGICAL STATISTICS

PSYCHOLOGICAL STATISTICS UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc. Cousellig Psychology (0 Adm.) IV SEMESTER COMPLEMENTARY COURSE PSYCHOLOGICAL STATISTICS QUESTION BANK. Iferetial statistics is the brach of statistics

More information

1. C. The formula for the confidence interval for a population mean is: x t, which was

1. C. The formula for the confidence interval for a population mean is: x t, which was s 1. C. The formula for the cofidece iterval for a populatio mea is: x t, which was based o the sample Mea. So, x is guarateed to be i the iterval you form.. D. Use the rule : p-value

More information

The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection

The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection The aalysis of the Courot oligopoly model cosiderig the subjective motive i the strategy selectio Shigehito Furuyama Teruhisa Nakai Departmet of Systems Maagemet Egieerig Faculty of Egieerig Kasai Uiversity

More information

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method Chapter 6: Variace, the law of large umbers ad the Mote-Carlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value

More information

CHAPTER 3 THE TIME VALUE OF MONEY

CHAPTER 3 THE TIME VALUE OF MONEY CHAPTER 3 THE TIME VALUE OF MONEY OVERVIEW A dollar i the had today is worth more tha a dollar to be received i the future because, if you had it ow, you could ivest that dollar ad ear iterest. Of all

More information

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample

More information

Project Deliverables. CS 361, Lecture 28. Outline. Project Deliverables. Administrative. Project Comments

Project Deliverables. CS 361, Lecture 28. Outline. Project Deliverables. Administrative. Project Comments Project Deliverables CS 361, Lecture 28 Jared Saia Uiversity of New Mexico Each Group should tur i oe group project cosistig of: About 6-12 pages of text (ca be loger with appedix) 6-12 figures (please

More information

Statistical inference: example 1. Inferential Statistics

Statistical inference: example 1. Inferential Statistics Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either

More information

Multi-server Optimal Bandwidth Monitoring for QoS based Multimedia Delivery Anup Basu, Irene Cheng and Yinzhe Yu

Multi-server Optimal Bandwidth Monitoring for QoS based Multimedia Delivery Anup Basu, Irene Cheng and Yinzhe Yu Multi-server Optimal Badwidth Moitorig for QoS based Multimedia Delivery Aup Basu, Iree Cheg ad Yizhe Yu Departmet of Computig Sciece U. of Alberta Architecture Applicatio Layer Request receptio -coectio

More information

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the. Cofidece Itervals A cofidece iterval is a iterval whose purpose is to estimate a parameter (a umber that could, i theory, be calculated from the populatio, if measuremets were available for the whole populatio).

More information

Confidence Intervals

Confidence Intervals Cofidece Itervals Cofidece Itervals are a extesio of the cocept of Margi of Error which we met earlier i this course. Remember we saw: The sample proportio will differ from the populatio proportio by more

More information

5 Boolean Decision Trees (February 11)

5 Boolean Decision Trees (February 11) 5 Boolea Decisio Trees (February 11) 5.1 Graph Coectivity Suppose we are give a udirected graph G, represeted as a boolea adjacecy matrix = (a ij ), where a ij = 1 if ad oly if vertices i ad j are coected

More information

Basic Elements of Arithmetic Sequences and Series

Basic Elements of Arithmetic Sequences and Series MA40S PRE-CALCULUS UNIT G GEOMETRIC SEQUENCES CLASS NOTES (COMPLETED NO NEED TO COPY NOTES FROM OVERHEAD) Basic Elemets of Arithmetic Sequeces ad Series Objective: To establish basic elemets of arithmetic

More information

1 Computing the Standard Deviation of Sample Means

1 Computing the Standard Deviation of Sample Means Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.

More information

Inference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval

Inference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval Chapter 8 Tests of Statistical Hypotheses 8. Tests about Proportios HT - Iferece o Proportio Parameter: Populatio Proportio p (or π) (Percetage of people has o health isurace) x Statistic: Sample Proportio

More information

1 Correlation and Regression Analysis

1 Correlation and Regression Analysis 1 Correlatio ad Regressio Aalysis I this sectio we will be ivestigatig the relatioship betwee two cotiuous variable, such as height ad weight, the cocetratio of a ijected drug ad heart rate, or the cosumptio

More information

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean 1 Social Studies 201 October 13, 2004 Note: The examples i these otes may be differet tha used i class. However, the examples are similar ad the methods used are idetical to what was preseted i class.

More information

Capacity of Wireless Networks with Heterogeneous Traffic

Capacity of Wireless Networks with Heterogeneous Traffic Capacity of Wireless Networks with Heterogeeous Traffic Migyue Ji, Zheg Wag, Hamid R. Sadjadpour, J.J. Garcia-Lua-Aceves Departmet of Electrical Egieerig ad Computer Egieerig Uiversity of Califoria, Sata

More information

Lesson 15 ANOVA (analysis of variance)

Lesson 15 ANOVA (analysis of variance) Outlie Variability -betwee group variability -withi group variability -total variability -F-ratio Computatio -sums of squares (betwee/withi/total -degrees of freedom (betwee/withi/total -mea square (betwee/withi

More information

THE HEIGHT OF q-binary SEARCH TREES

THE HEIGHT OF q-binary SEARCH TREES THE HEIGHT OF q-binary SEARCH TREES MICHAEL DRMOTA AND HELMUT PRODINGER Abstract. q biary search trees are obtaied from words, equipped with the geometric distributio istead of permutatios. The average

More information

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles The followig eample will help us uderstad The Samplig Distributio of the Mea Review: The populatio is the etire collectio of all idividuals or objects of iterest The sample is the portio of the populatio

More information

Sampling Distribution And Central Limit Theorem

Sampling Distribution And Central Limit Theorem () Samplig Distributio & Cetral Limit Samplig Distributio Ad Cetral Limit Samplig distributio of the sample mea If we sample a umber of samples (say k samples where k is very large umber) each of size,

More information

CHAPTER 3 DIGITAL CODING OF SIGNALS

CHAPTER 3 DIGITAL CODING OF SIGNALS CHAPTER 3 DIGITAL CODING OF SIGNALS Computers are ofte used to automate the recordig of measuremets. The trasducers ad sigal coditioig circuits produce a voltage sigal that is proportioal to a quatity

More information

Mann-Whitney U 2 Sample Test (a.k.a. Wilcoxon Rank Sum Test)

Mann-Whitney U 2 Sample Test (a.k.a. Wilcoxon Rank Sum Test) No-Parametric ivariate Statistics: Wilcoxo-Ma-Whitey 2 Sample Test 1 Ma-Whitey 2 Sample Test (a.k.a. Wilcoxo Rak Sum Test) The (Wilcoxo-) Ma-Whitey (WMW) test is the o-parametric equivalet of a pooled

More information

Is there employment discrimination against the disabled? Melanie K Jones i. University of Wales, Swansea

Is there employment discrimination against the disabled? Melanie K Jones i. University of Wales, Swansea Is there employmet discrimiatio agaist the disabled? Melaie K Joes i Uiversity of Wales, Swasea Abstract Whilst cotrollig for uobserved productivity differeces, the gap i employmet probabilities betwee

More information

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth Questio 1: What is a ordiary auity? Let s look at a ordiary auity that is certai ad simple. By this, we mea a auity over a fixed term whose paymet period matches the iterest coversio period. Additioally,

More information

Elementary Theory of Russian Roulette

Elementary Theory of Russian Roulette Elemetary Theory of Russia Roulette -iterestig patters of fractios- Satoshi Hashiba Daisuke Miematsu Ryohei Miyadera Itroductio. Today we are goig to study mathematical theory of Russia roulette. If some

More information

Biology 171L Environment and Ecology Lab Lab 2: Descriptive Statistics, Presenting Data and Graphing Relationships

Biology 171L Environment and Ecology Lab Lab 2: Descriptive Statistics, Presenting Data and Graphing Relationships Biology 171L Eviromet ad Ecology Lab Lab : Descriptive Statistics, Presetig Data ad Graphig Relatioships Itroductio Log lists of data are ofte ot very useful for idetifyig geeral treds i the data or the

More information

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13 EECS 70 Discrete Mathematics ad Probability Theory Sprig 2014 Aat Sahai Note 13 Itroductio At this poit, we have see eough examples that it is worth just takig stock of our model of probability ad may

More information

Using Four Types Of Notches For Comparison Between Chezy s Constant(C) And Manning s Constant (N)

Using Four Types Of Notches For Comparison Between Chezy s Constant(C) And Manning s Constant (N) INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH OLUME, ISSUE, OCTOBER ISSN - Usig Four Types Of Notches For Compariso Betwee Chezy s Costat(C) Ad Maig s Costat (N) Joyce Edwi Bategeleza, Deepak

More information

Chapter 5 Unit 1. IET 350 Engineering Economics. Learning Objectives Chapter 5. Learning Objectives Unit 1. Annual Amount and Gradient Functions

Chapter 5 Unit 1. IET 350 Engineering Economics. Learning Objectives Chapter 5. Learning Objectives Unit 1. Annual Amount and Gradient Functions Chapter 5 Uit Aual Amout ad Gradiet Fuctios IET 350 Egieerig Ecoomics Learig Objectives Chapter 5 Upo completio of this chapter you should uderstad: Calculatig future values from aual amouts. Calculatig

More information

Present Values, Investment Returns and Discount Rates

Present Values, Investment Returns and Discount Rates Preset Values, Ivestmet Returs ad Discout Rates Dimitry Midli, ASA, MAAA, PhD Presidet CDI Advisors LLC dmidli@cdiadvisors.com May 2, 203 Copyright 20, CDI Advisors LLC The cocept of preset value lies

More information

Decomposition of Gini and the generalized entropy inequality measures. Abstract

Decomposition of Gini and the generalized entropy inequality measures. Abstract Decompositio of Gii ad the geeralized etropy iequality measures Stéphae Mussard LAMETA Uiversity of Motpellier I Fraçoise Seyte LAMETA Uiversity of Motpellier I Michel Terraza LAMETA Uiversity of Motpellier

More information

Optimal Adaptive Bandwidth Monitoring for QoS Based Retrieval

Optimal Adaptive Bandwidth Monitoring for QoS Based Retrieval 1 Optimal Adaptive Badwidth Moitorig for QoS Based Retrieval Yizhe Yu, Iree Cheg ad Aup Basu (Seior Member) Departmet of Computig Sciece Uiversity of Alberta Edmoto, AB, T6G E8, CAADA {yizhe, aup, li}@cs.ualberta.ca

More information

Chapter 14 Nonparametric Statistics

Chapter 14 Nonparametric Statistics Chapter 14 Noparametric Statistics A.K.A. distributio-free statistics! Does ot deped o the populatio fittig ay particular type of distributio (e.g, ormal). Sice these methods make fewer assumptios, they

More information

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100B Istructor: Nicolas Christou Three importat distributios: Distributios related to the ormal distributio Chi-square (χ ) distributio.

More information

Institute of Actuaries of India Subject CT1 Financial Mathematics

Institute of Actuaries of India Subject CT1 Financial Mathematics Istitute of Actuaries of Idia Subject CT1 Fiacial Mathematics For 2014 Examiatios Subject CT1 Fiacial Mathematics Core Techical Aim The aim of the Fiacial Mathematics subject is to provide a groudig i

More information

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations CS3A Hadout 3 Witer 00 February, 00 Solvig Recurrece Relatios Itroductio A wide variety of recurrece problems occur i models. Some of these recurrece relatios ca be solved usig iteratio or some other ad

More information

COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S 2 CONTROL CHART FOR THE CHANGES IN A PROCESS

COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S 2 CONTROL CHART FOR THE CHANGES IN A PROCESS COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S CONTROL CHART FOR THE CHANGES IN A PROCESS Supraee Lisawadi Departmet of Mathematics ad Statistics, Faculty of Sciece ad Techoology, Thammasat

More information

A Mathematical Perspective on Gambling

A Mathematical Perspective on Gambling A Mathematical Perspective o Gamblig Molly Maxwell Abstract. This paper presets some basic topics i probability ad statistics, icludig sample spaces, probabilistic evets, expectatios, the biomial ad ormal

More information

Research Article Sign Data Derivative Recovery

Research Article Sign Data Derivative Recovery Iteratioal Scholarly Research Network ISRN Applied Mathematics Volume 0, Article ID 63070, 7 pages doi:0.540/0/63070 Research Article Sig Data Derivative Recovery L. M. Housto, G. A. Glass, ad A. D. Dymikov

More information

Infinite Sequences and Series

Infinite Sequences and Series CHAPTER 4 Ifiite Sequeces ad Series 4.1. Sequeces A sequece is a ifiite ordered list of umbers, for example the sequece of odd positive itegers: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29...

More information

INFINITE SERIES KEITH CONRAD

INFINITE SERIES KEITH CONRAD INFINITE SERIES KEITH CONRAD. Itroductio The two basic cocepts of calculus, differetiatio ad itegratio, are defied i terms of limits (Newto quotiets ad Riema sums). I additio to these is a third fudametal

More information

5.4 Amortization. Question 1: How do you find the present value of an annuity? Question 2: How is a loan amortized?

5.4 Amortization. Question 1: How do you find the present value of an annuity? Question 2: How is a loan amortized? 5.4 Amortizatio Questio 1: How do you fid the preset value of a auity? Questio 2: How is a loa amortized? Questio 3: How do you make a amortizatio table? Oe of the most commo fiacial istrumets a perso

More information

Asymptotic Growth of Functions

Asymptotic Growth of Functions CMPS Itroductio to Aalysis of Algorithms Fall 3 Asymptotic Growth of Fuctios We itroduce several types of asymptotic otatio which are used to compare the performace ad efficiecy of algorithms As we ll

More information

Now here is the important step

Now here is the important step LINEST i Excel The Excel spreadsheet fuctio "liest" is a complete liear least squares curve fittig routie that produces ucertaity estimates for the fit values. There are two ways to access the "liest"

More information

The Stable Marriage Problem

The Stable Marriage Problem The Stable Marriage Problem William Hut Lae Departmet of Computer Sciece ad Electrical Egieerig, West Virgiia Uiversity, Morgatow, WV William.Hut@mail.wvu.edu 1 Itroductio Imagie you are a matchmaker,

More information

Estimating Probability Distributions by Observing Betting Practices

Estimating Probability Distributions by Observing Betting Practices 5th Iteratioal Symposium o Imprecise Probability: Theories ad Applicatios, Prague, Czech Republic, 007 Estimatig Probability Distributios by Observig Bettig Practices Dr C Lych Natioal Uiversity of Irelad,

More information

Lecture 13. Lecturer: Jonathan Kelner Scribe: Jonathan Pines (2009)

Lecture 13. Lecturer: Jonathan Kelner Scribe: Jonathan Pines (2009) 18.409 A Algorithmist s Toolkit October 27, 2009 Lecture 13 Lecturer: Joatha Keler Scribe: Joatha Pies (2009) 1 Outlie Last time, we proved the Bru-Mikowski iequality for boxes. Today we ll go over the

More information

Convexity, Inequalities, and Norms

Convexity, Inequalities, and Norms Covexity, Iequalities, ad Norms Covex Fuctios You are probably familiar with the otio of cocavity of fuctios. Give a twicedifferetiable fuctio ϕ: R R, We say that ϕ is covex (or cocave up) if ϕ (x) 0 for

More information

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies ( 3.1.1) Limitations of Experiments. Pseudocode ( 3.1.2) Theoretical Analysis

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies ( 3.1.1) Limitations of Experiments. Pseudocode ( 3.1.2) Theoretical Analysis Ruig Time ( 3.) Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Most algorithms trasform iput objects ito output objects.

More information

Research Method (I) --Knowledge on Sampling (Simple Random Sampling)

Research Method (I) --Knowledge on Sampling (Simple Random Sampling) Research Method (I) --Kowledge o Samplig (Simple Radom Samplig) 1. Itroductio to samplig 1.1 Defiitio of samplig Samplig ca be defied as selectig part of the elemets i a populatio. It results i the fact

More information

BENEFIT-COST ANALYSIS Financial and Economic Appraisal using Spreadsheets

BENEFIT-COST ANALYSIS Financial and Economic Appraisal using Spreadsheets BENEIT-CST ANALYSIS iacial ad Ecoomic Appraisal usig Spreadsheets Ch. 2: Ivestmet Appraisal - Priciples Harry Campbell & Richard Brow School of Ecoomics The Uiversity of Queeslad Review of basic cocepts

More information

INVESTMENT PERFORMANCE COUNCIL (IPC)

INVESTMENT PERFORMANCE COUNCIL (IPC) INVESTMENT PEFOMANCE COUNCIL (IPC) INVITATION TO COMMENT: Global Ivestmet Performace Stadards (GIPS ) Guidace Statemet o Calculatio Methodology The Associatio for Ivestmet Maagemet ad esearch (AIM) seeks

More information

Vladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT

Vladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT Keywords: project maagemet, resource allocatio, etwork plaig Vladimir N Burkov, Dmitri A Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT The paper deals with the problems of resource allocatio betwee

More information

Reliability Analysis in HPC clusters

Reliability Analysis in HPC clusters Reliability Aalysis i HPC clusters Narasimha Raju, Gottumukkala, Yuda Liu, Chokchai Box Leagsuksu 1, Raja Nassar, Stephe Scott 2 College of Egieerig & Sciece, Louisiaa ech Uiversity Oak Ridge Natioal Lab

More information

THE ROLE OF EXPORTS IN ECONOMIC GROWTH WITH REFERENCE TO ETHIOPIAN COUNTRY

THE ROLE OF EXPORTS IN ECONOMIC GROWTH WITH REFERENCE TO ETHIOPIAN COUNTRY - THE ROLE OF EXPORTS IN ECONOMIC GROWTH WITH REFERENCE TO ETHIOPIAN COUNTRY BY: FAYE ENSERMU CHEMEDA Ethio-Italia Cooperatio Arsi-Bale Rural developmet Project Paper Prepared for the Coferece o Aual Meetig

More information

Mathematical goals. Starting points. Materials required. Time needed

Mathematical goals. Starting points. Materials required. Time needed Level A1 of challege: C A1 Mathematical goals Startig poits Materials required Time eeded Iterpretig algebraic expressios To help learers to: traslate betwee words, symbols, tables, ad area represetatios

More information

Your organization has a Class B IP address of 166.144.0.0 Before you implement subnetting, the Network ID and Host ID are divided as follows:

Your organization has a Class B IP address of 166.144.0.0 Before you implement subnetting, the Network ID and Host ID are divided as follows: Subettig Subettig is used to subdivide a sigle class of etwork i to multiple smaller etworks. Example: Your orgaizatio has a Class B IP address of 166.144.0.0 Before you implemet subettig, the Network

More information

3 Basic Definitions of Probability Theory

3 Basic Definitions of Probability Theory 3 Basic Defiitios of Probability Theory 3defprob.tex: Feb 10, 2003 Classical probability Frequecy probability axiomatic probability Historical developemet: Classical Frequecy Axiomatic The Axiomatic defiitio

More information

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem Lecture 4: Cauchy sequeces, Bolzao-Weierstrass, ad the Squeeze theorem The purpose of this lecture is more modest tha the previous oes. It is to state certai coditios uder which we are guarateed that limits

More information

One-sample test of proportions

One-sample test of proportions Oe-sample test of proportios The Settig: Idividuals i some populatio ca be classified ito oe of two categories. You wat to make iferece about the proportio i each category, so you draw a sample. Examples:

More information

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed. This documet was writte ad copyrighted by Paul Dawkis. Use of this documet ad its olie versio is govered by the Terms ad Coditios of Use located at http://tutorial.math.lamar.edu/terms.asp. The olie versio

More information

Overview of some probability distributions.

Overview of some probability distributions. Lecture Overview of some probability distributios. I this lecture we will review several commo distributios that will be used ofte throughtout the class. Each distributio is usually described by its probability

More information