Space-Efficient Estimation of Statistics over Sub-Sampled Streams
|
|
- Cynthia Thompson
- 8 years ago
- Views:
Transcription
1 Noame mauscript No. wi be iserted by the editor Space-Efficiet Estimatio of Statistics over Sub-Samped Streams Adrew McGregor A. Pava Srikata Tirthapura David Woodruff the date of receipt ad acceptace shoud be iserted ater Abstract I may stream moitorig situatios, the data arriva rate is so high that it is ot eve possibe to observe each eemet of the stream. The most commo soutio is to subsampe the data stream ad use the sampe to ifer properties ad estimate aggregates of the origia stream. However, i may cases, the estimatio of aggregates o the origia stream caot be accompished through simpy estimatig them o the samped stream, foowed by a ormaizatio. We preset agorithms for estimatig frequecy momets, support size, etropy, ad heavy hitters of the origia stream, through a sige pass over the samped stream. Keywords data streams, frequecy momets, sub-sampig 1 Itroductio I may stream moitorig situatios, the data arriva rate is so high that it is possibe to observe each eemet i the stream. The most commo soutio is to sub-sampe the data stream ad use the sampe to ifer properties of the origia stream. For exampe, i a IP router, aggregated statistics of the packet stream are maitaied through a protoco such as Netfow [9]. I high-ed routers, the oad due to statistics maiteace ca be so high that a variat of Netfow caed samped Netfow has bee deveoped. I radomy samped etfow, the moitor gets to view oy a radom sampe of the packet stream, ad must maitai statistics o the origia stream, usig this view. I such scearios of extreme data deuge, we are faced with two costraits o data processig. First, the etire data set is ot see by the moitor; oy a radom sampe is Adrew McGregor Uiversity of Massachusetts, E-mai: mcgregor@cs.umass.edu. Supported by NSF CAREER Award CCF A. Pava Iowa State Uiversity, E-mai: pava@cs.iastate.edu. Supported i part by NSF CCF Srikata Tirthapura Iowa State Uiversity, E-mai: st@iastate.edu. Supported i part by NSF CNS , CNS David P. Woodruff IBM Amade, E-mai: dpwoodru@us.ibm.com
2 Adrew McGregor et a. see. Secod, eve the radom sampe of the iput is too arge to be stored i mai memory or i secodary memory, ad must be processed i a sige pass through the data, as i the usua data stream mode. Whie there has bee a arge body of work that has deat with data processig usig a radom sampe see for exampe, [3, 4], ad extesive work o the oe-pass data stream mode see for exampe, [1, 9, 33], there has bee itte work so far o data processig i the presece of both costraits, where oy a radom sampe of the data set must be processed i a streamig fashio. We ote that the estimatio of frequecy momets over a samped stream is oe of the ope probems from [31], posed as Questio 13, Effects of Subsampig. 1.1 Probem Settig We assume the settig of Beroui sampig, described as foows. Cosider a iput stream P = a 1,a,...,a where a i {1,,...,m}. For a parameter p, 0 < p 1, a sub-stream of P, deoted L is costructed as foows. For 1 i, a i is icuded i L with probabiity p. The stream processor is oy aowed to see L, ad caot see P. The goa is to estimate properties of P through processig stream L. I the foowig discussio, L is caed the samped stream, ad P is caed the origia stream. 1. Our Resuts We preset agorithms ad ower bouds for estimatig key aggregates of a data stream by processig a radomy samped substream. We cosider the basic frequecy reated aggregates, icudig the umber of distict eemets, the frequecy momets, the empirica etropy of the frequecy distributio, ad the heavy hitters. 1. Frequecy Momets: For the frequecy momets F k for k, we preset 1 + ε,δ- approximatio agorithms with space compexity 1 Õp 1 m 1 /k. This resut yieds a iterestig tradeoff betwee the sampig probabiity ad the space used by the agorithm. The smaer the sampig probabiity up to a certai miimum probabiity, the greater is the streamig space compexity of our agorithm. The agorithm is preseted i Sectio 3.. Distict Eemets: For the umber of distict eemets, F 0, we show that the curret best offie methods for estimatig F 0 from a radom sampe ca be impemeted i a streamig fashio usig very sma space. Whie it is kow that radom sampig ca sigificaty reduce the accuracy of a estimate for F 0 [7], we show that the eed to process this stream usig sma space does ot. The upper ad ower bouds are preseted i Sectio Etropy: For estimatig etropy we first show that o mutipicative approximatio is possibe i geera eve whe p is costat. However, we show that estimatig the empirica etropy o the samped stream yieds a costat factor approximatio to the etropy of the origia stream if the etropy is arger tha some vaishigy sma fuctio of p ad. These resuts are preseted i Sectio 5. 1 Where Õ otatio suppresses factors poyomia i 1/ε ad 1/δ ad factors ogarithmic i m ad.
3 Space-Efficiet Estimatio of Statistics over Sub-Samped Streams 3 4. Heavy Hitters: We show tight bouds for idetifyig a set of O1/α eemets whose frequecy exceeds αf 1/k k for k {1,}. I the case of k = 1, we show that existig heavy hitter agorithms ca be used if the stream is sufficiety og compared with p. I the case of k =, we show how to adapt ideas used i Sectio 3 to arrive at a agorithm that uses space Õ1/p. Aother way of iterpretig our resuts is i terms of time-space tradeoffs for data stream probems. Amost every streamig agorithm has a time compexity of at east, sice the agorithm reads ad processes each stream update. We show that for estimatig F k ad other probems it is uecessary to process each update; istead, it suffices for the agorithm to read each item idepedety with probabiity p, ad maitai a data structure of size Õp 1 m 1 /k. Iterestigy, the time to update the data structure per samped stream item is sti oy Õ1. The time to output a estimate at the ed of observatio is Õp 1 m 1 /k, i.e., roughy iear i the size of the data structure. As a exampe of the type of tradeoffs that are achievabe, for estimatig F if = Θm we ca set p = Θ1/ ad obtai a agorithm usig Õ tota processig time ad Õ workspace. 1.3 Reated Work There is a arge body of prior work reated at the itersectio of radom sampig ad data stream processig. Some of this work is aog the ies of methods for radom sampig from a data stream, icudig the reservoir sampig agorithm, attributed to Waterma aso see [37]. There has bee much foow up o variats ad geeraizatios of reservoir sampig, see for exampe [,16,0,30,36]. Whie this ie of work focuses o how to efficiety sampe from a stream, our work focuses o how to process a stream that has aready bee samped. Stream sampig is a we-researched method for maagig the oad o etwork moitors, whie eabig accurate measuremet. Packets are grouped ito fows based o the vaues of certai attributes withi the packet header. Oe commoy used sampig method is the samped etfow mode NF [3], which is the same as the Beroui sampig that we cosider here, where packets are samped idepedet of each other. Other methods of sampig are aso cosidered uder the geera umbrea of samped etfow, such as determiistic sampig oe of out every packets. Aother sampig method is the sampe-ad-hod mode SH [], where, oce a packet is samped from a fow, a other packets beogig that fow are aso samped. The priority sampig procedure [19] is a method for sampig from a weighted stream so that we ca get ubiased estimators of idividua weights with sma variace. Szegedy [35] has show that the priority sampig method of [19] essetiay gets the smaest possibe variace, give a fixed sampe size. I additio, various combiatios ad ehacemets to these sampig mechaisms have bee proposed [10 1, 1]. I particuar, [1] presets methods for better tuig sampig parameters ad for exportig partia summaries to sower storage, [1] presets methods that dyamicay adapt the sampig rate to achieve a desired eve of accuracy, [10] preset structure-aware sampig methods that provide improved accuracy whe compared with NF o specific rage queries of iterest, ad [11] presets stream sampig schemes for variace-optima estimatio of the tota weight of a arbitrary subset of the stream of a certai size. There is much other work aog the ies of optimizig sampig methods for accurate estimatio of a specific cass of aggregates o the origia stream. Typica aggregates of iterest icude the distributio of the umber of packets i differet fows, ad
4 4 Adrew McGregor et a. aggregates over sub-popuatios of a fows. The above ie of work taiors the sampig scheme towards specific goas, whie we cosider a simpe but geera sampig scheme, Beroui sampig, ad expore how to efficiety process data uder this sampig strategy. I may situatios, icudig with samped etfow, the sampig strategy is aready decided by a extera etity, such as the router, over which we may ot have cotro. Duffied et a. [17] cosider the estimatio of the sizes of IP fows ad the umber of IP fows i a packet stream through observig the samped stream. I a foow up work [18], they provide methods for estimatig the distributio of the sizes of the iput fows by observig sampes of the origia stream; this ca be viewed as costructig a approximate histogram. The techiques used here are maximum ikeihood estimatio, as we as protoco eve detai at the IP ad TCP eve. Other work aog this ies icudes the work o ivertig samped traffic [6] which aims to recover the distributio of the origia traffic through aayzig the sampe, ad work i [5, 13] which seeks to aswer top-k queries ad rak fows through aayzig the sampe. Whie this ie of work deas with iferece from a radom sampe i detai, it does ot cosider the issue of processig the sampe i a streamig maer usig imited space, as we do here. Further, we cosider aggregates such as frequecy momets ad etropy, which do ot seem to have bee ivestigated i detai o samped streams i prior work o etwork moitorig. I particuar, eve whe the space compexity of a agorithm is high, we preset space ower bouds that hep uderstad the exted to which these aggregates ca be estimated. Rusu ad Dobra [34] cosider the estimatio of the secod frequecy momet of a stream, equivaety, the size of the sef-joi, through processig the samped stream. Our work differs from theirs i the foowig ways. Whie [34] do ot expicity metio the space boud of their agorithm, we derived a 1 + ε,δ estimator for F based o their agorithm ad foud that the estimator took Õ1/p space. We improve the depedece o the sampig probabiity ad obtai a agorithm that oy requires Õ1/p space. This depedece o the sampig probabiity p is optima. Our techique is aso differet from theirs. Ours reies o coutig the umber of coisios i the samped stream, whie theirs reies o scaig a estimate of the secod frequecy momet of the samped stream. We aso cosider higher frequecy momets F k, for k >, as we as the etropy, whie they do ot. Bhattacharya et a. [6] cosider stream processig i the mode where the stream processor ca adaptivey skip past stream eemets, ad ook at oy a fractio of the iput stream, thus speedig up stream computatio. I their mode, the stream processor has the power to decide which eemets to see ad which to skip past, hece it is adaptive ; i our mode, the stream processor does ot have such power, ad must dea with the radomy samped stream that is preseted to it. Our mode refects the setup i curret etwork moitorig equipmet, such as Radomy Samped Netfow [9]. They preset a costat factor approximatio for F, whie we preset 1+ε,δ approximatios for a frequecy momets F k for k. Bar-Yossef [3] presets ower bouds o the sampig probabiity, or equivaety, the umber of sampes eeded to estimate certai properties of a data set, icudig the frequecy momets. This yieds a miimum sampig probabiity for the Beroui samper that we cosider, beow which it is ot possibe to estimate aggregates accuratey, whether streamig or otherwise. This is reevat to Theorem 1 i our paper, which assumes that the sampig probabiity must be at east a certai vaue. There is work o probabiistic data streams [14,8], where the data stream itsef cosists of probabiistic data, ad each eemet of the stream is a probabiity distributio over a
5 Space-Efficiet Estimatio of Statistics over Sub-Samped Streams 5 set of possibe evets. Uike i our mode, the stream processor gets to see the etire iput i the probabiistic streams mode. Remark. The preimiary coferece versio of this paper caimed matchig ower bouds for estimatig F k ad heavy hitters [3]. The caimed ower bouds cruciay deped o ower bouds obtaied i a earier work of Guha ad Huag [4]. However, a probem has bee foud with the bouds of [4]. Thus the ower boud proofs that were preseted i [3] do ot hod. Notatio ad Preimiaries Throughout this paper, we wi deote the origia egth- stream by P = a 1,a,...,a ad wi assume that each eemet a i {1,,...,m}. We deote the sampig probabiity with p. The samped stream L is costructed by icudig each a i i L with probabiity p, idepedet of the other eemets. It is assumed that the sampig probabiity p is fixed i advace ad is kow to the agorithm. Throughout et f i be the frequecy of item i i the origia stream P. Let g i be the frequecy i the sub-samped stream ad ote that g i Bi f i, p. The streams P ad L defie frequecy vectors f = f 1, f,..., f m ad g = g 1,g,...,g m respectivey. Whe cosiderig a fuctio F o a stream e.g., a frequecy momet or the etropy we wi deote FP ad FL to idicate that vaue of the fuctio o the origia ad samped stream respectivey. Whe the cotext is cear, we wi aso abuse otatio ad use F to idicate FP. We are primariy iterested i radomized mutipicative approximatios. Defiitio 1 For α > 1 ad δ [0,1], we say X is a a α,δ-estimator for X if Pr [ α 1 X/ X α ] 1 δ. We use the otatio Õ to suppress factors poyomia i 1/ε, 1/δ ad ogarithmic i. More precisey, give two fuctios f ad g ad costats ε > 0, ad δ > 0, we write f Õg to deote f Opoy1/ε, 1/δ, og g. Simiary we write f Ωg to deote f Ωpoy1/ε,1/δ,ogg. 3 Frequecy Momets I this sectio, we preset a agorithm for estimatig the kth frequecy momet F k. The mai theorem of this sectio is as foows. Theorem 1 For k, there is a oe pass streamig agorithm which observes L ad outputs a 1+ε,δ-estimator for F k P usig Õp 1 m 1 /k space, assumig p = Ωmim, 1/k. For p = õmim, 1/k there is ot eough iformatio i the samped stream to obtai a 1 + ε,δ approximatio to F k P with ay amout of space, see Theorem 4.33 of [3]. Defiitio For 1 k defie the umber of -wise coisios to be C P = m fi ad C L = m gi.
6 6 Adrew McGregor et a. Our agorithm is based o the foowig coectio betwee the th frequecy momet of a stream ad the -wise coisios i the stream. Lemma 1 For 1 k, 1 F P =! C P + β F P 1 =1 where β = j1 <...< j 1 j 1 j j. Proof The reatioship foows from! C P = = = m m f i f i 1... f i 1 m fi f i f 1 i 1 j 1 1 j 1 + fi 1 j 1 1 j 1 1 = F P β F P. =1 m f 1 i + j 1 j... 1 j 1 < j 1 1 j 1 < j 1 j 1 j m f i... The foowig emma reates the expectatio of C L to C P ad bouds the variace. Lemma For 1 k, E[C L] = p C P ad V[C L] = Op 1 F 1/. Proof Let C deote C L. Sice each -wise coisio i P appears i L with probabiity p, we have E[C] = p C P. For each i [m], et C i be the umber of -wise coisios i L amog items that equa i. The C = i [m] C i. By idepedece of the C i, V[C] = V[C i ]. i [m] Fix a i [m]. Let S i be the set of idices i the origia stream equa to i. For each J S i with J =, et X J be a idicator radom variabe if each of the stream eemets i J appears i the samped stream. The C i = J X J. Hece, V[C i ] = J,J E[X J X J ] E[X J ]E[X J ] = p J J p J,J fi = j = j=1 j=1 fi j j O f j i p j. j p j p j
7 Space-Efficiet Estimatio of Statistics over Sub-Samped Streams 7 1/ j Sice F j F 1/ for a j = 1,...,, we have V[C] = O1 j=1 F j p j = O1 j=1 j/ F p j. If we ca show that the first term of this sum domiates, the desired variace boud foows. This is the case if p F 1/ 1, sice this is the ratio of two cosecutive summads. Note that F is miimized for a fixed F 0 ad F 1 whe there are F 0 frequecies each of vaue F 1 /F 0. I this case, Hece, p 1/F 1/ F 1/ = F 0 F 1 /F 0 1/ = F 1 /F 1 1/ 0. if p F 1 1/ 0 /F 1, which hods by assumptio. We ext describe the ituitio behid our agorithm. To estimate F k P, by Eq. 1, it suffices to obtai estimates for F 1 P, F P,...,F k 1 P ad C k P oe of the caveats is that some of the coefficiets of F i P are egative, which we hade as expaied beow. Our agorithm attempts to estimate F P for = 1,,... iductivey. Sice, by Cheroff bouds, F 1 P is very cose to F 1 L/p, F 1 P ca be estimated easiy. Thus our probem reduces to estimatig C k P by observig the sub-samped stream L. Sice the expected umber of coisios i L equas p k C k P, our agorithm wi attempt to estimate C k L, the umber of k-wise coisios i the sub-samped stream. However, it is ot possibe to fid a good reative approximatio of C k L i sma space if C k L is sma. However, whe C k L is sma, it does ot cotribute sigificaty to the fia aswer ad we do ot eed a good reative error approximatio! We oy eed that our estimator does ot grossy over estimate C k L. Our agorithm to estimate C k L wi have the foowig property: If C k L is arge, the it outputs a good reative error approximatio, ad if C k L is sma the it outputs a vaue that is at most 3C k L. Aother caveat is that some of the βi s coud be egative. Thus apriori it is ot cear that our strategy of estimatig F P by estimatig F 1 P, F P,...,F k 1 P, C k P, ad appyig Equatio 1 works. However, by usig a carefu choice of approximatio errors ad the fact that F i P F j P, whe i > j, we argue that this approach succeeds i obtaiig a good approximatio of F P. 3.1 The Agorithm Defie a sequece of radom variabes φ : φ 1 = F 1L p, ad φ = C L! 1 p + β i φ i for > 1. Agorithm 1 iductivey computes a estimate φ i for each φ i. Note that if C L/p takes its expected vaue of C P ad we coud compute C L exacty, the Eq. 1 impies that the agorithm woud retur F k P exacty. Whie this is excessivey optimistic we wi show that C L/p is sufficiety cose to C P with high probabiity ad that we ca costruct a estimate for C L for C L such that the fia resut retured is sti a 1+ε approximatio for F k P with probabiity at east 1 δ.
8 8 Adrew McGregor et a. Agorithm 1: F k P 1 Compute F 1 L exacty ad set φ 1 = F 1 L/p. for = to k do 3 Let C L be a estimate for C L, computed as described i the text. 4 Compute 5 ed 6 Retur φ k. φ C = L! p + 1 βi φ i We compute our estimate of C L via a agorithm by Idyk ad Woodruff [7]. This agorithm attempts to obtai a 1 + ε 1 approximatio of C L for some vaue of ε 1 to be determied. The estimator is as foows. For i = 0,1,,... defie S i = { j [m] : η1 + ε i g j < η1 + ε i+1 } where η is radomy chose betwee 0 ad 1 ad ε = ε 1 /4. The agorithm of Idyk ad Woodruff [7] returs a estimate s i for S i ad our estimate for C L is defied as η1 + ε C L := i s i i The space used by the agorithm is Õp 1 m 1 /. We defer the detais to Sectio 3.. We ext defie a evet E that correspods to our coisio estimates beig sufficiety accurate ad the samped stream beig we-behaved. The ext emma estabishes that Pr[E ] 1 δ. We wi defer the proof uti Sectio 3.. Lemma 3 Defie the evet E = E 1 E... E k where where ε k = ε, ε 1 = E 1 : φ 1 1 ± ε 1 F 1 P E : C L/p C P ε 1 F P/! for ε A +1, ad A = 1 β i. The Pr[E ] 1 δ. The ext theorem estabishes that, coditioed o the evet E, the agorithm returs a 1 ± ε approximatio of F k P as required. Lemma 4 Coditioed o E, we have φ 1 ± ε F P for a [k]. Proof The proof is by iductio o. Sice we are coditioig o evet E ad thus evet E 1, we have that φ 1 is a 1 ± ε 1 approximatio of F 1 P. Thus the iductio hypothesis esures that φ i, 1 i 1, is a 1 ± ε i approximatio of F i P. Therefore, φ C L! F P = p +!C P + 1 β i 1 β 1 = ε 1 F P + βi ε i F i P φ i F P i F i P F P + ε 1F P + 1 β i F i P
9 Space-Efficiet Estimatio of Statistics over Sub-Samped Streams 9 where the first iequaity foows sice we are coditioig o evet E which esures that C L! p!c P ε 1F P, ad the iductio hypothesis esures that 1 βi 1 φ i βi 1 F i P β i ε i F i P. The secod equaity foows due to Equatio 1. Note that i j impies ε i ε j ad F i P F j P. Hece, by the defiitio of ε, 1 ε 1 F P + βi 1 ε i F i P ε 1 F P 1 + βi = ε F P. Therefore φ 1 ± ε F P as required. 3. Proof of Lemma 3. Our goa is to show that Pr[E 1 E... E k ] 1 δ. To do this it wi suffice to show that for each [k], Pr[E ] 1 δ/k ad appea to the uio boud. We first observe that, by Cheroff bouds, the evet E 1 happes with probabiity at east 1 δ/k. Let X i deote the 0-1 radom variabe whose vaue if 1 if the i item of the origia stream appears i the samped stream. Note that E[X i ] = 1, 1 i, ad F 1 L = X i. Sice φ 1 = F 1 L/p, we have φ 1 = X i/p. Reca that = F 1 P. Pr [ ] [ ] E 1 = Pr φ 1 F 1 P F 1 Pε 1 [ = Pr X ] i p F 1P F 1 Pε 1 [ = Pr X ] i F 1 P p pε 1 e ε 1 F 1Pp/ By Cheroff Boud δ/k The ast iequaity foows because our coditio o p impies p > poy1/εog1/δ F 1 p. To aayze Pr[E ] for k we cosider the evets: E 1 : C L/p C P ε 1F P! E : C L/p C L/p ε 1 F P.! By the triage iequaity it is easy to see that Pr [ E 1 E ] Pr[E ] ad hece it suffices to show that Pr [ E 1 ] [ ] 1 δ/k ad Pr E 1 δ/k. The first part foows easiy from the variace boud i Lemma. Lemma 5 Pr [ E 1 ] 1 δ 4k.
10 10 Adrew McGregor et a. Proof There are two cases depedig o the vaue of E[C L]. Case I: First assume E[C L] δε 1 p F 8k!. Therefore, by Lemma, we aso kow that By Markov s boud C P δε 1F 8k!. [ Pr C L ε 1 p ] F 1 δ! 4k. 3 Eq. ad Eq. 3 together impy that with probabiity at east 1 δ 4k C L/p C P max C L/p,C P ε 1F! Case II: Next assume E[C L] > δε 1 p F 8k!. By Chebyshev s boud, ad usig Lemma, we get: Pr [ C L E[C L] ε ] 1E[C L] 4V[C L] ε 1 E[C L] Dk! δ ε 4 1 pf1/ Dk! F 1 1/ 0 δ ε 1 4 p F 1 Dk! 1 δ ε 1 4 p mif 1/ 0,F 1/ 1 = Dk! 1 δ H 4 ε 4 p mif 1/ 0,F 1/ 1 δ 4k where D ad H are sufficiety arge costats. The third iequaity foows because F 1/ F 1 /F 1 1/ 0. The equaity foows because ε = H ε 1. The ast iequaity foows because our assumptio o p impies that p poy1/ε,1/δmif 0,F 1 1/k. Sice E[C L] = p C P ad C P F P/!, we have that C Pr[ L/p C P ε ] 1F P 1 δ! 4k as required. We wi ow show that E happes with high probabiity by aayzig the agorithm that computes C L. We eed the foowig resut due to Idyk ad Woodruff [7]. Reca that ε = ε 1 /4. Theorem Idyk ad Woodruff [7] Let G be the set of idices i for which S i 1 + ε i γf L poyε 1 og, 4
11 Space-Efficiet Estimatio of Statistics over Sub-Samped Streams 11 the Pr [ i G, s i 1 ± ε S i ] 1 δ 8k. For every i whether it is i G or ot s i 3 S i. Moreover, the agorithm rus i space Õ1/γ. We say that a set S i cotributes if 1 + ε i S i > C L B. where B = poyε 1 og. Give i the evet that S i cotributes hods with certai coceivaby 0 probabiity. We first show that if S i cotributes, the S i is a good set with high probabiity. More precisey, we show that for every S i that cotributes, Eq. 4 hods with high probabiity with γ = pm 1+/. Lemma 6 Suppose that C L > ε 1 p F P 4!, ad aso suppose that the evet S i cotributes happeed. The [ Pr S i 1 + ε i ] δ pf L m 1 / poyε 1 1 δ og 8k. Proof Cosider a set S i that cotributes. Note that the probabiity that η < 1/poyδ 1 ε 1 og with is at most 1/poyδ 1 ε 1 og. Without oss of geeraity we ca take this probabiity to be ess tha δ/16k. By our assumptio o C L ad the fact that S i cotributes, S i 1 + ε i ε p F P B! hods with probabiity at east 1 δ/8k. Thus S i 1 + ε i ε / p F / P p F P B! / m 1 / poyε 1 og where the secod iequaity is a appicatio of Höder s iequaity. Note that E[F L] = p F P + p1 pf 1 P pf P. Thus, a appicatio of the Markov boud, [ Pr F L 16kpF ] P 1 δ δ 16k. 5 The emma foows as the foowig iequaities hod with probabiity at east 1 δ/8k. S i 1 + ε i p F P m 1 / poyε 1 og δ p16kpf P 16km 1 / poyε 1 og δ pf L m 1 / poyε 1 By 5 og
12 1 Adrew McGregor et a. Now we are ready to prove that the evet E Lemma 7 Pr [ E ] 1 δ k Proof There are two cases depedig o the size of C L. hods with high probabiity. Case I: Assume C L ε 1 p F P 4!. By Theorem, it foows that C L 3C L. Thus C L C L C L ε 1 p F P! Case : Assume C L > ε 1 p F 4!. By Lemma 6, for every S i that cotributes, [ ] Pr S i 1 + ε i δ pf L m 1 / poyε 1 1 δ og 8k. Now by Theorem for each S i that cotributes s i 1 ± ε S i, with probabiity at east 1 δ 8k. Therefore, If E 1 is true, the: C Pr [ C L C L ε C L ] 1 δ 4k. L C Pp ± ε 1F Pp.! Sice E 1 hods with probabiity at east 1 4k δ, the foowig iequaities hod with probabiity at east 1 k δ. C L C L ε C L ε C Pp + ε 1ε F Pp! ε F Pp! + ε 1ε F Pp! F Pp ε 1 + ε 1 ε 1 4! F Pp ε 1! 4 Distict Eemets There are strog ower bouds for the accuracy of estimatig the umber of distict vaues through radom sampig. The foowig theorem is from Charikar et a. [7], which we have restated sighty to fit our otatio the origia theorem is about database tabes. Let F 0 be the umber of eemets i a data set T of tota size. Note that T maybe a stored data set, ad eed ot be processed i a oe-pass streamig maer. Theorem 3 Charikar et a. [7] Cosider ay radomized estimator ˆF 0 for the umber of distict vaues F 0 of T, that examies at most r out of the eemets i T. For ay γ > e r, there exists a choice of the iput T such that with probabiity at east γ, the mutipicative error is at east r/rγ 1.
13 Space-Efficiet Estimatio of Statistics over Sub-Samped Streams 13 The above theorem impies that if we observe o eemets of P, the it is ot possibe to get eve a estimate with a costat mutipicative error. This ower boud for the ostreamig mode eads to the foowig ower boud for samped streams. Theorem 4 F 0 Lower Boud For sampig probabiity p 0,1/1], ay agorithm that estimates F 0 by observig L, there is a iput stream such that the agorithm wi have a mutipicative error of Ω 1/ p with probabiity at east 1 e p /. Proof Let E 1 deote the evet L 6p. Let β deote the mutipicative error of ay agorithm perhaps o-streamig that estimates F 0 P by observig L. Let α = 1p. Let E deote the evet β α. Note that L is a biomia radom variabe. The expected size of the samped stream is E[ L ] = p. By usig a Cheroff boud: Pr[E 1 ] = 1 Pr[ L > 6E[ L ]] 1 6E[ L ] > 1 e p If E 1 is true, the the umber of eemets i the samped stream is o more tha 6p. Substitutig r = 6p ad γ = 1/ i Theorem 3, we get: [ ] 6p Pr[E E 1 ] Pr β > 1p E 1 1 Simpifyig, ad usig p 1/1, we get: Pr[E ] Pr[E 1 E ] = Pr[E 1 ] Pr[E E 1 ] 1 1 e p We ow describe a simpe streamig agorithm for estimatig F 0 P by observig LP, p, which has a error of O1/ p with high probabiity. Agorithm : F 0 P 1 Let X deote a 1/,δ-estimate of F 0 L, derived usig ay streamig agorithm for F 0 such as [9]. Retur X/ p Lemma 8 F 0 Upper Boud Agorithm returs a estimate Y for F 0 P such that the mutipicative error of Y is o more tha 4/ p with probabiity at east 1 δ +e pf 0P/8. Proof Let D = F 0 P, ad D L = F 0 L. Let E 1 deote the evet D L pd/, E deote X D L /, ad E 3 deote the evet X 3D L /. Let E = 3 E i. Without oss of geeraity, et 1,,...,D deote the distict items that occurred i stream P. Defie X i = 1 if at east oe copy of item i appeared i L, ad 0 otherwise. The differet X i s are a idepedet. Thus D L = D X i is a the sum of idepedet Beroui radom variabes ad E[D L ] = D Pr[X i = 1].
14 14 Adrew McGregor et a. Sice each copy of item i is icuded i D L with probabiity p, we have Pr[X i = 1] p. Thus, E[D L ] pd. Appyig a Cheroff boud, Pr [ E 1 ] = Pr [ D L < pd Suppose E is true. The we have the foowig: ] [ Pr D L < E[D ] L] e E[DL]/8 e pd/8. 6 pd 4 D L X 3D L 3D The ast iequaity is because D L is at most D. Therefore X/ p has a mutipicative error of o more tha 4/ p. We ow boud the probabiity that E is fase. Pr [ E ] 3 Pr [ E i ] δ + e pd/8 where we have used the uio boud, Eq. 6, ad the fact that X is a 1/,δ-estimator of D L. 5 Etropy I this sectio we cosider approximatig the etropy of a stream. Defiitio 3 The etropy of a frequecy vector f = f 1, f,..., f m is defied as Hf = m f i g f i where = m f i. Ufortuatey, i cotrast to F 0 ad F k, it is ot possibe to mutipicativey approximate Hf eve if p is costat. Lemma 9 No mutipicative error approximatio is possibe with probabiity 9/10 eve with p > 1/. Furthermore, 1. There exists f such that Hf = Θog/p but Hg = 0 with probabiity at east 9/10.. There exists f such that Hf Hg gp with probabiity at east 9/10. Proof First cosider the foowig two scearios for the cotets of the stream. I Sceario 1, f 1 = ad i Sceario, f 1 = k ad f = f 3 =... = f k+1 = 1. I the first case the etropy Hf = 0 whereas i the secod, Hf = k ge k + k g = k Θk/ k + k g = Θ1 + g k.
15 Space-Efficiet Estimatio of Statistics over Sub-Samped Streams 15 Distiguishig these streams requires that at east oe vaue other that 1 is preset i the subsamped stream. This happes with probabiity 1 p k > 1 pk ad hece with k = p 1 /10 this probabiity is ess tha 9/10. For the secod part of the emma cosider the stream with f 1 = f =... = f m = 1 ad hece Hf = gm. But Hg = g L where L is the umber of eemets i the samped stream. By a appicatio of the Cheroff boud L is at most pm with probabiity at east 9/10 ad the resut foows. Istead we wi show that it is possibe to approximate Hf up to a costat factor with a additioa additive error term that teds to zero if p = ω 1/3. It wi aso be coveiet to cosider the foowig quatity: H p g = m g i p g. p g i The foowig propositios estabishes that H p g is a very good approximatio to Hg. Propositio 1 With probabiity 199/00, H p g Hg = Oogm/ p. Proof By a appicatio of the Cheroff boud, with probabiity 199/00 p m g i c p for some costat c > 0. Hece, if = m g i ad γ = /p it foows that γ = 1 ± O1/ p. The H p g = m g i p g = p g i m γg i g γg i = Hg + O1/ p + OHg/ p. The ext emma estabishes that the etropy of g is withi a costat factor of the etropy of f pus a sma additive term. Lemma 10 With probabiity 99/100, if p = ω 1/3, 1. H p g OHf.. H p g Hf/ O 1 p 1/ 1/6 Proof For the first part of the emma, first ote that E[H p g] = m [ ] gi p E g p g i m E[g i ] p g p m E[g i ] = p f i p g = Hf p p f i where the iequaity foows from Jese s iequaity sice the fuctio xgx 1 is cocave. Hece, by Markov s iequaity Pr[H p g 100Hf] 99/100.
16 16 Adrew McGregor et a. To prove the secod part of the emma, defie f = cp 1 ε og for some sufficiety arge costat c ad ε 0,1. We the partitio [m] ito A = {i : f i < f } ad B = {i : f i f } ad cosider Hf = H A f + H B f where H A f i f = i A g ad H B f i f = f i i B g. f i By appicatios of the Cheroff ad uio bouds, with probabiity at east 99/300, { ε p f if i A g i p f i ε p f i if i B. Hece, Hpg B g i p = g = i B p g i i B f i 1 ± ε g = 1 ± εh B f + Oε. 1 ± ε f i For H A pg we have two cases depedig o whether i A f i is smaer or arger tha θ := cp 1 ε. If i A f i θ the H A f i f = i A g θ g. f i O the other had if i A f i θ the by a appicatio of the Cheroff boud, ad hece i A g i p i A Hpg A g i = i A p g Combiig the above cases we deduce that g p g i 1 + ε f i A f i ε p f i i A g i p 1 εg 1 + ε f i A g1 + ε f 1 ε g f i H A f. H p g 1 ε gp 1 ε og Hf Oε ε. g p Settig ε = p 1/ 1/6 we get H p g 1 p 1/ 1/6 g1/3 og og Hf Op 1/ 1/6 O g /3 Hf/ Op 1/ 1/6. Therefore, by usig a existig etropy estimatio agorithm e.g., [5] to mutipicativey estimate Hg we have a costat factor approximatio to Hf if Hf = ωp 1/ 1/6. The ext theorem foows directy from Propositio 1 ad Lemma 10. Theorem 5 It is possibe to approximate Hf up to a costat factor i Opoyogm, space if Hf = ωp 1/ 1/6.
17 Space-Efficiet Estimatio of Statistics over Sub-Samped Streams 17 6 Heavy Hitters There are two commo otios for fidig heavy hitters i a stream: the F 1 -heavy hitters, ad the F -heavy hitters. Defiitio 4 I the F k -heavy hitters probem, k {1,} we are give a stream of updates to a uderyig frequecy vector f ad parameters α,ε, ad δ. The agorithm is required to output a set S of O1/α items such that: 1 every item i for which f i αf k 1/k is icuded i S, ad ay item i for which f i < 1 εαf k 1/k is ot icuded i S. The agorithm is additioay required to output approximatios f i with i S, f i [1 ε f i,1 + ε f i ]. The overa success probabiity shoud be at east 1 δ. The ituitio behid the agorithm for heavy hitters is as foows. Suppose a item i was a F k heavy hitter i the origia stream P, i.e. f i αf k 1/k. The, by a Cheroff boud, it ca be argued that with high probabiity, g i the frequecy of i i the samped stream is cose to p f i. I such a case, it ca be show that i is aso a heavy hitter i the samped stream ad wi be detected by a agorithm that idetifies heavy hitters o the samped stream with the right choice of parameters. Simiary, it ca be argued that a item i such that f i < 1 εαf k 1/k caot reach the required frequecy threshod o the samped stream, ad wi ot be retured by the agorithm. We preset the aaysis beow assumig that the heavy hitter agorithm o the samped stream is the CoutMi sketch. Other agorithms for heavy hitters ca be used too, such as the Misra-Gries agorithm [33]; ote that the Misra- Gries agorithm works o isert-oy streams, whie the CoutMi sketch works o geera update streams, with additios as we as deetios. Theorem 6 Suppose that F 1 P Cp 1 α 1 ε og/δ for a sufficiety arge costat C > 0. There is a oe pass streamig agorithm which observes the samped stream L ad computes the F 1 heavy hitters of the origia stream P with probabiity at east 1 δ. This agorithm uses Oε 1 og /αδ bits of space. Proof The agorithm rus the CoutMiα,ε,δ agorithm of [15] for fidig the F 1 - heavy hitters probem o the samped stream, for α = 1 ε/5 α, ε = ε/, ad δ = δ/4. We retur the set S of items i foud by CoutMi, ad we scae each of the f i by 1/p. Reca that g i the frequecy of item i i the samped stream L. The for sufficiety arge C > 0 give i the theorem statemet, for ay i, by a Cheroff boud, [ { Pr g i > max p 1 + ε f i, 5 C }] ε og δ δ 4. By a uio boud, with probabiity at east 1 δ/4, for a i [], { g i max p 1 + ε f i, 5 C } ε og. 7 δ
18 18 Adrew McGregor et a. We aso eed the property that if f i 1 εαf 1 P, the g i p1 ε/5 f i. For such i, by the premise of the theorem we have E[g i ] p1 εαf 1 P C1 εε og/δ. Hece, for sufficiety arge C, appyig a Cheroff ad a uio boud is eough to cocude that with probabiity at east 1 δ/4, for a such i, g i p1 ε/5 f i. We set the parameter δ of CoutMi to equa δ/4, ad so CoutMi succeeds with probabiity at east 1 δ/4. Aso, E[[F 1 L] = pf 1 P Cα 1 ε og/δ, the iequaity foowig from the premise of the theorem. By a Cheroff boud, [ Pr 1 ε pf 1 P F 1 L 1 + ε ] pf 1 P 1 δ By a uio boud, a evets discussed thus far joity occur with probabiity at east 1 δ, ad we coditio o their joit occurrece i the remaider of the proof. Lemma 11 If f i αf 1 P, the g i 1 ε/5 αf 1 L. If f i < 1 εαf 1 P, the g i 1 ε/αf 1 L. Proof Sice g i p1 ε/5 f i ad aso F 1 L p1 + ε/5f 1 P. Hece, g i 1 ε/5 1 + ε/5 αf 1L 1 ε/5 αf 1 L. Next cosider ay i for which f i < 1 εαf 1 P. The { g i max p 1 + ε 1 εαf 1 P, 5 { max 1 3ε αf 1 L, 5 { max { max 1 ε 1 ε 1 ε αf 1 L. C } ε og δ C } ε og δ αf 1 L, α } E[F 1L] αf 1 L, 1 + ε α } 5 F 1L It foows that by settig α = 1 ε/5 α ad ε = ε/, CoutMiα,ε,δ does ot retur ay i S for which f i < 1 εαf 1 P, sice for such i we have g i 1 ε/αf 1 L, ad so g i < 1 ε/10α F 1 L. O the other had, for every i S for which f i αf 1 P, we have i S, sice for such i we have g i α F 1 L. It remais to show that for every i S, we have f i [1 ε f i,1+ε f i ]. By the previous paragraph, for such i we have f i 1 εαf 1 P. By the above coditioig, this meas
19 Space-Efficiet Estimatio of Statistics over Sub-Samped Streams 19 that g i p1 ε/5 f i. We wi aso have g i p1 + ε/5 f i if p 1 + ε 5 fi C og ε δ. Sice f i 1 εαf 1 P, this i tur hods if F 1 P 1 1 ε1 + ε/5 Cp 1 α 1 ε og, δ which hods by the theorem premise provided ε is ess tha a sufficiety sma costat. This competes the proof. Theorem 7 Suppose that F 1/ Cp 3/ α 1 ε og/δ ad p = Ωm 1/. There is a oe pass streamig agorithm which observes the samped stream L ad computes α, 1 p 1/ 1 ε F -heavy hitters of the origia stream with high probabiity. Proof The agorithm rus the CoutSketchα,ε,δ agorithm [8] for fidig the F -heavy hitters o the samped stream, for appropriate α,ε, ad δ specified beow. We retur the set S of items i foud by CoutSketch. As before we ca show that if f i 1 εαf 1/, the with probabiity at east 1 δ/4, g i p1 ε/5 f i. Next we boud the variace of F L. Sice each g i is draw from a biomia distributio Bi f i, p o f i items with probabiity p, Moreover, E[F L] = Var[F L] = E [ g i ] = Var[g i ] It is kow that the 4-th momet of Bi f i, p is p fi + p1 p f i = p F P + p1 pf 1 P. E [ g i 4] p fi + p1 p f i E [ g i 4] p 4 f 4 i. f i p1 7p + 7 f i p + 1p 18 f i p + 6 f i p 6p f i p 3 6 f i p 3 + f 3 i p 3, ad subtractig p 4 f 4 i from this, we obtai f i p 7 f i p + 7 f i p + 1 f i p 3 18 f i p f 3 i p 3 6 f i p f i p 4 6 f 3 i p 4 + f 4 i p 4 f 4 i p 4, which is O f i p + fi p + fi 3 p 3. Hece, Var[F L] = OpF 1 + p F P + p 3 F 3 P. By Chebyshev s iequaity, Pr [ F L E[F L] ε p ] pf1 + p F + p 3 F 3 F = O ε p 4 F 1 = O ε + 1 pf ε + pf3/ F ε F F1 = O ε pf Thus with probabiity at east 1 δ/4 1 ε pf 1/ F L 1/ p 1/ F 1/. 5 = O + 1 ε + pf 3 F ε F 1 ε + 1 pf ε + F By uio boud a evets discussed so far joity occur with probabiity at east 1 δ, ad we coditio o them occurrig i the remaider of the aaysis. p ε F 1/
20 0 Adrew McGregor et a. Suppose that f i αf 1/ i the origia stream. The g i p1 ε/5 f i αf 1/ p1 ε/5 α p 1/ 1 ε/5f 1/ L Next cosider ay i for which f i < 1 εp 1/ αf 1/. The { g i max p 1 + ε 1 εp 1/ αf 1/ P, 5 { max 1 + ε 1 3ε 5 1 4ε ε C } ε og δ 1 ε p 3/ αf 1/ P, 5 p 3/ F 1/ P 5 1 ε p 1/ αf L 1/ C ε og δ } It foows that by settig α = 1 ε/5 α p 1/, δ = δ/4, ad ε = ε/10, CoutSketchα,ε,δ does ot retur ay i S for which f i < 1 εp 1/ αf 1/ P, sice for such i we have g i 1 ε/p 1/ αf L 1/. O the other had, for every i S for which f i αf 1/, we have i S, sice for such i we have g i α F L 1/. 7 Cocusio We preseted sma-space stream agorithms ad ower bouds for estimatig fuctios of iterest whe observig a radom sampe of the origia stream. The are umerous directios for future work, ad we metio some of them. As we have see, our resuts impy time/space tradeoffs for severa atura streamig probems. What other data stream probems have iterestig time/space tradeoffs? Aso, we have so far assumed that the sampig probabiity p is fixed, ad that the agorithm has o cotro over it. Suppose this was ot the case, ad the agorithm ca chage the sampig probabiity i a adaptive maer, depedig o the curret state of the stream. Is it possibe to get agorithms that ca observe fewer eemets overa ad get the same accuracy as our agorithms? For which precise modes ad probems is adaptivity usefu? It is aso iterestig to obtai matchig space ower bouds for the case of estimatig frequecy momets. Refereces 1. Ao, N., Matias, Y., Szegedy, M.: The Space Compexity of Approximatig the Frequecy Momets. Joura of Computer ad System Scieces 581, Babcock, B., Datar, M., Motwai, R.: Sampig from a movig widow over streamig data. I: Proc. ACM-SIAM Symposium o Discrete Agorithms SODA, pp Bar-Yossef, Z.: The compexity of massive dataset computatios. Ph.D. thesis, Uiversity of Caiforia at Berkeey 00
21 Space-Efficiet Estimatio of Statistics over Sub-Samped Streams 1 4. Bar-Yossef, Z.: Sampig ower bouds via iformatio theory. I: Proc. 35th Aua ACM Symposium o Theory Of Computig STOC, pp Barakat, C., Iaaccoe, G., Diot, C.: Rakig fows from samped traffic. I: Proc. ACM Coferece o Emergig Network Experimet ad Techoogy CoNEXT, pp Bhattacharyya, S., Madeira, A., Muthukrisha, S., Ye, T.: How to scaaby ad accuratey skip past streams. I: Proc. 3rd Iteratioa Coferece o Data Egieerig ICDE Workshops, pp Charikar, M., Chaudhuri, S., Motwai, R., Narasayya, V.R.: Towards estimatio error guaratees for distict vaues. I: Proc. 19th ACM Symposium o Pricipes of Database Systems PODS, pp Charikar, M., Che, K., Farach-Coto, M.: Fidig frequet items i data streams. Theoretica Computer Sciece 311, Cisco Systems: Radom Samped NetFow. feature/guide/fstatsa.htm 10. Cohe, E., Cormode, G., Duffied, N.G.: Structure-aware sampig: Fexibe ad accurate summarizatio. Proceedigs of the VLDB Edowmet 411, Cohe, E., Duffied, N.G., Kapa, H., Lud, C., Thorup, M.: Efficiet stream sampig for variaceoptima estimatio of subset sums. SIAM J. Comput. 405, Cohe, E., Duffied, N.G., Kapa, H., Lud, C., Thorup, M.: Agorithms ad estimators for summarizatio of uaggregated data streams. Joura of Computer ad System Scieces 807, Cohe, E., Grossaug, N., Kapa, H.: Processig top-k queries from sampes. Computer Networks 514, Cormode, G., Garofaakis, M.: Sketchig probabiistic data streams. I: Proc. 6th ACM Iteratioa Coferece o Maagemet of Data SIGMOD, pp Cormode, G., Muthukrisha, S.: A improved data stream summary: the cout-mi sketch ad its appicatios. Joura of Agorithms 551, Cormode, G., Muthukrisha, S., Yi, K., Zhag, Q.: Optima sampig from distributed streams. I: Proc. ACM Symposium o Pricipes of Database Systems PODS, pp Duffied, N.G., Lud, C., Thorup, M.: Properties ad predictio of fow statistics from samped packet streams. I: Proc. Iteret Measuremet Workshop, pp Duffied, N.G., Lud, C., Thorup, M.: Estimatig fow distributios from samped fow statistics. IEEE/ACM Trasactios o Networkig 135, Duffied, N.G., Lud, C., Thorup, M.: Priority sampig for estimatio of arbitrary subset sums. Joura of the ACM Efraimidis, P., Spirakis, P.G.: Weighted radom sampig with a reservoir. Iformatio Processig Letters 975, Esta, C., Keys, K., Moore, D., Varghese, G.: Buidig a better etfow. I: Proc. ACM Coferece o Appicatios, Techoogies, Architectures, ad Protocos for Computer Commuicatio SIGCOMM, pp Esta, C., Varghese, G.: New directios i traffic measuremet ad accoutig. I: Proc. ACM Coferece o Appicatios, Techoogies, Architectures, ad Protocos for Computer Commuicatio SIG- COMM, pp Gibbos, P.B., Matias, Y.: New sampig-based summary statistics for improvig approximate query aswers. I: Proc. ACM SIGMOD Iteratioa Coferece o Maagemet of Data, pp Guha, S., Huag, Z.: Revisitig the direct sum theorem ad space ower bouds i radom order streams. I: Automata, Laguages ad Programmig, 36th Iteratioa Cooquium, ICALP 1, pp Harvey, N.J.A., Neso, J., Oak, K.: Sketchig ad streamig etropy via approximatio theory. I: PRoc. 49th IEEE Coferece o Foudatios Of Computer Sciece FOCS, pp Hoh, N., Veitch, D.: Ivertig samped traffic. IEEE/ACM Trasactios o Networkig 141, Idyk, P., Woodruff, D.P.: Optima approximatios of the frequecy momets of data streams. I: Proc. 37th Aua ACM Symposium o Theory of Computig STOC, pp Jayram, T.S., McGregor, A., Muthukrisha, S., Vee, E.: Estimatig statistica aggregates o probabiistic data streams. ACM Trasactios o Database Systems 33, 6:1 6: Kae, D.M., Neso, J., Woodruff, D.P.: O the exact space compexity of sketchig ad streamig sma orms. I: Proc. 1st ACM-SIAM Symposium o Discrete Agorithms SODA, pp Lahiri, B., Tirthapura, S.: Stream sampig. I: L. Liu, M.T. Özsu eds. Ecycopedia of Database Systems, pp Spriger US 009
22 Adrew McGregor et a. 31. McGregor, A. ed.: Ope Probems i Data Streams ad Reated Topics iitk.ac.i/users/sgaguy/data-stream-probs.pdf 3. McGregor, A., Pava, A., Tirthapura, S., Woodruff, D.: Space-efficiet estimatio of statistics over subsamped streams. I: Proc. 31st ACM Symposium o Pricipes of Database Systems PODS, pp Misra, J., Gries, D.: Fidig repeated eemets. Sciece of Computer Programmig, Rusu, F., Dobra, A.: Sketchig samped data streams. I: Proc. 5th IEEE Iteratioa Coferece o Data Egieerig ICDE, pp Szegedy, M.: The dt priority sampig is essetiay optima. I: Proc. Aua ACM Symposium o Theory of Computig STOC, pp Tirthapura, S., Woodruff, D.P.: Optima radom sampig from distributed streams revisited. I: Proc. Iteratioa Symposium o Distributed Computig DISC, pp Vitter, J.S.: Radom sampig with a reservoir. ACM Trasactios o Mathematica Software 111,
The Sample Complexity of Exploration in the Multi-Armed Bandit Problem
Joura of Machie Learig Research 5 004) 63-648 Submitted 1/04; Pubished 6/04 The Sampe Compexity of Exporatio i the Muti-Armed Badit Probem Shie Maor Joh N. Tsitsikis Laboratory for Iformatio ad Decisio
More informationI. Chi-squared Distributions
1 M 358K Supplemet to Chapter 23: CHI-SQUARED DISTRIBUTIONS, T-DISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad t-distributios, we first eed to look at aother family of distributios, the chi-squared distributios.
More informationProperties of MLE: consistency, asymptotic normality. Fisher information.
Lecture 3 Properties of MLE: cosistecy, asymptotic ormality. Fisher iformatio. I this sectio we will try to uderstad why MLEs are good. Let us recall two facts from probability that we be used ofte throughout
More informationDiscrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13
EECS 70 Discrete Mathematics ad Probability Theory Sprig 2014 Aat Sahai Note 13 Itroductio At this poit, we have see eough examples that it is worth just takig stock of our model of probability ad may
More informationChapter 6: Variance, the law of large numbers and the Monte-Carlo method
Chapter 6: Variace, the law of large umbers ad the Mote-Carlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value
More informationA probabilistic proof of a binomial identity
A probabilistic proof of a biomial idetity Joatho Peterso Abstract We give a elemetary probabilistic proof of a biomial idetity. The proof is obtaied by computig the probability of a certai evet i two
More informationHypothesis testing. Null and alternative hypotheses
Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate
More informationWeek 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable
Week 3 Coditioal probabilities, Bayes formula, WEEK 3 page 1 Expected value of a radom variable We recall our discussio of 5 card poker hads. Example 13 : a) What is the probability of evet A that a 5
More informationChapter 7 Methods of Finding Estimators
Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of
More informationChapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:
Chapter 7 - Samplig Distributios 1 Itroductio What is statistics? It cosist of three major areas: Data Collectio: samplig plas ad experimetal desigs Descriptive Statistics: umerical ad graphical summaries
More informationCHAPTER FIVE Network Hydraulics
. ETWOR YDRAULICSE CATER IVE Network ydrauics The fudameta reatioships of coservatio of mass ad eergy mathematicay describe the fow ad pressure distributio withi a pipe etwork uder steady state coditios.
More informationDepartment of Computer Science, University of Otago
Departmet of Computer Sciece, Uiversity of Otago Techical Report OUCS-2006-09 Permutatios Cotaiig May Patters Authors: M.H. Albert Departmet of Computer Sciece, Uiversity of Otago Micah Colema, Rya Fly
More informationIn nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008
I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces
More informationIncremental calculation of weighted mean and variance
Icremetal calculatio of weighted mea ad variace Toy Fich faf@cam.ac.uk dot@dotat.at Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically
More informationThe Stable Marriage Problem
The Stable Marriage Problem William Hut Lae Departmet of Computer Sciece ad Electrical Egieerig, West Virgiia Uiversity, Morgatow, WV William.Hut@mail.wvu.edu 1 Itroductio Imagie you are a matchmaker,
More informationOutput Analysis (2, Chapters 10 &11 Law)
B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should
More information0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5
Sectio 13 Kolmogorov-Smirov test. Suppose that we have a i.i.d. sample X 1,..., X with some ukow distributio P ad we would like to test the hypothesis that P is equal to a particular distributio P 0, i.e.
More informationVladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT
Keywords: project maagemet, resource allocatio, etwork plaig Vladimir N Burkov, Dmitri A Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT The paper deals with the problems of resource allocatio betwee
More informationTaking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling
Taig DCOP to the Real World: Efficiet Complete Solutios for Distributed Multi-Evet Schedulig Rajiv T. Maheswara, Milid Tambe, Emma Bowrig, Joatha P. Pearce, ad Pradeep araatham Uiversity of Souther Califoria
More informationCase Study. Normal and t Distributions. Density Plot. Normal Distributions
Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca
More informationMARTINGALES AND A BASIC APPLICATION
MARTINGALES AND A BASIC APPLICATION TURNER SMITH Abstract. This paper will develop the measure-theoretic approach to probability i order to preset the defiitio of martigales. From there we will apply this
More information5 Boolean Decision Trees (February 11)
5 Boolea Decisio Trees (February 11) 5.1 Graph Coectivity Suppose we are give a udirected graph G, represeted as a boolea adjacecy matrix = (a ij ), where a ij = 1 if ad oly if vertices i ad j are coected
More informationIrreducible polynomials with consecutive zero coefficients
Irreducible polyomials with cosecutive zero coefficiets Theodoulos Garefalakis Departmet of Mathematics, Uiversity of Crete, 71409 Heraklio, Greece Abstract Let q be a prime power. We cosider the problem
More informationPROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM
PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY Physical ad Mathematical Scieces 2015, 1, p. 15 19 M a t h e m a t i c s AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM A. G. GULYAN Chair of Actuarial Mathematics
More informationLecture 2: Karger s Min Cut Algorithm
priceto uiv. F 3 cos 5: Advaced Algorithm Desig Lecture : Karger s Mi Cut Algorithm Lecturer: Sajeev Arora Scribe:Sajeev Today s topic is simple but gorgeous: Karger s mi cut algorithm ad its extesio.
More informationLecture 13. Lecturer: Jonathan Kelner Scribe: Jonathan Pines (2009)
18.409 A Algorithmist s Toolkit October 27, 2009 Lecture 13 Lecturer: Joatha Keler Scribe: Joatha Pies (2009) 1 Outlie Last time, we proved the Bru-Mikowski iequality for boxes. Today we ll go over the
More informationA short note on quantile and expectile estimation in unequal probability samples
Cataogue o. 2-00-X ISS 492-092 Survey Methodoogy A short ote o quatie ad expectie estimatio i uequa probabiity sampes by Lida Schuze Watrup ad Göra Kauerma eease date: Jue 22, 206 How to obtai more iformatio
More informationChapter 5 O A Cojecture Of Erdíos Proceedigs NCUR VIII è1994è, Vol II, pp 794í798 Jeærey F Gold Departmet of Mathematics, Departmet of Physics Uiversity of Utah Do H Tucker Departmet of Mathematics Uiversity
More informationModified Line Search Method for Global Optimization
Modified Lie Search Method for Global Optimizatio Cria Grosa ad Ajith Abraham Ceter of Excellece for Quatifiable Quality of Service Norwegia Uiversity of Sciece ad Techology Trodheim, Norway {cria, ajith}@q2s.tu.o
More informationCS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations
CS3A Hadout 3 Witer 00 February, 00 Solvig Recurrece Relatios Itroductio A wide variety of recurrece problems occur i models. Some of these recurrece relatios ca be solved usig iteratio or some other ad
More information1 Computing the Standard Deviation of Sample Means
Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.
More informationOverview of some probability distributions.
Lecture Overview of some probability distributios. I this lecture we will review several commo distributios that will be used ofte throughtout the class. Each distributio is usually described by its probability
More informationUniversal coding for classes of sources
Coexios module: m46228 Uiversal codig for classes of sources Dever Greee This work is produced by The Coexios Project ad licesed uder the Creative Commos Attributio Licese We have discussed several parametric
More informationA Faster Clause-Shortening Algorithm for SAT with No Restriction on Clause Length
Joural o Satisfiability, Boolea Modelig ad Computatio 1 2005) 49-60 A Faster Clause-Shorteig Algorithm for SAT with No Restrictio o Clause Legth Evgey Datsi Alexader Wolpert Departmet of Computer Sciece
More informationSAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx
SAMPLE QUESTIONS FOR FINAL EXAM REAL ANALYSIS I FALL 006 3 4 Fid the followig usig the defiitio of the Riema itegral: a 0 x + dx 3 Cosider the partitio P x 0 3, x 3 +, x 3 +,......, x 3 3 + 3 of the iterval
More informationCHAPTER 3 THE TIME VALUE OF MONEY
CHAPTER 3 THE TIME VALUE OF MONEY OVERVIEW A dollar i the had today is worth more tha a dollar to be received i the future because, if you had it ow, you could ivest that dollar ad ear iterest. Of all
More informationHow To Solve The Homewor Problem Beautifully
Egieerig 33 eautiful Homewor et 3 of 7 Kuszmar roblem.5.5 large departmet store sells sport shirts i three sizes small, medium, ad large, three patters plaid, prit, ad stripe, ad two sleeve legths log
More informationA Recursive Formula for Moments of a Binomial Distribution
A Recursive Formula for Momets of a Biomial Distributio Árpád Béyi beyi@mathumassedu, Uiversity of Massachusetts, Amherst, MA 01003 ad Saverio M Maago smmaago@psavymil Naval Postgraduate School, Moterey,
More information5: Introduction to Estimation
5: Itroductio to Estimatio Cotets Acroyms ad symbols... 1 Statistical iferece... Estimatig µ with cofidece... 3 Samplig distributio of the mea... 3 Cofidece Iterval for μ whe σ is kow before had... 4 Sample
More informationPerfect Packing Theorems and the Average-Case Behavior of Optimal and Online Bin Packing
SIAM REVIEW Vol. 44, No. 1, pp. 95 108 c 2002 Society for Idustrial ad Applied Mathematics Perfect Packig Theorems ad the Average-Case Behavior of Optimal ad Olie Bi Packig E. G. Coffma, Jr. C. Courcoubetis
More informationCOMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S 2 CONTROL CHART FOR THE CHANGES IN A PROCESS
COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S CONTROL CHART FOR THE CHANGES IN A PROCESS Supraee Lisawadi Departmet of Mathematics ad Statistics, Faculty of Sciece ad Techoology, Thammasat
More informationMaximum Likelihood Estimators.
Lecture 2 Maximum Likelihood Estimators. Matlab example. As a motivatio, let us look at oe Matlab example. Let us geerate a radom sample of size 00 from beta distributio Beta(5, 2). We will lear the defiitio
More informationEstimating Probability Distributions by Observing Betting Practices
5th Iteratioal Symposium o Imprecise Probability: Theories ad Applicatios, Prague, Czech Republic, 007 Estimatig Probability Distributios by Observig Bettig Practices Dr C Lych Natioal Uiversity of Irelad,
More informationUC Berkeley Department of Electrical Engineering and Computer Science. EE 126: Probablity and Random Processes. Solutions 9 Spring 2006
Exam format UC Bereley Departmet of Electrical Egieerig ad Computer Sciece EE 6: Probablity ad Radom Processes Solutios 9 Sprig 006 The secod midterm will be held o Wedesday May 7; CHECK the fial exam
More informationINVESTMENT PERFORMANCE COUNCIL (IPC)
INVESTMENT PEFOMANCE COUNCIL (IPC) INVITATION TO COMMENT: Global Ivestmet Performace Stadards (GIPS ) Guidace Statemet o Calculatio Methodology The Associatio for Ivestmet Maagemet ad esearch (AIM) seeks
More informationLesson 15 ANOVA (analysis of variance)
Outlie Variability -betwee group variability -withi group variability -total variability -F-ratio Computatio -sums of squares (betwee/withi/total -degrees of freedom (betwee/withi/total -mea square (betwee/withi
More informationLECTURE 13: Cross-validation
LECTURE 3: Cross-validatio Resampli methods Cross Validatio Bootstrap Bias ad variace estimatio with the Bootstrap Three-way data partitioi Itroductio to Patter Aalysis Ricardo Gutierrez-Osua Texas A&M
More information3 Basic Definitions of Probability Theory
3 Basic Defiitios of Probability Theory 3defprob.tex: Feb 10, 2003 Classical probability Frequecy probability axiomatic probability Historical developemet: Classical Frequecy Axiomatic The Axiomatic defiitio
More informationUniversity of California, Los Angeles Department of Statistics. Distributions related to the normal distribution
Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100B Istructor: Nicolas Christou Three importat distributios: Distributios related to the ormal distributio Chi-square (χ ) distributio.
More informationMulti-server Optimal Bandwidth Monitoring for QoS based Multimedia Delivery Anup Basu, Irene Cheng and Yinzhe Yu
Multi-server Optimal Badwidth Moitorig for QoS based Multimedia Delivery Aup Basu, Iree Cheg ad Yizhe Yu Departmet of Computig Sciece U. of Alberta Architecture Applicatio Layer Request receptio -coectio
More informationInfinite Sequences and Series
CHAPTER 4 Ifiite Sequeces ad Series 4.1. Sequeces A sequece is a ifiite ordered list of umbers, for example the sequece of odd positive itegers: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29...
More informationLecture 3. denote the orthogonal complement of S k. Then. 1 x S k. n. 2 x T Ax = ( ) λ x. with x = 1, we have. i = λ k x 2 = λ k.
18.409 A Algorithmist s Toolkit September 17, 009 Lecture 3 Lecturer: Joatha Keler Scribe: Adre Wibisoo 1 Outlie Today s lecture covers three mai parts: Courat-Fischer formula ad Rayleigh quotiets The
More informationChair for Network Architectures and Services Institute of Informatics TU München Prof. Carle. Network Security. Chapter 2 Basics
Chair for Network Architectures ad Services Istitute of Iformatics TU Müche Prof. Carle Network Security Chapter 2 Basics 2.4 Radom Number Geeratio for Cryptographic Protocols Motivatio It is crucial to
More information1. C. The formula for the confidence interval for a population mean is: x t, which was
s 1. C. The formula for the cofidece iterval for a populatio mea is: x t, which was based o the sample Mea. So, x is guarateed to be i the iterval you form.. D. Use the rule : p-value
More informationConfidence Intervals for One Mean
Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a
More informationRunning Time ( 3.1) Analysis of Algorithms. Experimental Studies ( 3.1.1) Limitations of Experiments. Pseudocode ( 3.1.2) Theoretical Analysis
Ruig Time ( 3.) Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Most algorithms trasform iput objects ito output objects.
More informationNormal Distribution.
Normal Distributio www.icrf.l Normal distributio I probability theory, the ormal or Gaussia distributio, is a cotiuous probability distributio that is ofte used as a first approimatio to describe realvalued
More informationCenter, Spread, and Shape in Inference: Claims, Caveats, and Insights
Ceter, Spread, ad Shape i Iferece: Claims, Caveats, ad Isights Dr. Nacy Pfeig (Uiversity of Pittsburgh) AMATYC November 2008 Prelimiary Activities 1. I would like to produce a iterval estimate for the
More informationTradigms of Astundithi and Toyota
Tradig the radomess - Desigig a optimal tradig strategy uder a drifted radom walk price model Yuao Wu Math 20 Project Paper Professor Zachary Hamaker Abstract: I this paper the author iteds to explore
More informationConvention Paper 6764
Audio Egieerig Society Covetio Paper 6764 Preseted at the 10th Covetio 006 May 0 3 Paris, Frace This covetio paper has bee reproduced from the author's advace mauscript, without editig, correctios, or
More information1. MATHEMATICAL INDUCTION
1. MATHEMATICAL INDUCTION EXAMPLE 1: Prove that for ay iteger 1. Proof: 1 + 2 + 3 +... + ( + 1 2 (1.1 STEP 1: For 1 (1.1 is true, sice 1 1(1 + 1. 2 STEP 2: Suppose (1.1 is true for some k 1, that is 1
More informationAsymptotic Growth of Functions
CMPS Itroductio to Aalysis of Algorithms Fall 3 Asymptotic Growth of Fuctios We itroduce several types of asymptotic otatio which are used to compare the performace ad efficiecy of algorithms As we ll
More informationSoving Recurrence Relations
Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree
More informationTHE HEIGHT OF q-binary SEARCH TREES
THE HEIGHT OF q-binary SEARCH TREES MICHAEL DRMOTA AND HELMUT PRODINGER Abstract. q biary search trees are obtaied from words, equipped with the geometric distributio istead of permutatios. The average
More informationDetermining the sample size
Determiig the sample size Oe of the most commo questios ay statisticia gets asked is How large a sample size do I eed? Researchers are ofte surprised to fid out that the aswer depeds o a umber of factors
More informationYour organization has a Class B IP address of 166.144.0.0 Before you implement subnetting, the Network ID and Host ID are divided as follows:
Subettig Subettig is used to subdivide a sigle class of etwork i to multiple smaller etworks. Example: Your orgaizatio has a Class B IP address of 166.144.0.0 Before you implemet subettig, the Network
More informationCHAPTER 3 DIGITAL CODING OF SIGNALS
CHAPTER 3 DIGITAL CODING OF SIGNALS Computers are ofte used to automate the recordig of measuremets. The trasducers ad sigal coditioig circuits produce a voltage sigal that is proportioal to a quatity
More informationTHE ABRACADABRA PROBLEM
THE ABRACADABRA PROBLEM FRANCESCO CARAVENNA Abstract. We preset a detailed solutio of Exercise E0.6 i [Wil9]: i a radom sequece of letters, draw idepedetly ad uiformly from the Eglish alphabet, the expected
More informationAnnuities Under Random Rates of Interest II By Abraham Zaks. Technion I.I.T. Haifa ISRAEL and Haifa University Haifa ISRAEL.
Auities Uder Radom Rates of Iterest II By Abraham Zas Techio I.I.T. Haifa ISRAEL ad Haifa Uiversity Haifa ISRAEL Departmet of Mathematics, Techio - Israel Istitute of Techology, 3000, Haifa, Israel I memory
More informationTHE ARITHMETIC OF INTEGERS. - multiplication, exponentiation, division, addition, and subtraction
THE ARITHMETIC OF INTEGERS - multiplicatio, expoetiatio, divisio, additio, ad subtractio What to do ad what ot to do. THE INTEGERS Recall that a iteger is oe of the whole umbers, which may be either positive,
More informationTHE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n
We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample
More information*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature.
Itegrated Productio ad Ivetory Cotrol System MRP ad MRP II Framework of Maufacturig System Ivetory cotrol, productio schedulig, capacity plaig ad fiacial ad busiess decisios i a productio system are iterrelated.
More informationSection 11.3: The Integral Test
Sectio.3: The Itegral Test Most of the series we have looked at have either diverged or have coverged ad we have bee able to fid what they coverge to. I geeral however, the problem is much more difficult
More informationLesson 17 Pearson s Correlation Coefficient
Outlie Measures of Relatioships Pearso s Correlatio Coefficiet (r) -types of data -scatter plots -measure of directio -measure of stregth Computatio -covariatio of X ad Y -uique variatio i X ad Y -measurig
More information.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth
Questio 1: What is a ordiary auity? Let s look at a ordiary auity that is certai ad simple. By this, we mea a auity over a fixed term whose paymet period matches the iterest coversio period. Additioally,
More informationTO: Users of the ACTEX Review Seminar on DVD for SOA Exam MLC
TO: Users of the ACTEX Review Semiar o DVD for SOA Eam MLC FROM: Richard L. (Dick) Lodo, FSA Dear Studets, Thak you for purchasig the DVD recordig of the ACTEX Review Semiar for SOA Eam M, Life Cotigecies
More informationSupervised Rank Aggregation
Sessio: Search Quaity ad Precisio Supervised Rak Aggregatio Yu-Tig Liu,*, Tie-Ya Liu, Tao Qi,3*, Zhi-Mig Ma 4, ad Hag Li Microsoft Research Asia 4F, Sigma Ceter, No. 49, Zhichu Road, Haidia District, Beijig,
More informationTrackless online algorithms for the server problem
Iformatio Processig Letters 74 (2000) 73 79 Trackless olie algorithms for the server problem Wolfgag W. Bei,LawreceL.Larmore 1 Departmet of Computer Sciece, Uiversity of Nevada, Las Vegas, NV 89154, USA
More informationMeasures of Spread and Boxplots Discrete Math, Section 9.4
Measures of Spread ad Boxplots Discrete Math, Sectio 9.4 We start with a example: Example 1: Comparig Mea ad Media Compute the mea ad media of each data set: S 1 = {4, 6, 8, 10, 1, 14, 16} S = {4, 7, 9,
More information3. Greatest Common Divisor - Least Common Multiple
3 Greatest Commo Divisor - Least Commo Multiple Defiitio 31: The greatest commo divisor of two atural umbers a ad b is the largest atural umber c which divides both a ad b We deote the greatest commo gcd
More informationSolutions to Selected Problems In: Pattern Classification by Duda, Hart, Stork
Solutios to Selected Problems I: Patter Classificatio by Duda, Hart, Stork Joh L. Weatherwax February 4, 008 Problem Solutios Chapter Bayesia Decisio Theory Problem radomized rules Part a: Let Rx be the
More informationOptimal Adaptive Bandwidth Monitoring for QoS Based Retrieval
1 Optimal Adaptive Badwidth Moitorig for QoS Based Retrieval Yizhe Yu, Iree Cheg ad Aup Basu (Seior Member) Departmet of Computig Sciece Uiversity of Alberta Edmoto, AB, T6G E8, CAADA {yizhe, aup, li}@cs.ualberta.ca
More informationSimple Annuities Present Value.
Simple Auities Preset Value. OBJECTIVES (i) To uderstad the uderlyig priciple of a preset value auity. (ii) To use a CASIO CFX-9850GB PLUS to efficietly compute values associated with preset value auities.
More informationThe Fundamental Capacity-Delay Tradeoff in Large Mobile Ad Hoc Networks
The Fudametal Capacity-Delay Tradeoff i Large Mobile Ad Hoc Networks Xiaoju Li ad Ness B. Shroff School of Electrical ad Computer Egieerig, Purdue Uiversity West Lafayette, IN 47907, U.S.A. {lix, shroff}@ec.purdue.edu
More informationPerformance Modelling of W-CDMA Networks Supporting Elastic and Adaptive Traffic
Performace Modeig of W-CDMA Networks Supportig Eastic ad Adaptive Traffic Georgios A. Kaos, Vassiios G. Vassiakis, Ioais D. Moschoios ad Michae D. Logothetis* WCL, Dept. of Eectrica & Computer Egieerig,
More informationChapter 7: Confidence Interval and Sample Size
Chapter 7: Cofidece Iterval ad Sample Size Learig Objectives Upo successful completio of Chapter 7, you will be able to: Fid the cofidece iterval for the mea, proportio, ad variace. Determie the miimum
More informationMath C067 Sampling Distributions
Math C067 Samplig Distributios Sample Mea ad Sample Proportio Richard Beigel Some time betwee April 16, 2007 ad April 16, 2007 Examples of Samplig A pollster may try to estimate the proportio of voters
More informationCapacity of Wireless Networks with Heterogeneous Traffic
Capacity of Wireless Networks with Heterogeeous Traffic Migyue Ji, Zheg Wag, Hamid R. Sadjadpour, J.J. Garcia-Lua-Aceves Departmet of Electrical Egieerig ad Computer Egieerig Uiversity of Califoria, Sata
More informationPlug-in martingales for testing exchangeability on-line
Plug-i martigales for testig exchageability o-lie Valetia Fedorova, Alex Gammerma, Ilia Nouretdiov, ad Vladimir Vovk Computer Learig Research Cetre Royal Holloway, Uiversity of Lodo, UK {valetia,ilia,alex,vovk}@cs.rhul.ac.uk
More informationODBC. Getting Started With Sage Timberline Office ODBC
ODBC Gettig Started With Sage Timberlie Office ODBC NOTICE This documet ad the Sage Timberlie Office software may be used oly i accordace with the accompayig Sage Timberlie Office Ed User Licese Agreemet.
More informationChapter 14 Nonparametric Statistics
Chapter 14 Noparametric Statistics A.K.A. distributio-free statistics! Does ot deped o the populatio fittig ay particular type of distributio (e.g, ormal). Sice these methods make fewer assumptios, they
More informationPSYCHOLOGICAL STATISTICS
UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc. Cousellig Psychology (0 Adm.) IV SEMESTER COMPLEMENTARY COURSE PSYCHOLOGICAL STATISTICS QUESTION BANK. Iferetial statistics is the brach of statistics
More informationSequences and Series
CHAPTER 9 Sequeces ad Series 9.. Covergece: Defiitio ad Examples Sequeces The purpose of this chapter is to itroduce a particular way of geeratig algorithms for fidig the values of fuctios defied by their
More informationLecture 4: Cheeger s Inequality
Spectral Graph Theory ad Applicatios WS 0/0 Lecture 4: Cheeger s Iequality Lecturer: Thomas Sauerwald & He Su Statemet of Cheeger s Iequality I this lecture we assume for simplicity that G is a d-regular
More informationSECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES
SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,
More informationExample 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here).
BEGINNING ALGEBRA Roots ad Radicals (revised summer, 00 Olso) Packet to Supplemet the Curret Textbook - Part Review of Square Roots & Irratioals (This portio ca be ay time before Part ad should mostly
More information3D Partitioning for Interference and Area Minimization
D Partitioig for Iterferece ad Area Miimizatio Hsi-Hsiug Huag ad Tsai-Mig Hsieh Abstract This work defies a ove probem i which a set of modues is assiged to a set of siico ayers i order to miimize the
More informationResearch Article Sign Data Derivative Recovery
Iteratioal Scholarly Research Network ISRN Applied Mathematics Volume 0, Article ID 63070, 7 pages doi:0.540/0/63070 Research Article Sig Data Derivative Recovery L. M. Housto, G. A. Glass, ad A. D. Dymikov
More informationBASIC STATISTICS. f(x 1,x 2,..., x n )=f(x 1 )f(x 2 ) f(x n )= f(x i ) (1)
BASIC STATISTICS. SAMPLES, RANDOM SAMPLING AND SAMPLE STATISTICS.. Radom Sample. The radom variables X,X 2,..., X are called a radom sample of size from the populatio f(x if X,X 2,..., X are mutually idepedet
More informationA Mathematical Perspective on Gambling
A Mathematical Perspective o Gamblig Molly Maxwell Abstract. This paper presets some basic topics i probability ad statistics, icludig sample spaces, probabilistic evets, expectatios, the biomial ad ormal
More information