Using Elasticity to Improve Inline Data Deduplication Storage Systems

Size: px
Start display at page:

Download "Using Elasticity to Improve Inline Data Deduplication Storage Systems"

Transcription

1 Usng Elastcty to Improve Inlne Data Deduplcaton Storage Systems Yufeng Wang Temple Unversty Phladelpha, PA, USA Chu C Tan Temple Unversty Phladelpha, PA, USA Nngfang M Northeastern Unversty Boston, Massachusetts, USA Abstract Elastcty s the ablty to scale computng resources suchas memory on-demand,ands one of theman advantages of utlzng cloud computng servces. Wth the ncreasng popularty of cloud based storage, t s natural that more deduplcaton based storage systems wll be mgrated to the cloud. Exstng deduplcaton systems however, do not adequately take advantage of elastcty. In ths paper, we llustrate how to use elastcty to mprove deduplcaton based systems, and propose EAD (elastcty aware deduplcaton), an ndexng algorthm that uses the ablty to dynamcally ncrease memory resources to mprove overall deduplcaton performance. Our expermental results ndcatethateadsable todetect more than98% of allduplcate data, however only consumes less than 5% of expected memory space. Meanwhle, t clams four tmes of deduplcaton effcency than the state-of-art samplng technque whle costs less than half of the amount of memory. I. INTRODUCTION Data deduplcaton s a technque used to reduce storage and transmsson overhead by dentfyng and elmnatng redundant data segments. Data deduplcaton plays an mportant role n exstng storage systems [1], and ts mportance wll contnue to grow as the amount of data ncreases (the growth of data s estmated to reach 35 zettabytes n the year 2020). The flexblty and cost advantages of cloud computng provders such as Azure [2], Amazon [3], etc. make t deployng storage servces n the cloud as an attractve opton. A key property of cloud computng s elastcty, the ablty to dynamcally adjust the amount of computng resources quckly. Elastcty can mprove deduplcaton systems by allowng deduplcaton storage systems to dynamcally adjust the amount of memory resources as needed to detect suffcent amount of duplcate data. Ths s especally useful for nlne deduplcaton systems [4], [5] where the ndex used for deduplcaton s often kept wthn the memory to avod the performance bottleneck from dsk I/O operatons. Intutvely, there s a basc tradeoff between amount of duplcate data been detected and the amount of memory space requred. Smaller memory resources lead to small ndexes, whch n turn leads to worse deduplcaton performance due to mssed deduplcaton opportuntes. Selectng too much memory, on other hand, leads to wasted memory, snce RAM allocated to the ndex cannot be used for other purposes. Elastcty provdes the ablty to scale memory resources as needed to mprove deduplcaton performance wthout ncurrng wasted resources. In ths paper, we proposed an elastcty-aware deduplcaton (EAD) algorthm that takes advantage of the elastcty of cloud computng. The key features of our soluton s that our deduplcaton algorthm s compatble wth current deduplcaton technques such as samplng to take advantage of localty [6], [7], and content-based chunkng [8] [10]. Ths means our soluton can take advantage of state-of-the-art algorthms to mprove performance. Furthermore, we also present a detaled analyss of our algorthm, as well evaluaton usng extensve experments on real world dataset. The rest of the paper s organzed as follows: Secton 2 contans the related work. Secton 3 explores lmtatons of exstng approaches. The EAD s presented n Secton 4. Secton 5 evaluates our soluton, and Secton 6 concludes. II. RELATED WORK Cloud based backup systems typcally use nlne data deduplcaton, where redundant data chunks are dentfed at run tme to avod transmttng the redundant data from the source to the cloud. Ths s opposed to offlne deduplcaton where the source transmts all the data to the cloud, whch then runs deduplcaton process to conserve storage space. For the remanng paper, deduplcaton wll refer to nlne deduplcaton. Numerous research has been done to mprove the performance of fndng duplcate data. Work by [11] focused technques to speed up the deduplcaton process. Researchers have also proposed dfferent chunkng algorthms to mprove the accuracy of detectng duplcates[12] [16]. Other research consders the problem of deduplcaton of multple datatypes[17], [18]. Ths lne of research s complmentary to our work, can be easly ncoporated nto our soluton. The ever ncreasng amounts of data coupled wth the performance gap between n-memory searchng and dsk look ups, mean that ncreasngly, dsk I/O has become the performance bottleneck. Recent deduplcaton research have focused on addressng the problem of lmted memory. Work by [19] proposed ntegrated solutons whch can avod dsk I/Os on close to 99% of the ndex lookups. However, [19] stll puts ndex data on the dsk, nstead of memory. Estmaton algorthms

2 Duplcate data detected(mb) Duplcate detected by usng dfferent ndex szes VM1 VM # of entres n ndex x 10 5 Fg. 1: Intutve test on amount of duplcate detected on two equalszed(4.7gb) VMs by usng equal-sze ndexes lke [20] can be used to mprove the performance by reducng total number of chunks, but the fundamental problem remans as the amount of data ncreases. Other exstng research n ths area have proposed dfferent samplng algorthms to ndex more data usng less memory: [6] ntroduces a soluton by only keepng part of chunks nformaton n the ndex; [7] proposes a more advanced method based on the work n [6] by deletng chunks fngerprnts (FPs) from the ndex when t s approachng fullness. The fundamental lmtaton wth samplng based approach s that t s mpossble to mantan deduplcaton performance by reducng the samplng rate whle the nput data sze ncreases but RAM resources remans fxed. Our soluton bulds upon earler work on samplng by takng advantage of the elastcty property of cloud computng to carefully combne samplng wth ncreasng memory resources. III. THE CASE FOR USING ELASTICITY Ths secton wll explore some alternatves to mprove deduplcaton performance, and ther lmtatons. A. Why not pck the best memory sze? One alternatve s to try and estmate the approprate amount of memory that s needed pror to deployng the deduplcaton system. A straghtforward approach s to perform smple proflng on a sample of data and compute the expected memory requrements based on the results. To llustrate why ths s dffcult to choose the rght amount of RAM n practce, we conducted a smple experment that represents a storage system used to archve vrtual machne (VM) mages (ths s a common workload used n deduplcaton evaluatons[7],[21]). We want to maxmze deduplcaton rato to conserve bandwdth and storage costs. For smplcty, we assume that all VMs are runnng the same OS, and are the same sze. A smple way to estmate memory requrements s to frst estmatng the ndex sze for a sngle VM, and then use that to estmate the total RAM necessary for all users. Thus, gven n users, and snce each user stores the same sze VM, we estmate m amountsofram tondexoneuser,ourbackup system wll need n m amounts of RAM. We can derve m va experments. Fg. 1 shows the results for two VMs. As far as we know, VM2 contansmore text fles whle VM1 has more vdeo fles. Number of ndex entry slots ndcates how much nformaton of already stored data the system can provde for duplcate detecton. We set fxed number of ndex entres for duplcate detecton and gradually ncrease t. We see that when ndex entry slots number ncreases to 270 thousand, both VMs exhbt the same amount of duplcate data. As we ncrease the ndex sze, VM1 shows lmted mprovement, whle VM2 shows much better performance. If we had used VM1 to estmate m would have led to much less bandwdth savngs, especally f a sgnfcant number of VMs resemble VM2. Buyng too much memory s wasteful f most of the data resemble VM1. B. Usng Localty and Downsamplng Storage systems that make use of data deduplcaton generally operate on chunk-level, and n order to quckly determne potental duplcate chunks, an ndex for exstng chunks needs to be mantaned n memory. For example, a 100TB data wll need about 800GB amounts of RAM for the ndex under standard deduplcaton parameters [22]. Ths makes keepng the entre ndex n memory challengng. The prncple of localty s used to desgn samplng algorthms that utlze smaller ndex sze whle provdng good performance [6]. The localty prncple suggests that f chunk X s observed to be surroundedby chunksy, Z, W n the past; the next tme chunk X appears, there s a hgh probablty that chunks Y, Z, W wll also appear. In samplng-based deduplcaton, the data wll be frst dvded nto larger segments, each of whch contans thousands of chunks. Deduplcaton s executed based on these segments by dentfyng exstence of ther sampled chunks fngerprnts n the ndex. If a chunk s fngerprnt s found n the ndex, the correspondent segment whch contans that chunk wll be located and fngerprnts nformaton of all the other chunks n ths segment wll be pre-fetched from dsk to the chunk cache n memory. Downsamplng algorthm [7] works as an optmzed samplng approach, by takng advantage of the localty prncple. The dfference s that the samplng rate s ntalzed as 1, whch ndcates t pcks all the chunks n a segment as ts sampled chunks. As the amount of ncomng data ncreases, ths value gradually decreases by droppng half of ndex entres. Thus the ndexng capacty doubles by only acceptng a part of chunks fngerprnts as samples to represent each segment. In other words, nstead of ndexng chunks X, Y, Z, and W n RAM, the downsamplng algorthm wll only ndex chunk X (or another one among four of them) n RAM after two tmes of adjustments, and the rest on dsk. The above samplng-based approaches have two man drawbacks. The frst (obvous) drawback s that not all data wll exhbt localty[17], and thus samplng algorthms do not work well wth these datasets. The second drawback s that even for data that exhbts localty, t s dffcult to select the correct

3 samplng rate or how to adjust t, due to the large varance n possble deduplcaton rato [23] [24]. IV. EAD: ELASTICITY-AWARE DEDUPLICATION Storage deduplcaton servces n the cloud often run n vrtual machnes (VM). Unlke a conventonal OS whch runs drectly on physcal hardware, the OS n a VM s runnng on top of a hypervsor or vrtual machne montor, whch n turn, communcates wth the underlyng physcal hardware. The hypervsor s responsble for ncreasng RAM resources to the vrtual machne (VM) dynamcally. Ths can be done n two generc ways. The frst s to use a balloonngalgorthmto reclammemory from other VMs runnng on the same physcal machne (PM) [25]. Ths s a relatvely lghtweght process that reles on the OS s memory management algorthm, but can only ncrease relatvely small amounts of memory. Deduplcaton systems that requre ncreasngly larger amounts of memory need to run a VM mgraton algorthm [26], [27]. In VM mgraton, the hypervsor mgrates the RAM contents from one PM to another wth suffcent memory resources [26]. Regardless of the mgraton algorthm used, some downtme can nevtably occur when swtchng over to a new VM [27]. A nave approach towards ncorporatng elastcty s to ncrease the memory sze once the ndex s close to beng full. Ths nave approach does not perform well snce frequent mgratons nduce a hgh overhead. Furthermore, the nave approach always retans the entre old ndex durng each mgraton, even those ndex entres do not fnger prnt many chunks. Such poor performng ndex entres take up valuable ndex space wthout provdng much benefts. Our approach combnes the benefts of downsamplng [7] and VM mgraton to allow users to mantan a satsfactory level of performance by adjustng samplng rate and memory sze accordngly. Our system desgn conssts of two components, an EAD clent that s responsble for fle chunkng, fngerprnt computaton and samplng, and an EAD server whch controls the ndex management and other memory management operatons. The EAD clent s run on the clent sde, for nstance, at the gateway server for a large company. The EAD server can be executed by the cloud provder. The entre system desgn s shown n Fg. 2. Only unque data s supposed to be store n Physcal Storage. The Fle Manager s responsble for data retreval and mantenance, how t works s out of ths paper s scope. A. EAD Algorthm Dfferent types of users have dfferent deduplcaton requrements. Some users wll be wllng to tolerate worse deduplcaton performance n exchange for lower costs, whle others are not. To accomodate dfferent requrements, EAD s desgned to allow a user to specfy a mgraton trgger, Γ ( (0, 1)), whch specfes the level of deduplcaton performance the user s wllng to accept. Deduplcaton performance s usually measured by reducton rato [18], [28], whch s the sze of the orgnal dataset Fg. 2: EAD nfrastructure. dvded by the sze of the dataset after deduplcaton. To help the user select the mgraton trgger, we defne Deduplcaton Rato (DR), Deduplcaton Rato = 1 Sze after deduplcaton Sze of orgnal data. Intutvely, we would lke to frst apply downsamplng algorthms untl the deduplcaton performance becomes unsatsfactory, and then mgrate the ndex to larger memory n order to obtan better performance. EAD wll mgrate to larger RAM only when mgraton wll result n deduplcaton performance better than Γ. Ths has an mportant but subtle mplcaton. EAD wll not always mgrate when deduplcaton performance falls under Γ, but only when mgraton wll mprove performance. Ths s mportant because gven a dataset that nherently exhbts poor deduplcaton characterstcs [29], addng more RAM wll ncur the mgraton overehad wthout mprovng deduplcaton performance. Ths means that EAD cannot smply compare the measured DR aganst Γ because the measured DR may not necessarly reflect the amount of duplcaton that exsts. To llustrate, let us assume that the deduplcaton system measures ts DR and t s less than Γ. There are two possbltes. The frst s that the system has performed overly aggressve downsamplng, and can beneft from ncreasng RAM. The second possblty s that the dataset tself has poor deduplcaton performance, e.g. data n multmeda or encrypted fles. In ths case, ncreasng RAM does not result n better performance. How our EAD algorthm determnes when to mgrate to more RAM resources can be found n Alg. 1. It executes n two phases as generc n-lne deduplcaton systems do. We use S n and x to denote the ncomng segment and chunks nsde t. FP x represents the fngerprnt of chunk x. In Phase I the EAD Clent sends all chunks fngerprnts nformaton (FP x ) of sampled chunks used for estmaton and duplcaton detecton, n each segment S n to EAD Server. The latter wll search ndex table T and chunk cache for duplcaton dentfcaton, as well as updatng estmaton base B. Based on results generated, n whch each chunk x s marked as dup or unq, ndcatng t s a duplcate or unque chunk. EAD Clent only transmts unque data chunks along wth metadata of duplcate onesto EAD Server n Phase II, savngbandwdthandstorage ( x S n ), ncludng labelng FP est x and FP dedup x

4 space. At the meantme, current samplng rate R 0 s subject to change to R based on deduplcaton performance. Detals on features of the EAD algorthm wll be presented next. B. Estmatng Possble Deduplcaton Performance One of the key features of EAD s that the algorthm s able to determne whether mgraton wll be benefcal. In order to dstngush whether poor deduplcaton performance s due to overly aggressve downsamplng or nherent wthn the dataset, we frst need to be able to estmate the potental DR of the dataset. Obtanng the actual DR s mpractcal snce t requres performng the entre deduplcaton process. Pror work from [30] provded an estmaton algorthm to estmate the deduplcaton performance for statc, fxedsze data sets. Ther algorthm requres the actual data to be avalable n order to perform random samplng and comparsons. However, n our problem, the dataset can be vewed as a stream of data. There s no pror knowledge of the sze or characterstcs of the data to be stored n advance. We also cannot perform back and forth scannng of the complete dataset for estmaton. In our EAD algorthm, we let the EAD Server mantan an estmaton base B. The EAD Clent randomly selects κ fngerprnts from each segment and sends them to EAD Server to be stored n B. Suppose there are n s segments come n, there wll be κ n s samples, whch wll ncrease along wth the ncreasng amount of ncomng data. Each entry slot n B ncludes a fngerprnt as well as two counters, x c1 and x c2, where counter x c1 records the number of occurrences of fngerprnt FP x appears n the B, and x c2 records the number of occurrences of fngerprnt FP x appears among that of all the chunks uploaded. We ntegrate our estmaton process nto the regular deduplcaton operatons so as to avod the separate samplng and scannng phases by [30]. Whle the clent sends the samples for duplcaton searchng to the storage server, these samples for estmaton are transmtted at the same tme for updatng B. Durng the fngerprnt comparson of ncomng chunks aganst that n chunk cache, we update B agan, ncrementng the counter x c2 by one every tme ts correspondent fngerprnt appears. Thus, there s no extra overhead for our estmaton purpose. Usng B, we can compute the estmated deduplcaton rato, EDR, as EDR = 1 1 κ n s x B x c1 x c2. The computaton of EDR happens whle the ndex sze s approachng the memory lmt. Only n the case that DR s smaller than Γ EDR, there wll be a potental performance mprovement by mgraton, and EAD wll mgrate the ndex to larger RAM. Otherwse, EAD wll apply downsamplng on the ndex as the exchange for larger ndexng capacty. C. EAD Refnements The performance of the EAD algorthm can be further mproved by observng addtonal nformaton obtaned durng Algorthm 1 Elastc deduplcaton strategy 1: The ncomng segment S n : Deduplcaton Phase I: Identfy duplcate chunks 2: x S n : EAD Clent sends FP x to EAD Server 3: for all FPx dedup do 4: f FPx dedup T then 5: Locate ts correspondent segments S dup x j S dup : Fetch nformaton of x j (FP xj ) Set x j chunkcache 6: else 7: Add FPx dedup to T 8: for all FPx est do 9: f FPx est B then 10: x c1 = x c : else 12: Add FPx est to B Set x c1 = x c2 = 0 13: for all x S n do 14: x k chunkcache : Compare FP x wth FP xk 15: f FP x = FP xk then 16: Set x dup (x dup ) 17: else 18: Set x unq (x unq ) 19: x l B : Compare FP x wth FP xl 20: f FP x = FP xl then 21: x c2 = x c2 +1 Deduplcaton Phase II: Data transmsson 22: for all x S n do 23: Transmts x unq along wth only metadata of x dup 24: EAD fnshes processng S n 25: f Index s approachng the RAM lmt then 26: f DR < Γ EDR then 27: f R 0 = 1 then 28: EAD sets Γ = DR EDR 29: else 30: EAD trggers mgraton, settng rate R = R 0 31: else 32: EAD sets R = R0 the run tme and then adjustng the parameters of the algorthm. Adjustng Γ. The parameter Γ s specfed by the user, and ndcates the user s desred level of deduplcaton performance. However, the user may sometmes be unaware of the underlyng potental deduplcaton performance of the data, and set an excessvely hgh Γ value, resultng n unnecessary mgratonovertme. We adjusttheuser sγvalueto DR EDR after each mgraton, and also n the case that DR has not reached accepted performance even the samplng rate s one. So that t represents the current system s maxmum deduplcaton ablty. In ths way, EAD s able to elastcally adapt varatons on ncomng data. Amount of RAM and Samplng Rate post mgraton. A smple way to compute the amount of RAM s allocatng after

5 mgraton by usng a fxed szed, e.g. doublng the RAM each tme ( = 2). We then reset the samplng rate back to 1, and start all over agan. We can mprove over ths process by observng the next to last samplng rate used pror to mgraton. Ths rate s the last known samplng rate that produced acceptable deduplcaton performance. Ths s vald because f t dd not produce an acceptable performance, EAD would have already trggered mgraton. Once we have ths new samplng rate, we can compute the amount of RAM by ntroducng a new counter d (ntalzed as zero) to record occurrences of downsamplng. We can then compute the new RAM, RAM new as { RAM org d = 1 RAM new = [1 d 1 =1 1 1 ] RAM org d 2 As the tmes of downsample operaton ncrease, EAD requres less amount of RAM for ndex table after mgraton. Compared wth always requrng tmes of orgnal RAM, such optmzed approach s able to clam hgher memory utlzaton effcency. Managng Sze of B. One concern wth our estmaton scheme s that the sze of B may become too large. If we need a large amount of RAM to store B, we wll be wastng RAM resources that could be used n the ndex. In practce, the sze of B s relatvely modest. Each entry n B conssts of a fngerprnt and two counters. Usng SHA-1 to compute the fngerprnt results n a 20 byte fngerprnt. An addtonal four bytes are used for each counter. Thus, each B entry s 28 bytes, ndcatng that the total sze of B would be at most approxmately MB to support 1 TB of data. In our experment, t only requres 4.32 MB for estmatng GB dataset. A. Expermental setup V. IMPLEMENTATION For our experments, we collected a dataset consstng of VMs that all run the Ubuntu OS, but each VM has dfferent types of software and utltes nstalled and contans dfferent types of applcaton data, whch majorty comes from Wkmeda Archves[31] and OpenfMRI [32].The total sze of our dataset s approxmately GB. Whle the dataset sze s relatvely modest compared to some pror work [5], [10], [18], we beleve that t stll adequately reflects real-world usages of backup systems, such as backng up employee laptops. To ensure a far comparson, we have scaled down our ndex sze to correspond to our dataset sze, n order to better represent a large scale envronment. We have mplemented our EAD algorthm n Java. For all experments, we use varable block deduplcaton parameters respectvely wth a mnmum and maxmum sze of 4 KB and 16 KB, and a correspondng average chunk sze s 8 KB. We set the segment sze to be 16 MB. These are common parameters used n prevous research [33] [19]. The experments are carred out on a 4-core Intel T at 2.60GHz wth total 8GB RAM, runnng on lnux. Strategy # of ndex entres sze of ndex(mb) Full ndex Wth down-sample EAD TABLE I: RAM deployment for ndex under dfferent deduplcaton strateges. We set the down-samplng trgger as 0.85, whch means whle the storage s approachng 85% of ts current lmt, the ndex wll be down-sampled(half of ts entres wll be removed. e.g. delete ndex FPs wth FP mod 2 = 0 ). We evaluate our soluton, denoted as Elastc n the fgures, aganst two alternatves approaches. The frst alternatve, denoted as FullIndex, represents an deal stuaton where there s unlmted RAM avalable. Ths wll serve as an upper bound on the total amount of space savngs. The other alternatve s denoted as DownSample, whch s based on[7], a recent approach that dynamcally adjusts the samplng rate to deal wth nsuffcent RAM. B. Deduplcaton rato We hereby compare our algorthm wth a generc deduplcaton mechansm wthout samplng and a state-of-art hgh performance deduplcaton strategy wth down-samplng mechansm [7]. Before deployng the deduplcaton process, we allocate a specfc amount of RAM for ndex n dfferent strateges. Table I shows the amount of RAM allocated for dfferent deduplcaton strateges. We set the sze of each entry slot n the ndex as 64 bytes, whch conssts of three parts: FP, chunk metadata (storage address, chunk length,etc) and counter, whch s 20 bytes(sha-1 hash sgnature [34]), 40 bytes and 4 bytes, respectvely. These szes may vary under dfferent hash functons or addressng polces, however t wll not dffer too much. We assume that the capacty s 75 GB. Nearly 10 mllon ndex entres are needed to ndex all the unque data f we do not use any samplng strateges. Whle under the down-sample strategy wth the mnmum samplng rate of 0.05, we need 500K ndex entres for 75 GB of unque data. EAD always pcks a much more conservatve sze of ndex, specfcally only 100K entry slots n ths case. We here use Normalzed Deduplcaton Rato as the metrc for deduplcaton rato comparson. It s defned as the rato of measured Deduplcaton Rato to Deduplcaton Rato of FullIndex deduplcaton. Note that FullIndex detects all the duplcate data chunks and can clam hghest deduplcaton rato. Thus, such a metrc s meanngful because t ndcates how close the measured deduplcaton rato s to the deal deduplcaton rato achevable n the system. Fg. 3(a) shows the Normalzed Deduplcaton Rato of the above deduplcaton strateges. Downsamplng and computaton of EDR happen when the usage of ndex approaches 85% of ts capacty. For the down-sample strategy, t has the rato hgher than 99.5 %, showng the benefts of takng advantage of localty. The EAD does not clam equally hgh rato, however the gap s less than 2 %. Also consder that the performance requrement for EAD s defned by Γ, whch s 0.95 n ths case, the performance of Elastc s always hgher than 98%, performng better than what s requred.

6 However, purely comparng the deduplcaton rato s not far for evaluatng ther performance. Snce that these three strateges spend dfferent amount of RAM for ndex from the start. Fg. 3(b) shows how samplng rate and number of ndex slots used vary above cases. Obvously, t brngs too much memory cost wthout samplng. We notce that both DownSample and Elastc have comparatvely very low memory cost (small number of ndex entry slots). Also we can observe that when about 5% of data has been processed, the samplng rate n Elastc ncreases, reflectng ts feature of elastcty. The above results show that EAD s able to use less RAM space to acheve a satsfyng deduplcaton rato, whch s only slghtly lower than the other two. Next we derve a more meanngful metrc Deduplcaton Effcency, as a sngle utlty measure that encompasses both deduplcaton rato and RAM cost, to make a more far comparson among these three strateges. Normalzed Deduplcaton Rato (%) Samplng Rate 100 # of Index Slots FullIndex DownSample Elastc x (a) Deduplcaton rato comparson FullIndex DownSample Elastc FullIndex DownSample Elastc (b) Index usage comparson. Fg. 3: The samplng rate s 1 for all of them at the start of back up. The ndex mgraton n EAD s trggered when the normalzed deduplcaton rato drops below 95 % (Γ=0.95), after that, samplng rate doubles( =2). C. Deduplcaton effcency As dscussed n Secton V-B, nether deduplcaton rato nor memory cost alone can fully represent the system performance. Therefore we defne: Duplcate Data Detected Deduplcaton Effcency = Index Entry Slots Deduplcaton Effcency (MB/Slot) FullIndex DownSample Elastc Fg. 4: Deduplcaton effcency performance. as a more advanced performance evaluaton crtera. By usng ths crtera, we make more farly comparsons among EAD and the other two solutons, as shown n Fg. 4. It shows that Elastc outperforms both Downsample and FullIndex on effcency. Notce that Elastc always yelds a hgher effcency, almost 4 tmes of that from Downsample and 30 tmes of that from FullIndex. Ths s because that ts elastc feature enable t utlze as lttle memory space as possble to detect enough duplcate data as requred, avodng memory waste as the other two do. D. Elastcty Optmzaton In ths secton, we explore the varatons of EAD from dfferent aspects. Montorng accuracy. EAD can work properly only when t s able to accurately montor the real tme deduplcaton effcency. As the crtera of judgng deduplcaton performance, estmated duplcaton rate s supposed to be as accurate as possble. Otherwse, elastcty mght brng unexpected effect on the performance f t makes an napproprate decson for ndex mgraton. Fg. 5 shows the accuracy of montored deduplcaton ratos durng the backup process. 500 tmes of ndependent test were conducted on the dataset. We here consder the rato of estmated deduplcaton rato n EAD to that n FullIndex as error devaton, whch ndcates the real tme accuracy of montorng. From the fgure we can see that ntally the error devaton s at most 10%, but as more data comes n, the devaton reduces to 2%, whch offers a relable crtera for evaluaton on system performance. Accordng to [30], the reducton rato of a dataset of up to 7 TB can be estmated wth accuracy less than 1%. Notce that we here dynamcally estmate the rato whch represents a very dfferent stuaton (as elaborated n Secton IV-A). It can been seen from Fg. 5 that the devaton s hgher than expected when only part of the dataset has been estmated. So that we reserve more RAM for estmaton, even though t only costs approxmately 4.32 MB of RAM for samples. Impact on ntal ndex sze. There s no standardzed crtera of selecton on amount of memory sze for ndex table n deduplcaton systems. We only gve examples of memory

7 Error Devaton Percentage of data processed(%) Fg. 5: Estmaton Accuracy verfcaton. 20 samples per segment are randomly pcked out from ncomng data. Deduplcaton Rato (%) Elastc(80) Elastc(85) Elastc(90) Elastc(95) (a) Deduplcaton Rato Deduplcaton Effcency (log scale) Elastc100k Elastc150k Elastc200k Fg. 6: EAD Performance under dfferent ntal ndex szes sze of one ffth to exstng solutons n Secton V-B. Because the elastcty feature of EAD s supposed to be low memory tolerant that allows us to gve a very low RAM space for ndex at the begnnng, snce t s able to mgrate ndex table f there s no enough space. We explore and verfy ts performance under dfferent ntal memory szes for ndex. Table II shows the normalzed deduplcaton rato as data comes n when ntal ndex entry slots are 100K, 150K and 200K, respectvely. It s not a surprse that a smaller ndex table helps us detect less duplcate data, but the gap s only at most approxmately 4%. Another more nterestng observaton s that by applyng our algorthm, ntally smaller ndex table case sometmes s able to clam even hgher deduplcaton rato. Ths s because t has a hgher probablty to be mgrated so that there wll be more ndex adjustments whch brngs performance mprovement. Fg. 6 shows the Deduplcaton Effcency of EAD wth dfferent ntal memory szes. It shows that the most conservatve RAM ntalzaton case clams the hghest deduplcaton effcency, whch also proves that EAD provdes well balance between memory and storage savngs. Impact of Γ. We then step further to verfy the effectveness of EAD under dfferent polces. Fg. 7 shows the performance when we apply dfferent values of Γ, whle ntal ndex entry slots are 100K. As we analyzed, Γ represents the system s tolerance to duplcate data detecton mssng. The hgher the Deduplcaton Effcency (log scale) Elastc(80) Elastc(85) Elastc(90) Elastc(95) (b) Deduplcaton Rato Fg. 7: The performance under dfferent mgraton trggerng values. Shown are results of EAD performance when the measured deduplcaton rato fall below 80%,85%,90% and 95% of estmated one, whle = 2. trgger s, t ll be more senstve and easly to trgger mgraton, and vce versa. A hgher Γ guarantees a hgher deduplcaton rato as shown n Fg. 7(a), althoughnot too much n ths case. However, we notce that Γ = 0.95 case also yelds the hghest overall effcency, whch mples that EAD s able to acheve double-wn on both deduplcaton rato and effcency. Memory usage comparson. Snce our goal s to ntroduce a comparable elastcty-aware deduplcaton soluton to exstng approaches. Based on above results, we are able to estmate the ntegrated memory overhead n EAD, and we here compare t to state-of-art. Asde of memory space needed for ndex, EAD requres extra space for estmaton. As shown n Table III, the extra memory overhead of EAD manly comes from the estmaton part, compared wth the other two. Even though, the total RAM cost by EAD s less than 50% and 5% of that by DownSample and FullIndex, respectvely. Also note that there s 0.1 MB of ndex ncrementaton at the end of deduplcaton because of ts conservatve mgraton mechansm. VI. CONCLUSION AND FUTURE WORK As a sgnfcant technque for elmnatng duplcate data, deduplcaton largely reduces storage usage and bandwdth

8 (6.4 MB) 99.73% 99.08% 96.23% 94.76% 99.66% 99.11% 97.13% 94.41% 93.93% (9.6 MB) 99.79% 98.97% 99.14% 98.74% 98.25% 97.63% 97.31% 96.70% 96.61% (12.8 MB) 99.79% 99.62% 99.72% 99.17% 98.60% 98.88% 98.69% 97.89% 97.72% TABLE II: The normalzed deduplcaton ratos whle we deploy dfferent amount of memory for ndex. The rato s measured as every 200 ncomng data segments have been processed, Γ = 0.9. Intal Index(MB) Fnal Index(MB) Est.(MB) Total(MB) EAD 6.40( slots) 6.50( slots) Down-sample 32 ( slots) 25.91( slots) Full Dedup 640( slots) ( slots) TABLE III: The RAM cost s broken nto ndex and estmaton(est.) parts for analyzng under dfferent deduplcaton strateges. Γ = 0.95 and = 2, respectvely n ths case. occupaton n the enterprse backup systems; Implementaton of Samplng further solves both chunk-lookup dsk bottleneck problem and lmted memory. However settng samplng rate only wth the consderaton of memory sze cannot guarantee the performance of the whole system. We hereby proposed the elastcty-aware deduplcaton soluton, n whch deduplcaton performance and memory sze are both consdered. We detaledly showed EAD s effcent adjustment on samplng rate by case analyss whch shows that EAD clams much better performance than exstng algorthms, offerng a complete gudelne for ts large scale deployment. Drectons for future research manly focus on the large scale mplementaton of our proposed soluton. We are amng to buld such a deduplcaton nfrastructure, verfyng ts property of elastcty and explctly demonstratng the space savng on both storage and memory. Another long-term goal s to explore ts applcaton on dstrbuted envronment n a even larger scalablty. REFERENCES [1] D. Geer, Reducng the storage burden va data deduplcaton, Computer, [2] B. Calder, J. d. Wang et al., Wndows Azure Storage: a hghly avalable cloud storage servce wth strong consstency, n Proceedngs of the Twenty-Thrd ACM Symposum on Operatng Systems Prncples, [3] Amazon S3, Cloud Computng Storage for Fles, Images, Vdeos, Accessed n 03/2013, [4] T. T. Thwel and N. L. Then, An effcent ndexng mechansm for data deduplcaton, n Current Trends n Informaton Technology (CTIT), [5] K. Srnvasan, T. d. Bsson et al., Dedup: Latency-aware, nlne data deduplcaton for prmary storage, n Proceedngs of the 10th USENIX conference on Fle and Storage Technologes, [6] M. Lllbrdge, K. Eshgh, D. Bhagwat, V. Deolalkar, G. Trezse, and P. Camble, Sparse ndexng: large scale, nlne deduplcaton usng samplng and localty, n Proccedngs of the 7th conference on Fle and storage technologes, [7] F. Guo and P. Efstathopoulos, Buldng a hgh performance deduplcaton system, n Proceedngs of the 2011 USENIX conference on USENIX annual techncal conference, [8] A. Adya, B. d et al., FARSITE: Federated, avalable, and relable storage for an ncompletely trusted envronment, ACM SIGOPS Operatng Systems Revew, [9] G. Forman, K. Eshgh, and S. Chocchett, Fndng smlar fles n large document repostores, n Proceedngs of the eleventh ACM SIGKDD nternatonal conference on Knowledge dscovery n data mnng, [10] U. Manber et al., Fndng smlar fles n a large fle system, n Proceedngs of the USENIX wnter 1994 techncal conference, [11] A. Sabaa, P. d. Kumar et al., Inlne Wre Speed Deduplcaton System, 2010, US Patent App. 12/797,032. [12] L. L. You and C. Karamanols, Evaluaton of effcent archval storage technques, n Proceedngs of the 21st IEEE/12th NASA Goddard Conference on Mass Storage Systems and Technologes, [13] E. Kruus, C. Ungureanu, and C. Dubnck, Bmodal content defned chunkng for backup streams, n Proceedngs of the 8th USENIX conference on Fle and storage technologes, [14] J. Mn, D. Yoon, and Y. Won, Effcent deduplcaton technques for modern backup operaton, IEEE Transactons on Computers, [15] A. Muthtacharoen, B. Chen, and D. Mazeres, A low-bandwdth network fle system, n ACM SIGOPS Operatng Systems Revew, [16] K. Eshgh and H. K. Tang, A framework for analyzng and mprovng content-based chunkng algorthms, Hewlett-Packard Labs Techncal Report TR, [17] W. Xa, H. d. Jang et al., Slo: a smlarty-localty based near-exact deduplcaton scheme wth low ram overhead and hgh throughput, n Proceedngs of USENIX annual techncal conference, [18] D. Bhagwat, K. Eshgh, D. D. Long, and M. Lllbrdge, Extreme bnnng: Scalable, parallel deduplcaton for chunk-based fle backup, n Modelng, Analyss & Smulaton of Computer and Telecommuncaton Systems, [19] B.Zhu,K.L,and H.Patterson, Avodng thedskbottleneck nthedata doman deduplcaton fle system, n Proceedngs of the 6th USENIX Conference on Fle and Storage Technologes, [20] G. Lu, Y. Jn, and D. H. Du, Frequency based chunkng for data de-duplcaton, n Modelng, Analyss & Smulaton of Computer and Telecommuncaton Systems (MASCOTS), [21] C. Km, Park et al., Rethnkng deduplcaton n cloud: From data proflng to blueprnt, n Networked Computng and Advanced Informaton Management (NCM), [22] G. Wallace, F. Dougls, H. Qan, P. Shlane, S. Smaldone, M. Chamness, and W. Hsu, Characterstcs of backup workloads n producton systems, n Proceedngs of the Tenth USENIX Conference on Fle and Storage Technologes (FAST12), [23] P. Kulkarn, F. Dougls, J. LaVoe, and J. M. Tracey, Redundancy elmnaton wthn large collectons of fles, n Proceedngs of the USENIX Annual Techncal Conference, [24] D. T. Meyer and W. J. Bolosky, A study of practcal deduplcaton, ACM Transactons on Storage (TOS), [25] C. A. Waldspurger, Memory resource management n VMware ESX server, ACM SIGOPS Operatng Systems Revew, [26] F. Travostno, P. d. Daspt et al., Seamless lve mgraton of vrtual machnes over the MAN/WAN, Future Generaton Computer Systems, [27] C. Clark, K. Fraser, S. Hand, J. G. Hansen, E. Jul, C. Lmpach, I. Pratt, and A. Warfeld, Lve mgraton of vrtual machnes, n Proceedngs of the 2nd conference on Symposum on Networked Systems Desgn & Implementaton-Volume 2, [28] M. Dutch, Understandng data deduplcaton ratos, n SNIA Data Management Forum, [29] M. Hbler, L. d. Stoller et al., Fast, Scalable Dsk Imagng wth Frsbee, n USENIX Annual Techncal Conference, General Track, [30] D. Harnk, O. Margalt, D. Naor, D. Sotnkov, and G. Vernk, Estmaton of deduplcaton ratos n large data sets, n Mass Storage Systems and Technologes (MSST), 2012 IEEE 28th Symposum on, [31] Wkmeda Downloads Hstorcal Archves, Accessed n 04/2013, http: //dumps.wkmeda.org/archve/. [32] OpenfMRI Datasets, Accessed n 05/2013, https://openfmr.org/ data-sets. [33] B. Debnath, S. Sengupta, and J. L, ChunkStash: speedng up nlne storage deduplcaton usng flash memory, n Proceedngs of the 2010 USENIX conference on USENIX annual techncal conference, [34] J. H. Burrows, Secure hash standard, DTIC Document, Tech. Rep., 1995.

Using SSD-Assisted Scalable Elasticity to Improve Inline Data Deduplication Storage Systems

Using SSD-Assisted Scalable Elasticity to Improve Inline Data Deduplication Storage Systems 1 Usng SSD-Asssted Scalable Elastcty to Improve Inlne Data Deduplcaton Storage Systems Yufeng Wang, Zhengyu Yang, Nngfang M, Chu C Tan Abstract Elastcty s the ablty to scale computng resources such as

More information

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis The Development of Web Log Mnng Based on Improve-K-Means Clusterng Analyss TngZhong Wang * College of Informaton Technology, Luoyang Normal Unversty, Luoyang, 471022, Chna wangtngzhong2@sna.cn Abstract.

More information

Fault tolerance in cloud technologies presented as a service

Fault tolerance in cloud technologies presented as a service Internatonal Scentfc Conference Computer Scence 2015 Pavel Dzhunev, PhD student Fault tolerance n cloud technologes presented as a servce INTRODUCTION Improvements n technques for vrtualzaton and performance

More information

Survey on Virtual Machine Placement Techniques in Cloud Computing Environment

Survey on Virtual Machine Placement Techniques in Cloud Computing Environment Survey on Vrtual Machne Placement Technques n Cloud Computng Envronment Rajeev Kumar Gupta and R. K. Paterya Department of Computer Scence & Engneerng, MANIT, Bhopal, Inda ABSTRACT In tradtonal data center

More information

An Alternative Way to Measure Private Equity Performance

An Alternative Way to Measure Private Equity Performance An Alternatve Way to Measure Prvate Equty Performance Peter Todd Parlux Investment Technology LLC Summary Internal Rate of Return (IRR) s probably the most common way to measure the performance of prvate

More information

A Replication-Based and Fault Tolerant Allocation Algorithm for Cloud Computing

A Replication-Based and Fault Tolerant Allocation Algorithm for Cloud Computing A Replcaton-Based and Fault Tolerant Allocaton Algorthm for Cloud Computng Tork Altameem Dept of Computer Scence, RCC, Kng Saud Unversty, PO Box: 28095 11437 Ryadh-Saud Araba Abstract The very large nfrastructure

More information

The OC Curve of Attribute Acceptance Plans

The OC Curve of Attribute Acceptance Plans The OC Curve of Attrbute Acceptance Plans The Operatng Characterstc (OC) curve descrbes the probablty of acceptng a lot as a functon of the lot s qualty. Fgure 1 shows a typcal OC Curve. 10 8 6 4 1 3 4

More information

Open Access A Load Balancing Strategy with Bandwidth Constraint in Cloud Computing. Jing Deng 1,*, Ping Guo 2, Qi Li 3, Haizhu Chen 1

Open Access A Load Balancing Strategy with Bandwidth Constraint in Cloud Computing. Jing Deng 1,*, Ping Guo 2, Qi Li 3, Haizhu Chen 1 Send Orders for Reprnts to reprnts@benthamscence.ae The Open Cybernetcs & Systemcs Journal, 2014, 8, 115-121 115 Open Access A Load Balancng Strategy wth Bandwdth Constrant n Cloud Computng Jng Deng 1,*,

More information

What is Candidate Sampling

What is Candidate Sampling What s Canddate Samplng Say we have a multclass or mult label problem where each tranng example ( x, T ) conssts of a context x a small (mult)set of target classes T out of a large unverse L of possble

More information

Forecasting the Direction and Strength of Stock Market Movement

Forecasting the Direction and Strength of Stock Market Movement Forecastng the Drecton and Strength of Stock Market Movement Jngwe Chen Mng Chen Nan Ye cjngwe@stanford.edu mchen5@stanford.edu nanye@stanford.edu Abstract - Stock market s one of the most complcated systems

More information

Communication Networks II Contents

Communication Networks II Contents 8 / 1 -- Communcaton Networs II (Görg) -- www.comnets.un-bremen.de Communcaton Networs II Contents 1 Fundamentals of probablty theory 2 Traffc n communcaton networs 3 Stochastc & Marovan Processes (SP

More information

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

Luby s Alg. for Maximal Independent Sets using Pairwise Independence Lecture Notes for Randomzed Algorthms Luby s Alg. for Maxmal Independent Sets usng Parwse Independence Last Updated by Erc Vgoda on February, 006 8. Maxmal Independent Sets For a graph G = (V, E), an ndependent

More information

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ). REVIEW OF RISK MANAGEMENT CONCEPTS LOSS DISTRIBUTIONS AND INSURANCE Loss and nsurance: When someone s subject to the rsk of ncurrng a fnancal loss, the loss s generally modeled usng a random varable or

More information

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module LOSSLESS IMAGE COMPRESSION SYSTEMS Lesson 3 Lossless Compresson: Huffman Codng Instructonal Objectves At the end of ths lesson, the students should be able to:. Defne and measure source entropy..

More information

HP Mission-Critical Services

HP Mission-Critical Services HP Msson-Crtcal Servces Delverng busness value to IT Jelena Bratc Zarko Subotc TS Support tm Mart 2012, Podgorca 2010 Hewlett-Packard Development Company, L.P. The nformaton contaned heren s subject to

More information

PAS: A Packet Accounting System to Limit the Effects of DoS & DDoS. Debish Fesehaye & Klara Naherstedt University of Illinois-Urbana Champaign

PAS: A Packet Accounting System to Limit the Effects of DoS & DDoS. Debish Fesehaye & Klara Naherstedt University of Illinois-Urbana Champaign PAS: A Packet Accountng System to Lmt the Effects of DoS & DDoS Debsh Fesehaye & Klara Naherstedt Unversty of Illnos-Urbana Champagn DoS and DDoS DDoS attacks are ncreasng threats to our dgtal world. Exstng

More information

Network Aware Load-Balancing via Parallel VM Migration for Data Centers

Network Aware Load-Balancing via Parallel VM Migration for Data Centers Network Aware Load-Balancng va Parallel VM Mgraton for Data Centers Kun-Tng Chen 2, Chen Chen 12, Po-Hsang Wang 2 1 Informaton Technology Servce Center, 2 Department of Computer Scence Natonal Chao Tung

More information

Cloud Auto-Scaling with Deadline and Budget Constraints

Cloud Auto-Scaling with Deadline and Budget Constraints Prelmnary verson. Fnal verson appears In Proceedngs of 11th ACM/IEEE Internatonal Conference on Grd Computng (Grd 21). Oct 25-28, 21. Brussels, Belgum. Cloud Auto-Scalng wth Deadlne and Budget Constrants

More information

Robust Design of Public Storage Warehouses. Yeming (Yale) Gong EMLYON Business School

Robust Design of Public Storage Warehouses. Yeming (Yale) Gong EMLYON Business School Robust Desgn of Publc Storage Warehouses Yemng (Yale) Gong EMLYON Busness School Rene de Koster Rotterdam school of management, Erasmus Unversty Abstract We apply robust optmzaton and revenue management

More information

Politecnico di Torino. Porto Institutional Repository

Politecnico di Torino. Porto Institutional Repository Poltecnco d Torno Porto Insttutonal Repostory [Artcle] A cost-effectve cloud computng framework for acceleratng multmeda communcaton smulatons Orgnal Ctaton: D. Angel, E. Masala (2012). A cost-effectve

More information

Proceedings of the Annual Meeting of the American Statistical Association, August 5-9, 2001

Proceedings of the Annual Meeting of the American Statistical Association, August 5-9, 2001 Proceedngs of the Annual Meetng of the Amercan Statstcal Assocaton, August 5-9, 2001 LIST-ASSISTED SAMPLING: THE EFFECT OF TELEPHONE SYSTEM CHANGES ON DESIGN 1 Clyde Tucker, Bureau of Labor Statstcs James

More information

Calculation of Sampling Weights

Calculation of Sampling Weights Perre Foy Statstcs Canada 4 Calculaton of Samplng Weghts 4.1 OVERVIEW The basc sample desgn used n TIMSS Populatons 1 and 2 was a two-stage stratfed cluster desgn. 1 The frst stage conssted of a sample

More information

DEFINING %COMPLETE IN MICROSOFT PROJECT

DEFINING %COMPLETE IN MICROSOFT PROJECT CelersSystems DEFINING %COMPLETE IN MICROSOFT PROJECT PREPARED BY James E Aksel, PMP, PMI-SP, MVP For Addtonal Informaton about Earned Value Management Systems and reportng, please contact: CelersSystems,

More information

Vembu StoreGrid Windows Client Installation Guide

Vembu StoreGrid Windows Client Installation Guide Ser v cepr ov dered t on Cl enti nst al l at ongu de W ndows Vembu StoreGrd Wndows Clent Installaton Gude Download the Wndows nstaller, VembuStoreGrd_4_2_0_SP_Clent_Only.exe To nstall StoreGrd clent on

More information

Cloud-based Social Application Deployment using Local Processing and Global Distribution

Cloud-based Social Application Deployment using Local Processing and Global Distribution Cloud-based Socal Applcaton Deployment usng Local Processng and Global Dstrbuton Zh Wang *, Baochun L, Lfeng Sun *, and Shqang Yang * * Bejng Key Laboratory of Networked Multmeda Department of Computer

More information

DBA-VM: Dynamic Bandwidth Allocator for Virtual Machines

DBA-VM: Dynamic Bandwidth Allocator for Virtual Machines DBA-VM: Dynamc Bandwdth Allocator for Vrtual Machnes Ahmed Amamou, Manel Bourguba, Kamel Haddadou and Guy Pujolle LIP6, Perre & Mare Cure Unversty, 4 Place Jusseu 755 Pars, France Gand SAS, 65 Boulevard

More information

Introduction CONTENT. - Whitepaper -

Introduction CONTENT. - Whitepaper - OneCl oud ForAl l YourCr t c al Bus nes sappl c at ons Bl uew r esol ut ons www. bl uew r e. c o. uk Introducton Bluewre Cloud s a fully customsable IaaS cloud platform desgned for organsatons who want

More information

Multiple-Period Attribution: Residuals and Compounding

Multiple-Period Attribution: Residuals and Compounding Multple-Perod Attrbuton: Resduals and Compoundng Our revewer gave these authors full marks for dealng wth an ssue that performance measurers and vendors often regard as propretary nformaton. In 1994, Dens

More information

A Dynamic Load Balancing for Massive Multiplayer Online Game Server

A Dynamic Load Balancing for Massive Multiplayer Online Game Server A Dynamc Load Balancng for Massve Multplayer Onlne Game Server Jungyoul Lm, Jaeyong Chung, Jnryong Km and Kwanghyun Shm Dgtal Content Research Dvson Electroncs and Telecommuncatons Research Insttute Daejeon,

More information

Updating the E5810B firmware

Updating the E5810B firmware Updatng the E5810B frmware NOTE Do not update your E5810B frmware unless you have a specfc need to do so, such as defect repar or nstrument enhancements. If the frmware update fals, the E5810B wll revert

More information

INVESTIGATION OF VEHICULAR USERS FAIRNESS IN CDMA-HDR NETWORKS

INVESTIGATION OF VEHICULAR USERS FAIRNESS IN CDMA-HDR NETWORKS 21 22 September 2007, BULGARIA 119 Proceedngs of the Internatonal Conference on Informaton Technologes (InfoTech-2007) 21 st 22 nd September 2007, Bulgara vol. 2 INVESTIGATION OF VEHICULAR USERS FAIRNESS

More information

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK Sample Stablty Protocol Background The Cholesterol Reference Method Laboratory Network (CRMLN) developed certfcaton protocols for total cholesterol, HDL

More information

IMPACT ANALYSIS OF A CELLULAR PHONE

IMPACT ANALYSIS OF A CELLULAR PHONE 4 th ASA & μeta Internatonal Conference IMPACT AALYSIS OF A CELLULAR PHOE We Lu, 2 Hongy L Bejng FEAonlne Engneerng Co.,Ltd. Bejng, Chna ABSTRACT Drop test smulaton plays an mportant role n nvestgatng

More information

An Interest-Oriented Network Evolution Mechanism for Online Communities

An Interest-Oriented Network Evolution Mechanism for Online Communities An Interest-Orented Network Evoluton Mechansm for Onlne Communtes Cahong Sun and Xaopng Yang School of Informaton, Renmn Unversty of Chna, Bejng 100872, P.R. Chna {chsun,yang}@ruc.edu.cn Abstract. Onlne

More information

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College Feature selecton for ntruson detecton Slobodan Petrovć NISlab, Gjøvk Unversty College Contents The feature selecton problem Intruson detecton Traffc features relevant for IDS The CFS measure The mrmr measure

More information

A Secure Password-Authenticated Key Agreement Using Smart Cards

A Secure Password-Authenticated Key Agreement Using Smart Cards A Secure Password-Authentcated Key Agreement Usng Smart Cards Ka Chan 1, Wen-Chung Kuo 2 and Jn-Chou Cheng 3 1 Department of Computer and Informaton Scence, R.O.C. Mltary Academy, Kaohsung 83059, Tawan,

More information

RequIn, a tool for fast web traffic inference

RequIn, a tool for fast web traffic inference RequIn, a tool for fast web traffc nference Olver aul, Jean Etenne Kba GET/INT, LOR Department 9 rue Charles Fourer 90 Evry, France Olver.aul@nt-evry.fr, Jean-Etenne.Kba@nt-evry.fr Abstract As networked

More information

Hollinger Canadian Publishing Holdings Co. ( HCPH ) proceeding under the Companies Creditors Arrangement Act ( CCAA )

Hollinger Canadian Publishing Holdings Co. ( HCPH ) proceeding under the Companies Creditors Arrangement Act ( CCAA ) February 17, 2011 Andrew J. Hatnay ahatnay@kmlaw.ca Dear Sr/Madam: Re: Re: Hollnger Canadan Publshng Holdngs Co. ( HCPH ) proceedng under the Companes Credtors Arrangement Act ( CCAA ) Update on CCAA Proceedngs

More information

MAPP. MERIS level 3 cloud and water vapour products. Issue: 1. Revision: 0. Date: 9.12.1998. Function Name Organisation Signature Date

MAPP. MERIS level 3 cloud and water vapour products. Issue: 1. Revision: 0. Date: 9.12.1998. Function Name Organisation Signature Date Ttel: Project: Doc. No.: MERIS level 3 cloud and water vapour products MAPP MAPP-ATBD-ClWVL3 Issue: 1 Revson: 0 Date: 9.12.1998 Functon Name Organsaton Sgnature Date Author: Bennartz FUB Preusker FUB Schüller

More information

Calculating the high frequency transmission line parameters of power cables

Calculating the high frequency transmission line parameters of power cables < ' Calculatng the hgh frequency transmsson lne parameters of power cables Authors: Dr. John Dcknson, Laboratory Servces Manager, N 0 RW E B Communcatons Mr. Peter J. Ncholson, Project Assgnment Manager,

More information

APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT

APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT Toshhko Oda (1), Kochro Iwaoka (2) (1), (2) Infrastructure Systems Busness Unt, Panasonc System Networks Co., Ltd. Saedo-cho

More information

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION Vson Mouse Saurabh Sarkar a* a Unversty of Cncnnat, Cncnnat, USA ABSTRACT The report dscusses a vson based approach towards trackng of eyes and fngers. The report descrbes the process of locatng the possble

More information

Effective Network Defense Strategies against Malicious Attacks with Various Defense Mechanisms under Quality of Service Constraints

Effective Network Defense Strategies against Malicious Attacks with Various Defense Mechanisms under Quality of Service Constraints Effectve Network Defense Strateges aganst Malcous Attacks wth Varous Defense Mechansms under Qualty of Servce Constrants Frank Yeong-Sung Ln Department of Informaton Natonal Tawan Unversty Tape, Tawan,

More information

Profit-Aware DVFS Enabled Resource Management of IaaS Cloud

Profit-Aware DVFS Enabled Resource Management of IaaS Cloud IJCSI Internatonal Journal of Computer Scence Issues, Vol. 0, Issue, No, March 03 ISSN (Prnt): 694-084 ISSN (Onlne): 694-0784 www.ijcsi.org 37 Proft-Aware DVFS Enabled Resource Management of IaaS Cloud

More information

Auditing Cloud Service Level Agreement on VM CPU Speed

Auditing Cloud Service Level Agreement on VM CPU Speed Audtng Cloud Servce Level Agreement on VM CPU Speed Ryan Houlhan, aojang Du, Chu C. Tan, Je Wu Department of Computer and Informaton Scences Temple Unversty Phladelpha, PA 19122, USA Emal: {ryan.houlhan,

More information

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña Proceedngs of the 2008 Wnter Smulaton Conference S. J. Mason, R. R. Hll, L. Mönch, O. Rose, T. Jefferson, J. W. Fowler eds. A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION

More information

Enterprise Master Patient Index

Enterprise Master Patient Index Enterprse Master Patent Index Healthcare data are captured n many dfferent settngs such as hosptals, clncs, labs, and physcan offces. Accordng to a report by the CDC, patents n the Unted States made an

More information

Traffic State Estimation in the Traffic Management Center of Berlin

Traffic State Estimation in the Traffic Management Center of Berlin Traffc State Estmaton n the Traffc Management Center of Berln Authors: Peter Vortsch, PTV AG, Stumpfstrasse, D-763 Karlsruhe, Germany phone ++49/72/965/35, emal peter.vortsch@ptv.de Peter Möhl, PTV AG,

More information

On the Optimal Control of a Cascade of Hydro-Electric Power Stations

On the Optimal Control of a Cascade of Hydro-Electric Power Stations On the Optmal Control of a Cascade of Hydro-Electrc Power Statons M.C.M. Guedes a, A.F. Rbero a, G.V. Smrnov b and S. Vlela c a Department of Mathematcs, School of Scences, Unversty of Porto, Portugal;

More information

IWFMS: An Internal Workflow Management System/Optimizer for Hadoop

IWFMS: An Internal Workflow Management System/Optimizer for Hadoop IWFMS: An Internal Workflow Management System/Optmzer for Hadoop Lan Lu, Yao Shen Department of Computer Scence and Engneerng Shangha JaoTong Unversty Shangha, Chna lustrve@gmal.com, yshen@cs.sjtu.edu.cn

More information

AN APPOINTMENT ORDER OUTPATIENT SCHEDULING SYSTEM THAT IMPROVES OUTPATIENT EXPERIENCE

AN APPOINTMENT ORDER OUTPATIENT SCHEDULING SYSTEM THAT IMPROVES OUTPATIENT EXPERIENCE AN APPOINTMENT ORDER OUTPATIENT SCHEDULING SYSTEM THAT IMPROVES OUTPATIENT EXPERIENCE Yu-L Huang Industral Engneerng Department New Mexco State Unversty Las Cruces, New Mexco 88003, U.S.A. Abstract Patent

More information

Stochastic Protocol Modeling for Anomaly Based Network Intrusion Detection

Stochastic Protocol Modeling for Anomaly Based Network Intrusion Detection Stochastc Protocol Modelng for Anomaly Based Network Intruson Detecton Juan M. Estevez-Tapador, Pedro Garca-Teodoro, and Jesus E. Daz-Verdejo Department of Electroncs and Computer Technology Unversty of

More information

The Load Balancing of Database Allocation in the Cloud

The Load Balancing of Database Allocation in the Cloud , March 3-5, 23, Hong Kong The Load Balancng of Database Allocaton n the Cloud Yu-lung Lo and Mn-Shan La Abstract Each database host n the cloud platform often has to servce more than one database applcaton

More information

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic Lagrange Multplers as Quanttatve Indcators n Economcs Ivan Mezník Insttute of Informatcs, Faculty of Busness and Management, Brno Unversty of TechnologCzech Republc Abstract The quanttatve role of Lagrange

More information

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network *

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network * JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 24, 819-840 (2008) Data Broadcast on a Mult-System Heterogeneous Overlayed Wreless Network * Department of Computer Scence Natonal Chao Tung Unversty Hsnchu,

More information

J. Parallel Distrib. Comput.

J. Parallel Distrib. Comput. J. Parallel Dstrb. Comput. 71 (2011) 62 76 Contents lsts avalable at ScenceDrect J. Parallel Dstrb. Comput. journal homepage: www.elsever.com/locate/jpdc Optmzng server placement n dstrbuted systems n

More information

An Integrated Dynamic Resource Scheduling Framework in On-Demand Clouds *

An Integrated Dynamic Resource Scheduling Framework in On-Demand Clouds * JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 30, 1537-1552 (2014) An Integrated Dynamc Resource Schedulng Framework n On-Demand Clouds * College of Computer Scence and Technology Zhejang Unversty Hangzhou,

More information

Reliable State Monitoring in Cloud Datacenters

Reliable State Monitoring in Cloud Datacenters Relable State Montorng n Cloud Datacenters Shcong Meng Arun K. Iyengar Isabelle M. Rouvellou Lng Lu Ksung Lee Balaj Palansamy Yuzhe Tang College of Computng, Georga Insttute of Technology, Atlanta, GA

More information

An Analysis of Central Processor Scheduling in Multiprogrammed Computer Systems

An Analysis of Central Processor Scheduling in Multiprogrammed Computer Systems STAN-CS-73-355 I SU-SE-73-013 An Analyss of Central Processor Schedulng n Multprogrammed Computer Systems (Dgest Edton) by Thomas G. Prce October 1972 Techncal Report No. 57 Reproducton n whole or n part

More information

Course outline. Financial Time Series Analysis. Overview. Data analysis. Predictive signal. Trading strategy

Course outline. Financial Time Series Analysis. Overview. Data analysis. Predictive signal. Trading strategy Fnancal Tme Seres Analyss Patrck McSharry patrck@mcsharry.net www.mcsharry.net Trnty Term 2014 Mathematcal Insttute Unversty of Oxford Course outlne 1. Data analyss, probablty, correlatons, vsualsaton

More information

Answer: A). There is a flatter IS curve in the high MPC economy. Original LM LM after increase in M. IS curve for low MPC economy

Answer: A). There is a flatter IS curve in the high MPC economy. Original LM LM after increase in M. IS curve for low MPC economy 4.02 Quz Solutons Fall 2004 Multple-Choce Questons (30/00 ponts) Please, crcle the correct answer for each of the followng 0 multple-choce questons. For each queston, only one of the answers s correct.

More information

Traffic-light a stress test for life insurance provisions

Traffic-light a stress test for life insurance provisions MEMORANDUM Date 006-09-7 Authors Bengt von Bahr, Göran Ronge Traffc-lght a stress test for lfe nsurance provsons Fnansnspetonen P.O. Box 6750 SE-113 85 Stocholm [Sveavägen 167] Tel +46 8 787 80 00 Fax

More information

Watermark-based Provable Data Possession for Multimedia File in Cloud Storage

Watermark-based Provable Data Possession for Multimedia File in Cloud Storage Vol.48 (CIA 014), pp.103-107 http://dx.do.org/10.1457/astl.014.48.18 Watermar-based Provable Data Possesson for Multmeda Fle n Cloud Storage Yongjun Ren 1,, Jang Xu 1,, Jn Wang 1,, Lmng Fang 3, Jeong-U

More information

Efficient Bandwidth Management in Broadband Wireless Access Systems Using CAC-based Dynamic Pricing

Efficient Bandwidth Management in Broadband Wireless Access Systems Using CAC-based Dynamic Pricing Effcent Bandwdth Management n Broadband Wreless Access Systems Usng CAC-based Dynamc Prcng Bader Al-Manthar, Ndal Nasser 2, Najah Abu Al 3, Hossam Hassanen Telecommuncatons Research Laboratory School of

More information

The Current Employment Statistics (CES) survey,

The Current Employment Statistics (CES) survey, Busness Brths and Deaths Impact of busness brths and deaths n the payroll survey The CES probablty-based sample redesgn accounts for most busness brth employment through the mputaton of busness deaths,

More information

Enabling P2P One-view Multi-party Video Conferencing

Enabling P2P One-view Multi-party Video Conferencing Enablng P2P One-vew Mult-party Vdeo Conferencng Yongxang Zhao, Yong Lu, Changja Chen, and JanYn Zhang Abstract Mult-Party Vdeo Conferencng (MPVC) facltates realtme group nteracton between users. Whle P2P

More information

A Performance Analysis of View Maintenance Techniques for Data Warehouses

A Performance Analysis of View Maintenance Techniques for Data Warehouses A Performance Analyss of Vew Mantenance Technques for Data Warehouses Xng Wang Dell Computer Corporaton Round Roc, Texas Le Gruenwald The nversty of Olahoma School of Computer Scence orman, OK 739 Guangtao

More information

taposh_kuet20@yahoo.comcsedchan@cityu.edu.hk rajib_csedept@yahoo.co.uk, alam_shihabul@yahoo.com

taposh_kuet20@yahoo.comcsedchan@cityu.edu.hk rajib_csedept@yahoo.co.uk, alam_shihabul@yahoo.com G. G. Md. Nawaz Al 1,2, Rajb Chakraborty 2, Md. Shhabul Alam 2 and Edward Chan 1 1 Cty Unversty of Hong Kong, Hong Kong, Chna taposh_kuet20@yahoo.comcsedchan@ctyu.edu.hk 2 Khulna Unversty of Engneerng

More information

Frequency Selective IQ Phase and IQ Amplitude Imbalance Adjustments for OFDM Direct Conversion Transmitters

Frequency Selective IQ Phase and IQ Amplitude Imbalance Adjustments for OFDM Direct Conversion Transmitters Frequency Selectve IQ Phase and IQ Ampltude Imbalance Adjustments for OFDM Drect Converson ransmtters Edmund Coersmeer, Ernst Zelnsk Noka, Meesmannstrasse 103, 44807 Bochum, Germany edmund.coersmeer@noka.com,

More information

A Resource-trading Mechanism for Efficient Distribution of Large-volume Contents on Peer-to-Peer Networks

A Resource-trading Mechanism for Efficient Distribution of Large-volume Contents on Peer-to-Peer Networks A Resource-tradng Mechansm for Effcent Dstrbuton of Large-volume Contents on Peer-to-Peer Networks SmonG.M.Koo,C.S.GeorgeLee, Karthk Kannan School of Electrcal and Computer Engneerng Krannet School of

More information

SEVERAL trends are opening up the era of Cloud

SEVERAL trends are opening up the era of Cloud IEEE Transactons on Cloud Computng Date of Publcaton: Aprl-June 2012 Volume: 5, Issue: 2 1 Towards Secure and Dependable Storage Servces n Cloud Computng Cong Wang, Student Member, IEEE, Qan Wang, Student

More information

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING Matthew J. Lberatore, Department of Management and Operatons, Vllanova Unversty, Vllanova, PA 19085, 610-519-4390,

More information

Resource Scheduling in Desktop Grid by Grid-JQA

Resource Scheduling in Desktop Grid by Grid-JQA The 3rd Internatonal Conference on Grd and Pervasve Computng - Worshops esource Schedulng n Destop Grd by Grd-JQA L. Mohammad Khanl M. Analou Assstant professor Assstant professor C.S. Dept.Tabrz Unversty

More information

A Cost-Effective Strategy for Intermediate Data Storage in Scientific Cloud Workflow Systems

A Cost-Effective Strategy for Intermediate Data Storage in Scientific Cloud Workflow Systems A Cost-Effectve Strategy for Intermedate Data Storage n Scentfc Cloud Workflow Systems Dong Yuan, Yun Yang, Xao Lu, Jnjun Chen Faculty of Informaton and Communcaton Technologes, Swnburne Unversty of Technology

More information

A New Task Scheduling Algorithm Based on Improved Genetic Algorithm

A New Task Scheduling Algorithm Based on Improved Genetic Algorithm A New Task Schedulng Algorthm Based on Improved Genetc Algorthm n Cloud Computng Envronment Congcong Xong, Long Feng, Lxan Chen A New Task Schedulng Algorthm Based on Improved Genetc Algorthm n Cloud Computng

More information

Methodology to Determine Relationships between Performance Factors in Hadoop Cloud Computing Applications

Methodology to Determine Relationships between Performance Factors in Hadoop Cloud Computing Applications Methodology to Determne Relatonshps between Performance Factors n Hadoop Cloud Computng Applcatons Lus Eduardo Bautsta Vllalpando 1,2, Alan Aprl 1 and Alan Abran 1 1 Department of Software Engneerng and

More information

Project Networks With Mixed-Time Constraints

Project Networks With Mixed-Time Constraints Project Networs Wth Mxed-Tme Constrants L Caccetta and B Wattananon Western Australan Centre of Excellence n Industral Optmsaton (WACEIO) Curtn Unversty of Technology GPO Box U1987 Perth Western Australa

More information

QBox: Guaranteeing I/O Performance on Black Box Storage Systems

QBox: Guaranteeing I/O Performance on Black Box Storage Systems QBox: Guaranteeng I/O Performance on Black Box Storage Systems Dmtrs Skourts skourts@cs.ucsc.edu Shnpe Kato shnpe@cs.ucsc.edu Department of Computer Scence Unversty of Calforna, Santa Cruz Scott Brandt

More information

THE deployment of IEEE 802.11 wireless networks

THE deployment of IEEE 802.11 wireless networks IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. X, NO. X, XXX 2008 1 Passve Onlne Detecton of 802.11 Traffc Usng Sequental Hypothess Testng wth TCP ACK-Pars We We, Member, IEEE, Kyoungwon Suh, Member, IEEE,

More information

EVALUATING THE PERCEIVED QUALITY OF INFRASTRUCTURE-LESS VOIP. Kun-chan Lan and Tsung-hsun Wu

EVALUATING THE PERCEIVED QUALITY OF INFRASTRUCTURE-LESS VOIP. Kun-chan Lan and Tsung-hsun Wu EVALUATING THE PERCEIVED QUALITY OF INFRASTRUCTURE-LESS VOIP Kun-chan Lan and Tsung-hsun Wu Natonal Cheng Kung Unversty klan@cse.ncku.edu.tw, ryan@cse.ncku.edu.tw ABSTRACT Voce over IP (VoIP) s one of

More information

Number of Levels Cumulative Annual operating Income per year construction costs costs ($) ($) ($) 1 600,000 35,000 100,000 2 2,200,000 60,000 350,000

Number of Levels Cumulative Annual operating Income per year construction costs costs ($) ($) ($) 1 600,000 35,000 100,000 2 2,200,000 60,000 350,000 Problem Set 5 Solutons 1 MIT s consderng buldng a new car park near Kendall Square. o unversty funds are avalable (overhead rates are under pressure and the new faclty would have to pay for tself from

More information

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm Avalable onlne www.ocpr.com Journal of Chemcal and Pharmaceutcal Research, 2014, 6(7):1884-1889 Research Artcle ISSN : 0975-7384 CODEN(USA) : JCPRC5 A hybrd global optmzaton algorthm based on parallel

More information

Single and multiple stage classifiers implementing logistic discrimination

Single and multiple stage classifiers implementing logistic discrimination Sngle and multple stage classfers mplementng logstc dscrmnaton Hélo Radke Bttencourt 1 Dens Alter de Olvera Moraes 2 Vctor Haertel 2 1 Pontfíca Unversdade Católca do Ro Grande do Sul - PUCRS Av. Ipranga,

More information

Efficient Striping Techniques for Variable Bit Rate Continuous Media File Servers æ

Efficient Striping Techniques for Variable Bit Rate Continuous Media File Servers æ Effcent Strpng Technques for Varable Bt Rate Contnuous Meda Fle Servers æ Prashant J. Shenoy Harrck M. Vn Department of Computer Scence, Department of Computer Scences, Unversty of Massachusetts at Amherst

More information

Fair Virtual Bandwidth Allocation Model in Virtual Data Centers

Fair Virtual Bandwidth Allocation Model in Virtual Data Centers Far Vrtual Bandwdth Allocaton Model n Vrtual Data Centers Yng Yuan, Cu-rong Wang, Cong Wang School of Informaton Scence and Engneerng ortheastern Unversty Shenyang, Chna School of Computer and Communcaton

More information

Nonlinear data mapping by neural networks

Nonlinear data mapping by neural networks Nonlnear data mappng by neural networks R.P.W. Dun Delft Unversty of Technology, Netherlands Abstract A revew s gven of the use of neural networks for nonlnear mappng of hgh dmensonal data on lower dmensonal

More information

Study on Model of Risks Assessment of Standard Operation in Rural Power Network

Study on Model of Risks Assessment of Standard Operation in Rural Power Network Study on Model of Rsks Assessment of Standard Operaton n Rural Power Network Qngj L 1, Tao Yang 2 1 Qngj L, College of Informaton and Electrcal Engneerng, Shenyang Agrculture Unversty, Shenyang 110866,

More information

2. SYSTEM MODEL. the SLA (unlike the only other related mechanism [15] we can compare it is never able to meet the SLA).

2. SYSTEM MODEL. the SLA (unlike the only other related mechanism [15] we can compare it is never able to meet the SLA). Managng Server Energy and Operatonal Costs n Hostng Centers Yyu Chen Dept. of IE Penn State Unversty Unversty Park, PA 16802 yzc107@psu.edu Anand Svasubramanam Dept. of CSE Penn State Unversty Unversty

More information

Mining Multiple Large Data Sources

Mining Multiple Large Data Sources The Internatonal Arab Journal of Informaton Technology, Vol. 7, No. 3, July 2 24 Mnng Multple Large Data Sources Anmesh Adhkar, Pralhad Ramachandrarao 2, Bhanu Prasad 3, and Jhml Adhkar 4 Department of

More information

METHODOLOGY TO DETERMINE RELATIONSHIPS BETWEEN PERFORMANCE FACTORS IN HADOOP CLOUD COMPUTING APPLICATIONS

METHODOLOGY TO DETERMINE RELATIONSHIPS BETWEEN PERFORMANCE FACTORS IN HADOOP CLOUD COMPUTING APPLICATIONS METHODOLOGY TO DETERMINE RELATIONSHIPS BETWEEN PERFORMANCE FACTORS IN HADOOP CLOUD COMPUTING APPLICATIONS Lus Eduardo Bautsta Vllalpando 1,2, Alan Aprl 1 and Alan Abran 1 1 Department of Software Engneerng

More information

Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation

Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation Exhaustve Regresson An Exploraton of Regresson-Based Data Mnng Technques Usng Super Computaton Antony Daves, Ph.D. Assocate Professor of Economcs Duquesne Unversty Pttsburgh, PA 58 Research Fellow The

More information

Staff Paper. Farm Savings Accounts: Examining Income Variability, Eligibility, and Benefits. Brent Gloy, Eddy LaDue, and Charles Cuykendall

Staff Paper. Farm Savings Accounts: Examining Income Variability, Eligibility, and Benefits. Brent Gloy, Eddy LaDue, and Charles Cuykendall SP 2005-02 August 2005 Staff Paper Department of Appled Economcs and Management Cornell Unversty, Ithaca, New York 14853-7801 USA Farm Savngs Accounts: Examnng Income Varablty, Elgblty, and Benefts Brent

More information

1 Approximation Algorithms

1 Approximation Algorithms CME 305: Dscrete Mathematcs and Algorthms 1 Approxmaton Algorthms In lght of the apparent ntractablty of the problems we beleve not to le n P, t makes sense to pursue deas other than complete solutons

More information

Can Auto Liability Insurance Purchases Signal Risk Attitude?

Can Auto Liability Insurance Purchases Signal Risk Attitude? Internatonal Journal of Busness and Economcs, 2011, Vol. 10, No. 2, 159-164 Can Auto Lablty Insurance Purchases Sgnal Rsk Atttude? Chu-Shu L Department of Internatonal Busness, Asa Unversty, Tawan Sheng-Chang

More information

8 Algorithm for Binary Searching in Trees

8 Algorithm for Binary Searching in Trees 8 Algorthm for Bnary Searchng n Trees In ths secton we present our algorthm for bnary searchng n trees. A crucal observaton employed by the algorthm s that ths problem can be effcently solved when the

More information

Demographic and Health Surveys Methodology

Demographic and Health Surveys Methodology samplng and household lstng manual Demographc and Health Surveys Methodology Ths document s part of the Demographc and Health Survey s DHS Toolkt of methodology for the MEASURE DHS Phase III project, mplemented

More information

QOS DISTRIBUTION MONITORING FOR PERFORMANCE MANAGEMENT IN MULTIMEDIA NETWORKS

QOS DISTRIBUTION MONITORING FOR PERFORMANCE MANAGEMENT IN MULTIMEDIA NETWORKS QOS DISTRIBUTION MONITORING FOR PERFORMANCE MANAGEMENT IN MULTIMEDIA NETWORKS Yumng Jang, Chen-Khong Tham, Ch-Chung Ko Department Electrcal Engneerng Natonal Unversty Sngapore 119260 Sngapore Emal: {engp7450,

More information

On-Line Fault Detection in Wind Turbine Transmission System using Adaptive Filter and Robust Statistical Features

On-Line Fault Detection in Wind Turbine Transmission System using Adaptive Filter and Robust Statistical Features On-Lne Fault Detecton n Wnd Turbne Transmsson System usng Adaptve Flter and Robust Statstcal Features Ruoyu L Remote Dagnostcs Center SKF USA Inc. 3443 N. Sam Houston Pkwy., Houston TX 77086 Emal: ruoyu.l@skf.com

More information

J. Parallel Distrib. Comput. Environment-conscious scheduling of HPC applications on distributed Cloud-oriented data centers

J. Parallel Distrib. Comput. Environment-conscious scheduling of HPC applications on distributed Cloud-oriented data centers J. Parallel Dstrb. Comput. 71 (2011) 732 749 Contents lsts avalable at ScenceDrect J. Parallel Dstrb. Comput. ournal homepage: www.elsever.com/locate/pdc Envronment-conscous schedulng of HPC applcatons

More information

"Research Note" APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES *

Research Note APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES * Iranan Journal of Scence & Technology, Transacton B, Engneerng, ol. 30, No. B6, 789-794 rnted n The Islamc Republc of Iran, 006 Shraz Unversty "Research Note" ALICATION OF CHARGE SIMULATION METHOD TO ELECTRIC

More information