Towards Zero-Overhead Static and Adaptive Indexing in Hadoop

Size: px
Start display at page:

Download "Towards Zero-Overhead Static and Adaptive Indexing in Hadoop"

Transcription

1 Nonme mnusript No. (will e inserted y the editor) Towrds Zero-Overhed Stti nd Adptive Indexing in Hdoop Stefn Rihter Jorge-Arnulfo Quiné-Ruiz Stefn Shuh Jens Dittrih the dte of reeipt nd eptne should e inserted lter Astrt Hdoop MpRedue hs evolved to n importnt industry stndrd for mssive prllel dt proessing nd hs eome widely dopted for vriety of use ses. Reent works hve shown tht indexes n improve the performne of seletive MpRedue jos drmtilly. However, one mjor wekness of existing pprohes re high index retion osts. We present HAIL (Hdoop Aggressive Indexing Lirry), novel indexing pproh for HDFS nd Hdoop MpRedue. HAIL retes different lustered indexes over terytes of dt with miniml, often invisile osts nd it drmtilly improves runtimes of severl lsses of MpRedue jos. HAIL fetures two different indexing pipelines, stti indexing nd dptive indexing. HAIL stti indexing effiiently indexes dtsets while uploding them to HDFS. Therey, HAIL leverges the defult replition of Hdoop nd enhnes it with logil replition. This llows HAIL to rete multiple lustered indexes for dtset, e.g. one for eh physil repli. Still, in terms of uplod time, HAIL mthes or even improves over the performne of stndrd HDFS. Additionlly, HAIL dptive indexing llows for utomti, inrementl indexing t jo runtime with miniml runtime overhed. For exmple, HAIL dptive indexing n ompletely index dtset s yprodut of only four MpRedue jos while inurring n overhed s low s 11% for the very first of those jo only. In our experiments, we show tht HAIL improves jo runtimes y up to 68x over Hdoop. This rtile is n extended version of the VLDB 212 pper Only Aggressive Elephnts re Fst Elephnts (PVLDB, 5(11): , 212). S. Rihter, S. Shuh, J. Dittrih Informtion Systems Group Srlnd University J.-A. Quiné-Ruiz Qtr Computing Reserh Institute Qtr Foundtion 1 Introdution MpRedue hs eome the de fto stndrd for lrge sle dt proessing in mny enterprises. It is used for developing novel solutions on mssive dtsets suh s we nlytis, reltionl dt nlytis, mhine lerning, dt mining, nd rel-time nlytis [23]. In prtiulr, log proessing emerges s n importnt type of dt nlysis ommonly done with MpRedue [5,36,18]. In ft, Feook nd Twitter use Hdoop MpRedue (the most populr MpRedue open soure implementtion) to nlyze the huge mounts of we logs generted every dy y their users [43,22,35]. Over the lst yers, lot of reserh works hve foused on improving the performne of Hdoop MpRedue [12, 26, 32, 34]. When improving the performne of MpRedue, it is importnt to onsider tht it ws initilly developed for lrge ggregtion tsks tht sn through huge mounts of dt. However, nowdys Hdoop is often lso used for seletive queries tht im to find only few relevnt reords for further onsidertion 1. For seletive queries, Hdoop still sns through the omplete dtset. This resemles the serh for needle in hystk. For this reson, severl reserhers hve prtiulrly foused on supporting effiient index ess in Hdoop [45, 15,35,33]. Some of these works hve improved the performne of seletive MpRedue jos y orders of mgnitude. However, ll these indexing pprohes hve three min weknesses. First, they require high upfront ost for index retion. This trnsltes to long witing times for users until they n tully strt to run queries. Seond, they n only support one physil sort order (nd hene one lustered index) per dtset. This eomes serious prolem if the worklod demnds indexes for severl ttriutes. Third, they require users to hve good knowledge of the worklod 1 A simple exmple of suh use se would e distriuted grep.

2 2 Stefn Rihter et l. in order to hoose the indexes to rete. This is not lwys possile, e.g. if the dt is nlyzed in n explortory wy or queries re sumitted y ustomers. 1.1 Motivtion Let us see through the eyes of dt nlyst, sy Bo, who wnts to nlyze lrge we log. The we log ontins different fields tht my serve s filter onditions for Bo like visitdte, drevenue, soureip nd so on. Assume Bo is interested in ll soureips with visitdte from 211. Thus, Bo writes MpRedue progrm to filter out extly those reords nd disrd ll others. Bo is using Hdoop, whih will sn the entire input dtset from disk to filter out the qulifying reords. This tkes while. After inspeting the result set Bo detets series of strnge requests from soureip Therefore, he deides to modify his MpRedue jo to show ll requests from the entire input dtset hving tht soureip. Bo is using Hdoop. This tkes while. Eventully, Bo deides to modify his MpRedue jo gin to only return log reords hving prtiulr drevenue. Yes, this gin tkes while. In summry, Bo uses sequene of different filter onditions, eh one triggering new MpRedue jo. He is not extly sure wht he is looking for. The whole endevor feels like going shopping without shopping list. This exmple illustrtes n explortory usge (nd mjor usese) of Hdoop MpRedue [5, 18, 38]. But, this use-se hs one mjor prolem: slow query runtimes. The time to exeute MpRedue jo sed on sn my e very high: it is dominted y the I/O for reding ll input dt [39,33]. While witing for his MpRedue jo to omplete, Bo hs enough time to pik offee (or two) nd this hppens every time Bo modifies the MpRedue jo. This will likely kill his produtivity nd mke his oss unhppy. Now, ssume the fortunte se tht Bo rememers sentene from one of his professors sying full-tle-sns re d; indexes re good 2. Thus, he reds ll the reent VLDB ppers (inluding [33,12,26,32]) nd finds pper tht shows how to rete so-lled trojn index [15]. A trojn index is n index tht my e used with Hdoop MpRedue nd yet does not modify the underlying Hdoop MpRedue nd HDFS engines. Zero-Overhed indexing. Bo finds the trojn index ide interesting nd hene deides to rete trojn index on soureip efore running his MpRedue jos. However, using trojn indexes rises two other prolems: (1.) Expensive index retion. The time to rete the trojn index on soureip (or ny other ttriute) is even muh longer thn running sn-sed MpRedue jo. Thus, if Bo s MpRedue jos use tht index only few times, the index retion osts will never e mortized. So, why would Bo rete suh n expensive index in the first ple? 2 The professor is wre tht for some situtions the opposite is true. (2.) Whih ttriute to index? Even if Bo mortizes index retion osts, the trojn index on soureip will only help for tht prtiulr ttriute. So, whih ttriute should Bo use to rete the index? Bo is wondering how to rete severl indexes t very low ost to solve those prolems. Per-Repli indexing. One dy in utumn 211, Bo reds out nother ide [34] where some reserhers looked t wys to improve vertil prtitioning in Hdoop. The reserhers in tht work relized tht HDFS keeps three (or more) physil opies of ll dt for fult-tolerne. Therefore, they deided to hnge HDFS to store eh physil opy in different dt lyout (row, olumn, PAX, or ny other olumn grouping lyout). As ll dt lyout trnsformtion is done per HDFS dt lok, the filover properties of HDFS nd Hdoop MpRedue were not ffeted. At the sme time, I/O times improved. Bo thinks tht this looks very promising, euse he ould possily exploit this onept to rete different lustered indexes lmost invisile to the user. This is euse he ould rete one lustered index per dt lok repli when uploding dt to HDFS. This would lredy help him lot in severl query worklods. However, Bo quikly figures out tht there re ses where this ide still hs some nnoying limittions. Even if Bo ould rete one lustered index per dt repli t low ost, he would still hve to determine whih ttriutes to index when uploding his dt to HDFS. Afterwrds, he ould not esily revise his deision or introdue dditionl indexes without uploding the dtset gin. Unfortuntely, it sometimes hppens tht Bo nd his ollegues nvigte through dtsets ording to the properties nd orreltions of the dt. In suh ses, Bo nd his ollegues typilly: (1.) do not know the dt ess ptterns in dvne; (2.) hve different interests nd hene nnot gree upon ommon seletion riteri t dt uplod time; (3.) even if they gree whih ttriutes to index t dt uplod time, they might end up filtering reords ording to vlues on different ttriutes. Therefore, using ny trditionl indexing tehnique [19, 1, 2,8,11,45,35,15,33] would e prolemti, euse they nnot dpt well to unknown or hnging query worklods. Adptive indexing. When serhing for solution to his prolem with stti indexing, Bo stumles ross new pproh lled dptive indexing [28], where the generl ide is to rete indexes s side-effet of query proessing. This is similr to the ide of soft indexes [37], where the system piggyks the index retion for given ttriute on single inoming query. However, in ontrst to soft indexes, dptive indexing ims t reting indexes inrementlly (i.e., piggyking on severl inoming queries) in order to void high upfront index retion times. Thus, Bo is exited out the dptive indexing ide sine this ould e the missing piee to solve his remining onern. However, Bo quikly noties tht he nnot simply pply existing

3 Towrds Zero-Overhed Stti nd Adptive Indexing in Hdoop 3 dptive indexing works [17,28,29,21,3,24] in MpRedue systems for severl resons: (1.) Glol index onvergene. These tehniques im t onverging to glol index for n entire ttriute, whih requires sorting the ttriute glolly. Therefore, these tehniques perform mny dt movements ross the entire dtset. Doing this in MpRedue would hurt fult-tolerne s well s the performne of MpRedue jos. This is euse the system would hve to move dt ross dt loks in syn with ll their three physil dt lok replis. We do not pln to rete glol indexes, ut fous on reting prtil indexes tht in totl over the whole dtset. A smll k of the envelope lultion shows tht the possile gins of glol index re negligile in omprison to the overhed of the MpRedue frmework. For instne, if dtset is uniformly distriuted over luster nd oupies 16 HDFS loks on eh dtnode (like the dtset in our experiments in Setion 9) nd we do not hve glol index, then we need to perform 16 index esses on eh dtnode. Sine ll dtnodes n ess their loks in prllel to eh other, we ssume tht the overhed is determined y the highest overhed per dtnode. Overll, our pproh requires t most 318 dditionl rndom reds in HDFS per dtnode in this senrio, whih in turn ost roughly 15ms eh. In totl, this mounts to 4.77s overhed ompred to glol index stored in HDFS. However, even empty MpRedue jos, tht do not red ny dt nor ompute single mp funtion, run for more thn 1s. (2.) High I/O osts. Even if Bo pplied existing dptive indexing tehniques inside dt loks, these tehniques would end up in mny ostly I/O opertions to move dt on disk. This is euse these tehniques onsider minmemory systems nd thus do not ftor in the I/O-ost for reding/writing dt from/to disk. Only one of these works [21] proposes n dptive merging tehnique for disksed systems. However, pplying this tehnique inside HDFS lok would not mke sense in MpRedue sine HDFS loks re typilly loded entirely into min memory nywys when proessing mp tsks. One my think out pplying dptive merging ross HDFS loks, ut this would gin hurt fult-tolerne nd the performne of MpRedue jos s desried ove. (3.) Unlustered index. These works fous on reting unlustered indexes in the first ple nd hene it is only enefiil for highly seletive queries. One of these works [29] introdued lzy tuple reorgnistion in order to onverge to lustered indexes. However, this tehnique needs severl thousnd queries to onverge nd its pplition in disksed system would gin introdue huge numer of expensive I/O opertions. (4.) Centrlized pproh. Existing dptive indexing pprohes were minly designed for single-node DBMSs. Therefore, pplying these works in distriuted prllel systems, like Hdoop MpRedue, would not fully exploit the existing prllelism to distriute the indexing effort ross severl omputing nodes. Despite ll these open prolems, Bo is very enthusisti to omine the ove interesting ides on indexing into new system to revolutionize the wy his ompny n use Hdoop. And this is where the story egins. 1.2 Reserh Questions nd Chllenges This rtile ddresses the following reserh questions: Zero-Overhed indexing. Current indexing pprohes in Hdoop involve signifint upfront ost for index retion. How n we mke indexing in Hdoop so effetive tht it is silly invisile for the user? How n we minimize the I/O osts for indexing or eventully redue them to zero? How n we fully utilize the ville CPU resoures nd prllelism of lrge lusters for indexing? Per-Repli indexing. Hdoop uses dt replition for filover. How n we exploit this replition to support different sort orders nd indexes? Whih hnges to the HDFS uplod pipeline need to e done to mke this effiient? Wht hppens to the involved heksum mehnism of HDFS? How n we teh the HDFS nmenode to distinguish the different replis nd keep trk of the different indexes? Jo exeution. How n we hnge Hdoop MpRedue to utilize different sort orders nd indexes t query time? How n we hnge Hdoop MpRedue to shedule tsks to replis hving the pproprite index? How n we shedule mp tsks to effiiently proess indexed nd non-indexed dt loks without ffeting filover? How muh do we need to hnge existing MpRedue jos? How will Hdoop MpRedue hnge from the user s perspetive? Zero-Overhed Adptive indexing. How n we dptively nd utomtilly rete dditionl useful indexes online t miniml osts per jo? How to index ig dt inrementlly in distriuted, disk-sed system like Hdoop s yprodut of jo exeution? How to minimize the impt of indexing on individul jo exeution times? How to effiiently interleve dt proessing with indexing? How to distriute the indexing effort effiiently y onsidering dtlolity nd index plement ross omputing nodes? How to rete severl lustered indexes t query time? How to support different numer of replis per dt lok? 1.3 Contriutions We propose HAIL (Hdoop Aggressive Indexing Lirry), stti nd dptive indexing pproh for MpRedue systems. The min gol of HAIL is to minimize oth (i) the index retion time when uploding dt nd (ii) the impt of onurrent index retion on jo exeution times. In summry, we mke the following min ontriutions to tkle the questions nd hllenges mentioned ove:

4 4 Stefn Rihter et l. (1.) Zero-Overhed indexing. We show how to effetively piggy-k sorting nd index retion on the existing HDFS uplod pipeline. This wy no dditionl MpRedue jo is required to rete those indexes nd lso no dditionl red of the dt is required t ll. In ft, the HAIL uplod pipeline is so effetive when ompred to HDFS tht the dditionl overhed for sorting nd index retion is hrdly notiele in the overll proess. Therefore, we offer win-win sitution over Hdoop MpRedue nd even over Hdoop++ [15]. We give n overview of HAIL nd its enefits in Setion 2. (2.) Per-Repli indexing. We show how to exploit the defult replition of Hdoop to support different sort orders nd indexes for eh lok repli (Setion 3). Hene, for defult replition ftor of three, up to three different sort orders nd lustered indexes re ville for proessing MpRedue jos. Thus, the likelihood to find suitle index inreses nd hene the runtime for worklod improves. Our pproh enefits from the ft tht Hdoop is only used for ppends: there re no updtes. Thus, one lok is full, it will never e hnged gin. (3.) Jo Exeution. We show how to effetively hnge the Hdoop MpRedue pipeline to exploit existing indexes (Setion 4). Our gol is to do this without hnging the ode of the MpRedue frmework. Therefore, we introdue optionl nnottions for MpRedue jos tht llow users to enrih their queries with expliit speifitions of their seletions nd projetions. HAIL tkes re of performing MpRedue jos using norml dt lok replis or pseudo dt lok replis (or even oth). In ddition, we propose new tsk sheduling, lled HAIL Sheduling, to fully exploit sttilly nd dptively indexed dt loks (Setion 7). The gol of HAIL Sheduling is twofold: (i) to redue the sheduling overhed when exeuting MpRedue jo, nd (ii) to lne the indexing effort ross omputing nodes to limit the impt of dptive indexing. (4.) Zero-Overhed Adptive indexing. We show how to effetively piggyk dptive index retion on the existing MpRedue jo exeution pipeline (Setion 5). The ide is to omine dptive indexing nd zero-overhed indexing to solve the prolem of missing indexes for evolving or unpreditle worklods. In other words, when HAIL exeutes mp redue jo with filter ondition on n unindexed ttriute, HAIL retes tht missing index for ertin frtion of the HDFS loks in prllel. We dditionlly propose set of dptive indexing strtegies tht mkes HAIL wre of the performne nd the seletivity of MpRedue jos (Setion 6). We present lzy nd eger dptive indexing, two tehniques tht llow HAIL to quikly dpt to hnges in users worklods t low indexing overhed. We then show how HAIL n deide whih dt loks to index sed on the seletivities of MpRedue jos. (5.) Exhustive vlidtion. We present n extensive experimentl omprison of HAIL with Hdoop nd Hdoop++ [15] (Setion 9). We use seven different lusters inluding physil nd virtul EC2 lusters of up to 1 nodes. A series of experiments shows the superiority of HAIL over oth Hdoop nd Hdoop++. Another series of slility experiments with different dtsets lso demonstrtes the superiority of using dptive indexing in HAIL. In prtiulr, our experimentl results demonstrte tht HAIL: (i) retes lustered indexes t uplod time lmost for free; (ii) quikly dpts to query worklods with negligile indexing overhed; nd (iii) only for the very first jo HAIL hs smll overhed over Hdoop when reting indexes dptively: ll the following jos re fster in HAIL. Notie tht, this rtile presents n extended version of the initil HAIL system [16] with the following signifint dded vlue: we enrih HAIL with the dptive indexing pipeline, tht llows HAIL to dpt to hnges in query worklods in n utomti, inrementl, nd dynmi wy (ll of ontriution Zero-Overhed Adptive indexing.); we extend the HAIL tsk sheduling in order to lne the index effort t jo exeution time nd exploit pseudo dt loks (hlf of ontriution Jo exeution.); we run lrge numer of new experiments to vlidte our dptive indexing tehniques s well s the extended HAIL tsk sheduling (one third of ontriution Exhustive vlidtion.). 2 Overview In the following, we give n overview of HAIL y ontrsting it with norml HDFS nd Hdoop MpRedue. Therey, we introdue the two indexing pipelines of HAIL. First, stti indexing llows us to rete severl lustered indexes t uplod time. Seond, HAIL dptive indexing retes dditionl indexes s yprodut of tul jo exeution, whih enles HAIL to dpt to unexpeted worklods. For more detiled ontrst to relted work see Setion 8. For now, let s onsider gin our motivting exmple: How n Bo nlyze his log file with Hdoop nd HAIL? 2.1 Hdoop nd HDFS In HDFS nd Hdoop MpRedue, Bo strts y uploding his log file to HDFS using the HDFS lient. HDFS then prtitions the file into logil HDFS loks using onstnt lok size (the HDFS defult is 64MB). Eh HDFS lok is then physilly stored three times (ssuming the defult replition ftor). Eh physil opy of lok is lled repli. Eh repli will sit on different dtnode. Therefore, t lest two dtnode filures my e survived y HDFS. Note tht HDFS keeps informtion on the different replis for n HDFS lok in entrl nmenode diretory. After uploding his log file to HDFS, Bo my run n tul MpRedue jo. Bo invokes Hdoop MpRedue through Hdoop MpRedue JoClient, whih sends his

5 Towrds Zero-Overhed Stti nd Adptive Indexing in Hdoop 5 MpRedue jo to entrl node termed JoTrker. The MpRedue jo onsists of severl tsks. A tsk is exeuted on suset of the input file, typilly n HDFS lok 3. The JoTrker ssigns eh tsk to different TskTrker, whih typilly runs on the sme mhine s n HDFS dtnode. Eh dtnode will then red its suset of the input file, i.e., set of HDFS loks, nd feed tht dt into the MpRedue proessing pipeline whih usully onsists of Mp, Shuffle, nd Redue Phse (see [13,15,14] for detiled desription). As soon s ll results hve een written to HDFS, the JoClient informs Bo tht the results re ville. Notie tht, the exeution time of the MpRedue jo is hevily influened y the size of the input dtset, euse Hdoop MpRedue reds the input dtset entirely in order to perform ny inoming MpRedue jo. 2.2 HAIL In HAIL, Bo nlyzes his log file s follows. He strts y uploding his log file to HAIL using the HAIL lient. In ontrst to the HDFS lient, the HAIL lient nlyzes the input dt for eh HDFS lok, onverts eh HDFS lok diretly to inry olumnr lyout, tht resemles PAX [3] nd sends it to three dtnodes. Then, ll dtnodes sort the dt ontined in tht HDFS lok in prllel using different sort order. The required sort orders n e mnully speified y Bo in onfigurtion file or omputed y physil design lgorithm. For eh HDFS lok, ll sorting nd index retion hppens in min memory. This is fesile s the HDFS lok size is typilly etween 64MB (defult) nd 1GB. This esily fits in the min memory of most mhines. In ddition, in HAIL, eh dtnode retes different lustered index for eh HDFS lok repli nd stores it with the sorted dt. This proess is lled the HAIL stti indexing pipeline. After uploding his log file to HAIL, Bo runs his MpRedue jos, tht n now immeditely exploit the indexes tht were reted y HAIL sttilly (i.e., t uplod time). As efore, Bo invokes Hdoop MpRedue through JoClient whih sends his MpRedue jos to the Jo- Trker. However, his MpRedue jos re slightly modified so tht the system n deide to eventully use ville indexes on the dt lok replis. For exmple, ssume tht dt lok hs three replis with lustered indexes on visitdte, drevenue, nd soureip. In se tht Bo hs MpRedue jo filtering on visitdte, HAIL uses the replis hving the lustered index on visitdte. If Bo is filtering on soureip, HAIL uses the replis hving the lustered index on soureip nd so on. To provide filover nd lod lning, HAIL my fll k to stndrd Hdoop snning for some of the loks. However, even ftoring this 3 Atully it is split. The differene does not mtter here. We will get k to this in Setion 4.2. in, Bo s queries run muh fster on verge, if indexes on the right ttriutes exist. In se tht Bo sumits jos tht filter on unindexed ttriutes (e.g., on durtion), HAIL gin flls k to stndrd full sn y hoosing ny ritrry repli, just like Hdoop. However, in ontrst to Hdoop, HAIL n index HDFS loks in prllel to jo exeution. If nother jo filters gin on the durtion field, the new jo n lredy enefit from the previously indexed loks. So, HAIL tkes inoming jos, whih hve seletion predite on urrently unindexed ttriutes, s hints for vlule dditionl lustered indexes. Consequently, the set of ville indexes in HAIL evolves with hnging worklods. We ll this proess the HAIL dptive indexing pipeline. 2.3 HAIL Benefits (1.) HAIL often improves oth uplod nd query times. The uplod is drmtilly fster thn Hdoop++ nd often fster (or only slightly slower) thn with the stndrd Hdoop even though we (i) onvert the input file into inry PAX, (ii) rete series of different sort orders, nd (iii) rete multiple lustered indexes. From the user-side, this provides win-win sitution: there is no notiele punishment for uplod. For querying, users n only win: if our indexes nnot help, we will fll k to stndrd Hdoop snning; if the indexes n help, query runtimes will improve. Why do we not hve high osts t uplod time? We silly exploit the unused CPU tiks tht re not used y stndrd HDFS. As the stndrd HDFS uplod pipeline is I/O-ound, the effort for our sorting nd index retion in the HAIL uplod pipeline is hrdly notiele. In ddition, sine we prse dt to inry while uploding, we often enefit from smller dtsets triggering less network nd disk I/O. (2.) Even if we did not rete the right indexes t uplod time, HAIL n rete indexes dptively t jo exeution time without inurring high overhed. Why don t we see high overhed? We do not need to dditionlly lod the lok dt to min memory, sine we piggyk on the reding of the mp tsks. Furthermore, HAIL retes indexes inrementlly over severl jo exeutions using different dptive indexing strtegies. (3.) We do not hnge the filover properties of Hdoop. Why is filover not ffeted? All dt stys on the sme logil HDFS lok. We just hnge the physil representtion of eh repli of n HDFS lok. Therefore, from eh physil repli we my reover the logil HDFS lok. (4.) HAIL works with existing MpRedue jos inurring only miniml hnges to those jos. Why does this work? We llow Bo to nnotte his existing jos with seletions nd projetions. Those nnottions re then onsidered y HAIL to pik the right index. Like tht, for Bo the hnges to his MpRedue jos re miniml.

6 6 Stefn Rihter et l. Network Network Bo HAILClient CL DtNode DN 1 DtNode DN 3 OK uplod notify preproess 1 onvert 2 PAX Blok Blok Metdt PCK 2 ACK PCK 1 ACK PAX Blok Blok Metdt ressemle PCK PCK 2 1 forwrd 13 uild 7 ACK HAIL Blok PAX Blok HAIL Blok 1 Blok Metdt Blok Metdt Blok 111 Metdt Index Metdt uild Index Metdt Index Index ressemle 8 forwrd PCK PCK hek ACK 2 ACK 1 ACK 2 ppend knowledge Network get lotion 3 register register HDFS NmeNode Blok diretory HAIL Repli diretory Fig. 1 The HAIL stti indexing pipeline s prt of uploding dt to HDFS 3 HAIL Zero-Overhed Stti Indexing We rete stti indexes in HAIL while uploding dt. One of the min hllenges is to support different sort orders nd lustered indexes per repli s well s to uild those indexes effiiently without muh impt on uplod times. Figure 1 shows the dt flow when Bo uplods file to HAIL. Let s first explore the detils of the stti indexing pipeline. 3.1 Dt Lyout In HDFS, for eh lok, the lient ontts the nmenode to otin the list of dtnodes tht should store the lok replis. Then, the lient sends the originl lok to the first dtnode, whih forwrds this to the seond dtnode nd so on. In the end, eh dtnode stores yte-identil opy of the originl lok dt. In HAIL, the HAIL lient preproesses the file sed on its ontent to onsider end of lines 1 in Figure 1. We prse the ontents into rows y serhing for end of line symols nd never split row etween two loks. This is in ontrst to stndrd HDFS whih splits file into HDFS loks fter onstnt numer of ytes. For eh lok the HAIL lient prses eh row ording to the shem speified y the user 4. If HAIL enounters row tht does not mth the given shem (i.e., d reord), it seprtes this reord into speil prt of the dt lok. HAIL then onverts ll HDFS loks to inry olumnr lyout tht resemles PAX 2. This llows us to index nd ess individul ttriutes more effiiently. The HAIL lient lso ollets metdt informtion from eh HDFS lok (suh s the dt shem) nd retes lok heder (Blok Metdt) for eh HDFS lok 2. We ould nively piggy-k on this existing HDFS uplod pipeline y first storing the originl lok dt s done 4 Alterntively, HAIL n lso suggest n pproprite shem to users through shem nlysis. in Hdoop nd then onverting it to inry PAX lyout in seond step. However, we would hve to re-red nd then re-write eh lok, whih would trigger one extr write nd red for eh repli, e.g., for n input file of 1GB we would hve to py 6GB extr I/O on the luster. This would led to very long uplod times. In ontrst, HAIL does not hve to py ny of tht extr I/O. However, to hieve this drmti improvement, we hve to mke nontrivil hnges in the stndrd Hdoop uplod pipeline. 3.2 Stti Indexing in the Uplod Pipeline To understnd the implementtion of stti indexing in the HAIL uplod pipeline, we first hve to nlyze the norml HDFS uplod pipeline in more detil. In HDFS, while uploding lok, the dt is further prtitioned into hunks of onstnt size 512B. Chunks re olleted into pkets. A pket is sequene of hunks plus heksum for eh of the hunks. In ddition some metdt is kept. In totl pket hs size of up to 64KB. Immeditely efore sending the dt over the network, eh HDFS lok is onverted to sequene of pkets. On disk, HDFS keeps, for eh repli, seprte file ontining heksums for ll of its hunks. Hene, for eh repli two files re reted on lol disk: one file with the tul dt nd one file with its heksums. These heksums re reused y HDFS whenever dt is send over the network. The HDFS lient (CL) sends the first pket of the lok to the first dtnode (DN 1 ) in the uplod pipeline. DN 1 splits the pket into two prts: the first ontins the tul hunk dt, the seond ontins the heksums for those hunks. Then DN 1 flushes the hunk dt to file on lol disk. The heksums re flushed to n extr file. In prllel DN 1 forwrds the pket to DN 2 whih splits nd flushes the dt like DN 1 nd in turn forwrds the pket to DN 3 whih splits nd flushes the dt s well. Yet, only DN 3 verifies the heksum for eh hunk. If the reomputed heksums for eh hunk of

7 Towrds Zero-Overhed Stti nd Adptive Indexing in Hdoop 7 pket mth the reeived heksums, DN 3 knowledges the pket k to DN 2, whih knowledges k to DN 1. Finlly, DN 1 knowledges k to CL. Eh dtnode lso ppends its ID to the ACK. Like tht only one of the dtnodes (the lst in the hin, here DN 3 s the replition ftor is three) hs to verify the heksums. DN 2 elieves DN 3, DN 1 elieves DN 2, nd CL elieves DN 1. If ny CL or DN i reeives ACKs in the wrong order, the uplod is onsidered filed. The ide of sending multiple pkets from CL is to hide the roundtrip ltenies of the individul pkets. Creting this hin of ACKs lso hs the enefit tht CL only reeives single ACK for eh pket nd not three. Notie, tht HDFS provides this heksum mehnism on top of the existing TCP/IP heksum mehnism (whih hs weker orretness gurntees thn HDFS). In HAIL, in order to reuse s muh of the existing HDFS pipeline nd yet to mke this effiient, we need to perform the following hnges. As efore, the HAIL lient (CL) gets the list of dtnodes to use for this lok from the HDFS nmenode 3. But rther thn sending the originl input, CL retes the PAX lok, uts it into pkets 4, nd sends it to DN 1 5. Whenever dtnode DN 1 DN 3 reeives pket, it does neither flush its dt nor its heksums to disk. Still, DN 1 nd DN 2 immeditely forwrd the pket to the next dtnode s efore 8. DN 3 will verify the heksum of the hunks for the reeived PAX lok 9 nd knowledge the pket k to DN 1 2. This mens the semntis of n ACK for pket of lok re hnged from pket reeived, vlidted, nd flushed to pket reeived nd vlidted. We do neither flush the hunks nor its heksums to disk s we first hve to sort the entire lok ording to the desired sort key. On eh dtnode, we ssemle the lok from ll pkets in min memory 6. This is relisti in prtie, sine min memories tend to e >1GB for ny modern server. Typilly, the size of lok is etween 64MB (defult) nd 1GB. This mens tht for the defult size we ould keep out 15 loks in min memory t the sme time. In prllel to forwrding nd ressemling pkets, eh dtnode sorts the dt, retes indexes, nd forms HAIL Blok 7, (see Setion 3.4). As prt of this proess, eh dtnode lso dds Index Metdt informtion to eh HAIL lok in order to speify the index it reted for this lok. Eh dtnode (e.g., DN 1 ) typilly sorts the dt inside lok in different sort order. It is worth noting tht hving different sort orders ross replis does not impt fult-tolerne s ll dt is reorgnized inside the sme lok only, i.e., dt is not reorgnized ross loks. Hene, ll replis of the sme HDFS lok logilly ontin the sme reords with just different order nd therefore n still t s logil replements for eh other. Additionlly, this property helps HAIL to preserve the lod lning pilities of Hdoop. For exmple, when dtnode ontining the repli with mthing sort order for ertin jo is overloded, HAIL might hoose to red from different repli on nother dtnode, just like norml Hdoop. To void overloding dtnodes in the first ple, HAIL employs round roin strtegy for ssigning sort orders to physil replis on top of the repli plement of HDFS. This mens, tht while HDFS lredy res out distriuting HDFS lok replis ross the luster, HAIL res out distriuting the sort orders (nd hene the indexes) ross those replis. As soon s dtnode hs ompleted sorting nd reting its index, it will reompute heksums for eh hunk of lok. Notie tht, heksums will differ on eh repli, s different sort orders nd indexes re used. Hene, eh dtnode hs to ompute its own heksums. Then, eh dtnode flushes the hunks nd newly omputed heksums to two seprte files on lol disk s efore. For DN 3, one ll hunks nd heksums hve een flushed to disk, DN 3 will knowledge the lst pket of the lok k to DN 1 2. After tht DN 3 will inform the HDFS nmenode out its new repli inluding its HAIL lok size, the reted indexes, nd the sort order 11 (see Setion 3.3). Dtnodes DN 2 nd DN 1 ppend their ID to eh ACK 12. Then they forwrd eh ACK k in the hin 13. DN 2 nd DN 1 will forwrd the lst ACK of the lok only if ll hunks nd heksums hve een flushed to their disks. After tht DN 2 nd DN 1 individully inform the HDFS nmenode 14. The HAIL lient lso verifies tht ll ACKs rrive in order 15. Notie, tht it is importnt to hnge the HDFS nmenode in order to keep trk of the different sort orders. We disuss these hnges in Setion HDFS Nmenode Extensions In HDFS, the entrl nmenode keeps diretory Dir lok of loks, i.e., mpping lokid Set Of DtNodes. This diretory is required y ny opertion retrieving loks from HDFS. Hdoop MpRedue exploits Dir lok for sheduling. In Hdoop MpRedue whenever split needs to e ssigned to worker in the mp phse, the sheduler looks up Dir lok in the HDFS nmenode to retrieve the list of dtnodes hving repli of the ontined HDFS lok. Then, the Hdoop MpRedue sheduler will try to shedule mp tsks on those dtnodes if possile. Unfortuntely, the HDFS nmenode does not differentite the replis w.r.t. their physil lyouts. HDFS ws simply not designed for this. Thus, from the point of view of the nmenode ll replis re yte-equivlent nd hve the sme size. In HAIL, we need to llow Hdoop MpRedue to hnge the sheduling proess to shedule mp tsks lose to replis hving suitle index otherwise Hdoop MpRedue would pik indexes rndomly. Hene, we hve to enrih the HDFS nmenode to keep dditionl informtion out the ville indexes. We do this y keeping n dditionl diretory Dir rep mpping (lokid, dtnode)

8 8 Stefn Rihter et l. HAILBlokRepliInfo. An instne of HAILBlokRepliInfo ontins detiled informtion out the types of ville indexes for repli, i.e., indexing key, index type, size, strt offsets, et. As efore, Hdoop MpRedue looks up Dir lok to retrieve the list of dtnodes hving repli for given lok. However, in ddition, HAIL looks up the min memory Dir rep to otin the detiled HAILBlok- RepliInfo for eh repli, i.e., one min memory lookup for eh repli. HAILBlokRepliInfo is then exploited y HAIL to hnge the sheduling strtegy of Hdoop (we will disuss this in detil in Setion 4). 3.4 An Index Struture for Zero-Overhed Indexing In this setion, we riefly disuss our hoie of n pproprite index struture for indexing t miniml osts in HAIL s give some detils on our onrete implementtion. Why Clustered Indexes? An interesting question is why we fous on lustered indexes. For indexing with miniml overhed, we require n index struture tht is hep to rete in min memory, hep to write to disk, nd hep to query from disk. We tried numer of indexes in the eginning of the projet inluding orse-grnulr indexes nd unlustered indexes. After some experimenttion we quikly disovered tht sorting nd index retion in min memory is so fst tht tehniques like prtil or orse-grnulr sorting do not py off for HAIL. Whether you py three or two seonds for sorting nd indexing per lok during uplod is hrdly notiele in the overll uplod proess of HDFS. In ddition, mjor prolem with unlustered indexes is tht they re only ompetitive for very seletive queries s they my trigger onsiderle rndom I/O for non-seletive index trversls. In ontrst, lustered indexes do not hve tht prolem. Whtever the seletivity, we will red the lustered index nd sn the qulifying loks. Hene, even for very low seletivities the only overhed over sn is the initil index node trversl, whih is negligile. Moreover, s unlustered indexes re dense y definition, they require onsiderly more dditionl spe on disk nd require more write I/O thn sprse lustered index. Thus, using unlustered indexes would severely ffet uplod times. Yet, n interesting diretion for future work would e to extend HAIL to support dditionl indexes tht might oost performne, suh s itmp indexes nd inverted lists. 4 HAIL Jo Exeution We now fous on generl jo exeution in HAIL. First, we present from Bo s perspetive how he n enhne MpRedue jos to enefit from HAIL stti indexing (Setion 4.1). We will explin how Bo n write his MpRedue jos (lmost) s efore nd run them extly s when using Hdoop MpRedue. After tht we nlyze from the system s perspetive the stndrd Hdoop MpRedue pipeline nd then ompre how HAIL exeutes jos (Setion 4.2). We will see tht HAIL requires only smll hnges in the Hdoop MpRedue frmework, whih mkes HAIL esy to integrte into newer Hdoop versions (Setion 4.3). Figure 2 shows the query pipeline when Bo runs MpRedue jo on HAIL. Finlly, we riefly disuss the se of seletions on unindexed ttriutes, i.e., when jo requests stti index tht ws not reted, s motivtion for HAIL dptive indexing (Setion 4.4). 4.1 Bo s Perspetive In Hdoop MpRedue, Bo writes MpRedue jo, whih inludes jo onfigurtion lss, mp funtion, nd redue funtion. In HAIL, the MpRedue jo remins the sme (see 1 nd 2 in Figure 2), ut with three tiny hnges: (1) Bo speifies the HilInputFormt (whih uses Hil- ReordReder internlly) in the min lss of the MpRedue jo. By doing this, Bo enles his MpRedue jo to red HAIL Bloks (see Setion 3.2). (2) Bo nnottes his mp funtion to speify the seletion predite nd the projeted ttriutes required y his MpRedue jo 5. For exmple, ssume tht Bo wnts to write MpRedue jo tht performs the following SQL query (exmple from Introdution): SELECT soureip FROM UserVisits WHERE visitdte BETWEEN AND To exeute this query in HAIL, Bo dds to his mp funtion HilQuery nnottion s etween( , 2-1-1)", projetion={@1}) void mp(text key, Text v) {... } Where the in the filter vlue nd the in the projetion vlue denote the ttriute position in the UserVisits reords. In this exmple the third ttriute is visitdte nd the first ttriute is soureip. By nnotting his mp funtion s mentioned ove, Bo indites tht he wnts to reeive in the mp funtion only the projeted ttriute vlues of those tuples qulifying the speified seletion predite. In se Bo does not speify filter predites, HAIL will perform full sn s the stndrd Hdoop. At query time, if the HilQuery nnottion is set, HAIL heks (using the Index Metdt of dt lok) whether n index exists on the filter ttriute. Using suh n index llows us to speed up the jo exeution. HAIL lso uses the Blok Metdt to determine the shem of dt lok. This llows HAIL to red the ttriutes speified in the filter nd projetion prmeters only. (3) Bo uses HilReord ojet s input vlue in the mp funtion. This llows Bo to diretly red the projeted ttriutes without splitting the reord into ttriutes s he 5 Alterntively, HAIL llows Bo to speify the seletion predite nd the projeted ttriutes in the jo onfigurtion lss.

9 Towrds Zero-Overhed Stti nd Adptive Indexing in Hdoop 9 Bo's Perspetive System's Perspetive Bo 2 run Jo JoClient Hdoop MpRedue MpRedue Pipeline Pipeline Split Phse Sheduler Mp Phse JoTrker TskTrker HAILReordReder 1 write Jo MpRedue Jo for eh lok lok i { lotion = lok i.gethostwithindex(@3); reteinputsplit(lotion); } 3 send splits[] for eh split split i { llote split i to losest DtNode storing lok i } 5 llote Mp Tsk - Index ess or full sn - Post-filtering - For eh reord invoke mp(hilreord) - Adptive indexing? Min Clss mp(...) redue(...) hose 4 omputing Node 6 red lok i 7 store filter="@3 etween( , 2-1-1)", projetion={@1}) void mp(text k, HilReord v) { output(v.getint(1), null); }... HAIL Annottion Fig. 2 The HAIL query pipeline DN1 lok i lok i lok i DN3 DN4 DN5 DN6 DN7 DNn HDFS HDFS... would do it in the stndrd Hdoop MpRedue. For exmple, using stndrd Hdoop MpRedue Bo would write the following mp funtion to perform the ove SQL query: Mp Funtion for Hdoop MpRedue (pseudo-ode): void mp(text key, Text v) { String[] ttr = v.tostring().split(","); if (DteUtils.isBetween(ttr[2], " ", "2-1-1")) output(ttr[], null); } Using HAIL Bo writes the following mp funtion: Mp Funtion for HAIL: void mp(text key, HilReord v) { output(v.getint(1), null); } Notie tht, Bo now does not hve to filter out the inoming reords, euse this is utomtilly hndled y HAIL vi the HilQuery nnottion (s mentioned erlier). This nnottion is illustrted in Figure System Perspetive In Hdoop MpRedue, when Bo sumits MpRedue jo JoClient instne is reted. The min gol of the Jo- Client is to opy ll the resoures needed to run the MpRedue jo (e.g. metdt nd jo lss files). But lso, the JoClient fethes ll the lok metdt (BlokLotion[]) of the input dtset. Then, the JoClient logilly reks the input into smller piees lled input splits (split phse in Figure 2) s defined in the InputFormt. By defult, the Jo- Client omputes input splits suh tht eh input split mps to distint HDFS lok. An input split defines the input of mp tsk while n HDFS lok is horizontl prtition of dtset stored in HDFS (see Setion 3.1 for detils on how HDFS stores dtsets). For sheduling purposes, the JoClient retrieves for eh input split ll dtnode lotions hving repli of tht HDFS lok. This is done y lling gethosts() of eh BlokLotion. For instne, in Figure 2, dtnodes DN3, DN5, nd DN7 re the split lotions for split 42 sine lok 42 is stored on suh dtnodes. After this split phse, the JoClient sumits the jo to the JoTrker with the set of input splits to proess 3. Among other opertions, the JoTrker retes mp tsk for eh input split. Then, for eh mp tsk, the JoTrker deides on whih omputing node to shedule the mp tsk, using the split lotions 4. This deision is sed on dt-lolity nd vilility [13]. After this, the JoTrker llotes the mp tsk to the TskTrker (whih performs mp nd redue tsks) running on tht omputing node 5. Only then, the mp tsk n strt proessing its input split. The mp tsk uses ReordReder UDF in order to red its input dt lok i from the losest dtnode 6. Interestingly, it is the lol HDFS lient running on the node where the mp tsk is running tht deides from whih dtnode mp tsk will red its input nd not the Hdoop MpRedue sheduler. This is done when the ReordReder sks for the input strem pointing to lok i. It is worth notiing tht the HDFS lient hooses dtnode from the set of ll dtnodes storing repli of lok 42 (vi the gethosts() method) rther thn from the lotions given y the input split. This mens tht mp tsk might eventully end up reding its input dt from remote node even though it is ville lolly. One the input strem is opened, the ReordReder reks lok 42 into reords nd mkes ll to the mp funtion for eh reord. Assuming tht the MpRedue jo onsists of mp phse only, the mp tsk then writes its output k to HDFS 7. See [15,44, 14] for more detils on the MpRedue exeution pipeline. In HAIL, it is ruil to e non-intrusive to the stndrd Hdoop exeution pipeline so tht users run MpRedue jos extly s efore. However, supporting per-repli indexes in n effiient wy nd without signifint hnges to the stndrd exeution pipeline is hllenging for sev-

10 1 Stefn Rihter et l. erl resons. First, the JoClient nnot simply rete input splits sed only on the defult lok size s eh HDFS lok repli hs different size (euse of indexes). Seond, the JoTrker n no longer shedule mp tsks sed on dt-lolity nd nodes vilility only. The JoTrker now hs to onsider the existing indexes for eh HDFS lok. Third, the ReordReder hs to perform either index ess or full sn of HDFS loks without ny intertion with users, e.g. depending on the vilility of suitle indexes. Fourth, the HDFS lient nnot nymore open n input strem to given HDFS lok sed on dt-lolity nd nodes vilility only: it hs to onsider index lolity nd vilility s well. HAIL overomes these issues y minly providing two UDFs: the HilInputFormt nd the HilReordReder. Notie, tht y using UDFs we llow HAIL to e esy to integrte into newer versions of Hdoop MpRedue. We disuss these two UDFs in the following. 4.3 HilInputFormt nd HilReordReder HAILInputFormt implements different splitting strtegy thn stndrd InputFormts. This strtegy llows HAIL to redue the numer of mp wves per jo, i.e., the mximum numer of mp tsks per mp slot required to omplete this jo. Therey, the totl sheduling overhed of MpRedue jos is drstilly redued. We disuss the detils of the HAIL Splitting strtegy in Setion 7. HAILReordReder is responsile for retrieving the reords tht stisfy the seletion predite of MpRedue jos (s illustrted in the MpRedue Pipeline of Figure 2). Those reords re then pssed to the mp funtion. For exmple in Bo s query of Setion 4.1, we need to find ll reords hving visitdte etween nd To do so, for eh dt lok required y the jo, we first try to open n input strem to lok repli hving the required index. For this, HAIL instruts the lol HDFS Client to use the newly introdued gethostswithindex() method of eh BlokLotion so s to hoose the losest dtnode with the desired index. Let us first fous on the se where suitle, sttilly reted index is ville so tht HAIL n open n input strem to n indexed repli. One tht input strem hs een opened, we use the informtion out seletion predites nd ttriute projetions from the HilQuery nnottion or from the jo onfigurtion file. When performing n index-sn, we red the index entirely into min memory (typilly few KB) to perform n index lookup. This lso implies reding the qulifying lok prts from disk into min memory nd post-filtering reords (see Setion 3.4). Then, we reonstrut the projeted ttriutes of qulifying tuples from PAX to row lyout. In se tht no projetion ws speified y users, we then reonstrut ll ttriutes. Finlly, we mke ll to the mp funtion for eh qulifying tuple. For d reords (see Setion 3.1), HAIL psses them diretly to the mp funtion, whih in turn hs to del with them (just like in stndrd Hdoop MpRedue). For this, HAIL psses reord to the mp funtion with flg to indite d reord or not. 4.4 Prolem: Missing Stti Indexes Finlly, let us now disuss the seond se when Bo sumits jo whih filters on n unindexed ttriute (e.g. on durtion). Here, the HilReordReder must ompletely sn the required ttriutes of unindexed loks, pply the seletion predite nd perform tuple reonstrution. Notie tht, with stti indexing, there is no wy for HAIL to overome the prolem of missing indexes effiiently. This mens tht when the ttriutes used in the seletion predites of the worklod hnge over time, the only wy to dpt the set of ville indexes is to uplod the dt gin. However, this hs the signifint overhed of n dditionl uplod, whih goes ginst the priniple of zero-overhed indexing. Thus, HAIL introdues n dptive indexing tehnique tht offers muh more elegnt nd effiient solution to this prolem. We disuss this tehnique in the following Setion. 5 HAIL Zero-Overhed Adptive Indexing We now disuss the dptive indexing pipeline of HAIL. The ore ide is to rete missing ut promising indexes s yproduts of full sns in the mp phse of MpRedue jos. Similr to the stti indexing pipeline, our gol is gin to ome loser towrds zero overhed indexing. Therefore, we dopt two importnt priniples from our stti indexing pipeline. First, we piggyk gin on proedure tht is nturlly reding dt from disk to min memory. This llows HAIL to ompletely sve the dt red ost for dptive index retion. Seond, s mp tsks re usully I/Oound, HAIL gin exploits unused CPU time when omputing lustered indexes in prllel to jo exeution. In Setion 5.1, we strt with generl overview of the HAIL dptive indexing pipeline. In Setion 5.2, we fous on the internl omponents for uilding nd storing lustered indexes inrementlly. In Setion 5.3, we present how HAIL esses the indexes reted t jo runtime in wy tht is trnsprent to the MpRedue jo exeution pipeline. Finlly, in Setion 6, we introdue three dditionl dptive indexing tehniques tht mke the indexing overhed over MpRedue jos lmost invisile to users. 5.1 HAIL Adptive Indexing in the Exeution Pipeline For our motivting exmple, let s ssume Bo ontinues to nlyze his logs nd noties some suspiious tivities, e.g. mny user visits with very short durtion, inditing spm ot tivities. Therefore, Bo suddenly needs different jos for his nlysis tht selets user visits with short durtions. However, rell tht unfortuntely he did not rete stti index on ttriute durtion t uplod time whih would help

11 Towrds Zero-Overhed Stti nd Adptive Indexing in Hdoop 11 HAIL Input Split 1 proess Mp Redue HAILReordReder Blok 42 Blok Metdt Index Metdt Index TskTrker 3 Detil View of TskTrker 5 red 3 mp lok Mpper mp(k, V) {...} d TskTrker 5 m pss to indexer 2 6 AdptiveIndexer NmeNode Pseudo Blok 42 Blok 42 Blok 42 Blok 42 HDFS Repli Repli Repli Repli + d... DN 3... DN 5... DN 7... Fig. 3 HAIL dptive indexing pipeline. write d Blok 42 Blok Metdt Index Metdt Index d register 7 TskTrker 7 for these new jos. In generl, s soon s Bo (or one of his ollegues) sends new jo (sy jo d ) with seletion predite on n unindexed ttriute (e.g. on ttriute durtion, whih we will denote s d in the following.), HAIL nnot enefit from index sns nymore. However, HAIL tkes these jos s hints on how to dptively improve the repertoire of indexes for future jos. HAIL piggyks the retion of lustered index over ttriute durtion on the exeution of jo d. Without ny loss of generlity, we ssume tht jo d projets ll ttriutes from its input dtset. Figure 3 illustrtes the generl workflow of the HAIL dptive indexing pipeline. The figure shows how HAIL proesses mp tsks of jo d when no suitle index is ville (i.e., when performing full sn) in more detil. As soon s HAIL shedules mp tsk to speifi TskTrker 6, e.g. TskTrker 5, the HAILReordReder of the mp tsk first reds the metdt from the HAILInputSplit 1 7. With this metdt, the HAILReordReder heks whether suitle index is ville for its input dt lok (sy lok 42 ). As no index on ttriute d is ville, the HAILReordReder simply opens n input strem to the lol repli of lok 42 stored on DtNode 5. Then, the HAILReordReder: (i) lods ll vlues of the ttriutes required y jo d from disk to min memory 2 ; (ii) reonstruts reords (s our HDFS loks re in olumnr lyout); nd (iii) feeds the mp funtion with eh reord 3. Here lies the euty of HAIL: n HDFS lok tht is potentil ndidte for indexing ws ompletely trnsferred to min memory s prt of the jo exeution proess. In ddition to feeding the entire lok 42 to the mp funtion, HAIL n rete lustered index on ttriute d to speed up future jos. For this, the HAILReordReder psses lok 42 to the AdptiveIndexer s soon s the mp funtion finished proessing this dt lok 4. 8 The AdptiveIndexer, in turn, sorts the dt in lok 42 ording to ttriute d, ligns other 5 6 A Hdoop instne responsile to exeute mp nd redue tsks. 7 Tht ws otined from the HAILInputFormt vi getsplits(). 8 Notie tht, ll mp tsks (even from different MpRedue jos) running on the sme node intert with the sme AdptiveIndexer inttriutes through reordering, nd retes sprse lustered index 5. Finlly, the AdptiveIndexer stores this index with opy of lok 42 (sorted on ttriute d) s pseudo dt lok repli 6. Additionlly, the AdptiveIndexer registers the new reted index for lok 42 with the HDFS NmeNode 7. In ft, the implementtion of the dptive indexing pipeline solves some interesting tehnil hllenges. We disuss the pipeline in more detil in the reminder of this setion. 5.2 AdptiveIndexer Adptive indexing is n utomti proess tht is not expliitly requested y users nd therefore should not unexpetedly impose signifint performne penlties on users jos. Piggyking dptive indexing on mp tsks llows us to ompletely sve the red I/O-ost. However, the indexing effort is shifted to query time. As result, ny dditionl time involved in indexing will potentilly dd to the totl runtime of MpRedue jos. Therefore, the first onern of HAIL is: how to mke dptive index retion effiient? To overome this issue, the ide of HAIL is to run the mpping nd indexing proesses in prllel. However, interleving mp tsk exeution with indexing ers the risk of re onditions etween mp tsks nd the AdptiveIndexer on the dt lok. In other words, the AdptiveIndexer might potentilly reorder dt inside dt lok, while the mp tsk is still onurrently reding the dt lok. One might think out opying dt loks efore indexing to del with this issue. Nevertheless, this would entil the dditionl runtime nd memory overhed of opying suh memory hunks. For this reson, HAIL does not interleve the mpping nd indexing proesses on the sme dt lok. Insted, HAIL interleves the indexing of given dt lok (e.g. lok 42 ) with the mpping phse of the sueeding dt lok (e.g. lok 43 ), i.e., HAIL keeps two HDFS loks in memory t the sme time. For this, HAIL uses produeronsumer pttern: mp tsk ts s produer y offering dt lok to the AdptiveIndexer, vi ounded loking queue, s soon s it finishes proessing the dt lok; in turn, the AdptiveIndexer is onstntly onsuming dt loks from this queue. As result, HAIL n perfetly interleve mp tsks with indexing, exept for the first nd lst dt lok to proess in eh node. It is worth noting tht the queue exposed y the AdptiveIndexer is llowed to rejet dt loks in se ertin limit of enqueued dt loks is exeeded. This prevents the AdptiveIndexer to run out of memory euse of overlod. Still, future MpRedue jos with seletion predite on the sme ttriute (i.e., on ttriute d) n t their turn tke re of indexing the rejeted dt loks. One the AdptiveIndexer pulls dt lok from its queue, it proesses the dt lok using two stne. Hene, the AdptiveIndexer n end up y indexing dt loks from different MpRedue jos t the sme time.

Enterprise Digital Signage Create a New Sign

Enterprise Digital Signage Create a New Sign Enterprise Digitl Signge Crete New Sign Intended Audiene: Content dministrtors of Enterprise Digitl Signge inluding stff with remote ess to sign.pitt.edu nd the Content Mnger softwre pplition for their

More information

Active Directory Service

Active Directory Service In order to lern whih questions hve een nswered orretly: 1. Print these pges. 2. Answer the questions. 3. Send this ssessment with the nswers vi:. FAX to (212) 967-3498. Or. Mil the nswers to the following

More information

OUTLINE SYSTEM-ON-CHIP DESIGN. GETTING STARTED WITH VHDL August 31, 2015 GAJSKI S Y-CHART (1983) TOP-DOWN DESIGN (1)

OUTLINE SYSTEM-ON-CHIP DESIGN. GETTING STARTED WITH VHDL August 31, 2015 GAJSKI S Y-CHART (1983) TOP-DOWN DESIGN (1) August 31, 2015 GETTING STARTED WITH VHDL 2 Top-down design VHDL history Min elements of VHDL Entities nd rhitetures Signls nd proesses Dt types Configurtions Simultor sis The testenh onept OUTLINE 3 GAJSKI

More information

Quick Guide to Lisp Implementation

Quick Guide to Lisp Implementation isp Implementtion Hndout Pge 1 o 10 Quik Guide to isp Implementtion Representtion o si dt strutures isp dt strutures re lled S-epressions. The representtion o n S-epression n e roken into two piees, the

More information

UNIVERSITY AND WORK-STUDY EMPLOYERS WEBSITE USER S GUIDE

UNIVERSITY AND WORK-STUDY EMPLOYERS WEBSITE USER S GUIDE UNIVERSITY AND WORK-STUDY EMPLOYERS WEBSITE USER S GUIDE Tble of Contents 1 Home Pge 1 2 Pge 2 3 Your Control Pnel 3 4 Add New Job (Three-Step Form) 4-6 5 Mnging Job Postings (Mnge Job Pge) 7-8 6 Additionl

More information

Inter-domain Routing

Inter-domain Routing COMP 631: COMPUTER NETWORKS Inter-domin Routing Jsleen Kur Fll 2014 1 Internet-sle Routing: Approhes DV nd link-stte protools do not sle to glol Internet How to mke routing slle? Exploit the notion of

More information

1. Definition, Basic concepts, Types 2. Addition and Subtraction of Matrices 3. Scalar Multiplication 4. Assignment and answer key 5.

1. Definition, Basic concepts, Types 2. Addition and Subtraction of Matrices 3. Scalar Multiplication 4. Assignment and answer key 5. . Definition, Bsi onepts, Types. Addition nd Sutrtion of Mtries. Slr Multiplition. Assignment nd nswer key. Mtrix Multiplition. Assignment nd nswer key. Determinnt x x (digonl, minors, properties) summry

More information

Words Symbols Diagram. abcde. a + b + c + d + e

Words Symbols Diagram. abcde. a + b + c + d + e Logi Gtes nd Properties We will e using logil opertions to uild mhines tht n do rithmeti lultions. It s useful to think of these opertions s si omponents tht n e hooked together into omplex networks. To

More information

Equivalence Checking. Sean Weaver

Equivalence Checking. Sean Weaver Equivlene Cheking Sen Wever Equivlene Cheking Given two Boolen funtions, prove whether or not two they re funtionlly equivlent This tlk fouses speifilly on the mehnis of heking the equivlene of pirs of

More information

Arc-Consistency for Non-Binary Dynamic CSPs

Arc-Consistency for Non-Binary Dynamic CSPs Ar-Consisteny for Non-Binry Dynmi CSPs Christin Bessière LIRMM (UMR C 9928 CNRS / Université Montpellier II) 860, rue de Sint Priest 34090 Montpellier, Frne Emil: essiere@rim.fr Astrt. Constrint stisftion

More information

Innovation in Software Development Process by Introducing Toyota Production System

Innovation in Software Development Process by Introducing Toyota Production System Innovtion in Softwre Development Proess y Introduing Toyot Prodution System V Koihi Furugki V Tooru Tkgi V Akinori Skt V Disuke Okym (Mnusript reeived June 1, 2006) Fujitsu Softwre Tehnologies (formerly

More information

- DAY 1 - Website Design and Project Planning

- DAY 1 - Website Design and Project Planning Wesite Design nd Projet Plnning Ojetive This module provides n overview of the onepts of wesite design nd liner workflow for produing wesite. Prtiipnts will outline the sope of wesite projet, inluding

More information

Using CrowdSourcing for Data Analytics

Using CrowdSourcing for Data Analytics Using CrowdSouring for Dt Anlytis Hetor Gri-Molin (work with Steven Whng, Peter Lofgren, Adity Prmeswrn nd others) Stnford University 1 Big Dt Anlytis CrowdSouring 1 CrowdSouring 3 Rel World Exmples Ctegorizing

More information

PLWAP Sequential Mining: Open Source Code

PLWAP Sequential Mining: Open Source Code PL Sequentil Mining: Open Soure Code C.I. Ezeife Shool of Computer Siene University of Windsor Windsor, Ontrio N9B 3P4 ezeife@uwindsor. Yi Lu Deprtment of Computer Siene Wyne Stte University Detroit, Mihign

More information

Practice Test 2. a. 12 kn b. 17 kn c. 13 kn d. 5.0 kn e. 49 kn

Practice Test 2. a. 12 kn b. 17 kn c. 13 kn d. 5.0 kn e. 49 kn Prtie Test 2 1. A highwy urve hs rdius of 0.14 km nd is unnked. A r weighing 12 kn goes round the urve t speed of 24 m/s without slipping. Wht is the mgnitude of the horizontl fore of the rod on the r?

More information

Student Access to Virtual Desktops from personally owned Windows computers

Student Access to Virtual Desktops from personally owned Windows computers Student Aess to Virtul Desktops from personlly owned Windows omputers Mdison College is plesed to nnoune the ility for students to ess nd use virtul desktops, vi Mdison College wireless, from personlly

More information

Chapter. Contents: A Constructing decimal numbers

Chapter. Contents: A Constructing decimal numbers Chpter 9 Deimls Contents: A Construting deiml numers B Representing deiml numers C Deiml urreny D Using numer line E Ordering deimls F Rounding deiml numers G Converting deimls to frtions H Converting

More information

BUSINESS PROCESS MODEL TRANSFORMATION ISSUES The top 7 adversaries encountered at defining model transformations

BUSINESS PROCESS MODEL TRANSFORMATION ISSUES The top 7 adversaries encountered at defining model transformations USINESS PROCESS MODEL TRANSFORMATION ISSUES The top 7 dversries enountered t defining model trnsformtions Mrion Murzek Women s Postgrdute College for Internet Tehnologies (WIT), Institute of Softwre Tehnology

More information

Data Security 1. 1 What is the function of the Jump instruction? 2 What are the main parts of the virus code? 3 What is the last act of the virus?

Data Security 1. 1 What is the function of the Jump instruction? 2 What are the main parts of the virus code? 3 What is the last act of the virus? UNIT 18 Dt Seurity 1 STARTER Wht stories do you think followed these hedlines? Compre nswers within your group. 1 Love ug retes worldwide hos. 2 Hkers rk Mirosoft softwre odes. 3 We phone sm. Wht other

More information

1 GSW IPv4 Addressing

1 GSW IPv4 Addressing 1 For s long s I ve een working with the Internet protools, people hve een sying tht IPv6 will e repling IPv4 in ouple of yers time. While this remins true, it s worth knowing out IPv4 ddresses. Even when

More information

Clause Trees: a Tool for Understanding and Implementing Resolution in Automated Reasoning

Clause Trees: a Tool for Understanding and Implementing Resolution in Automated Reasoning Cluse Trees: Tool for Understnding nd Implementing Resolution in Automted Resoning J. D. Horton nd Brue Spener University of New Brunswik, Frederiton, New Brunswik, Cnd E3B 5A3 emil : jdh@un. nd spener@un.

More information

Reasoning to Solve Equations and Inequalities

Reasoning to Solve Equations and Inequalities Lesson4 Resoning to Solve Equtions nd Inequlities In erlier work in this unit, you modeled situtions with severl vriles nd equtions. For exmple, suppose you were given usiness plns for concert showing

More information

REMO: Resource-Aware Application State Monitoring for Large-Scale Distributed Systems

REMO: Resource-Aware Application State Monitoring for Large-Scale Distributed Systems : Resoure-Awre Applition Stte Monitoring for Lrge-Sle Distriuted Systems Shiong Meng Srinivs R. Kshyp Chitr Venktrmni Ling Liu College of Computing, Georgi Institute of Tehnology, Atlnt, GA 332, USA {smeng,

More information

How To Organize A Meeting On Gotomeeting

How To Organize A Meeting On Gotomeeting NOTES ON ORGANIZING AND SCHEDULING MEETINGS Individul GoToMeeting orgnizers my hold meetings for up to 15 ttendees. GoToMeeting Corporte orgnizers my hold meetings for up to 25 ttendees. GoToMeeting orgnizers

More information

p-q Theory Power Components Calculations

p-q Theory Power Components Calculations ISIE 23 - IEEE Interntionl Symposium on Industril Eletronis Rio de Jneiro, Brsil, 9-11 Junho de 23, ISBN: -783-7912-8 p-q Theory Power Components Clultions João L. Afonso, Memer, IEEE, M. J. Sepúlved Freits,

More information

McAfee Network Security Platform

McAfee Network Security Platform XC-240 Lod Blner Appline Quik Strt Guide Revision D MAfee Network Seurity Pltform This quik strt guide explins how to quikly set up nd tivte your MAfee Network Seurity Pltform XC-240 Lod Blner. The SFP+

More information

Architecture and Data Flows Reference Guide

Architecture and Data Flows Reference Guide Arhiteture nd Dt Flows Referene Guide BES12 Version 12.3 Pulished: 2015-10-14 SWD-20151014125318579 Contents Aout this guide... 5 Arhiteture: BES12 EMM solution... 6 Components used to mnge BlkBerry 10,

More information

KEY SKILLS INFORMATION TECHNOLOGY Level 3. Question Paper. 29 January 9 February 2001

KEY SKILLS INFORMATION TECHNOLOGY Level 3. Question Paper. 29 January 9 February 2001 KEY SKILLS INFORMATION TECHNOLOGY Level 3 Question Pper 29 Jnury 9 Ferury 2001 WHAT YOU NEED This Question Pper An Answer Booklet Aess to omputer, softwre nd printer You my use ilingul ditionry Do NOT

More information

VMware Horizon FLEX Administration Guide

VMware Horizon FLEX Administration Guide VMwre Horizon FLEX Administrtion Guide Horizon FLEX 1.0 This doument supports the version of eh produt listed nd supports ll susequent versions until the doument is repled y new edition. To hek for more

More information

GENERAL OPERATING PRINCIPLES

GENERAL OPERATING PRINCIPLES KEYSECUREPC USER MANUAL N.B.: PRIOR TO READING THIS MANUAL, YOU ARE ADVISED TO READ THE FOLLOWING MANUAL: GENERAL OPERATING PRINCIPLES Der Customer, KeySeurePC is n innovtive prout tht uses ptente tehnology:

More information

ORGANIZER QUICK REFERENCE GUIDE

ORGANIZER QUICK REFERENCE GUIDE NOTES ON ORGANIZING AND SCHEDULING MEETINGS Individul GoToMeeting orgnizers my hold meetings for up to 15 ttendees. GoToMeeting Corporte orgnizers my hold meetings for up to 25 ttendees. GoToMeeting orgnizers

More information

Calculating Principal Strains using a Rectangular Strain Gage Rosette

Calculating Principal Strains using a Rectangular Strain Gage Rosette Clulting Prinipl Strins using Retngulr Strin Gge Rosette Strin gge rosettes re used often in engineering prtie to determine strin sttes t speifi points on struture. Figure illustrtes three ommonly used

More information

VMware Horizon FLEX Administration Guide

VMware Horizon FLEX Administration Guide VMwre Horizon FLEX Administrtion Guide Horizon FLEX 1.1 This doument supports the version of eh produt listed nd supports ll susequent versions until the doument is repled y new edition. To hek for more

More information

European Convention on Social and Medical Assistance

European Convention on Social and Medical Assistance Europen Convention on Soil nd Medil Assistne Pris, 11.XII.1953 Europen Trety Series - No. 14 The governments signtory hereto, eing memers of the Counil of Europe, Considering tht the im of the Counil of

More information

Cell Breathing Techniques for Load Balancing in Wireless LANs

Cell Breathing Techniques for Load Balancing in Wireless LANs 1 Cell rething Tehniques for Lod lning in Wireless LANs Yigl ejerno nd Seung-Je Hn ell Lortories, Luent Tehnologies Astrt: Mximizing the network throughput while providing firness is one of the key hllenges

More information

SOLVING EQUATIONS BY FACTORING

SOLVING EQUATIONS BY FACTORING 316 (5-60) Chpter 5 Exponents nd Polynomils 5.9 SOLVING EQUATIONS BY FACTORING In this setion The Zero Ftor Property Applitions helpful hint Note tht the zero ftor property is our seond exmple of getting

More information

Maximum area of polygon

Maximum area of polygon Mimum re of polygon Suppose I give you n stiks. They might e of ifferent lengths, or the sme length, or some the sme s others, et. Now there re lots of polygons you n form with those stiks. Your jo is

More information

How To Balance Power In A Distribution System

How To Balance Power In A Distribution System NTERNATONA JOURNA OF ENERG, ssue 3, ol., 7 A dynmilly S bsed ompt ontrol lgorithm for lod blning in distribution systems A. Kzemi, A. Mordi Koohi nd R. Rezeipour Abstrt An lgorithm for pplying fixed pitor-thyristorontrolled

More information

Ratio and Proportion

Ratio and Proportion Rtio nd Proportion Rtio: The onept of rtio ours frequently nd in wide vriety of wys For exmple: A newspper reports tht the rtio of Repulins to Demorts on ertin Congressionl ommittee is 3 to The student/fulty

More information

control policies to be declared over by associating security

control policies to be declared over by associating security Seure XML Querying with Seurity Views Wenfei Fn University of Edinurgh & Bell Lortories wenfei@infeduk Chee-Yong Chn Ntionl University of Singpore hny@ompnusedusg Minos Groflkis Bell Lortories minos@reserhell-lsom

More information

SECTION 7-2 Law of Cosines

SECTION 7-2 Law of Cosines 516 7 Additionl Topis in Trigonometry h d sin s () tn h h d 50. Surveying. The lyout in the figure t right is used to determine n inessile height h when seline d in plne perpendiulr to h n e estlished

More information

1 Fractions from an advanced point of view

1 Fractions from an advanced point of view 1 Frtions from n vne point of view We re going to stuy frtions from the viewpoint of moern lger, or strt lger. Our gol is to evelop eeper unerstning of wht n men. One onsequene of our eeper unerstning

More information

European Convention on Products Liability in regard to Personal Injury and Death

European Convention on Products Liability in regard to Personal Injury and Death Europen Trety Series - No. 91 Europen Convention on Produts Liility in regrd to Personl Injury nd Deth Strsourg, 27.I.1977 The memer Sttes of the Counil of Europe, signtory hereto, Considering tht the

More information

Seeking Equilibrium: Demand and Supply

Seeking Equilibrium: Demand and Supply SECTION 1 Seeking Equilirium: Demnd nd Supply OBJECTIVES KEY TERMS TAKING NOTES In Setion 1, you will explore mrket equilirium nd see how it is rehed explin how demnd nd supply intert to determine equilirium

More information

PUBLIC-TRANSIT VEHICLE SCHEDULES USING A MINIMUM CREW-COST APPROACH

PUBLIC-TRANSIT VEHICLE SCHEDULES USING A MINIMUM CREW-COST APPROACH TOTAL LOGISTIC MANAGEMENT No. PP. Avishi CEDER PUBLIC-TRANSIT VEHICLE SCHEDULES USING A MINIMUM CREW-COST APPROACH Astrt: Commonly, puli trnsit genies, with view towrd effiieny, im t minimizing the numer

More information

National Firefighter Ability Tests And the National Firefighter Questionnaire

National Firefighter Ability Tests And the National Firefighter Questionnaire Ntionl Firefighter Aility Tests An the Ntionl Firefighter Questionnire PREPARATION AND PRACTICE BOOKLET Setion One: Introution There re three tests n questionnire tht mke up the NFA Tests session, these

More information

SE3BB4: Software Design III Concurrent System Design. Sample Solutions to Assignment 1

SE3BB4: Software Design III Concurrent System Design. Sample Solutions to Assignment 1 SE3BB4: Softwre Design III Conurrent System Design Winter 2011 Smple Solutions to Assignment 1 Eh question is worth 10pts. Totl of this ssignment is 70pts. Eh ssignment is worth 9%. If you think your solution

More information

Module 5. Three-phase AC Circuits. Version 2 EE IIT, Kharagpur

Module 5. Three-phase AC Circuits. Version 2 EE IIT, Kharagpur Module 5 Three-hse A iruits Version EE IIT, Khrgur esson 8 Three-hse Blned Suly Version EE IIT, Khrgur In the module, ontining six lessons (-7), the study of iruits, onsisting of the liner elements resistne,

More information

Multi-level Visualization of Concurrent and Distributed Computation in Erlang

Multi-level Visualization of Concurrent and Distributed Computation in Erlang Multi-level Visuliztion of Conurrent nd Distriuted Computtion in Erlng Roert Bker r440@kent..uk Peter Rodgers P.J.Rodgers@kent..uk Simon Thompson S.J.Thompson@kent..uk Huiqing Li H.Li@kent..uk Astrt This

More information

OxCORT v4 Quick Guide Revision Class Reports

OxCORT v4 Quick Guide Revision Class Reports OxCORT v4 Quik Guie Revision Clss Reports This quik guie is suitble for the following roles: Tutor This quik guie reltes to the following menu options: Crete Revision Clss Reports pg 1 Crete Revision Clss

More information

SOLVING QUADRATIC EQUATIONS BY FACTORING

SOLVING QUADRATIC EQUATIONS BY FACTORING 6.6 Solving Qudrti Equtions y Ftoring (6 31) 307 In this setion The Zero Ftor Property Applitions 6.6 SOLVING QUADRATIC EQUATIONS BY FACTORING The tehniques of ftoring n e used to solve equtions involving

More information

The remaining two sides of the right triangle are called the legs of the right triangle.

The remaining two sides of the right triangle are called the legs of the right triangle. 10 MODULE 6. RADICAL EXPRESSIONS 6 Pythgoren Theorem The Pythgoren Theorem An ngle tht mesures 90 degrees is lled right ngle. If one of the ngles of tringle is right ngle, then the tringle is lled right

More information

The art of Paperarchitecture (PA). MANUAL

The art of Paperarchitecture (PA). MANUAL The rt of Pperrhiteture (PA). MANUAL Introution Pperrhiteture (PA) is the rt of reting three-imensionl (3D) ojets out of plin piee of pper or ror. At first, esign is rwn (mnully or printe (using grphil

More information

The Cat in the Hat. by Dr. Seuss. A a. B b. A a. Rich Vocabulary. Learning Ab Rhyming

The Cat in the Hat. by Dr. Seuss. A a. B b. A a. Rich Vocabulary. Learning Ab Rhyming MINI-LESSON IN TION The t in the Ht y Dr. Seuss Rih Voulry tme dj. esy to hndle (not wild) LERNING Lerning Rhyming OUT Words I know it is wet nd the sun is not sunny. ut we n hve Lots of good fun tht is

More information

A System Context-Aware Approach for Battery Lifetime Prediction in Smart Phones

A System Context-Aware Approach for Battery Lifetime Prediction in Smart Phones A System Context-Awre Approh for Bttery Lifetime Predition in Smrt Phones Xi Zho, Yo Guo, Qing Feng, nd Xingqun Chen Key Lbortory of High Confidene Softwre Tehnologies (Ministry of Edution) Shool of Eletronis

More information

A Language-Neutral Representation of Temporal Information

A Language-Neutral Representation of Temporal Information A Lnguge-Neutrl Representtion of Temporl Informtion Rihrd Cmpell*, Tkko Aikw, Zixin Jing, Crmen Lozno, Mite Melero nd Andi Wu Mirosoft Reserh One Mirosoft Wy, Redmond, WA 98052 USA {rihmp, tkko, jingz,

More information

EQUATIONS OF LINES AND PLANES

EQUATIONS OF LINES AND PLANES EQUATIONS OF LINES AND PLANES MATH 195, SECTION 59 (VIPUL NAIK) Corresponding mteril in the ook: Section 12.5. Wht students should definitely get: Prmetric eqution of line given in point-direction nd twopoint

More information

BEC TESTS Gli ascolti sono disponibili all indirizzo www.loescher.it/business

BEC TESTS Gli ascolti sono disponibili all indirizzo www.loescher.it/business Gli solti sono disponiili ll indirizzo www.loesher.it/usiness SURNAME AND NAME CLASS DATE BEC TEST Prt one Questions 1-8 For questions 1-8 you will her eight short reordings. For eh question, hoose one

More information

c b 5.00 10 5 N/m 2 (0.120 m 3 0.200 m 3 ), = 4.00 10 4 J. W total = W a b + W b c 2.00

c b 5.00 10 5 N/m 2 (0.120 m 3 0.200 m 3 ), = 4.00 10 4 J. W total = W a b + W b c 2.00 Chter 19, exmle rolems: (19.06) A gs undergoes two roesses. First: onstnt volume @ 0.200 m 3, isohori. Pressure inreses from 2.00 10 5 P to 5.00 10 5 P. Seond: Constnt ressure @ 5.00 10 5 P, isori. olume

More information

AntiSpyware Enterprise Module 8.5

AntiSpyware Enterprise Module 8.5 AntiSpywre Enterprise Module 8.5 Product Guide Aout the AntiSpywre Enterprise Module The McAfee AntiSpywre Enterprise Module 8.5 is n dd-on to the VirusScn Enterprise 8.5i product tht extends its ility

More information

INSTALLATION, OPERATION & MAINTENANCE

INSTALLATION, OPERATION & MAINTENANCE DIESEL PROTECTION SYSTEMS Exhust Temperture Vlves (Mehnil) INSTALLATION, OPERATION & MAINTENANCE Vlve Numer TSZ-135 TSZ-150 TSZ-200 TSZ-275 TSZ-392 DESCRIPTION Non-eletril temperture vlves mnuftured in

More information

Analysis of Algorithms and Data Structures for Text Indexing Moritz G. Maaß

Analysis of Algorithms and Data Structures for Text Indexing Moritz G. Maaß FAKULTÄT FÜR INFORMATIK TECHNISCHE UNIVERSITÄT MÜNCHEN Lehrstuhl für Effiziente Algorithmen Anlysis of Algorithms nd Dt Strutures for Text Indexing Moritz G. Mß FAKULTÄT FÜR INFORMATIK TECHNISCHE UNIVERSITÄT

More information

Fundamentals of Cellular Networks

Fundamentals of Cellular Networks Fundmentls of ellulr Networks Dvid Tipper Assoite Professor Grdute Progrm in Teleommunitions nd Networking University of Pittsburgh Slides 4 Telom 2720 ellulr onept Proposed by ell Lbs 97 Geogrphi Servie

More information

Would your business survive a crisis? A guide to business continuity planning. www.staffordbc.gov.uk

Would your business survive a crisis? A guide to business continuity planning. www.staffordbc.gov.uk Would your usiness survive risis? A guide to usiness ontinuity plnning www.stfford.gov.uk 2 A guide to Business Continuity Plnning A guide to usiness ontinuity plnning Contents The Lw Wht type of inidents

More information

Lesson 1: Getting started

Lesson 1: Getting started Answer key 0 Lesson 1: Getting strte 1 List the three min wys you enter t in QuikBooks. Forms, lists, registers 2 List three wys to ess fetures in QuikBooks. Menu r, Ion Br, Centers, Home pge 3 Wht ookkeeping

More information

Regular Sets and Expressions

Regular Sets and Expressions Regulr Sets nd Expressions Finite utomt re importnt in science, mthemtics, nd engineering. Engineers like them ecuse they re super models for circuits (And, since the dvent of VLSI systems sometimes finite

More information

Small Business Cloud Services

Small Business Cloud Services Smll Business Cloud Services Summry. We re thick in the midst of historic se-chnge in computing. Like the emergence of personl computers, grphicl user interfces, nd mobile devices, the cloud is lredy profoundly

More information

the machine and check the components Black Yellow Cyan Magenta Starter Ink Cartridges Product Registration Sheet (USA only)

the machine and check the components Black Yellow Cyan Magenta Starter Ink Cartridges Product Registration Sheet (USA only) Quik Setup Guide Strt Here DCP-J140W Thnk you for hoosing Brother, your support is importnt to us nd we vlue your usiness. Your Brother produt is engineered nd mnuftured to the highest stndrds to deliver

More information

Orthodontic marketing through social media networks: The patient and practitioner s perspective

Orthodontic marketing through social media networks: The patient and practitioner s perspective Originl rtile Orthodonti mrketing through soil medi networks: The ptient nd prtitioner s perspetive Kristin L. Nelson ; Bhvn Shroff ; l M. Best ; Steven J. Linduer d BSTRCT Ojetive: To (1) ssess orthodonti

More information

Revised products from the Medicare Learning Network (MLN) ICD-10-CM/PCS Myths and Facts, Fact Sheet, ICN 902143, downloadable.

Revised products from the Medicare Learning Network (MLN) ICD-10-CM/PCS Myths and Facts, Fact Sheet, ICN 902143, downloadable. DEPARTMENT OF HEALTH AND HUMAN SERVICES Centers for Meire & Meii Servies Revise prouts from the Meire Lerning Network (MLN) ICD-10-CM/PCS Myths n Fts, Ft Sheet, ICN 902143, ownlole. MLN Mtters Numer: SE1325

More information

MATH PLACEMENT REVIEW GUIDE

MATH PLACEMENT REVIEW GUIDE MATH PLACEMENT REVIEW GUIDE This guie is intene s fous for your review efore tking the plement test. The questions presente here my not e on the plement test. Although si skills lultor is provie for your

More information

THE LONGITUDINAL FIELD IN THE GTEM 1750 AND THE NATURE OF THE TERMINATION.

THE LONGITUDINAL FIELD IN THE GTEM 1750 AND THE NATURE OF THE TERMINATION. THE LONGITUDINAL FIELD IN THE GTEM 175 AND THE NATURE OF THE TERMINATION. Benjmin Guy Loder Ntionl Physil Lbortory, Queens Rod, Teddington, Middlesex, Englnd. TW11 LW Mrtin Alexnder Ntionl Physil Lbortory,

More information

New combinatorial features for knots and virtual knots. Arnaud MORTIER

New combinatorial features for knots and virtual knots. Arnaud MORTIER New omintoril fetures for knots nd virtul knots Arnud MORTIER April, 203 2 Contents Introdution 5. Conventions.................................... 9 2 Virtul knot theories 2. The lssil se.................................

More information

How To Network A Smll Business

How To Network A Smll Business Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology

More information

If two triangles are perspective from a point, then they are also perspective from a line.

If two triangles are perspective from a point, then they are also perspective from a line. Mth 487 hter 4 Prtie Prolem Solutions 1. Give the definition of eh of the following terms: () omlete qudrngle omlete qudrngle is set of four oints, no three of whih re olliner, nd the six lines inident

More information

Interpreting the Mean Comparisons Report

Interpreting the Mean Comparisons Report Interpreting the Men Comprisons Report Smple The Men Comprisons report is bsed on informtion from ll rndomly seleted students for both your institution nd your omprison institutions. 1 Trgeted oversmples

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology

More information

Small Businesses Decisions to Offer Health Insurance to Employees

Small Businesses Decisions to Offer Health Insurance to Employees Smll Businesses Decisions to Offer Helth Insurnce to Employees Ctherine McLughlin nd Adm Swinurn, June 2014 Employer-sponsored helth insurnce (ESI) is the dominnt source of coverge for nonelderly dults

More information

Appendix D: Completing the Square and the Quadratic Formula. In Appendix A, two special cases of expanding brackets were considered:

Appendix D: Completing the Square and the Quadratic Formula. In Appendix A, two special cases of expanding brackets were considered: Appendi D: Completing the Squre nd the Qudrtic Formul Fctoring qudrtic epressions such s: + 6 + 8 ws one of the topics introduced in Appendi C. Fctoring qudrtic epressions is useful skill tht cn help you

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd business. Introducing technology

More information

the machine and check the components

the machine and check the components Quik Setup Guide Strt Here MFC-7360N MFC-7460DN Plese red the Sfety nd Legl ooklet first efore you set up your mhine. Then, plese red this Quik Setup Guide for the orret setup nd instlltion. To view the

More information

Vendor Rating for Service Desk Selection

Vendor Rating for Service Desk Selection Vendor Presented By DATE Using the scores of 0, 1, 2, or 3, plese rte the vendor's presenttion on how well they demonstrted the functionl requirements in the res below. Also consider how efficient nd functionl

More information

Introductory Information. Setup Guide. Introduction. Space Required for Installation. Overview of Setup. The Manuals Supplied with This Printer ENG

Introductory Information. Setup Guide. Introduction. Space Required for Installation. Overview of Setup. The Manuals Supplied with This Printer ENG Introdutory Informtion Introdution Setup Guide ENG Red this mnul efore ttempting to operte the printer. Keep this mnul in hndy lotion for future referene. Overview of Setup These re the steps in printer

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd business. Introducing technology

More information

Qualmark Licence Agreement

Qualmark Licence Agreement Terms nd Conditions Qulmrk Liene Agreement Terms nd Conditions Terms nd Conditions 1. Liene Holder Applint 2. Confirmed Sttus 3. Term nd Renewl 4. Use of the Intelletul Property 5. Qulmrk Progrmme Rtings

More information

How To Set Up A Network For Your Business

How To Set Up A Network For Your Business Why Network is n Essentil Productivity Tool for Any Smll Business TechAdvisory.org SME Reports sponsored by Effective technology is essentil for smll businesses looking to increse their productivity. Computer

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology

More information

DiaGen: A Generator for Diagram Editors Based on a Hypergraph Model

DiaGen: A Generator for Diagram Editors Based on a Hypergraph Model DiGen: A Genertor for Digrm Eitors Bse on Hypergrph Moel G. Viehstet M. Mins Lehrstuhl für Progrmmiersprhen Universität Erlngen-Nürnerg Mrtensstr. 3, 91058 Erlngen, Germny Emil: fviehste,minsg@informtik.uni-erlngen.e

More information

Start Here. Quick Setup Guide. the machine and check the components. NOTE Not all models are available in all countries.

Start Here. Quick Setup Guide. the machine and check the components. NOTE Not all models are available in all countries. Quik Setup Guide Strt Here HL-3140CW / HL-3150CDN HL-3150CDW / HL-3170CDW Thnk you for hoosing Brother, your support is importnt to us nd we vlue your usiness. Your Brother produt is engineered nd mnuftured

More information

TOA RANGATIRA TRUST. Deed of Trust 3714386.2

TOA RANGATIRA TRUST. Deed of Trust 3714386.2 TOA RANGATIRA TRUST Deed of Trust 1 Deed dted 2011 Prties 1 Te Runng o To Rngtir Inorported n inorported soiety hving its registered offie t Poriru (the Runng ) Bkground A B C D The Runng is n inorported

More information

Morgan Stanley Ad Hoc Reporting Guide

Morgan Stanley Ad Hoc Reporting Guide spphire user guide Ferury 2015 Morgn Stnley Ad Hoc Reporting Guide An Overview For Spphire Users 1 Introduction The Ad Hoc Reporting tool is ville for your reporting needs outside of the Spphire stndrd

More information

Bayesian Updating with Continuous Priors Class 13, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Bayesian Updating with Continuous Priors Class 13, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom Byesin Updting with Continuous Priors Clss 3, 8.05, Spring 04 Jeremy Orloff nd Jonthn Bloom Lerning Gols. Understnd prmeterized fmily of distriutions s representing continuous rnge of hypotheses for the

More information

Review. Scan Conversion. Rasterizing Polygons. Rasterizing Polygons. Triangularization. Convex Shapes. Utah School of Computing Spring 2013

Review. Scan Conversion. Rasterizing Polygons. Rasterizing Polygons. Triangularization. Convex Shapes. Utah School of Computing Spring 2013 Uth Shool of Computing Spring 2013 Review Leture Set 4 Sn Conversion CS5600 Computer Grphis Spring 2013 Line rsteriztion Bsi Inrementl Algorithm Digitl Differentil Anlzer Rther thn solve line eqution t

More information

Homework 3 Solutions

Homework 3 Solutions CS 341: Foundtions of Computer Science II Prof. Mrvin Nkym Homework 3 Solutions 1. Give NFAs with the specified numer of sttes recognizing ech of the following lnguges. In ll cses, the lphet is Σ = {,1}.

More information

LISTENING COMPREHENSION

LISTENING COMPREHENSION PORG, přijímí zkoušky 2015 Angličtin B Reg. číslo: Inluded prts: Points (per prt) Points (totl) 1) Listening omprehension 2) Reding 3) Use of English 4) Writing 1 5) Writing 2 There re no extr nswersheets

More information

Econ 4721 Money and Banking Problem Set 2 Answer Key

Econ 4721 Money and Banking Problem Set 2 Answer Key Econ 472 Money nd Bnking Problem Set 2 Answer Key Problem (35 points) Consider n overlpping genertions model in which consumers live for two periods. The number of people born in ech genertion grows in

More information

Fluent Merging: A General Technique to Improve Reachability Heuristics and Factored Planning

Fluent Merging: A General Technique to Improve Reachability Heuristics and Factored Planning Fluent Merging: A Generl Tehnique to Improve Rehility Heuristis n Ftore Plnning Menkes vn en Briel Deprtment of Inustril Engineering Arizon Stte University Tempe AZ, 85287-8809 menkes@su.eu Suro Kmhmpti

More information

Before you can use the machine, please read this Quick Setup Guide for the correct setup and installation.

Before you can use the machine, please read this Quick Setup Guide for the correct setup and installation. Quik Setup Guide Strt Here DCP-365CN DCP-373CW DCP-375CW DCP-377CW Before you n use the mhine, plese red this Quik Setup Guide for the orret setup nd instlltion. WARNING CAUTION Wrnings tell you wht to

More information

Interactive Phone Call: Synchronous Remote Collaboration and Projected Interactive Surfaces

Interactive Phone Call: Synchronous Remote Collaboration and Projected Interactive Surfaces Intertive Phone Cll: Synhronous Remote Collortion nd Projeted Intertive Surfes Christin Winkler, Christin Reinrtz, Din Nowk, Enrio Rukzio MHCI Group, pluno The Ruhr Institute for Softwre Tehnology, University

More information

Forensic Engineering Techniques for VLSI CAD Tools

Forensic Engineering Techniques for VLSI CAD Tools Forensi Engineering Tehniques for VLSI CAD Tools Jennifer L. Wong, Drko Kirovski, Dvi Liu, Miorg Potkonjk UCLA Computer Siene Deprtment University of Cliforni, Los Angeles June 8, 2000 Computtionl Forensi

More information