Monitoring Streams A New Class of Data Management Applications

Transcription

1 Brown Comuter Science Technical Reort, TR-CS-0-04 Monitoring Streams A New Class o Data Management Alications Don Carney Brown University dc@csbrownedu Ugur Cetintemel Brown University ugur@csbrownedu Mitch Cherniack Brandeis University mc@csbrandeisedu Christian Convey Brown University cjc@csbrownedu Sangdon Lee Brown University sdlee@csbrownedu Greg Seidman Brown University gss@csbrownedu Michael Stonebraker MIT stonebraker@lcsmitedu Nesime Tatbul Brown University tatbul@csbrownedu Stan Zdonik Brown University sbz@csbrownedu Abstract This aer introduces monitoring alications, which we will show dier substantially rom conventional business data rocessing The act that a sotware system must rocess and react to continual inuts rom many sources (eg, sensors) rather than rom human oerators requires one to rethink the undamental architecture o a DBMS or this alication area In this aer, we resent Aurora, a new DBMS that is currently under construction at Brandeis University, Brown University, and MIT We describe the basic system architecture, a stream-oriented set o oerators, otimization tactics, and suort or real-time oeration 1 Introduction Traditional DBMSs have been oriented toward business data rocessing, and consequently are designed to address the needs o these alications First, they have assumed that the DBMS is a assive reository storing a large collection o data elements and that humans initiate queries and transactions on this reository We call this a Human- Active, DBMS-Passive (HADP) model Second, they have assumed that the current state o the data is the only thing that is imortant Hence, current values o data elements are easy to obtain, while revious values can only be ound torturously by decoding the DBMS log The third assumtion is that triggers and alerters are second-class citizens These constructs have been added as an ater thought to current systems, and none have an imlementation that scales to a large number o triggers In addition, there is tyically little or no suort or querying a trigger base Fourth, DBMSs assume that data elements have recise values (eg, emloyee salaries), and have little or no suort or data elements that are imrecise or out-o-date Lastly, DBMSs assume that alications require no real-time services There is a substantial class o alications where all ive assumtions are roblematic Monitoring alications are alications that monitor continuous streams o data This class o alications includes military alications that monitor readings rom sensors worn by soldiers (eg, blood ressure, heart rate, osition), inancial analysis alications that monitor streams o stock data reorted rom various stock exchanges, and tracking alications that monitor the locations o large numbers o objects or which they are resonsible (eg, audio-visual deartments that must monitor the location o borrowed equiment) Because o the high volume o monitored data and the query requirements or these alications, monitoring alications would beneit rom DBMS suort Existing DBMS systems, however, are ill suited or such alications since they target business alications First, monitoring alications get their data rom external sources (eg, sensors) rather than rom humans issuing transactions The role o the DBMS in this context is to alert humans when abnormal activity is detected This is a DBMS-Active, Human-Passive (DAHP) model Second, monitoring alications require data management extending over the entire history o values reorted in a stream, and not just over the most recently reorted values Consider a monitoring alication that tracks the location o items o interest, such as overhead transarency rojectors and lato comuters, using electronic roerty stickers attached to the objects Ceiling-mounted sensors inside a building and the GPS system in the oen air generate large volumes o location data I a reserved overhead rojector is not in its roer location, then one might want to know the geograhic osition o the missing rojector In this case, the last value o the monitored object is required However, an administrator might also want to know the duty cycle o the rojector, thereby requiring access to the entire historical time series Third, most monitoring alications are trigger-oriented I one is monitoring a chemical lant, then one wants to alert an oerator i a sensor value gets too high or i another sensor value has recorded a value out o range more than twice in the last 4 hours Every alication could otentially monitor multile streams o data, requesting alerts i comlicated conditions are met Thus, the scale o trigger rocessing required in this environment ar exceeds that ound in traditional DBMS alications 0

2 Fourth, stream data is oten lost, stale, or imrecise An object being monitored may move out o range o a sensor system, thereby resulting in lost data The most recent reort on the location o the object becomes more and more inaccurate over time Moreover, sensors that record bodily unctions (such as heartbeat) are quite imrecise, and have margins o error that are signiicant in size Lastly, many monitoring alications have real-time requirements Alications that monitor mobile sensors (eg, military alications monitoring soldier locations) oten have a low tolerance or stale data, making these alications eectively real time The added stress on a DBMS that must serve real-time alications makes it imerative that the DBMS emloy intelligent resource management (eg, scheduling) and graceul degradation strategies (eg, load shedding) during eriods o high load Monitoring alications are very diicult to imlement in traditional DBMSs First, the basic comutation model is wrong: DBMSs have a HADP model while monitoring alications oten require a DAHP model In addition, to store time-series inormation one has only two choices First, he can encode the time series as current data in normal tables In this case, assembling the historical time series is very exensive because the required data is sread over many tules, thereby dramatically slowing erormance Alternately, he can encode time series inormation in binary large objects (BLOBs) to achieve hysical locality, at the exense o making queries to individual values in the time series very diicult The only system that we are aware o that tries to do something more intelligent with time series data is the Inormix Universal Server, which imlemented a time-series data tye and associated methods that seed retrieval o values in a time series [1] I a monitoring alication had a very large number o triggers or alerters, then current DBMSs would ail because they do not scale ast a ew triggers er table The only alternative is to encode triggers in some middleware alication Using this imlementation, the system cannot reason about the triggers (eg, otimization), nor can the triggers be queried because they are outside the DBMS Moreover, erormance is tyically oor because middleware must oll or data values that triggers and alerters deend on Lastly, no DBMS that we are aware o has built-in acilities or imrecise or stale data The same comment alies to real-time caabilities Again, the user must build custom code into his alication For these reasons, monitoring alications are diicult to imlement using traditional DBMS technology To do better, all the basic mechanisms in current DBMSs must be rethought In this aer, we describe a rototye system, Aurora, which is designed to better suort monitoring alications We use Aurora to illustrate design issues that would arise in any system o this kind Monitoring alications are alications or which streams o inormation, triggers, imrecise data, and realtime requirements are revalent We exect that there will Inut data streams Oerator boxes Historical Storage Continuous & ad hoc queries Figure 1: Aurora system model Outut to alications be a large class o such alications For examle, we exect the class o monitoring alications or hysical acilities (eg, monitoring unusual events at nuclear ower lants) to grow in resonse to growing needs or security) In addition, as GPS-style devices are attached to a broader and broader class o objects, monitoring alications will exand in scoe Currently such monitoring is exensive and is restricted to costly items like automobiles (eg, Lojack technology) In the uture, it will be available or most objects whose osition is o interest In Section, we begin by describing the basic Aurora architecture and undamental building blocks In Section 3, we show why traditional query otimization ails in our environment, and resent our alternate strategies or otimizing Aurora alications Section 4 describes the run-time architecture and behavior o Aurora, concentrating on storage organization, scheduling, introsection, and load shedding In Section 5, we discuss the myriad o related work that has receded our eort Finally, we conclude in Section 6 Aurora System Model Aurora data is assumed to come rom a variety o data sources such as comuter rograms that generate values at regular or irregular intervals or hardware sensors We will use the term data source or either case In addition, a data stream is the term we will use or the collection o data values that are resented by a data source Each data source is assumed to have a unique source identiier and Aurora timestams every incoming tule to monitor the quality o service being rovided The basic job o Aurora is to rocess incoming streams in the way deined by an alication administrator Aurora is undamentally a data-low system and uses the oular boxes and arrows aradigm ound in most rocess low and worklow systems Hence, tules low through a looree, directed, grah o rocessing oerations (ie, boxes) Ultimately, outut streams are resented to alications, which must be rogrammed to deal with the asynchronous tules in an outut stream Aurora can also maintain historical storage, rimarily in order to suort ad-hoc queries Figure 1 illustrates the high-level system model Aurora Oerators Aurora contains built-in suort or seven rimitive oerations or exressing its stream rocessing requirements The ilter box alies a redicate to each incoming tule, assing only the ones that satisy 1

3 the redicate Merge combines two streams o data into a single stream Resamle is a owerul oeration that redicts additional values that are not originally contained in a stream This oeration can be used to create new values in the stream by interolating between existing values In addition, Aurora rovides a dro box that removes tules rom the stream, and is used rimarily to shed load when Aurora is not roviding reasonable service (see Section 45) Join is a windowed version o the standard relational join Join airs elements in searate streams whose distance (eg, dierence in time, relative osition, etc) is within a seciied bound This bound is considered the size o the window or the join For examle, the window might be 30 minutes i one wanted to air all securities that had the same rice within a hal hour o each other In environments where data can be stale or time imrecise, windowed oerations are a necessity The last box is called ma and it suorts the alication o a (user-deined) unction to every element o a data stream Some unctions just transorm individual items in the stream to other items, while others, such as moving average, aly a unction across a window o values in a stream Hence, ma has both a windowed and nonwindowed version A recise deinition or each o these oerators is contained in Aendix 1 There are no condition boxes in an Aurora network Instead, an alication would have to include two ilter boxes roviding both the condition and the converse o the condition At some inconvenience to an alication, this results in a simler system structure Additionally, there is no exlicit slit box; instead, the alication administrator can connect the outut o one box to the inut o several others This imlements an imlicit slit oeration On the other hand, there is an exlicit Aurora merge oeration, whereby two streams can be ut together I, additionally, one tule must be delayed or the arrival o a second one, then a resamle box can be inserted in the Aurora network to accomlish this eect Aurora Query Model Aurora suorts continual queries (real-time rocessing), views, and ad-hoc queries all using substantially the same mechanisms All three modes o oeration use the same concetual building blocks Each mode rocesses inormation lows based on QoS seciications each outut in Aurora is associated with two-dimensional QoS grahs that seciy the utility o the outut in terms o several erormance and quality related attributes (see Section 41) The diagram in Figure illustrates the rocessing modes suorted by Aurora The tomost ath reresents a continuous query In isolation, data elements low into boxes, are rocessed, and low urther downstream In this scenario, there is no need to store any data elements once they are rocessed Once an inut has worked its way through all reachable aths, that data item is drained rom the network The QoS seciication at the end o the ath controls how resources are allocated to the rocessing elements along the ath One can also view an Aurora network (along with some o its alications) as a large collection o triggers Each ath S1 storage Connection oint Persistence sec: Kee hr O 1 S O 4 storage S3 O 7 O 8 O 9 O O 3 O 5 O 6 a Figure : Aurora query model continuous query view QOS sec a QOS sec Persistence sec: Kee 1 hr ad-hoc query QOS sec rom a sensor inut to an outut can be viewed as comuting the condition art o a comlex trigger An outut tule is delivered to an alication, which can take the aroriate action The dark circles on the inut arcs to boxes O 1 and O reresent connection oints A connection oint is an arc that will suort dynamic modiication to the network New boxes can be added to or deleted rom a connection oint When a new alication connects to the network, it will oten require access to the recent ast As such, a connection oint has the otential or ersistent storage (see Section 4) Persistent storage retains data items beyond their rocessing by a articular box In other words, as items low ast a connection oint, they are cached in a ersistent store or some eriod o time They are not drained rom the network by alications Instead, a ersistence seciication indicates exactly how long the items are ket In the igure, the let-most connection oint is seciied to be available or two hours This indicates the beginning o time or newly connected alications will be two hours in the ast The middle ath in Figure reresents a view In this case, a ath is deined with no connected alication It is allowed to have a QoS seciication as an indication o the imortance o the view Alications can connect to the end o this ath whenever there is a need Beore this haens, the system can roagate some, all, or none o the values stored at the connection oint in order to reduce latency or alications that connect later Moreover, it can store these artial results at any oint along a view ath This is analogous to a materialized or artially materialized view View materialization is under the control o the scheduler The bottom ath reresents an ad-hoc query An ad-hoc query can be attached to a connection oint at any time The semantics o an ad-hoc query is that the system will rocess data items and deliver answers rom the earliest time T (ersistence sec) stored in the connection oint until the query branch is exlicitly disconnected Thus, the semantics or an Aurora ad-hoc query is the same as a continuous query that starts executing at t now T and continues until exlicit termination Aurora User Interace The Aurora user interace cannot be covered in detail because o sace limitations

4 Here, we mention only a ew salient eatures To acilitate designing large networks, Aurora will suort a hierarchical collection o grous o boxes A designer can begin near the to o the hierarchy where only a ew suerboxes are visible on the screen A zoom caability is rovided to allow him to move into seciic ortions o the network, by relacing a grou with its constituent boxes and grous In this way, a browsing caability is rovided or the Aurora diagram Boxes and grous have a tag, an argument list, a descrition o the unctionality and ultimately a manual age By querying these attributes a user can teleort to seciic laces in an Aurora network Additionally, a user can lace bookmarks in a network to allow him to return to laces o interest These caabilities give an Aurora user a mechanism to query the Aurora diagram, thereby satisying one o the design goals in the introduction, namely query acilities or the trigger base The user interace also allows monitors or arcs in the network to acilitate debugging, as well as acilities or single steing through a sequence o Aurora boxes We lan a grahical erormance monitor, as well as more sohisticated query caabilities 3 Aurora Otimization In traditional relational query otimization, one o the rimary objectives is to minimize the number o iterations over large data sets Stream-oriented oerators that constitute the Aurora network, on the other hand, are designed to oerate in a data low mode in which data elements are rocessed as they aear on the inut Although the amount o comutation required by an oerator to rocess a new element is usually quite small, we exect to have a very large number o boxes Furthermore, high data rates add another dimension to the roblem We now resent our strategies to construct an otimized Aurora network based on static inormation such as estimated cost o the boxes and selectivity statistics Combining all the boxes into a massive query and then alying conventional query otimization is not a workable aroach or the Aurora system; however, some conventional rinciles still aly As in multile-query otimization [3], one way to go about otimizing multile triggers is to try to globally otimize them, which is shown to be NP-comlete [4] Instead, we ocus on the ollowing collection o alternative tactics to come u with locally otimal triggers which can then be comiled into a single network o boxes (also using common subexression elimination techniques where ossible) Inserting Projections It is unlikely that the alication administrator will have inserted ma oerators to roject out all unneeded attributes Examination o an Aurora network allows us to insert/move such ma oerations to the earliest ossible oints in the network, thereby shrinking the size o the tules that must be subsequently rocessed Note that this kind o otimization requires that ilter ma w ma merge resamle join dro ilter + ma?? w ma -?? merge resamle?? - -? join?? - - -? dro Table 1: Oerator commutativity table the system be rovided with unction/oerator signatures that describe the attributes that are used and roduced by the oerators Combining Boxes As a next ste, Aurora diagrams will be rocessed to combine boxes where ossible A air-wise examination o the oerators suggests that, in general, ma and ilter can be combined with almost all o the oerators whereas windowed or binary oerators cannot It is desirable to combine two boxes into a single box when this leads to some cost reduction As an examle, a ma oerator that only rojects out attributes can be combined easily with any adjacent oerator, thereby saving the box execution overhead or a very chea oerator In addition, two iltering oerations can be combined into a single, more comlex ilter that can be more eiciently executed than the two boxes it relaces Not only is the overhead o a second box activation avoided, but also standard relational otimization on one-table redicates can be alied in the larger box In general, combining boxes at least saves the box execution overhead, and reduces the total number o boxes, leading to a simler diagram Reordering Boxes Reordering the oerations in a conventional relational DBMS to an equivalent but more eicient orm is a common technique in query otimization For examle, ilter oerations can sometimes be ushed down the query tree through joins In Aurora, we can aly the same technique when two oerations commute We resent the commutativity o each air o rimitive boxes in Aurora in Table 1 We observe that a ew oerators always commute, while many commute under certain circumstances and a ew never commute The oerations that conditionally commute usually deend on the size o the window In Aendix, we examine the commutativity o our oerations in detail To decide when to interchange two commutative oerators, we make use o the ollowing erormance model Each Aurora box, b, has a cost, c(b), deined as the exected execution time or b to rocess one inut tule Additionally, each box has a selectivity, s(b), which is the exected number o outut tules er inut tule Consider two boxes, b i and b j, with b j ollowing b i In this case, or each inut tule or b i, we can comute the amount o rocessing as c(b i ) + c(b j ) s(b i ) Reversing the oerators gives a like calculation Hence, we can comute the condition used to decide whether the boxes should be switched as: 3

5 Storage Manager Q 1 Q Q i Buer manager Persistent Q j Q n Store inuts Router Scheduler Catalogs oututs Load Shedder 1 s( b )/ c( b ) > 1 s( b)/ c( b i j j i ) It is straightorward to generalize the above calculation to deal with cases that involve an-in or an-out situations Moreover, it is easy to see that we can obtain an otimal ordering by sorting all the boxes according to their corresonding ratios in decreasing order We use this result in a heuristic algorithm that iteratively reorders boxes (to the extent allowed by their commutativity roerties) until no more reorderings are ossible Ad-Hoc Query Otimization Recall that the semantics o an ad-hoc query is that it must run on all the historical inormation saved at the connection oint(s) to which it is connected Subsequently, it becomes a normal ortion o an Aurora network, until it is discarded Aurora rocesses ad-hoc queries in two stes First, it runs the ortion o the query that uses the historical inormation in the connection oint The storage system is ree to organize this storage in any way that it deems aroriate (eg, a B-tree) When this inishes, it switches to normal oeration on the queue data structures (see Section 4) To otimize historical queries, Aurora begins at each connection oint and examines the successor box(es) I the box is a ilter, then Aurora examines the condition to see i it is comatible with the storage key associated with the connection oint I so, it switches the imlementation o the ilter box to erorm an indexed looku in the B-tree Similarly, i the successor box is a join, then Aurora costs erorming a merge-sort or indexed looku, chooses the cheaest one, and changes the join imlementation aroriately Other boxes cannot eectively use the indexed structure, so only these two need be considered Moreover, once the initial box erorms its work on the historical tules, the index structure is lost, and all subsequent boxes will work in the normal way Hence, the initial boxes in an ad-hoc query can ull inormation rom the B-tree associated with the corresonding connection oint(s) Ater that, the normal ush rocessing associated with Aurora oeration takes over When, the historical oeration is inished, Aurora σ µ Box Processors Figure 3: Aurora run-time architecture >< QoS Monitor switches the imlementation to the queued data structures, and continues rocessing in the conventional ashion This switching strategy works unless there is a windowed oeration in the ad-hoc query In this case, secial care must be taken to obtain the correct answer (requiring a sort oeration) The most straightorward mechanism is to relicate the size o the window rom the queued structure into the B-tree In this way, the overla that is required or correct oeration is rovided in the B- tree O course, this relication must be noted in the B-tree structure, so that uture ad-hoc queries will only be given the overla window size that they need Also, the size o the relicated window slowly goes to zero, since new tules that would ordinarily be added to the B-tree are discarded instead 4 Run-Time Oeration The basic urose o Aurora run-time network is to rocess data lows through a otentially large worklow diagram Figure 3 illustrates the basic Aurora architecture Here, inuts rom data sources and oututs rom boxes are ed to the router, which orwards them either to external alications or to the storage manager to be laced on the roer queue The storage manager is resonsible or maintaining the box queues and managing the buer Concetually, the scheduler icks a box or execution, ascertains what rocessing is required, and asses a ointer to the box descrition (together with a ointer to the box state) to the multi-threaded box rocessor The box rocessor executes the aroriate oeration and then orwards the outut tules to the router The scheduler then ascertains the next rocessing ste and the cycle reeats The QoS evaluator continually monitors system erormance and activates the load shedder when it detects an overload situation and oor system erormance The load shedder then sheds load till the erormance o the system reaches an accetable level The catalog in Figure 3 contains inormation regarding the network toology, inuts, oututs, QoS inormation, and relevant statistics (eg, selectivity, average box rocessing costs), and is essentially used by all comonents We now describe Aurora s rimary run-time architecture in more detail, ocusing rimarily on the storage manager, scheduler, QoS monitor, and load shedder 41 QoS Data Structures Aurora attemts to maximize the erceived quality o service (QoS) or the oututs it roduces QoS, in general, is a multidimensional unction o several attributes o an Aurora system These include: Resonse times outut tules should be roduced in a timely ashion; as otherwise QoS will degrade as delays get longer; Tule dros i tules are droed to shed load, then the QoS o the aected oututs will deteriorate; Values roduced QoS clearly deends on whether imortant values are being roduced or not 4

6 good zone QoS QoS QoS 0 % tules delivered δ delay 0 outut value (a) dro-based (b) delay-based (c) value-based Asking the alication administrator to seciy a multidimensional QoS unction seems imractical Instead, Aurora relies on a simler tactic, which is much easier or humans to deal with: or each outut stream, we exect the alication administrator to give Aurora a two-dimensional QoS grah based on the ercentage o tules delivered (as illustrated in Figure 4a) In this case, the alication administrator indicates that high QoS is achieved when tule delivery is near 100% and that QoS degrades as tules are droed Otionally, the alication administrator can give Aurora either one o two additional QoS grahs or all oututs in an Aurora system The irst, illustrated in Figure 4b, is a delay-based QoS grah Here, the QoS o the outut is maximized i delay is less than the threshold, δ, in the grah Beyond δ, QoS degrades with additional tule delay The alication administrator should resent Aurora with the inormation in Figure 4b or each outut, i he is caable o doing so Aurora also assumes that all quality o service grahs are normalized, so that we can comare QoS or dierent oututs quantitatively Aurora urther assumes that the value chosen or δ is easible, ie, that a roerly sized Aurora network will oerate with all oututs in the good zone to the let o δ This will require the delay introduced by the total comutational cost along the longest ath rom a data source to this outut not to exceed δ I the alication administrator does not resent Aurora with easible QoS grahs, then the algorithms in the subsequent sections will not roduce good results The second otional QoS grah or oututs is shown in Figure 4c The ossible values roduced as oututs aear on the horizontal axis, and the QoS grah indicates the imortance o each one This value-based QoS grah catures the act that some oututs are more imortant than others For examle, in a lant monitoring alication, oututs near a critical region are much more imortant than ones well away rom it Again, i the alication administrator has value-based QoS inormation, then Aurora will use it to shed load more intelligently than would occur otherwise The last item o inormation required rom the alication administrator is H, the headroom or the system, deined as the ercentage o the comuting resources that can be used in steady state The remainder is reserved or the exected ad-hoc queries, which are added Figure 4: QoS grah tyes dynamically In summary, Aurora requires the alication administrator to seciy the headroom as well as drobased QoS grahs or each outut In addition, the administrator can otionally give Aurora delay-based or value-based QoS grahs or all oututs 4 Storage Management The job o the Aurora Storage Manager (ASM) is to store all tules required by an Aurora network There are two kinds o requirements First, ASM must manage storage or the tules that are being assed through an Aurora network, and secondly, it must maintain extra tule storage that may be required at connection oints Queue Management Each windowed oeration requires a historical collection o tules to be stored, equal to the size o the window Moreover, i the network is currently saturated, then additional tules may accumulate at various laces in the network As such, ASM must manage a collection o variable length queues o tules There is one queue at the outut o each box, which is shared by all successor boxes Each such successor box maintains two ointers into this queue The head indicates the oldest tule that this box has not rocessed The tail, in contrast, indicates the oldest tule that the box needs The head and tail indicate box s current window, which slides as new tules are rocessed ASM will kee track o these collections o ointers, and can normally discard tules in a queue that are older than the oldest tail ointing into the queue In summary, when a box roduces a new tule, it is added to the ront o the queue Eventually, all successor boxes rocess this tule and it alls out o all o their windows and can be discarded Figure 5 illustrates this model by deicting a two-way branch scenario where two boxes, b 1 and b, share the same queue ( w s reer to window sizes) Normally, queues o this sort are stored as main memory data structures However, ASM must be able to scale arbitrarily, and has chosen a dierent aroach Disk oldest tule tail w = 9 b time head tail w 1 = 5 can be discarded Figure 5: Queue organization b 1 youngest tule head 5

7 storage is divided into ixed length blocks, o a tunable size, block_size We exect tyical environment will use 18KB or larger blocks Each queue is allocated one block, and queue management roceeds as above As long as the queue does not overlow, the single block is used as a circular buer I an overlow occurs, ASM looks or a collection o blocks (contiguous i ossible), and exands the queue dynamically to block_size Circular management continues in this larger sace O course, queue underlow can be treated in an analogous manner At start u time, ASM is allocated a buer ool or queue storage It ages queue blocks into and out o main memory using a novel relacement olicy The scheduler and ASM share a tabular data structure that contains a row or each box in the network containing the current scheduling riority o the box and the ercentage o its queue that is currently in main memory The scheduler eriodically adjusts the riority o each box, while the ASM does likewise or the main memory residency o the queue This latter iece o inormation is used by the scheduler or guiding scheduling decisions (see Section 43) The data structure also contains a lag to indicate that a box is currently running Figure 6 illustrates this interaction When sace is needed or a disk block, ASM evicts the lowest riority main memory resident block In addition, whenever, ASM discovers a block or a queue that does not corresond to a running block, it will attemt to ugrade the block by evicting it in avor o a block or the queue corresonding to a higher riority box In this way, ASM is continually trying to kee all the required blocks in main memory that corresond to the to riority queues ASM is also aware o the size o each queue and whether it is contiguous on disk Using this inormation, it can schedule multi-block reads and writes and garner added eiciency O course, as blocks move through the system and conditions change, the scheduler will adjust the riority o boxes, and ASM will react by adjusting the buer ool Naturally, we must be careul to avoid the well-known hysteresis eect, whereby ASM and the scheduler start working at cross uroses, and erormance degrades sharly Connection Point Management As noted earlier, the Aurora alication designer indicates a collection o connection oints, to which collections o boxes can be subsequently connected This satisies the Aurora requirement to suort ad-hoc queries Associated with each connection oint is a history requirement and an otional storage key The history requirement indicates the amount o historical inormation that must be retained Sometimes, the amount o retained history is less than the maximum window size o the successor boxes In this case, no extra storage need be allocated The usual case is that additional history is requested In this case, ASM will organize the historical tules in a B-tree organized on the storage key I one is not seciied, then a B-tree will be built on the timestam ield in the tule When tules all o the end o a queue that is QoS-based riority inormation Scheduler Storage Manager Buer-state inormation Figure 6: Scheduler-storage manager interaction associated with a connection oint, then ASM will gather u batches o such tules and insert them into the corresonding B-tree Periodically, it will make a ass through the B-tree and delete all the tules, which are older than the history requirement Obviously, it is more eicient to rocess insertions and deletions in batches, than one by one Since we exect B-tree blocks to be smaller than block_size, we anticiate slitting one or more o the buer ool blocks into smaller ieces, and aging historical blocks into this sace The scheduler will simly add the boxes corresonding to ad-hoc queries to the data structure mentioned above, and give these new boxes a riority ASM will react by reetching index blocks, but not data blocks, or worthy indexed structures In turn, it will retain index blocks, as long as there are not higher riority buer requirements No attemt will be made to retain data blocks in main memory 43 QoS- and Buer-Driven Scheduling Scheduling algorithms develoed or real-time and multimedia systems (eg, [19, 0]) tyically attemt to maximize some QoS-based utility metric by choosing to execute, at each scheduling instance, the task with the highest exected utility (see Section 5) As each tule that enters Aurora concetually reresents a task, such an aroach is simly not workable because o the sheer number o tasks Furthermore, unlike the aorementioned systems where a task is also the unit o scheduling and rocessing, execution o a task in Aurora sans many such scheduling and rocessing stes (ie, an inut tule tyically needs to go through many boxes beore contributing to the outut) and may involve multile accesses to secondary storage Basing scheduling decisions solely on QoS requirements, thereby ailing to address endto-end tule rocessing costs, might lead to drastic erormance degradation esecially under resource constraints To this end, Aurora not only considers the realtime QoS requirements but also makes an exlicit attemt to reduce overall tule execution costs Non-Linearities in Tule Processing In articular, Aurora observes and exloits two basic non-linearities that arise when rocessing tules: Intra-box non-linearity The cost o tule rocessing may decrease as the number o tules that are available or rocessing at a given box increases This reduction in unit tule rocessing costs may arise due to two reasons First, the total number o box calls that need to be made to rocess a given number o tules decreases, cutting down 6

8 low-level overheads such as calls to the box code and context switch Second, a box, deending on its semantics, may otimize its execution better with larger number o tules available in its queue For instance, a box can materialize intermediate results and reuse them in the case o windowed oerations, or use merge-join instead o nested loos in the case o joins Figure 7 illustrates this notion o intra-box non-linearity In the igure, while box b 1 (eg, a ilter box) can only beneit rom decreased box execution overheads and cannot exloit the availability o larger number o tules, box b (eg, join) erorms much better with increasing number o queued tules Inter-box non-linearity End-to-end tule rocessing costs may drastically increase i buer sace is not suicient and tules need to be shuttled back and orth between memory and disk several times throughout their lietime One imortant goal o Aurora scheduling is, thus, to minimize tule trashing Another orm o inter-box nonlinearity occurs when assing tules between box queues I the scheduler can decide in advance that, say, box b is going to be scheduled right ater box b 1 (whose oututs eed b ), then the storage manager can be byassed (assuming there is suicient buer sace) and its overhead avoided while transerring b 1 s oututs to b s queue Train Scheduling Aurora exloits the beneits o nonlinearity in both intra-box and inter-box tule rocessing rimarily through train scheduling, a set o scheduling heuristics whose rimary goal is to have boxes queue as many tules as ossible without rocessing (thereby generating long tule trains), to rocess comlete trains at once (thereby exloiting intra-box non-linearity), and to ass them to the subsequent boxes without having to go to disk (thereby exloiting inter-box non-linearity) One imortant imlication o train scheduling is that, unlike traditional blocking oerators that wake u and rocess any new inut tules as they arrive, Aurora scheduler tells each box when to execute and how many queued tules to rocess This somewhat comlicates the imlementation and increases the load o the scheduler, but is necessary or creating and rocessing trains, which will signiicantly decrease overall execution costs Because o the extreme scale, highly dynamic nature o the system, and the granularity o scheduling, searching or otimal scheduling solutions is clearly ineasible Aurora thereore uses heuristics to strike a good balance between the real-time requirements and cost minimization Rather than considering all boxes or scheduling, which is rohibitive or a large network, Aurora irst restricts its attention to the to k highest riority oututs (and the boxes that corresond to those oututs) that have queues in memory The riority o an outut is an indication o its urgency, and can be comuted, or instance, based on the average staleness o the unrocessed tules that belong to it I the relative value o k to the total number o oututs is small, this aroach should signiicantly reduce the number o boxes that Aurora considers or scheduling Note that the riority list is recomuted eriodically, not at each scheduling oint The assumtion is that the relative Unit tule rocessing cost b 1 b Number o queued tules Figure 7: Illustrating intra-box non-linearity riorities will not change signiicantly between consecutive recomutations Aurora then uses two rough measures to guide its scheduling Seciically, Aurora will rioritize boxes based on their bu_t values: bu ( b)/ t ( b ) where bu(b) is the buer utilization actor and t(b) is the train actor Buer utilization actor is a simle estimation o how much extra buer sace will be gained by executing the box and is trivially comuted as tr ain _ size( b) (1 s( b)), where train_size(b) is the size o b s queue that is in buer and s(b) is the selectivity o b The train actor, t(b), reers to the intra-box non-linearity and is a measure o the exected otential decrease in the unit cost o tule execution i b s execution is deerred to create a longer train (eg, in Figure 7, it makes more sense to deer b than to deer b 1 ) For each selected outut, Aurora will ind the irst downstream box whose queue is in memory (note that or a box to be schedulable, its queue must at least contain its window s worth o tules) Going ustream, Aurora will then consider other boxes, until either it considers a box whose queue is not in memory or it runs out o boxes At this oint, there is going to be a sequence o boxes (ie, a suerbox) that can be scheduled one ater another For each such suerbox, Aurora will comute a bu_t value, and choose the suerbox that has the highest such value The inal schedule will consist o suerboxes (one or each chosen outut) sorted by their bu_t values In order to execute a box, Aurora contacts the storage manager and asks that the queue o the box be inned to the buer throughout box s execution It then asses the location o the inut queue to the aroriate box rocessor code, seciies how many tules the box should rocess, and assigns it to an available worker thread 44 Introsection Aurora emloys static and run-time introsection techniques to redict and detect overload situations Static Analysis The goal o static analysis is to determine i the hardware running the Aurora network is sized correctly I insuicient comutational resources are resent to handle the steady state requirements o an Aurora network, then queue lengths will increase without bound and resonse times will become arbitrarily large As described beore, each box b in an Aurora network has an exected tule rocessing cost, c(b), and a 7

9 selectivity, s(b) I we also know the exected rate o tule roduction r(d) rom each data source d, then we can use the ollowing static analysis to ascertain i Aurora is sized correctly From each data source, we begin by examining the immediate downstream boxes: i box b i is directly downstream rom data source d i, then, or the system to be stable, the throughut o b i should be at least as large as the inut data rate; ie, 1/ cb ( i) rd ( i) We can then calculate the outut data rate rom b i as: min(1/ c( bi), r( di)) s( bi ) Proceeding iteratively, we can comute the outut data rate and comutational requirements or each box in an Aurora network We can then calculate the minimum aggregate comutational resources required er unit time, min_ca, or stable steady-state oeration Clearly, the Aurora system with a caacity C cannot handle the exected steady state load i C is smaller than min_ca Furthermore, the resonse times will assuredly suer under the exected load o ad-hoc queries i C H < min_ ca Clearly, this is an undesirable situation and can be corrected by redesigning alications to change their resource requirements, by sulying more resources to increase system caacity, or by load shedding Dynamic Analysis Even i the system has suicient resources to execute a given Aurora network under exected conditions, unredictable, long-duration sikes in inut rates may deteriorate erormance to a level that renders the system useless We now describe two run-time techniques to detect such cases Our irst technique or detecting an overload relies on the use o delay-based QoS inormation, i available Aurora timestams all tules rom data sources as they arrive Furthermore, all Aurora oerators reserve the tule timestams as they roduce outut tules (i an oerator has multile inut tules, then the earlier timestam is reserved) When Aurora delivers an outut tule to an alication, it checks the corresonding delay-based QoS grah (Figure 4b) or that outut to ascertain that the delay is at an accetable level (ie, the outut is in the good zone) I delay-based QoS inormation is not available to guide Aurora in detecting abnormal oeration, then Aurora emloys a somewhat cruder technique Seciically, Aurora watches its internal disatching queue or evidence o sustained growth o one or more o the logical queues (in ront o some box(es)) I a buildu is observed or a eriod o time exceeding a threshold, then Aurora takes corrective action O course, longer queue lengths do not necessarily mean that the delivered QoS is bad or unaccetable (eg, consider train scheduling) Unless delay-based QoS inormation is available, however, this is the most reliable iece o inormation used to trigger load shedding 45 Load Shedding When an overload is detected as a result o static or dynamic analysis, Aurora attemts to reduce the volume o Aurora tule rocessing via load shedding The naïve aroach to load shedding involves droing tules at random oints in the network in an entirely uncontrolled manner This is similar to droing overlow ackets in acket-switching networks [8], and has two otential roblems: (1) overall system utility might be degraded more than necessary; and () alication semantics might be arbitrarily aected In order to alleviate these roblems, Aurora relies on QoS inormation to guide the load shedding rocess We now describe two load shedding schemes that dier in the way they exloit QoS inormation Load Shedding by Droing Tules The irst aroach addresses the ormer roblem mentioned above: it attemts to minimize the degradation (or maximize the imrovement) in the overall system QoS; ie, the QoS values aggregated over all the oututs This is accomlished by droing tules on network branches that terminate in more tolerant oututs I load shedding is triggered as a result o static analysis, then we cannot exect to use delay-based or value-based QoS inormation (without assuming the availability o a riori knowledge o the tule delays or requency distribution o values) On the other hand, i load shedding is triggered as a result o dynamic analysis, we can also use delay-based QoS grahs i available We use a greedy algorithm to erorm load shedding Let us initially describe the static load shedding algorithm driven by dro-based QoS grahs We irst identiy the outut with the smallest negative sloe or the corresonding QoS grah We move horizontally along this curve until there is another outut whose QoS curve has a smaller negative sloe at that oint This horizontal dierence gives us an indication o the outut tules to dro (ie, the selectivity o the dro box to be inserted) that would result in the minimum decrease in the overall QoS We then move the corresonding dro box as ar ustream as ossible until we ind a box that aects other oututs (ie, a slit oint), and lace the dro box at this oint Meanwhile, we can calculate the amount o recovered resources I the system resources are still not suicient, then we reeat the rocess For the run-time case, the algorithm is similar excet that we can use delay-based QoS grahs to identiy the roblematic oututs, ie, the ones which are beyond their delay thresholds, and we reeat the load shedding rocess until the latency goals are met In general, there are two subtleties in dynamic load shedding First, dro boxes inserted by the load shedder should be among the ones that are given higher riority by the scheduler Otherwise, load shedding will be ineective in reducing the load o the system Thereore, the load shedder simly does not consider the inactive (ie, low riority) oututs, which are indicated by the scheduler 8

10 Secondly, the algorithm tries to move the dro boxes as close to the sources as ossible to discard tules beore they redundantly consume any resources On the other hand, i there is a box with a large existing queue, it makes sense to temorarily insert the dro box at that oint rather than trying to move it ustream closer towards the data sources Presumably, the alication is coded so that it can tolerate missing tules rom a data source caused by communication ailures or other roblems Hence, load shedding simly artiicially introduces additional missing tules Although the semantics o the alication are somewhat dierent, the harm should not be too damaging Semantic Load Shedding by Filtering Tules The load shedding scheme described above eectively reduces the amount o Aurora rocessing by droing randomly selected tules at strategic oints in the network While this aroach attemts to minimize the loss in overall system utility, it ails to control the imact o the droed tules on alication semantics Semantic load shedding addresses this limitation by using value-based QoS inormation, i available (recall that value-based QoS grahs seciy the relative imortance o various values or a given outut) Seciically, semantic load shedding dros tules in a more controlled way; ie, it dros less imortant tules, rather than random ones, using ilters I value-based QoS inormation is available, then Aurora can watch each outut and build u a histogram containing the requency with which value ranges have been observed In addition, Aurora can calculate the exected utility o a range o oututs by multilying the QoS values with the corresonding requency values or every interval and then summing these values To shed load, Aurora identiies the outut with the lowest utility interval; converts this interval to a ilter redicate; and then, as beore, attemts to roagate the corresonding ilter box as ar ustream as ossible to a slit oint This strategy, which we reer to as backward interval roagation, admittedly has limited scoe because it requires the alication o the inverse unction or each oerator assed ustream (Aurora boxes do not necessarily have inverses) An alternative strategy, orward interval roagation, estimates a roer ilter redicate and roagates it in downstream direction to see what results at the outut By trial-and-error, Aurora can converge on a desired ilter redicate Note that a combination o these two strategies can also be utilized First, Aurora can aly backward roagation until a box, say b, whose oerator s inverse is diicult to comute Aurora can then aly orward roagation between the insertion location o the ilter box and b This algorithm can be alied iteratively until suicient load is shed 5 Related Work A secial case o Aurora rocessing is as a continuous query system A continuous query system like Niagara [6] is concerned with combining multile data sources in a wide area setting, while we are initially ocusing on the construction o a general stream rocessor that can rocess very large numbers o streams Indexing queries [] is an imortant technique or enhancing the erormance o large-scale iltering alications (eg, ublish/subscribe) In Aurora, this would corresond to a merge o some inuts ollowed by a anout to a large number o ilter boxes Query indexing would be useul here, but it reresents only one Aurora rocessing idiom As in Aurora, active databases [1, ] are concerned with monitoring conditions These conditions can be a result o any arbitrary udate on the stored database state In our setting, udates are aend-only, thus requiring dierent rocessing strategies or detecting monitored conditions Triggers evaluate conditions that are either true or alse Our ramework is general enough to suort queries over streams or the conversion o these queries into monitored conditions There has also been extensive work on making active databases highly scalable (eg, [10]) Similar to continuous query research, these eorts have ocused on query indexing, while Aurora is constructing a more general system Adative query rocessing techniques (eg, [3, 14, 7]) address eicient query execution in unredictable and dynamic environments by revising the query execution lan as the characteristics o incoming data changes O articular relevance is the Eddies work [3] Unlike traditional query rocessing where every tule rom a given data source gets rocessed in the same way, each tule rocessed by an Eddy is dynamically routed to oerator threads or artial rocessing, with the resonsibility alling uon the tule to carry with it its rocessing state Recent work [18] extended Eddies to suort the rocessing o queries over streams, mainly by ermitting Eddies systems to rocess multile queries simultaneously and or unbounded lengths o time The Aurora architecture bears some similarity to that o Eddies in its division o a single query s rocessing into multile threads o control (one er query oerator) However, queries rocessed by Eddies are exected to be rocessed in their entirety; there is neither the notion o load shedding, nor quality o service Directly related work on stream data query rocessing architectures shares many o the goals and target alication domains with Aurora The Streams roject [4] attemts to rovide comlete DBMS unctionality along with suort or continuous queries over streaming data Aurora s main emhasis is on eiciently suorting sohisticated continuous queries over a large number o otentially very ast data streams, thus sacriicing some traditional database unctionality such as transactions The Fjords architecture [17] combines querying o ush-based sensor sources with ull-based traditional sources by embedding the ull/ush semantics into queues between query oerators It is undamentally dierent rom Aurora in that oerator scheduling is governed by a combination o schedulers seciic to query threads and oerator-queue 9

11 interactions Tribeca [6] is an extensible, stream-oriented data rocessor designed seciically or suorting network traic analysis While Tribeca incororates many o the stream oerators and comile-time otimizations Aurora suorts, it does not address crucial run-time issues such as scheduling or load shedding Furthermore, Tribeca does not have the concet o ad-hoc queries, and, thus, does not address relevant storage organization issues Tools or mining o stream-based data have received considerable attention lately Some tools only allow one ass over stored stream data (eg, [1]), whereas others (eg, [13]) are interested in addressing data mining roblems and, thus, require multile asses Work in sequence databases [5] deined sequence deinition and maniulation languages over discrete data sequences The Chronicle data model [15] deined a restricted view deinition and maniulation language over aend-only sequences Aurora s algebra extends relevant asects o revious roosals by roosing a binary windowed oerator (ie, join), which is indisensable or continuous query execution over data streams Our work is also relevant to materialized views [9], which are essentially stored continuous queries that are reexecuted (or incrementally udated) as their base data are modiied However, Aurora s notion o continuous queries diers rom materialized views rimarily in that Aurora udates are aend-only, thus, making it much easier to incrementally materialize the view Also, query results are streamed (rather than stored); and high stream data rates may require load shedding or other aroximate query rocessing techniques that trade o eiciency or result accuracy Our work is likely to beneit rom and contribute to the considerable research on temoral databases [0] and main-memory databases [7], which assume an HADP model, whereas Aurora rooses a DAHP model that builds streams as undamental Aurora objects We can also beneit rom the literature on real-time databases [16, 0] In a real-time database system, transactions are assigned timing constraints and the system attemts to ensure a degree o conidence in meeting these timing requirements The Aurora notion o QoS seciication extends the sot and hard deadlines emloyed in real-time databases to general utility unctions Furthermore, real-time databases associate deadlines with individual transactions, whereas Aurora associates QoS curves with oututs rom stream rocessing and, thus, has to suort continuous timing requirements There has been extensive research on scheduling tasks in real-time and multimedia systems and databases [19, 0] The roosed aroaches are commonly deadline- (or QoS-) driven; ie, at each scheduling oint, the task that has the earliest deadline or one that is exected to rovide the highest QoS (eg, throughut) is identiied and scheduled In Aurora, such an aroach is not only imractical because o the sheer number o otentially schedulable tasks (ie, tules), but is also ineicient because o the imlicit assumtion that all tasks are memory-resident and are scheduled and executed in their entirety Note that Eddies scheduling [3, 18] addresses the latter issue by avoring memory-resident tules or execution To the best o our knowledge, however, our train scheduling aroach is unique in its ability to reduce overall execution costs by exloiting intra- and inter-box non-linearities described here The work o [7] takes a scheduling-based aroach to query rocessing; however, they do not address continuous queries, are rimarily concerned with data rates that are too slow (we also consider rates that are too high), and they only address query lans that are trees with single oututs The congestion control roblem in data networks [8] is relevant to Aurora and its load shedding mechanism Load shedding in networks tyically involves droing individual ackets randomly, based on timestams, or using (alication-seciied) riority bits Desite concetual similarities, there are also some undamental dierences between network load shedding and Aurora load shedding First, unlike network load shedding which is inherently distributed, Aurora is aware o the entire system state and can otentially make more intelligent shedding decisions Second, Aurora uses QoS inormation rovided by the external alications to trigger and guide load shedding Third, Aurora s semantic load shedding aroach not only attemts to minimize the degradation in overall system utility, but also quantiies the imrecision due to droed tules Aurora load shedding is also related to aroximate query answering (eg, [11]) and data reduction and summary techniques [5, 8], where result accuracy is traded or eiciency By throwing away data, Aurora bases its comutations on samled data, eectively roducing aroximate answers using data samling The unique asect o our aroach is that our samling is driven by QoS seciications 6 Conclusions Monitoring alications are those where streams o inormation, triggers, real-time requirements, and imrecise data are revalent Traditional DBMSs are based on the HADP model, and thus cannot rovide adequate suort or such alications In this aer, we have described the architecture o Aurora, a DAHP system, oriented towards monitoring alications We argued that roviding eicient suort or these demanding alications not only require critically revisiting many existing asects o database design and imlementation, but also require develoing novel roactive data storage and rocessing concets and techniques In this aer, we irst resented the basic Aurora architecture, along with the rimitive building blocks or worklow rocessing We ollowed with several heuristics or otimizing a large Aurora network We then ocused on run-time data storage and rocessing issues, discussing in detail storage organization, real-time scheduling, 10

12 introsection, and load shedding, and roosed novel solutions in all these areas We have started to imlement the roosed architecture and shortly exect to have an initial rototye, which we will use to veriy the eectiveness and racticality o the Aurora model and related algorithms In terms o uture work, we identiied two imortant research directions First, we are extending our data and rocessing model to coe with missing and imrecise data values, which are common in alications involving sensor-generated data streams Second, we are working on a distributed Aurora architecture that will enable oerators to be ushed closer to the data sources, otentially yielding signiicantly imroved scalability, energy use, and bandwidth eiciency Reerences [1] Inormix White Paer Time Series: The Next Ste or Telecommunications Data Management [] M Altinel and M J Franklin Eicient Filtering o XML Documents or Selective Dissemination o Inormation In Proc o the 6th VLDB Con, 000 [3] R Avnur and J Hellerstein Eddies: Continuously Adative Query Processing In Proc o the SIGMOD Con, Dallas, TX, 000 [4] S Babu and J Widom Continuous Queries over Data Streams SIGMOD Record, 30(3):109-10, 001 [5] D Barbara, W DuMouchel, C Faloutsos, P J Haas, J M Hellerstein, Y E Ioannidis, H V Jagadish, T Johnson, R T Ng, V Poosala, K A Ross, and K C Sevcik The New Jersey Data Reduction Reort IEEE Data Engineering Bulletin, 0(4):3-45, 1997 [6] J Chen, D J DeWitt, F Tian, and Y Wang NiagaraCQ: A Scalable Continuous Query System or Internet Databases In Proc o the SIGMOD Con, Dallas, TX, 000 [7] H Garcia-Molina and K Salem Main Memory Database Systems: An Overview IEEE Transactions on Knowledge and Data Engineering (TKDE), 4(6): , 199 [8] J Gehrke, F Korn, and D Srivastava On Comuting Correlated Aggregates over Continual Data Streams In Proc o the SIGMOD Con, Santa Barbara, CA, 001 [9] A Guta and I S Mumick Maintenance o Materialized Views: Problems, Techniques, and Alications IEEE Data Engineering Bulletin, 18():3-18, 1995 [10] E N Hanson, C Carnes, L Huang, M Konyala, L Noronha, S Parthasarathy, J B Park, and A Vernon Scalable Trigger Processing In Proc o the 15th ICDE, Sydney, Austrialia, 1999 [11] J M Hellerstein, P J Haas, and H J Wang Online Aggregation In Proc o the SIGMOD Con, 1997 [1] M R Henzinger, P Raghavan, and S Rajagoalan Comuting on Data Streams Comaq Systems Research Center, Palo Alto, Caliornia Technical Reort TR , May 1998 [13] C Hidber Online Association Rule Mining In Proceedings o the 1999 ACM SIGMOD International Conerence on Management o Data, Philadelhia, PA, 1999 [14] Z G Ives, D Florescu, M Friedman, A Levy, and D S Weld An Adative Query Execution System or Data Integration In Proc o the SIGMOD Con, Philadelhia, PA, 1999 [15] H V Jagadish, I S Mumick, and A Silberschatz View Maintenance Issues or the Chronicle Data Model In Proc o the 14th PODS, 1995 [16] B Kao and H Garcia-Molina, An Overview o Realtime Database Systems, in Real Time Comuting, W A Halang and A D Stoyenko, Eds: Sringer-Verlag, 1994 [17] S Madden and M J Franklin Fjording the Stream: An Architecture or Queries over Streaming Sensor Data In Proc o the 18th ICDE, 00 [18] S R Madden, M A Shaw, J M Hellerstein, and V Raman Continuously Adative Continuous Queries Over Streams In Proc o the SIGMOD Con, Wisconsin, USA, 00 [19] J Nieh and M S Lam The Design, Imlementation and Evaluation o SMART: A Scheduler or Multimedia Alications In Proc 16th ACM Symosium on OS Princiles, 1997 [0] G Ozsoyoglu and R T Snodgrass Temoral and Real-Time Databases: A Survey IEEE Transactions on Knowledge and Data Engineering (TKDE), 7(4):513-53, 1995 [1] N Paton and O Diaz Active Database Systems ACM Comuting Surveys, 31(1):63-103, 1999 [] U Schreier, H Pirahesh, R Agrawal, and C Mohan Alert: An Architecture or Transorming a Passive DBMS into an Active DBMS In Proc o the 17th VLDB Con, Barcelona, Sain, 1991 [3] T K Sellis Multile-Query Otimization ACM Transactions on Database Systems (TODS), 13(1):3-5, 1988 [4] T K Sellis and S Ghosh On the Multile-Query Otimization Problem IEEE Transactions on Knowledge and Data Engineering (TKDE), ():6-66, 1990 [5] P Seshadri, M Livny, and R Ramakrishnan The Design and Imlementation o a Sequence Database System In Proc o the th VLDB, India, 1996 [6] M Sullivan and A Heybey Tribeca: A System or Managing Large Databases o Network Traic In Proc o the USENIX Annual Technical Con, New Orleans, LA, 1998 [7] T Urhan and M J Franklin Dynamic Pieline Scheduling or Imroving Interactive Query Perormance In Proc o the VLDB Con, 001 [8] C Yang and A V S Reddy A Taxonomy or Congestion Control Algorithms in Packet Switching Networks IEEE Network, 9(5):34-44,

13 Aendix 1 Aurora Query Oerator Semantics 4 Merge (+): Merge erorms the union o tules A stream is a otentially ininite set o tules ordered by rom searate but comatible streams (ie, those index values (such as timestams or integer ositions) More ormally, a stream, S, is a set o (index value, tule) that have the same index tyes, units o measure, and tule schemas) airs (stream elements): S1 + S = S1 S S = {(i 1, t 1 ), (i, t ),, (i n, t n ), } 5 Resamle (ρ): Resamle redicts elements in one such that index values, i j, belong to an index tye (below), stream or every index value reresented in and all tules, t j, belong to the same schema another The rediction is made based on a An index tye is a totally ordered tye that is associated rediction unction,, and a window size, w, that with a unit measure, which deines minimal increments in indicates how much history rom S 1 must be used index values For examle, i the index tye consists o or rediction Thus, or any streams, S 1 and S timestams (as in stock data), then the associated unit with comatible index tyes; window size, w; and measure might be 15 min or 1 hour deending on the unction, : maximum requency o stock data Unit measures ermit ρ w, ( S1, S) = {( i, ( S')) S[ i], exression o distances between elements, and window sizes within queries Index tyes include the ollowing S' = {( i', x) S1 ( i, i') w}} oerations: 6 Join ( >< ) Join correlates airs o streams with Ordering relations (<, >,, ): Because the index tye is a total order, it must be the case or any elements o the index tye, i 1 and i, that either i 1 < i, i < i 1, or i 1 = i comatible indexes (ie, the same tyes and unit measures) Elements rom dierent streams are joined rovided that the distance between them (as determined by their indexes) is within a given Position Arithmetic (+, -, %): For any index value, i, i + k is the index value that is k unit measures more than window size Thus, or any streams S 1 and S ; redicate, ; unction, ; and window size, w: i For examle, i i is a timestam-based index tye S1><,, ws = { ( x, y) i, with unit measure, 1 hour, then 15: = 18:00 - js ( 1[ i] S1, S[ j] S, ( i, j) is deined similarly % erorms modular arithmetic on index values wx, S1[ i], y S[ j]), xy (, )} Distance ( ): (i 1, i ) is the integer distance (in unit measures) between index values i 1 and i In deining the ormal semantics o Aurora s query 7 Dro (δ): Dro is a secial kind o ilter that ilters streams according to index values Seciically, given a eriod o k units, δ k would block all oerators, we adot the ollowing shorthand notation alicable to any stream, elements or every k th index value More ormally, given a ositive integer, k: S = {(i 1, t 1 ), (i, t ),, (i n, i n ), }: δ k ( S) = {(,) (,) i t i t S,% i k 0} 1 S [i k ] = {(i k, t k ) (i k, t k ) S} S [i m i n ] = {(i k, t k ) (i k, t k ) S, i m i k i n } 3 index (i k, t k ) = i k, value (i k, t k ) = t k Aendix Oerator Commutativity Proerties Aurora s query oerations all accet one or more inut streams and roduce a single outut stream: This aendix treats conditional commutativity in more detail In Table, we resent the commutativity 1 Filter (σ): σ (S) returns stream elements in S that satisy the redicate, : inormation resented in Table 1 with all the conditional cases numbered What ollows are the recise conditions σ ( ) {, ( )} or commutativity or each o the ten cases Ma (µ): µ 1 σ ( µ ( S)) µ ( σ ( S)), i attributes reerred to (S) alies the unction to every stream element in S: in are unchanged by µ ( S) = { ( x) x S} σ ( ρ(, S1, S)) ρ(, σ ( S1), S), i 3 Windowed-Ma (Μ): Given unction, that mas a i is always true (ie, selectivity( ) = 1) subset o consecutive stream elements (window) to on both S 1 and ρ (, S1, S), or another steam element, Μ,w (S) returns the stream resulting rom alying to every set o elements in ii µ index ( S) µ index ( σ ( S1)) S whose index values dier by at most w unit 3 σ ( >< q, concat, w ( S1, S )) >< q, concat, w ( S1, S), measures: where M, w( S) = { ( S') d( S' = S[ d d + w 1])} 1

14 ilter ilter ma w ma merge resamle join dro yes ma 1 4 w ma no 5 8 merge yes yes no yes resamle 6 no no 9 join 3 7 no no no 10 dro yes yes no yes no no yes Table : Detailed oerator commutativity table >< q, concat, w ( S, S ) 1 >< q, concat, w ( σ ( S1), S), i is deined on S1 and is not deined on S >< q, concat, w ( S1, σ ( S)), i is not deined on S1 and is deined on S >< q, concat, w ( σ ( S1), σ ( S)), i is deined on both S1 and S >< q, concat, w ( σ ( S 1 1), σ ( S )), i is not deined on either S1 or S, but = 1, where 1 is deined on S1 and is deined on S q, concat, w ( S1, S), otherwise >< In the above ormulation, concat, which stands or concatenation o two tules, is given as the deault combining unction o the join oerator In act, any unction that kees the attributes o the joined tules that are also reerred by redicate could be used instead 4 µ ( µ ( S)) µ ( µ ( S)), i οg = gο g g 5 g, w g, w µ ( µ ( S)) µ ( µ ( S) ), i or any index i, ( g( S[ i i+ w 1])) = g( ( Si [ ]), ( Si [ + 1]),, ( Si [ + w 1])) In other words, should distribute over g Note that this ormulation assumes that indices act as keys 6 µ ( ρ(, S, S )) ρ (, µ ( S ), S ), i g 1 g 1 i introduces no new values in (eg, ii is a ste-wise constant unction), or g is the identity unction, or iii µ index ( S) µ index ( S 1) µ ( ><, concat, w ( S1, S)) 7, where >< ( S, S ), concat ο, w 1 S 1 >< ( S, S ), concatο, w 1 ><, concat, w ( µ ( S1), S), i is deined on S1 and is not deined on S ><, concat, w ( S1, µ ( S)), i is not deined on S1 and is deined on S ><, concat, w ( µ ( S1), µ ( S )), i is deined on both S1 and S ><, concat, w ( µ ( S 1 1), µ ( S )), i is not deined on either S1 or S, but = 1 ο, where 1 is deined on S 1 and is deined on S ><, concat, w ( S 1, S ), otherwise and attributes reerred to in are unchanged by 8 µ ( µ ( S)) µ ( µ ( S)), i or any i, w g, w g, w, w ( g( S[ i i+ u 1]), g( S[ i+ 1 i+ u]),, gsi ( [ + w 1 i+ u+ w ])) = g( ( S[ i i+ u 1]), ( S[ i+ 1 i+ u]),, ( S[ i+ w 1 i+ u+ w ])) 9 ρ ρ g S1 S3 S ρ g ρ S1 S3 S (, (,, ), ) (, (,, ), ), i = g ><, concat, w ( >< q, concat, u ( S1, S), S3) 10, >< ( S, >< ( S, S )) q, concat, u 1, concat, w 3 i is deined on attributes o S and S 3, and q is S1 deined on attributes o and S 13