AClassofLinearAlgorithmstoProcessSetsofSegments GonzaloNavarroRicardoBaeza-Yates DepartmentofComputerScience fgnavarro,rbaezag@dcc.uchile.cl BlancoEncalada2120 UniversityofChile Santiago-Chile currentsolutionstooperatesegmentsfocusonsingleoperations(e.g.insertionorsearching), weareinterestedinset-orientedoperations(e.g.union,dierenceandothersmorespecicto tomanipulatesetsofnelements.weshowthatawideclassofoperationscaninfactbe segments).inthosecases,extendingthecurrentapproachesleadstoo(nlogn)timecomplexity performedino(n)time,i.e.inaconstantamortizedcostperprocessedsegment.wepresent thegeneralframeworkandshowanumberofoperationsofthatkind,depictingandanalyzing Weaddresstheproblemofecientlyperformingoperationsonsetsofsegments.While Abstract ThisworkhasbeensupportedinpartbyFONDECYTgrants1940271and1950622. thealgorithms.finally,weshowsomeapplicationsofthistechnique.
1Introduction Inmanypracticalapplicationstheproblemofmanipulatingsegmentsarisesunderdierentforms. modelswithconstraints[9,8]andstructuredtextsearch[13,10]. Typicalexamplesarecomputationalgeometry[14,11,6,3],temporaldatabases[7,2],database studied[14,5].alltheseapproachesfocuson\single"operations,inwhichasinglesegmentis behaviorfortheseoperations. operatedagainstasetofsegments.examplesare:searchingforasegment,insertinganewsegment structuredtextortemporaldatabases.examplesoftheoperationsthatarecommonlyperformedin intotheset,removingasegmentfromtheset,etc.in[5],itisshownhowtoachieveo(logn) oriented"operationsareneeded.thisisthecase,forexample,ofset-orientedquerylanguagesfor Becauseofthissituation,theproblemofmanipulatingasetofsegmentshasbeenextensively theseapplicationsareretrievingsegmentsincludingothersegments(e.g.allchapterscontainingat leastfourgures,instructuredtextsearch),orallsegmentsfromasetshortlyprecedingasegment scene,inatemporaldatabasedescribingmovies). fromanotherset(e.g.allmusicalsceneswhereagivenactorappearsthatshortlyprecedeacolorful However,forsomeapplications,thattypeofoperationsarelessimportant,whilemore\set- nonestingisallowed,thesegmentsaretriviallylinearlyordered,anditiseasytodeveloplinear solutionsfornelements.ouraimistoshowthatunderquitegeneralassumptions,muchbetter withamorecomplexdatastructure,andanalyzeunderwhichsituationstheideaworks. solutionscanbefound. segments,thusleadingtoo(n)algorithms.therefore,weextendthissimplemechanismtodeal Animportantconsiderationisthatwesupportnestinginthesegmentsformingaset,sinceif Weshowthatinmanycases,asolutionsimilartolistmerging[1]canbeappliedtosetsof AtrivialextensionofthecurrentapproachestodealwiththeserequirementsleadstoO(nlogn) wefoundlinearalgorithms.insection5weshowsomecomplexoperationsforwhichwefound algorithms[4]. moreexpensivesolutions.insection6weshowsomeapplicationsusingthesealgorithms.finally, weexplainthegeneralschemeofoursolution.insection4weshowanumberofproblemsforwhich retrievalcanbefoundin[12,13]. case.anearlierbutmoredetailedversionofpartofthisworkanditsapplicationtostructuredtext O(n)amortizedtime,whilethecostofanindividualoperationisknowntobe(logn)intheworst Thispaperisorganizedasfollows.Insection2weexplainourmodelofoperation.Insection3 ThemaincontributionofthispaperisasimpletechniquetoperformO(n)setoperationsin andtohx;yi=y.asegmenta=hx;yiissaidtocontainanotheroneb=hx0;y0i(andwedenote 2Preliminaries itasaborba)ixx0^y0y.ifasegmentcontainsanotheronewesaythattheynest. Asegmentisapairhx;yi,wherexandyarerealnumbersandxy.WedeneFromhx;yi=x insection7wepresentourconclusionsandfutureworkdirections. intheother).iftwosegmentsdonotnestandarenotdisjointwesaythattheyoverlap.finally, ab^a6=b. Thesegmentsaresaidtobedisjointiy<x0_y0<x(wesaya<bintherstcaseanda>b weusetheequalsignbetweensegmentswiththeobviousmeaning,andaborbatodenote
AandBinsomeway,inO(jAj+jBj)time. operations: Eachsetmustformahierarchy,i.e.nooverlapsareallowedbetweenanytwosegmentsof Toachievethatgoal,weimposetwofurtherrestrictions,oneonthesetsandoneontheprocessing Ourgeneralproblemis:giventwosetsofsegmentsAandB,obtainanewsetCbyprocessing Weareinterestedinset-orientedoperations,towhichweimposetheadditionalrestriction ofoperatingwithproximalsegments.proximalitymeansthatthepresenceorabsenceofa thanoverlapping,soweareinterestedinprovidingnesting. However,insomeapplications(e.g.structuredtextsearch)nestingcanbemoreimportant triviallyorderedbytheirrstorlastextreme,andthenormallistmergingalgorithmswork[4]. agivenset.thisisbecausetheonlywaywecouldobtainlinearalgorithmsforsetswith givensegmentinthenalresultmustbedenedintermsofrelativelyclosesegmentsinthe overlappingsegmentswaspreventingnestinginsidetheset.inthiscasethesegmentscanbe resultdoes.thatis,ouralgorithmsworkalsoifsegmentsfromdierentargumentsoverlap. Observethatwedonotneedthattheunionofbothoperandsformsahierarchy,aslongasthe arguments.thisisbecauseweplantotraversetheargumentsinsynchronizationtoproduce Weuseatreedatastructuretoarrangethesegmentsofeachset.Sincetherearenooverlaps, Ourapproachtothesolutioncanbedenedingeneraltermsasfollows: theresults. Toobtainthesolutionset(whichisalsoarrangedinatree),wearegoingtotraverseboth wedenethetreebythecontainmentrelationsbetweensegments. Wearenotinterestedinhowthesetsarebuiltintherstplace,andhowtheyarenally used.ourschemeisnotsoecienttobuildthetreesbyconsecutiverandominsertions,but ontreesandselectelementsundermorecomplexcriteria.tobeabletogeneratethewhole segments.allthealgorithmsconsistofvariationsofthisidea. solutionbytraversingtheoperandsjustonce,itisnecessarytheassumptionofproximal operandtreessimultaneously,inasynchronizedway,whilewegeneratethesolutiontreeat intheapplicationsweareinterestedin,thoseproblemsaresolvedinanad-hocway(weshow thesametime.theideaistogeneralizethelistmergingalgorithms,tomakethemoperate 3ASolutionScheme 3.1DataStructure Inwhatfollows,wedescribemoreindetailourdatastructureandalgorithmicscheme. anexampleinsection6). asegmentadescendsfromanotheronebinthetreeiab.althoughforclaritywedonotallow repetitions,thealgorithmsareeasilymodiedtoaccountforthis. Assaid,wearrangethesetofsegmentsinatree.Thecriteriontodenethetreeisstraightforward:
AformaldenitionofourtypeTreefollows: Segm=fhx;yi=x;y2R^xyg Subtree=SegmTree Tree=Subtree associatedsegment.ourtreescanbeseeninfactasforestswithorderamongtheirtrees. whereristhesetofrealnumbers.asitcanbeseen,therootofourtreedoesnothavean node:subtree!segmandsubtree:subtree!treeareselectors,i.e.ifs=(s;((s1;t1);:::; Wedenesomefunctionstoaccessthistreetype: 8(s;((s1;t1);:::;(sk;tk)))2Subtree;8i21::k;ssi (sk;tk)))2subtree,thennode(s)=sandsubtree(s)=((s1;t1);:::;(sk;tk)). 8(s1;t1);:::;(sk;tk)2Tree;8i21::k?1;si<si+1 head:subtree+!subtreereturnstherstelementofthelist,i.e.head(fl1;:::;lkg)=l1. tail:subtree+!subtreeeliminatestherstelement,i.e.tail(fl1;:::;lkg)=fl2;:::;l;lkg. 2Treedenotesanemptytree. operation,wemoveinoneorbothtrees,goingtothenextnodeorjumpingdirectlytothenext allowecientcomputationoftheseaccessfunctions. 3.2AlgorithmicScheme Ateachstep,thecurrentnodesofbothtreesarecompared,anddependingontheresultsandthe Thegeneralformofouralgorithmsconsistsoftraversingbothtrees,normallyinpreorpostorder. Althoughwedonotconsideranyparticularrepresentation,thedatastructureforourtreesmust sibling(thusskippingthecurrentsubtree).eachparticularcaseisavariationofthisgeneralidea. Weshowanumberofexamplesinthenextsection. fromthenalsolution(almostallusefuloperationsselectelementsfromonlyoneoperand).this selectorrejectthemarkednodes.somealgorithmscanbesolvedwithoutmarking,though. goodabstraction,sincewedonotdetailhowwecollectmarkedordeleteunmarkednodes.moreover, themarkingalgorithmcanbeusedfortwocomplementaryoperations,dependingonwhetherwe markingcanbeasimplebooleanmarkoritmayhaveamorecomplexmeaning.thisprovidesa Thuswearegoingtodescribethealgorithmsbymarkingnodesthatmustbeselectedorrejected traversal,whichistrivial,weusetwotypesoftraversaloperationshere,oneto\goright"andother storepointerstoparents,wekeepexplicitstackstoimplementthesetwotraversals. to\godown"inthetree.sinceseveraltreesaretraversedatthesametimeanditiswastefulto Atthebeginningofeachalgorithm,weinitializeanemptystackforeachargument.This Wedescribenowanimportantabstractionrelatedtothewaywetraversethetrees.Unlikelist Afterinitializingthestack,ifthetreeisnotempty,wepushthersttop-levelnodeofthetree operationspush,pop,emptyandtoponthestacks intothestack. stackholdspointerstotrees,andisreferredtoashargumenti:stack.weusethenormal
Thealgorithmsaccessonlythetopofthestacksandterminatewhenastackisempty.Argumenttreenamesareuppercaseletters.Theirlowercaseversiondenotesthetopnodesoftheir h1;1i(where1=1,1>x;8x2r,and1+x=1;8x2r). does,wepushitsrstchildintothestack.ifitdoesnot,weperforma\goright"operation. Togodowninthetree,wetestwhetherthecurrenttopofthestackhaschildrenornot.Ifit correspondingstacks,e.g.p=head(top(p:stack)).ifp:stackisempty,pisassumedtobe brace).figure1showsthebasicalgorithmsfortreetraversal. Togoright,wereplacethecurrenttopofthestackbyitsnextsibling.Ifthereisnonext Weuseapseudocodenotationforouralgorithms.Weincludeacase-likeinstruction(abigleft sibling,wepopthecurrenttopandretrytheoperationwiththeparent.weeventuallyempty thestackinthisway. Infact,goingdownmeans\processrstthechildrenandthenthesiblings". Init(P) Empty(P:stack). If(subtree(p)6=)Push(P:stack;subtree(p)) While(:Empty(P:stack)^tail(Top(P:stack))=)Pop(P:stack). elseright(p). If(P6=)Push(P:stack;P). If(:Empty(P:stack))Top(P:stack) Down(P) X,hXistheheightofitstree(intheworstcaseitcanbenX)anddXisthemaximumdegreeof itstree(itcanalsobenxinaattree).wealsousen,dandhasthemaximumcorresponding Weusethefollowingnumbersintheanalysis:nXisthesizeofthesetcorrespondingtooperand Figure1:Basicoperationsfortreetraversal. tail(top(p:stack)). valuebetweenalloperands(therearenormallytwooperands). ItshouldbeclearthateithercollectingmarkedordeletingunmarkednodesisO(n)time, Twoobservationsabouttheanalysis: simplewiththeseprimitives. AlthoughaparticularoperationoftreetraversalcanworkuptoO(h),wenotethatthewhole Wearenowinpositiontodescribeanumberofexamplealgorithms.Theirdescriptionisvery wherenisthenumberofnodesofthetree. thantwice.so,theamortizedcostoftreetraversalsisalwayslinearwiththesizeofthetree. traversal,evenbyusingdown,iso(n).thisisbecausenoedgeofthetreeistraversedmore
manipulation,segmentsincludingorincludedinothers,segmentsafterorbeforeothers,etc.inthis sectionweexplainindetailacoupleofthemandtheiranalysis. 4.1SetDierenceandIntersection Thereareanumberofinterestingproblemsthatadmitalinearimplementation,forexample:set 4LinearOperations isthatthetreesareunmarkedattherstplace,andwetraversebothtreesinsynchronization, Setdierenceandintersectionarecomplementaryversionsofasinglemarkingalgorithm.Theidea markingallnodesoftherstargumentthatarealsointhesecondone.thus,welaterimplement setdierencebycollectingunmarkednodesandsetintersectionbycollectingmarkednodes. Figure2showsthealgorithmtomarkthetree.PandQarethearguments. Whilemax(To(p);To(q))<1 Init(P).Init(Q). 8><>:p<q:Right(P). else:down(p). pq:down(q). p>q:right(q). p=q:markp.down(p). 4.2SegmentsIncludedinOthers time.itisalso(n)intheworstcase. beenalreadyshownlinear),andweworko(1)ateachstep.therefore,wehaveo(np+nq)=o(n) Thisalgorithmislinear,sinceasingletraversalisdoneoneachargument(thattraversalhas Figure2:Markingalgorithmforsetdierenceorintersection. Q,insynchronism.WhenanodeofPisincludedinatop-levelnodeofQ,thatsubtreeofPis toaccountforthis. marked.otherwisewediscardthatpnodeandcontinuewithitschildren. segmentofq. nodebutalsoitswholesubtreeisconsideredmarked.thecollectionalgorithmmustbemodied AnotherinterestingoperationisIn(P;Q),thatselectselementsfromPthatareincludedina renethisanalysisasfollows:eachtimewedoadown(p)isbecauseitcontainsanelementofqor ThisalgorithmisO(nP+nQ)=O(n),bythesameargumentsasbefore.Inthiscase,wecan Toavoidtoomuchmarkingoverhead,westatethatamarkinanodemeansthatnotonlythat ThealgorithmtosolveIn(P;Q)ispresentedinFigure3.WetraversethetoplevelsofPand becauseitoverlapswithanelementofq,thuseachextremeofeachsegmentof(thetop-levelof)q levelofthispath,theoperationcantakeusdpcomparisons,thusthecostiso(dqhpdp)=o(d2h). rstlevelinp,therelevantlistfromthetop-levelofqhasonlyoneelement(theoriginalq).ateach iscompared,atmost,withacompletepathofp(lengthhp).thatisbecauseoncewedescendthe takeso(logn)(atleasttodothemarking). Thatmeans,forexample,thatinanapplicationwithconstantdandbalancedtreestheoperation
Whilemax(To(p);To(q))<1 Init(P).Init(Q). 8><>:p<q:Right(P). p>q:right(q). pq:markp.right(p). else:down(p). 5ComplexOperations Then,thecomplexityofthisoperatorisO(min(nP+nQ;dQhPdP))=O(min(n;d2h)). Figure3:Markingalgorithmforsegmentsincludedinothers. includingksegments,andcomplexversionsof\after"and\before"aresomeexamples.weexplain Althoughmostoperationscanbeimplementedinlineartime,therearemorecomplexoperations thatseemnottohavealinearimplementation:segmentsincludedinotherswithpositions,segments indetailtherstone. 5.1SegmentsIncludedinOtherswithPositions slightlydierentalgorithms. consideringthemaximalsegmentsofpincludedinq,foreachq.ifasegmentofpoverlapswith q,thesegmentanditsdescendantsinparenotconsidered(seefigure4).othercriteriaproduce besegmentsincludedinqthatnestinp(e.g.sectionsinsidesections),wedene\position"byonly AnoperationthathappenstobeusefulforapplicationsistoselectsegmentsfromPthatareincluded ofpthatareincludedunderthesameq(e.g.thethirdsectionofeachchapter).sincetheremay insomesegmentqofqatagivenposition.thepositionisdenedintermsoftheothersegments node.wecanuse,e.g.rst,k-th,last,last?k,primepositions,etc.forthisalgorithmweusea genericpredicatetoavoidanyrestriction:sisanexpressiondenotingtheallowedpositions. WhenwendasetofnodesofPincludedinoneofQ,wemarkthes-thnodes,andthenwepass Figure4:Criteriontoparticipateininclusionwithpositions.Ellipsesindicateselectedsegments. Anotherconcerniswhichlanguagewillweusetodenotetheallowedpositionsofanincluded Thealgorithmrequiressimplebooleanmarking.Wetraversebothtreesinsynchronization. q P
againovertheincludedp-nodes,thistimecomparingthemwiththesubtreeoftheq-node.if, instead,thenodeofpincludesoneofq,wefollowthechildrenofthep-node.figure5showsthe algorithm. Whilemax(To(p);To(q))<1 Init(P).Init(Q). 8><>: else:down(p). p<q:right(p). p>q:right(q). pq:lp Figure5:Markingalgorithmforsegmentsincludedinothersatagivenposition. Whilehead(lp)q Down(Q). If(pos2s)Markhead(lp). Top(P:stack).pos tail(lp).pos pos+1. h2;ni;:::;hn;nig;s). bydoingatmosto(dp)work(whenpq),thusthealgorithmiso(np+nqdp).butalso thenalcollectionofnodesiso(np).observethateachelementofqisdeletedfromtheproblem O(min(nP+nQdP;nQ+nPhQ))=O(nmin(d;h)). algorithmisalsoo(nq+nphq).then,thealgorithmhasthebestfrombothcomplexities,namely observethateachelementofpcanbeworkedonbyatmostacompletepathofq,thusthe Toseethatitisalso(n2),considerthefollowingexample:In(fh1;1i;h2;2i;:::;hn;nig;fh1;ni; Toanalyzethisalgorithm,considerthatwecantraversebothPandQcompletely,andthat onlyrangeswecanimplementitino(nmin(h;klogd))wherekisthenumberofrangeswehave. considermanynodesofq,notjustone,andthereseemstobenobetterwaytodothat.ifthe 6SomeApplications languageofpositions(s)isrestrictedwecanobtainbettercomplexities,e.g.ifweallowtoexpress met.butobservethatinordertodeterminewhetherapnodeistobemarkedornot,wehaveto Thisnon-linearcomplexitymaybesurprising,sincetherequirementofproximalityseemstobe Ourtechniqueappliestoanumberofdissimilarapplications.Inthissectionwebrieydiscussa temporaldatabaseandexplainindetailastructuredtextsearchapplication.ouraimisnotonly toexposerealsituationswheretheproblemarises,butalsotoshowthepracticalperformanceof oneventsthatoccuratagivenpointorintervalintime.ifwehaveaset-orientedquerylanguage wewillbeinterestedinquestionssuchas\givemealltheevents(andtheirtimes)thatsatisfysome ouralgorithms. 6.1TemporalDatabases Temporaldatabases[7,2]canmanipulateinformationwithtemporalvalidity.Theirdataisbased constraint"(anexampleisgivenintheintroduction).
hastomanageanumberofquantitativefacts(i.e.eventsthatareknowntohavehappenedat time(e.g.ahappenedbeforeb,withoutknowingwhen).wejustwanttoshowthatifadatabase knowntimes),itmaybeinterestingforitsback-endtoperformset-orientedoperationsonthese facts.thesealgorithmswouldbeapartofamoregeneralinferenceengine,beinganecientway toselectrelevantfactstoworkon. Infact,temporaldatabasesaremuchmorecomplex,sincetheyalsodealwithuncertaintyin restrictionweimposeontheapplicabilityofourmodeltotemporaldatabases. wecannothandleresults(nalorintermediate)havingsegmentoverlapping.therefore,thisisa sequentialprocessescanbehierarchicallystructured. tions,althougheachanswerandintermediateresultisasubsetofsomesequentialprocess.those intoasetofsequentialprocesses,andsubsetsofdierentprocessescanbecombinedintoopera- Inmosttemporaldatabases,thetimeintervalscanoverlapandnest.Asexplainedearlier, Moderntextualdatabasesareagoodexampleofhierarchicalstructuring.Wedescribehereaspecic 6.2StructuredTextSearch Thisdoesnotmeanthattherecannotbeoverlapsinthedatabase.Wecandividetheknowledge modelhasmotivatedthisworkonecientalgorithmstoevaluatequeries,althoughsomeoperations modeltoquerystructuredtextdatabases,calledproximalnodes[13,12].thedevelopmentofthis dierslightlyfromthoseexposedhere. matchingexpressions.eachstructuralcomponent(e.g.chapters,gures,pages)ispreindexed,so tobeasubsetofagivenhierarchy. are:chapter/section/paragraphandfascicle/page/line.inthiscase,itmakesfullsenseforanswers componentsofthedatabase(e.g.chapters,pages,etc.).examplesofhierarchicalviewsofthetext ofindependenthierarchicalstructuresbuiltonthetext.eachhierarchyisstrict,althoughthere maybeoverlapsbetweendierenthierarchies.thenodesofthehierarchiesarethestructural Theleavesofthequerysyntaxtreesarenamesofstructuralcomponentsandtextpattern- Inthismodel,atextualdatabaseisseenasatext(asequenceofsymbols)plusanumber thesetofallelementsofagiventypecanberetrievedinatimeproportionaltothesizeofthe retrievedset.pattern-matchingusesanotherindextoreturnalistofsegmentsofthetextthat matchedthepattern.theselistsareconsideredtobepartofaspecialhierarchy.thisishowthe queryingfacilitiesbutahighexecutionoverhead. possibilitiesandthequerylanguageofthedatabaseandthosewhichhaverichstructuringand goodtradeobetweenthemodelswhichachievehigheciencybystronglyrestrictingthestructuring setsarebuiltintherstplace.allinternalnodesofquerysyntaxtreescorrespondtooperations betweensegmentsoftheoperands,andtheyarefurtherrestrictedtooperateonlyonproximal segments. theimplementation(usingthealgorithmsweexposehere)isveryecient.themodelconstitutesa Iwanttextparagraphsinitalicswhicharebefore(butinthesamepage)agurethatsays Someexamplesofqueriesare: Itisshownthatthequerylanguageobtainedundertheserestrictionshasgoodexpressivity,while GivemeallreferencestoKnuth'sbooksinchapters2-4. Iwantaparagraphprecedinganotherparagraphwheretheword\Computer"appearsbefore somethingabouttheearth. (at10symbolsorless)theword\science".bothparagraphsmustbeinthesamepage.
neededforthenalresult(thussavingalotofworkwhenprocessingtheintermediateresults). top-leveloftheresulttreeandaskstoexpandonlysomenodes.inthiscase,wesavetheworkof Thismechanismcanalsobeusedtoimplementanavigationalinterface,inwhichtheuserseesthe computingnodesthatarenottobeseen.althoughtheworst-casecomplexityofthelazyoperations isworsethanthatof\full"operations,therealtimesarebetter,sincelessnodesareprocessedin Iwantallsectionswithmathematicalformulasthatarenotappendices. practice. Wealsoimplementedalazyversionofthealgorithms,thatonlyprocessesthenodesthatare dierent.wetestedeachoperationondierentoperandsofsizes(i.e.numberofnodes)ranging gorithms.weusedadatabaseofcprogramsandlatexdocuments,whosestructuringisquite from100to10000.wealsotestedanumberofmorecomplexqueries,tocomparethefullandlazy versionsonrealqueries. correspondtoasunsparcclassicofapproximatelyspecmark26and16mbofram.fromthe testsweextractthefollowingconclusions(theymaydierforotherapplications): Aprototypehasbeenimplementedforthismodel,totesttheaverageperformanceoftheal- Figure6showsatypicalexampleofthetimesofanalgorithm(i.e.asingleoperation).They 1.0 seconds 0.5 full lazy 0.1 Thefullversionsofthealgorithmsarealllinearinpractice,sincethesituationsunderwhich 0.05 Thefullversionshaveverylowvariance,beingtheirtimeshighlypredictable,proportionalto Figure6:Typicaltimesforequal-sizedoperands.Observethatweusealogarithmicscale. theyarenotareveryunlikelytooccurinstructuredtextdatabases(e.g.averydeeptree). 50.000nodesprocessedpersecondperoperator. thesumofthesizesofallintermediateresults.theconstantforourmachineisapproximately 0.01 operand 100 1000 10000 size
7ConclusionsandFutureWork Weanalyzedtheproblemofmanipulatingasetofsegmentswhenweareinterestedinset-oriented Thelazyversionisnormallybetterthanthefulloneinpractice,despitetheworsecomplexities, operations.classicalsolutionsprovidesimpleoperationswitho(logn)complexity,whichleadsto verylargevariance,though. especiallyforcomplexqueries.thisisbecauselessnodesareexpanded.lazyalgorithmshave O(nlogn)solutionsforset-orientedoperations. formedinlineartime.theseassumptionsare:theoperandsandtheresultmustnothaveoverlapping segmentsinside,andtheoperationsmustworkonproximalsegments. However,weshowthat,undersomegeneralassumptions,set-orientedoperationscanbeper- includeaveragetimesofarealimplementationofthealgorithms. operationsbecauseoftheamortizedcost,i.e.itsperformanceforsingleoperationsisnotgood. adaptedtoanumberofapparentlydissimilarproblems.thistechniqueworkswellforset-oriented merge-likeoperations,andappliedtheframeworktosolveinlineartimeanumberofset-oriented exampleoperationsonsegments.theaimwastoshowthattheframeworkisexibleenoughtobe Thereareanumberoffutureworkdirectionsrelatedtothiswork.Themostimportantare: Finally,wepresentedsomeapplicationstowhichthistechniquecouldbeapplied.Wealso Wedevelopedaframeworkorientedtotreetraversalthatgeneralizestheideaoflisttraversalfor Findatechniquethatkeepsthesegoodresultswhileimprovingtheperformanceforsingle Extendtheseideastoallowoverlapsintoaset,sincethiswillopenawealthofnewpossibilities toapplythistechnique. Extendtheframeworktoaccountformoredimensions,e.g.manipulatinghypercubesinstead Studyadisk-basedimplementation,tryingtominimizeseektimes.Someworkonthisdirectionhasalreadybeendone. speciallybylookingatapplicationsthatcanbenetfromthiswork. operations. Searchformoreoperationsthatcanbeimplementedinlineartimebyusingthisframework, Acknowledgments WethankClaudiaMedeirosandNinaEdelweissfortheirhelpontemporaldatabases.Wealso Studyaparallelimplementation,sinceouralgorithmsseemtobehighlyparallelizable. incomputationalgeometryandinmapprocessing(e.g.ingeographicinformationsystems). ofjustone-dimensionalsegments.thismayopenanumberofopportunitiesofapplications References thankthehelpfulcommentsofthereferees. [1]A.Aho,R.Sethi,andJ.Ullman.Compilers:Principles,TechniquesandTools.Addison- Wesley,1986.
[4]C.Clarke,G.Cormack,andF.Burkowski.Schema-independentretrievalfromheterogeneous [2]J.Allen,J.IIendler,andA.Tate,editors.ReadingsinPlanning.MorganKaufmann,1990. [5]T.Cormen,C.Leiserson,andR.Rivest.IntroductiontoAlgorithms.TheMITPress,1990. [3]J.Bentley.AlgorithmsforKlee'srectangleproblems.Dept.ofComputerScience,Carnegie- [6]H.Edelsbrunner.Dynamicdatastructuresfororthogonalintersection.TechnicalReportF59, structuredtext.inprocs.ofthe4thannualsymposiumondocumentanalysisandinformation Retrieval,Apr.1995. MellonUniv.Unpublishednotes.,1977. [9]P.Kanellakis,S.Ramaswamy,D.Vengro,andJ.Vitter.Indexingfordatamodelswithconstraintsandclasses.TechnicalReportCS-93-21,Dept.ofComputerScience,BrownUniversity, [7]A.T.etal.TemporalDatabases:Theory,DesignandImplementation.BenjaminCummings, [8]P.KanellakisandD.Goldin.Constraintprogramminganddatabasequerylanguages.Technical ReportCS-94-31,Dept.ofComputerScience,BrownUniversity,June1994. 1993. Tech.Univ.Graz,InstitutefurInformationsverarbeitung,1980. [10]A.Loeen.Textdatabases:Asurveyoftextmodelsandsystems.ACMSIGMODConference. [11]E.McCreight.Prioritysearchtrees.TechnicalReportCSL-81-5,XeroxPARC,1981. [12]G.Navarro.Alanguageforqueriesonstructureandcontentsoftextualdatabases.Master'sthe- ACMSIGMODRECORD,23(1):97{106,Mar.1994. May1993. [13]G.NavarroandR.Baeza-Yates.Alanguageforqueriesonstructureandcontentsoftextual [14]F.PreparataandM.Shamos.ComputationalGeometry.Springer-Verlag,2ndedition,1988. sis,dept.ofcomputerscience,univ.ofchile,apr.1995.ftp://sunsite.dcc.uchile.cl/- pub/users/gnavarro/thesis95.ps.gz. databases.inproc.acmsigir'95,pages93{101,1995.ftp://sunsite.dcc.uchile.cl/- pub/users/gnavarro/sigir95.ps.gz.