AggregateFunctions,Conservative Extension,andLinearOrders DepartmentofComputerandInformationScience LeonidLibkin UniversityofPennsylvania LimsoonWong However,themannerinwhichaggregatefunctionswereintroducedinthese functions.forexample,\ndmeanofcolumn"canbeexpressedinsql. 1Practicaldatabasequerylanguagesareusuallyequippedwithsomeaggregate Summary Philadelphia,PA19104-6389,USA andwong[3]introducedanestedrelationallanguagenrc(=)basedonmonads [16,24]andstructuralrecursion[1,2].ItwasshowninWong[27]thatthis querylanguagesleavessomethingtobedesired.breazu-tannen,buneman, languageisequivalenttothenestedrelationalalgebrasofthomasandfischer [22],SchekandScholl[20],andColby[4].NRC(=)enjoyscertainadvantages overtheselanguages:itisnaturallyembeddedinfunctionallanguages,itis readilyextensible,andithasacompactequationaltheory.therefore,itis usedinthisreportasabasisforinvestigatingaggregatefunctions. endowedwithrationalnumbers,rationalarithmetic,andasummationoperator. Theaugmentedlanguage,NRC(Q;+;;?;;P;=),isabletoexpressavariety ofaggregatefunctionscommonlyfoundinrealdatabasequerylanguages.the mainresultsofthispaperremainvalidinauniformwayifanysummation-like Insection2,thenestedrelationalcalculusNRC(=)isdescribed.Itisthen Ozsoyoglu,andMatos[18],andKlausnerandGoodman[11]. ismoredisciplinedandgeneralthanthoseproposedbyklug[12],ozsoyoglu, primitive,suchasboundedproduct,isaddedtothelanguage.thisapproach NRC(Q;+;;?;;P;=)canbecomputedwithoutusinganyintermediatedata whosedepthofnestingofsetsexceedsthatoftheinputandoutput.thisis knownastheconservativeextensionproperty.conservativityofnestedrelationalquerylanguagesintheabsenceofaggregatefunctionswasstudiedby Insection3,weprovethateveryfunctionf:s!texpressiblein previouslystudied. putandoutput.conservativityinthepresenceofaggregatefunctionswasnot wheninputandoutputareatrelations.thelattergeneralizedittoanyin- ParedaensandVanGucht[19]andWong[26].Theformerprovedthatitholds thesomewhatsurprisingfactthatnrc(q;+;;?;;p;=)cannotexpressthe usuallinearorderingonrationalnumbers.aslinearordersplayacentral roleinfundamentaldataorganizationalgorithms[14],thiscallsforspecial attention.wepresentatechniqueforliftinglinearorderatbasetypestolinear Insection4,theconservativeextensionpropertyisusedtodemonstrate 1
orderatalltypes.thistechniqueyieldslinearordersthatareexpressible innrc(q;+;;?;;p;=;),whichisthelanguageobtainedbyaugmenting NRC(Q;+;;?;;P;=)withlinearordersatbasetypes.Linearorderisknown toincreaseexpressivepowerinthecontextofdatabasequerylanguages[8,23]. Inourcase,thisisamajoradvantage.Queriessuchas\ndmaximumof column,"\ndmodeofcolumn"and\testparityofcardinalityofaset"are expressibleinnrc(q;+;;?;;p;=;).moreimportantly,afunctionthat assignsranktoelementsofasetisnowexpressible. eratorpowersetretainstheconservativeextensionproperty.hullandsu[7] closureoperatortc,theboundedxpointoperatorbx,orthepowersetop- NRC(Q;+;;?;;P;=;)augmentedwithanycombinationofthetransitive showedthatnrc(=;powerset)isnotconservativeoveratinputandoutput. Thisrankassignmentfunctionisusedinsection5toshowthat ThisfailureofconservativityforNRC(=;powerset)wasgeneralizedtoallinput andoutputheightsbygrumbachandvianu[6].incontrast,ourresultshows thatnrc(=;bx)isconservativeoveratrelations.hisresultisremarkable thatconservativitycanberepairedwithverylittleextra.suciu[21]showed inthatitdidnotneedanyarithmeticnororder.furthermore,itisalsovalid tionalprimitiveswhichareinarelationshiplikethatbetweenp,0,and+. NRC(Q;+;;?;;P;=;;Q;;)whereQ,,andareanytripleofaddi- andoutput.infact,ourproofofconservativeextensionholdsuniformlyfor resultusesarithmeticbutholdsforboundedxpointoperatoroveranyinput whenboundedxpointisreplacedbyboundedpartialxpointoperator.our 2ThemonadcalculusofBreazu-Tannen,Buneman,andWong[3]isdenotedNRC here.inthissection,itisextendedwithrationalnumbers,simplearithmetics, andasummationoperator.theextendedlanguageisabletoexpressmany Nestedrelationalcalculuswithsummation aggregatefunctionscommonlyfoundincommercialrelationaldatabasequery languagessuchassql. wheresandtarecomplexobjecttypes.thecomplexobjecttypesaregiven bythegrammar: AtypeinNRCiseitheracomplexobjecttypeorisafunctiontypes!t objectoftypeunitisdenotedby().objectsoftypestarepairswhoserst ObjectsoftypeBarethetwobooleanvaluestrueandfalse.Theunique componentsareobjectsoftypesandsecondcomponentsareobjectsoftypet. s;t::=bjbjunitjstjfsg Objectsoftypefsgarenitesetsofobjectsoftypes.Wealsoincludesome Sfe1jxs2e2ginstead.ThelanguagealsocontainssomeuninterpretedconstantscofbasetypeType(c)anduninterpretedfunctionspoffunctiontype Notethat[3]usesext(xs:e1)(e2);buthereweusetheequivalentconstruct uninterpretedbasetypesb. ExpressionsofNRCareconstructedusingtherulesinthegurebelow.
Type(p).Thetypesuperscriptsareomittedintherestofthepaperbecause theycanbeinferred[17,10].throughoutthispaperweassumetheusual conventionthatvariablesaredistinctandthatexpressionsarewellformed. xs:slambdacalculusandproducts xs:e:s!t e:t e1:s!te2:s ():unit 1e:s2e:t e:st e1e2:t SetMonad (e1;e2):st e1:se2:t fgs:fsg feg:fsg e:s e1:fsge2:fsg e1[e2:fsg Booleans Sfe1jxs2e2g:ftg e1:ftge2:fsg true:b false:b ife1thene2elsee3:t e1:be2:te3:t booleanconstructsarestandard.webrieyrepeatthemeaningofthemonad ThesemanticsofNRCwasdescribedin[3].Thelambdacalculus,product,and e1[e2istheunionofsetse1ande2.theconstructsfe1jx2e2gdenotes constructshere.fgistheemptyset.fegisthesingletonsetcontaininge. thesetobtainedbyrstapplyingthefunctionx:e1toelementsofthesete2 andthentakingtheirbigunion.hencesfe1jx2e2g=f(o1)[:::[f(on), wherefisthefunctionx:e1andfo1;:::;ongisthesete2.theshorthand fo1;::::ongisusedtodenotefo1g[:::[fong.itmustbestressedthatthe x2e2partintheconstructsfe1jx2e2gisnotamembershiptest;itisthe introductionofanewvariablexwhosescopeisthesubexpressione1. wasshownin[3]thatendowingnrcwithequalitytest=s:ss!batalltypes andfischer[22],schekandscholl[20],andcolby[4]).thatis,operations bywong[27]tobeequivalenttoclassicalnestedrelationalalgebrasofthomas selevatesnrctoafullyedgednestedrelationallanguage(whichwasshown Asitstands,NRCcanmerelyexpressqueriesthatarepurelystructural.It suchasnest,membershiptest,subsettest,setintersection,setdierence,etc. areexpressibleinnrc(=).(wewritetheadditionalprimitiveinbracketsto in[3],booleansaresimulatedbyvaluesoftypefunitgwithf()gfortrueand distinguishvariousextensionsofthelanguage.)itshouldalsoberemarkedthat itdoesnotmatterwhichpresentationofbooleansisused theresulting languageshavethesameexpressivepower. fgforfalse.however,overtheclassoffunctionsoftypes!fs1gfsng,
SfSff(x;y)gjx2Xgjy2YgformsthecartesianproductofsetsXand relationalnestingofx. Sff(1x;Sfif1x=1ythenf2ygelsefgjy2Xggjx2Xgisthe Y.SfSff(1x;y)gjy22xgjx2XgistheunnestingofthesetX. Examples.Sffx;5xgjx2f1;2;3ggevaluatestothesetf1;2;3;5;10;15g. as\selectaveragefromcolumn,"\selectmaximumofcolumn,"\selectcount fromcolumn,"etc.tohandlethiskindofqueries,additionalprimitivesmustbe addedtonrc.inthispaper,weaddrationalnumbers(whosetypeisdenoted byq)andthefollowingconstructs: Realdatabasequerylanguagesfrequentlyhavetodealwithqueriessuch e1:qe2:q e1+e2:q e1:qe2:q e1:qe2:q e1e2:q e1:qe2:q e1?e2:q Pfje1jxs2e2jg:Q e1:qe2:fsg e1e2:q where+,,?,andarerespectivelyaddition,multiplication,subtraction, denotestherationalobtainedbyrstapplyingthefunctionx:e1toeveryitem anddivisionofrationalnumbers.thesummationconstructpfje1jxs2e2jg inthesete2andthenaddingtheresultsup.thatis,pfje1jx2xjgis f(o1)+:::+f(on)iffisthefunctiondenotedbyx:e1andfo1;:::ong,with o1,...,onalldistinct,isthesetdenotedbyx.itshouldbeemphasizedthat thefje1jx2e2jgpartoftheconstructpfje1jx2e2jgisnotanexpressionof manyaggregateoperationsfoundincommercialdatabases.herearesome examples: thelanguage;hencepfj1jx2f5;6gjgis2andnot1. \CountthenumberofrecordsinR"iscount(R),Pfj1jx2Rjg. TheextendedlanguageNRC(Q;+;;?;;P;=)iscapableofexpressing \VarianceoftherstcolumnofR"isvariance(R),(Pfjsq(1x)jx2 \TotaltherstcolumnofR"istotal(R),Pfj1xjx2Rjg. \AverageoftherstcolumninR"isaverage(R),total(R)count(R). Klug[12].Heintroducedthesefunctionsbyrepeatingthemforeverycolumn ofarelation.thatis,aggregate1isforcolumn1,aggregate2isforcolumn2, Aggregatefunctionswererstintroducedintoatrelationalalgebraby Rjg?(sq(Pfj1xjx2Rjg)count(R)))count(R),wheresq,y:yy. todealcorrectlywithduplicates.hidingisdierentfromprojection.let andsoon.ozsoyoglu,ozsoyoglu,andmatos[18]generalizedthisapproachto suchasmean:fqg!q.however,theyhadtorelyonanotionofhiding nestedrelations.ouruseofthesummationconstructismoregeneral.onthe otherhand,klausnerandgoodman[11]had\stand-alone"aggregatefunctions
wherethehiddencomponentsareshownbetweensquarebrackets.observethat R,f(1;2);(2;3);(2;4)g.ProjectingoutthesecondcolumnofRgivesusR0, f1;2g.hidingthesecondcolumnofrgivesusr00,f(1;[2]);(2;[3]);(2;[4])g, whereasmean(r0)doesnotcomputethemeancorrectly.theuseofhiding toretainduplicatesisratherclumsy.ouruseofthesummationconstructis components.thenmean(r00)producestheaverageoftherstcolumnofr, theformer\eliminates"duplicatesassetshavenoduplicatebydenition.the latter\retains"theduplicated2byvirtueoftaggingthemwithdierenthidden simpler. atypesisdenedbyinductiononthestructureoftype:ht(unit)=ht(b)=0, 3Letusrstdenetheconceptofconservativeextension.Thesetheightht(s)of ht(st)=ht(s!t)=max(ht(s);ht(t)),andht(fsg)=1+ht(s).every Conservativeextension expressionofourlanguagehasauniquetypingderivation.hencethesetheight ofexpressioneisdenedasht(e)=maxfht(s)jsoccursinthetypederivation ofeg.letli;o;hdenotetheclassoffunctionswhoseinputhassetheightat mosti,whoseoutputhassetheightatmosto,andwhicharedenableinthe languagelusinganexpressionwhosesetheightisatmosthmax(i;o).lis Li;o;h+1foralli,o,andhmax(i;o;k).NotethatifLhastheconservative saidtohavetheconservativeextensionpropertywithxedconstantkifli;o;h= L(p)hasitwithconstantatmostmax(ht(p);k)=max(ht(s!t);k). extensionpropertywithconstantk,thenforanyadditionalprimitivep:s!t, isstronglynormalizing.thenormalformsinducedbythisrewritingarethen usedtoprovethateverydenablefunctionisdenableusingoperatorswhose setheightisatmostthesetheightoftheinput/outputofthefunction.thetheoremimpliesthatnrc(q;+;;?;;p;=)hastheconservativeextensionpropertywithxedconstant0.consequently,theclassnrc(q;+;;?;;p;=)i;o;h isindependentofh.henceusingintermediatedatastructureofgreatheight anyequalitytest=s:ss!bcanbeimplementedintermsofequalitytests programsmoreelegant). atbasetypes=b:bb!b.hence,intherestofthereport,weassume doesnotincreasethehorsepowerofthelanguage(thoughitfrequentlymakes Inthissection,wepresentarewritesystemforNRC(Q;+;;?;;P;=)that that=s,wheresisnotabasetype,isasyntacticsugarasimplementedinthe propositionbelow. WeproceedusingthestrategydevelopedbyWong[26].First,observethat Proposition3.1Anyequalitytest=s:ss!Bcanbeimplementedinterms ofequalitytestsatbasetypes=b:bb!b,usingnrc(q;+;;?;;p;=)as theambientlanguage. Proof.Proceedbyinductionons. x=sty,if1x=s1ythen2x=t2yelsefalse =bisthegivenequalitytestatbasetypeb.
x2sy,(pfjifx=sythen1else0jy2yjg)=q1. XsY,((Pfjifx2sYthen0else1jx2Xjg)=Q0) X=fsgY,ifXsYthenYsXelsefalse,where NRC(Q;+;;?;;P;=)isarewritesystemadaptedfromWong[26].Let e[e0=x]standsfortheexpressionobtainedbyreplacingallfreeoccurrencesof Thenextsteptowardprovingtheconservativeextensionpropertyfor 2 xinebye0,providedthefreevariablesine0arenotcapturedduringthe substitution.now,considertherulesbelow. i(ife1thene2elsee3);ife1thenie2elseie3 i(e1;e2);ei (x:e)(e0);e[e0=x] Sfejx2fgg;fg Sffgjx2eg;fg Sfejx2ife1thene2elsee3g Sfejx2fe0gg;e[e0=x] Sfe1jx2Sfe2jy2e3gg;SfSfe1jx2e2gjy2e3g Sfejx2e1[e2g;Sfejx2e1g[Sfejx2e2g ;ife1thensfejx2e2gelsesfejx2e3g Pfjejx2fgjg;0 Pfjejx2fe0gjg;e[e0=x] Pfjejx2e1[e2jg;Pfjejx2e1jg+Pfjifx2e1then0elseejx2e2jg Pfjejx2ife1thene2elsee3jg Pfjejx2Sfe1jy2e2gjg ;ife1thenpfjejx2e2jgelsepfjejx2e3jg ruledeservesspecialattention.considertheincorrectequation:pfjejx2 Thissystemofrewriterulespreservesthemeaningsofexpressions.Thelast ;PfjPfj(ePfjPfjifx=vthen1else0jv2e1jgjy2e2jg)jx2 Sfe1jy2e2gjg=PfjPfjejx2e1jgjy2e2jg.Supposee2evaluatestoaset e1jgjy2e2jg returns1buttheright-hand-sideyields2.thedivisionoperationinthelast ruleisusedtohandleduplicatesproperly. fo3g.supposee[o3=x]evaluatesto1.thentheleft-hand-sideofthe\equation" oftwodistinctobjectsfo1;o2g.supposee1[o1=y]ande1[o2=y]bothevaluateto
Proposition3.2(Soundness)Ife1;e2,thene1=e2.Thatis,e1;e2 impliese1ande2denotethesamevalue. Proof.Straightforward. ofapplicationsoftheserulesisguaranteedtoterminate. Asystemofrewriterulesissaidtobestronglynormalizingifanysequence 2 stronglynormalizing. Proof.Whilethelastthreerulesseemtoincreasethe\charactercount"of Proposition3.3(Strongnormalization)Theaboverewritesystemis thesethreerulestoanexpressionthatdecreasesinthee0position.thisisthe keytotheproof.thedetailcanbefoundintheappendixoflibkinandwong expressions,itshouldberemarkedthatpfjejx2e0jgisalwaysrewrittenby [15]. formshavethefollowingproperty: Henceeveryexpressioncanberewrittentosomenormalform.Thesenormal 2 NRC(Q;+;;?;;P;=)innormalform.Thenht(e)max(fht(s)g[ Theorem3.4(Conservativeextension)Lete:sbeanexpressionof fht(t)jtisthetypeofafreevariableoccurringineg).therefore, NRC(Q;+;;?;;P;=)hastheconservativeextensionpropertywithxedconstant0. Proof.Byafairlyroutinestructuralinductionone. andbywong[26].theformerprovedthatnrc(=)i;o;h=nrc(=)i;o;h+1for ConservativityforNRC(=)wasstudiedbyParedaensandVanGucht[19] 2 i=o=1.thelattergeneralizedittoalliando.howeverconservativity inthepresenceofaggregatefunctionswasnotstudied.theabovetheorem impliesthatnrc(q;+;;?;;p;=)i;o;h=nrc(q;+;;?;;p;=)i;o;h+1for tothecasewhereaggregatefunctionsarepresent. anyi,o,hmax(i;o).hencewehavegeneralizedtheresultsof[19]and[26] andschek[9]designedastatisticaldatabasewhoserelationsarethosehaving supportnestedsetsuptoaxeddepthofnesting.forexample,jaeschke supportsjustatrelations.bothofthesesystemshaveasuitablecollectionof heightatmost2.anotherexampleisthecommerciallysuccessfulsqlwhich Thetheoremhaspracticalsignicance.Somedatabasesaredesignedto withtheentirelanguagenrc(q;+;;?;;p;=)asamoreconvenientquery anaturalquerylanguageforsuchdatabases.butknowingthatnrc(q;+;; languageforthesedatabases,solongasquerieshaveinput/outputheightnot?;;p;=)isconservativeatallsetheights,onecaninsteadprovidetheuser aggregatefunctions.\nrc(q;+;;?;;p;=)restrictedtoheight2or1"is exceeding2or1.
Theconservativeextensionpropertycanbeusedtostudymanypropertiesof 4languages(seeLibkinandWong[15]forsomeexamples).Inthissection,we useittodemonstratethatnrc(q;+;;?;;p;=)isincapableofexpressing Linearorderingonnestedrelations theusuallinearorderingq:qq!bonrationalnumbers.soweintroduce linearorderforbasetypes.thenatechniqueforliftinglinearorderatbase Proposition4.1NRC(Q;+;;?;;P;=)cannotexpressQ. Proof.Itisenoughtoshowthatthefollowingfunctioncannotbeexpressed: typestoalltypesispresented. g(x)=0ifx1andg(x)=1ifx>1.observethatg:q!qhas height0.bytheconservativeextensionproperty,itmustbedenableusingan expressionofheight0.however,wecanprovethefollowingclaim: Claim.Letg(x):Qbeanexpressiondenedwholelyintermsof+,?,,,=b, if-then-else,constants,andthevariablex:q.thentherearetwopolynomials p(x)andq(x)withrationalcoecientssuchthatg(x)coincideswithp(x)q(x) almosteverywhere.thatis,g(x)6=p(x)q(x)foronlynitelymanyx2q. polynomialequation,ithasnitelymanyroots.henceg(x)cannotcoincide withp(x)q(x)almosteverywhere.consequently,gisnotexpressible. Nowp(x)q(x)=1ip(x)?q(x)=0.Sincep(x)?q(x)=0isa orderb:bb!bforeachbasetypeb.manyimportantdataorganization functionssuchassortingalgorithmsandduplicatedetection/eliminationalgorithmsrelyonlinearorders.intheremainderofthissection,weshowhowto liftlinearorderatbasetypestolinearorderatalltypes.firstrecallthatthe Proposition4.2Let(D;v)beapartiallyorderedset.Deneanorder.[on everyx2xthereisy2ysuchthatxvy.then Hoareorderingv[onthesubsetsofanorderedsetisdenedasXv[Yifor thenitesubsetsofdasfollows:x.[yieitherxv[yandy6v[x, orxv[yandyv[xandx?yv[y?x.then.[isapartialorder. Proof.SeeLibkinandWong[15]. Moreover,ifvisalinearorder,thensois.[. Therefore,weproposetoaugmentNRC(Q;+;;?;;P;=)withalinear 2 typesintheirstudyofduplicatedetectionandelimination.theorderingde- nedabovecoincideswithoneofthemandisinfactaparticularcaseofan Kupert,Saake,andWegner[14]gavethreelinearorderingsoncollection 2 featureofourtechniqueofliftinglinearordersisthattheresultinglinearorders orderwellknowninuniversalalgebraandcombinatorics[13,25].animportant sugarasimplementedinthetheorembelow. restofthereport,weassumethats,wheresisnotabasetype,isasyntactic arereadilyseentobecomputablebyourverylimitedlanguage.henceinthe
Theorem4.3(Linearorder)NRC(Q;+;;?;;P;=)augmentedwithlinearorderb:bb!bateverybasetypebcanexpressalinearorder Proof.Proceedbyinductionons. bisthegivenlinearorderonbasetypeb. xsty,if1xs1ythen(if1x=s1ythen2xt s:ss!sateverytypes. Xv[sY,(Pfj(if(Pfj(ifxsythen1else0)jy2Yjg)=0then1 XfsgY,ifXv[sYthen(ifYv[sXthenX.[sYelsetrue)elsefalse 2yelsetrue)elsefalse X.[Y,(Pfjifx2sYthen0else(if(Pfjify2sXthen0else(ifx else0)jx2xjg)=0 NRC(Q;+;;?;;P;=;).Severalotherqueriescommonlyencounteredin practicaldatabaseenvironments,aswellassomeunusualones,arenoweasily Hencewedenotethelanguageendowedwithlinearorderatbasetypesby sythen1else0)jy2yjg)=0then1else0)jx2xjg)=0. 2 expressed: \RowsofRwhoserstcolumnvalueisthemaximumofthecolumn" \RowsofRwhoserstcolumnvalueisthemodeofthecolumn"is ismaxrows(r),sfif(pfjif1(x)=1(y)then0elseif1(y) 1(x)then1else0jx2Rjg=0)thenfygelsefgjy2Rg. \ParityofthecardinalityofasetR"isodd(R),SfifPfjifx moderows(r),maxrows(sff(pfjiff(y)=f(x)then1else0jy2 Rjg;x)gjx2Rg). Moresignicantly,therankassignmentfunctioncanbeexpressed.Therank assignmentfunctionleadstoafewrathersurprisingresultstobediscussed fgjx2rg=f()g. ythen1else0jy2rjg=pfjifyxthen1else0jy2rjgthenf()gelse tionsuchthatsortfo1;:::;ong=f(o1;1);:::;(on;n)gwhereo1<:::<on. NRC(Q;+;;?;;P;=;)candenesorts. shortly. Proposition4.4Arankassignmentsorts:fsg!fsQgisthefunc- Proof.sort(R),Sff(x;Pfjifyxthen1else0jy2Rjg)gjx2Rg.2 Theabilitytocomputealinearorderandarankassignmentfunctionateverytypeprovestobeanasset.Inthisnalsection,wepresentafewmore 5conservativeextensionresults.First,letusconsiderthefollowingprimitives: Moreconservativeextensionresults
tcs:fssg!fssg g:fsgf:fsg!fsg bxs(f;g):fsg wheretc(r)isthetransitiveclosureofr;bx(f;g)istheboundedxpoint offwithrespecttog;thatis,itistheleastxpointoftheequationf(r)= powersets:fsg!ffsgg g\(r[f(r));andpowerset(r)isthepowersetofr. Corollary5.1Thefollowingshavetheconservativeextensionproperty: NRC(Q;+;;?;;P;=;;tc)withxedconstant1. NRC(Q;+;;?;;P;=;;bx)withxedconstant1. adaptationofthesametechnique.firstobservethatnrc(q;+;;?;;p;= Proof.Weprovidetheprooffortherstone,theothertwoarestraightforward NRC(Q;+;;?;;P;=;;powerset)withxedconstant2. ;;tcq),wherewerestrictcomputationoftransitiveclosuretobinaryrelations canbeachievedbyexploitingtherankassignmentfunctionsortbydening Therefore,itsucesforustoshowthattcsisexpressibleinitforanys.This ofrationalnumbers,hastheconservativeextensionpropertywithconstant1. dom(r),sff1xgjx2rg[sff2xgjx2rg, encode(r;c),sfsfsfif1x=1ythenif2x=1zthenf(2y; tc(r),decode(tcq(encode(r;sort(dom(r))));sort(dom(r))),where decode(r;c),sfsfsfif1x=2ythenif2x=2zthenf(1y; 2z)gelsefgelsefgjz2Cgjy2Cgjx2Rg,and NRC(=;powerset)i;o;h+1foranyhandi=o=1.Thisimpliesthefailureof GrumbachandVianu[6].TheformershowedthatNRC(=;powerset)i;o;h6= ConservativityofNRC(=;powerset)wasconsideredbyHullandSu[7]and 1z)gelsefgelsefgjz2Cgjy2Cgjx2Rg. 2 conservativeextensionfornrc(=;powerset)withrespecttoatrelations.the denbussche[5],thatnrc(=;bx)i;o;h=nrc(=;bx)i;o;h+1fori=o=1. thefailureathigherheightscanberepairedbyaugmentingnrc(=;powerset) lattergeneralizedthisresulttoanyiando.thecorollaryaboveshowedthat withasummationoperator. tendedtoallinputandoutputinthepresenceofarithmetics. Thisisremarkablebecausehedidnotneedanyarithmeticoperation.The corollaryaboveshowedthattheconservativityofboundedxpointcanbeex- Morerecently,Suciu[21]showed,usingatechniquerelatedtothatofVan
equivalenttoptime.thismayimplynrc(q;+;;?;;p;=;;lfp)1;1;h= NRC(Q;+;;?;;P;=;;lfp)1;1;h+1.Inwhichcase,NRC(Q;+;;?;;P;= ;;lfp)isconservativeoveratrelations.thisshouldbecontrastedwiththe corollaryabove.thelanguagesinthecorollarydonotnecessarilygiveusall Immerman[8]showedthatrst-orderlogicwithleastxpointandorderis PTIMEqueriesoveratrelations.Furthermore,conservativityholdsforthem primitives,andqtonrc(q;+;;?;;p;=;), overanyinputandoutput. sicuniformity.toillustratethis,letusintroducethreepartiallyinterpreted Thetechniqueusedinourproofofconservativeextensionhasanintrin- wherebissomexedtype,:bb!bisacommutativeassociativebinary :b e1:be2:b e1e2:b Qfje1jxs2e2jg:b e1:be2:fsg fsg.asanexample,taketobeandbtobeq,thenbecomes1andq :::e[on=xs]foranysetfo1;:::;ong,witho1,...,onalldistinct,oftype becomesasortofboundedproduct. operation,:bistheidentityfor,andqfjejxs2fo1;:::;ongjg=e[o1=xs] Proposition5.2Foreveryi,o,andhmax(i;o;ht(b)),NRC(B;Q;+;;?; ;P;=;;;Q;)i;o;h=NRC(B;Q;+;;?;;P;=;;;Q;)i;o;h+1. Proof.Itsucestoappendtherulesbelowtotherewritesystemofsection3. canbeused.) Notetheuseofthelinearordering.(Ifisalsoidempotent,simplerrules Qfjejx2e1[e2jg;Qfjejx2e1jgQfjifx2e1thenelseejx2e2jg Qfjejx2fgjg; Qfjejx2fe0gjg;e[e0=x] Qfjejx2ife1thene2elsee3jgnewline;ife1thenQfjejx2 Qfjejx2Sfe1jy2e2gjg;QfjQfjif(Pfjifx2e1[w=y]then(ifw= e2jgelseqfjejx2e3jg 6 Conclusionandfuturework. jx2e1jgjy2e2jg. ythen0else(ifwythen1else0))else0jw2e2jg)=0theneelse 2 thepresenceofaggregatefunctionsandlinearorders.weshowedthatthis propertyisretainedbythenestedrelationalcalculusnrc(=)whenverysimplearithmeticsandasummationoperatorareaddedtothelanguage.we provedalsothatthepresenceoflinearordersatbasetypesleadstoamore Theconservativeextensionpropertyofnestedrelationalcalculiisstudiedin
heightswhenverysimplearithmetics,boundedsummation,andlinearorders areavailable.theseresultshavemanyconsequences,includinganinteresting ureofconservativityofnrc(=;powerset)isshowntoberepairableathigher propertyofseveralnestedrelationalcalculi.inparticular,thewell-knownfail- uniformandperhapsunexpecteddemonstrationoftheconservativeextension nite-conitenesspropertyofthebagquerylanguageoflibkinandwong[15]; wehopetopresentthemindetailinafuturereport. alinearorderatalltypes.itisagoodframeworkforinvestigatingtheimpact querylanguages[8,23].ournestedsetlanguagehasenoughpowertoexpress oflinearordersonnestedcollections.also,otherkindsoflinearorderson nestedcollectionssuchasthosein[14]shouldbestudied. Itisknownthatthepresenceofalinearorderaddspowertorst-order tothecorrespondingonesonrationalnumbers.whatisthegeneralproperty transitiveclosure,boundedxpointandpowersetbyreducingtheseprimitives nestedsetlanguagewithaggregatefunctionsandadditionalprimitivessuchas oftheseprimitivesthatallowedthisreduction? Wewereabletodemonstratetheconservativeextensionpropertyforthe statisticaldatabases.doesithavesucientexpressivepowerforquerying databasesforotheradvancedapplicationssuchasspatialdatabases,geographic databases,andgenomedatabases? Thenestedrelationallanguagewithsummationseemstobeadequatefor Acknowledgements.DiscussionswithPeterBuneman,ValBreazu-Tannen, theirencouragementandinsights.wearealsogratefultoanthonykoskyand andespeciallydansuciudirectlyresultedinthispaper.wethankthemfor providedinpartbynationalsciencefoundationgrantiri-90-04137anda PaulaTa-Shmafortheirvaluablecomments.SupportforLeonidLibkinis bynationalsciencefoundationgrantiri-90-04137andarmyresearchoce GrantDAAL03-89-C-0031-PRIME. AT&TDoctoralFellowship.SupportforLimsoonWongisprovidedinpart References [1]V.Breazu-Tannen,P.Buneman,andS.Naqvi.Structuralrecursionasa [2]V.Breazu-TannenandR.Subrahmanyam.LogicalandcomputationalaspectsofprogrammingwithSets/Bags/Lists.InLNCS510:Proceedings of18thinternationalcolloquiumonautomata,languages,andprogramming,madrid,spain,july1991,pages60{75.springerverlag,1991. ProgrammingLanguages,Naphlion,Greece,pages9{19.MorganKaufmann,August1991. querylanguage.inproceedingsof3rdinternationalworkshopondatabase [3]ValBreazu-Tannen,PeterBuneman,andLimsoonWong.Naturallyem- 154.Springer-Verlag,October1992. ferenceondatabasetheory,berlin,germany,october,1992,pages140{ beddedquerylanguages.inlncs646:proceedingsofinternationalcon-
[4]LathaS.Colby.Arecursivealgebrafornestedrelations.Information [5]JanVandenBussche.Complexobjectmanipulationthroughidentiers: Analgebraicperspective.technicalReport92-41,UniversityofAntwerp, DepartmentofMathematicsandComputerScience,Universiteitsplein1, Systems,15(5):567{582,1990. [6]StephaneGrumbachandVictorVianu.Playinggameswithobjects.In B-2610Antwerp,Belgium,September1992. [7]RichardHullandJianwenSu.Ontheexpressivepowerofdatabase LNCS470:3rdInternationalConferenceonDatabaseTheory,Paris, France,December1990,pages25{39.Springer-Verlag,1990. [8]NeilImmerman.Relationalqueriescomputableinpolynomialtime.InformationandControl,68:86{104,1986ences,43:219{267,1991. querieswithintermediatetypes.journalofcomputerandsystemsci- [10]L.A.JategaonkarandJ.C.Mitchell.MLwithextendedpatternmatching [9]G.JaeschkeandH.J.Schek.Remarksonthealgebraofnonrstnormal formrelations.inproceedingsacmsymposiumonprinciplesofdatabase Systems,pages124{138,LosAngeles,California,March1982. [11]AvielKlausnerandNathanGoodman.Multirelations:Semanticsand Programming,pages198{211,Snowbird,Utah,July1988. andsubtypes.inproceedingsofacmconferenceonlispandfunctional [12]AnthonyKlug.Equivalenceofrelationalalgebraandrelationalcalculusquerylanguageshavingaggregatefunctions.JournaloftheACM, 29(3):699{717,July1982. 1985.MorganKaufmann. Databases,Stockholm,August1985,pages251{258,LosAltos,CA,August languages.inproceedingsof11thinternationalconferenceonverylarge [13]J.B.Kruskal.Thetheoryofwell-quasi-ordering:Afrequentlydiscovered [14]K.Kupert,G.Saake,andL.Wegner.Duplicatedetectionanddeletion concept.journalofcombinatorialtheoryseriesa,13:297{305,1972. [15]LeonidLibkinandLimsoonWong.Querylanguagesforbags.Technical OrganizationandAlgorithms,pages83{101.Springer-Verlag,June1989. intheextendednf2datamodel.inlncs367:foundationofdata [16]EugenioMoggi.Notionsofcomputationandmonads.Informationand PA19104,March1993. Computation,93:55{92,1991. ReportMS-CIS-93-36/L&C59,UniversityofPennsylvania,Philadelphia,
[17]A.Ohori,P.Buneman,andV.Breazu-Tannen.Databaseprogramming [18]G.Ozsoyoglu,Z.M.Ozsoyoglu,andV.Matos.Extendingrelationalalge- inmachiavelli:apolymorphiclanguagewithstatictypeinference.in ProceedingsofACMInternationalConferenceonManagementofData, pages46{57,portland,oregon,june1989. [19]JanParedaensandDirkVanGucht.Convertingnestedrelationalalgebra tions.acmtransactionsondatabasesystems,12(4):566{592,december braandrelationalcalculuswithset-valuedattributesandaggregatefunc- 1987. [20]H.-J.SchekandM.H.Scholl.Therelationalmodelwithrelation-valued expressionsintoatalgebraexpressions.acmtransactionondatabase Systems,17(1):65{93,March1992. [21]DanSuciu.Fixpointsandboundedxpointsforcomplexobjects.Technical attributes.informationsystems,11(2):137{147,1986. [22]S.J.ThomasandP.C.Fischer.Nestedrelationalstructures.InAdvances incomputingresearch:theoryofdatabases,pages269{307.jaipress, PA19104,March1993. ReportMS-CIS-93-32/L&C58,UniversityofPennsylvania,Philadelphia, [23]M.Y.Vardi.Thecomplexityofrelationalquerylanguages.InProceedings of14thacmsymposiumontheoryofcomputing,pages137{146,1982. 1986. [24]PhilipWadler.Comprehendingmonads.InProceedingsofACMConferenceonLispandFunctionalProgramming,Nice,June1990. [25]W.Wechler.UniversalAlgebraforComputerScientists,volume25of EATCSMonographonTheoreticalComputerScience.Springer-Verlag, Berlin,1992. [26]LimsoonWong.Normalformsandconservativepropertiesforquerylanguagesovercollectiontypes.InProceedingsof12thACMSymposiumon PrinciplesofDatabaseSystems,pages26{36,Washington,D.C.,May [27]LimsoonWong.Querylanguagesovercollectiontypes.ManuscriptavailablefromLimsoon@Saul.CIS.UPenn.EDU,June1993.