Similar documents

Duke*University* B.S.E.E.*1989,*Electrical*engineering*

ILIGAN INSTITUTE OF TECHNOLOGY. School of Engineering Technology DIPLOMA IN AUTOMOTIVE ENGINEERING TECHNOLOGY (DAET)


A-1. Chapter 1. Project planning. Chapter 1. Festo Didactic Mechatronics

Boolean Algebra Part 1

SIMATIC S5. S5-100U Programmable Controller CPU 100/102/103. Reference Guide. Order No. 6ES MA21

OPEN SOURCE SOFTWARE in Italian schools a national survey

SULTAN HASSANAL BOLKIAH INSTITUTE OF EDUCATION UNIVERSITI BRUNEI DARUSSALAM PROGRAMME 146: MASTER OF EDUCATION BY COURSEWORK

Latino Service Organizations in the Greater Philadelphia Area

Employee Survey Questionnaire

Value-Driven Business Process Management Berlin, Germany/Orlando, FL USA

manual del propietario mazda 6

Pro3 1 : listes chaînées

f...-. I enterprise Amazon SimpIeDB Developer Guide Scale your application's database on the cloud using Amazon SimpIeDB Prabhakar Chaganti Rich Helms

where N is the standard normal distribution function,

Cost Allocation Method

M.S. Project Proposal. SAT Based Attacks on SipHash

CDYNE Phone Verification Web Service

Loan Estimate. Loan Terms. Projected Payments. Costs at Closing. Save this Loan Estimate to compare with your Closing Disclosure.

R E E T O L O C A D. Type: - Turbine 3/20 - Turbine 10/50 - Torpress. Gearbox: - R 93. Optional: Rev: /2005

Risks in digitalization. Tomas Pluharik, PwC RAS CZ

Siemens S7 MPI Driver Help Kepware Technologies

Clock Commands on Cisco IOS XR Software

H U M A N R ES O U R C ES Direct Deposit Online Instructions

Lecture 12: Abstract data types

Using Safari to Deliver and Debug a Responsive Web Design

Intelligent Information Network

Rigorous Software Engineering Hoare Logic and Design by Contracts

Managed App Configuration for App Developers. February 22, 2016

ABI/INFORM (PROQUEST)

Practical Generic Programming with OCaml

CCIFER Executive Management Master. In partnership with ESCP Europe

The European Patent Register. An introductory guide

Sources of Sales Tax Revenue Collected in LA County. Gregory Freeman Nancy D. Sidhu, PhD Myasnik Poghosyan

Introduction. Compiler Design CSE 504. Overview. Programming problems are easier to solve in high-level languages

Protect your laptop with ESET Anti-Theft

Questions and Answers About AdvancED Accreditation, Standards for Quality and ASSIST. About the Standards

I n la n d N a v ig a t io n a co n t r ib u t io n t o eco n o m y su st a i n a b i l i t y

Gates & Boolean Algebra. Boolean Operators. Combinational Logic. Introduction

Long term electronic signatures or documents retention

Understanding Logic Design

EIT ICT Labs Information & Communication Technology Labs

Applying Particle Swarm Optimization to Software Testing

ESET SECURE AUTHENTICATION. Check Point Software SSL VPN Integration Guide

European Technical Assessment. ETA-02/0020 of 22 September English translation prepared by DIBt - Original version in German language

Secrets of YARN Application Development

Simplifying Logic Circuits with Karnaugh Maps

Course Outline 1.0 IDENTIFICATION COURSE INFORMATION 2.0 LEARNING OVERVIEW SUBSECTION. Subject Code INFO. Course Name PowerPoint.

Sales and Marketing Tools

G(s) = Y (s)/u(s) In this representation, the output is always the Transfer function times the input. Y (s) = G(s)U(s).

Cata de Vinos en America del Sur

Comprehensive Risk Assessment for Urban and National Resilience

HOE WERKT CYBERCRIME EN WAT KAN JE ER TEGEN DOEN? Dave Maasland Managing Director ESET Nederland

Admin. ECE 550: Fundamentals of Computer Systems and Engineering. Last time. VHDL: Behavioral vs Structural. Memory Elements

INFORMATIONAL GUIDE FOR THE MATHEMATICS PORTFOLIO APPEALS PROCESS

ESET SECURE AUTHENTICATION. API SSL Certificate Replacement

Scraping Web Pages for Data with the Web Viewer. FMUG August 3, 2007

Wicket Application Development

Data Acquisition Analog Input/Loop

ESET Secure Authentication Java SDK

"New Java course material and available pool of.ppt presentations"

6-2. A quantum system has the following energy level diagram. Notice that the temperature is indicated

NowCastMIX. A fuzzy logic based tool for providing automatic integrated short-term warnings from continuously monitored nowcasting systems

Attribution. Modified from Stuart Russell s slides (Berkeley) Parts of the slides are inspired by Dan Klein s lecture material for CS 188 (Berkeley)

INTRODUCTION TO THE APPLICATIONS OF EVOLUTIONARY COMPUTATION IN COMPUTER SECURITY AND CRYPTOGRAPHY

FEBRUARY 2016 PTAB Public Hearing Schedule

The community platform lug.berlin

Virtual Machines. Adapted from J.S. Smith and R. Nair, VIRTUAL MACHINES, Morgan-Kaufmann Teodor Rus.

The Klipsch School of Electrical & Computer Engineering 152-HOUR COMBINED BSEE/MSEE OPTION

Automatic Test Data Generation for TTCN-3 using CTE

Transcription:

AggregateFunctions,Conservative Extension,andLinearOrders DepartmentofComputerandInformationScience LeonidLibkin UniversityofPennsylvania LimsoonWong However,themannerinwhichaggregatefunctionswereintroducedinthese functions.forexample,\ndmeanofcolumn"canbeexpressedinsql. 1Practicaldatabasequerylanguagesareusuallyequippedwithsomeaggregate Summary Philadelphia,PA19104-6389,USA andwong[3]introducedanestedrelationallanguagenrc(=)basedonmonads [16,24]andstructuralrecursion[1,2].ItwasshowninWong[27]thatthis querylanguagesleavessomethingtobedesired.breazu-tannen,buneman, languageisequivalenttothenestedrelationalalgebrasofthomasandfischer [22],SchekandScholl[20],andColby[4].NRC(=)enjoyscertainadvantages overtheselanguages:itisnaturallyembeddedinfunctionallanguages,itis readilyextensible,andithasacompactequationaltheory.therefore,itis usedinthisreportasabasisforinvestigatingaggregatefunctions. endowedwithrationalnumbers,rationalarithmetic,andasummationoperator. Theaugmentedlanguage,NRC(Q;+;;?;;P;=),isabletoexpressavariety ofaggregatefunctionscommonlyfoundinrealdatabasequerylanguages.the mainresultsofthispaperremainvalidinauniformwayifanysummation-like Insection2,thenestedrelationalcalculusNRC(=)isdescribed.Itisthen Ozsoyoglu,andMatos[18],andKlausnerandGoodman[11]. ismoredisciplinedandgeneralthanthoseproposedbyklug[12],ozsoyoglu, primitive,suchasboundedproduct,isaddedtothelanguage.thisapproach NRC(Q;+;;?;;P;=)canbecomputedwithoutusinganyintermediatedata whosedepthofnestingofsetsexceedsthatoftheinputandoutput.thisis knownastheconservativeextensionproperty.conservativityofnestedrelationalquerylanguagesintheabsenceofaggregatefunctionswasstudiedby Insection3,weprovethateveryfunctionf:s!texpressiblein previouslystudied. putandoutput.conservativityinthepresenceofaggregatefunctionswasnot wheninputandoutputareatrelations.thelattergeneralizedittoanyin- ParedaensandVanGucht[19]andWong[26].Theformerprovedthatitholds thesomewhatsurprisingfactthatnrc(q;+;;?;;p;=)cannotexpressthe usuallinearorderingonrationalnumbers.aslinearordersplayacentral roleinfundamentaldataorganizationalgorithms[14],thiscallsforspecial attention.wepresentatechniqueforliftinglinearorderatbasetypestolinear Insection4,theconservativeextensionpropertyisusedtodemonstrate 1

orderatalltypes.thistechniqueyieldslinearordersthatareexpressible innrc(q;+;;?;;p;=;),whichisthelanguageobtainedbyaugmenting NRC(Q;+;;?;;P;=)withlinearordersatbasetypes.Linearorderisknown toincreaseexpressivepowerinthecontextofdatabasequerylanguages[8,23]. Inourcase,thisisamajoradvantage.Queriessuchas\ndmaximumof column,"\ndmodeofcolumn"and\testparityofcardinalityofaset"are expressibleinnrc(q;+;;?;;p;=;).moreimportantly,afunctionthat assignsranktoelementsofasetisnowexpressible. eratorpowersetretainstheconservativeextensionproperty.hullandsu[7] closureoperatortc,theboundedxpointoperatorbx,orthepowersetop- NRC(Q;+;;?;;P;=;)augmentedwithanycombinationofthetransitive showedthatnrc(=;powerset)isnotconservativeoveratinputandoutput. Thisrankassignmentfunctionisusedinsection5toshowthat ThisfailureofconservativityforNRC(=;powerset)wasgeneralizedtoallinput andoutputheightsbygrumbachandvianu[6].incontrast,ourresultshows thatnrc(=;bx)isconservativeoveratrelations.hisresultisremarkable thatconservativitycanberepairedwithverylittleextra.suciu[21]showed inthatitdidnotneedanyarithmeticnororder.furthermore,itisalsovalid tionalprimitiveswhichareinarelationshiplikethatbetweenp,0,and+. NRC(Q;+;;?;;P;=;;Q;;)whereQ,,andareanytripleofaddi- andoutput.infact,ourproofofconservativeextensionholdsuniformlyfor resultusesarithmeticbutholdsforboundedxpointoperatoroveranyinput whenboundedxpointisreplacedbyboundedpartialxpointoperator.our 2ThemonadcalculusofBreazu-Tannen,Buneman,andWong[3]isdenotedNRC here.inthissection,itisextendedwithrationalnumbers,simplearithmetics, andasummationoperator.theextendedlanguageisabletoexpressmany Nestedrelationalcalculuswithsummation aggregatefunctionscommonlyfoundincommercialrelationaldatabasequery languagessuchassql. wheresandtarecomplexobjecttypes.thecomplexobjecttypesaregiven bythegrammar: AtypeinNRCiseitheracomplexobjecttypeorisafunctiontypes!t objectoftypeunitisdenotedby().objectsoftypestarepairswhoserst ObjectsoftypeBarethetwobooleanvaluestrueandfalse.Theunique componentsareobjectsoftypesandsecondcomponentsareobjectsoftypet. s;t::=bjbjunitjstjfsg Objectsoftypefsgarenitesetsofobjectsoftypes.Wealsoincludesome Sfe1jxs2e2ginstead.ThelanguagealsocontainssomeuninterpretedconstantscofbasetypeType(c)anduninterpretedfunctionspoffunctiontype Notethat[3]usesext(xs:e1)(e2);buthereweusetheequivalentconstruct uninterpretedbasetypesb. ExpressionsofNRCareconstructedusingtherulesinthegurebelow.

Type(p).Thetypesuperscriptsareomittedintherestofthepaperbecause theycanbeinferred[17,10].throughoutthispaperweassumetheusual conventionthatvariablesaredistinctandthatexpressionsarewellformed. xs:slambdacalculusandproducts xs:e:s!t e:t e1:s!te2:s ():unit 1e:s2e:t e:st e1e2:t SetMonad (e1;e2):st e1:se2:t fgs:fsg feg:fsg e:s e1:fsge2:fsg e1[e2:fsg Booleans Sfe1jxs2e2g:ftg e1:ftge2:fsg true:b false:b ife1thene2elsee3:t e1:be2:te3:t booleanconstructsarestandard.webrieyrepeatthemeaningofthemonad ThesemanticsofNRCwasdescribedin[3].Thelambdacalculus,product,and e1[e2istheunionofsetse1ande2.theconstructsfe1jx2e2gdenotes constructshere.fgistheemptyset.fegisthesingletonsetcontaininge. thesetobtainedbyrstapplyingthefunctionx:e1toelementsofthesete2 andthentakingtheirbigunion.hencesfe1jx2e2g=f(o1)[:::[f(on), wherefisthefunctionx:e1andfo1;:::;ongisthesete2.theshorthand fo1;::::ongisusedtodenotefo1g[:::[fong.itmustbestressedthatthe x2e2partintheconstructsfe1jx2e2gisnotamembershiptest;itisthe introductionofanewvariablexwhosescopeisthesubexpressione1. wasshownin[3]thatendowingnrcwithequalitytest=s:ss!batalltypes andfischer[22],schekandscholl[20],andcolby[4]).thatis,operations bywong[27]tobeequivalenttoclassicalnestedrelationalalgebrasofthomas selevatesnrctoafullyedgednestedrelationallanguage(whichwasshown Asitstands,NRCcanmerelyexpressqueriesthatarepurelystructural.It suchasnest,membershiptest,subsettest,setintersection,setdierence,etc. areexpressibleinnrc(=).(wewritetheadditionalprimitiveinbracketsto in[3],booleansaresimulatedbyvaluesoftypefunitgwithf()gfortrueand distinguishvariousextensionsofthelanguage.)itshouldalsoberemarkedthat itdoesnotmatterwhichpresentationofbooleansisused theresulting languageshavethesameexpressivepower. fgforfalse.however,overtheclassoffunctionsoftypes!fs1gfsng,

SfSff(x;y)gjx2Xgjy2YgformsthecartesianproductofsetsXand relationalnestingofx. Sff(1x;Sfif1x=1ythenf2ygelsefgjy2Xggjx2Xgisthe Y.SfSff(1x;y)gjy22xgjx2XgistheunnestingofthesetX. Examples.Sffx;5xgjx2f1;2;3ggevaluatestothesetf1;2;3;5;10;15g. as\selectaveragefromcolumn,"\selectmaximumofcolumn,"\selectcount fromcolumn,"etc.tohandlethiskindofqueries,additionalprimitivesmustbe addedtonrc.inthispaper,weaddrationalnumbers(whosetypeisdenoted byq)andthefollowingconstructs: Realdatabasequerylanguagesfrequentlyhavetodealwithqueriessuch e1:qe2:q e1+e2:q e1:qe2:q e1:qe2:q e1e2:q e1:qe2:q e1?e2:q Pfje1jxs2e2jg:Q e1:qe2:fsg e1e2:q where+,,?,andarerespectivelyaddition,multiplication,subtraction, denotestherationalobtainedbyrstapplyingthefunctionx:e1toeveryitem anddivisionofrationalnumbers.thesummationconstructpfje1jxs2e2jg inthesete2andthenaddingtheresultsup.thatis,pfje1jx2xjgis f(o1)+:::+f(on)iffisthefunctiondenotedbyx:e1andfo1;:::ong,with o1,...,onalldistinct,isthesetdenotedbyx.itshouldbeemphasizedthat thefje1jx2e2jgpartoftheconstructpfje1jx2e2jgisnotanexpressionof manyaggregateoperationsfoundincommercialdatabases.herearesome examples: thelanguage;hencepfj1jx2f5;6gjgis2andnot1. \CountthenumberofrecordsinR"iscount(R),Pfj1jx2Rjg. TheextendedlanguageNRC(Q;+;;?;;P;=)iscapableofexpressing \VarianceoftherstcolumnofR"isvariance(R),(Pfjsq(1x)jx2 \TotaltherstcolumnofR"istotal(R),Pfj1xjx2Rjg. \AverageoftherstcolumninR"isaverage(R),total(R)count(R). Klug[12].Heintroducedthesefunctionsbyrepeatingthemforeverycolumn ofarelation.thatis,aggregate1isforcolumn1,aggregate2isforcolumn2, Aggregatefunctionswererstintroducedintoatrelationalalgebraby Rjg?(sq(Pfj1xjx2Rjg)count(R)))count(R),wheresq,y:yy. todealcorrectlywithduplicates.hidingisdierentfromprojection.let andsoon.ozsoyoglu,ozsoyoglu,andmatos[18]generalizedthisapproachto suchasmean:fqg!q.however,theyhadtorelyonanotionofhiding nestedrelations.ouruseofthesummationconstructismoregeneral.onthe otherhand,klausnerandgoodman[11]had\stand-alone"aggregatefunctions

wherethehiddencomponentsareshownbetweensquarebrackets.observethat R,f(1;2);(2;3);(2;4)g.ProjectingoutthesecondcolumnofRgivesusR0, f1;2g.hidingthesecondcolumnofrgivesusr00,f(1;[2]);(2;[3]);(2;[4])g, whereasmean(r0)doesnotcomputethemeancorrectly.theuseofhiding toretainduplicatesisratherclumsy.ouruseofthesummationconstructis components.thenmean(r00)producestheaverageoftherstcolumnofr, theformer\eliminates"duplicatesassetshavenoduplicatebydenition.the latter\retains"theduplicated2byvirtueoftaggingthemwithdierenthidden simpler. atypesisdenedbyinductiononthestructureoftype:ht(unit)=ht(b)=0, 3Letusrstdenetheconceptofconservativeextension.Thesetheightht(s)of ht(st)=ht(s!t)=max(ht(s);ht(t)),andht(fsg)=1+ht(s).every Conservativeextension expressionofourlanguagehasauniquetypingderivation.hencethesetheight ofexpressioneisdenedasht(e)=maxfht(s)jsoccursinthetypederivation ofeg.letli;o;hdenotetheclassoffunctionswhoseinputhassetheightat mosti,whoseoutputhassetheightatmosto,andwhicharedenableinthe languagelusinganexpressionwhosesetheightisatmosthmax(i;o).lis Li;o;h+1foralli,o,andhmax(i;o;k).NotethatifLhastheconservative saidtohavetheconservativeextensionpropertywithxedconstantkifli;o;h= L(p)hasitwithconstantatmostmax(ht(p);k)=max(ht(s!t);k). extensionpropertywithconstantk,thenforanyadditionalprimitivep:s!t, isstronglynormalizing.thenormalformsinducedbythisrewritingarethen usedtoprovethateverydenablefunctionisdenableusingoperatorswhose setheightisatmostthesetheightoftheinput/outputofthefunction.thetheoremimpliesthatnrc(q;+;;?;;p;=)hastheconservativeextensionpropertywithxedconstant0.consequently,theclassnrc(q;+;;?;;p;=)i;o;h isindependentofh.henceusingintermediatedatastructureofgreatheight anyequalitytest=s:ss!bcanbeimplementedintermsofequalitytests programsmoreelegant). atbasetypes=b:bb!b.hence,intherestofthereport,weassume doesnotincreasethehorsepowerofthelanguage(thoughitfrequentlymakes Inthissection,wepresentarewritesystemforNRC(Q;+;;?;;P;=)that that=s,wheresisnotabasetype,isasyntacticsugarasimplementedinthe propositionbelow. WeproceedusingthestrategydevelopedbyWong[26].First,observethat Proposition3.1Anyequalitytest=s:ss!Bcanbeimplementedinterms ofequalitytestsatbasetypes=b:bb!b,usingnrc(q;+;;?;;p;=)as theambientlanguage. Proof.Proceedbyinductionons. x=sty,if1x=s1ythen2x=t2yelsefalse =bisthegivenequalitytestatbasetypeb.

x2sy,(pfjifx=sythen1else0jy2yjg)=q1. XsY,((Pfjifx2sYthen0else1jx2Xjg)=Q0) X=fsgY,ifXsYthenYsXelsefalse,where NRC(Q;+;;?;;P;=)isarewritesystemadaptedfromWong[26].Let e[e0=x]standsfortheexpressionobtainedbyreplacingallfreeoccurrencesof Thenextsteptowardprovingtheconservativeextensionpropertyfor 2 xinebye0,providedthefreevariablesine0arenotcapturedduringthe substitution.now,considertherulesbelow. i(ife1thene2elsee3);ife1thenie2elseie3 i(e1;e2);ei (x:e)(e0);e[e0=x] Sfejx2fgg;fg Sffgjx2eg;fg Sfejx2ife1thene2elsee3g Sfejx2fe0gg;e[e0=x] Sfe1jx2Sfe2jy2e3gg;SfSfe1jx2e2gjy2e3g Sfejx2e1[e2g;Sfejx2e1g[Sfejx2e2g ;ife1thensfejx2e2gelsesfejx2e3g Pfjejx2fgjg;0 Pfjejx2fe0gjg;e[e0=x] Pfjejx2e1[e2jg;Pfjejx2e1jg+Pfjifx2e1then0elseejx2e2jg Pfjejx2ife1thene2elsee3jg Pfjejx2Sfe1jy2e2gjg ;ife1thenpfjejx2e2jgelsepfjejx2e3jg ruledeservesspecialattention.considertheincorrectequation:pfjejx2 Thissystemofrewriterulespreservesthemeaningsofexpressions.Thelast ;PfjPfj(ePfjPfjifx=vthen1else0jv2e1jgjy2e2jg)jx2 Sfe1jy2e2gjg=PfjPfjejx2e1jgjy2e2jg.Supposee2evaluatestoaset e1jgjy2e2jg returns1buttheright-hand-sideyields2.thedivisionoperationinthelast ruleisusedtohandleduplicatesproperly. fo3g.supposee[o3=x]evaluatesto1.thentheleft-hand-sideofthe\equation" oftwodistinctobjectsfo1;o2g.supposee1[o1=y]ande1[o2=y]bothevaluateto

Proposition3.2(Soundness)Ife1;e2,thene1=e2.Thatis,e1;e2 impliese1ande2denotethesamevalue. Proof.Straightforward. ofapplicationsoftheserulesisguaranteedtoterminate. Asystemofrewriterulesissaidtobestronglynormalizingifanysequence 2 stronglynormalizing. Proof.Whilethelastthreerulesseemtoincreasethe\charactercount"of Proposition3.3(Strongnormalization)Theaboverewritesystemis thesethreerulestoanexpressionthatdecreasesinthee0position.thisisthe keytotheproof.thedetailcanbefoundintheappendixoflibkinandwong expressions,itshouldberemarkedthatpfjejx2e0jgisalwaysrewrittenby [15]. formshavethefollowingproperty: Henceeveryexpressioncanberewrittentosomenormalform.Thesenormal 2 NRC(Q;+;;?;;P;=)innormalform.Thenht(e)max(fht(s)g[ Theorem3.4(Conservativeextension)Lete:sbeanexpressionof fht(t)jtisthetypeofafreevariableoccurringineg).therefore, NRC(Q;+;;?;;P;=)hastheconservativeextensionpropertywithxedconstant0. Proof.Byafairlyroutinestructuralinductionone. andbywong[26].theformerprovedthatnrc(=)i;o;h=nrc(=)i;o;h+1for ConservativityforNRC(=)wasstudiedbyParedaensandVanGucht[19] 2 i=o=1.thelattergeneralizedittoalliando.howeverconservativity inthepresenceofaggregatefunctionswasnotstudied.theabovetheorem impliesthatnrc(q;+;;?;;p;=)i;o;h=nrc(q;+;;?;;p;=)i;o;h+1for tothecasewhereaggregatefunctionsarepresent. anyi,o,hmax(i;o).hencewehavegeneralizedtheresultsof[19]and[26] andschek[9]designedastatisticaldatabasewhoserelationsarethosehaving supportnestedsetsuptoaxeddepthofnesting.forexample,jaeschke supportsjustatrelations.bothofthesesystemshaveasuitablecollectionof heightatmost2.anotherexampleisthecommerciallysuccessfulsqlwhich Thetheoremhaspracticalsignicance.Somedatabasesaredesignedto withtheentirelanguagenrc(q;+;;?;;p;=)asamoreconvenientquery anaturalquerylanguageforsuchdatabases.butknowingthatnrc(q;+;; languageforthesedatabases,solongasquerieshaveinput/outputheightnot?;;p;=)isconservativeatallsetheights,onecaninsteadprovidetheuser aggregatefunctions.\nrc(q;+;;?;;p;=)restrictedtoheight2or1"is exceeding2or1.

Theconservativeextensionpropertycanbeusedtostudymanypropertiesof 4languages(seeLibkinandWong[15]forsomeexamples).Inthissection,we useittodemonstratethatnrc(q;+;;?;;p;=)isincapableofexpressing Linearorderingonnestedrelations theusuallinearorderingq:qq!bonrationalnumbers.soweintroduce linearorderforbasetypes.thenatechniqueforliftinglinearorderatbase Proposition4.1NRC(Q;+;;?;;P;=)cannotexpressQ. Proof.Itisenoughtoshowthatthefollowingfunctioncannotbeexpressed: typestoalltypesispresented. g(x)=0ifx1andg(x)=1ifx>1.observethatg:q!qhas height0.bytheconservativeextensionproperty,itmustbedenableusingan expressionofheight0.however,wecanprovethefollowingclaim: Claim.Letg(x):Qbeanexpressiondenedwholelyintermsof+,?,,,=b, if-then-else,constants,andthevariablex:q.thentherearetwopolynomials p(x)andq(x)withrationalcoecientssuchthatg(x)coincideswithp(x)q(x) almosteverywhere.thatis,g(x)6=p(x)q(x)foronlynitelymanyx2q. polynomialequation,ithasnitelymanyroots.henceg(x)cannotcoincide withp(x)q(x)almosteverywhere.consequently,gisnotexpressible. Nowp(x)q(x)=1ip(x)?q(x)=0.Sincep(x)?q(x)=0isa orderb:bb!bforeachbasetypeb.manyimportantdataorganization functionssuchassortingalgorithmsandduplicatedetection/eliminationalgorithmsrelyonlinearorders.intheremainderofthissection,weshowhowto liftlinearorderatbasetypestolinearorderatalltypes.firstrecallthatthe Proposition4.2Let(D;v)beapartiallyorderedset.Deneanorder.[on everyx2xthereisy2ysuchthatxvy.then Hoareorderingv[onthesubsetsofanorderedsetisdenedasXv[Yifor thenitesubsetsofdasfollows:x.[yieitherxv[yandy6v[x, orxv[yandyv[xandx?yv[y?x.then.[isapartialorder. Proof.SeeLibkinandWong[15]. Moreover,ifvisalinearorder,thensois.[. Therefore,weproposetoaugmentNRC(Q;+;;?;;P;=)withalinear 2 typesintheirstudyofduplicatedetectionandelimination.theorderingde- nedabovecoincideswithoneofthemandisinfactaparticularcaseofan Kupert,Saake,andWegner[14]gavethreelinearorderingsoncollection 2 featureofourtechniqueofliftinglinearordersisthattheresultinglinearorders orderwellknowninuniversalalgebraandcombinatorics[13,25].animportant sugarasimplementedinthetheorembelow. restofthereport,weassumethats,wheresisnotabasetype,isasyntactic arereadilyseentobecomputablebyourverylimitedlanguage.henceinthe

Theorem4.3(Linearorder)NRC(Q;+;;?;;P;=)augmentedwithlinearorderb:bb!bateverybasetypebcanexpressalinearorder Proof.Proceedbyinductionons. bisthegivenlinearorderonbasetypeb. xsty,if1xs1ythen(if1x=s1ythen2xt s:ss!sateverytypes. Xv[sY,(Pfj(if(Pfj(ifxsythen1else0)jy2Yjg)=0then1 XfsgY,ifXv[sYthen(ifYv[sXthenX.[sYelsetrue)elsefalse 2yelsetrue)elsefalse X.[Y,(Pfjifx2sYthen0else(if(Pfjify2sXthen0else(ifx else0)jx2xjg)=0 NRC(Q;+;;?;;P;=;).Severalotherqueriescommonlyencounteredin practicaldatabaseenvironments,aswellassomeunusualones,arenoweasily Hencewedenotethelanguageendowedwithlinearorderatbasetypesby sythen1else0)jy2yjg)=0then1else0)jx2xjg)=0. 2 expressed: \RowsofRwhoserstcolumnvalueisthemaximumofthecolumn" \RowsofRwhoserstcolumnvalueisthemodeofthecolumn"is ismaxrows(r),sfif(pfjif1(x)=1(y)then0elseif1(y) 1(x)then1else0jx2Rjg=0)thenfygelsefgjy2Rg. \ParityofthecardinalityofasetR"isodd(R),SfifPfjifx moderows(r),maxrows(sff(pfjiff(y)=f(x)then1else0jy2 Rjg;x)gjx2Rg). Moresignicantly,therankassignmentfunctioncanbeexpressed.Therank assignmentfunctionleadstoafewrathersurprisingresultstobediscussed fgjx2rg=f()g. ythen1else0jy2rjg=pfjifyxthen1else0jy2rjgthenf()gelse tionsuchthatsortfo1;:::;ong=f(o1;1);:::;(on;n)gwhereo1<:::<on. NRC(Q;+;;?;;P;=;)candenesorts. shortly. Proposition4.4Arankassignmentsorts:fsg!fsQgisthefunc- Proof.sort(R),Sff(x;Pfjifyxthen1else0jy2Rjg)gjx2Rg.2 Theabilitytocomputealinearorderandarankassignmentfunctionateverytypeprovestobeanasset.Inthisnalsection,wepresentafewmore 5conservativeextensionresults.First,letusconsiderthefollowingprimitives: Moreconservativeextensionresults

tcs:fssg!fssg g:fsgf:fsg!fsg bxs(f;g):fsg wheretc(r)isthetransitiveclosureofr;bx(f;g)istheboundedxpoint offwithrespecttog;thatis,itistheleastxpointoftheequationf(r)= powersets:fsg!ffsgg g\(r[f(r));andpowerset(r)isthepowersetofr. Corollary5.1Thefollowingshavetheconservativeextensionproperty: NRC(Q;+;;?;;P;=;;tc)withxedconstant1. NRC(Q;+;;?;;P;=;;bx)withxedconstant1. adaptationofthesametechnique.firstobservethatnrc(q;+;;?;;p;= Proof.Weprovidetheprooffortherstone,theothertwoarestraightforward NRC(Q;+;;?;;P;=;;powerset)withxedconstant2. ;;tcq),wherewerestrictcomputationoftransitiveclosuretobinaryrelations canbeachievedbyexploitingtherankassignmentfunctionsortbydening Therefore,itsucesforustoshowthattcsisexpressibleinitforanys.This ofrationalnumbers,hastheconservativeextensionpropertywithconstant1. dom(r),sff1xgjx2rg[sff2xgjx2rg, encode(r;c),sfsfsfif1x=1ythenif2x=1zthenf(2y; tc(r),decode(tcq(encode(r;sort(dom(r))));sort(dom(r))),where decode(r;c),sfsfsfif1x=2ythenif2x=2zthenf(1y; 2z)gelsefgelsefgjz2Cgjy2Cgjx2Rg,and NRC(=;powerset)i;o;h+1foranyhandi=o=1.Thisimpliesthefailureof GrumbachandVianu[6].TheformershowedthatNRC(=;powerset)i;o;h6= ConservativityofNRC(=;powerset)wasconsideredbyHullandSu[7]and 1z)gelsefgelsefgjz2Cgjy2Cgjx2Rg. 2 conservativeextensionfornrc(=;powerset)withrespecttoatrelations.the denbussche[5],thatnrc(=;bx)i;o;h=nrc(=;bx)i;o;h+1fori=o=1. thefailureathigherheightscanberepairedbyaugmentingnrc(=;powerset) lattergeneralizedthisresulttoanyiando.thecorollaryaboveshowedthat withasummationoperator. tendedtoallinputandoutputinthepresenceofarithmetics. Thisisremarkablebecausehedidnotneedanyarithmeticoperation.The corollaryaboveshowedthattheconservativityofboundedxpointcanbeex- Morerecently,Suciu[21]showed,usingatechniquerelatedtothatofVan

equivalenttoptime.thismayimplynrc(q;+;;?;;p;=;;lfp)1;1;h= NRC(Q;+;;?;;P;=;;lfp)1;1;h+1.Inwhichcase,NRC(Q;+;;?;;P;= ;;lfp)isconservativeoveratrelations.thisshouldbecontrastedwiththe corollaryabove.thelanguagesinthecorollarydonotnecessarilygiveusall Immerman[8]showedthatrst-orderlogicwithleastxpointandorderis PTIMEqueriesoveratrelations.Furthermore,conservativityholdsforthem primitives,andqtonrc(q;+;;?;;p;=;), overanyinputandoutput. sicuniformity.toillustratethis,letusintroducethreepartiallyinterpreted Thetechniqueusedinourproofofconservativeextensionhasanintrin- wherebissomexedtype,:bb!bisacommutativeassociativebinary :b e1:be2:b e1e2:b Qfje1jxs2e2jg:b e1:be2:fsg fsg.asanexample,taketobeandbtobeq,thenbecomes1andq :::e[on=xs]foranysetfo1;:::;ong,witho1,...,onalldistinct,oftype becomesasortofboundedproduct. operation,:bistheidentityfor,andqfjejxs2fo1;:::;ongjg=e[o1=xs] Proposition5.2Foreveryi,o,andhmax(i;o;ht(b)),NRC(B;Q;+;;?; ;P;=;;;Q;)i;o;h=NRC(B;Q;+;;?;;P;=;;;Q;)i;o;h+1. Proof.Itsucestoappendtherulesbelowtotherewritesystemofsection3. canbeused.) Notetheuseofthelinearordering.(Ifisalsoidempotent,simplerrules Qfjejx2e1[e2jg;Qfjejx2e1jgQfjifx2e1thenelseejx2e2jg Qfjejx2fgjg; Qfjejx2fe0gjg;e[e0=x] Qfjejx2ife1thene2elsee3jgnewline;ife1thenQfjejx2 Qfjejx2Sfe1jy2e2gjg;QfjQfjif(Pfjifx2e1[w=y]then(ifw= e2jgelseqfjejx2e3jg 6 Conclusionandfuturework. jx2e1jgjy2e2jg. ythen0else(ifwythen1else0))else0jw2e2jg)=0theneelse 2 thepresenceofaggregatefunctionsandlinearorders.weshowedthatthis propertyisretainedbythenestedrelationalcalculusnrc(=)whenverysimplearithmeticsandasummationoperatorareaddedtothelanguage.we provedalsothatthepresenceoflinearordersatbasetypesleadstoamore Theconservativeextensionpropertyofnestedrelationalcalculiisstudiedin

heightswhenverysimplearithmetics,boundedsummation,andlinearorders areavailable.theseresultshavemanyconsequences,includinganinteresting ureofconservativityofnrc(=;powerset)isshowntoberepairableathigher propertyofseveralnestedrelationalcalculi.inparticular,thewell-knownfail- uniformandperhapsunexpecteddemonstrationoftheconservativeextension nite-conitenesspropertyofthebagquerylanguageoflibkinandwong[15]; wehopetopresentthemindetailinafuturereport. alinearorderatalltypes.itisagoodframeworkforinvestigatingtheimpact querylanguages[8,23].ournestedsetlanguagehasenoughpowertoexpress oflinearordersonnestedcollections.also,otherkindsoflinearorderson nestedcollectionssuchasthosein[14]shouldbestudied. Itisknownthatthepresenceofalinearorderaddspowertorst-order tothecorrespondingonesonrationalnumbers.whatisthegeneralproperty transitiveclosure,boundedxpointandpowersetbyreducingtheseprimitives nestedsetlanguagewithaggregatefunctionsandadditionalprimitivessuchas oftheseprimitivesthatallowedthisreduction? Wewereabletodemonstratetheconservativeextensionpropertyforthe statisticaldatabases.doesithavesucientexpressivepowerforquerying databasesforotheradvancedapplicationssuchasspatialdatabases,geographic databases,andgenomedatabases? Thenestedrelationallanguagewithsummationseemstobeadequatefor Acknowledgements.DiscussionswithPeterBuneman,ValBreazu-Tannen, theirencouragementandinsights.wearealsogratefultoanthonykoskyand andespeciallydansuciudirectlyresultedinthispaper.wethankthemfor providedinpartbynationalsciencefoundationgrantiri-90-04137anda PaulaTa-Shmafortheirvaluablecomments.SupportforLeonidLibkinis bynationalsciencefoundationgrantiri-90-04137andarmyresearchoce GrantDAAL03-89-C-0031-PRIME. AT&TDoctoralFellowship.SupportforLimsoonWongisprovidedinpart References [1]V.Breazu-Tannen,P.Buneman,andS.Naqvi.Structuralrecursionasa [2]V.Breazu-TannenandR.Subrahmanyam.LogicalandcomputationalaspectsofprogrammingwithSets/Bags/Lists.InLNCS510:Proceedings of18thinternationalcolloquiumonautomata,languages,andprogramming,madrid,spain,july1991,pages60{75.springerverlag,1991. ProgrammingLanguages,Naphlion,Greece,pages9{19.MorganKaufmann,August1991. querylanguage.inproceedingsof3rdinternationalworkshopondatabase [3]ValBreazu-Tannen,PeterBuneman,andLimsoonWong.Naturallyem- 154.Springer-Verlag,October1992. ferenceondatabasetheory,berlin,germany,october,1992,pages140{ beddedquerylanguages.inlncs646:proceedingsofinternationalcon-

[4]LathaS.Colby.Arecursivealgebrafornestedrelations.Information [5]JanVandenBussche.Complexobjectmanipulationthroughidentiers: Analgebraicperspective.technicalReport92-41,UniversityofAntwerp, DepartmentofMathematicsandComputerScience,Universiteitsplein1, Systems,15(5):567{582,1990. [6]StephaneGrumbachandVictorVianu.Playinggameswithobjects.In B-2610Antwerp,Belgium,September1992. [7]RichardHullandJianwenSu.Ontheexpressivepowerofdatabase LNCS470:3rdInternationalConferenceonDatabaseTheory,Paris, France,December1990,pages25{39.Springer-Verlag,1990. [8]NeilImmerman.Relationalqueriescomputableinpolynomialtime.InformationandControl,68:86{104,1986ences,43:219{267,1991. querieswithintermediatetypes.journalofcomputerandsystemsci- [10]L.A.JategaonkarandJ.C.Mitchell.MLwithextendedpatternmatching [9]G.JaeschkeandH.J.Schek.Remarksonthealgebraofnonrstnormal formrelations.inproceedingsacmsymposiumonprinciplesofdatabase Systems,pages124{138,LosAngeles,California,March1982. [11]AvielKlausnerandNathanGoodman.Multirelations:Semanticsand Programming,pages198{211,Snowbird,Utah,July1988. andsubtypes.inproceedingsofacmconferenceonlispandfunctional [12]AnthonyKlug.Equivalenceofrelationalalgebraandrelationalcalculusquerylanguageshavingaggregatefunctions.JournaloftheACM, 29(3):699{717,July1982. 1985.MorganKaufmann. Databases,Stockholm,August1985,pages251{258,LosAltos,CA,August languages.inproceedingsof11thinternationalconferenceonverylarge [13]J.B.Kruskal.Thetheoryofwell-quasi-ordering:Afrequentlydiscovered [14]K.Kupert,G.Saake,andL.Wegner.Duplicatedetectionanddeletion concept.journalofcombinatorialtheoryseriesa,13:297{305,1972. [15]LeonidLibkinandLimsoonWong.Querylanguagesforbags.Technical OrganizationandAlgorithms,pages83{101.Springer-Verlag,June1989. intheextendednf2datamodel.inlncs367:foundationofdata [16]EugenioMoggi.Notionsofcomputationandmonads.Informationand PA19104,March1993. Computation,93:55{92,1991. ReportMS-CIS-93-36/L&C59,UniversityofPennsylvania,Philadelphia,

[17]A.Ohori,P.Buneman,andV.Breazu-Tannen.Databaseprogramming [18]G.Ozsoyoglu,Z.M.Ozsoyoglu,andV.Matos.Extendingrelationalalge- inmachiavelli:apolymorphiclanguagewithstatictypeinference.in ProceedingsofACMInternationalConferenceonManagementofData, pages46{57,portland,oregon,june1989. [19]JanParedaensandDirkVanGucht.Convertingnestedrelationalalgebra tions.acmtransactionsondatabasesystems,12(4):566{592,december braandrelationalcalculuswithset-valuedattributesandaggregatefunc- 1987. [20]H.-J.SchekandM.H.Scholl.Therelationalmodelwithrelation-valued expressionsintoatalgebraexpressions.acmtransactionondatabase Systems,17(1):65{93,March1992. [21]DanSuciu.Fixpointsandboundedxpointsforcomplexobjects.Technical attributes.informationsystems,11(2):137{147,1986. [22]S.J.ThomasandP.C.Fischer.Nestedrelationalstructures.InAdvances incomputingresearch:theoryofdatabases,pages269{307.jaipress, PA19104,March1993. ReportMS-CIS-93-32/L&C58,UniversityofPennsylvania,Philadelphia, [23]M.Y.Vardi.Thecomplexityofrelationalquerylanguages.InProceedings of14thacmsymposiumontheoryofcomputing,pages137{146,1982. 1986. [24]PhilipWadler.Comprehendingmonads.InProceedingsofACMConferenceonLispandFunctionalProgramming,Nice,June1990. [25]W.Wechler.UniversalAlgebraforComputerScientists,volume25of EATCSMonographonTheoreticalComputerScience.Springer-Verlag, Berlin,1992. [26]LimsoonWong.Normalformsandconservativepropertiesforquerylanguagesovercollectiontypes.InProceedingsof12thACMSymposiumon PrinciplesofDatabaseSystems,pages26{36,Washington,D.C.,May [27]LimsoonWong.Querylanguagesovercollectiontypes.ManuscriptavailablefromLimsoon@Saul.CIS.UPenn.EDU,June1993.