Structured Representation Models. Structured Information Sources



Similar documents

SMART Solutions for Active Directory Migrations

Developing Microsoft SQL Server Databases 20464C; 5 Days

Administering a SQL Database Infrastructure (MS )

Database Dictionary. Provided by GeekGirls.com

20464C: Developing Microsoft SQL Server Databases

Designing a Microsoft SQL Server 2005 Infrastructure

Developing Microsoft SQL Server Databases MOC 20464

Developing Microsoft SQL Server Databases

right, left, back total air quality in exhaust air channel (ventilator free flow m_/h

Course 20464: Developing Microsoft SQL Server Databases

RUBA: Real-time Unstructured Big Data Analysis Framework

Developing Microsoft SQL Server Databases (20464) H8N64S

Customer Training Catalog Training Programs CN OSS

Implementing a Data Warehouse with Microsoft SQL Server 2012

A National Online Essay Evaluation Service: Valid, Reliable, & Cost Effective Writing Assessment

MS Administering Microsoft SQL Server Databases

Course 20464C: Developing Microsoft SQL Server Databases

MS 20467: Designing Business Intelligence Solutions with Microsoft SQL Server 2012

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh

So You Want an SOA: Best Practices for Migrating to SOA in the Enterprise. Eric Newcomer, CTO

How To Improve Performance In A Database

A Service for Data-Intensive Computations on Virtual Clusters

A Multidatabase System as 4-Tiered Client-Server Distributed Heterogeneous Database System

Programa de Actualización Profesional ACTI Oracle Database 11g: SQL Tuning Workshop

Reading the Degree Evaluation

Effective Team Development Using Microsoft Visual Studio Team System

MS 20465: Designing Database Solutions for Microsoft SQL Server 2012

Basic knowledge of the Microsoft Windows operating system and its core functionality Working knowledge of Transact-SQL and relational databases

A Platform as a Service for Smart Home

DATABASE REVERSE ENGINEERING

Security Test s i t ng Eileen Donlon CMSC 737 Spring 2008

Implementing a Data Warehouse with Microsoft SQL Server MOC 20463

COURSE OUTLINE MOC 20463: IMPLEMENTING A DATA WAREHOUSE WITH MICROSOFT SQL SERVER

How To Make A Network Address Prefix Smaller

MS Designing and Optimizing Database Solutions with Microsoft SQL Server 2008

6.830 Lecture PS1 Due Next Time (Tuesday!) Lab 1 Out today start early! Relational Model Continued, and Schema Design and Normalization

HANDLING IMPRECISION IN QUALITATIVE DATA WAREHOUSE: URBAN BUILDING SITES ANNOYANCE ANALYSIS USE CASE

Microsoft SQL Server: MS Performance Tuning and Optimization Digital

Lesson 4 Web Service Interface Definition (Part I)

East Asia Network Sdn Bhd

Thepurposeofahospitalinformationsystem(HIS)istomanagetheinformationthathealth

Course Outline. Module 1: Introduction to Data Warehousing

Implementing a Data Warehouse with Microsoft SQL Server

International Cyber University for Health

Undergraduate Degree Map for Completion in Four Years

Condusiv s V-locity Server Boosts Performance of SQL Server 2012 by 55%

Functional Dependencies and Normalization

Administering a SQL Database Infrastructure

Administering a SQL Database Infrastructure 20764; 5 Days; Instructor-led

Technical Data Sheet: imc SEARCH 3.1. Topology

Chapter 10 Practical Database Design Methodology and Use of UML Diagrams

Implementing a Data Warehouse with Microsoft SQL Server 2012 (70-463)

Database Design and Programming

Implementing a Data Warehouse with Microsoft SQL Server

Course Outline: Course: Implementing a Data Warehouse with Microsoft SQL Server 2012 Learning Method: Instructor-led Classroom Learning

COURSE 20463C: IMPLEMENTING A DATA WAREHOUSE WITH MICROSOFT SQL SERVER

MCC AAS: Information Technology Programming to WSU BA in Computer Science

Chapter 10. Practical Database Design Methodology. The Role of Information Systems in Organizations. Practical Database Design Methodology

Seeking Data Quality. Using Agile Methods to Test a Data Warehouse

Data Integration with Talend Open Studio Robert A. Nisbet, Ph.D.

FINANCIAL REPORTING WITH BUSINESS ANALYTICS

Announcements. SQL is hot! Facebook. Goal. Database Design Process. IT420: Database Management and Organization. Normalization (Chapter 3)

CATALOG ADDENDUM: 2013 CATALOG WITH EFFECTIVE DATE OF JANUARY 1, DECEMBER 31, 2013

Implementing a Data Warehouse with Microsoft SQL Server 2012 MOC 10777

Microsoft SQL Server 2014: MS SQL Server Administering Databases

Education and Research of Science and Engineering in Korea

DATABASE NORMALIZATION

Implementing a Microsoft SQL Server 2008 Database

Server-side Development using Python and SQL

SQM. Maintaining Microsoft SQL for Broadcast Engineers. Training Course Outline

Concepts of Database Management Seventh Edition. Chapter 6 Database Design 2: Design Method

Conceptual Schema Approach to Natural Language Database Access

Information Systems Analysis and Design CSC John Mylopoulos Database Design Information Systems Analysis and Design CSC340

DATABASE SYSTEMS. Chapter 7 Normalisation

Infrastructures for big data

Implementing a Data Warehouse with Microsoft SQL Server

Master of Urban and Regional Planning And Juris Doctorate Law Degree ( )

Administrating Microsoft SQL Server 2012 Databases

Data Integration and Exchange. L. Libkin 1 Data Integration and Exchange

Improving database development. Recommendations for solving development problems using Red Gate tools

Implement a Data Warehouse with Microsoft SQL Server 20463C; 5 days

Transcription:

SchemalessRepresentationofSemistructured Dong-YalSeo1,Dong-HaLee1,Kang-SikMoon1,JisookChang1, DataandSchemaConstruction 1Dept.ofComputerScienceandEngineering PohangUniversityofScienceandTechnology Jeon-YoungLee1,andChang-YongHan2 2DataWarehouseAdvancedTechnology Pohang,Kyungbuk,790-784,KOREA Abstract.Weshouldconsidersemistructureddataofwhichhavea Youngdeungpo-Gu,Seoul,150-010,KOREA OracleSystemsKorea,Ltd. weakschemainformationinnetworkedinformationworld.tomanage suchsemistructureddataeciently,thispaperintroducesadatamodel fullydependentonschemalessmanipulations.forschemaconstruction, transformsemistructureddataintostructuredonebyintroducingschema constructionmethodology,comparedtotheformerstudieswhichare forsemistructureddataandoperationsforschemaconstruction.we wedenedoperationsforbuildingis-a/is-part-ofrelationships,collectingdataobjectstobuildaprimitiveclass,andmergingtwodata 1Introduction instancesorclasses. 1.1Motivation Inearlystagesofdataprocessingsuchasinventory/accountmanagementsystems,acentralizedlargedatabasesystemwasusedasaninformationserver. dataviapredenedschema.theschemaisrmandend-usersarenotresponsible Throughthedatabasedesign(orreal-worldmodeling)phase,DBA(Database andcreationafterschemadenition.forend-users,theirroleistomanipulate Administrator)denesawell-structuredschema.Weperformdataacquisition WideWeb)isatypicaldomainoftheexamples.Everyusercreateshis/her forschemamanagement. owndocumentsandsubmittsinthewww.howtomanagethoseplentyof usercreatesandupdateshis/herowninformation,likedba.thewww(world- informationisprovidedbyindividualusersandupdatedveryquickly.eachend- Asdatasourcesandcomputingenvironmentaredistributed,abundantof hyper-linksorsearchbykeywordsbecausethereisnoabsoluteschemainthe HTMLdocumentsandotherwebresources?Weshouldalwaysnavigatethrough storedinformation.ifwecoulddeneaschemaonthesetofwebresources, useastructuredquerylanguage.schemaprovidesthewell-structuredviewof notonlywehaveabetterstructureofgatheredinformationbutalsowecan

Semistructured Data Processing Lightweight Information Systems Structured Data Processing Conventional DBMSs storeddata.itexpressesdatalocation,relationshipsamongdataobjects,data categories,summarizedconcepts,andsoon. structuremaybeirregularorincomplete,areknownassemistructureddata.even thoughadataiscreatedasawell-structured,i.e.,schema-based,setofdata,it becomessemistructuredwhenthedatacomesoutfromitsoriginalstructure. Forexample,asinglerecordfromarelationaltableissemistructuredifwehave Thedatasets,wherethereisnoabsoluteschemaxedinadvanceandwhose tables. 1.2ProblemsandApproaches noideaabouttheoverallstructureofthetableandtherelationshipswithother Figure1showstheapproachesofinformationprocessingbasedonthestructuralnatureofdatasources.Rightsideofthegurepresentsconventionaldata well-structuredmodel,likerelationalorobject-oriented.usersmanipulatedata processing.informationitselfhasarigidstructureandisrepresentedwitha withaschemawhichmainlyprovidesanstructuralabstractionofstoreddata. Fig.1.InformationStructuresandProcessings Schema Construction Semistructured Structured Representation Models Representation Models Structured instances,andisstoredinalightweightstorage.storedinformationismanipulatedwithalightweightquerylanguage,whichcanbeusedwithincomplete schemainformation. databaseschema.althoughschemalessmanipulationisconvenientforuserswho wanttoretrievedatawithoutdeepknowledgeofunderlyingstructures,schema givesrmnessandconceptualizedview. isindispensableforembeddedsql,apicalls,orstoredprocedures.schema Thestudiesonlightweightapproachesmuchoverlookedtheimportanceof representedwithalightweightmodel,whichpermitsschemalesscreationofdata turedinformation[12](orevenunstructured[3]).semistructuredinformationis Leftsideshowstheprocessingofinformationwithpoorschema,i.e.semistruc- Information Sources Sources outcompleteknowledgeofthepredenedschema.(orevenwithoutanyschema Insemistructureddataprocessing,end-usersrepresentdatainstanceswith-

lesscreation,schema-basedmanipulation"whichinvolvesthefollowinggoals: phase.forsemistructureddataprocessing,weestablishedthestrategy,\schema- information.)sodatacreationphasecanbeperformedbeforeschemadenition 1.Providearepresentationmodelforschemalessdatainstances.Themodel 2.Deviseamechanismforschemaconstructionwhichcanbeappliedtoa shouldbeexpressiveenoughtodescribesemistructureddatainstancesfrom heterogeneousdatasources. schemalesspoolofdatainstances.afterapplyingschemaconstructionprocedures,wewillhavearigidschemaandmanipulatethestoreddatawith relatedworkandcorrespondingcontributions.section3and4addressadata Theremainingpartsofthispaperiscomposedasfollows.Section2presents modelforschemalesscreationofdataobjectsandoperationsforschemaconstruction,themaincontributionofthiswork,respectively.andnally,conclusion SQL. 2RelatedWork anddirectionsforfutureworkarediscussedinsection5. workdealswiththeproblemsininformationgatheringlayer.morespecically, Generalproblemsofnetworkedinformationprocessingarediscussedin[4]. weareinterestedindatamappingproblemandweintroduceschemaconstruction informationinterface,informationdispersion,andinformationgathering.our Threeconceptuallayersofnetworkedinformationsystemsareintroducedas operatorsforthatproblem.theimportanceandthemotivationaboutschemalessdatarepresentationsandmanipulationsarediscussedin[1][3][11][12]. Althoughnotdevelopedasasemistructureddatamodel,O2'scomplexvalue OEM(ObjectExchangeModel)[11],Labeled-Tree[3],andDataForestModel[1]. model[8]showsagoodwayofsemistructuredrepresentationwithattribute-value Schemalessdatainstancesareusuallydescribedbytheirattributesandcorrespondingvalues.Attribute-valuepairswereusedfordatarepresentationin andtheirexpressivepowersarealmostsame. pairs.alltheearliermodelsforsemistructureddataaresimilartoeachother substructuresaswellasatomicvalueslikeintegersandstringsbyusingattributevaluepairs4.labeled-treemodel,hasthesameexpressivepowerastheoem, representssemistructureddataastrees,i.e.,thetreeswithalabelingofedges. TSIMMIS3project[7]introducesOEMandotherrelatedwork[12]aboutthe integrationofheterogeneousinformationsources.oemprovidessetsandnested DataForestModelsupportslisttypewhichisunabletobedescribedinthe OEMandthelabeled-tree. proachdrawsadistinctionbetweenourstudyandconventionalmethodologiesin importanceofdatabaseschemaistoooverlooked.theschemaconstructionap- 3TheStanford-IBMManagerofMultipleInformationSources Theformerstudiesaremainlyfocusedonlightweightapproachesandthe 4In[11],theterm\level-valuepair"wasusedinsteadof\attribute-valuepair".

semistructureddataprocessing.theproblemsonstructuring[13]andtyping[9] semistructureddataareintroducedrecently. forclasscompositionswhichdealswithbehaviorscomparedtoourapproach whichdealswithdata. Thestudiesonsubject-orientedprogramming[10]introduceamethodology orrelationshipsamongobjects.wemainlyconsidereddataobjectsfromthe operationsandproperties.propertiesdeneeitherattributesoftheobjectitself 3SchemalessCreationofDataObjects viewpointofproperties. Objectsareusuallydistinguishedbytheirtypes,wheretypedescribesapplicable 3.1ModelDenition Ourdatamodeldescribesschemalessdataobjectswithaseriesofattributevaluepairs,calledAVPL(Attribute-ValuePairsList).Anattribute-valuepairis asetofattributes,andasetofvaluesasd,a,andv,respectively,avplis denedasfollows: 1.Singleattribute-valuepairisanAVPL 2.Unionoftwoattribute-valuepairsisanAVPL (a2a)^(v2v)?!f(a;v)g2d composedoftwotuples,attributeandvalue.whenwedenoteasetofavpl, itselfisalsoanattribute.whensdenotesthesetofstrings,attributeisdened asfollows: Attributeisanorderedcollectionofoneormorevariables,whereeachvariable D1;D22D?!D1[D22D 2.Compositeattribute(Attributewithmultiplevariables) 1.Singletonattribute a1;a2;:::;an2a?!(a1;a2;:::;an)2a s2s?!s2a attributeanditsvariables.assignmentofvaluestoattributesaredenedas follows: Valueisanassignableinstance,orasetofinstances,tothecorresponding where(a1;a2;:::;an)isanorderedsequenceofattributevariables. 1.Singletonattributeandvalue wherea2aandv2v a?v

2.Compositeattributeandvalue Thedomainofattributesincludesprimitivestrings,referencesofvalues,set (1in). where(a1;a2;:::;an)2a,(v1;v2;:::;vn)2v,andeachviisassignedtoai?(v1;v2;:::;vn) denedasfollows: structure. ofavplobjects)isalsoacomponentofotheravplobjectsandallowsnested (orlist)ofvalues,andavplobjects.therefore,anavplobjectitself(oraset 1.Primitivecharacterstringss2S?!s2V Whenwedenoteaset(ortypesystem)ofvaluesasV,typesofvaluesare 2.Referencestoanytypeofvalues wheresdenotesasetofstrings. 3.Setofanytypesofvalues(unordered) where&visthereference,i.e.,identier,ofvand. v1;v2;:::;vn2v?!fv1;v2;:::;vng2v v2v?!&v2v 4.Listofanytypesofvalues(ordered) 5.AVPLobjectsv1;v2;:::;vn2V?!<v1;v2;:::;vn>2V 6.Null(emptyvalue) whereddenotesasetofavplobjects. Anyattributecanbenullvalued,i.e.,novalueisassigned. d2d?!d2v 7.Identier 3.2ExpressivePower Aselfcontainedlabel,astring,whichbeginswith`#'.Identierisoptional AlltheatomicvaluesarestringsinAVPL.Otherkindsofatomictypeslike andusedbyotheravplobjectsasareference. integer,oat,andbooleanarenotprovided.thosetypescanbeeasilyderived fromadatasourceandusersarefreefromatomictypes. shipcannotberepresented.figure2isanexampleofavplobjectintabular sets,lists,andnestedstructures.itprovidestablestructureswithacomposite recordtuples).advancedsemanticsofobject-orientedmodel,likeis-arelation- attribute(astableheaders)andacorrespondinglistofcompositevalues(as Wecanrepresenttable-structuredvaluesaswellasreferences(identiers), representation.

Name Research Education Contact Dong-Yal Seo Database Degree School Year BS POSTECH 1992 4SchemaConstruction Fig.2.TabularRepresentationofanExampleAVPLObject MS POSTECH 1994 Telephone Fax e-mail 4.1SchemaandObjects 0562-279-5660 0562-279-5699 dyseo@white.postech.ac.kr Schema,inanOODB,denesclassesandtheirrelationships.Andtherelationshipsamongclassesimplytherelationshipsamongobjects.Schemadenesboth structuralandbehavioralpartofaclass.inthiswork,wemainlyfocusedon classfromasetofinstances,and2)variousrelationshipsamongthoseclasses. Atrst,wewillremindpossiblerelationshipsamongclassestodeneoperations forconstructingclassesandtheirrelationships.therearetworelationshipsbetweenobejcts,is-aandis-part-of.theformeristhebasisoftheinheritance Toconstructaschemafromapoolofschemalessobjects,weshouldbuild1)a structuralpart. hierarchyandthelatterisisthebasisofthecompositionhierarchy. informationinanobject-orientedmodel.typeisimplementedasaclassandthe classdenesacollectionrelationship.notonlyobjectsarecreatedasinstances ofaclass,aclasscouldbecreatedasacollectionofinstances. Atypeisacollectionofobjectswiththesamestructuralandbehavioral uniqueinobject-orientedworld.sotwodescriptionscanhaveanequivalence relationship. twodescriptionsmustbemergedintoasingledescriptionbecauseanobjectis Oneobjectcanbedescribedinmorethantwoways.Inthiscase,those 4.2SchemaConstructionOperations 1.CreationofaclassbyInstanceCollection ofavplobjectsuisconstructedwithobjectcollect(s1;:::;sm)if FortheAVPLobjectsS1;:::;Smandtheirattribute-setsA1;:::;Am,aclass valueofaisnull. a2ua,ifthereisanyavplobjectsi(1m)whereaisnotinai,the wheretheattribute-setuaofuisua=a1[a2[[amforallattributes U=fS1;S2;:::;Smg 2.Merging

(a)objectmerging withobjectmerge(s,t)if FortwoAVPLobjectsSandT,anewAVPLobjectUisconstructed wheretheattribute-setofuisthesameass[tand^aisacommon(shared)attribute-valuepairofsandt.wisanattribute-value U=fwj(w2S[T)^9^a(^a2S^^a2T)g (b)classmerging isconstructedwithclassmerge(s,t)ifu2uisconstructedwithobjectmerge(s,t)anduisconstructedwithobjectcollect(u1;u2;:::;um) FortwoclassesofAVPLobjectsSandT,anewclassofAVPLobjectsU pair. 3.Composition(IS-PART-OFRelationship) (a)objectcomposition wheres2s,t2t,andui2ufor1im. FortwoAVPLobjectsS,T,anewAVPLobjectUisconstructedwith ObjectCompose(S,T)ifU=(S?t)[^T (b)classcomposition wheret2sandt2t.^tistitselforareferencetot.theattribute-set ofuisthesameass. 1im. structedwithobjectcompose(s,t)anduisconstructedwithob- jectcollect(u1;u2;:::;um)wheres2s,t2t,andui2ufor jectsuisconstructedwithclasscompose(s,t)ifu2uiscon- FortwoclassesofAVPLobjectsSandT,anewclassofAVPLob- 4.Inclusion(IS-ARelationship) FortwosetsofAVPLobjectsUandV,anewrelationship,Uisasubsetof V,canbeconstructedwithClassInclude(U,V)if whereattribute-setsuaandvaofuandv,respectively,hasrelationshipof VAUA. UV 5.Triviallywecandeneadditionaloperations,suchasdestruction,splitting, jectcollect().otheroperationslikeclassmerge()orclasscompose()canbe Figure3explainstheoperationsObjectMerge(),ObjectCompose(),andOb- andexclusion,fromtheinverseoftheabovedenedoperations. implementedbyusingobjectmerge()orobjectcompose()withobjectcollect(), respectively.objectcompose()infigure3meanscompositionbyreferencevalue.

o1 Name Advisor Research Dong-Yal Seo J.Y. Lee Database o2 Name Telephone e-mail Dong-Yal Seo 279-5660 dyseo@white.... o3 = Object_Merge(o1, o2) Name Advisor Research Telephone e-mail Dong-Yal Seo J.Y. Lee Database 279-5660 dyseo@white.... a) Object Merging o5= Object_Compose(o3, o4) Name Advisor Research Telephone e-mail Dong-Yal Seo Database 279-5660 dyseo@white.... o4 Name Position Lab.... J.Y. Lee Associate Prof. IIS b) Object Composition o1 o6 Name Age Address Home City Chang-Yong Han 28 Pohang Sungnam o7 = Object_Collect(o1, o6) Name Advisor Research Age Address Home City 4.3SchemaConsistency Fig.3.ExampleofSchemaConstructionOperations Dong-Yal Seo J.Y. Lee Database Chang-Yong Han 28 Pohang Sungnam Whentheuserrunsaschemaconstructionprocedureusingabovementioned operations,schemaevolutiontakesplaceinthepre-existingschema.indatabase c) Object Collection to Build a Class maptheconsequencesoftheeectsonthetaxonomyoftheschema-modication world,itisveryimportanttokeepschemaconsistency.weintroduceseveral eectsofschemaconstructionoperationsontheexistingschemahierarchyand aectstaticstructureofclasses. operationslistedin[2].infact,theschemaconstructionoperationsmightheavily ationinschemaevolutiontaxonomyiftheclasstobemergedhasrelationships berejected5. cannotbepreserved,theoperationthatbreaksschemaconsistencyrulesshould withotherclasses.thus,iftheinvariantspropertiesoftheinheritancehierarchy Forexample,mergeoperationcouldbeconsideredasattributeaddingoper- 5Refer[6]formoreaboutschemainvariants. WechoseschemaevolutiontaxonomyofORIONdatamodelbasedonthe

comparisonsin[6].table1showsthetaxonomyofschemamodicationsin anobject-orienteddatabaseandtheircorrespondingschemaconstructionoperations.itmeansthatwecanmaptheconsistencyproblemsbyourschema constructionoperationsintoschemamodicationproblems. havevaluablemeaninginclassicalobject-orientedmodelwhereclassdenition alwaysprecedesobjectinstantiation. notndanytaxonomyformethods.weneitheraddressthecategoryofdefault valueattributesorsharedattributesdenedin[2],sincethesefunctionsonly Becausewedidnotconsiderthebehavioralpartofobjects,thereaderwill SchemaConstructionCorrespondingSchemaEvolution Merge Split Table1.SchemaConstructionOperationsandEvolutions Compose Decompose Include Modifythedomain'sattributes Modifycompositeattributesintononcompositeattributes Addattributes Exclude Deleteattributesandbuildanewclass Collect MakeaclassSthesuperclassofclassC RemoveaclassSfromthelistofsuperclassesofclassC Createanewclass Weintroducedanewmodelofdatabaseprocessingwhereobjectsarecreated beforeschemadenition.wedenedatypesystemforsemistructureddatainstances,andtheoperationsfortheconstructionofstructuralschemafromaset whichcontainsalistofuser-denedattributesandtheircorrespondingvalues. Forschemaconstruction,wedenedoperationsforbuildingIS-AandIS-PART- OFrelationships,collectingobjectstobuildaclass,andmergingtwoobjectsor classestomakealargerone.operationscanbeappliedinbothobject-leveland Inourdatamodel,aschemalessdatainstanceiscreatedasadescription 5ConclusionandFutureWork ofschemalessdatainstances. class-level. semistructureddatainstances,whicharenotcreatedasinstancesofpredened schema.databasesystemforcollectedhtmldocumentsisagoodapplicationof ourwork.htmldocumentshavesignicantlylessstructurethantheexamples inthispaperanditismorediculttoextracttheattribute-valuepairsneeded Ourapproachissuitablefortheapplicationswherewecollectandmanage toconstructtheschema.

References 1.Abiteboul,S.,Cluet,S.,Milo,T.:CorrespondenceandTranslationforHeterogeneousData.Proceedingsofthe'97ICDT,Delphi,Greece(1997)352{363 3.Buneman,P.,Davidson,S.,Hillerbrand,G.,Suciu,D.:AQueryLanguageand 2.Banerjee,J.,Kim,W.,Kim,H.,Korth,H.:SemanticsandImplementationof MOD,SanFrancisco,CA(1987)311{322 SchemaEvolutioninObject-OrientedDatabases.Proceedingsofthe'87ACMSIG- 4.Bowman,C.,Danzig,P.,Manber,U.,Schwartz,M.:ScalableInternetResourceDiscovery:ResearchProblemsandApproaches.CommunicationsoftheACM.37(8) MOD,Montreal,Canada(1996)505{516 OptimizationTechniquesforUnstructuredData.Proceedingsofthe'96ACMSIG- (1994)98{107 5.Bowman,C.,Danzig,P.,Hardy,D.,Manber,U.,Schwartz,M.:TheHarvestInformationDiscoveryandAccessSystem.ProceedingsoftheSecondInternational 7.Chawathe,S.,Garcia-Molina,H.,Hammer,J.,Ireland,K.:TheTSIMMISProject: 6.Tsichritzis,D.,ed.:ObjectManagement.CentreUniversitaired'Informatique,UniversityofGeneva(1990) WorldWideWebConference,Chicago,Illinois(1994)763{771 IntegrationofHeterogeneousInformationSources.ProceedingsofIPSJConference, 8.Bancilhon,F.,Delobel,C.,Kanellakis,P.eds.:BuildinganObject-OrientSystem: 9.Nestorov,S.,Abiteboul,S.,Motwani,R.:InferringStructureinSemistructured Tokyo,Japan(1994) TheStoryofO2.MorganKaufmann,SanMateo,CA(1992) Data.ProceedingsoftheWorkshopfortheManagementofSemistructuredData 11.Papakonstantinou,Y.,Garcia-Molina,H.,Widom,J.:ObjectExchangeAcross 10.Ossher,H.,Kaplan,M.,Harrison,W.,Katz,A.,Kruskal,V.:Subject-Oriented CompositionRules.ProceedingsoftheOOPSLA'95,Austin,Texas(1995)235{ 250 (inconjunctionwith'97acmpods/sigmod),tucson,arizona(1997)42{48 12.Quass,D.,Rajaraman,A.,Ullman,J.,Widom,J.:QueryingSemistructuredHeterogeneousInformation.Proceedingsof4thInternationalConferenceonDeductive SelectivelyLabeledOrderedTrees.ProceedingsoftheWorkshopfortheManage- andobject-orienteddatabases,singapore(1995)319{344 mentofsemistructureddata(inconjunctionwith'97acmpods/sigmod), Tucson,Arizona(1997)54{59 ConferenceonDataEngineering,Taipei,Taiwan(1995)251{260 HeterogeneousInformationSources.Proceedingsofthe11thIEEEInternational 13.Seo,D.,Lee,D.,Lee,K.,Lee,J.:DiscoveryofSchemaInformationfromaForestof ThisarticlewasprocessedusingtheLATEXmacropackagewithLLNCSstyle