Xml Tree Structure and Methods of Interdependence

Size: px
Start display at page:

Download "Xml Tree Structure and Methods of Interdependence"

Transcription

1 Reasoning bout Data in XML Data Integration adeusz ankowski 1,2 1 Institute of Control and Information Engineering, oznań University of echnology, oland 2 Faculty of Mathematics and Computer Science, dam Mickiewicz University, oznań, oland tadeusz.pankowski@put.poznan.pl bstract In this paper, we propose solutions to some problems arising while data from different sources is to be integrated under a given target schema. We address the following problems: inferring missing data based on constraints imposed by the target schema, generating mappings from a source schema to a target schema based on key constraints and value dependencies, and merging data based on subsumptions between XML data controlled by ontology and semantics defined by means of description logic. 1 Introduction In data integration [2, 11] we identify the following issues concerning reasoning about data: (1) inferring data values which are not given explicitly in sources but can be deduced based on some constraints enforced by the target schema; (2) finding an executable mapping from a source schema into a target schema so that an instance of the target schema can be computed from a given set of source instances; (3) merging heterogeneous source data in such a way that the result is subsumed by all merged components the result is at least as specific as any component and is free of overlapping data. In the process of data transformation some missing or incomplete data may be inferred. We achieve that by representing missing data by terms reflecting constraints imposed by the schema. In some cases such terms may be resolved and replaced by the actual data [15]. In the paper we propose a method for generating mappings between schemas based on key constraints and value dependencies defined by means of an XML schema. In the first step an automapping over a schema, i.e. a mapping from the schema onto itself, is generated. he automapping represents the schema. composition of automappings over two schemas gives a mapping between these schemas. We propose a language, called XDMap, for mapping specification based on source-to-target dependencies and Skolem functions. he data taken out from different sources may have not only different structures but also may use different names, concepts, precision, etc. In order to handle them we have to use a domain ontology. However, semantic relationships provided by the ontology must be generalized to XML tree structures in order to reason about subsumptions or equivalences between XML data. We have done it using semantics of description logic. Section 2 illustrates the problem of inferring some missing data in data integration. In Section 3 an approach to create executable schema mappings is proposed. We show how key constraints defined in XML Schema may be used to generate automappings and how mappings can be derived from automappings. In Section 4 we discuss subsumptions on XML data trees and their use for merging data. Section 5 concludes the paper.

2 2 Using constraints for inferring data in data integration We will show how missing data may be inferred in data integration using some constraints on target schema. Suppose there are three schemas S 1, S 2, and S 3, respectively (Fig. 1) and that only S 2 and S 3 are associated with data, while S 1 is a mediated (or target) schema that does not store any data. he meaning of labels are: author (), name () and university (U) of the author; paper ( ) title ( ), year ( ) of publication and the conference (C) where the paper has been presented. Elements labeled with R and K are used to join authors with their papers. I 2 and I 3 are instances of S 2 and S 3, respectively. In such scenario we meet the problem of data integration (data exchange), i.e. computing target instances from source instances [3, 8, 12, 17, 20]. It is commonly agreed that mappings are needed to perform these functions effectively, where a mapping specifies a relationship between a set of source schemas and a target schema. In particular, an instance of S 1 in Fig. 1 can be obtained by transformations M 21 (I 2 ) or M 31 (I 3 ), or by merging (M 21 (I 2 ) M 31 (I 3 ) = (M 21 M 31 )(I 2, I 3 ), where M ij denotes a mapping from S i into S j. We can use two kinds of constraints to define mappings, namely: 1. Value dependencies (on the target) to declare that a value of a path depends on a tuple of values of other paths; 2. Key constraints (on a source) to declare that a subtree is uniquely identified by a tuple of values of key paths. Value dependencies can be used to infer missing data [3, 15, 20]. Suppose we want to transform the instance I 2 to the target schema S 1, i.e. an instance I 11 = M 21 (I 2 ) must be produced (Fig. 2(a)). he original instance provides no data about publication year. We know, however, that the publication year ( ) uniquely depends on the title ( ), denoted by the value dependency constraint = y( ), where y is the name of a function mapping titles into publication years. Hence, we assign the term y(t) as the text value of, where t is the title. his convention forces some elements of type to have the same values (Fig. 2(a)). Such value dependencies can be defined within a schema declared be means of an (extended) XML Schema (Fig. 3). term, like y(t), may be resolved using other mappings. Suppose we want to merge the instance in Fig. 2(a)) and I 3. In this process terms denoting years will be replaced with actual values (Fig. 2(b)). ote that in this way we are able to infer the publication year of the paper written by a2. his information is not given explicitly neither in I 2 nor in I 3. Information provided by key constraints, elements <xs:key> within XML Schema (Fig. 3), are used to specify how many instances (nodes) of an element type must be in the computed target instance. For example, the element type /1/ in S 1 is uniquely identified by the key path. So, there are as many nodes of type /1/ as there are different values of /1//. In S 2, however, elements of type are identified by but only in a context determined by the element type / 2/ that is identified by. hus, to identify / 2// we need a pair of values determined by paths / 2// and / 2///. 3 XML schema mappings 3.1 Basic ideas of mappings We will show how, from the declaration in Fig. 3, the automapping M 11 over S 1 can be generated (Fig. 4). he clause foreach defines variables. Lines (1) and (2) are obvious. (3) includes value dependencies specified in the schema. Let y = f($x 1 ) and z = f($x 2 ) be two value dependencies, Ω be a set of bindings for $x 1, Ω be a set of bindings for $z and $x 2, and there is no binding for $y, neither in Ω nor in Ω ($x 1 denotes a vector of variables). he value to $y is assigned according to the rules: 1. For a binding ω Ω, the term f(a), where a = ω($x 1 ), is assigned to y.

3 S 1 : 1 S 2 : 2 S 3 : D3 * * * * U? + + R* K C? I 2 : 2 U I 3 : D3 t1 t2 U U a1 u1 a2 u2 a1 U u1 a1 R i1 R i2 a3 R i3 K i1 t1 05 C1 C K i2 t2 03 C C2 K i3 t3 04 C C1 Figure 1: Schemas: S 1, S 2, S 3, and schema instances I 2 and I 3 (S 1 does not have any stored instance) (a) I 11 = M 21(I 2) 1 (b) I 13 = M 21(I 2) M 31(I 3) = (M 21 M 31)(I 2, I 3) 1 U U a1 u1 a2 u2 t1 y(t1) t2 y(t2) t1 y(t1) a1 U u1 t1 05 t2 03 a2 U u2 t1 05 U a3 u(a3) t3 04 Figure 2: Instances of schema S 1 produced by mappings using value dependency constraints <xs:schema xmlns:xs="..."> <xs:element name="1"> <xs:complexype><xs:sequence> <xs:element ref=""/></xs:sequence> </xs:complexype> </xs:element> <xs:element name=""> <xs:complexype><xs:sequence> <xs:element name="" type="xs:string"/> <xs:element name="u" type="xs:string"/> <xs:element ref="" /></xs:sequence> </xs:complexype> <xs:key name="key"><xs:selector xpath="."/> <xs:field xpath=""/> </xs:key> <xs:valdep> <xs:target name="u"/><xs:function name="u"/> <xs:source xpath=""/> </xs:valdep> </xs:element> <xs:element name=""> <xs:complexype><xs:sequence> <xs:element name="" type="xs:string"/> <xs:element name="" type="xs:string"/> </xs:sequence> </xs:complexype> <xs:key name="key"><xs:selector xpath="."/> <xs:field xpath=""/> </xs:key> <xs:valdep> <xs:target name=""/><xs:function name="y"/> <xs:source xpath=""/> </xs:valdep> </xs:element> </xs:schema> Figure 3: XML Schema of S 1, extended with <xs:valdep> declaration 2. If there is a binding ω Ω such that ω ($x 2 ) = a, then the value ω ($z) is assigned to $y (we say that the term f(a) has been resolved). M 11 = (G 11, Φ 11, C 11, E 11) = (1) foreach $y 1 in /1, $y in $y 1/, $y in $y /, $y U in $y /U, $y in $y /, $y in $y /, $y in $y /, (2) where true (3) when $y U = u($y ), $y = y($y ) exists (4) F /1 () in F () ()/1 (5) F /1/ ($y ) in F /1 ()/ (6) F /1// ($y ) in F /1/ ($y )/ with $y (7) F /1//U ($y, $y U ) in F /1/ ($y )/U with $y U (8) F /1// ($y, $y ) in F /1/ ($y )/ (9) F /1/// ($y, $y ) in F /1// ($y, $y )/ with $y (10) F /1/// ($y, $y, $y ) in F /1// ($y, $y )/ with $y Figure 4: utomapping M 11 over S 1 (4) creates two new nodes, the root r and the node n of the outermost element of type /1, as results of Skolem functions F () () and F /1 (), respectively. he node n is a child of type 1 of r. (5) creates a new node n for any distinct value of $y, each such node has the type /1/ and is a child of type of the node created by F /1 () in (4). (6) For any distinct value of $y a new node n of type /1// is created. Each such node is

4 a child of type of the node created by invocation of F /1/ ($y ) in (5) for the same value of $y. Because n is a leaf, so it obtains the text value equal to the current value of $y. nalogously for the remainder. 3.2 Capturing key constraints by automappings In specification of automappings, Skolem functions and their arguments play a crucial role. We assume that: for any path in the schema there is exactly one Skolem function F (...), arguments of a Skolem function F (...) are determined by key paths defined for the element of type in the schema. In S 1 there is exactly one root and one outermost element, so the corresponding Skolem functions have empty lists of arguments. Element of type /1/ has a key path. Each of its subelements inherits this key path and additionally has its local key paths. Local key paths for non-leaf elements are defined in the schema. he local key path for a leaf element is, by default, this leaf element itself. hus, in S 1 we have the following key paths: for /1/ and for /1//; (, ) for /1// and for /1/// ; and (,, ) for /1///. Values of these key paths are bound to variables and are used as arguments of Skolem functions. In definition of S 3 (Fig. 5), the schema specifies the key and keyref relationships between the K child element of the element (the referenced key) and the R child element of the element (the foreign key). dditionaly, the value dependency K = k(, ) says that the path must start at element referencing via its foreign key defined in Keyref. Key references are captured as follows: in the exists clause any occurrence of a variable $x f ranging over values of a foreign key is replaced with a variable $x k ranging over values of the corresponding referenced key; <xs:element name=""> <xs:complexype>...</xs:complexype>... <xs:keyref name="keyref" refer="key"> <xs:selector xpath="."/> <xs:field xpath="r"/> </xs:keyref> </xs:element> <xs:element name=""> <xs:complexype>...</xs:complexype> <xs:key name="key"> <xs:selector xpath="."/> <xs:field xpath="k"/> </xs:key> <xs:valdep> <xs:target name="k"/><xs:function name="k"/> <xs:source xpath=""/> <xs:source xpath="" ref="keyref"/> </xs:valdep>...</xs:element> Figure 5: Fragment of XML Schema for S 3 the equality $x f = $x k is inserted into the where clause. Using these rules, we obtain the following specification of the automapping over S 3 : M 33 = foreach $z D3 in /D3, $z in $z D3 /, $z in $z /, $z R in $z /R, $z in $z D3 /, $z K in $z /K, $z in $z /, $z in $z /, $z C in $z /C where $z R = $z K when $z K = k($z, $z ), $z = y($z ), $z C = c($z ) exists F /D3 () in F () ()/D3 F /D3/ ($z ) in F /D3 ()/ F /D3// ($z ) in F /D3/ ($z )/ with $z F /D3//R ($z, $z K ) in F /D3/ ($z )/R with $z K F /D3/ ($z K ) in F /D3 ()/ F /D3//K ($z K) in F /D3/ ($z K)/K with $z K F /D3// ($z K, $z ) in F /D3/ ($z K)/ with $z F /D3// ($z K, $z ) in F /D3/ ($z K)/ with $z F /D3//C ($z K, $z C) in F /D3/ ($z K)/C with $z C 3.3 Syntax and semantics for mappings he part foreach/where/when of a mapping M determines a partially ordered set (Ω, ) of bindings of variables ($x, $y). For example, in the mapping M 21 (Fig. 6) for two bindings over I 2, ω 1 = ($x t 1, $x a 1, $x U u 1, $y y(t 1 )) and ω 2 = ($x t 1, $x a 2, $x U u 2, $y y(t 2 )), we have ω 1 < ω 2, because the tuple of leaf nodes providing values for ω 1 precedes the tuple of leaf nodes providing values for ω 2. Bindings from Ω are used in the exists E part to produce the result target instance. he ordering

5 imposed in Ω by a source instance should be preserved in the target instance. ote that if the foreach/where clause is defined over S 2, while the when/exists concerns S 1, then we deal with a mapping M 21 from S 2 into S 1. hen, after an appropriate replacement of variables, we obtain: M 21 = foreach $x 2 in / 2, $x in $x 2 /, $x in $x /, $x in $x /, $x in $x /, $x U in $x /U where true when C 11 ($y, $y U, $y, $y ) [$y $x, $y U $x U, $y $x ] exists E 11 ($y, $y U, $y, $y ) [$y $x, $y U $x U, $y $x ] Figure 6: Mapping M 21 from S 2 into S 1 In M 21 there is no replacement for $y, thus its value must be set somehow differently, e.g. as a null value [3]. We set it as the term y(t), where t is the current value of $x (see Fig. 2(a)). It is a form of Skolemization. hus, a mapping specification in XDMap conforms to the general form of source-to-target generating dependencies [1, 9, 12, 13]: $x(g($x) Φ($x) $yc($x, $y) E($x, $y)). Definition 1 n executable schema mapping in XDMap (or mapping for short) between a source schema S and a target schema is a sequence M ::= (M,..., M) of mapping constraints between S and, where: M := foreach G($x) where Φ($x) when C($x, $y) exists F /l ($x, $y) in F ($x, $y )/l [ with $x ] G is a list of variable definitions over a source schema: $x in Q or $x in $x /Q; Φ is a conjunction of atomic conditions: $x = $x or $x $x ; C a list of target constraints $x = f($x) or $y = f($x), $x $x, $y $y; F ($x, $y) a Skolem term, where is a rooted path in a target schema; ($x, $y ) ($x, $y), $x ($x, $y). Definition 2 Let M = (G, Φ, C, E)($x, $y) be a mapping, and (Ω, ) be a partially ordered set of bindings of variables ($x, $y) determined by (G, Φ, C). target instance I of a target schema is then obtained as follows: 1. F () () = r the root of I. 2. F ($x, $y)(ω) = n a node of type. 3. If F /l ($x, $y)(ω) = n and F ($x, $y )(ω) = n, and ($x, $y ) ($x, $y) then n is a child of type l of the node n. 4. Let F /l ($x, $y)(ω 1 ) = n 1, F /l ($x, $y)(ω 2 ) = n 2, where ω 1 ω 2, and ($x, $y )(ω 1 ) = ($x, $y )(ω 2 ). hen n 1 n 2 in the document order in the set of children of type l of the node F ($x, $y )(ω 1 ). 5. If F /l (($x, $y)(ω) = n is a leaf, then the text value of n is equal to ω($x ). 4 Subsumptions on XML data trees ill now we have assumed that source documents are ontologically homogeneous. In real applications [16], however, we need domain ontologies to make use of relationships between the concepts used for data modeling. Relationships between concepts need to be generalized to cope with XML data trees. hen XML data, taken out from different sources, can be merged into a document that is the greatest lower bound of the set of data being merged, i.e. is subsumed by the data. o discuss the problem more precisely, we will use a simple tree language L, to express paths and tree patterns (at schema level) as well as values and trees (at instance level). ::= /(,..., ) (tree patterns) ::= l l/ (paths) t ::= v :v /(t,..., t) (trees) v ::= s (v,..., v) (values) where l is a node label, and s is a string value. ote, that a tree pattern is a set of paths with a common prefix.

6 o define semantics for L, we will use the approach used in description logic [4]. Let be a non-empty set of individuals, and child be a transitively closed binary relation over. Interpretation of L is a function.i defined as follows: c I l I (v 1,..., v n ) I = v1 I... vn I (l/ ) I = (l I child I ).2, where (X child ).2 = = {y x(x X (x, y) child)} ( 1,..., n ) I = 1 I... n I (/( 1,..., n )) I = ( I child ( 1,..., n ) I ).2 (t 1,..., t n ) I = t I 1... t I n (/(t 1,..., t n)) I = ( I child (t 1,..., t n) I ).2 ( : v) I = ( I child v I ).2 We say that an expression E 1 is subsumed by an expression E 2, or that E 2 subsumes E 1, written E 1 E 2, if E I 1 EI 2. If both E 1 E 2 and E 2 E 1, then E 1 is equivalent to E 2, written E 1 E 2. heorem 1 he following rules hold: R1. (v 1,..., v n ) (v 1,..., v i ), ( 1,..., n ) ( 1,..., i ), (t 1,..., t n ) (t 1,..., t i ), for any 1 i n; R2. /t t, R3. /:v :v, R4. if 1 /t 1 and 2 /t 2 are valid trees, then 1 2 t 1 t 2 1 /t 1 2 /t 2, R5. /( 1 :v 1,..., n :v n ) :(v 1,..., v n ). roof (R1) follows from the property of sets intersection; to prove (R2) note that (/t) I = ( I child t I ).2 t I ; in proof of (R3) we use the fact that the child relation is transitively closed, thus we have (/ :v) I = (( I child I ).2) child v I ).2 ( I child v I ).2; (R4) is a standard property of partial ordering relations. (R5) follows from the definition and from (R1) and (R3): (/( 1 :v 1,..., n :v n )) I = (/ 1 :v 1,..., / n :v n )) I ( :v 1,..., :v n )) I = ( :(v 1,..., v n )) I. In data integration we try to merge different XML documents into a one, duplicate-free, and well constructed document. In order to realize this we can use: definitions of source data schemas given by means of DD or XML Schemas, if they are available; domain ontologies both for names and tags (at the schema level) and for values (at the instance level), any other resources which can be used to understand and classify data correctly, such as dictionaries, taxonomies, thesauri, user provided match and mismatch information as well as knowledge discovered in data, e.g. keys and statistical characteristics. Using these resources and methods, we can classify XML tree fragments such as values and paths and tree patterns into equivalence classes with respect to the synonymy relation. he value representing the class of semantically equivalent values resolves such issues as diversity of currencies, measures, and representation formats, in order to overcome difficulties in duplicate elimination and value comparison. For text values there is a problem with synonyms, different languages, jargon and so on. o solve these problems, methods from information retrieval can be used [5, 7]. ext, subsumption on these classes can be defined, where v 1 v 2 means that v 1 is more desirable than v 2, because v 1 is more informative, more reliable (one database may be considered to be more reliable than others) or has higher precision. Correct definition of this relation is crucial because it is used to define subsumption relation over complex expressions. In order to define subsumption on tree patterns, we start with establishing it on individual labels. s for values, patterns with different syntax may have the same meaning, e.g. f name, first-name, and f irstname belong to the same equivalence class. he path author/name and the tree pattern author/name(f name, lname) will belong to

7 t 1 : article t 2 : paper t 3 : paper title author title author journal title author journal title-1 fname John lname Smith title-1 John Smith journal-1 title-1 fname lname John Smith journal-1 Figure 7: Source data trees t 1 and t 2, and their join t 3. Fat arrows denote equivalent key paths different, but somehow related equivalence classes. gain, identification of such patterns can be supported by ontologies, statistics and machinelearning methods [10, 18]. For complex patterns, the subsumption relation can be inferred from atomic patterns by means of rules proved in heorem 1. he following inference rules follow from heorem 1 and are of special importance for data merging during data integration: 1 2 v 1 v 2 1 :v 1 2 :v 2, /( 1,..., n ) /( 1,..., m) (v 1,..., v n ) (v 1,..., v m) /( 1 :v 1,..., n :v n ) /( 1 :v 1,..., m:v m) It follows from heorem 1 that it is sufficient to inspect subsumptions between trees and paths, rather than between trees and trees. Example For data trees t 1, t 2, and t 3 from Fig. 7, we have: atterns: article paper, author/fname author, author/lname author, author/(fname,lname) author. Values: John Smith John, John Smith Smith. rees: author/(fname: John,lname: Smith ) author: John Smith, t 3 t 1, t 3 t 2. ote that if we restricted ourselves to paths only, we would not be able to construct the expected minimal result tree t 3 from t 1 and t 2, because neither author/fname: John nor author/lname: Smith is subsumed by the path author: John Smith. rees t 1 and t 2 from Fig. 7 can be joined because there are two keys holding in t 1 and t 2, respectively, which are equivalent and have equivalent values, i.e. article/title: title- 1 paper/title: title-1. hus, these trees could be treated as describing the same entity from the semantic domain of interest. When trees describe different entities they are non-joinable. on-joinable trees are merged in such a way that a new root label is created and all trees under consideration become the highest-level subtrees of the newly created root. 5 Conclusion We discussed some reasoning methods useful in XML data integration systems. We motivated our research on an scenario of data exchange when data structured under source schemas are to be transformed into a data structured under another schema (a target schema). In such data integration some missing or incomplete data can be inferred. he reasoning about missing data is based on data constraints imposed by the target schema. Integration of data needs mappings which describes transformation from a source into a target schema. We propose a novel approach to XML schema mapping specification based on key constraints [6, 19]. First, automappings over schemas are generated, and next the automappings are combined to create mappings between schemas represented by these automappings. he other kind of reasoning is based on ontologies and concerns a problem of finding the least upper bound of merged data. he assumption of the existence of some domain oriented taxonomies and on-

8 References tologies makes the problem more feasible than in the case of deep Web integration [10]. he method presented in the paper is a part of our research on XML data integration [16, 15] XML data transformation [14] and query reformulation. [1] S. biteboul, R. Hull, and V. Vianu. Foundations of Databases. ddison- Wesley, Reading, Massachusetts, [2] S. biteboul, L. Segoufin, and V. Vianu. Representing and Querying XML with Incomplete Information. In ODS Conference, pages , [3] M. renas and L. Libkin. XML Data Exchange: Consistency and Query nswering. In ODS, pages 13 24, [4] F. Baader, D. Calvanese, D. McGuinness, D. ardi, and. etel-schneider, editors. he Description Logic Handbook: heory, Implementation and pplications. Cambridge, [5] R. Baeza-ates and B. Ribeiro-eto. Modern Information Retrieval. ddison Wesley, ew ork, [6]. Buneman, S. B. Davidson, W. Fan, C. S. Hara, and W. C. an. Reasoning about keys for XML. Information Systems, 28(8): , [7] J. C.. Carvalho and. S. da Silva. Finding similar identities among objects from multiple web sources. In WIDM 2003, pages CM, [8] R. Fagin,. G. Kolaitis, and L. opa. Data exchange: getting to the core. CM ODS, 30(1): , [9] R. Fagin,. G. Kolaitis, L. opa, and W. C. an. Composing schema mappings: Second-order dependencies to the rescue. In ODS, pages 83 94, [10] B. He, K. C.-C. Chang, and J. Han. Discovering complex matchings across web query interfaces: a correlation mining approach. In KDD 2004, pages CM, [11] M. Lenzerini. Data integration: theoretical perspective. In ODS, pages , [12] S. Melnik,.. Bernstein,.. Halevy, and E. Rahm. Supporting executable mappings in model management. In SIG- MOD Conference, pages , [13]. ash,.. Bernstein, and S. Melnik. Composition of mappings given by embedded dependencies. In ODS, [14]. ankowski. High-Level Language for Specifying XML Data ransformations, In DBIS Lecture otes in Computer Science, 3255: , [15]. ankowski. Management of executable schema mappings for XML data exchange, In DX 2006,EDB 2006 Workshops. Lecture otes in Computer Science (to appear), pages 1 12, [16]. ankowski and E. Hunt. Data merging in life science data integration systems. In Intelligent Information Systems, dvances in Soft Computing, pages Springer Verlag, [17] L. opa,. Velegrakis, R. J. Miller, M.. Hernández, and R. Fagin. ranslating web data. In VLDB, pages , [18]. heobald and G. Weikum. he Index- Based XXL Search Engine for Querying XML Data with Relevance Ranking, In: EDB Lecture otes in Computer Science, 2287: , [19] XML Schema art 1: Structures 2d Edition [20] C. u and L. opa. Constraint-based xml query rewriting for data integration. In SIGMOD Conference, pages , 2004.

INTEGRATION OF XML DATA IN PEER-TO-PEER E-COMMERCE APPLICATIONS

INTEGRATION OF XML DATA IN PEER-TO-PEER E-COMMERCE APPLICATIONS INTEGRATION OF XML DATA IN PEER-TO-PEER E-COMMERCE APPLICATIONS Tadeusz Pankowski 1,2 1 Institute of Control and Information Engineering Poznan University of Technology Pl. M.S.-Curie 5, 60-965 Poznan

More information

XML data integration in SixP2P a theoretical framework

XML data integration in SixP2P a theoretical framework XML data integration in SixP2P a theoretical framework Tadeusz Pankowski Institute of Control and Information Engineering Poznań University of Technology Poland Faculty of Mathematics and Computer Science

More information

XML Data Integration

XML Data Integration XML Data Integration Lucja Kot Cornell University 11 November 2010 Lucja Kot (Cornell University) XML Data Integration 11 November 2010 1 / 42 Introduction Data Integration and Query Answering A data integration

More information

Schema Mappings and Agents Actions in P2P Data Integration System 1

Schema Mappings and Agents Actions in P2P Data Integration System 1 Journal of Universal Computer Science, vol. 14, no. 7 (2008), 1048-1060 submitted: 1/10/07, accepted: 21/1/08, appeared: 1/4/08 J.UCS Schema Mappings and Agents Actions in P2P Data Integration System 1

More information

Full and Complete Binary Trees

Full and Complete Binary Trees Full and Complete Binary Trees Binary Tree Theorems 1 Here are two important types of binary trees. Note that the definitions, while similar, are logically independent. Definition: a binary tree T is full

More information

XML DATA INTEGRATION SYSTEM

XML DATA INTEGRATION SYSTEM XML DATA INTEGRATION SYSTEM Abdelsalam Almarimi The Higher Institute of Electronics Engineering Baniwalid, Libya Belgasem_2000@Yahoo.com ABSRACT This paper describes a proposal for a system for XML data

More information

Data Integration and Exchange. L. Libkin 1 Data Integration and Exchange

Data Integration and Exchange. L. Libkin 1 Data Integration and Exchange Data Integration and Exchange L. Libkin 1 Data Integration and Exchange Traditional approach to databases A single large repository of data. Database administrator in charge of access to data. Users interact

More information

Xml Normalization and Reducing RedUndies

Xml Normalization and Reducing RedUndies Proceedings of the International Multiconference on Computer Science and Information Technology pp. 543 550 ISBN 978-83-60810-14-9 ISSN 1896-7094 Dealing with redundancies and dependencies in normalization

More information

Composing Schema Mappings: An Overview

Composing Schema Mappings: An Overview Composing Schema Mappings: An Overview Phokion G. Kolaitis UC Santa Scruz & IBM Almaden Joint work with Ronald Fagin, Lucian Popa, and Wang-Chiew Tan The Data Interoperability Challenge Data may reside

More information

A Workbench for Prototyping XML Data Exchange (extended abstract)

A Workbench for Prototyping XML Data Exchange (extended abstract) A Workbench for Prototyping XML Data Exchange (extended abstract) Renzo Orsini and Augusto Celentano Università Ca Foscari di Venezia, Dipartimento di Informatica via Torino 155, 30172 Mestre (VE), Italy

More information

Schema mapping and query reformulation in peer-to-peer XML data integration system

Schema mapping and query reformulation in peer-to-peer XML data integration system Control and Cybernetics vol. 38 (2009) No. 1 Schema mapping and query reformulation in peer-to-peer XML data integration system by Tadeusz Pankowski Institute of Control and Information Engineering Poznań

More information

Web-Based Genomic Information Integration with Gene Ontology

Web-Based Genomic Information Integration with Gene Ontology Web-Based Genomic Information Integration with Gene Ontology Kai Xu 1 IMAGEN group, National ICT Australia, Sydney, Australia, kai.xu@nicta.com.au Abstract. Despite the dramatic growth of online genomic

More information

A Secure Mediator for Integrating Multiple Level Access Control Policies

A Secure Mediator for Integrating Multiple Level Access Control Policies A Secure Mediator for Integrating Multiple Level Access Control Policies Isabel F. Cruz Rigel Gjomemo Mirko Orsini ADVIS Lab Department of Computer Science University of Illinois at Chicago {ifc rgjomemo

More information

Relational model. Relational model - practice. Relational Database Definitions 9/27/11. Relational model. Relational Database: Terminology

Relational model. Relational model - practice. Relational Database Definitions 9/27/11. Relational model. Relational Database: Terminology COS 597A: Principles of Database and Information Systems elational model elational model A formal (mathematical) model to represent objects (data/information), relationships between objects Constraints

More information

2. Basic Relational Data Model

2. Basic Relational Data Model 2. Basic Relational Data Model 2.1 Introduction Basic concepts of information models, their realisation in databases comprising data objects and object relationships, and their management by DBMS s that

More information

University of Ostrava. Reasoning in Description Logic with Semantic Tableau Binary Trees

University of Ostrava. Reasoning in Description Logic with Semantic Tableau Binary Trees University of Ostrava Institute for Research and Applications of Fuzzy Modeling Reasoning in Description Logic with Semantic Tableau Binary Trees Alena Lukasová Research report No. 63 2005 Submitted/to

More information

Integrating Pattern Mining in Relational Databases

Integrating Pattern Mining in Relational Databases Integrating Pattern Mining in Relational Databases Toon Calders, Bart Goethals, and Adriana Prado University of Antwerp, Belgium {toon.calders, bart.goethals, adriana.prado}@ua.ac.be Abstract. Almost a

More information

A Hybrid Approach for Ontology Integration

A Hybrid Approach for Ontology Integration A Hybrid Approach for Ontology Integration Ahmed Alasoud Volker Haarslev Nematollaah Shiri Concordia University Concordia University Concordia University 1455 De Maisonneuve Blvd. West 1455 De Maisonneuve

More information

Data Integration Hub for a Hybrid Paper Search

Data Integration Hub for a Hybrid Paper Search Data Integration Hub for a Hybrid Paper Search Jungkee Kim 1,2, Geoffrey Fox 2, and Seong-Joon Yoo 3 1 Department of Computer Science, Florida State University, Tallahassee FL 32306, U.S.A., jungkkim@cs.fsu.edu,

More information

Dependencies Revisited for Improving Data Quality

Dependencies Revisited for Improving Data Quality Dependencies Revisited for Improving Data Quality Wenfei Fan University of Edinburgh & Bell Laboratories Wenfei Fan Dependencies Revisited for Improving Data Quality 1 / 70 Real-world data is often dirty

More information

Integrating XML Data Sources using RDF/S Schemas: The ICS-FORTH Semantic Web Integration Middleware (SWIM)

Integrating XML Data Sources using RDF/S Schemas: The ICS-FORTH Semantic Web Integration Middleware (SWIM) Integrating XML Data Sources using RDF/S Schemas: The ICS-FORTH Semantic Web Integration Middleware (SWIM) Extended Abstract Ioanna Koffina 1, Giorgos Serfiotis 1, Vassilis Christophides 1, Val Tannen

More information

How To Create A Table In Sql 2.5.2.2 (Ahem)

How To Create A Table In Sql 2.5.2.2 (Ahem) Database Systems Unit 5 Database Implementation: SQL Data Definition Language Learning Goals In this unit you will learn how to transfer a logical data model into a physical database, how to extend or

More information

Data Integration: A Theoretical Perspective

Data Integration: A Theoretical Perspective Data Integration: A Theoretical Perspective Maurizio Lenzerini Dipartimento di Informatica e Sistemistica Università di Roma La Sapienza Via Salaria 113, I 00198 Roma, Italy lenzerini@dis.uniroma1.it ABSTRACT

More information

Query Processing in Data Integration Systems

Query Processing in Data Integration Systems Query Processing in Data Integration Systems Diego Calvanese Free University of Bozen-Bolzano BIT PhD Summer School Bressanone July 3 7, 2006 D. Calvanese Data Integration BIT PhD Summer School 1 / 152

More information

Enforcing Data Quality Rules for a Synchronized VM Log Audit Environment Using Transformation Mapping Techniques

Enforcing Data Quality Rules for a Synchronized VM Log Audit Environment Using Transformation Mapping Techniques Enforcing Data Quality Rules for a Synchronized VM Log Audit Environment Using Transformation Mapping Techniques Sean Thorpe 1, Indrajit Ray 2, and Tyrone Grandison 3 1 Faculty of Engineering and Computing,

More information

Piazza: Data Management Infrastructure for Semantic Web Applications

Piazza: Data Management Infrastructure for Semantic Web Applications Piazza: Data Management Infrastructure for Semantic Web Applications Alon Y. Halevy Zachary G. Ives Peter Mork Igor Tatarinov University of Washington Box 352350 Seattle, WA 98195-2350 {alon,zives,pmork,igor}@cs.washington.edu

More information

XML with Incomplete Information

XML with Incomplete Information XML with Incomplete Information Pablo Barceló Leonid Libkin Antonella Poggi Cristina Sirangelo Abstract We study models of incomplete information for XML, their computational properties, and query answering.

More information

Caching XML Data on Mobile Web Clients

Caching XML Data on Mobile Web Clients Caching XML Data on Mobile Web Clients Stefan Böttcher, Adelhard Türling University of Paderborn, Faculty 5 (Computer Science, Electrical Engineering & Mathematics) Fürstenallee 11, D-33102 Paderborn,

More information

Constraint-Based XML Query Rewriting for Data Integration

Constraint-Based XML Query Rewriting for Data Integration Constraint-Based XML Query Rewriting for Data Integration Cong Yu Department of EECS, University of Michigan congy@eecs.umich.edu Lucian Popa IBM Almaden Research Center lucian@almaden.ibm.com ABSTRACT

More information

Extending Data Processing Capabilities of Relational Database Management Systems.

Extending Data Processing Capabilities of Relational Database Management Systems. Extending Data Processing Capabilities of Relational Database Management Systems. Igor Wojnicki University of Missouri St. Louis Department of Mathematics and Computer Science 8001 Natural Bridge Road

More information

Quiz! Database Indexes. Index. Quiz! Disc and main memory. Quiz! How costly is this operation (naive solution)?

Quiz! Database Indexes. Index. Quiz! Disc and main memory. Quiz! How costly is this operation (naive solution)? Database Indexes How costly is this operation (naive solution)? course per weekday hour room TDA356 2 VR Monday 13:15 TDA356 2 VR Thursday 08:00 TDA356 4 HB1 Tuesday 08:00 TDA356 4 HB1 Friday 13:15 TIN090

More information

Network (Tree) Topology Inference Based on Prüfer Sequence

Network (Tree) Topology Inference Based on Prüfer Sequence Network (Tree) Topology Inference Based on Prüfer Sequence C. Vanniarajan and Kamala Krithivasan Department of Computer Science and Engineering Indian Institute of Technology Madras Chennai 600036 vanniarajanc@hcl.in,

More information

Introduction to XML. Data Integration. Structure in Data Representation. Yanlei Diao UMass Amherst Nov 15, 2007

Introduction to XML. Data Integration. Structure in Data Representation. Yanlei Diao UMass Amherst Nov 15, 2007 Introduction to XML Yanlei Diao UMass Amherst Nov 15, 2007 Slides Courtesy of Ramakrishnan & Gehrke, Dan Suciu, Zack Ives and Gerome Miklau. 1 Structure in Data Representation Relational data is highly

More information

Data Integration. Maurizio Lenzerini. Universitá di Roma La Sapienza

Data Integration. Maurizio Lenzerini. Universitá di Roma La Sapienza Data Integration Maurizio Lenzerini Universitá di Roma La Sapienza DASI 06: Phd School on Data and Service Integration Bertinoro, December 11 15, 2006 M. Lenzerini Data Integration DASI 06 1 / 213 Structure

More information

A single minimal complement for the c.e. degrees

A single minimal complement for the c.e. degrees A single minimal complement for the c.e. degrees Andrew Lewis Leeds University, April 2002 Abstract We show that there exists a single minimal (Turing) degree b < 0 s.t. for all c.e. degrees 0 < a < 0,

More information

virtual class local mappings semantically equivalent local classes ... Schema Integration

virtual class local mappings semantically equivalent local classes ... Schema Integration Data Integration Techniques based on Data Quality Aspects Michael Gertz Department of Computer Science University of California, Davis One Shields Avenue Davis, CA 95616, USA gertz@cs.ucdavis.edu Ingo

More information

Semantic Search in Portals using Ontologies

Semantic Search in Portals using Ontologies Semantic Search in Portals using Ontologies Wallace Anacleto Pinheiro Ana Maria de C. Moura Military Institute of Engineering - IME/RJ Department of Computer Engineering - Rio de Janeiro - Brazil [awallace,anamoura]@de9.ime.eb.br

More information

Checking Access to Protected Members in the Java Virtual Machine

Checking Access to Protected Members in the Java Virtual Machine Checking Access to Protected Members in the Java Virtual Machine Alessandro Coglio Kestrel Institute 3260 Hillview Avenue, Palo Alto, CA 94304, USA Ph. +1-650-493-6871 Fax +1-650-424-1807 http://www.kestrel.edu/

More information

Fuzzy Duplicate Detection on XML Data

Fuzzy Duplicate Detection on XML Data Fuzzy Duplicate Detection on XML Data Melanie Weis Humboldt-Universität zu Berlin Unter den Linden 6, Berlin, Germany mweis@informatik.hu-berlin.de Abstract XML is popular for data exchange and data publishing

More information

A MEDIATION LAYER FOR HETEROGENEOUS XML SCHEMAS

A MEDIATION LAYER FOR HETEROGENEOUS XML SCHEMAS A MEDIATION LAYER FOR HETEROGENEOUS XML SCHEMAS Abdelsalam Almarimi 1, Jaroslav Pokorny 2 Abstract This paper describes an approach for mediation of heterogeneous XML schemas. Such an approach is proposed

More information

Managing large sound databases using Mpeg7

Managing large sound databases using Mpeg7 Max Jacob 1 1 Institut de Recherche et Coordination Acoustique/Musique (IRCAM), place Igor Stravinsky 1, 75003, Paris, France Correspondence should be addressed to Max Jacob (max.jacob@ircam.fr) ABSTRACT

More information

Translating WFS Query to SQL/XML Query

Translating WFS Query to SQL/XML Query Translating WFS Query to SQL/XML Query Vânia Vidal, Fernando Lemos, Fábio Feitosa Departamento de Computação Universidade Federal do Ceará (UFC) Fortaleza CE Brazil {vvidal, fernandocl, fabiofbf}@lia.ufc.br

More information

A Semantic Dissimilarity Measure for Concept Descriptions in Ontological Knowledge Bases

A Semantic Dissimilarity Measure for Concept Descriptions in Ontological Knowledge Bases A Semantic Dissimilarity Measure for Concept Descriptions in Ontological Knowledge Bases Claudia d Amato, Nicola Fanizzi, Floriana Esposito Dipartimento di Informatica, Università degli Studi di Bari Campus

More information

What to Ask to a Peer: Ontology-based Query Reformulation

What to Ask to a Peer: Ontology-based Query Reformulation What to Ask to a Peer: Ontology-based Query Reformulation Diego Calvanese Faculty of Computer Science Free University of Bolzano/Bozen Piazza Domenicani 3, I-39100 Bolzano, Italy calvanese@inf.unibz.it

More information

A Scalable Approach for Large-Scale. Schema Mediation. Khalid Saleem, Zohra Bellahsène LIRMM CNRS/Université Montpellier 2, France

A Scalable Approach for Large-Scale. Schema Mediation. Khalid Saleem, Zohra Bellahsène LIRMM CNRS/Université Montpellier 2, France A Scalable Approach for Large-Scale Schema Mediation Khalid Saleem, Zohra Bellahsène LIRMM CNRS/Université Montpellier 2, France Outline Introduction The matching problem Brief state of the art A hybrid

More information

Reverse Engineering of Relational Databases to Ontologies: An Approach Based on an Analysis of HTML Forms

Reverse Engineering of Relational Databases to Ontologies: An Approach Based on an Analysis of HTML Forms Reverse Engineering of Relational Databases to Ontologies: An Approach Based on an Analysis of HTML Forms Irina Astrova 1, Bela Stantic 2 1 Tallinn University of Technology, Ehitajate tee 5, 19086 Tallinn,

More information

XML Interoperability

XML Interoperability XML Interoperability Laks V. S. Lakshmanan Department of Computer Science University of British Columbia Vancouver, BC, Canada laks@cs.ubc.ca Fereidoon Sadri Department of Mathematical Sciences University

More information

Mathematics for Computer Science/Software Engineering. Notes for the course MSM1F3 Dr. R. A. Wilson

Mathematics for Computer Science/Software Engineering. Notes for the course MSM1F3 Dr. R. A. Wilson Mathematics for Computer Science/Software Engineering Notes for the course MSM1F3 Dr. R. A. Wilson October 1996 Chapter 1 Logic Lecture no. 1. We introduce the concept of a proposition, which is a statement

More information

Logical and categorical methods in data transformation (TransLoCaTe)

Logical and categorical methods in data transformation (TransLoCaTe) Logical and categorical methods in data transformation (TransLoCaTe) 1 Introduction to the abbreviated project description This is an abbreviated project description of the TransLoCaTe project, with an

More information

Question Answering and the Nature of Intercomplete Databases

Question Answering and the Nature of Intercomplete Databases Certain Answers as Objects and Knowledge Leonid Libkin School of Informatics, University of Edinburgh Abstract The standard way of answering queries over incomplete databases is to compute certain answers,

More information

A New Marketing Channel Management Strategy Based on Frequent Subtree Mining

A New Marketing Channel Management Strategy Based on Frequent Subtree Mining A New Marketing Channel Management Strategy Based on Frequent Subtree Mining Daoping Wang Peng Gao School of Economics and Management University of Science and Technology Beijing ABSTRACT For most manufacturers,

More information

Using LSI for Implementing Document Management Systems Turning unstructured data from a liability to an asset.

Using LSI for Implementing Document Management Systems Turning unstructured data from a liability to an asset. White Paper Using LSI for Implementing Document Management Systems Turning unstructured data from a liability to an asset. Using LSI for Implementing Document Management Systems By Mike Harrison, Director,

More information

Grid Data Integration based on Schema-mapping

Grid Data Integration based on Schema-mapping Grid Data Integration based on Schema-mapping Carmela Comito and Domenico Talia DEIS, University of Calabria, Via P. Bucci 41 c, 87036 Rende, Italy {ccomito, talia}@deis.unical.it http://www.deis.unical.it/

More information

(LMCS, p. 317) V.1. First Order Logic. This is the most powerful, most expressive logic that we will examine.

(LMCS, p. 317) V.1. First Order Logic. This is the most powerful, most expressive logic that we will examine. (LMCS, p. 317) V.1 First Order Logic This is the most powerful, most expressive logic that we will examine. Our version of first-order logic will use the following symbols: variables connectives (,,,,

More information

COMP 378 Database Systems Notes for Chapter 7 of Database System Concepts Database Design and the Entity-Relationship Model

COMP 378 Database Systems Notes for Chapter 7 of Database System Concepts Database Design and the Entity-Relationship Model COMP 378 Database Systems Notes for Chapter 7 of Database System Concepts Database Design and the Entity-Relationship Model The entity-relationship (E-R) model is a a data model in which information stored

More information

Binary Coded Web Access Pattern Tree in Education Domain

Binary Coded Web Access Pattern Tree in Education Domain Binary Coded Web Access Pattern Tree in Education Domain C. Gomathi P.G. Department of Computer Science Kongu Arts and Science College Erode-638-107, Tamil Nadu, India E-mail: kc.gomathi@gmail.com M. Moorthi

More information

The Import & Export of Data from a Database

The Import & Export of Data from a Database The Import & Export of Data from a Database Introduction The aim of these notes is to investigate a conceptually simple model for importing and exporting data into and out of an object-relational database,

More information

Oracle Database 10g: Introduction to SQL

Oracle Database 10g: Introduction to SQL Oracle University Contact Us: 1.800.529.0165 Oracle Database 10g: Introduction to SQL Duration: 5 Days What you will learn This course offers students an introduction to Oracle Database 10g database technology.

More information

Universal. Event. Product. Computer. 1 warehouse.

Universal. Event. Product. Computer. 1 warehouse. Dynamic multi-dimensional models for text warehouses Maria Zamr Bleyberg, Karthik Ganesh Computing and Information Sciences Department Kansas State University, Manhattan, KS, 66506 Abstract In this paper,

More information

Secure Semantic Web Service Using SAML

Secure Semantic Web Service Using SAML Secure Semantic Web Service Using SAML JOO-YOUNG LEE and KI-YOUNG MOON Information Security Department Electronics and Telecommunications Research Institute 161 Gajeong-dong, Yuseong-gu, Daejeon KOREA

More information

Integrating and Exchanging XML Data using Ontologies

Integrating and Exchanging XML Data using Ontologies Integrating and Exchanging XML Data using Ontologies Huiyong Xiao and Isabel F. Cruz Department of Computer Science University of Illinois at Chicago {hxiao ifc}@cs.uic.edu Abstract. While providing a

More information

Query Reformulation over Ontology-based Peers (Extended Abstract)

Query Reformulation over Ontology-based Peers (Extended Abstract) Query Reformulation over Ontology-based Peers (Extended Abstract) Diego Calvanese 1, Giuseppe De Giacomo 2, Domenico Lembo 2, Maurizio Lenzerini 2, and Riccardo Rosati 2 1 Faculty of Computer Science,

More information

Oracle Database 12c: Introduction to SQL Ed 1.1

Oracle Database 12c: Introduction to SQL Ed 1.1 Oracle University Contact Us: 1.800.529.0165 Oracle Database 12c: Introduction to SQL Ed 1.1 Duration: 5 Days What you will learn This Oracle Database: Introduction to SQL training helps you write subqueries,

More information

Efficiently Identifying Inclusion Dependencies in RDBMS

Efficiently Identifying Inclusion Dependencies in RDBMS Efficiently Identifying Inclusion Dependencies in RDBMS Jana Bauckmann Department for Computer Science, Humboldt-Universität zu Berlin Rudower Chaussee 25, 12489 Berlin, Germany bauckmann@informatik.hu-berlin.de

More information

Physical Data Organization

Physical Data Organization Physical Data Organization Database design using logical model of the database - appropriate level for users to focus on - user independence from implementation details Performance - other major factor

More information

04 XML Schemas. Software Technology 2. MSc in Communication Sciences 2009-10 Program in Technologies for Human Communication Davide Eynard

04 XML Schemas. Software Technology 2. MSc in Communication Sciences 2009-10 Program in Technologies for Human Communication Davide Eynard MSc in Communication Sciences 2009-10 Program in Technologies for Human Communication Davide Eynard Software Technology 2 04 XML Schemas 2 XML: recap and evaluation During last lesson we saw the basics

More information

Completing Description Logic Knowledge Bases using Formal Concept Analysis

Completing Description Logic Knowledge Bases using Formal Concept Analysis Completing Description Logic Knowledge Bases using Formal Concept Analysis Franz Baader, 1 Bernhard Ganter, 1 Barış Sertkaya, 1 and Ulrike Sattler 2 1 TU Dresden, Germany and 2 The University of Manchester,

More information

Formal Engineering for Industrial Software Development

Formal Engineering for Industrial Software Development Shaoying Liu Formal Engineering for Industrial Software Development Using the SOFL Method With 90 Figures and 30 Tables Springer Contents Introduction 1 1.1 Software Life Cycle... 2 1.2 The Problem 4 1.3

More information

A first step towards modeling semistructured data in hybrid multimodal logic

A first step towards modeling semistructured data in hybrid multimodal logic A first step towards modeling semistructured data in hybrid multimodal logic Nicole Bidoit * Serenella Cerrito ** Virginie Thion * * LRI UMR CNRS 8623, Université Paris 11, Centre d Orsay. ** LaMI UMR

More information

An Ontology Based Method to Solve Query Identifier Heterogeneity in Post- Genomic Clinical Trials

An Ontology Based Method to Solve Query Identifier Heterogeneity in Post- Genomic Clinical Trials ehealth Beyond the Horizon Get IT There S.K. Andersen et al. (Eds.) IOS Press, 2008 2008 Organizing Committee of MIE 2008. All rights reserved. 3 An Ontology Based Method to Solve Query Identifier Heterogeneity

More information

Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification

Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 Outline More Complex SQL Retrieval Queries

More information

Data Quality Mining: Employing Classifiers for Assuring consistent Datasets

Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Fabian Grüning Carl von Ossietzky Universität Oldenburg, Germany, fabian.gruening@informatik.uni-oldenburg.de Abstract: Independent

More information

Relational Databases

Relational Databases Relational Databases Jan Chomicki University at Buffalo Jan Chomicki () Relational databases 1 / 18 Relational data model Domain domain: predefined set of atomic values: integers, strings,... every attribute

More information

1. Domain Name System

1. Domain Name System 1.1 Domain Name System (DNS) 1. Domain Name System To identify an entity, the Internet uses the IP address, which uniquely identifies the connection of a host to the Internet. However, people prefer to

More information

On Transiting Key in XML Data Transformation for Integration

On Transiting Key in XML Data Transformation for Integration On Transiting Key in XML Data Transformation for Integration Md. Sumon Shahriar and Jixue Liu Data and Web Engineering Lab School of Computer and Information Science University of South Australia, Adelaide,

More information

Group Preferences in Social Network Services

Group Preferences in Social Network Services Group Preferences in Social Network Services Florian Wenzel Institute for Computer Science University of Augsburg 86135 Augsburg, Germany wenzel@informatik.uni-augsburg.de Werner Kießling Institute for

More information

Data exchange. L. Libkin 1 Data Integration and Exchange

Data exchange. L. Libkin 1 Data Integration and Exchange Data exchange Source schema, target schema; need to transfer data between them. A typical scenario: Two organizations have their legacy databases, schemas cannot be changed. Data from one organization

More information

Relationship-Based Change Propagation: A Case Study

Relationship-Based Change Propagation: A Case Study Relationship-Based Change Propagation: A Case Study by Winnie Lai A thesis submitted in conformity with the requirements for the degree of Master of Computer Science Department of Computer Science University

More information

Regular Expressions with Nested Levels of Back Referencing Form a Hierarchy

Regular Expressions with Nested Levels of Back Referencing Form a Hierarchy Regular Expressions with Nested Levels of Back Referencing Form a Hierarchy Kim S. Larsen Odense University Abstract For many years, regular expressions with back referencing have been used in a variety

More information

Lecture 17 : Equivalence and Order Relations DRAFT

Lecture 17 : Equivalence and Order Relations DRAFT CS/Math 240: Introduction to Discrete Mathematics 3/31/2011 Lecture 17 : Equivalence and Order Relations Instructor: Dieter van Melkebeek Scribe: Dalibor Zelený DRAFT Last lecture we introduced the notion

More information

Analysis of Algorithms I: Binary Search Trees

Analysis of Algorithms I: Binary Search Trees Analysis of Algorithms I: Binary Search Trees Xi Chen Columbia University Hash table: A data structure that maintains a subset of keys from a universe set U = {0, 1,..., p 1} and supports all three dictionary

More information

A Framework for Ontology-Based Knowledge Management System

A Framework for Ontology-Based Knowledge Management System A Framework for Ontology-Based Knowledge Management System Jiangning WU Institute of Systems Engineering, Dalian University of Technology, Dalian, 116024, China E-mail: jnwu@dlut.edu.cn Abstract Knowledge

More information

AN AI PLANNING APPROACH FOR GENERATING BIG DATA WORKFLOWS

AN AI PLANNING APPROACH FOR GENERATING BIG DATA WORKFLOWS AN AI PLANNING APPROACH FOR GENERATING BIG DATA WORKFLOWS Wesley Deneke 1, Wing-Ning Li 2, and Craig Thompson 2 1 Computer Science and Industrial Technology Department, Southeastern Louisiana University,

More information

Report on the Dagstuhl Seminar Data Quality on the Web

Report on the Dagstuhl Seminar Data Quality on the Web Report on the Dagstuhl Seminar Data Quality on the Web Michael Gertz M. Tamer Özsu Gunter Saake Kai-Uwe Sattler U of California at Davis, U.S.A. U of Waterloo, Canada U of Magdeburg, Germany TU Ilmenau,

More information

Peer Data Exchange. ACM Transactions on Database Systems, Vol. V, No. N, Month 20YY, Pages 1 0??.

Peer Data Exchange. ACM Transactions on Database Systems, Vol. V, No. N, Month 20YY, Pages 1 0??. Peer Data Exchange Ariel Fuxman 1 University of Toronto Phokion G. Kolaitis 2 IBM Almaden Research Center Renée J. Miller 1 University of Toronto and Wang-Chiew Tan 3 University of California, Santa Cruz

More information

The composition of Mappings in a Nautural Interface

The composition of Mappings in a Nautural Interface Composing Schema Mappings: Second-Order Dependencies to the Rescue Ronald Fagin IBM Almaden Research Center fagin@almaden.ibm.com Phokion G. Kolaitis UC Santa Cruz kolaitis@cs.ucsc.edu Wang-Chiew Tan UC

More information

A Comparison of Database Query Languages: SQL, SPARQL, CQL, DMX

A Comparison of Database Query Languages: SQL, SPARQL, CQL, DMX ISSN: 2393-8528 Contents lists available at www.ijicse.in International Journal of Innovative Computer Science & Engineering Volume 3 Issue 2; March-April-2016; Page No. 09-13 A Comparison of Database

More information

Deferred node-copying scheme for XQuery processors

Deferred node-copying scheme for XQuery processors Deferred node-copying scheme for XQuery processors Jan Kurš and Jan Vraný Software Engineering Group, FIT ČVUT, Kolejn 550/2, 160 00, Prague, Czech Republic kurs.jan@post.cz, jan.vrany@fit.cvut.cz Abstract.

More information

Data Quality in Information Integration and Business Intelligence

Data Quality in Information Integration and Business Intelligence Data Quality in Information Integration and Business Intelligence Leopoldo Bertossi Carleton University School of Computer Science Ottawa, Canada : Faculty Fellow of the IBM Center for Advanced Studies

More information

! " # The Logic of Descriptions. Logics for Data and Knowledge Representation. Terminology. Overview. Three Basic Features. Some History on DLs

!  # The Logic of Descriptions. Logics for Data and Knowledge Representation. Terminology. Overview. Three Basic Features. Some History on DLs ,!0((,.+#$),%$(-&.& *,2(-$)%&2.'3&%!&, Logics for Data and Knowledge Representation Alessandro Agostini agostini@dit.unitn.it University of Trento Fausto Giunchiglia fausto@dit.unitn.it The Logic of Descriptions!$%&'()*$#)

More information

BUSINESS RULES AND GAP ANALYSIS

BUSINESS RULES AND GAP ANALYSIS Leading the Evolution WHITE PAPER BUSINESS RULES AND GAP ANALYSIS Discovery and management of business rules avoids business disruptions WHITE PAPER BUSINESS RULES AND GAP ANALYSIS Business Situation More

More information

The process of database development. Logical model: relational DBMS. Relation

The process of database development. Logical model: relational DBMS. Relation The process of database development Reality (Universe of Discourse) Relational Databases and SQL Basic Concepts The 3rd normal form Structured Query Language (SQL) Conceptual model (e.g. Entity-Relationship

More information

Oracle 10g PL/SQL Training

Oracle 10g PL/SQL Training Oracle 10g PL/SQL Training Course Number: ORCL PS01 Length: 3 Day(s) Certification Exam This course will help you prepare for the following exams: 1Z0 042 1Z0 043 Course Overview PL/SQL is Oracle's Procedural

More information

A Multi-agent System for Knowledge Management based on the Implicit Culture Framework

A Multi-agent System for Knowledge Management based on the Implicit Culture Framework A Multi-agent System for Knowledge Management based on the Implicit Culture Framework Enrico Blanzieri Paolo Giorgini Fausto Giunchiglia Claudio Zanoni Department of Information and Communication Technology

More information

KNOWLEDGE FACTORING USING NORMALIZATION THEORY

KNOWLEDGE FACTORING USING NORMALIZATION THEORY KNOWLEDGE FACTORING USING NORMALIZATION THEORY J. VANTHIENEN M. SNOECK Katholieke Universiteit Leuven Department of Applied Economic Sciences Dekenstraat 2, 3000 Leuven (Belgium) tel. (+32) 16 28 58 09

More information

Lesson 8: Introduction to Databases E-R Data Modeling

Lesson 8: Introduction to Databases E-R Data Modeling Lesson 8: Introduction to Databases E-R Data Modeling Contents Introduction to Databases Abstraction, Schemas, and Views Data Models Database Management System (DBMS) Components Entity Relationship Data

More information

Mapping VRA Core 4.0 to the CIDOC/CRM ontology

Mapping VRA Core 4.0 to the CIDOC/CRM ontology 1 st Workshop on Digital Information Management March 30-31, 2011 Mapping VRA Core 4.0 to the CIDOC/CRM ontology Panorea Gaitanou, Manolis Gergatsoulis Database and Information Systems Group (DBIS) Laboratory

More information

Functional Dependencies and Normalization

Functional Dependencies and Normalization Functional Dependencies and Normalization 5DV119 Introduction to Database Management Umeå University Department of Computing Science Stephen J. Hegner hegner@cs.umu.se http://www.cs.umu.se/~hegner Functional

More information

Lightweight Data Integration using the WebComposition Data Grid Service

Lightweight Data Integration using the WebComposition Data Grid Service Lightweight Data Integration using the WebComposition Data Grid Service Ralph Sommermeier 1, Andreas Heil 2, Martin Gaedke 1 1 Chemnitz University of Technology, Faculty of Computer Science, Distributed

More information

A Knowledge-Based Approach to Ontologies Data Integration

A Knowledge-Based Approach to Ontologies Data Integration A Knowledge-Based Approach to Ontologies Data Integration Tech Report kmi-04-17 Maria Vargas-Vera and Enrico Motta A Knowledge-Based Approach to Ontologies Data Integration Maria Vargas-Vera and Enrico

More information