Efficient Storage and Temporal Query Evaluation of Hierarchical Data Archiving Systems

Size: px
Start display at page:

Download "Efficient Storage and Temporal Query Evaluation of Hierarchical Data Archiving Systems"

Transcription

1 Efficient Storage and Temporal Query Evaluation of Hierarchical Data Archiving Systems Hui (Wendy) Wang, Ruilin Liu Stevens Institute of Technology, New Jersey, USA Dimitri Theodoratos, Xiaoying Wu New Jersey Institute of Technology, New, Jersey, USA

2 Scientific Data in XML extensible Markup Language (XML): A mark-up language XML for scientific data Describe data structure and give simple processing instructions (e.g., XSIL and XDMF) Provide common data format that is an open, universal standard (e.g, CML, MathML, GML) Swiss-Prot Dataset 2

3 Updates on Scientific Data in XML Scientific databases are continuously updated Necessity of maintaining all versions in the archive Problem: as increasing volumes of these data being accumulated, the archives can reach a critical mass. 3

4 An Example of Multiple XML Dataset Versions Four consecutive instances of an extract of the Swiss-Prot Dataset 4

5 Challenges (1) how to store successive versions of XML databases in an archiving database in a cost-effective way. (2) how to evaluate queries efficiently on the archiving database. 5

6 Our Contributions A novel compact and updateable storage scheme for XML archiving databases. A simple yet expressive query language for XML archiving databases. Optimization of query evaluation over the compact storage. 6

7 Outline Preliminaries Compact storage Temporal query language Query optimization Experiments Conclusion 7

8 Compact Storage Scheme (I) Merge versions Store multiple occurrences of the same node only once in the archiving database A. Each node p in A is associated with a timestamp set which contains the timestamps of the instances of p. 8

9 An Example of Merging root 0 Entry 1 Species 2 Features 3 Descr 4 Rattus norvegicus Domain 5 Descr 6 100KDA protein Versions root 0 Entry 1 Species 2 Features 3 Descr 4 Ref 7 Domain 5 Author 8 Rattus norvegicus 100KDA protein PRO-RICH Version 1 Version 2 root 0 [1-2] [1-2] Entry 1 Species 2 Features 3 Descr 4 Ref 7 [1-2] [1-2] [1-2] [2] Rattus norvegicus Domain 5 [1-2] Descr 6 [1] PRO-RICH 100KDA protein Author 8 [2] Nene V Nene V Monotone property on timestamps of the nodes on the same path 9

10 Compact Storage Scheme (II) Timestamp set compaction The timestamp label of the parent node only preserves the timestamps that do not appear on any of its children. Lp = ts(p) cchild(p) ts(c) 10

11 An Example of Timestamp Set root 0 Compaction root 0 Entry 1 Species 2 Features 3 Descr 4 Rattus norvegicus Domain 5 Descr 6 100KDA protein Entry 1 Species 2 Features 3 Descr 4 Ref 7 Domain 5 Author 8 Rattus norvegicus 100KDA protein PRO-RICH Version 1 Version 2 Nene V root 0 [1-2] [1-2] Entry 1 Species 2 Features 3 Descr 4 Ref 7 [1-2] [1-2] [1-2] [1-2] [1-2] [2] Domain 5 Author 8 Rattus norvegicus [1-2] [2] 100KDA protein Descr [2] [2] 6 Nene V [1] [1] PRO-RICH 11

12 An Example of Compact root 0 Entry 1 Storage root 0 Entry 1 Species 2 Features 3 Descr 4 Rattus norvegicus Domain 5 Descr 6 PRO-RICH 100KDA protein Species 2 Features 3 Descr 4 Ref 7 Domain 5 Author 8 Rattus norvegicus 100KDA protein Nene V root 0 Entry 1 Species 2 Features 3 Descr 4 Ref 7 [1-2] [1-2] Domain 5 Author [2] 8 100KDA protein [2] Descr 6 Nene V [1] Rattus norvegicus PRO-RICH 12

13 Efficient Updates Incremental timestamp label computation When insert a new instance D k at timestamp k, root 0 add the newly inserted trees T 1,..., T m in D k to the archive, For leaf nodes in D k, update the timestamp labels of their corresponding archive nodes by adding timestamp k. root 0 Entry 1 Entry 1 Features 3 Descr 9 Domain 5 Version 3 Species 2 Features 3 Descr 4 Ref 7 [1-2] [1-2] Domain 5 100KDA Author [2] 8 protein [2] Descr 6 Nene V [1] Rattus norvegicus PRO-RICH Archiving Dataset before Update Descr 9 13

14 Efficient Updates Incremental timestamp label computation When insert a new instance D k at timestamp k, root 0 add the newly inserted trees T 1,..., T m in D k to the archive, update the archive nodes of the leaf nodes in D k root by adding the timestamp k. 0 Entry 1 Entry 1 Species 2 Features 3 Descr 4 [1-2] [1-2] Ref 7 Descr 9 [3] Features 3 Descr 9 Domain 5 Version 3 Rattus norvegicus Domain 5 [2-3] Descr 6 [1] PRO-RICH 100KDA protein Author 8 [2] Nene V Archiving Dataset after Update 14

15 Comparison with Related Work Compact storage [1] (Top-down approach) Remove timestamps at children if they are the same as those of the parent node Compared with our solution (bottom-up approach) w.r.t. # of updated timestamp labels when inserting a new instance D k into the archiving database A TD: # of nodes in D k whose corresponding nodes in A have timestamp labels + new nodes in D k BU: # of leaf nodes in D k 15

16 An Example of TD V.S. BU root 0 [1-4] Entry 1 Species[2-3] 2 Features 3 Descr [2-4] 4 [3-4] Ref 7 [2-4] Rattus norvegicus Domain 5 [3-4] Descr [4] 6 PRO-RICH 100KDA protein Author 8 [2] Nene V TD approach: Archiving Dataset before Update root 0 Entry 1 Species 2 Features 3 Descr 4 Ref 7 [2-3] [2] [3-4] [3-4] Domain 5 100KDA Author [3] 8 protein [2] Descr 6 Nene V [4] Rattus norvegicus PRO-RICH BU approach: Archiving Dataset before Update 16

17 An Example of TD V.S. BU root 0 [1-5] Entry 1 Species[2-3] 2 Features 3 Descr [2-5] 4 [3-4] Ref 7 [2-4] root 0 Entry 1 Features 3 Domain 5 Descr 6 Version 5 Rattus norvegicus Domain 5 [3-5] Descr [4-5] 6 PRO-RICH root 0 Entry 1 100KDA protein Author 8 [2] Nene V TD approach: Archiving Dataset after Update Species 2 Features 3 Descr 4 Ref 7 [2-3] [2] [3-4] [3-4] Domain 5 100KDA Author [3] 8 protein [2] Descr 6 [4-5] Rattus norvegicus PRO-RICH Nene V BU approach: Archiving Dataset after Update 17

18 Outline Compact storage Temporal query language Query optimization Experiments Conclusion 18

19 Temporal Query Language Types: Snapshot history trace Temporal constraints: includes(t), overlaps(t a, t b ), before(t)/after(t), contains(t a, t b )/is_contained(t a, t b ), meets(t a, t b ) Temporal queries: XML structural queries + temporal constraints on query nodes 19

20 Archivin g databas e Evaluation of Temporal Constraints over Compressed Timestamps root 0 [1-2] Entry 1 Species 2 Features 3 Descr 4 Ref 7 [1-2] [1-2] [1-2] Domain 5 [1-2][2] Author 8 100KDA protein [2] Descr[1] 6 Rattus norvegicus Nene V Answer Descr 4 PRO-RICH 100KDA protein root Query Entry include(2) Domain overlaps(1,3) Descr* contain(1,2) 20

21 Temporal Evaluation Annotations DC (Descendant Check) LC (Local Check) NC (No Check) Archive root 0 [1-2] Entry 1 Species 2 Features 3 Descr 4 Ref 7 [1-2] [1-2] [1-2] Domain 5 [1-2][2] Author 8 100KDA protein [2] Descr 6 Nene V [1] Rattus norvegicus PRO-RICH Query root Entry DC include(2) LC Domain Descr* overlaps(1,3) contain(1,2) 21

22 Cost Model Cost model of temporal evaluation annotations where T q is the number of nodes of the tree rooted at q in a query. root 0 Query Domain Entry overlaps(1,3) [1-2] Entry 1 root include(2) Descr* contain(1,2) NC(0) DC(2) LC(1) Archive Species 2 Features 3 Descr 4 Ref 7 [1-2] [1-2] [1-2] Domain 5 [1-2][2] Author 8 100KDA protein [2] Descr 6 [1] Rattus norvegicus PRO-RICH Nene V 22

23 Outline Preliminaries Compact storage Temporal query language Query optimization Experiments Conclusion 23

24 Optimization Problem DC is expensive (recursive check) DC can be replaced by LC/NC in some cases Goal: Replace as many DCs as possible with LCs/NCs 24

25 An Example of Optimization root Entry include(2) DC Descr* contain(1,4) DC Query Q root Database Schema Entry include(2) DC NC Descr* contain(1,4) DC LC After optimization 25

26 Inference Rules We use inference rules to find redundant temporal annotations Inference rules: P 1,, P k R if the premises P 1,, P k are true, then the conclusion R is also true. Types of inference rules Without database schema With presence of database schema 26

27 Inference Rules: No Database Schema AD(ancestor-descendant) Rule: Q = p//q q t p TR(transitivity) Rule: p t q and q t r p t r 27

28 Inference Rules: with Database Schema SP(SinglePath) Rule: Q = p//q, SinglePath(p, q ) p = t q DC(descendant) Rule: Q = p//q, Q = p//r, SinglePath(p, r ) q t r DE(derived) Rule: Q = p//q, Q = p//r, SinglePath(p, r ), SinglePath(q, r ) r t q 28

29 Temporal Constraint Graph Query root Temporal constraint graph Entry Features include(2) overlaps(1,3) contain(1,4) Descr* contain(3,4) Species An edge from p to q indicates an inferred p t q relationship 29

30 Temporal Constraint Consumption We check temporal constraint consumption on the temporal constraint graph Consuming temporal constraint on p Consumed temporal constraint on q include(t) p t q includes(t) overlaps(t 1, t 2 ), t [t 1, t 2 ] contains(t 1, t 2 ) includes(t 3 ), t 3 [t 1, t 2 ] contains(t 3, t 4 ), t 3 t 1, t 4 t 2 overlaps(t 3, t 4 ), t 1 < t 3 < t 2 or t 1 < t 4 < t 2 q t p is_contained(t 1, t 2 ) is_contained(t 3, t 4 ), t 3 t 1, t 4 t 2 before(t) after(t) before(t 1 ), t 1 t after(t 1 ), t 1 t q = t p meets(t 1, t 2 ) meets(t 1, t 2 ) 30

31 An Example of Query Optimization Consuming temporal constraint on p (Descr) contains(1, 4) Consumed temporal constraint on q (Entry) Descr t Entry includes(2) root Entry include(2) DC Features overlaps(1,3) DC NC Descr* contain(1,4) DC Species contain(3,4) DC LC LC NC 31

32 Outline Preliminaries Compact storage Temporal query language Query optimization Experiments Conclusion 32

33 Experiment Setup Hardware Intel Core 2 CPU 2.40 GHz processor, 4.00 GB of RAM Software OS: Windows 7 The algorithms were implemented in Java JDOM engine: parse the XML databases Wutka DTD parser: parse the XML DTD Oracle Berkeley DB XML engine: query evaluation 33

34 Datasets Synthetic dataset using the IBM XML generator on the DTD of the XMark benchmark Real dataset Treebank dataset Dataset Size # of elements Max. depth Avg. depth Treebank 22.3MB Xmark 14.6MB

35 Versions For Xmark and Treebank datasets, we created 50 and 20 consecutive database instances respectively. Each instance was generated from the previous one by first deleting and then inserting (sub)trees. The (inserted and deleted) trees take 10% of the nodes of the database instance. 35

36 Experiment Three storage approaches: The naive approach (NA): keeps the timestamp sets as un-compacted The top-down (TD) approach ([1]): eliminates the timestamps of the children nodes that are identical to the parent. Our bottom-up (BU) approach: eliminates the timestamps from the parent nodes that are repeated on children. 36

37 Experiment: Archiving Archiving Time Overhead Compaction ratio Dataset Top-down Bottom-up XMark (shallow&fat) XMark (deep&thin) 2.15% 2.72% 3.09% 2.75% Total number of timestamps (space overhead) 37

38 Experiment: Update Cost Summary of archiving overhead Both TD and BU can reduce the number of timestamps in the archive. The difference between TD and BU regarding the number of timetstamps is not significant. BU always has much better update cost than TD. 38

39 Experiment: Query Optimization Temporal Constraint Evaluation Optimization Our optimization can bring significant performance improvement with negligible overhead 39

40 Outline Preliminaries Compact storage Temporal query language Query optimization Experiments Conclusion 40

41 Conclusion We proposed an efficient XML archiving database system that consists of A novel compact and updateable storage scheme A simple yet expressive query language for XML archiving databases Optimization of query evaluation over the compact storage 41

42 Future Work Consider additional temporal constraints Consider unrestricted database schemas that may contain cycles Design efficient optimization algorithms that can work in this broader framework 42

43 Thank you! 43

44 Reference [1] P. Buneman, S. Khanna, K. Tajima, and W.-C. Tan. Archiving scientific data. In ACM Transactions on Database Systems, [2] A. P. Chapman, H. Jagadish, and P. Ramanan. Efficient provenance storage. In SIGMOD, [3] S. Chawathe and H. Garcia-molina. Meaningful change detection in structured data. In SIGMOD, [4] P. T. Jayant and J. R. Haritsa. Xgrind: A query-friendly xml compressor. In ICDE, [5] H. Liefke and D. Suciu. XMill: an efficient compressor for XML data. In SIGMOD, [6] H. M uller, P. Buneman, and I. Koltsidas. Xarch: Archiving scientific and reference data. In SIGMOD, [7] F. Rizzolo and A. A. Vaisman. Temporal xml: modeling, indexing, and query processing. The VLDB Journal, 17: , August [8] F. Wang and C. Zaniolo. Temporal queries in XML document archives and web warehouses. In TIME-ICTL, [9] F. Wang and C. Zaniolo. Temporal queries and version management in XML-based docu-ment archives. Data Knowl. Eng., 65: , May

45 Outline Preliminaries Compact storage Temporal query language Query optimization Experiments Conclusion 45

46 XML database XML database: tree-structured, ID-based. 46

47 Archiving XML database XML instances: XML database at certain time point Each instance is associate with a timestamp (version number) Archiving Database: multiple instances are merged into one database 47

48 Updating XML database Update operations: insertion deletion A sequence of update operations can be modeled as a set of deletions followed by a set of insertions. root 0 root 0 Entry 1 Entry 1 Species 2 Features 3 Descr 4 Species 2 Features 3 Descr 4 Ref 7 Domain 5 Rattus norvegicus 100KDA protein Descr 6 Domain 5 Rattus norvegicus 100KDA protein Descr 6 Author 8 Nene V PRO-RICH PRO-RICH 48

49 A Piece of Related Work P. Buneman, et al. Archiving scientific data. TODS, Solution: Merge instances Remove timestamps if same as parent node (Top-down) Weakness: Inefficient updates Inefficient query 49

50 Witness Graph Definition Given a query Q, a witness graph for Q is a graph WQ such that a)the nodes of W Q correspond to the nodes of Q b)there is an edge from node p to node q in W Q, iff p is a witness node of q in Q. Example Construct the witness graph from the temporal constraint graph after temporal annotation consumption (next page) 50

Archiving Scientific Data - A Practical Approach

Archiving Scientific Data - A Practical Approach Archiving Scientific Data PETER BUNEMAN University of Edinburgh SANJEEV KHANNA University of Pennsylvania KEISHI TAJIMA Japan Advanced Institute of Science and Technology and WANG-CHIEW TAN University

More information

The Database Wiki Project: A General-Purpose Platform for Data Curation and Collaboration

The Database Wiki Project: A General-Purpose Platform for Data Curation and Collaboration The Database Wiki Project: A General-Purpose Platform for Data Curation and Collaboration Peter Buneman, James Cheney, Sam Lindley School of Informatics University of Edinburgh Edinburgh, United Kingdom

More information

Caching XML Data on Mobile Web Clients

Caching XML Data on Mobile Web Clients Caching XML Data on Mobile Web Clients Stefan Böttcher, Adelhard Türling University of Paderborn, Faculty 5 (Computer Science, Electrical Engineering & Mathematics) Fürstenallee 11, D-33102 Paderborn,

More information

Database Technologies

Database Technologies Database Technologies Bachelor and Master Projects XML Databases Database & Information Systems Group Christian Grün Introduction XML just small files why databases? library of U (800 MB) genetic data

More information

Echidna: Efficient Clustering of Hierarchical Data for Network Traffic Analysis

Echidna: Efficient Clustering of Hierarchical Data for Network Traffic Analysis Echidna: Efficient Clustering of Hierarchical Data for Network Traffic Analysis Abdun Mahmood, Christopher Leckie, Parampalli Udaya Department of Computer Science and Software Engineering University of

More information

Sorting Hierarchical Data in External Memory for Archiving

Sorting Hierarchical Data in External Memory for Archiving Sorting Hierarchical Data in External Memory for Archiving Ioannis Koltsidas School of Informatics University of Edinburgh i.koltsidas@sms.ed.ac.uk Heiko Müller School of Informatics University of Edinburgh

More information

Creating Synthetic Temporal Document Collections for Web Archive Benchmarking

Creating Synthetic Temporal Document Collections for Web Archive Benchmarking Creating Synthetic Temporal Document Collections for Web Archive Benchmarking Kjetil Nørvåg and Albert Overskeid Nybø Norwegian University of Science and Technology 7491 Trondheim, Norway Abstract. In

More information

An Efficient Algorithm for Web Page Change Detection

An Efficient Algorithm for Web Page Change Detection An Efficient Algorithm for Web Page Change Detection Srishti Goel Department of Computer Sc. & Engg. Thapar University, Patiala (INDIA) Rinkle Rani Aggarwal Department of Computer Sc. & Engg. Thapar University,

More information

Modeling and Querying E-Commerce Data in Hybrid Relational-XML DBMSs

Modeling and Querying E-Commerce Data in Hybrid Relational-XML DBMSs Modeling and Querying E-Commerce Data in Hybrid Relational-XML DBMSs Lipyeow Lim, Haixun Wang, and Min Wang IBM T. J. Watson Research Center {liplim,haixun,min}@us.ibm.com Abstract. Data in many industrial

More information

Binary Coded Web Access Pattern Tree in Education Domain

Binary Coded Web Access Pattern Tree in Education Domain Binary Coded Web Access Pattern Tree in Education Domain C. Gomathi P.G. Department of Computer Science Kongu Arts and Science College Erode-638-107, Tamil Nadu, India E-mail: kc.gomathi@gmail.com M. Moorthi

More information

XML Data Integration Based on Content and Structure Similarity Using Keys

XML Data Integration Based on Content and Structure Similarity Using Keys XML Data Integration Based on Content and Structure Similarity Using Keys Waraporn Viyanon 1, Sanjay K. Madria 1, and Sourav S. Bhowmick 2 1 Department of Computer Science, Missouri University of Science

More information

QuickDB Yet YetAnother Database Management System?

QuickDB Yet YetAnother Database Management System? QuickDB Yet YetAnother Database Management System? Radim Bača, Peter Chovanec, Michal Krátký, and Petr Lukáš Radim Bača, Peter Chovanec, Michal Krátký, and Petr Lukáš Department of Computer Science, FEECS,

More information

Physical Data Organization

Physical Data Organization Physical Data Organization Database design using logical model of the database - appropriate level for users to focus on - user independence from implementation details Performance - other major factor

More information

Research Problems in Data Provenance

Research Problems in Data Provenance Research Problems in Data Provenance Wang-Chiew Tan University of California, Santa Cruz Email: wctan@cs.ucsc.edu Abstract The problem of tracing the provenance (also known as lineage) of data is an ubiquitous

More information

A Model For Revelation Of Data Leakage In Data Distribution

A Model For Revelation Of Data Leakage In Data Distribution A Model For Revelation Of Data Leakage In Data Distribution Saranya.R Assistant Professor, Department Of Computer Science and Engineering Lord Jegannath college of Engineering and Technology Nagercoil,

More information

Technologies for a CERIF XML based CRIS

Technologies for a CERIF XML based CRIS Technologies for a CERIF XML based CRIS Stefan Bärisch GESIS-IZ, Bonn, Germany Abstract The use of XML as a primary storage format as opposed to data exchange raises a number of questions regarding the

More information

A MEDIATION LAYER FOR HETEROGENEOUS XML SCHEMAS

A MEDIATION LAYER FOR HETEROGENEOUS XML SCHEMAS A MEDIATION LAYER FOR HETEROGENEOUS XML SCHEMAS Abdelsalam Almarimi 1, Jaroslav Pokorny 2 Abstract This paper describes an approach for mediation of heterogeneous XML schemas. Such an approach is proposed

More information

Managing large sound databases using Mpeg7

Managing large sound databases using Mpeg7 Max Jacob 1 1 Institut de Recherche et Coordination Acoustique/Musique (IRCAM), place Igor Stravinsky 1, 75003, Paris, France Correspondence should be addressed to Max Jacob (max.jacob@ircam.fr) ABSTRACT

More information

Sharing large data collections between mobile peers

Sharing large data collections between mobile peers Sharing large data collections between mobile peers Brian Tripney, Christopher Foley, Richard Gourlay, John Wilson Department of Computer and Information Sciences University of Strathclyde Glasgow, G1

More information

Efficient Mapping XML DTD to Relational Database

Efficient Mapping XML DTD to Relational Database Efficient Mapping XML DTD to Relational Database Mohammed Adam Ibrahim Fakharaldien 1, Khalid Edris 2, Jasni Mohamed Zain 3, Norrozila Sulaiman 4 Faculty of Computer System and Software Engineering,University

More information

Enhancing Traditional Databases to Support Broader Data Management Applications. Yi Chen Computer Science & Engineering Arizona State University

Enhancing Traditional Databases to Support Broader Data Management Applications. Yi Chen Computer Science & Engineering Arizona State University Enhancing Traditional Databases to Support Broader Data Management Applications Yi Chen Computer Science & Engineering Arizona State University What Is a Database System? Of course, there are traditional

More information

XML DATA INTEGRATION SYSTEM

XML DATA INTEGRATION SYSTEM XML DATA INTEGRATION SYSTEM Abdelsalam Almarimi The Higher Institute of Electronics Engineering Baniwalid, Libya Belgasem_2000@Yahoo.com ABSRACT This paper describes a proposal for a system for XML data

More information

State History Storage in Disk-based Interval Trees

State History Storage in Disk-based Interval Trees State History Storage in Disk-based Interval Trees Alexandre Montplaisir June 29, 2010 École Polytechnique de Montréal Content Introduction : The concept of State The current method : Checkpoints The proposed

More information

A Workbench for Prototyping XML Data Exchange (extended abstract)

A Workbench for Prototyping XML Data Exchange (extended abstract) A Workbench for Prototyping XML Data Exchange (extended abstract) Renzo Orsini and Augusto Celentano Università Ca Foscari di Venezia, Dipartimento di Informatica via Torino 155, 30172 Mestre (VE), Italy

More information

XStruct: Efficient Schema Extraction from Multiple and Large XML Documents

XStruct: Efficient Schema Extraction from Multiple and Large XML Documents XStruct: Efficient Schema Extraction from Multiple and Large XML Documents Jan Hegewald, Felix Naumann, Melanie Weis Humboldt-Universität zu Berlin Unter den Linden 6, 10099 Berlin {hegewald,naumann,mweis}@informatik.hu-berlin.de

More information

Pushing XML Main Memory Databases to their Limits

Pushing XML Main Memory Databases to their Limits Pushing XML Main Memory Databases to their Limits Christian Grün Database & Information Systems Group University of Konstanz, Germany christian.gruen@uni-konstanz.de The we distribution of XML documents

More information

SIGMOD RWE Review Towards Proximity Pattern Mining in Large Graphs

SIGMOD RWE Review Towards Proximity Pattern Mining in Large Graphs SIGMOD RWE Review Towards Proximity Pattern Mining in Large Graphs Fabian Hueske, TU Berlin June 26, 21 1 Review This document is a review report on the paper Towards Proximity Pattern Mining in Large

More information

Graph Mining and Social Network Analysis

Graph Mining and Social Network Analysis Graph Mining and Social Network Analysis Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References Jiawei Han and Micheline Kamber, "Data Mining: Concepts and Techniques", The Morgan Kaufmann

More information

Summary of Alma-OSF s Evaluation of MongoDB for Monitoring Data Heiko Sommer June 13, 2013

Summary of Alma-OSF s Evaluation of MongoDB for Monitoring Data Heiko Sommer June 13, 2013 Summary of Alma-OSF s Evaluation of MongoDB for Monitoring Data Heiko Sommer June 13, 2013 Heavily based on the presentation by Tzu-Chiang Shen, Leonel Peña ALMA Integrated Computing Team Coordination

More information

Efficient XML-to-SQL Query Translation: Where to Add the Intelligence?

Efficient XML-to-SQL Query Translation: Where to Add the Intelligence? Efficient XML-to-SQL Query Translation: Where to Add the Intelligence? Rajasekar Krishnamurthy Raghav Kaushik Jeffrey F Naughton IBM Almaden Research Center Microsoft Research University of Wisconsin-Madison

More information

Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science IBM Chief Scientist, Graph Computing. October 29th, 2015

Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science IBM Chief Scientist, Graph Computing. October 29th, 2015 E6893 Big Data Analytics Lecture 8: Spark Streams and Graph Computing (I) Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science IBM Chief Scientist, Graph Computing

More information

Graph Database Performance: An Oracle Perspective

Graph Database Performance: An Oracle Perspective Graph Database Performance: An Oracle Perspective Xavier Lopez, Ph.D. Senior Director, Product Management 1 Copyright 2012, Oracle and/or its affiliates. All rights reserved. Program Agenda Broad Perspective

More information

How To Improve Performance In A Database

How To Improve Performance In A Database Some issues on Conceptual Modeling and NoSQL/Big Data Tok Wang Ling National University of Singapore 1 Database Models File system - field, record, fixed length record Hierarchical Model (IMS) - fixed

More information

ABSTRACT 1. INTRODUCTION. Kamil Bajda-Pawlikowski kbajda@cs.yale.edu

ABSTRACT 1. INTRODUCTION. Kamil Bajda-Pawlikowski kbajda@cs.yale.edu Kamil Bajda-Pawlikowski kbajda@cs.yale.edu Querying RDF data stored in DBMS: SPARQL to SQL Conversion Yale University technical report #1409 ABSTRACT This paper discusses the design and implementation

More information

KEYWORD SEARCH IN RELATIONAL DATABASES

KEYWORD SEARCH IN RELATIONAL DATABASES KEYWORD SEARCH IN RELATIONAL DATABASES N.Divya Bharathi 1 1 PG Scholar, Department of Computer Science and Engineering, ABSTRACT Adhiyamaan College of Engineering, Hosur, (India). Data mining refers to

More information

! E6893 Big Data Analytics Lecture 9:! Linked Big Data Graph Computing (I)

! E6893 Big Data Analytics Lecture 9:! Linked Big Data Graph Computing (I) ! E6893 Big Data Analytics Lecture 9:! Linked Big Data Graph Computing (I) Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science Mgr., Dept. of Network Science and

More information

Hadoop s Entry into the Traditional Analytical DBMS Market. Daniel Abadi Yale University August 3 rd, 2010

Hadoop s Entry into the Traditional Analytical DBMS Market. Daniel Abadi Yale University August 3 rd, 2010 Hadoop s Entry into the Traditional Analytical DBMS Market Daniel Abadi Yale University August 3 rd, 2010 Data, Data, Everywhere Data explosion Web 2.0 more user data More devices that sense data More

More information

Col*Fusion: Not Just Jet Another Data Repository

Col*Fusion: Not Just Jet Another Data Repository Col*Fusion: Not Just Jet Another Data Repository Evgeny Karataev 1 and Vladimir Zadorozhny 1 1 School of Information Sciences, University of Pittsburgh Abstract In this poster we introduce Col*Fusion a

More information

Duplicate Detection Algorithm In Hierarchical Data Using Efficient And Effective Network Pruning Algorithm: Survey

Duplicate Detection Algorithm In Hierarchical Data Using Efficient And Effective Network Pruning Algorithm: Survey www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 3 Issue 12 December 2014, Page No. 9766-9773 Duplicate Detection Algorithm In Hierarchical Data Using Efficient

More information

Report Paper: MatLab/Database Connectivity

Report Paper: MatLab/Database Connectivity Report Paper: MatLab/Database Connectivity Samuel Moyle March 2003 Experiment Introduction This experiment was run following a visit to the University of Queensland, where a simulation engine has been

More information

XRecursive: An Efficient Method to Store and Query XML Documents

XRecursive: An Efficient Method to Store and Query XML Documents XRecursive: An Efficient Method to Store and Query XML Documents Mohammed Adam Ibrahim Fakharaldien, Jasni Mohamed Zain, Norrozila Sulaiman Faculty of Computer System and Software Engineering, University

More information

Original-page small file oriented EXT3 file storage system

Original-page small file oriented EXT3 file storage system Original-page small file oriented EXT3 file storage system Zhang Weizhe, Hui He, Zhang Qizhen School of Computer Science and Technology, Harbin Institute of Technology, Harbin E-mail: wzzhang@hit.edu.cn

More information

Deferred node-copying scheme for XQuery processors

Deferred node-copying scheme for XQuery processors Deferred node-copying scheme for XQuery processors Jan Kurš and Jan Vraný Software Engineering Group, FIT ČVUT, Kolejn 550/2, 160 00, Prague, Czech Republic kurs.jan@post.cz, jan.vrany@fit.cvut.cz Abstract.

More information

Storing and Querying XML Data using an RDMBS

Storing and Querying XML Data using an RDMBS Storing and Querying XML Data using an RDMBS Daniela Florescu INRIA, Roquencourt daniela.florescu@inria.fr Donald Kossmann University of Passau kossmann@db.fmi.uni-passau.de 1 Introduction XML is rapidly

More information

ON ANALYZING THE DATABASE PERFORMANCE FOR DIFFERENT CLASSES OF XML DOCUMENTS BASED ON THE USED STORAGE APPROACH

ON ANALYZING THE DATABASE PERFORMANCE FOR DIFFERENT CLASSES OF XML DOCUMENTS BASED ON THE USED STORAGE APPROACH ON ANALYZING THE DATABASE PERFORMANCE FOR DIFFERENT CLASSES OF XML DOCUMENTS BASED ON THE USED STORAGE APPROACH Hagen Höpfner and Jörg Schad and Essam Mansour International University Bruchsal, Campus

More information

Automatic Annotation Wrapper Generation and Mining Web Database Search Result

Automatic Annotation Wrapper Generation and Mining Web Database Search Result Automatic Annotation Wrapper Generation and Mining Web Database Search Result V.Yogam 1, K.Umamaheswari 2 1 PG student, ME Software Engineering, Anna University (BIT campus), Trichy, Tamil nadu, India

More information

Elastic Enterprise Data Warehouse Query Log Analysis on a Secure Private Cloud

Elastic Enterprise Data Warehouse Query Log Analysis on a Secure Private Cloud Elastic Enterprise Data Warehouse Query Log Analysis on a Secure Private Cloud Data Warehouse and Business Intelligence Architect Credit Suisse, Zurich Joint research between Credit Suisse and ETH Zurich:

More information

CSE 326: Data Structures B-Trees and B+ Trees

CSE 326: Data Structures B-Trees and B+ Trees Announcements (4//08) CSE 26: Data Structures B-Trees and B+ Trees Brian Curless Spring 2008 Midterm on Friday Special office hour: 4:-5: Thursday in Jaech Gallery (6 th floor of CSE building) This is

More information

Static Data Mining Algorithm with Progressive Approach for Mining Knowledge

Static Data Mining Algorithm with Progressive Approach for Mining Knowledge Global Journal of Business Management and Information Technology. Volume 1, Number 2 (2011), pp. 85-93 Research India Publications http://www.ripublication.com Static Data Mining Algorithm with Progressive

More information

Inferring Fine-Grained Data Provenance in Stream Data Processing: Reduced Storage Cost, High Accuracy

Inferring Fine-Grained Data Provenance in Stream Data Processing: Reduced Storage Cost, High Accuracy Inferring Fine-Grained Data Provenance in Stream Data Processing: Reduced Storage Cost, High Accuracy Mohammad Rezwanul Huq, Andreas Wombacher, and Peter M.G. Apers University of Twente, 7500 AE Enschede,

More information

White Paper. Better Performance, Lower Costs. The Advantages of IBM PowerLinux 7R2 with PowerVM versus HP DL380p G8 with vsphere 5.

White Paper. Better Performance, Lower Costs. The Advantages of IBM PowerLinux 7R2 with PowerVM versus HP DL380p G8 with vsphere 5. 89 Fifth Avenue, 7th Floor New York, NY 10003 www.theedison.com 212.367.7400 White Paper Better Performance, Lower Costs The Advantages of IBM PowerLinux 7R2 with PowerVM versus HP DL380p G8 with vsphere

More information

Big Data Provenance: Challenges and Implications for Benchmarking

Big Data Provenance: Challenges and Implications for Benchmarking Big Data Provenance: Challenges and Implications for Benchmarking Boris Glavic Illinois Institute of Technology 10 W 31st Street, Chicago, IL 60615, USA glavic@iit.edu Abstract. Data Provenance is information

More information

DBMS / Business Intelligence, SQL Server

DBMS / Business Intelligence, SQL Server DBMS / Business Intelligence, SQL Server Orsys, with 30 years of experience, is providing high quality, independant State of the Art seminars and hands-on courses corresponding to the needs of IT professionals.

More information

Ag + -tree: an Index Structure for Range-aggregation Queries in Data Warehouse Environments

Ag + -tree: an Index Structure for Range-aggregation Queries in Data Warehouse Environments Ag + -tree: an Index Structure for Range-aggregation Queries in Data Warehouse Environments Yaokai Feng a, Akifumi Makinouchi b a Faculty of Information Science and Electrical Engineering, Kyushu University,

More information

Development of Monitoring and Analysis Tools for the Huawei Cloud Storage

Development of Monitoring and Analysis Tools for the Huawei Cloud Storage Development of Monitoring and Analysis Tools for the Huawei Cloud Storage September 2014 Author: Veronia Bahaa Supervisors: Maria Arsuaga-Rios Seppo S. Heikkila CERN openlab Summer Student Report 2014

More information

Join Minimization in XML-to-SQL Translation: An Algebraic Approach

Join Minimization in XML-to-SQL Translation: An Algebraic Approach Join Minimization in XML-to-SQL Translation: An Algebraic Approach Murali Mani Song Wang Daniel J. Dougherty Elke A. Rundensteiner Computer Science Dept, WPI {mmani,songwang,dd,rundenst}@cs.wpi.edu Abstract

More information

Lecture Data Warehouse Systems

Lecture Data Warehouse Systems Lecture Data Warehouse Systems Eva Zangerle SS 2013 PART C: Novel Approaches in DW NoSQL and MapReduce Stonebraker on Data Warehouses Star and snowflake schemas are a good idea in the DW world C-Stores

More information

Change Manager 5.0 Installation Guide

Change Manager 5.0 Installation Guide Change Manager 5.0 Installation Guide Copyright 1994-2008 Embarcadero Technologies, Inc. Embarcadero Technologies, Inc. 100 California Street, 12th Floor San Francisco, CA 94111 U.S.A. All rights reserved.

More information

Classifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang

Classifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang Classifying Large Data Sets Using SVMs with Hierarchical Clusters Presented by :Limou Wang Overview SVM Overview Motivation Hierarchical micro-clustering algorithm Clustering-Based SVM (CB-SVM) Experimental

More information

GRAPH PATTERN MINING: A SURVEY OF ISSUES AND APPROACHES

GRAPH PATTERN MINING: A SURVEY OF ISSUES AND APPROACHES International Journal of Information Technology and Knowledge Management July-December 2012, Volume 5, No. 2, pp. 401-407 GRAPH PATTERN MINING: A SURVEY OF ISSUES AND APPROACHES B. Bhargavi 1 and K.P.

More information

ArcGIS for Server Performance and Scalability: Testing Methodologies. Andrew Sakowicz, asakowicz@esri.com Frank Pizzi, fpizzi@esri.

ArcGIS for Server Performance and Scalability: Testing Methodologies. Andrew Sakowicz, asakowicz@esri.com Frank Pizzi, fpizzi@esri. ArcGIS for Server Performance and Scalability: Testing Methodologies Andrew Sakowicz, asakowicz@esri.com Frank Pizzi, fpizzi@esri.com Introductions Target audience - GIS, DB, System administrators - Testers

More information

Quiz! Database Indexes. Index. Quiz! Disc and main memory. Quiz! How costly is this operation (naive solution)?

Quiz! Database Indexes. Index. Quiz! Disc and main memory. Quiz! How costly is this operation (naive solution)? Database Indexes How costly is this operation (naive solution)? course per weekday hour room TDA356 2 VR Monday 13:15 TDA356 2 VR Thursday 08:00 TDA356 4 HB1 Tuesday 08:00 TDA356 4 HB1 Friday 13:15 TIN090

More information

PERFORMANCE ENHANCEMENTS IN TreeAge Pro 2014 R1.0

PERFORMANCE ENHANCEMENTS IN TreeAge Pro 2014 R1.0 PERFORMANCE ENHANCEMENTS IN TreeAge Pro 2014 R1.0 15 th January 2014 Al Chrosny Director, Software Engineering TreeAge Software, Inc. achrosny@treeage.com Andrew Munzer Director, Training and Customer

More information

Novel Data Extraction Language for Structured Log Analysis

Novel Data Extraction Language for Structured Log Analysis Novel Data Extraction Language for Structured Log Analysis P.W.D.C. Jayathilake 99X Technology, Sri Lanka. ABSTRACT This paper presents the implementation of a new log data extraction language. Theoretical

More information

CHAPTER 3 PROPOSED SCHEME

CHAPTER 3 PROPOSED SCHEME 79 CHAPTER 3 PROPOSED SCHEME In an interactive environment, there is a need to look at the information sharing amongst various information systems (For E.g. Banking, Military Services and Health care).

More information

Branch-and-Price Approach to the Vehicle Routing Problem with Time Windows

Branch-and-Price Approach to the Vehicle Routing Problem with Time Windows TECHNISCHE UNIVERSITEIT EINDHOVEN Branch-and-Price Approach to the Vehicle Routing Problem with Time Windows Lloyd A. Fasting May 2014 Supervisors: dr. M. Firat dr.ir. M.A.A. Boon J. van Twist MSc. Contents

More information

Image Compression through DCT and Huffman Coding Technique

Image Compression through DCT and Huffman Coding Technique International Journal of Current Engineering and Technology E-ISSN 2277 4106, P-ISSN 2347 5161 2015 INPRESSCO, All Rights Reserved Available at http://inpressco.com/category/ijcet Research Article Rahul

More information

Concurrency Control. Chapter 17. Comp 521 Files and Databases Fall 2010 1

Concurrency Control. Chapter 17. Comp 521 Files and Databases Fall 2010 1 Concurrency Control Chapter 17 Comp 521 Files and Databases Fall 2010 1 Conflict Serializable Schedules Recall conflicts (WR, RW, WW) were the cause of sequential inconsistency Two schedules are conflict

More information

Information Discovery on Electronic Medical Records

Information Discovery on Electronic Medical Records Information Discovery on Electronic Medical Records Vagelis Hristidis, Fernando Farfán, Redmond P. Burke, MD Anthony F. Rossi, MD Jeffrey A. White, FIU FIU Miami Children s Hospital Miami Children s Hospital

More information

The basic data mining algorithms introduced may be enhanced in a number of ways.

The basic data mining algorithms introduced may be enhanced in a number of ways. DATA MINING TECHNOLOGIES AND IMPLEMENTATIONS The basic data mining algorithms introduced may be enhanced in a number of ways. Data mining algorithms have traditionally assumed data is memory resident,

More information

General Purpose Database Summarization

General Purpose Database Summarization Table of Content General Purpose Database Summarization A web service architecture for on-line database summarization Régis Saint-Paul (speaker), Guillaume Raschia, Noureddine Mouaddib LINA - Polytech

More information

Efficient Structure Oriented Storage of XML Documents Using ORDBMS

Efficient Structure Oriented Storage of XML Documents Using ORDBMS Efficient Structure Oriented Storage of XML Documents Using ORDBMS Alexander Kuckelberg 1 and Ralph Krieger 2 1 Chair of Railway Studies and Transport Economics, RWTH Aachen Mies-van-der-Rohe-Str. 1, D-52056

More information

Energy Efficiency in Secure and Dynamic Cloud Storage

Energy Efficiency in Secure and Dynamic Cloud Storage Energy Efficiency in Secure and Dynamic Cloud Storage Adilet Kachkeev Ertem Esiner Alptekin Küpçü Öznur Özkasap Koç University Department of Computer Science and Engineering, İstanbul, Turkey {akachkeev,eesiner,akupcu,oozkasap}@ku.edu.tr

More information

Introduction to XML Applications

Introduction to XML Applications EMC White Paper Introduction to XML Applications Umair Nauman Abstract: This document provides an overview of XML Applications. This is not a comprehensive guide to XML Applications and is intended for

More information

Unraveling the Duplicate-Elimination Problem in XML-to-SQL Query Translation

Unraveling the Duplicate-Elimination Problem in XML-to-SQL Query Translation Unraveling the Duplicate-Elimination Problem in XML-to-SQL Query Translation Rajasekar Krishnamurthy University of Wisconsin sekar@cs.wisc.edu Raghav Kaushik Microsoft Corporation skaushi@microsoft.com

More information

What is a database? COSC 304 Introduction to Database Systems. Database Introduction. Example Problem. Databases in the Real-World

What is a database? COSC 304 Introduction to Database Systems. Database Introduction. Example Problem. Databases in the Real-World COSC 304 Introduction to Systems Introduction Dr. Ramon Lawrence University of British Columbia Okanagan ramon.lawrence@ubc.ca What is a database? A database is a collection of logically related data for

More information

Firewall Design: Consistency, Completeness, Compactness

Firewall Design: Consistency, Completeness, Compactness Firewall Design: Consistency, Completeness, Compactness Alex X. Liu alex@cs.utexas.edu Department of Computer Sciences The University of Texas at Austin Austin, Texas 78712-1188, U.S.A. March, 2004 Co-author:

More information

Binary Image Scanning Algorithm for Cane Segmentation

Binary Image Scanning Algorithm for Cane Segmentation Binary Image Scanning Algorithm for Cane Segmentation Ricardo D. C. Marin Department of Computer Science University Of Canterbury Canterbury, Christchurch ricardo.castanedamarin@pg.canterbury.ac.nz Tom

More information

Process Mining by Measuring Process Block Similarity

Process Mining by Measuring Process Block Similarity Process Mining by Measuring Process Block Similarity Joonsoo Bae, James Caverlee 2, Ling Liu 2, Bill Rouse 2, Hua Yan 2 Dept of Industrial & Sys Eng, Chonbuk National Univ, South Korea jsbae@chonbukackr

More information

Fast Contextual Preference Scoring of Database Tuples

Fast Contextual Preference Scoring of Database Tuples Fast Contextual Preference Scoring of Database Tuples Kostas Stefanidis Department of Computer Science, University of Ioannina, Greece Joint work with Evaggelia Pitoura http://dmod.cs.uoi.gr 2 Motivation

More information

Supporting Database Provenance under Schema Evolution

Supporting Database Provenance under Schema Evolution Supporting Database Provenance under Schema Evolution Shi Gao and Carlo Zaniolo University of California, Los Angeles {gaoshi, zaniolo}@cs.ucla.edu Abstract. Database schema upgrades are common in modern

More information

Improving Query Performance Using Materialized XML Views: A Learning-Based Approach

Improving Query Performance Using Materialized XML Views: A Learning-Based Approach Improving Query Performance Using Materialized XML Views: A Learning-Based Approach Ashish Shah and Rada Chirkova Department of Computer Science North Carolina State University Campus Box 7535, Raleigh

More information

The Planets Preservation Planning workflow and the planning tool Plato

The Planets Preservation Planning workflow and the planning tool Plato The Planets Preservation Planning workflow and the planning tool Plato Hannes Kulovits Vienna University of Technology http://www.ifs.tuwien.ac.at/~kulovits Outline Preservation Planning Evaluation of

More information

Supporting Ontology-based Keyword Search over Medical Databases

Supporting Ontology-based Keyword Search over Medical Databases Supporting Ontology-based Keyword Search over Medical Databases Anastasios Kementsietsidis, Ph.D. Lipyeow Lim, Ph.D. Min Wang, Ph.D. IBM T.J. Watson Research Center, Skyline Drive, Hawthorne, NY, USA.

More information

Database Design Patterns. Winter 2006-2007 Lecture 24

Database Design Patterns. Winter 2006-2007 Lecture 24 Database Design Patterns Winter 2006-2007 Lecture 24 Trees and Hierarchies Many schemas need to represent trees or hierarchies of some sort Common way of representing trees: An adjacency list model Each

More information

Data Management in RFID Applications

Data Management in RFID Applications Data Management in RFID Applications Dan Lin 1, Hicham G. Elmongui 1,, Elisa Bertino 1, and Beng Chin Ooi 2 1 Department of Computer Science, Purdue University, USA {lindan, elmongui, bertino}@cs.purdue.edu

More information

Dependency Free Distributed Database Caching for Web Applications and Web Services

Dependency Free Distributed Database Caching for Web Applications and Web Services Dependency Free Distributed Database Caching for Web Applications and Web Services Hemant Kumar Mehta School of Computer Science and IT, Devi Ahilya University Indore, India Priyesh Kanungo Patel College

More information

How To Test For Performance And Scalability On A Server With A Multi-Core Computer (For A Large Server)

How To Test For Performance And Scalability On A Server With A Multi-Core Computer (For A Large Server) Scalability Results Select the right hardware configuration for your organization to optimize performance Table of Contents Introduction... 1 Scalability... 2 Definition... 2 CPU and Memory Usage... 2

More information

Three Effective Top-Down Clustering Algorithms for Location Database Systems

Three Effective Top-Down Clustering Algorithms for Location Database Systems Three Effective Top-Down Clustering Algorithms for Location Database Systems Kwang-Jo Lee and Sung-Bong Yang Department of Computer Science, Yonsei University, Seoul, Republic of Korea {kjlee5435, yang}@cs.yonsei.ac.kr

More information

A Platform for Supporting Data Analytics on Twitter: Challenges and Objectives 1

A Platform for Supporting Data Analytics on Twitter: Challenges and Objectives 1 A Platform for Supporting Data Analytics on Twitter: Challenges and Objectives 1 Yannis Stavrakas Vassilis Plachouras IMIS / RC ATHENA Athens, Greece {yannis, vplachouras}@imis.athena-innovation.gr Abstract.

More information

On Mining Group Patterns of Mobile Users

On Mining Group Patterns of Mobile Users On Mining Group Patterns of Mobile Users Yida Wang 1, Ee-Peng Lim 1, and San-Yih Hwang 2 1 Centre for Advanced Information Systems, School of Computer Engineering Nanyang Technological University, Singapore

More information

Constraint Preserving XML Storage in Relations

Constraint Preserving XML Storage in Relations Constraint Preserving XML Storage in Relations Yi Chen, Susan B. Davidson and Yifeng Zheng Ôغ Ó ÓÑÔÙØ Ö Ò ÁÒ ÓÖÑ Ø ÓÒ Ë Ò ÍÒ Ú Ö ØÝ Ó È ÒÒ ÝÐÚ Ò yicn@saul.cis.upenn.edu susan@cis.upenn.edu yifeng@saul.cis.upenn.edu

More information

An Efficient and Scalable Management of Ontology

An Efficient and Scalable Management of Ontology An Efficient and Scalable Management of Ontology Myung-Jae Park 1, Jihyun Lee 1, Chun-Hee Lee 1, Jiexi Lin 1, Olivier Serres 2, and Chin-Wan Chung 1 1 Korea Advanced Institute of Science and Technology,

More information

Persistent Binary Search Trees

Persistent Binary Search Trees Persistent Binary Search Trees Datastructures, UvA. May 30, 2008 0440949, Andreas van Cranenburgh Abstract A persistent binary tree allows access to all previous versions of the tree. This paper presents

More information

Mining Large Datasets: Case of Mining Graph Data in the Cloud

Mining Large Datasets: Case of Mining Graph Data in the Cloud Mining Large Datasets: Case of Mining Graph Data in the Cloud Sabeur Aridhi PhD in Computer Science with Laurent d Orazio, Mondher Maddouri and Engelbert Mephu Nguifo 16/05/2014 Sabeur Aridhi Mining Large

More information

Protein Protein Interactions (PPI) APID (Agile Protein Interaction DataAnalyzer)

Protein Protein Interactions (PPI) APID (Agile Protein Interaction DataAnalyzer) APID (Agile Protein Interaction DataAnalyzer) 23 APID (Agile Protein Interaction DataAnalyzer) Integrates and unifies 7 DBs: BIND, DIP, HPRD, IntAct, MINT, BioGRID. Includes 51,873 proteins 241,204 interactions

More information

Merkle Hash Tree based Techniques for Data Integrity of Outsourced Data

Merkle Hash Tree based Techniques for Data Integrity of Outsourced Data Merkle Hash Tree based Techniques for Data Integrity of Outsourced Data ABSTRACT Muhammad Saqib Niaz Dept. of Computer Science Otto von Guericke University Magdeburg, Germany saqib@iti.cs.uni-magdeburg.de

More information

Learning Outcomes. COMP202 Complexity of Algorithms. Binary Search Trees and Other Search Trees

Learning Outcomes. COMP202 Complexity of Algorithms. Binary Search Trees and Other Search Trees Learning Outcomes COMP202 Complexity of Algorithms Binary Search Trees and Other Search Trees [See relevant sections in chapters 2 and 3 in Goodrich and Tamassia.] At the conclusion of this set of lecture

More information

Efficient Iceberg Query Evaluation for Structured Data using Bitmap Indices

Efficient Iceberg Query Evaluation for Structured Data using Bitmap Indices Proc. of Int. Conf. on Advances in Computer Science, AETACS Efficient Iceberg Query Evaluation for Structured Data using Bitmap Indices Ms.Archana G.Narawade a, Mrs.Vaishali Kolhe b a PG student, D.Y.Patil

More information

XML Fragment Caching for Small Mobile Internet Devices

XML Fragment Caching for Small Mobile Internet Devices XML Fragment Caching for Small Mobile Internet Devices Stefan Böttcher, Adelhard Türling University of Paderborn Fachbereich 17 (Mathematik-Informatik) Fürstenallee 11, D-33102 Paderborn, Germany email

More information