Knowledge-Based Persistent Archives

Size: px
Start display at page:

Download "Knowledge-Based Persistent Archives"

Transcription

1 SDSC TR Knowledge-Based Persistent Archives Reagan W. Moore San Diego Supercomputer Center Sponsored by NATIONAL ARCHIVES AND RECORDS ADMINISTRATION and ADVANCED RESEARCH PROJECTS AGENCY ITO INTELLIGENT METACOMPUTING TESTBED ARPA Order D570 Issued by ESC/ENS under contract F C-0020 January 18, 2001 San Diego Supercomputer Center TECHNICAL REPORT Copyright 2001, The Regents of the University of California

2 The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Advanced Research Projects Agency or the U.S. Government.

3 Knowledge-based Persistent Archives Reagan W. Moore San Diego Supercomputer Center La Jolla, CA Abstract The preservation of digital information for long periods of time is becoming feasible through the integration of archival storage technology from supercomputer centers, information models from the digital library community, and preservation models from the archivist s community. The supercomputer centers provide the technology needed to store the immense amounts of digital data that are being created, while the digital library community provides the mechanisms to define the context needed to interpret the data. The coordination of these technologies with preservation and management policies defines the infrastructure for a collection based persistent archive [1]. This report discusses the use of knowledge representations to augment collection-based persistent archives. 1. Introduction Supercomputer centers, digital libraries, and archival storage communities have common persistent archival storage requirements. Each of these communities is building software infrastructure to organize and store large collections of data. An emerging common requirement is the ability to maintain data collections for long periods of time. The challenge is to maintain the ability to discover, access, and display digital objects that are stored within the archive, while the technology used to manage the archive evolves. We originally implemented a collection-based persistent archive [1] in which a description of the collection is stored along with the data. The approach focused on the development of infrastructure independent representations for the information content of the collection, interoperability mechanisms to support migration of the collection onto new software and hardware systems, and use of a standard tagging language to annotate the information content. The process used to ingest a collection, transform it into an infrastructure independent form, and recreate the collection on new technology is shown schematically in Figure 1. 1

4 Figure 1. Persistent Collection Process Two phases are emphasized, the archiving of the collection, and the retrieval or instantiation of the collection onto new technology. The diagram shows the multiple steps that are necessary to preserve digital objects through time. The steps form a cycle that can be used for migrating data collections onto new infrastructure as technology evolves. The technology changes can occur at the system-level where archive, file, compute and database software evolves, or at the information model level where formats, programming languages and practices change. The ultimate goal is to maintain not only the bits associated with the original data, but also the context that permits the data to be interpreted. We rely on the use of collections to define the context to associate with digital data. Each digital object is maintained as a tagged structure that includes the original bytes of data, as well as attributes that have been defined as relevant for the data collection. A collection-based persistent archive is therefore one in which the organization of the collection is archived simultaneously with the digital objects that comprise the collection. A persistent collection requires the ability to dynamically recreate the collection on new technology. Scalable archival storage systems are used to ensure that sufficient resources are available for continual migration of digital objects to new media. The software systems that interpret the infrastructure independent representation for the collections are based upon generic digital library systems, and are migrated explicitly to new platforms. In this system, the original representation of the digital objects and of the collections does 2

5 not change. The maintenance of the persistent archive is then achieved through application of archivist policies that govern the rate of migration of the objects and the collection instantiation software. 2. Knowledge-based Archives The preservation of the context to associate with digital objects is the dominant issue for knowledge-based persistent archives. The context is traditionally defined through specification of attributes that are associated with each digital object. The context is also defined through the implied relationships that exist between the attributes, and the preferred organization of the attributes in user interfaces for viewing the data collection. Management of the collection context is made difficult by the rapid change of technology. Software systems used to manage collections are changing on five to tenyear time scale. Of greater concern is that the information tagging languages used to annotate digital objects is also changing. The persistent archiving of a collection must also handle the evolution of the information mark-up language. We have characterized persistent archives in prior publications [1,2] as collection-based repositories. We now recognize the need to broaden the archive characterization to knowledge-based repositories. Not only the information content, but also the processing steps used to accession the collection must be preserved. Conceptually, one can view the accessioning process as the equivalent of the process needed to instantiate the collection on new technology. If the accessioning process can be captured in an infrastructure independent representation, the same process can be used to manage the migration of the collection to new markup languages, archival data repositories, information repositories, and knowledge repositories. The archival description of a collection then must include not only contextual information about the digital objects, but also knowledge about the relationships used to derive the contextual information. The architecture that is needed to implement a knowledge-based persistent archive is shown in figure 2. 3

6 Ingest Manage Access Knowledge Relationships between Concepts Knowledge Repository for Rules Knowledge or Topic-Based Query Information Attributes Semantics Information Repository Attribute- based Query Data Fields Containers Folders Storage (Replicas, Persistent IDs) Feature-based Query Process Infrastructure Process Figure 2. Knowledge-based Persistent Archive The three columns represent the technologies needed to manage the ingestion process, manage the persistent archive, and manage the access environment. The three rows represent the infrastructure needed to manage knowledge, information and data. Knowledge is represented as relationships between domain concepts. Information is represented as attributes about digital objects within the collection. The digital objects are images of the reality described by the domain concepts. Ingestion corresponds to the steps of knowledge mining/tagging, information mining/tagging, and digital object organization/storage. Persistent archive management requires infrastructure to store the digital objects (archives), information repositories to hold the metadata (databases), and knowledge repositories to organize the relationships (logic systems). The access environment provides mechanisms to query the collection at the data level through feature extraction, at the information level through database queries, and at the knowledge-level through domain concepts. Just as the data management infrastructure is intended to provide access without having to know data object names, the knowledge access infrastructure is intended to provide access without having to know the explicit metadata attribute names used to organize the collection database. 4

7 The knowledge-based persistent archive requires software infrastructure to support interoperability between different implementations of ingestion, management, and access infrastructure components. This is shown in Figure 3. Between Ingest platforms and Management repositories, standards are needed to define consistent tagging mechanisms for knowledge (XML Topic Map DTD[3] or XTM DTD) for information (XML DTD[4]), and for data organization (logical folders and physical containers). Between Management repositories and Access platforms, standard query languages are needed for knowledge-based access (Knowledge query language or rule manipulation language), attribute-based access (EMCAT SGL generator or MIX mediator[5]), and feature-based access (application of procedures within a computational grid). Between the knowledge and information environments, a standard representation is needed to map from concepts to attributes, such as topic maps or model-based access systems. Between information and data storage environments, a data handling system is needed to map from attributes to storage locations, such as the SDSC Storage Resource Broker.[6] Ingest Manage Access Knowledge Relationships Between Concepts X T M D T D Knowledge Repository for Rules Ru les - K Q L Knowledge or Topic-Based Query Information Attributes Semantics (Topic Maps / Model-based Access) X M L D T D Information Repository (Data Handling System - Storage Resource Broker) E M C A T / M IX Attribute- based Query Data Fields Containers Folders M C A T/ H D F Storage (Replicas, Persistent IDs) Gr ids Feature-based Query Figure 3. Persistent Archive Interfaces 5

8 Persistence is achieved through the infrastructure middleware (shown in Figure 3 as the blue grid) that links accession platforms, management repositories, and access platforms. The same middleware is needed to support grid environments (such as computation on distributed data collections) and digital library environments (such as curricula support in the National Science, Mathematics, Engineering, and Technology Education Digital Library - NDSL). This architecture has been proposed to both the Grid Forum and the NSDL, and may be the architecture that integrates knowledge management activities from these communities with the persistent archive community. 2.1 Archive Accessioning Process: Of interest is the emerging need for knowledge management as well as information management and data management when ingesting collections. When we look at collections, we see multiple interfaces where knowledge is required to be able to adequately describe relationships inherent within the collection. We have been looking at the preservation of relationships that are needed to describe: - implied knowledge (interpretation of fields) - structural knowledge (topology associated with digital line graphs) - domain knowledge (relationships between domain concepts) - procedural knowledge (workflow creation steps for digital objects) - presentation knowledge (support for knowledge-based queries). One way to accomplish the goal of knowledge-based access is to use the ISO Topic Maps standard to maintain mappings between domain concepts and the attribute names used in the collection schema. It is very interesting to note that relationships are implicit between each of the nine infrastructure components defined in Figure 2. The relationships either define rules that can be applied to the collection, or quantify associations that can be made between collection elements. Examples are: Relationships that quantify rules: Rules for defining collection attributes Rules for organizing attributes into a schema Rules for feature extraction Rules governing data set creation Relationships that quantify associations: Organization of concepts into topic maps Ontology mapping between concept maps Mapping of concepts to collection attributes Mapping of concepts to feature extraction rules Mapping between attributes and data fields (semantics) 6

9 Semantic mapping between collections Mapping between attributes and storage Mapping between attributes and features Clustering of data into containers The relationships can be separated into four broad classes: Semantic/logical relationships. Relationships can be defined to map from the concepts used to describe the collection to the attribute tags used to annotate the collection. Semantic relationships can also be defined between the domain specific concepts as knowledge bases or semantic maps. Procedural/temporal relationships. The transformations that are applied to the collection to create the archival form constitute a workflow that represents the ingestion process. The temporal order and explicit transformations can be represented as a set of states through which the collection is processed. Structural/spatial relationships. The internal organization of digital objects within the collection can be represented as a structural ordering of the tagged elements. The representation of the structure can be expressed using the same types of characterization as needed for spatially tagged data. Functional relationships. For scientific applications, analysis algorithms are needed to identify features that might be associated with a digital object. The expression of the relationship between the named feature and its presence within a digital object will require the ability to archive mathematical expressions. In the ingestion process, a major challenge has been the need to be able to differentiate between artifacts and implied knowledge. Essentially, the steps of refining the description of a collection by including more attributes, must be integrated with the identification of anomalies. To make progress, we apply the concepts of occurrence tagging and closure to the archived collections. Occurrence tagging is the explicit annotation of the location of each tagged attribute along with the associated value. This provides a representation that captures all of the information content, without imposing constraints on permissible attribute values. Closure is the analysis of the occurrences to identify both completeness and consistency. Completeness is evaluated by verifying that all attributes are populated, and that the information content is fully annotated. Consistency checks that all attribute values fall within defined ranges. Consistency can be checked by construction of inverse indexes that point to all occurrences of each attribute value. It is necessary to iterate between knowledge extraction and attribute mining. We illustrate this through application of the ingestion process shown in Figure 4. 7

10 Define a representation of the concepts inherent within the collection. Build a concept map that identifies all of the possible attributes to associate with each concept Tag the collection to identify attributes for each of the possible fields. Restructure the concept map to eliminate unused fields, specialize classes, rearrange class attributes, etc. Mine the collection to identify differences between bill versions, identify missing attributes, identify implicit attributes, and identify invalid data (such as duplicated pages). Accession Template Closure Concept/Attribute Attribute Inverse Indexing Knowledge Generation Information Generation Attribute Selection Attribute Tagging Occurrence Tagging View Management Data Organization Collection Figure 4. Ingestion Process At one time, the hope was to be able to ingest a collection in a single pass. Based upon the above steps, at least three analyses are needed to mine knowledge, information, and organize data. Depending upon the number of iterations used to refine the concept space, additional passes through the data may be necessary. It is still an area of debate for whether it will be possible to differentiate in general between concept map refinement and error analysis. These steps will have to be done jointly for most collections. 8

11 Note that once the data has been wrapped into XML, all integrity checking, knowledge mining, derivation of a "consolidated version", etc., can be seen as (albeit very elaborate) queries against an XML collection. The interesting research issue is to find out how well XML query languages (including the UCSD/SDSC XMAS system) are able to express the analysis queries. Especially for integrity checking, logic-based XML query languages seem to be a good choice for an ingestion environment. 2.2 Archival Representation of Collections: One of the results of the analysis of the collections provided by NARA was the realization that multiple views of a collection may need to be archived. Typical views include: Original form as submitted XML tagged form Occurrence representation (occurrence, attribute, value) Knowledge-based representation (recreation of the original form from the occurrence representation). This view can be thought of as the noise-free representation of the original collection based upon the knowledge and information content that was created during the accessioning process. This view can be designed to include white space and all anomalies if desired. Consolidated representation (elimination of all duplicated information) By archiving descriptions of the processing steps needed to go between each of these views, one can guarantee that the same processing steps could be applied in the future to re-instantiate the collection on new technology, including new information and knowledge representations. 3. Relationships between NARA and other Agency projects: There is a strong synergy between the development of persistent archive infrastructure for NARA, digital library development for NSF, and data grid development for DOE, NASA, and NLM. All of these research areas require the ability to manage knowledge, information, and data objects. What has become apparent, is that even though the requirements driving the infrastructure development for each agency are different, a uniform architecture is emerging that meets all agency requirements. The architecture shown in Figure 3 provides: Validation mechanism for the common data management architecture 9

12 Validation mechanism for the differentiation between knowledge, information, and data and the choice of representation standards Integration vehicle for tying together persistent archives with grid environments Integration vehicle for tying together grid environments with digital libraries Integration vehicle for tying together digital libraries with persistent archives It is interesting to note the multiple projects that are building upon the architecture that is being developed in the NARA collaboration: NSF Digital Library Initiative, Phase 2. NSF National SMET Education Digital Library NSF NPACI data grid for neuroscience brain image federation NASA Information Power Grid distributed data processing DOE ASCI Data Visualization Corridor remote data processing DOE Particle Physics Data Grid object replication NLM Digital Embryo Project data grid for image processing and storage NARA Persistent Archive It is also interesting to note the iterative technology development cycle that links all of the projects. An original DARPA project developed the data handling capabilities as part of the Distributed Object Computation Testbed. The NASA IPG integrated the data handling technology with computational grid technology (common security environments). The NSF NPACI project integrated information management with data handling to support digital libraries. The ASCI PPDG then applied the technology to support replica management across heterogeneous systems. And the NARA project applied the technology to manage migration of collections across evolving infrastructure technology. Acknowledgements: This research has been sponsored by the National Archives and Records Administration and Advanced Research Projects Agency/ITO, "Intelligent Metacomputing Testbed", ARPA Order No. D570, issued by ESC/ENS under Contract #F C-0020, and by the Data Intensive Computing thrust area of the National Science Foundation project ASC National Partnership for Advanced Computational Infrastructure. The research topics have been investigated by the following members of the Data Intensive Computing Environment Group at the San Diego Supercomputer Center: Richard Marciano, Bertram Ludaescher, Ilya Zaslavsky, Amarnath Gupta, and Chaitan Baru. 10

13 References: [1] Moore, R., C. Baru, A. Rajasekar, B. Ludascher, R. Marciano, M. Wan, W. Schroeder, and A. Gupta, Collection-Based Persistent Digital Archives - Part 1, D-Lib Magazine, March 2000, [2] Moore, R., C. Baru, A. Rajasekar, B. Ludascher, R. Marciano, M. Wan, W. Schroeder, and A. Gupta, Collection-Based Persistent Digital Archives - Part 2, D-Lib Magazine, April 2000, [3] ISO/IEC FCD Topic Maps, [4] Extensible Markup Language (XML) [5] Baru, C., V. Chu, A. Gupta, B. Ludäscher, R. Marciano, Y. Papakonstantinou, and P. Velikhov. XML-Based Information Mediation for Digital Libraries. In ACM Conference on Digital Libraries, Berkeley, CA, Exhibition program. [6] Baru, C., R, Moore, A. Rajasekar, M. Wan,"The SDSC Storage Resource Broker, Proc. CASCON'98 Conference, Nov.30-Dec.3, 1998, Toronto, Canada. 11

Assessment of RLG Trusted Digital Repository Requirements

Assessment of RLG Trusted Digital Repository Requirements Assessment of RLG Trusted Digital Repository Requirements Reagan W. Moore San Diego Supercomputer Center 9500 Gilman Drive La Jolla, CA 92093-0505 01 858 534 5073 moore@sdsc.edu ABSTRACT The RLG/NARA trusted

More information

Preservation Environments

Preservation Environments Preservation Environments Reagan W. Moore San Diego Supercomputer Center University of California, San Diego 9500 Gilman Drive, MC-0505 La Jolla, CA 92093-0505 moore@sdsc.edu tel: +1-858-534-5073 fax:

More information

Concepts in Distributed Data Management or History of the DICE Group

Concepts in Distributed Data Management or History of the DICE Group Concepts in Distributed Data Management or History of the DICE Group Reagan W. Moore 1, Arcot Rajasekar 1, Michael Wan 3, Wayne Schroeder 2, Antoine de Torcy 1, Sheau- Yen Chen 2, Mike Conway 1, Hao Xu

More information

Archiving, Indexing and Accessing Web Materials: Solutions for large amounts of data

Archiving, Indexing and Accessing Web Materials: Solutions for large amounts of data Archiving, Indexing and Accessing Web Materials: Solutions for large amounts of data David Minor 1, Reagan Moore 2, Bing Zhu, Charles Cowart 4 1. (88)4-104 minor@sdsc.edu San Diego Supercomputer Center

More information

US Korea Joint Workshop on Digital Libraries August 10-11, 2000 San Diego Supercomputer Center San Diego, California

US Korea Joint Workshop on Digital Libraries August 10-11, 2000 San Diego Supercomputer Center San Diego, California US Korea Joint Workshop on Digital Libraries August 10-11, 2000 San Diego Supercomputer Center San Diego, California 1. Executive Summary There are many barriers to the worldwide development of digital

More information

Building Preservation Environments with Data Grid Technology

Building Preservation Environments with Data Grid Technology SOAA_SP09 23/5/06 3:32 PM Page 139 Building Preservation Environments with Data Grid Technology Reagan W. Moore Abstract Preservation environments for digital records are successful when they can separate

More information

Policy Policy--driven Distributed driven Distributed Data Management (irods) Richard M arciano Marciano marciano@un marciano @un.

Policy Policy--driven Distributed driven Distributed Data Management (irods) Richard M arciano Marciano marciano@un marciano @un. Policy-driven Distributed Data Management (irods) Richard Marciano marciano@unc.edu Professor @ SILS / Chief Scientist for Persistent Archives and Digital Preservation @ RENCI Director of the Sustainable

More information

Building Semantic Content Management Framework

Building Semantic Content Management Framework Building Semantic Content Management Framework Eric Yen Computing Centre, Academia Sinica Outline What is CMS Related Work CMS Evaluation, Selection, and Metrics CMS Applications in Academia Sinica Concluding

More information

Digital Preservation Lifecycle Management

Digital Preservation Lifecycle Management Digital Preservation Lifecycle Management Building a demonstration prototype for the preservation of large-scale multi-media collections Arcot Rajasekar San Diego Supercomputer Center, University of California,

More information

A MEDIATION LAYER FOR HETEROGENEOUS XML SCHEMAS

A MEDIATION LAYER FOR HETEROGENEOUS XML SCHEMAS A MEDIATION LAYER FOR HETEROGENEOUS XML SCHEMAS Abdelsalam Almarimi 1, Jaroslav Pokorny 2 Abstract This paper describes an approach for mediation of heterogeneous XML schemas. Such an approach is proposed

More information

Using Databases to Manage State Information for. Globally Distributed Data

Using Databases to Manage State Information for. Globally Distributed Data Storage Resource Broker Using Databases to Manage State Information for Globally Distributed Data Reagan W. Moore San Diego Supercomputer Center moore@sdsc.edu http://www.sdsc sdsc.edu/srb Abstract The

More information

Conceptualizing Policy-Driven Repository Interoperability (PoDRI) Using irods and Fedora

Conceptualizing Policy-Driven Repository Interoperability (PoDRI) Using irods and Fedora Conceptualizing Policy-Driven Repository Interoperability (PoDRI) Using irods and Fedora David Pcolar Carolina Digital Repository (CDR) david_pcolar@unc.edu Alexandra Chassanoff School of Information &

More information

The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets

The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets!! Large data collections appear in many scientific domains like climate studies.!! Users and

More information

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing

More information

XML DATA INTEGRATION SYSTEM

XML DATA INTEGRATION SYSTEM XML DATA INTEGRATION SYSTEM Abdelsalam Almarimi The Higher Institute of Electronics Engineering Baniwalid, Libya Belgasem_2000@Yahoo.com ABSRACT This paper describes a proposal for a system for XML data

More information

IT S ABOUT TIME. Sponsored by. The National Science Foundation. Digital Government Program and Digital Libraries Program

IT S ABOUT TIME. Sponsored by. The National Science Foundation. Digital Government Program and Digital Libraries Program IT S ABOUT TIME RESEARCH CHALLENGES IN DIGITAL ARCHIVING AND LONG-TERM PRESERVATION Sponsored by The National Science Foundation Digital Government Program and Digital Libraries Program Directorate for

More information

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

More information

Data Management in an International Data Grid Project. Timur Chabuk 04/09/2007

Data Management in an International Data Grid Project. Timur Chabuk 04/09/2007 Data Management in an International Data Grid Project Timur Chabuk 04/09/2007 Intro LHC opened in 2005 several Petabytes of data per year data created at CERN distributed to Regional Centers all over the

More information

Abstract. 1. Introduction. irods White Paper 1

Abstract. 1. Introduction. irods White Paper 1 irods: integrated Rule Oriented Data System White Paper Data Intensive Cyber Environments Group University of North Carolina at Chapel Hill University of California at San Diego September 2008 Abstract

More information

Data Grid Landscape And Searching

Data Grid Landscape And Searching Or What is SRB Matrix? Data Grid Automation Arun Jagatheesan et al., University of California, San Diego VLDB Workshop on Data Management in Grids Trondheim, Norway, 2-3 September 2005 SDSC Storage Resource

More information

Data Management System for grid and portal services

Data Management System for grid and portal services Data Management System for grid and portal services Piotr Grzybowski 1, Cezary Mazurek 1, Paweł Spychała 1, Marcin Wolski 1 1 Poznan Supercomputing and Networking Center, ul. Noskowskiego 10, 61-704 Poznan,

More information

Integrating Heterogeneous Data Sources Using XML

Integrating Heterogeneous Data Sources Using XML Integrating Heterogeneous Data Sources Using XML 1 Yogesh R.Rochlani, 2 Prof. A.R. Itkikar 1 Department of Computer Science & Engineering Sipna COET, SGBAU, Amravati (MH), India 2 Department of Computer

More information

Concepts of Database Management Seventh Edition. Chapter 9 Database Management Approaches

Concepts of Database Management Seventh Edition. Chapter 9 Database Management Approaches Concepts of Database Management Seventh Edition Chapter 9 Database Management Approaches Objectives Describe distributed database management systems (DDBMSs) Discuss client/server systems Examine the ways

More information

Archiving Systems. Uwe M. Borghoff Universität der Bundeswehr München Fakultät für Informatik Institut für Softwaretechnologie. uwe.borghoff@unibw.

Archiving Systems. Uwe M. Borghoff Universität der Bundeswehr München Fakultät für Informatik Institut für Softwaretechnologie. uwe.borghoff@unibw. Archiving Systems Uwe M. Borghoff Universität der Bundeswehr München Fakultät für Informatik Institut für Softwaretechnologie uwe.borghoff@unibw.de Decision Process Reference Models Technologies Use Cases

More information

Zhenping Liu *, Yao Liang * Virginia Polytechnic Institute and State University. Xu Liang ** University of California, Berkeley

Zhenping Liu *, Yao Liang * Virginia Polytechnic Institute and State University. Xu Liang ** University of California, Berkeley P1.1 AN INTEGRATED DATA MANAGEMENT, RETRIEVAL AND VISUALIZATION SYSTEM FOR EARTH SCIENCE DATASETS Zhenping Liu *, Yao Liang * Virginia Polytechnic Institute and State University Xu Liang ** University

More information

Annotation for the Semantic Web during Website Development

Annotation for the Semantic Web during Website Development Annotation for the Semantic Web during Website Development Peter Plessers, Olga De Troyer Vrije Universiteit Brussel, Department of Computer Science, WISE, Pleinlaan 2, 1050 Brussel, Belgium {Peter.Plessers,

More information

Knowledge based Replica Management in Data Grid Computation

Knowledge based Replica Management in Data Grid Computation Knowledge based Replica Management in Data Grid Computation Riaz ul Amin 1, A. H. S. Bukhari 2 1 Department of Computer Science University of Glasgow Scotland, UK 2 Faculty of Computer and Emerging Sciences

More information

Digital Preservation. OAIS Reference Model

Digital Preservation. OAIS Reference Model Digital Preservation OAIS Reference Model Stephan Strodl, Andreas Rauber Institut für Softwaretechnik und Interaktive Systeme TU Wien http://www.ifs.tuwien.ac.at/dp Aim OAIS model Understanding the functionality

More information

Digital libraries of the future and the role of libraries

Digital libraries of the future and the role of libraries Digital libraries of the future and the role of libraries Donatella Castelli ISTI-CNR, Pisa, Italy Abstract Purpose: To introduce the digital libraries of the future, their enabling technologies and their

More information

ISO 19119 and OGC Service Architecture

ISO 19119 and OGC Service Architecture George PERCIVALL, USA Keywords: Geographic Information, Standards, Architecture, Services. ABSTRACT ISO 19119, "Geographic Information - Services," has been developed jointly with the Services Architecture

More information

Chapter 5. Warehousing, Data Acquisition, Data. Visualization

Chapter 5. Warehousing, Data Acquisition, Data. Visualization Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization 5-1 Learning Objectives

More information

Datagridflows: Managing Long-Run Processes on Datagrids

Datagridflows: Managing Long-Run Processes on Datagrids Datagridflows: Managing Long-Run Processes on Datagrids Arun Jagatheesan 1,2, Jonathan Weinberg 1, Reena Mathew 1, Allen Ding 1, Erik Vandekieft 1, Daniel Moore 1,3, Reagan Moore 1, Lucas Gilbert 1 and

More information

Queensland recordkeeping metadata standard and guideline

Queensland recordkeeping metadata standard and guideline Queensland recordkeeping metadata standard and guideline June 2012 Version 1.1 Queensland State Archives Department of Science, Information Technology, Innovation and the Arts Document details Security

More information

Tools and Services for the Long Term Preservation and Access of Digital Archives

Tools and Services for the Long Term Preservation and Access of Digital Archives Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer Studies Department of Electrical and Computer

More information

GIS Databases With focused on ArcSDE

GIS Databases With focused on ArcSDE Linköpings universitet / IDA / Div. for human-centered systems GIS Databases With focused on ArcSDE Imad Abugessaisa g-imaab@ida.liu.se 20071004 1 GIS and SDBMS Geographical data is spatial data whose

More information

2009 ikeep Ltd, Morgenstrasse 129, CH-3018 Bern, Switzerland (www.ikeep.com, info@ikeep.com)

2009 ikeep Ltd, Morgenstrasse 129, CH-3018 Bern, Switzerland (www.ikeep.com, info@ikeep.com) CSP CHRONOS Compliance statement for ISO 14721:2003 (Open Archival Information System Reference Model) 2009 ikeep Ltd, Morgenstrasse 129, CH-3018 Bern, Switzerland (www.ikeep.com, info@ikeep.com) The international

More information

MULTI AGENT-BASED DISTRIBUTED DATA MINING

MULTI AGENT-BASED DISTRIBUTED DATA MINING MULTI AGENT-BASED DISTRIBUTED DATA MINING REECHA B. PRAJAPATI 1, SUMITRA MENARIA 2 Department of Computer Science and Engineering, Parul Institute of Technology, Gujarat Technology University Abstract:

More information

The Southern California Earthquake Center Information Technology Research Initiative

The Southern California Earthquake Center Information Technology Research Initiative The Southern California Earthquake Center Information Technology Research Initiative Toward a Collaboratory for System-Level Earthquake Science Tom Jordan USC Kim Olsen - UCSB 4th Meeting of the US-Japan

More information

Secure Semantic Web Service Using SAML

Secure Semantic Web Service Using SAML Secure Semantic Web Service Using SAML JOO-YOUNG LEE and KI-YOUNG MOON Information Security Department Electronics and Telecommunications Research Institute 161 Gajeong-dong, Yuseong-gu, Daejeon KOREA

More information

An Oracle White Paper October 2013. Oracle Data Integrator 12c New Features Overview

An Oracle White Paper October 2013. Oracle Data Integrator 12c New Features Overview An Oracle White Paper October 2013 Oracle Data Integrator 12c Disclaimer This document is for informational purposes. It is not a commitment to deliver any material, code, or functionality, and should

More information

The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Datasets

The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Datasets The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Datasets Ann Chervenak Ian Foster $+ Carl Kesselman Charles Salisbury $ Steven Tuecke $ Information

More information

Open DMIX - Data Integration and Exploration Services for Data Grids, Data Web and Knowledge Grid Applications

Open DMIX - Data Integration and Exploration Services for Data Grids, Data Web and Knowledge Grid Applications Open DMIX - Data Integration and Exploration Services for Data Grids, Data Web and Knowledge Grid Applications Robert L. Grossman, Yunhong Gu, Dave Hanley, Xinwei Hong and Gokulnath Rao Laboratory for

More information

GEOG 482/582 : GIS Data Management. Lesson 10: Enterprise GIS Data Management Strategies GEOG 482/582 / My Course / University of Washington

GEOG 482/582 : GIS Data Management. Lesson 10: Enterprise GIS Data Management Strategies GEOG 482/582 / My Course / University of Washington GEOG 482/582 : GIS Data Management Lesson 10: Enterprise GIS Data Management Strategies Overview Learning Objective Questions: 1. What are challenges for multi-user database environments? 2. What is Enterprise

More information

DataGrids 2.0 irods - A Second Generation Data Cyberinfrastructure. Arcot (RAJA) Rajasekar DICE/SDSC/UCSD

DataGrids 2.0 irods - A Second Generation Data Cyberinfrastructure. Arcot (RAJA) Rajasekar DICE/SDSC/UCSD DataGrids 2.0 irods - A Second Generation Data Cyberinfrastructure Arcot (RAJA) Rajasekar DICE/SDSC/UCSD What is SRB? First Generation Data Grid middleware developed at the San Diego Supercomputer Center

More information

JOURNAL OF OBJECT TECHNOLOGY

JOURNAL OF OBJECT TECHNOLOGY JOURNAL OF OBJECT TECHNOLOGY Online at www.jot.fm. Published by ETH Zurich, Chair of Software Engineering JOT, 2008 Vol. 7, No. 8, November-December 2008 What s Your Information Agenda? Mahesh H. Dodani,

More information

The Service Availability Forum Specification for High Availability Middleware

The Service Availability Forum Specification for High Availability Middleware The Availability Forum Specification for High Availability Middleware Timo Jokiaho, Fred Herrmann, Dave Penkler, Manfred Reitenspiess, Louise Moser Availability Forum Timo.Jokiaho@nokia.com, Frederic.Herrmann@sun.com,

More information

Information Services for Smart Grids

Information Services for Smart Grids Smart Grid and Renewable Energy, 2009, 8 12 Published Online September 2009 (http://www.scirp.org/journal/sgre/). ABSTRACT Interconnected and integrated electrical power systems, by their very dynamic

More information

Infosys GRADIENT. Enabling Enterprise Data Virtualization. Keywords. Grid, Enterprise Data Integration, EII Introduction

Infosys GRADIENT. Enabling Enterprise Data Virtualization. Keywords. Grid, Enterprise Data Integration, EII Introduction Infosys GRADIENT Enabling Enterprise Data Virtualization Keywords Grid, Enterprise Data Integration, EII Introduction A new generation of business applications is emerging to support customer service,

More information

Semantic Exploration of Archived Product Lifecycle Metadata under Schema and Instance Evolution

Semantic Exploration of Archived Product Lifecycle Metadata under Schema and Instance Evolution Semantic Exploration of Archived Lifecycle Metadata under Schema and Instance Evolution Jörg Brunsmann Faculty of Mathematics and Computer Science, University of Hagen, D-58097 Hagen, Germany joerg.brunsmann@fernuni-hagen.de

More information

Report on the Dagstuhl Seminar Data Quality on the Web

Report on the Dagstuhl Seminar Data Quality on the Web Report on the Dagstuhl Seminar Data Quality on the Web Michael Gertz M. Tamer Özsu Gunter Saake Kai-Uwe Sattler U of California at Davis, U.S.A. U of Waterloo, Canada U of Magdeburg, Germany TU Ilmenau,

More information

Luc Declerck AUL, Technology Services Declan Fleming Director, Information Technology Department

Luc Declerck AUL, Technology Services Declan Fleming Director, Information Technology Department Luc Declerck AUL, Technology Services Declan Fleming Director, Information Technology Department What is cyberinfrastructure? Outline Examples of cyberinfrastructure t Why is this relevant to Libraries?

More information

Metadata Hierarchy in Integrated Geoscientific Database for Regional Mineral Prospecting

Metadata Hierarchy in Integrated Geoscientific Database for Regional Mineral Prospecting Metadata Hierarchy in Integrated Geoscientific Database for Regional Mineral Prospecting MA Xiaogang WANG Xinqing WU Chonglong JU Feng ABSTRACT: One of the core developments in geomathematics in now days

More information

Theme 6: Enterprise Knowledge Management Using Knowledge Orchestration Agency

Theme 6: Enterprise Knowledge Management Using Knowledge Orchestration Agency Theme 6: Enterprise Knowledge Management Using Knowledge Orchestration Agency Abstract Distributed knowledge management, intelligent software agents and XML based knowledge representation are three research

More information

Service Oriented Architecture

Service Oriented Architecture Service Oriented Architecture Charlie Abela Department of Artificial Intelligence charlie.abela@um.edu.mt Last Lecture Web Ontology Language Problems? CSA 3210 Service Oriented Architecture 2 Lecture Outline

More information

DA-NRW: a distributed architecture for long-term preservation

DA-NRW: a distributed architecture for long-term preservation DA-NRW: a distributed architecture for long-term preservation Manfred Thaller manfred.thaller@uni-koeln.de, Sebastian Cuy sebastian.cuy@uni-koeln.de, Jens Peters jens.peters@uni-koeln.de, Daniel de Oliveira

More information

Transparency and Efficiency in Grid Computing for Big Data

Transparency and Efficiency in Grid Computing for Big Data Transparency and Efficiency in Grid Computing for Big Data Paul L. Bergstein Dept. of Computer and Information Science University of Massachusetts Dartmouth Dartmouth, MA pbergstein@umassd.edu Abstract

More information

Design and Implementation of a Semantic Web Solution for Real-time Reservoir Management

Design and Implementation of a Semantic Web Solution for Real-time Reservoir Management Design and Implementation of a Semantic Web Solution for Real-time Reservoir Management Ram Soma 2, Amol Bakshi 1, Kanwal Gupta 3, Will Da Sie 2, Viktor Prasanna 1 1 University of Southern California,

More information

Distributed Database for Environmental Data Integration

Distributed Database for Environmental Data Integration Distributed Database for Environmental Data Integration A. Amato', V. Di Lecce2, and V. Piuri 3 II Engineering Faculty of Politecnico di Bari - Italy 2 DIASS, Politecnico di Bari, Italy 3Dept Information

More information

RUP Design. Purpose of Analysis & Design. Analysis & Design Workflow. Define Candidate Architecture. Create Initial Architecture Sketch

RUP Design. Purpose of Analysis & Design. Analysis & Design Workflow. Define Candidate Architecture. Create Initial Architecture Sketch RUP Design RUP Artifacts and Deliverables RUP Purpose of Analysis & Design To transform the requirements into a design of the system to-be. To evolve a robust architecture for the system. To adapt the

More information

Integrating XML Data Sources using RDF/S Schemas: The ICS-FORTH Semantic Web Integration Middleware (SWIM)

Integrating XML Data Sources using RDF/S Schemas: The ICS-FORTH Semantic Web Integration Middleware (SWIM) Integrating XML Data Sources using RDF/S Schemas: The ICS-FORTH Semantic Web Integration Middleware (SWIM) Extended Abstract Ioanna Koffina 1, Giorgos Serfiotis 1, Vassilis Christophides 1, Val Tannen

More information

In ediscovery and Litigation Support Repositories MPeterson, June 2009

In ediscovery and Litigation Support Repositories MPeterson, June 2009 XAM PRESENTATION (extensible TITLE Access GOES Method) HERE In ediscovery and Litigation Support Repositories MPeterson, June 2009 Contents XAM Introduction XAM Value Propositions XAM Use Cases Digital

More information

Talend Metadata Manager. Reduce Risk and Friction in your Information Supply Chain

Talend Metadata Manager. Reduce Risk and Friction in your Information Supply Chain Talend Metadata Manager Reduce Risk and Friction in your Information Supply Chain Talend Metadata Manager Talend Metadata Manager provides a comprehensive set of capabilities for all facets of metadata

More information

Business Intelligence: Recent Experiences in Canada

Business Intelligence: Recent Experiences in Canada Business Intelligence: Recent Experiences in Canada Leopoldo Bertossi Carleton University School of Computer Science Ottawa, Canada : Faculty Fellow of the IBM Center for Advanced Studies 2 Business Intelligence

More information

CiteSeer x in the Cloud

CiteSeer x in the Cloud Published in the 2nd USENIX Workshop on Hot Topics in Cloud Computing 2010 CiteSeer x in the Cloud Pradeep B. Teregowda Pennsylvania State University C. Lee Giles Pennsylvania State University Bhuvan Urgaonkar

More information

Amit Sheth & Ajith Ranabahu, 2010. Presented by Mohammad Hossein Danesh

Amit Sheth & Ajith Ranabahu, 2010. Presented by Mohammad Hossein Danesh Amit Sheth & Ajith Ranabahu, 2010 Presented by Mohammad Hossein Danesh 1 Agenda Introduction to Cloud Computing Research Motivation Semantic Modeling Can Help Use of DSLs Solution Conclusion 2 3 Motivation

More information

Databases in Organizations

Databases in Organizations The following is an excerpt from a draft chapter of a new enterprise architecture text book that is currently under development entitled Enterprise Architecture: Principles and Practice by Brian Cameron

More information

A View Integration Approach to Dynamic Composition of Web Services

A View Integration Approach to Dynamic Composition of Web Services A View Integration Approach to Dynamic Composition of Web Services Snehal Thakkar, Craig A. Knoblock, and José Luis Ambite University of Southern California/ Information Sciences Institute 4676 Admiralty

More information

Reverse Engineering of Relational Databases to Ontologies: An Approach Based on an Analysis of HTML Forms

Reverse Engineering of Relational Databases to Ontologies: An Approach Based on an Analysis of HTML Forms Reverse Engineering of Relational Databases to Ontologies: An Approach Based on an Analysis of HTML Forms Irina Astrova 1, Bela Stantic 2 1 Tallinn University of Technology, Ehitajate tee 5, 19086 Tallinn,

More information

Service Cloud for information retrieval from multiple origins

Service Cloud for information retrieval from multiple origins Service Cloud for information retrieval from multiple origins Authors: Marisa R. De Giusti, CICPBA (Comisión de Investigaciones Científicas de la provincia de Buenos Aires), PrEBi, National University

More information

Integrating Relational Database Schemas using a Standardized Dictionary

Integrating Relational Database Schemas using a Standardized Dictionary Integrating Relational Database Schemas using a Standardized Dictionary Ramon Lawrence Advanced Database Systems Laboratory University of Manitoba Winnipeg, Manitoba, Canada umlawren@cs.umanitoba.ca Ken

More information

Collaborative SRB Data Federations

Collaborative SRB Data Federations WHITE PAPER Collaborative SRB Data Federations A Unified View for Heterogeneous High-Performance Computing INTRODUCTION This paper describes Storage Resource Broker (SRB): its architecture and capabilities

More information

Long Term Knowledge Retention and Preservation

Long Term Knowledge Retention and Preservation Long Term Knowledge Retention and Preservation Aziz Bouras University of Lyon, DISP Laboratory France abdelaziz.bouras@univ-lyon2.fr Recent years: How should digital 3D data and multimedia information

More information

Filtering the Web to Feed Data Warehouses

Filtering the Web to Feed Data Warehouses Witold Abramowicz, Pawel Kalczynski and Krzysztof We^cel Filtering the Web to Feed Data Warehouses Springer Table of Contents CHAPTER 1 INTRODUCTION 1 1.1 Information Systems 1 1.2 Information Filtering

More information

Data Integration Hub for a Hybrid Paper Search

Data Integration Hub for a Hybrid Paper Search Data Integration Hub for a Hybrid Paper Search Jungkee Kim 1,2, Geoffrey Fox 2, and Seong-Joon Yoo 3 1 Department of Computer Science, Florida State University, Tallahassee FL 32306, U.S.A., jungkkim@cs.fsu.edu,

More information

Knowledge Management in Heterogeneous Data Warehouse Environments

Knowledge Management in Heterogeneous Data Warehouse Environments Management in Heterogeneous Data Warehouse Environments Larry Kerschberg Co-Director, E-Center for E-Business, Department of Information and Software Engineering, George Mason University, MSN 4A4, 4400

More information

Knowledge-based Expressive Technologies within Cloud Computing Environments

Knowledge-based Expressive Technologies within Cloud Computing Environments Knowledge-based Expressive Technologies within Cloud Computing Environments Sergey V. Kovalchuk, Pavel A. Smirnov, Konstantin V. Knyazkov, Alexander S. Zagarskikh, Alexander V. Boukhanovsky 1 Abstract.

More information

1 What Are Web Services?

1 What Are Web Services? Oracle Fusion Middleware Introducing Web Services 11g Release 1 (11.1.1) E14294-04 January 2011 This document provides an overview of Web services in Oracle Fusion Middleware 11g. Sections include: What

More information

Chapter 11 Mining Databases on the Web

Chapter 11 Mining Databases on the Web Chapter 11 Mining bases on the Web INTRODUCTION While Chapters 9 and 10 provided an overview of Web data mining, this chapter discusses aspects of mining the databases on the Web. Essentially, we use the

More information

1 What Are Web Services?

1 What Are Web Services? Oracle Fusion Middleware Introducing Web Services 11g Release 1 (11.1.1.6) E14294-06 November 2011 This document provides an overview of Web services in Oracle Fusion Middleware 11g. Sections include:

More information

Middleware support for the Internet of Things

Middleware support for the Internet of Things Middleware support for the Internet of Things Karl Aberer, Manfred Hauswirth, Ali Salehi School of Computer and Communication Sciences Ecole Polytechnique Fédérale de Lausanne (EPFL) CH-1015 Lausanne,

More information

Autonomy for SOHO Ground Operations

Autonomy for SOHO Ground Operations From: FLAIRS-01 Proceedings. Copyright 2001, AAAI (www.aaai.org). All rights reserved. Autonomy for SOHO Ground Operations Walt Truszkowski, NASA Goddard Space Flight Center (GSFC) Walt.Truszkowski@gsfc.nasa.gov

More information

WHITE PAPER DATA GOVERNANCE ENTERPRISE MODEL MANAGEMENT

WHITE PAPER DATA GOVERNANCE ENTERPRISE MODEL MANAGEMENT WHITE PAPER DATA GOVERNANCE ENTERPRISE MODEL MANAGEMENT CONTENTS 1. THE NEED FOR DATA GOVERNANCE... 2 2. DATA GOVERNANCE... 2 2.1. Definition... 2 2.2. Responsibilities... 3 3. ACTIVITIES... 6 4. THE

More information

Web Service Based Data Management for Grid Applications

Web Service Based Data Management for Grid Applications Web Service Based Data Management for Grid Applications T. Boehm Zuse-Institute Berlin (ZIB), Berlin, Germany Abstract Web Services play an important role in providing an interface between end user applications

More information

Enabling the Big Data Commons through indexing of data and their interactions

Enabling the Big Data Commons through indexing of data and their interactions biomedical and healthcare Data Discovery Index Ecosystem Enabling the Big Data Commons through indexing of and their interactions 2 nd BD2K all-hands meeting Bethesda 11/12/15 Aims 1. Help users find accessible

More information

Reference Architecture, Requirements, Gaps, Roles

Reference Architecture, Requirements, Gaps, Roles Reference Architecture, Requirements, Gaps, Roles The contents of this document are an excerpt from the brainstorming document M0014. The purpose is to show how a detailed Big Data Reference Architecture

More information

Oracle Data Miner (Extension of SQL Developer 4.0)

Oracle Data Miner (Extension of SQL Developer 4.0) An Oracle White Paper September 2013 Oracle Data Miner (Extension of SQL Developer 4.0) Integrate Oracle R Enterprise Mining Algorithms into a workflow using the SQL Query node Denny Wong Oracle Data Mining

More information

ECS 165A: Introduction to Database Systems

ECS 165A: Introduction to Database Systems ECS 165A: Introduction to Database Systems Todd J. Green based on material and slides by Michael Gertz and Bertram Ludäscher Winter 2011 Dept. of Computer Science UC Davis ECS-165A WQ 11 1 1. Introduction

More information

Bringing Business Objects into ETL Technology

Bringing Business Objects into ETL Technology Bringing Business Objects into ETL Technology Jing Shan Ryan Wisnesky Phay Lau Eugene Kawamoto Huong Morris Sriram Srinivasn Hui Liao 1. Northeastern University, jshan@ccs.neu.edu 2. Stanford University,

More information

International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 ISSN 2229-5518

International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 ISSN 2229-5518 International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 Over viewing issues of data mining with highlights of data warehousing Rushabh H. Baldaniya, Prof H.J.Baldaniya,

More information

Semantic Search in Portals using Ontologies

Semantic Search in Portals using Ontologies Semantic Search in Portals using Ontologies Wallace Anacleto Pinheiro Ana Maria de C. Moura Military Institute of Engineering - IME/RJ Department of Computer Engineering - Rio de Janeiro - Brazil [awallace,anamoura]@de9.ime.eb.br

More information

BUSINESS VALUE OF SEMANTIC TECHNOLOGY

BUSINESS VALUE OF SEMANTIC TECHNOLOGY BUSINESS VALUE OF SEMANTIC TECHNOLOGY Preliminary Findings Industry Advisory Council Emerging Technology (ET) SIG Information Sharing & Collaboration Committee July 15, 2005 Mills Davis Managing Director

More information

META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING

META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING Ramesh Babu Palepu 1, Dr K V Sambasiva Rao 2 Dept of IT, Amrita Sai Institute of Science & Technology 1 MVR College of Engineering 2 asistithod@gmail.com

More information

Implementing Ontology-based Information Sharing in Product Lifecycle Management

Implementing Ontology-based Information Sharing in Product Lifecycle Management Implementing Ontology-based Information Sharing in Product Lifecycle Management Dillon McKenzie-Veal, Nathan W. Hartman, and John Springer College of Technology, Purdue University, West Lafayette, Indiana

More information

IO Informatics The Sentient Suite

IO Informatics The Sentient Suite IO Informatics The Sentient Suite Our software, The Sentient Suite, allows a user to assemble, view, analyze and search very disparate information in a common environment. The disparate data can be numeric

More information

Workflow Requirements (Dec. 12, 2006)

Workflow Requirements (Dec. 12, 2006) 1 Functional Requirements Workflow Requirements (Dec. 12, 2006) 1.1 Designing Workflow Templates The workflow design system should provide means for designing (modeling) workflow templates in graphical

More information

OWL based XML Data Integration

OWL based XML Data Integration OWL based XML Data Integration Manjula Shenoy K Manipal University CSE MIT Manipal, India K.C.Shet, PhD. N.I.T.K. CSE, Suratkal Karnataka, India U. Dinesh Acharya, PhD. ManipalUniversity CSE MIT, Manipal,

More information

Lightweight Data Integration using the WebComposition Data Grid Service

Lightweight Data Integration using the WebComposition Data Grid Service Lightweight Data Integration using the WebComposition Data Grid Service Ralph Sommermeier 1, Andreas Heil 2, Martin Gaedke 1 1 Chemnitz University of Technology, Faculty of Computer Science, Distributed

More information

2 Associating Facts with Time

2 Associating Facts with Time TEMPORAL DATABASES Richard Thomas Snodgrass A temporal database (see Temporal Database) contains time-varying data. Time is an important aspect of all real-world phenomena. Events occur at specific points

More information

Data Quality in Information Integration and Business Intelligence

Data Quality in Information Integration and Business Intelligence Data Quality in Information Integration and Business Intelligence Leopoldo Bertossi Carleton University School of Computer Science Ottawa, Canada : Faculty Fellow of the IBM Center for Advanced Studies

More information

A Model-based Software Architecture for XML Data and Metadata Integration in Data Warehouse Systems

A Model-based Software Architecture for XML Data and Metadata Integration in Data Warehouse Systems Proceedings of the Postgraduate Annual Research Seminar 2005 68 A Model-based Software Architecture for XML and Metadata Integration in Warehouse Systems Abstract Wan Mohd Haffiz Mohd Nasir, Shamsul Sahibuddin

More information