Using Databases to Manage State Information for. Globally Distributed Data

Size: px
Start display at page:

Download "Using Databases to Manage State Information for. Globally Distributed Data"

Transcription

1 Storage Resource Broker Using Databases to Manage State Information for Globally Distributed Data Reagan W. Moore San Diego Supercomputer Center sdsc.edu/srb

2 Abstract The management of globally distributed data is simplified through the use of data grids which enable data sharing environments. Data grids provide: Interoperability mechanisms needed to interact with legacy storage systems and legacy applications Logical name spaces needed to identify files, resources, and users. Consistent management of state information about each file within the distributed environment. Access controls, descriptive metadata, and administration metadata. These capabilities enable data virtualization, the ability to manage data independently of the chosen storage repositories. Examples of management of globally distributed data include data grid federation, distributed digital libraries, and distributed persistent archives.

3 Research Questions How do we build global data management systems that rely on database technology to support state information? Is the current state of the art sufficient, or do we need extensions to current database technology?

4 lobal Data Management Precursor Alternative Architecture Study for NASA Earth Observing Satellite EOSDIS Mike Stonebraker, Jim Grey, Jeff Dozier, William Farrell, Reagan Moore Proposed aggressive use of database technology to manage 3 TBs data ingestion per day from multiple sites Replication of data 15 PB archive Discovery and manipulation of the collection Just in time acquisition of technology

5 Data Grid Evolution DARPA Massive Data Analysis Systems DARPA/USPTO Distributed Object Computation Testbed NSF National Partnership for Advanced Computational Infrastructure DOE Accelerated Strategic Computing Initiative data grid NARA persistent archive NASA Information Power Grid NLM Digital Embryo digital library DOE Particle Physics data grid NSF Grid Physics Network data grid NSF National Virtual Observatory data grid NSF National Science Digital Library persistent archive NSF Southern California Earthquake Center digital library NIH Biomedical Informatics Research Network data grid NSF Real-time Observatories, Applications, and Data management Network NSF ITR, Constraint based data systems LC Digital Preservation Lifecycle Management LC National Digital Information Infrastructure and Preservation Program

6 Terminology 1998 Data Grid, a data management system that organizes distributed data into collections Data Grid - data virtualization 2000 Persistent Archive, a data management system that handles technology evolution Persistent Archive - infrastructure independence

7 Data Grids: First Viewpoint Create a shared collection which manages state information independently from the storage systems Build a metadata catalog to store state information Separate data access mechanisms from storage access mechanisms Data Virtualization

8 Trust Virtualization Shared collection owns the data At each remote storage system, an account ID is created under which the data grid stores files User authenticates to the data grid Data grid checks access controls Data grid server authenticates to a remote data grid server Remote data grid server authenticates to the remote storage repository

9 Data Grids Largest single data grids ROADnet real-time sensor network, links 90 object ring buffers supporting 24,000 sensors BaBar high energy physics, distributes SLAC collections to Lyon,France and Rome, Italy Biomedical Informatics Research Network, links data resources across 25 institutions within the US Largest data grid federations KEK high-energy physics federation of 7 data grids between Japan, South Korea, China, Taiwan, Australia, Poland, US WUN federation of 5 academic institutions (SDSC, NCSA, U Bergen, U Southampton, U Manchester) NARA Research Prototype Persistent Archive (SDSC, U Maryland, NARA, GA Tech)

10 ata Grids: Second Viewpoint Support data management applications Automate all aspects of data discovery, access, management, analysis, preservation Security paramount Distributed data Provide distributed data support for Data sharing - data grids Data publication - digital libraries Data preservation - persistent archives Data collections - Real time sensor data

11 Generic Data Management Data grids provide capabilities needed by digital libraries and persistent archives Infrastructure independence to manage a collection distributed across multiple storage systems Descriptive metadata to describe authenticity context of each file Administrative metadata to maintain integrity Location, replicas, checksums, audit trails, ownership, access controls, versions, locks, pinning, aggregation Data grids are implemented as middleware Runs as application under an account ID

12 ederated Server Architecture Logical Name Or Attribute Condition Read Application Peer-to-peer Brokering Parallel Data Access SRB server /6 4 SRB server SRB agent 2 5 SRB agent 1.Logical-to-Physical mapping 2.Identification of Replicas 3.Access & Audit Control R1 MCAT Data Access R2 Server(s) Spawning

13 Storage Resource Broker Application C Library, Java Unix Shell Linux I/O C++ NT Browser, Kepler Actors DLL / Python, Perl, Windows HTTP, DSpace, Fedora, OpenDAP OAI, WSDL, (WSRF), GridFTP Federation Management Consistency & Metadata Management / Authorization, Authentication, Audit Logical Name Space Latency Management Data Transport Metadata Transport Database Abstraction Storage Repository Abstraction Databases - DB2, Oracle, Sybase, Postgres, mysql, Informix Archives - Tape, Sam-QFS, DMF, ORB HPSS, ADSM, UniTree, ADS File Systems Unix, NT, Mac OSX Databases - DB2, Oracle, Sybase, Postgres, mysql, Informix

14 orage Resource Broker Collections at SDSC (8/2/2005) GBs of data stored Number of files Users with ACLs a Grid Ê Ê Ê F/ITR - National Virtual Observatory 53,862 9,536,751 1 F - National Partnership for Advanced Computational Infrastructure 36,149 7,539,180 3 tic collections Ğ Hayden planetarium 8, ,352 2 ne Ğ public collections 12,998 6,707,952 F/NPACI - Biology and Environmental collections 40,155 76,083 F/NPACI Ğ Joint Center for Structural Genomics 15,731 1,577,260 F - TeraGrid, ENZO Cosmology simulations 176,730 2,125,945 3,2 - Biomedical Informatics Research Network 10,561 7,596,888 3 ital Library Ê Ê Ê F/NPACI - Long Term Ecological Reserve 256 9,033 F/NPACI - Grid Portal 2,620 53, Alliance for Cell Signaling microarray data ,594 F - National Science Digital Library SIO Explorer collection 2,733 1,083,998 F/ITR - Southern California Earthquake Center 131,010 2,702,421 sistent Archive Ê Ê Ê PRC Persistent Archive Testbed (Kentucky, Ohio, Michigan, Minnesota) ,186 SD Libraries archive 4, ,050 RA- Research Prototype Persistent Archive 1, ,434 F - National Science Digital Library persistent archive 3,600 27,034,150 1 TAL 501 TB 68 million 5,3

15 nfrastructure Independence Data virtualization Management of name spaces independently of the storage repositories Global name spaces Persistent identifiers Collection-based ownership of files Support for access operations independently of the storage repositories Separation of access methods from storage protocols Support for integrity and authenticity operations

16 Separation of Access Method from Storage Protocols Access Method Access Operations Data Grid Storage Operations Storage Protocol Storage System Map from the operations used by the access method to a standard set of operations used to interact with the storage system

17 Data Grid Operations File access Open, close, read, write, seek, stat, synch, Audit, versions, pinning, checksums, synchronize, Parallel I/O and firewall interactions Versions, backups, replicas Latency management Bulk operations Register, load, unload, delete, Remote procedures HDFv5, data filtering, file parsing, replicate, aggregate Metadata management SQL generation, schema extension, XML import and export, browsing, queries, GGF, Operations for Access, Management, and Transport at Remote Sites

18 Latency Management - Bulk Operations Bulk register Create a logical name for a file Load context (metadata) Bulk load Create a copy of the file on a data grid storage repository Bulk unload Provide containers to hold small files and pointers to each file location Bulk delete Trash can Sticky bits for access control,

19 Examples of Extensibility The 3 fundamental APIs are C library, shell commands, Java Other access mechanisms are ported on top of these interfaces API evolution Initial access through C library, Unix shell command Added inq Windows browser (C++ library) Added mysrb Web browser (C library and shell commands) Added Java (Jargon) Added Perl/Python load libraries (shell command) Added WSDL (Java) Added OAI-PMH, OpenDAP, DSpace digital library (Java) Added Kepler actors for dataflow access (Java) Added GridFTP version 3.3 (C library)

20 Examples of Extensibility Storage Repository Driver evolution Initially supported Unix file system Added archival access - UniTree, HPSS Added FTP/HTTP Added database blob access Added database table interface Added Windows file system Added project archives - Dcache, Castor, ADS Added Object Ring Buffer, Datascope Added GridFTP version 3.3 Database management evolution Postgres DB2 Oracle Informix Sybase

21 Logical Name Spaces Data Access Methods (C library, Unix, Web Browser) Storage Repository Storage location User name File name File context (creation date, ) Access constraints Data access directly between application and storage repository using names required by the local repository

22 Logical Name Spaces Data Access Methods (C library, Unix, Web Browser) Data Collection Storage Repository Storage location User name File name File context (creation date, ) Access constraints Data Grid Logical resource name space Logical user name space Logical file name space Logical context (metadata) Control/consistency constraints Data is organized as a shared collection

23 ederation Between Data Grids Data Access Methods (Web Browser, DSpace, OAI-PMH) Data Collection A Data Grid Logical resource name space Logical user name space Logical file name space Logical context (metadata) Data Collection B Data Grid Logical resource name space Logical user name space Logical file name space Logical context (metadata) Control/consistency constraints Control/consistency constraint Access controls and consistency constraints on cross registration of digital entities

24 Types of Federation Peer-to-peer grids Data grids forward requests for access to public data Hierarchical grids Master - slave All files in slave data grid are replicated from the master data grid Central archive Multiple independent data grids deposit replicas into a central archive Replication grids Two independent data grids serve as back-up sites for each other

25 Data Management Systems Digital Libraries DSpace services for ingestion and description of files - ported on top of SRB data grid Fedora relationship management services, port on top of SRB data grid in test Cheshire integration on top of SRB data grid OAI-PMH interface to the SRB data grid Persistent archives Manage authenticity, integrity, and infrastructure independence Integrate preservation processes on top of SRB data grid

26 hronopolis Preservation Facility Demonstrate preservation environment Authenticity Integrity Management of technology evolution Mitigation of risk of data loss Replication of data Federation of catalogs Management of preservation metadata Scalability 3 collections / year Support 100 TBs per site NCAR U Md SDSC MCAT MCAT MCAT Deep Archive at NARA, no user access but complete copy Federation of Three Independent Data Grids Replicated copy at U Md for improved access, load balancing and disaster recovery Active archive at SDSC, user access

27 Distributed Metadata Management Database specific replication Oracle Master-slave catalogs across vendors Synchronize an independent metadata catalog with state from primary catalog SRB version 3.4 Master-slave catalog federation between data grids How to enforce data grid administration policies across independent data grids

28 Data Grids: Third Viewpoint Require ability to apply dynamic consistency constraints to state information When federating data grids When modifying views of collections When managing data placement When asserting global properties (creating consistent state across the collection) Synchronization of replicas Validation of checksums

29 State Information Context Need to be able to characterize the consistency constraints that are evaluated when updating state information Example - copying data Replica (intent that copy be synchronized) Version (intent that copy be labeled) Backup (intent that copy represent time snapshot) New file (intent that copy be independent of original) Could change intent of the copy, changing the required state information

30 Dynamic Constraints Need technology that manages Reification of consistency rules into metadata State information about the reification Version of consistency rules that were evaluated Time stamp for when the evaluation was done Granularity within the collection for which the reification is valid Require two versions Management of procedural execution of rules Management of global consistency assertions

31 Projects Monash University / NSF National Science Digital Library / NARA Persistent Archive Integration of Fedora and SRB data grid SDSC - NSF ITR on Constraint-based Knowledge Systems for Grids, Digital Libraries, and Persistent Archives Embedding of dynamic constraint management within the SRB data grid

32 For More Information Reagan W. Moore San Diego Supercomputer Center

Archiving, Indexing and Accessing Web Materials: Solutions for large amounts of data

Archiving, Indexing and Accessing Web Materials: Solutions for large amounts of data Archiving, Indexing and Accessing Web Materials: Solutions for large amounts of data David Minor 1, Reagan Moore 2, Bing Zhu, Charles Cowart 4 1. (88)4-104 minor@sdsc.edu San Diego Supercomputer Center

More information

Policy Policy--driven Distributed driven Distributed Data Management (irods) Richard M arciano Marciano marciano@un marciano @un.

Policy Policy--driven Distributed driven Distributed Data Management (irods) Richard M arciano Marciano marciano@un marciano @un. Policy-driven Distributed Data Management (irods) Richard Marciano marciano@unc.edu Professor @ SILS / Chief Scientist for Persistent Archives and Digital Preservation @ RENCI Director of the Sustainable

More information

Building Preservation Environments with Data Grid Technology

Building Preservation Environments with Data Grid Technology SOAA_SP09 23/5/06 3:32 PM Page 139 Building Preservation Environments with Data Grid Technology Reagan W. Moore Abstract Preservation environments for digital records are successful when they can separate

More information

Preservation Environments

Preservation Environments Preservation Environments Reagan W. Moore San Diego Supercomputer Center University of California, San Diego 9500 Gilman Drive, MC-0505 La Jolla, CA 92093-0505 moore@sdsc.edu tel: +1-858-534-5073 fax:

More information

DataGrids 2.0 irods - A Second Generation Data Cyberinfrastructure. Arcot (RAJA) Rajasekar DICE/SDSC/UCSD

DataGrids 2.0 irods - A Second Generation Data Cyberinfrastructure. Arcot (RAJA) Rajasekar DICE/SDSC/UCSD DataGrids 2.0 irods - A Second Generation Data Cyberinfrastructure Arcot (RAJA) Rajasekar DICE/SDSC/UCSD What is SRB? First Generation Data Grid middleware developed at the San Diego Supercomputer Center

More information

Fedora Distributed data management (SI1)

Fedora Distributed data management (SI1) Fedora Distributed data management (SI1) Mohamed Rafi DART UQ Outline of Work Package To enable Fedora to natively handle large datasets. Explore SRB integration at the storage level of the repository

More information

Data Grids, Digital Libraries, and Persistent Archives

Data Grids, Digital Libraries, and Persistent Archives Data Grids, Digital Libraries, and Persistent Archives ESIP Federation Meeting Arcot Rajasekar Michael Wan Reagan Moore (sekar, mwan, moore)@sdsc.edu SDSC SRB Team Arun Jagatheesan George Kremenek Sheau-Yen

More information

Data Management in an International Data Grid Project. Timur Chabuk 04/09/2007

Data Management in an International Data Grid Project. Timur Chabuk 04/09/2007 Data Management in an International Data Grid Project Timur Chabuk 04/09/2007 Intro LHC opened in 2005 several Petabytes of data per year data created at CERN distributed to Regional Centers all over the

More information

Data Management using irods

Data Management using irods Data Management using irods Fundamentals of Data Management September 2014 Albert Heyrovsky Applications Developer, EPCC a.heyrovsky@epcc.ed.ac.uk 2 Course outline Why talk about irods? What is irods?

More information

irods Policy-Driven Data Preservation Integrating Cloud Storage and Institutional Repositories

irods Policy-Driven Data Preservation Integrating Cloud Storage and Institutional Repositories irods Policy-Driven Data Preservation Integrating Cloud Storage and Institutional Repositories Reagan W. Moore Arcot Rajasekar Mike Wan {moore,sekar,mwan}@diceresearch.org h;p://irods.diceresearch.org

More information

Concepts in Distributed Data Management or History of the DICE Group

Concepts in Distributed Data Management or History of the DICE Group Concepts in Distributed Data Management or History of the DICE Group Reagan W. Moore 1, Arcot Rajasekar 1, Michael Wan 3, Wayne Schroeder 2, Antoine de Torcy 1, Sheau- Yen Chen 2, Mike Conway 1, Hao Xu

More information

Digital Preservation Lifecycle Management

Digital Preservation Lifecycle Management Digital Preservation Lifecycle Management Building a demonstration prototype for the preservation of large-scale multi-media collections Arcot Rajasekar San Diego Supercomputer Center, University of California,

More information

Data Grid Landscape And Searching

Data Grid Landscape And Searching Or What is SRB Matrix? Data Grid Automation Arun Jagatheesan et al., University of California, San Diego VLDB Workshop on Data Management in Grids Trondheim, Norway, 2-3 September 2005 SDSC Storage Resource

More information

Geospatial Data and Storage Resource Broker Online GIS Integration in ESRI Environments with SRB MapServer and Centera.

Geospatial Data and Storage Resource Broker Online GIS Integration in ESRI Environments with SRB MapServer and Centera. Geospatial Data and Storage Resource Broker Online GIS Integration in ESRI Environments with SRB MapServer and Centera White Paper 2 Geospatial Data Access and Management, The SRB MapServer Table of Contents

More information

irods at CC-IN2P3: managing petabytes of data

irods at CC-IN2P3: managing petabytes of data Centre de Calcul de l Institut National de Physique Nucléaire et de Physique des Particules irods at CC-IN2P3: managing petabytes of data Jean-Yves Nief Pascal Calvat Yonny Cardenas Quentin Le Boulc h

More information

Tools and Services for the Long Term Preservation and Access of Digital Archives

Tools and Services for the Long Term Preservation and Access of Digital Archives Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer Studies Department of Electrical and Computer

More information

Deploying a distributed data storage system on the UK National Grid Service using federated SRB

Deploying a distributed data storage system on the UK National Grid Service using federated SRB Deploying a distributed data storage system on the UK National Grid Service using federated SRB Manandhar A.S., Kleese K., Berrisford P., Brown G.D. CCLRC e-science Center Abstract As Grid enabled applications

More information

Grid Computing @ Sun Carlo Nardone. Technical Systems Ambassador GSO Client Solutions

Grid Computing @ Sun Carlo Nardone. Technical Systems Ambassador GSO Client Solutions Grid Computing @ Sun Carlo Nardone Technical Systems Ambassador GSO Client Solutions Phases of Grid Computing Cluster Grids Single user community Single organization Campus Grids Multiple user communities

More information

Collaborative SRB Data Federations

Collaborative SRB Data Federations WHITE PAPER Collaborative SRB Data Federations A Unified View for Heterogeneous High-Performance Computing INTRODUCTION This paper describes Storage Resource Broker (SRB): its architecture and capabilities

More information

Assessment of RLG Trusted Digital Repository Requirements

Assessment of RLG Trusted Digital Repository Requirements Assessment of RLG Trusted Digital Repository Requirements Reagan W. Moore San Diego Supercomputer Center 9500 Gilman Drive La Jolla, CA 92093-0505 01 858 534 5073 moore@sdsc.edu ABSTRACT The RLG/NARA trusted

More information

Storage Resource Broker (SRB ) for Life Sciences Advancing Enterprise Collaboration and Regulatory Compliance

Storage Resource Broker (SRB ) for Life Sciences Advancing Enterprise Collaboration and Regulatory Compliance Storage Resource Broker (SRB ) for Life Sciences Advancing Enterprise Collaboration and Regulatory Compliance 2 SRB for Life Sciences Table of Contents Executive Briefing...3 Introduction...5 Rise of the

More information

The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets

The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets!! Large data collections appear in many scientific domains like climate studies.!! Users and

More information

Data Management System for grid and portal services

Data Management System for grid and portal services Data Management System for grid and portal services Piotr Grzybowski 1, Cezary Mazurek 1, Paweł Spychała 1, Marcin Wolski 1 1 Poznan Supercomputing and Networking Center, ul. Noskowskiego 10, 61-704 Poznan,

More information

INTEGRATED RULE ORIENTED DATA SYSTEM (IRODS)

INTEGRATED RULE ORIENTED DATA SYSTEM (IRODS) INTEGRATED RULE ORIENTED DATA SYSTEM (IRODS) Todd BenDor Associate Professor Dept. of City and Regional Planning UNC-Chapel Hill bendor@unc.edu http://irods.org/ SESYNC Model Integration Workshop Important

More information

Implementing Network Attached Storage. Ken Fallon Bill Bullers Impactdata

Implementing Network Attached Storage. Ken Fallon Bill Bullers Impactdata Implementing Network Attached Storage Ken Fallon Bill Bullers Impactdata Abstract The Network Peripheral Adapter (NPA) is an intelligent controller and optimized file server that enables network-attached

More information

How To Build A Cloud Storage System

How To Build A Cloud Storage System Reference Architectures for Digital Libraries Keith Rajecki Education Solutions Architect Sun Microsystems, Inc. 1 Agenda Challenges Digital Library Solution Architectures > Open Storage/Open Archive >

More information

In ediscovery and Litigation Support Repositories MPeterson, June 2009

In ediscovery and Litigation Support Repositories MPeterson, June 2009 XAM PRESENTATION (extensible TITLE Access GOES Method) HERE In ediscovery and Litigation Support Repositories MPeterson, June 2009 Contents XAM Introduction XAM Value Propositions XAM Use Cases Digital

More information

THE CCLRC DATA PORTAL

THE CCLRC DATA PORTAL THE CCLRC DATA PORTAL Glen Drinkwater, Shoaib Sufi CCLRC Daresbury Laboratory, Daresbury, Warrington, Cheshire, WA4 4AD, UK. E-mail: g.j.drinkwater@dl.ac.uk, s.a.sufi@dl.ac.uk Abstract: The project aims

More information

Data Sharing with irods (integrated Rule-oriented Data System)

Data Sharing with irods (integrated Rule-oriented Data System) Data Sharing with irods (integrated Rule-oriented Data System) Richard Marciano Arcot Rajasekar Reagan Moore Lead Scientist Sustainable Archives & Library Technologies (SALT) lab director Data Intensive

More information

FROM RELATIONAL TO OBJECT DATABASE MANAGEMENT SYSTEMS

FROM RELATIONAL TO OBJECT DATABASE MANAGEMENT SYSTEMS FROM RELATIONAL TO OBJECT DATABASE MANAGEMENT SYSTEMS V. CHRISTOPHIDES Department of Computer Science & Engineering University of California, San Diego ICS - FORTH, Heraklion, Crete 1 I) INTRODUCTION 2

More information

<Insert Picture Here> Solution Direction for Long-Term Archive

<Insert Picture Here> Solution Direction for Long-Term Archive 1 Solution Direction for Long-Term Archive Donna Harland Oracle Optimized Solutions: Solutions Architect Program Agenda Archive Layers SAM QFS connectivity for

More information

Abstract. 1. Introduction. irods White Paper 1

Abstract. 1. Introduction. irods White Paper 1 irods: integrated Rule Oriented Data System White Paper Data Intensive Cyber Environments Group University of North Carolina at Chapel Hill University of California at San Diego September 2008 Abstract

More information

GridFTP: A Data Transfer Protocol for the Grid

GridFTP: A Data Transfer Protocol for the Grid GridFTP: A Data Transfer Protocol for the Grid Grid Forum Data Working Group on GridFTP Bill Allcock, Lee Liming, Steven Tuecke ANL Ann Chervenak USC/ISI Introduction In Grid environments,

More information

Data storage services at CC-IN2P3

Data storage services at CC-IN2P3 Centre de Calcul de l Institut National de Physique Nucléaire et de Physique des Particules Data storage services at CC-IN2P3 Jean-Yves Nief Agenda Hardware: Storage on disk. Storage on tape. Software:

More information

Michał Jankowski Maciej Brzeźniak PSNC

Michał Jankowski Maciej Brzeźniak PSNC National Data Storage - architecture and mechanisms Michał Jankowski Maciej Brzeźniak PSNC Introduction Assumptions Architecture Main components Deployment Use case Agenda Data storage: The problem needs

More information

Availability Digest. www.availabilitydigest.com. Raima s High-Availability Embedded Database December 2011

Availability Digest. www.availabilitydigest.com. Raima s High-Availability Embedded Database December 2011 the Availability Digest Raima s High-Availability Embedded Database December 2011 Embedded processing systems are everywhere. You probably cannot go a day without interacting with dozens of these powerful

More information

Integrating Data Life Cycle into Mission Life Cycle. Arcot Rajasekar rajasekar@unc.edu sekar@diceresearch.org

Integrating Data Life Cycle into Mission Life Cycle. Arcot Rajasekar rajasekar@unc.edu sekar@diceresearch.org Integrating Data Life Cycle into Mission Life Cycle Arcot Rajasekar rajasekar@unc.edu sekar@diceresearch.org 1 Technology of Interest Provide an end-to-end capability for Exa-scale data orchestration From

More information

Security and Control Issues within Relational Databases

Security and Control Issues within Relational Databases Security and Control Issues within Relational Databases David C. Ogbolumani, CISA, CISSP, CIA, CISM Practice Manager Information Security Preview of Key Points The Database Environment Top Database Threats

More information

WOS for Research. ddn.com. DDN Whitepaper. Utilizing irods to manage collaborative research. 2012 DataDirect Networks. All Rights Reserved.

WOS for Research. ddn.com. DDN Whitepaper. Utilizing irods to manage collaborative research. 2012 DataDirect Networks. All Rights Reserved. DDN Whitepaper WOS for Research Utilizing irods to manage collaborative research. 2012 DataDirect Networks. All Rights Reserved. irods and the DDN Web Object Scalar (WOS) Integration irods, an open source

More information

Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam sastry.vedantam@oracle.com

Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam sastry.vedantam@oracle.com Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam sastry.vedantam@oracle.com Agenda The rise of Big Data & Hadoop MySQL in the Big Data Lifecycle MySQL Solutions for Big Data Q&A

More information

integrated Rule-Oriented Data System Reference

integrated Rule-Oriented Data System Reference i integrated Rule-Oriented Data System Reference Arcot Rajasekar 1 Michael Wan 2 Reagan Moore 1 Wayne Schroeder 2 Sheau-Yen Chen 2 Lucas Gilbert 2 Chien-Yi Hou Richard Marciano 1 Paul Tooby 2 Antoine de

More information

IT S ABOUT TIME. Sponsored by. The National Science Foundation. Digital Government Program and Digital Libraries Program

IT S ABOUT TIME. Sponsored by. The National Science Foundation. Digital Government Program and Digital Libraries Program IT S ABOUT TIME RESEARCH CHALLENGES IN DIGITAL ARCHIVING AND LONG-TERM PRESERVATION Sponsored by The National Science Foundation Digital Government Program and Digital Libraries Program Directorate for

More information

Archiving Systems. Uwe M. Borghoff Universität der Bundeswehr München Fakultät für Informatik Institut für Softwaretechnologie. uwe.borghoff@unibw.

Archiving Systems. Uwe M. Borghoff Universität der Bundeswehr München Fakultät für Informatik Institut für Softwaretechnologie. uwe.borghoff@unibw. Archiving Systems Uwe M. Borghoff Universität der Bundeswehr München Fakultät für Informatik Institut für Softwaretechnologie uwe.borghoff@unibw.de Decision Process Reference Models Technologies Use Cases

More information

CROSS PLATFORM AUTOMATIC FILE REPLICATION AND SERVER TO SERVER FILE SYNCHRONIZATION

CROSS PLATFORM AUTOMATIC FILE REPLICATION AND SERVER TO SERVER FILE SYNCHRONIZATION 1 E N D U R A D A T A EDpCloud: A File Synchronization, Data Replication and Wide Area Data Distribution Solution CROSS PLATFORM AUTOMATIC FILE REPLICATION AND SERVER TO SERVER FILE SYNCHRONIZATION 2 Resilient

More information

irods Overview Intro to Data Grids and Policy-Driven Data Management!!Leesa Brieger, RENCI! Reagan Moore, DICE & RENCI!

irods Overview Intro to Data Grids and Policy-Driven Data Management!!Leesa Brieger, RENCI! Reagan Moore, DICE & RENCI! irods Overview Intro to Data Grids and Policy-Driven Data Management!!Leesa Brieger, RENCI! Reagan Moore, DICE & RENCI! Renaissance Computing Institute (RENCI) A research unit of UNC Chapel Hill Current

More information

Daymark DPS Enterprise - Agentless Cloud Backup and Recovery Software

Daymark DPS Enterprise - Agentless Cloud Backup and Recovery Software Daymark DPS Enterprise - Agentless Cloud Backup and Recovery Software Your company s single most valuable asset may be its data. Customer data, product data, financial data, employee data this is the lifeblood

More information

Software design (Cont.)

Software design (Cont.) Package diagrams Architectural styles Software design (Cont.) Design modelling technique: Package Diagrams Package: A module containing any number of classes Packages can be nested arbitrarily E.g.: Java

More information

Integrating Content Management Within Enterprise Applications: The Open Standards Option. Copyright Xythos Software, Inc. 2005 All Rights Reserved

Integrating Content Management Within Enterprise Applications: The Open Standards Option. Copyright Xythos Software, Inc. 2005 All Rights Reserved Integrating Content Management Within Enterprise Applications: The Open Standards Option Copyright Xythos Software, Inc. 2005 All Rights Reserved Table of Contents Introduction...3 Why Developers Are Choosing

More information

AD207: Advances in Data Integration with Lotus Enterprise Integrator for Domino 6.5. Sarah Boucher, Manager Enterprise Integration Development

AD207: Advances in Data Integration with Lotus Enterprise Integrator for Domino 6.5. Sarah Boucher, Manager Enterprise Integration Development AD207: Advances in Data Integration with Lotus Enterprise Integrator for Domino 6.5 Sarah Boucher, Manager Enterprise Integration Development Goals Overview of Lotus Enterprise Integration offerings and

More information

MIGRATING DESKTOP AND ROAMING ACCESS. Migrating Desktop and Roaming Access Whitepaper

MIGRATING DESKTOP AND ROAMING ACCESS. Migrating Desktop and Roaming Access Whitepaper Migrating Desktop and Roaming Access Whitepaper Poznan Supercomputing and Networking Center Noskowskiego 12/14 61-704 Poznan, POLAND 2004, April white-paper-md-ras.doc 1/11 1 Product overview In this whitepaper

More information

irods and Metadata survey Version 0.1 Date March Abhijeet Kodgire akodgire@indiana.edu 25th

irods and Metadata survey Version 0.1 Date March Abhijeet Kodgire akodgire@indiana.edu 25th irods and Metadata survey Version 0.1 Date 25th March Purpose Survey of Status Complete Author Abhijeet Kodgire akodgire@indiana.edu Table of Contents 1 Abstract... 3 2 Categories and Subject Descriptors...

More information

Concepts and Architecture of Grid Computing. Advanced Topics Spring 2008 Prof. Robert van Engelen

Concepts and Architecture of Grid Computing. Advanced Topics Spring 2008 Prof. Robert van Engelen Concepts and Architecture of Grid Computing Advanced Topics Spring 2008 Prof. Robert van Engelen Overview Grid users: who are they? Concept of the Grid Challenges for the Grid Evolution of Grid systems

More information

Cloud Computing. Lecture 5 Grid Case Studies 2014-2015

Cloud Computing. Lecture 5 Grid Case Studies 2014-2015 Cloud Computing Lecture 5 Grid Case Studies 2014-2015 Up until now Introduction. Definition of Cloud Computing. Grid Computing: Schedulers Globus Toolkit Summary Grid Case Studies: Monitoring: TeraGRID

More information

Introduction. Introduction: Database management system. Introduction: DBS concepts & architecture. Introduction: DBS versus File system

Introduction. Introduction: Database management system. Introduction: DBS concepts & architecture. Introduction: DBS versus File system Introduction: management system Introduction s vs. files Basic concepts Brief history of databases Architectures & languages System User / Programmer Application program Software to process queries Software

More information

EDG Project: Database Management Services

EDG Project: Database Management Services EDG Project: Database Management Services Leanne Guy for the EDG Data Management Work Package EDG::WP2 Leanne.Guy@cern.ch http://cern.ch/leanne 17 April 2002 DAI Workshop Presentation 1 Information in

More information

SQL Server Training Course Content

SQL Server Training Course Content SQL Server Training Course Content SQL Server Training Objectives Installing Microsoft SQL Server Upgrading to SQL Server Management Studio Monitoring the Database Server Database and Index Maintenance

More information

Division of IT Security Best Practices for Database Management Systems

Division of IT Security Best Practices for Database Management Systems Division of IT Security Best Practices for Database Management Systems 1. Protect Sensitive Data 1.1. Label objects containing or having dedicated access to sensitive data. 1.1.1. All new SCHEMA/DATABASES

More information

Globus Striped GridFTP Framework and Server. Raj Kettimuthu, ANL and U. Chicago

Globus Striped GridFTP Framework and Server. Raj Kettimuthu, ANL and U. Chicago Globus Striped GridFTP Framework and Server Raj Kettimuthu, ANL and U. Chicago Outline Introduction Features Motivation Architecture Globus XIO Experimental Results 3 August 2005 The Ohio State University

More information

Building the Internet of Things Jim Green - CTO, Data & Analytics Business Group, Cisco Systems

Building the Internet of Things Jim Green - CTO, Data & Analytics Business Group, Cisco Systems Building the Internet of Things Jim Green - CTO, Data & Analytics Business Group, Cisco Systems Brian McCarson Sr. Principal Engineer & Sr. System Architect, Internet of Things Group, Intel Corp Mac Devine

More information

Introduction: Database management system

Introduction: Database management system Introduction Databases vs. files Basic concepts Brief history of databases Architectures & languages Introduction: Database management system User / Programmer Database System Application program Software

More information

Alliance Key Manager Solution Brief

Alliance Key Manager Solution Brief Alliance Key Manager Solution Brief KEY MANAGEMENT Enterprise Encryption Key Management On the road to protecting sensitive data assets, data encryption remains one of the most difficult goals. A major

More information

Symantec Enterprise Vault.cloud Overview

Symantec Enterprise Vault.cloud Overview Fact Sheet: Archiving and ediscovery Introduction The data explosion that has burdened corporations and governments across the globe for the past decade has become increasingly expensive and difficult

More information

Sisense. Product Highlights. www.sisense.com

Sisense. Product Highlights. www.sisense.com Sisense Product Highlights Introduction Sisense is a business intelligence solution that simplifies analytics for complex data by offering an end-to-end platform that lets users easily prepare and analyze

More information

The Mantid Project. The challenges of delivering flexible HPC for novice end users. Nicholas Draper SOS18

The Mantid Project. The challenges of delivering flexible HPC for novice end users. Nicholas Draper SOS18 The Mantid Project The challenges of delivering flexible HPC for novice end users Nicholas Draper SOS18 What Is Mantid A framework that supports high-performance computing and visualisation of scientific

More information

CHAPTER 1: OPERATING SYSTEM FUNDAMENTALS

CHAPTER 1: OPERATING SYSTEM FUNDAMENTALS CHAPTER 1: OPERATING SYSTEM FUNDAMENTALS What is an operating? A collection of software modules to assist programmers in enhancing efficiency, flexibility, and robustness An Extended Machine from the users

More information

Chapter 4 Cloud Computing Applications and Paradigms. Cloud Computing: Theory and Practice. 1

Chapter 4 Cloud Computing Applications and Paradigms. Cloud Computing: Theory and Practice. 1 Chapter 4 Cloud Computing Applications and Paradigms Chapter 4 1 Contents Challenges for cloud computing. Existing cloud applications and new opportunities. Architectural styles for cloud applications.

More information

Chronopolis: A Partnership. The Chronopolis: Digital Preservation Archive Development and Demonstration Program

Chronopolis: A Partnership. The Chronopolis: Digital Preservation Archive Development and Demonstration Program The Chronopolis: Digital Preservation Archive Development and Demonstration Program Robert H. McDonald Indiana University Ardys Kozbial UC San Diego Libraries David Minor San Diego Supercomputer Center

More information

CatDV Pro Workgroup Serve r

CatDV Pro Workgroup Serve r Architectural Overview CatDV Pro Workgroup Server Square Box Systems Ltd May 2003 The CatDV Pro client application is a standalone desktop application, providing video logging and media cataloging capability

More information

Hadoop Distributed File System. T-111.5550 Seminar On Multimedia 2009-11-11 Eero Kurkela

Hadoop Distributed File System. T-111.5550 Seminar On Multimedia 2009-11-11 Eero Kurkela Hadoop Distributed File System T-111.5550 Seminar On Multimedia 2009-11-11 Eero Kurkela Agenda Introduction Flesh and bones of HDFS Architecture Accessing data Data replication strategy Fault tolerance

More information

A Web Services Data Analysis Grid *

A Web Services Data Analysis Grid * A Web Services Data Analysis Grid * William A. Watson III, Ian Bird, Jie Chen, Bryan Hess, Andy Kowalski, Ying Chen Thomas Jefferson National Accelerator Facility 12000 Jefferson Av, Newport News, VA 23606,

More information

MySQL Administration and Management Essentials

MySQL Administration and Management Essentials MySQL Administration and Management Essentials Craig Sylvester MySQL Sales Consultant 1 Safe Harbor Statement The following is intended to outline our general product direction. It

More information

Diagram 1: Islands of storage across a digital broadcast workflow

Diagram 1: Islands of storage across a digital broadcast workflow XOR MEDIA CLOUD AQUA Big Data and Traditional Storage The era of big data imposes new challenges on the storage technology industry. As companies accumulate massive amounts of data from video, sound, database,

More information

Protegrity Data Security Platform

Protegrity Data Security Platform Protegrity Data Security Platform The Protegrity Data Security Platform design is based on a hub and spoke deployment architecture. The Enterprise Security Administrator (ESA) enables the authorized Security

More information

The software platform for storing, preserving and sharing very large data sets. www.active-circle.com

The software platform for storing, preserving and sharing very large data sets. www.active-circle.com The software platform for storing, preserving and sharing very large data sets www.active-circle.com The easiest solution for storing and archiving very large data sets! ACTIVE CIRCLE HIGHLIGHTS Software-based

More information

The Arctic Observing Network and its Data Management Challenges Florence Fetterer (NSIDC/CIRES/CU), James A. Moore (NCAR/EOL), and the CADIS team

The Arctic Observing Network and its Data Management Challenges Florence Fetterer (NSIDC/CIRES/CU), James A. Moore (NCAR/EOL), and the CADIS team The Arctic Observing Network and its Data Management Challenges Florence Fetterer (NSIDC/CIRES/CU), James A. Moore (NCAR/EOL), and the CADIS team Photo courtesy Andrew Mahoney NSF Vision What is AON? a

More information

Cloudbuz at Glance. How to take control of your File Transfers!

Cloudbuz at Glance. How to take control of your File Transfers! How to take control of your File Transfers! A MFT solution for ALL organisations! Cloudbuz is a MFT (Managed File Transfer) platform for organisations and businesses installed On-Premise or distributed

More information

Vodacom Managed Hosted Backups

Vodacom Managed Hosted Backups Vodacom Managed Hosted Backups Robust Data Protection for your Business Critical Data Enterprise class Backup and Recovery and Data Management on Diverse Platforms Vodacom s Managed Hosted Backup offers

More information

2009 ikeep Ltd, Morgenstrasse 129, CH-3018 Bern, Switzerland (www.ikeep.com, info@ikeep.com)

2009 ikeep Ltd, Morgenstrasse 129, CH-3018 Bern, Switzerland (www.ikeep.com, info@ikeep.com) CSP CHRONOS Compliance statement for ISO 14721:2003 (Open Archival Information System Reference Model) 2009 ikeep Ltd, Morgenstrasse 129, CH-3018 Bern, Switzerland (www.ikeep.com, info@ikeep.com) The international

More information

The glite File Transfer Service

The glite File Transfer Service The glite File Transfer Service Peter Kunszt Paolo Badino Ricardo Brito da Rocha James Casey Ákos Frohner Gavin McCance CERN, IT Department 1211 Geneva 23, Switzerland Abstract Transferring data reliably

More information

TimePictra Release 10.0

TimePictra Release 10.0 DATA SHEET Release 100 Next Generation Synchronization System Key Features Web-based multi-tier software architecture Comprehensive FCAPS management functions Software options for advanced FCAPS features

More information

Digital Preservation. OAIS Reference Model

Digital Preservation. OAIS Reference Model Digital Preservation OAIS Reference Model Stephan Strodl, Andreas Rauber Institut für Softwaretechnik und Interaktive Systeme TU Wien http://www.ifs.tuwien.ac.at/dp Aim OAIS model Understanding the functionality

More information

Symantec NetBackup 7 Clients and Agents

Symantec NetBackup 7 Clients and Agents Complete protection for your information-driven enterprise Overview Symantec NetBackup provides a simple yet comprehensive selection of innovative clients and agents to optimize the performance and efficiency

More information

Agentless Cloud Backup and Recovery Software for the Enterprise

Agentless Cloud Backup and Recovery Software for the Enterprise Agentless Cloud Backup and Recovery Software for the Enterprise Armada Cloud 6165 Greenwich Drive San Diego, California 92122 United States T: 888-924-1777 E: sales@armadacloud.com W: www.armadacloud.com

More information

IBM Optim. The ROI of an Archiving Project. Michael Mittman Optim Products IBM Software Group. 2008 IBM Corporation

IBM Optim. The ROI of an Archiving Project. Michael Mittman Optim Products IBM Software Group. 2008 IBM Corporation IBM Optim The ROI of an Archiving Project Michael Mittman Optim Products IBM Software Group Disclaimers IBM customers are responsible for ensuring their own compliance with legal requirements. It is the

More information

Guardium Change Auditing System (CAS)

Guardium Change Auditing System (CAS) Guardium Change Auditing System (CAS) Highlights. Tracks all changes that can affect the security of database environments outside the scope of the database engine Complements Guardium's Database Activity

More information

Towards Heterogeneous Grid Database Replication. Kemian Dang

Towards Heterogeneous Grid Database Replication. Kemian Dang Towards Heterogeneous Grid Database Replication Kemian Dang Master of Science Computer Science School of Informatics University of Edinburgh 2008 Abstract Heterogeneous database replication in the Grid

More information

Karl Lum Partner, LabKey Software klum@labkey.com. Evolution of Connectivity in LabKey Server

Karl Lum Partner, LabKey Software klum@labkey.com. Evolution of Connectivity in LabKey Server Karl Lum Partner, LabKey Software klum@labkey.com Evolution of Connectivity in LabKey Server Connecting Data to LabKey Server Lowering the barrier to connect scientific data to LabKey Server Increased

More information

EII - ETL - EAI What, Why, and How!

EII - ETL - EAI What, Why, and How! IBM Software Group EII - ETL - EAI What, Why, and How! Tom Wu 巫 介 唐, wuct@tw.ibm.com Information Integrator Advocate Software Group IBM Taiwan 2005 IBM Corporation Agenda Data Integration Challenges and

More information

Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications

Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications White Paper Table of Contents Overview...3 Replication Types Supported...3 Set-up &

More information

DiskPulse DISK CHANGE MONITOR

DiskPulse DISK CHANGE MONITOR DiskPulse DISK CHANGE MONITOR User Manual Version 7.9 Oct 2015 www.diskpulse.com info@flexense.com 1 1 DiskPulse Overview...3 2 DiskPulse Product Versions...5 3 Using Desktop Product Version...6 3.1 Product

More information

Ultimate Guide to Oracle Storage

Ultimate Guide to Oracle Storage Ultimate Guide to Oracle Storage Presented by George Trujillo George.Trujillo@trubix.com George Trujillo Twenty two years IT experience with 19 years Oracle experience. Advanced database solutions such

More information

Open Unified Data Protection and Business Continuity Framework

Open Unified Data Protection and Business Continuity Framework Open Unified Data Protection and Business Continuity Framework Presenter: Dr. Anupam Bhide Calsoft, Inc. Email: anupam@calsoftinc.com Phone: +91 (20) 4079 2900 Authors: Anupam Bhide (Calsoft) Parag Kulkarni

More information

Cloud Computing and Advanced Relationship Analytics

Cloud Computing and Advanced Relationship Analytics Cloud Computing and Advanced Relationship Analytics Using Objectivity/DB to Discover the Relationships in your Data By Brian Clark Vice President, Product Management Objectivity, Inc. 408 992 7136 brian.clark@objectivity.com

More information

ILM: Tiered Services & The Need For Classification

ILM: Tiered Services & The Need For Classification ILM: Tiered Services & The Need For Classification Edgar StPierre, EMC 2 SNW San Diego April 2007 SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies

More information

Infosys GRADIENT. Enabling Enterprise Data Virtualization. Keywords. Grid, Enterprise Data Integration, EII Introduction

Infosys GRADIENT. Enabling Enterprise Data Virtualization. Keywords. Grid, Enterprise Data Integration, EII Introduction Infosys GRADIENT Enabling Enterprise Data Virtualization Keywords Grid, Enterprise Data Integration, EII Introduction A new generation of business applications is emerging to support customer service,

More information

Informatica ILM Archive and Application Retirement

Informatica ILM Archive and Application Retirement Informatica ILM Archive and Application Retirement Thierry AUDOT Technical Manager EMEA 26 th September 2012 1 Live Archiving What are key users pain points? My reports take forever to run! I need all

More information

Reference Architectures for Repositories and Preservation Archiving

Reference Architectures for Repositories and Preservation Archiving Reference Architectures for Repositories and Preservation Archiving Keith Rajecki Education Solutions Architect Sun Microsystems, Inc. 1 Agenda Challenges Solution Architectures > Open Storage/Open Archive

More information

Postgres Plus xdb Replication Server with Multi-Master User s Guide

Postgres Plus xdb Replication Server with Multi-Master User s Guide Postgres Plus xdb Replication Server with Multi-Master User s Guide Postgres Plus xdb Replication Server with Multi-Master build 57 August 22, 2012 , Version 5.0 by EnterpriseDB Corporation Copyright 2012

More information

Oracle Warehouse Builder 10g

Oracle Warehouse Builder 10g Oracle Warehouse Builder 10g Architectural White paper February 2004 Table of contents INTRODUCTION... 3 OVERVIEW... 4 THE DESIGN COMPONENT... 4 THE RUNTIME COMPONENT... 5 THE DESIGN ARCHITECTURE... 6

More information

Requirements Checklist for Choosing a Cloud Backup and Recovery Service Provider

Requirements Checklist for Choosing a Cloud Backup and Recovery Service Provider Whitepaper: Requirements Checklist for Choosing a Cloud Backup and Recovery Service Provider WHITEPAPER Requirements Checklist for Choosing a Cloud Backup and Recovery Service Provider Requirements Checklist

More information