Evaluating Metadata access



Similar documents
glibrary: Digital Asset Management System for the Grid

Remote Sensitive Image Stations and Grid Services

The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets

EDG Project: Database Management Services

Data Management in an International Data Grid Project. Timur Chabuk 04/09/2007

Customer Bank Account Management System Technical Specification Document

Geodatabase Programming with SQL

Unifying the Global Data Space using DDS and SQL

Data Grids. Lidan Wang April 5, 2007

UltraQuest Cloud Server. White Paper Version 1.0

Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam

Version Overview. Business value

Distributed Database Access in the LHC Computing Grid with CORAL

SOLUTION BRIEF. JUST THE FAQs: Moving Big Data with Bulk Load.

Executive summary. Table of Contents. Benefits of an integration platform. Technical paper Infor Cloverleaf Integration Suite

Using Object And Object-Oriented Technologies for XML-native Database Systems

Oracle Architecture, Concepts & Facilities

CA Repository for Distributed. Systems r2.3. Benefits. Overview. The CA Advantage

Using EMC Documentum with Adobe LiveCycle ES

Release Bulletin Sybase ETL Small Business Edition 4.2

DBX. SQL database extension for Splunk. Siegfried Puchbauer

Real-time Data Replication

Software Development around a Millisecond

SOFTWARE ENGINEERING PROGRAM

Data Integrator: Object Naming Conventions

Introduction to XML Applications

A Web services solution for Work Management Operations. Venu Kanaparthy Dr. Charles O Hara, Ph. D. Abstract

ODBC Client Driver Help Kepware, Inc.

Oracle Database 10g: Building GIS Applications Using the Oracle Spatial Network Data Model. An Oracle Technical White Paper May 2005

Institute of Computational Modeling SB RAS

DataDirect XQuery Technical Overview

ARDA Experiment Dashboard

New Features in Neuron ESB 2.6

Java EE 7: Back-End Server Application Development

Java SE 8 Programming

High Performance XML Data Retrieval

Querying Databases Using the DB Query and JDBC Query Nodes

FileMaker 13. ODBC and JDBC Guide

2009 ikeep Ltd, Morgenstrasse 129, CH-3018 Bern, Switzerland (

A Framework for Developing the Web-based Data Integration Tool for Web-Oriented Data Warehousing

Communiqué 4. Standardized Global Content Management. Designed for World s Leading Enterprises. Industry Leading Products & Platform

Bringing Together Data Integration and SOA

Informatica Data Replication FAQs

HALOGEN. Technical Design Specification. Version 2.0

A Generic Model for Querying Multiple Databases in a Distributed Environment Using JDBC and an Uniform Interface

ActiveVOS Server Architecture. March 2009

FileMaker 12. ODBC and JDBC Guide

An Oracle White Paper June Integration Technologies for Primavera Solutions

Developing a Web Server Platform with SAPI Support for AJAX RPC using JSON

AWS Schema Conversion Tool. User Guide Version 1.0

What is a database? COSC 304 Introduction to Database Systems. Database Introduction. Example Problem. Databases in the Real-World

Third-Party Software Support. Converting from SAS Table Server to a SQL Server Database

Run your own Oracle Database Benchmarks with Hammerora

Spring,2015. Apache Hive BY NATIA MAMAIASHVILI, LASHA AMASHUKELI & ALEKO CHAKHVASHVILI SUPERVAIZOR: PROF. NODAR MOMTSELIDZE

GIS Databases With focused on ArcSDE

2sms SMS API Overview

SequeLink Server for ODBC Socket

Architecting the Future of Big Data

Using SQL Developer. Copyright 2008, Oracle. All rights reserved.

JAVA r VOLUME II-ADVANCED FEATURES. e^i v it;

Lost in Space? Methodology for a Guided Drill-Through Analysis Out of the Wormhole

Standardized Multimedia Retrieval in Distributed Heterogenous Database Systems. Dr. Mario Döller

MySQL and Hadoop Big Data Integration

Oracle Warehouse Builder 10g

Weaving Stored Procedures into Java at Zalando

AWS Schema Conversion Tool. User Guide Version 1.0

TRANSFORM BIG DATA INTO ACTIONABLE INFORMATION

SAS MDM 4.2. User s Guide. SAS Documentation

mframe Software Development Platform KEY FEATURES

Sharding with postgres_fdw

Principles and Foundations of Web Services: An Holistic View (Technologies, Business Drivers, Models, Architectures and Standards)

tibbr Now, the Information Finds You.

Best Practices for Hadoop Data Analysis with Tableau

Windows Authentication on Microsoft SQL Server

PDQ-Wizard Prototype 1.0 Installation Guide

AN INTEGRATION APPROACH FOR THE STATISTICAL INFORMATION SYSTEM OF ISTAT USING SDMX STANDARDS

SAP Business Objects Business Intelligence platform Document Version: 4.1 Support Package Data Federation Administration Tool Guide

Course Code NCS2013: SharePoint 2013 No-code Solutions for Office 365 and On-premises

1 Changes in this release

Oracle Data Integrator 11g New Features & OBIEE Integration. Presented by: Arun K. Chaturvedi Business Intelligence Consultant/Architect

Hive Interview Questions

Jet Data Manager 2012 User Guide

Technical. Overview. ~ a ~ irods version 4.x

Graph Database Proof of Concept Report

Transcription:

Evaluating Metadata access strategies with the GOME test suite André Gemünd Fraunhofer SCAI www.eu-egee.org EGEE-II INFSO-RI-031688 EGEE and glite are registered trademarks

Motivation Testing the test suite Sufficiency of specification and utility Investigate AMGA and GRelC as alternatives Until now we ve used OGSA-DAI in NA4 Used a java wrapper to access from Python and Perl glite integration EGEE-II INFSO-RI-031688 Evaluating Metadata access strategies with the GOME test suite 2

Introduction DEGREE Project Dissemination and Exploitation of GRids in Earth science Bridge Earth Science and Grid Community Identify barriers for broader acceptance Identify and assess key requirements Improve communication and collaboration EGEE-II INFSO-RI-031688 Evaluating Metadata access strategies with the GOME test suite 3

Introduction Test suites Specify typical workflows for earth science applications As white papers for testing ti Grid middleware Organised and grouped into categories (data management, etc.) Consisting of test cases with annotated tested requirements EGEE-II INFSO-RI-031688 Evaluating Metadata access strategies with the GOME test suite 4

Introduction GOME-Validation Test Suite High amount of datasets from two sources GOME satellite measurements LIDAR ground station ti measurements Correlate by metadata geo-coordinates & date of measurement Target components (as specified): Data management Database access Workflow control EGEE-II INFSO-RI-031688 Evaluating Metadata access strategies with the GOME test suite 5

Proceeding What we did Implement GOME-Validation as a representative workflow Transmission i and Grid registration ti of data files Extraction and archiving of Metadata Bidirectional correlation of files through Metadata Abstraction of Metadata backend EGEE-II INFSO-RI-031688 Evaluating Metadata access strategies with the GOME test suite 6

Software Design Proceeding EGEE-II INFSO-RI-031688 Evaluating Metadata access strategies with the GOME test suite 7

Results Problems / Characteristics Backend Compatibility Data schema and types Query language GIS features Indexing (IDs) Bulk Action support Hierarchical metadata Reuse of Data EGEE-II INFSO-RI-031688 Evaluating Metadata access strategies with the GOME test suite 8

Results Database Compatibility AMGA uses ODBC MySQL, Oracle, pgsql, etc. Extensions and custom Functions need to be added to the Query Parser (Bison Grammar) GRelC C API libraries Config file states choose between mysql and pgsql Needs pgsql as configuration backend EGEE-II INFSO-RI-031688 Evaluating Metadata access strategies with the GOME test suite 9

Results Database Compatibility OGSA-DAI Unique strength Uses JDBC, exist and custom drivers Write data providers for arbitrary data sources Databases and files already included Combine data from different sources Execute Transformations on data Deliver to Grid-FTP, Gridservice, Client, EGEE-II INFSO-RI-031688 Evaluating Metadata access strategies with the GOME test suite 10

Proceeding Data schema (OGSA-DAI & GRelC) Raw SQL tables Taken directly from Test suite specification 2 Tables One for LIDAR and one for GOME files Problem: 1 Lidar files hosts n datasets Different time / coordinates Save redundant or introduce relations? EGEE-II INFSO-RI-031688 Evaluating Metadata access strategies with the GOME test suite 11

Proceeding Data schema (AMGA) We had to devise a modified schema AMGA uses path structures Entity-specific ifi attributes t Leverage advantages Dynamic change Inheritance of attributes (hierarchy) EGEE-II INFSO-RI-031688 Evaluating Metadata access strategies with the GOME test suite 12

Results Using hierarchies in AMGA example /gometest/lidar/ano/hgl/30108/ /ano/ /hgl/ Identifies station and thus also coordinates Here: Andoya, Norway Author, here: Georg Hansen /30108/ Identifies file entity Files in this directory Real Datasets EGEE-II INFSO-RI-031688 Evaluating Metadata access strategies with the GOME test suite 13

Results Datatypes: Location of measurement PostGIS Polygon AMGA can use int, float, varchar, timestamp, text, or numeric But: unknown fieldtypes of database get returned as text OGSA-DAI & GRelC let you choose No datatype abstraction Function to determine containment? See query language EGEE-II INFSO-RI-031688 Evaluating Metadata access strategies with the GOME test suite 14

Results Datatypes No additional types offered by the services Desirable Relations containment, adjacency, Custom relations (ontology-like) like) isresultof isusedinexperiment Array types Not only abstraction but extension EGEE-II INFSO-RI-031688 Evaluating Metadata access strategies with the GOME test suite 15

Results Query language OGSA-DAI and GRelC use SQL Highly coupled to table schema Differences in SQL dialect (e.g. pgsql <-> Oracle) Support for SQL functions, Views, Extensions Syntax errors if extension is not enabled (e.g. PostGIS) GRelC add. supports XMLDB query language XPath XQuery AMGA defines own query language Makes for reusable queries / abstractions May ypossibly limit query ypower Add. Functions need source change EGEE-II INFSO-RI-031688 Evaluating Metadata access strategies with the GOME test suite 16

Results Bulk Actions AMGA additionally supports socket connection instead of document based (SOAP) Low latency Multiple queries without delay High transfer rates possible OGSA-DAI workflows Pipeline, Parallel grouping of activities Powerful but complicated EGEE-II INFSO-RI-031688 Evaluating Metadata access strategies with the GOME test suite 17

Results What we would like to have Integration of external data sources like OGSA-DAI For custom data sources like swiss-prot etc. Integration to glite Integration with file catalogue o Browsable in both directions Support for aliases and replicas o Assess best replica for current location VOMS-based Authorization & Authentication Extendible for GIS-features and the like APIs for Java, C++, Python & Perl EGEE-II INFSO-RI-031688 Evaluating Metadata access strategies with the GOME test suite 18