Secure Data Transfer and Replication Mechanisms in Grid Environments p. 1



Similar documents
Data Management in an International Data Grid Project. Timur Chabuk 04/09/2007

Hadoop and Map-Reduce. Swati Gore

In Memory Accelerator for MongoDB

SQL Server for Database Administrators Course Syllabus

Advanced Peer to Peer Discovery and Interaction Framework

Designing Database Solutions for Microsoft SQL Server 2012 MOC 20465

B.Com(Computers) II Year DATABASE MANAGEMENT SYSTEM UNIT- V

Hadoop Architecture. Part 1

Deploying a distributed data storage system on the UK National Grid Service using federated SRB

Implementing a Microsoft SQL Server 2005 Database

MS 20465: Designing Database Solutions for Microsoft SQL Server 2012

SiteCelerate white paper

Learning Management Redefined. Acadox Infrastructure & Architecture

Global Server Load Balancing

Online Transaction Processing in SQL Server 2008

HadoopRDF : A Scalable RDF Data Analysis System

6231A - Maintaining a Microsoft SQL Server 2008 Database

Security Design.

CitusDB Architecture for Real-Time Big Data

DFSgc. Distributed File System for Multipurpose Grid Applications and Cloud Computing

Diagram 1: Islands of storage across a digital broadcast workflow

CS2510 Computer Operating Systems

CS2510 Computer Operating Systems

SCALABILITY AND AVAILABILITY

Chapter 3. Database Environment - Objectives. Multi-user DBMS Architectures. Teleprocessing. File-Server

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets

An approach to grid scheduling by using Condor-G Matchmaking mechanism

Denodo Data Virtualization Security Architecture & Protocols

Designing Database Solutions for Microsoft SQL Server 2012

<Insert Picture Here> Operational Reporting for Oracle Applications with Oracle GoldenGate

Data Refinery with Big Data Aspects

Elastic Application Platform for Market Data Real-Time Analytics. for E-Commerce

SQL Server Administrator Introduction - 3 Days Objectives

Generic Log Analyzer Using Hadoop Mapreduce Framework

Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components

VMware vsphere Data Protection 6.1

Storage Technologies for Video Surveillance

Sector vs. Hadoop. A Brief Comparison Between the Two Systems

SQL Server Training Course Content

Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications

ETERNUS CS High End Unified Data Protection

Distribution transparency. Degree of transparency. Openness of distributed systems

White Paper. Optimizing the Performance Of MySQL Cluster

A STUDY ON HADOOP ARCHITECTURE FOR BIG DATA ANALYTICS

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

MOC 20467B: Designing Business Intelligence Solutions with Microsoft SQL Server 2012

THE CCLRC DATA PORTAL

On the features and challenges of security and privacy in distributed internet of things. C. Anurag Varma CpE /24/2016

Data Grids. Lidan Wang April 5, 2007

Using Multipathing Technology to Achieve a High Availability Solution

New Constructions and Practical Applications for Private Stream Searching (Extended Abstract)

Request Routing, Load-Balancing and Fault- Tolerance Solution - MediaDNS

Overview of Luna High Availability and Load Balancing

Appendix A Core Concepts in SQL Server High Availability and Replication

Classic Grid Architecture

PROTOTYPE IMPLEMENTATION OF A DEMAND DRIVEN NETWORK MONITORING ARCHITECTURE

Cloud Computing Architecture

Deployment Topologies

Big data management with IBM General Parallel File System

Web Service Based Data Management for Grid Applications

Designing, Optimizing and Maintaining a Database Administrative Solution for Microsoft SQL Server 2008

Hadoop: A Framework for Data- Intensive Distributed Computing. CS561-Spring 2012 WPI, Mohamed Y. Eltabakh

Building a protocol validator for Business to Business Communications. Abstract

MICROSOFT SOFTWARE LICENSE TERMS MICROSOFT SQL SERVER 2005 (CAL VERSIONS)

BIG DATA TECHNOLOGY. Hadoop Ecosystem

Fig. 3. PostgreSQL subsystems

IMPLEMENTATION OF SOURCE DEDUPLICATION FOR CLOUD BACKUP SERVICES BY EXPLOITING APPLICATION AWARENESS

Microsoft SQL Database Administrator Certification

SQL Server 2008 Designing, Optimizing, and Maintaining a Database Session 1

Contents. Overview 1 SENTINET

IFS-8000 V2.0 INFORMATION FUSION SYSTEM

SavvyDox Publishing Augmenting SharePoint and Office 365 Document Content Management Systems

Oracle Database 11g for Data Warehousing and Business Intelligence. An Oracle White Paper July 2007

Concepts and Architecture of the Grid. Summary of Grid 2, Chapter 4

A distributed system is defined as

EMC Data Domain Boost for Oracle Recovery Manager (RMAN)

MS Design, Optimize and Maintain Database for Microsoft SQL Server 2008

Cloud Based Distributed Databases: The Future Ahead

Associate Professor, Department of CSE, Shri Vishnu Engineering College for Women, Andhra Pradesh, India 2

A SURVEY OF POPULAR CLUSTERING TECHNOLOGIES

Chapter 7. Using Hadoop Cluster and MapReduce

Cyber Security. BDS PhantomWorks. Boeing Energy. Copyright 2011 Boeing. All rights reserved.

GIS Databases With focused on ArcSDE

Energy Efficient Storage - Multi- Tier Strategies For Retaining Data

Lehrstuhl für Informatik 4 Kommunikation und verteilte Systeme. Middleware. Chapter 8: Middleware

Oracle Identity Management Concepts and Architecture. An Oracle White Paper December 2003

CONDIS. IT Service Management and CMDB

COMMVAULT SIMPANA 10 SOFTWARE MULTI-TENANCY FEATURES FOR SERVICE PROVIDERS

Oracle Architecture, Concepts & Facilities

XtreemFS Extreme cloud file system?! Udo Seidel

Clustering and Queue Replication:

Hadoop s Entry into the Traditional Analytical DBMS Market. Daniel Abadi Yale University August 3 rd, 2010

CHAPTER 2 MODELLING FOR DISTRIBUTED NETWORK SYSTEMS: THE CLIENT- SERVER MODEL

Transcription:

Secure Data Transfer and Replication Mechanisms in Grid Environments Konrad Karczewski, Lukasz Kuczynski and Roman Wyrzykowski Institute of Computer and Information Sciences, Czestochowa University of Technology, Poland email: [xeno,neron,roman]@icis.pcz.pl www: http://icis.pcz.pl Secure Data Transfer and Replication Mechanisms in Grid Environments p. 1

Overview In this presentation we would like to: present tasks of the modern Data Management System propose new solutions for data safety and security show a concept of modular system architecture designed with security in mind Secure Data Transfer and Replication Mechanisms in Grid Environments p. 2

Data Management Data Management service: maintains discovers stores validates transfers instantiates the data for the users applications transparently Secure Data Transfer and Replication Mechanisms in Grid Environments p. 3

Features of Data Management System transparent access reliability security and safety of transferred and stored data access control possibility of transparent data compression access optimization Secure Data Transfer and Replication Mechanisms in Grid Environments p. 4

Transparent Data Access Data Transfer System should be implemented as a layer between client application and Data Management System. Such an approach allows to: hide real data access mechanisms add new solutions without need for rewriting end user application implementation of intelligent data access functionality Secure Data Transfer and Replication Mechanisms in Grid Environments p. 5

Data access optimization (1) Modern Data Management System should provide data access optimization mechanisms. The main task of this subsystem is choosing the most suitable data location to use taking into account multiple factors, such as: available resources in the storage elements network properties: bandwidth topology current throughput user s access permissions Secure Data Transfer and Replication Mechanisms in Grid Environments p. 6

Data access optimization (2) Additionally to improve efficiency and minimize data access time, data partitioning mechanism (splitting into smaller parts) could be used. In such case Data Broker would: decide on partitioning of the file search optimal locations for the parts every part would be replicated in several locations to minimize chance of losing the data in case of a failure Such a solution, besides improving the system fault tolerance, would as well improve data access time by copying several parts in parallel from different locations. Secure Data Transfer and Replication Mechanisms in Grid Environments p. 7

Even the best Data Management System won t be able to function properly when the data itself are unavailable Secure Data Transfer and Replication Mechanisms in Grid Environments p. 8

Storage Element or network failure could result in such an effect Secure Data Transfer and Replication Mechanisms in Grid Environments p. 9

Reliability (1) Reliability is one of the most important aspects of the Data Management System. The following basic functionalities are required: improving fault tolerance of the system by providing Data Replication mechanism automation and control by the Data Broker assures maximum transparency Secure Data Transfer and Replication Mechanisms in Grid Environments p. 10

Reliability (2) Increase of reliability level could be accomplished by elimination of the single point of failure. The basic way of achieving this is implementation of a distributed Data Broker Service. This implies: automated control takeover in case of failure metadata replication and synchronization distributed metadata server heartbeat mechanism Secure Data Transfer and Replication Mechanisms in Grid Environments p. 11

Security and Safety of Data (1) To provide required level of security, the Data Management System must include: user authentication and authorization (e.g. GSI based) data encryption possibility permissions delegation (single sign on) Secure Data Transfer and Replication Mechanisms in Grid Environments p. 12

Security and Safety of Data(2) To improve data safety the following mechanisms could be implemented: Access Control Lists embedded in the metadata Data partitioning (only a part of data is available on each storage element) dataset name transformation (e.g. md5) Secure Data Transfer and Replication Mechanisms in Grid Environments p. 13

Access Control Lists Access Control Lists allow to: manage access to resources constrain users rights access rights to data access rights to metadata manage visibility of data Secure Data Transfer and Replication Mechanisms in Grid Environments p. 14

System Architecture Secure Data Transfer and Replication Mechanisms in Grid Environments p. 15

Data Partitioning Data Partitioning enables: increased data security: no Storage Element holds complete information repartitioning information available only to Data Broker increased data safety parts stored on different Storage Elements decreased chance to lose entire data increased performance equal load distribution amongst Storage Elements Secure Data Transfer and Replication Mechanisms in Grid Environments p. 16

Data Partitioning Mechanism (1) Storing the data client authenticates and authorizes in the GSI subsystem new dataset is registered in the Data Broker Data Broker: decides on partitioning of the dataset searches for optimal locations encrypts and splits data digests dataset name stores parts on the Storage Elements updates Metadata Server information Secure Data Transfer and Replication Mechanisms in Grid Environments p. 17

Data Partitioning Mechanism (2) Data retrieval client authenticates and authorizes in the GSI subsystem requests data retrieval using dataset name Data Broker: digests dataset name and checks users rights in the Metadata Server retrieves repartitioning information chooses optimal Storage Elements for data retrieval retrieves parts of the dataset and reunites the data decrypts dataset and tranfers it to the requested location Secure Data Transfer and Replication Mechanisms in Grid Environments p. 18

Distributed Broker Architecture Secure Data Transfer and Replication Mechanisms in Grid Environments p. 19

Broker Operation Schema Normal operation: client queries MDS for the Data Broker MDS returns current Primary Data Broker address client communicates with the Data Broker Broker / network failure: Synchronization Subsystem detects failure Secondary Data Broker takes over Primary Broker functions the new Primary Data Broker updates MDS information Secure Data Transfer and Replication Mechanisms in Grid Environments p. 20

Summary The proposed solution will allow to: use common storage elements for sensitive data hide presence of a dataset on the storage element to unauthorised users eliminate single point of failure It will help to: secure data from unauthorized access balance the load of storage elements optimize access time Secure Data Transfer and Replication Mechanisms in Grid Environments p. 21