National Data Storage data replication in the network



Similar documents
National Data Storage 2 Secure sharing, publishing and exchanging data

Michał Jankowski Maciej Brzeźniak PSNC

National Data Store 2 crypto-clients - demonstration

Popular backup/archival service and its application for the archival of the network traffic in the academic network PIONIER

Polish National Data Storage. Norbert Meyer, Maciej Brzeźniak, Maciej Stroiński PSNC

NDS2 Secure storage, sharing and publishing of data in the NDS

Network Attached Storage. Jinfeng Yang Oct/19/2015

File Sharing and Network Marketing

Storage Virtualization. Andreas Joachim Peters CERN IT-DSS

XtreemFS Extreme cloud file system?! Udo Seidel

Network File System (NFS) Pradipta De

Advancements in Storage QoS Management in National Data Storage

SmartSync NAS-to-NAS Data Replication

Analisi di un servizio SRM: StoRM

CERN Cloud Storage Evaluation Geoffray Adde, Dirk Duellmann, Maitane Zotes CERN IT

ETERNUS CS High End Unified Data Protection

IBM TSM DISASTER RECOVERY BEST PRACTICES WITH EMC DATA DOMAIN DEDUPLICATION STORAGE

Samba's Cloudy Future. Jeremy Allison Samba Team.

Architecture and Mode of Operation

Hadoop Distributed File System. T Seminar On Multimedia Eero Kurkela

XtreemStore A SCALABLE STORAGE MANAGEMENT SOFTWARE WITHOUT LIMITS YOUR DATA. YOUR CONTROL

(Scale Out NAS System)

Distributed File Systems

Migration and Building of Data Centers in IBM SoftLayer with the RackWare Management Module

Egnyte Local Cloud Architecture. White Paper

Diagram 1: Islands of storage across a digital broadcast workflow

Collaborative SRB Data Federations

Top 10 Reasons why MySQL Experts Switch to SchoonerSQL - Solving the common problems users face with MySQL

XtreemFS a Distributed File System for Grids and Clouds Mikael Högqvist, Björn Kolbeck Zuse Institute Berlin XtreemFS Mikael Högqvist/Björn Kolbeck 1

We mean.network File System

BlueArc unified network storage systems 7th TF-Storage Meeting. Scale Bigger, Store Smarter, Accelerate Everything

Open Source, Scale-out clustered NAS using nfs-ganesha and GlusterFS

Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle

High Availability Solutions for the MariaDB and MySQL Database

MANAGEMENT METHODS IN SLA-AWARE DISTRIBUTED STORAGE SYSTEMS

Migration and Building of Data Centers in IBM SoftLayer with the RackWare Management Module

Long term retention and archiving the challenges and the solution

High-Availability Using Open Source Software

CERNBox + EOS: Cloud Storage for Science

THE HADOOP DISTRIBUTED FILE SYSTEM

SURFsara Data Services

Data Management in an International Data Grid Project. Timur Chabuk 04/09/2007

Cloud Based Application Architectures using Smart Computing

The Future of PostgreSQL High Availability Robert Hodges - Continuent, Inc. Simon Riggs - 2ndQuadrant

NAS 259 Protecting Your Data with Remote Sync (Rsync)

DESYcloud: an owncloud & dcache update

Take An Internal Look at Hadoop. Hairong Kuang Grid Team, Yahoo! Inc

Software to Simplify and Share SAN Storage Sanbolic s SAN Storage Enhancing Software Portfolio

Distributed File System Choices: Red Hat Storage, GFS2 & pnfs

Intro to AWS: Storage Services

VMware vsphere Data Protection

IBM Tivoli Storage Manager Version Introduction to Data Protection Solutions IBM

DSS. High performance storage pools for LHC. Data & Storage Services. Łukasz Janyst. on behalf of the CERN IT-DSS group

Introduction to Gluster. Versions 3.0.x

Constant Replicator: An Introduction

owncloud Architecture Overview

Service Overview CloudCare Online Backup

<Insert Picture Here> Oracle Cloud Storage. Morana Kobal Butković Principal Sales Consultant Oracle Hrvatska

Tushar Joshi Turtle Networks Ltd

Implementing the Hadoop Distributed File System Protocol on OneFS Jeff Hughes EMC Isilon

Performance, Reliability, and Operational Issues for High Performance NAS Storage on Cray Platforms. Cray User Group Meeting June 2007

Level 1: Asigra Cloud Backup Foundation Training

dcache, Software for Big Data

Remote File System Suite

Data Replication INSTALATION GUIDE. Open-E Data Storage Server (DSS ) Integrated Data Replication reduces business downtime.

HDFS Under the Hood. Sanjay Radia. Grid Computing, Hadoop Yahoo Inc.

Deployment Topologies

Protect Microsoft Exchange databases, achieve long-term data retention

Digital Library for Multimedia Content Management

Availability Digest. Redundant Load Balancing for High Availability July 2013

Integrating Content Management Within Enterprise Applications: The Open Standards Option. Copyright Xythos Software, Inc All Rights Reserved

Deploying Silver Peak VXOA with EMC Isilon SyncIQ. February

BookKeeper. Flavio Junqueira Yahoo! Research, Barcelona. Hadoop in China 2011

How To Improve Afs.Org For Free On A Pc Or Mac Or Ipad (For Free) For A Long Time (For A Long Term Time) For Free (For Cheap) For Your Computer Or Your Hard Drive) For The Long

THE EXPAND PARALLEL FILE SYSTEM A FILE SYSTEM FOR CLUSTER AND GRID COMPUTING. José Daniel García Sánchez ARCOS Group University Carlos III of Madrid

Big data management with IBM General Parallel File System

Private Cloud Storage for Media Applications. Bang Chang Vice President, Broadcast Servers and Storage

EMC IRODS RESOURCE DRIVERS

Technical. Overview. ~ a ~ irods version 4.x

WOS Cloud. ddn.com. Personal Storage for the Enterprise. DDN Solution Brief

The dcache Storage Element

FAN An Architecture for Scalable, Service-Oriented Data Management

Replication Security

Introduction to Highly Available NFS Server on scale out storage systems based on GlusterFS

Open Source Cloud Computing Management with OpenNebula

List of Figures and Tables

XtreemFS - a distributed and replicated cloud file system

EMC DATA DOMAIN OVERVIEW. Copyright 2011 EMC Corporation. All rights reserved.

Caching SMB Data for Offline Access and an Improved Online Experience

MIGRATING DESKTOP AND ROAMING ACCESS. Migrating Desktop and Roaming Access Whitepaper

WOS OBJECT STORAGE PRODUCT BROCHURE DDN.COM Full Spectrum Object Storage

Competitive Analysis Retrospect And Our Competition

Disaster Recovery for Oracle Database

Michael Thomas, Dorian Kcira California Institute of Technology. CMS Offline & Computing Week

Scientific Storage at FNAL. Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015

EMC DATA PROTECTION. Backup ed Archivio su cui fare affidamento

Designing a Cloud Storage System

Current Status of FEFS for the K computer

ovirt and Gluster Hyperconvergence

Overview. Big Data in Apache Hadoop. - HDFS - MapReduce in Hadoop - YARN. Big Data Management and Analytics

Transcription:

National Data Storage data replication in the network Maciej Brzeźniak, Michał Jankowski, Norbert Meyer, PSNC, Supercomputing Dept. 1st Technical meeting in Munich, December 5-6th, 2011 Project funded by: NCBiR for 2011-2013 under KMD2 project (no. NR02-0025-10/2011) Full Polish name of the project: System bezpiecznego przechowywania i współdzielenia danych oraz składowania kopii zapasowych i archiwalnych w Krajowym Magazynie Danych Project partners 10 Polish universities and supercomputing centres:

National Data Storage NDS Overview: NDS Architecture: Design assumptions Overal architecture Data replication in NDS Data Replication modes Replication protocols usage User profiles vs data replication settings Rule-based replication? NDS vs external world vs EUDAT NDS future: NDS2 secure data storage and exchange

NDS - design assumptions Overall assumptions: Avoid SoF - distributed: data & meta-data replication Standard access protocols(tools) to be usable Abstraction of system internals Logical namespace visible to user; Separate namespaces for different user groups Robust implementation(c/c++) within 2 years Tape systems (HSMs) on the back-end (for cost-effieciency) Main applications Archival and backup data storage Effective storage and accesss of large files Multiple small files not welcome

NDS Project status National Data Storage (R&D project: 2007-2009) System architecture & concept Software stack (rpms for CentOS/RHEL) Current NDS deployment: Backup and Archive Services for Science BADSS (Service Platform for e-science) Capacity: 12,5 PB of tapes in 5 sites & performance: 2 PB of disks in 5 sites National Data Storage 2 (R&D project: 2011-2013) Secure storage and data sharing (user-side encryption + integrity control)

NDS highlights Automated, TRANSPARENT, data replication Users do not see the details (if they don t want; they can) They speak to remote virtual filesystem Abstract data access interfaces: File-system view of the data (remote virtual filesystem) NDS is implemented as a user-level code (FUSE library) User access: standard methods: SFTP, WebDAV, GridFTP Storage access: NFS / GridFTP-NFS (each SN exposes at least NFS) Meta-data replication: Automated, transparent Postgress Slony-I + semi-synchronous replication DR, not full HA (no recovery automation)

NDS architecture (1) Overall picture User Metadata DB Database Access Methods Servers (SSH, HTTPs, WebDAV...) VFS for data and meta-data NDS system logic Access Users DB Accounting & limits DB Replica access methods servers (NFS, GridFTP) FS with data migration (HSM) Replication Storage Storage HSM system (NFS) NAS appliance

NDS architecture (2) Data replication & presentation User Metadata DB Database Access Methods Servers (SSH, HTTPs, WebDAV...) VFS for data and meta-data NDS system logic Access Users DB Accounting & limits DB Replica access methods servers (NFS, GridFTP) FS with data migration (HSM) Replication Storage Storage HSM system (NFS) NAS appliance

NDS architecture (3) Data replication & presentation Data Daemon Implements the core NDS system logic (together with MC): I/O serving, filesystem presentation data operations with replication meta-data-related operations Emulates Virtual File System Supports most of POSIX functions: open, close, read, write opendir, readdir, getattr, setattr, rename, link, unlink... Based on FUSE (Filesystem in USErspace) Additional: enforces security policies (access control) optimizes replica access and creation implements limits and accounting

NDS architecture (4) Async. vs sync. replication from VFS perspective: Writing to the system (async mode) VFS: OPEN (new file, O_RDWR O_CREAT) - Register a new logical file in MC (lock for writing) - Create one physical replica - Register replica in MC VFS: WRITE... - Write to localreplica(async.) (QUICK-local, single replica write) - Update meta-data (size, last access etc.) VFS: CLOSE (on anopenedfile) - Flush buffers and close replica(async.) (QUICK) - Update meta-data incl. release write locks - Return to user Asynchronous action: Make replicas - Enqueue replication tasks to replication daemon - Update meta-data - Replications daemon (in the background) does the replicas (typically 3-rd party: SN1->SN2) Writing to the system (sync mode) VFS: OPEN (new file, O_RDWR O_CREAT) - Register a new logical file in MC (lock for writing) - Create physical multiple replicas - Register replicas in MC VFS: WRITE... - Write to all replicas(sync.) (TAKES TIME-remote, multiple sites to write) - Update meta-data (size, last access etc.) VFS: CLOSE (on anopenedfile) - Flush buffers(also to remote replicas) (TAKES TIME) - Update meta-data incl. release write locks - Return to user All replicas already done

NDS architecture (5) Replica access methods (AN-SN) (1) Access Methods Servers (SSH, HTTPs, WebDAV...) VFS for data and meta-data NDS system logic Replica access client (NFS) Replica access client (GridFTP) Access Low latency High bandwidth (eg. 10 Geth) LAN WAN High latency High bandwidth (eg. 1 Geth) NFS GridFTP NFS GridFTP or NFS used when needed LOCAL Storage Replica access method (GridFTP) Replica access method (NFS) Replica access method (GridFTP) Replica access method (NFS) REMOTE Storage

NDS architecture (6) Replica access methods (AN-SN) (2) NFS and GridFTP used where they fit best: Static protocol selection (currently), Dynamic protocol selection e.g. basing on file size (planned) NFS State-less, IOPS friendly Lowoverheadon IOPS operations: small files access meta-data related operations Low performance(mb/s) on long-distance No parallelism (to/from single file) NFS 4.1. (pnfs) on the horizonbut stillnot there Usage in NDS: meta-data related operations accessing replicas on local SNs access to small files on remote SNs (future) Stable, standardised GridFTP State-full, can exploit available badwidth High overhead on IOPS operations: even small files access and meta-data ops require session High performance (MB/s) despite distance Parallelism (up to 256 streams) 64+ streamscansustain1geth link (1000 km-long) Usage in NDS: 3-rd party replication(async. mode) transferring replicas to/from remote SNs Stability issues, even if standard in Grids

NDS architecture (7) GridFTP 3-rd party replication (SN-SN) (3) 3-rd party transmission (SN->SN) used in async. replication mode Access Methods Servers (SSH, HTTPs, WebDAV...) VFS for data and meta-data NDS system logic Replication daemon (GridFTPclient) Access Low latency High bandwidth (eg. 10 Geth) LAN WAN High latency High bandwidth (eg. 1 Geth) GridFTP control conection LOCAL Storage Replica access method (GridFTP) WAN High latency High bandwidth (eg. 1 Geth) Replica access method (GridFTP) REMOTE Storage

NDS architecture in PLATON: Replica access: GridFTP-NFS access to SNs Access Methods Servers (SSH, HTTPs, WebDAV...) VFS for data and meta-data NDS system logic Replica access client (GridFTP) Replica access client (NFS) Access GridFTP GridFTP virtual Storage Replica access method (GridFTP) NFS client Replica access method (GridFTP) NFS client virtual Storage NFS NFS HSM system (NFS) NAS appliance(nfs) FS with data migration (HSM)

NDS architecture (8) Replication-related settings (1) Replication-related parameters of profile: Profile parameter Possible values and meaning Replication mode Asynchronous (default) Synchronous Number of replicas Typically 2 (max. 3) Can be set to any value Allowed storage sites and nodes: Default replica locations Additional replica location Storage media type: Disk vs tape (HSM) Replicas are typically created in default locations Additional replica locations are used in case of failure of default ones Using a combination of allowed storage sites & nodes + knowledge on the deployment infrastructure we can determine media type

NDS architecture (9) Replication-related settings (2) Replication is configured per-profile, NOT per data-object, e.g. directory / file Policies are static cannot be changed dynamically Users can use one or many profiles: Assigned to profiles using DNs of certs => multiple certificates have to be used in order to access different profiles Fast, HA & FT spacefor backups: Replication: SYNC 3 replicas on (1 local + 2 distant) on disks only FT space for archives: Replication: ASYNC 2 replicas on distant nodes; both copies on tapes Safestoragespacefor collaboration: Replication: ASYNC 2 replicas: local on disk + 1 remote in HSM

NDS features vs EUDAT (1) Automated, TRANSPARENT, data replication Safe, transparent replication service case service Abstract data interfaces above and below NDS: SFTP, WebDAV, GridFTP for users Possibleto interfacewith NDS from other systems e.g. 3-rd party transfer to/from NDS Data available through VFS layer on ANs: Possible to add new access methods Some work needed to extend the authentication mechanisms Storage access: NFS / GridFTP Any kind of storage can be used as the backend... as far as it provides NFS service service GridFTP front-end to storage HSM system (NFS) FS with migration (HSM) Access Methods (SSH, HTTPs, WebDAV...) NDS Access Methods servers NDS VFS NDS logic GridFTP front-end to storage NAS appliance (NFS) GridFTP front-end to storage Any other storage(nfs) s data centres?

NDS features vs EUDAT (2) Persistent IDs? User always sees the same logical structure, nevertheless: The replication process: Physical location is transparent Replication process does not affect the logical namespace which Access he uses: The logical structure of VFS is the same everywhere what access method he uses: failures: The logical structure of VFS is presented similarly through different access methods As long as at least one replica is OK Pathto the file ordirectoryisconstant => Is this PID-like feature?

NDS features vs EUDAT (3) User-level metadata: User canassignfree-form textfilesto data objects, they can include metadata This is done through Web-GUI or procfs-like mechanism Medata search possible but not yet implemented (on the roadmap) Can above be somehow re-used in EUDAT Extendability? Functionality can be easily extended as the architecture & interfaces are open (Postgresql, NFS/GridFTP...) micro-services like approach possible: but requires effoert on the NDS consortium side For instance: some basic interfaces to meta-data can be defined (e.g. for searching data meeting some criteria) Example: We currently design and develop a mechanism for periodic data integrity checking (data scrubbing)

NDS2 - features Secure data storage: Data encrypted on the user side Symmetric keys to file stored in the system - protected by user s asymmetric key Integrity control on the user side MD5 s/sha1 digests stored in the system - encrypted Secure data exchange: 2-level access control: ACLs on virtual filesystem level User-side encryption and keys exchange make the sharing safe (e.g. if we don t trust provider) Secure data publication: 2 kinds of storage space: private (for internal users) and public (sanbox ed) Multiple web servers (load-balancing, HA, data synchronisation) to serve data effectively Specialised user-side tools needed: Java GUI for managing file sharing, ACLs, publication and versioning Virtual encrypted (!) filesystem for end-users: both for Linux and Windows Status: R&D project (2011-2013); prototype expected in 2Q2013

Backup slides

NDS1: Summary Data storage & replication:: Implemented @ VFS level: portability and security Robust and lightweight Data replication: Automatic, transparent to users - Sync. and async modes NFS or GridFTP or GridFTP 3rd party used to access//make replicas Meta-data handling & replication: Handles file system-level medata and user-level metadata Logically centralized but DR solutions in place for quick recovery Logical filesystem structure persistency: Physical location-agnostic access Pluggable : Open interfaces, standard interfaces to external world (both user- and storage-side) We can provide custom interfaces to meta-data if needed

NDS architecture: (10) Meta-catalog (1) Functionality: System-level meta-data storage and handling File system structure Data replicas User-level meta-data storage Implementation: C++ library used by Data Daemon Postgres database with Slony-I replication at the backend Separation of namespaces: No sharing among users groups assumed Security by isolation Scalability multiple instances of MC for multiple users groups / insitutions Metadata DB Database

NDS architecture: (11) Meta-catalog (2) Meta-data redundancy: problem: reliability and performance? (1) Postgres database with Slony-I replication Each meta-catalog replicated asynchronously in master-slaves mode (Slony-I) In case of failure of master MC: slave MC is manually selected as master (DR, not full HA, human intervention needed) (2) Semi-synchronous data replication: All operations on metadata synchronously logged to distributed logs In case of failure of master MC: part of operations logged are repeated on the new master (human interv. needed) Comments: Reliability similar to synchronous DBMS replication but mechanism is lighter!!!

PLATON s B/A service access: sftp sftp: Well-known, secure data upload/download method WinSCP example:

PLATON s B/A service access: WebDAV Web Browser-based WebDAV access (read only)

PLATON s B/A service access: WebDAV Windows built-in WebDAV(Web Folders) client supports: mapping NDS filesystem as the Network Drive drag & drop

PLATON s B/A service access: NDS web application

PLATON s B/A service access: NDS Web application filesystem navigation

PLATON s B/A service access: NDS Web application meta-data view

PLATON s B/A service access: NDS MDFS filesystem for meta-data access