Samba's Cloudy Future. Jeremy Allison Samba Team. jra@samba.org



Similar documents
Network File System (NFS) Pradipta De

Why is it a better NFS server for Enterprise NAS?

SAMBA AND SMB3: ARE WE THERE YET? Ira Cooper Principal Software Engineer Red Hat Samba Team

Integration with Active Directory. Jeremy Allison Samba Team

National Data Storage data replication in the network

SerNet. Clustered Samba. Nürnberg April 29, Volker Lendecke SerNet Samba Team. Network Service in a Service Network

Sep 23, OSBCONF 2014 Cloud backup with Bareos

We mean.network File System

Open Source, Scale-out clustered NAS using nfs-ganesha and GlusterFS

Samba in the Enterprise : Samba 3.0 and beyond

Bacula Success Stories: Einsatz in vielfältigen Umgebungen

Business Continuity with the. Concerto 7000 All Flash Array. Layers of Protection for Here, Near and Anywhere Data Availability

Course 10978A Introduction to Azure for Developers

day 1 2 Windows Azure Platform Overview... 2 Windows Azure Compute... 3 Windows Azure Storage... 3 day 2 5

Distributed File System Choices: Red Hat Storage, GFS2 & pnfs

Egnyte Local Cloud Architecture. White Paper

RackWare Solutions Disaster Recovery

Storage Made Easy Enterprise File Share and Sync (EFSS) Cloud Control Gateway Architecture

Running SMB3.x over RDMA on Linux platforms

SMB in the Cloud David Disseldorp

MS 10978A Introduction to Azure for Developers

10978A: Introduction to Azure for Developers

A COMPARISON BETWEEN THE SAMBA3 AND LIKEWISE LWIOD FILE SERVERS

Lecture 11. RFS A Network File System for Mobile Devices and the Cloud

Questions to ask about a cloud service. enter

Developing Microsoft Azure Solutions 20532B; 5 Days, Instructor-led

Gladinet Cloud Access Solution Simple, Secure Access to Online Storage

Implementing Microsoft Azure Infrastructure Solutions

XtreemFS a Distributed File System for Grids and Clouds Mikael Högqvist, Björn Kolbeck Zuse Institute Berlin XtreemFS Mikael Högqvist/Björn Kolbeck 1

New Storage System Solutions

09'Linux Plumbers Conference

Cloud Sync White Paper. Based on DSM 6.0

Red Hat Storage Server Administration Deep Dive

Optimizing Ext4 for Low Memory Environments

Cloud Gateway. Agenda. Cloud concepts Gateway concepts My work. Monica Stebbins

Samba 4.2. Cebit 2015 Hannover

Distributed File Systems

Using Samba to play nice with Windows. Bill Moran Potential Technologies

Transporter from Connected Data Date: February 2015 Author: Kerry Dolan, Lab Analyst and Vinny Choinski, Sr. Lab Analyst

Course 20533: Implementing Microsoft Azure Infrastructure Solutions

Kopano product strategy & roadmap

Implementing the Hadoop Distributed File System Protocol on OneFS Jeff Hughes EMC Isilon

ViewBox: Integrating Local File System with Cloud Storage Service

ovirt and Gluster Hyperconvergence

RED HAT STORAGE SERVER TECHNICAL OVERVIEW

Creating Synergies on the Cloudbox. Mohan Sundar Samsung Electronics

Linux Powered Storage:

Diagram 1: Islands of storage across a digital broadcast workflow

Introduction to Azure for Developers

Implementing Microsoft Azure Infrastructure Solutions 20533B; 5 Days, Instructor-led

Lecture 18: Reliable Storage

XtreemFS Extreme cloud file system?! Udo Seidel

Gladinet Cloud Backup V3.0 User Guide

Course 20533B: Implementing Microsoft Azure Infrastructure Solutions

StorReduce Technical White Paper Cloud-based Data Deduplication

CS3210: Crash consistency. Taesoo Kim

Course 20532B: Developing Microsoft Azure Solutions

A block based storage model for remote online backups in a trust no one environment

Bootstrap guide for the File Station

Solaris For The Modern Data Center. Taking Advantage of Solaris 11 Features

Microsoft Introduction to Azure for Developers

API Architecture. for the Data Interoperability at OSU initiative

This talk is mostly about Data Center Replication, but along the way we'll have to talk about why you'd want transactionality arnd the Low-Level API.

SOP Common service PC File Server

Cloud Services for Backup Exec. Planning and Deployment Guide

THE LINK OFFLINE DATA ARCHITECTURE

Hitachi Content Platform (HCP)

The Google File System

HiDrive Intelligent online storage for private and business users.

the client omits the BranchCache identifier from the request message.

Cloud File System. Cloud computing advantages:

Lecture 5: GFS & HDFS! Claudia Hauff (Web Information Systems)! ti2736b-ewi@tudelft.nl

ArcGIS 10.3 Server on Amazon Web Services

The Hybrid Cloud Advantage White Paper

Data Replication in Privileged Credential Vaults

Briefcase for Android. Release Notes

Zarafa S/MIME Webaccess Plugin User Manual. Client side configuration and usage.

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9a: Cloud Computing WHAT IS CLOUD COMPUTING? 2

POWER ALL GLOBAL FILE SYSTEM (PGFS)

RADOS: A Scalable, Reliable Storage Service for Petabyte- scale Storage Clusters

Implementing Microsoft Azure Infrastructure Solutions

Vodafone Total Managed Mobility

Nasuni Filer Administration Guide

File Systems for Clusters from a Protocol Perspective

This module provides an overview of service and cloud technologies using the Microsoft.NET Framework and the Windows Azure cloud.

Scaling Web Applications in a Cloud Environment. Emil Ong Caucho Technology 8621

Implementing Microsoft Azure Infrastructure Solutions

Whitepaper : Cloud Based Backup for Mobile Users and Remote Sites

Transcription:

Samba's Cloudy Future Jeremy Allison Samba Team jra@samba.org

Isn't cloud storage the future? Yes, but not usable for many existing apps.

Cloud Storage is a blob store Blob stores don't map very well onto the open/read/write/close random access semantics of most applications. Apps are changing to cope with no random access semantics of cloud stores, but this will take time.

Doesn't FUSE solve this? Mostly, for Linux and MacOS X applications. No FUSE on Windows clients. So can't Windows clients access via Samba running on top of a FUSE filesystem? No, because FUSE system calls can block for an awfully long time.. In theory Samba can cope with this by making all syscalls async. Might we become a FUSE host?

Enter flexible Samba The VFS can save us! Stackable, modular, easily modifiable user-space code. Already contains code to cache all reads/writes onto local files. We just :-) need a way to ensure we sync to the remote cloud on metadata update File/directory creation/deletion,. What should we do with the extra metadata (ACLs etc.)?

Possible Architecture To client (SMB2) smbd Cloud VFS Regular VFS Metadata sync protocol Local disk cache (shared amongst smbds) Cloud gateway daemon Local file access Into Disk Cache Pluggable cloud provider backend To cloud (https)

What level of coherence can we provide? NOT full Windows semantics. Latency going to the the cloud would make this unusable. Only clients going to the same on-premises Samba- Cloud gateway provide Windows semantics. Coherence on close/unlink/mkdir/create only. If a client writes then closes, they are guaranteed to have that version uploaded to cloud storage. Open files not synced. Good-enough semantics for many apps.

Under the covers Cloud services daemon reads/writes into private namespace hidden from smbd, but on the same physical device. On sync use atomic file operations (rename, unlink) to move to/from private namespace into smbd-exported namespace. If we have a underlying btrfs copy-on-write filesystem, use file cloning on close to create instant copies that can be synchronized out to cloud storage. On non COW filesystems, physical copies of files needed.

Operations Open/Create/Mkdir Create operations must sync with cloud daemon. Requires synchronous call to cloud backend. On create success local file/directory created. For files marked TMP, ignore cloud operations? For open existing, open operation must trigger read from cloud. At least fetching file metadata (size etc.) must be synchronous. File data access operations (read/write) can then be forced to go async until the data is synchronized from the cloud. FILE_ATTRIBUTE_OFFLINE can help here.

Operations Read/Write/Truncate/Unlink/Rename Once the on-disk file is known to be in a valid state, reads, writes and truncates can proceed as normal. No cloud communication needed. Unlink and rename need synchronous operation to the cloud, but should be a reasonably quick operation (just round-trip latency).

Operations - Close Reference count open files on last close then expensive synchronous operations need to happen. If file modified then make a sync call to the cloud sync daemon to make a copy of this file. Use COW if available. Once copy is complete, smbd can reply to the close request. Cloud sync daemon then pushes changed file up to the cloud. Should we add a delay time here? Only guarantee coherence after 5 minutes?

The cloud sync daemon Using existing Samba technologies talloc, asynchronous tevent, with backing threadpool. All smbd's will need to talk to it. Use tdb transactions for crash recovery. Create plugable back-ends to allow choice of cloud storage providers. Ignore key management/credentials issues. Just expect credentials file to have been magically placed locally by an administrator. Many possible tunables number of simultaneous connections, bandwidth limits etc.

Error recovery Shock, horror. Cloud file operations can fail, right in the middle of uploading via https. Do we allow client access to continue whilst there is a cloud outage? If so, how do we queue operations that occur during cloud outage? How much queuing do we allow in order to replay operations later before stopping client access? I don't currently have good answers to these problems.

Privacy: "We have secured ourselves from the NSA, except for the parts that we either don't know about or can't talk about. - Bruce Schneier Data stored in the cloud unencrypted, or remotely encrypted with keys held by the cloud storage provider can be compromised. Local encryption before uploading to the cloud is the only way to reduce this risk. Getting this right is hard. Two options I can see: vfs_encfs Samba VFS module that emulates encfs fuse module. Copy to encfs filesystem before uploading to cloud.

Create backends that target the Big Three Cloud Store Vendors

Existing vendors There are many existing cloud-gateway companies, some of which are already using Samba for this purpose. Are we trying to compete with them? No In doing this I'm trying to raise the bar on what are the basic services provided with Samba. Proprietary cloud-gateways provide more complete service guarantees. Are you reinventing the wheel? Yes. Loughborough University in the UK has this code already. Unfortunately they're not releasing it back to the community (so far, I've asked).

Questions and Comments? Email: jra@samba.org Slides available at: ftp://samba.org/pub/samba/slides/sambaxp-2015-cloudyfuture.odp