Lustre* HSM in the Cloud. Robert Read, Intel HPDD



Similar documents
Extended Attributes and Transparent Encryption in Apache Hadoop

Intel Desktop Board DQ35JO

Intel Media SDK Library Distribution and Dispatching Process

Intel SSD 520 Series Specification Update

Cloud based Holdfast Electronic Sports Game Platform

Intel Desktop public roadmap

with PKI Use Case Guide

Intel HTML5 Development Environment Article Using the App Dev Center

Intel Desktop Board DG33TL

Intel Unite. User Guide

Intel Core i5 processor 520E CPU Embedded Application Power Guideline Addendum January 2011

System Image Recovery* Training Foils

Intel Solid-State Drive Pro 2500 Series Opal* Compatibility Guide

Intel Server Raid Controller. RAID Configuration Utility (RCU)

Developing High-Performance, Scalable, cost effective storage solutions with Intel Cloud Edition Lustre* and Amazon Web Services

Intel HTML5 Development Environment. Tutorial Building an Apple ios* Application Binary

Intel Desktop Board DG965RY

EHCI Removal from 6 th Generation Intel Core Processor Family Platform Controller Hub (PCH)

Acronis Storage Gateway

Intel HTML5 Development Environment. Article - Native Application Facebook* Integration

GRIDScaler-WOS Bridge

Intel Identity Protection Technology (IPT)

Intel Desktop Board D945GCPE

Benchmarking Cloud Storage through a Standard Approach Wang, Yaguang Intel Corporation

Intel Rack Scale Architecture Storage Services

Intel Desktop Board DG41BI

Intel Unite Solution. Standalone User Guide

Intel Desktop Board DG43RK

System Event Log (SEL) Viewer User Guide

Intel Desktop Board D945GCPE Specification Update

Software Evaluation Guide for Autodesk 3ds Max 2009* and Enemy Territory: Quake Wars* Render a 3D character while playing a game

Intel Cloud Builder Guide: Cloud Design and Deployment on Intel Platforms

Intel Desktop Board DG41TY

Intel Desktop Board DP55WB

Intel Desktop Board DG31PR

Intel Modular Server System MFSYS25

Intel System Event Log (SEL) Viewer Utility

Intel Identity Protection Technology Enabling improved user-friendly strong authentication in VASCO's latest generation solutions

Intel Matrix Storage Console

Intel Desktop Board DQ43AP

How to Configure Intel X520 Ethernet Server Adapter Based Virtual Functions on Citrix* XenServer 6.0*

Version Rev. 1.0

Intel HTML5 Development Environment. Tutorial Test & Submit a Microsoft Windows Phone 8* App (BETA)

Creating Overlay Networks Using Intel Ethernet Converged Network Adapters

Object Level Authentication

Lustre* Testing: The Basics. Justin Miller, Cray Inc. James Nunez, Intel Corporation LAD 15 Paris, France

Intel Desktop Board D945GCL

Protecting Data with a Unified Platform

iscsi Quick-Connect Guide for Red Hat Linux

Intel Active Management Technology Embedded Host-based Configuration in Intelligent Systems

How To Reduce Pci Dss Scope

Recovery BIOS Update Instructions for Intel Desktop Boards

Lustre * Filesystem for Cloud and Hadoop *

Intel IoT Gateways: Publishing Data to an MQTT Broker Using Python

COSBench: A benchmark Tool for Cloud Object Storage Services. Jiangang.Duan@intel.com

Intel Desktop Board DG41WV

Extending PCIe NVMe Storage to Client. John Carroll Intel Corporation. Flash Memory Summit 2015 Santa Clara, CA 1

* * * Intel RealSense SDK Architecture

This guide explains how to install an Intel Solid-State Drive (Intel SSD) in a SATA-based desktop or notebook computer.

Intel Service Assurance Administrator. Product Overview

Intel Compute Stick STCK1A32WFC User Guide. Intel Compute Stick STCK1A32WFC

How To Write A Libranthus (Libranthus) On Libranus 2.4.3/Libranus 3.5 (Librenthus) (Libre) (For Linux) (

Intel Desktop Board DP43BF

Intelligent Business Operations

Page Modification Logging for Virtual Machine Monitor White Paper

The Case for Rack Scale Architecture

Instructions for Recovery BIOS Update

Intel Solid State Drive Toolbox

Intel Desktop Board DQ965GF

Intel and Qihoo 360 Internet Portal Datacenter - Big Data Storage Optimization Case Study

Hetero Streams Library 1.0

Intel Desktop Board D101GGC Specification Update

Software Solutions for Multi-Display Setups

Intel Internet of Things (IoT) Developer Kit

Intel Cyber Security Briefing: Trends, Solutions, and Opportunities. Matthew Rosenquist, Cyber Security Strategist, Intel Corp

Hadoop Applications on High Performance Computing. Devaraj Kavali

Intel Data Migration Software

Intel Matrix Storage Manager 8.x

Intel 810 and 815 Chipset Family Dynamic Video Memory Technology

Inside Lustre HSM. An introduction to the newly HSM-enabled Lustre 2.5.x parallel file system. Torben Kling Petersen, PhD.

Intel RAID Volume Recovery Procedures

Intel Solid-State Drive Data Center Tool User Guide Version 1.1

Intel Solid State Drive Toolbox

Backup and Recovery of SAP Systems on Windows / SQL Server

Intel Data Direct I/O Technology (Intel DDIO): A Primer >

CA ARCserve Backup r16.x Professional Exam (CAT-360) Study Guide Version 1.1

Dell InTrust Preparing for Auditing Microsoft SQL Server

Intel Remote Configuration Certificate Utility Frequently Asked Questions

Quest Collaboration Services How it Works Guide

Specification Update. January 2014

New Dimensions in Configurable Computing at runtime simultaneously allows Big Data and fine Grain HPC

Jun Liu, Senior Software Engineer Bianny Bian, Engineering Manager SSG/STO/PAC

Intel Extreme Memory Profile (Intel XMP) DDR3 Technology

Intel vpro Technology. How To Purchase and Install Symantec* Certificates for Intel AMT Remote Setup and Configuration

Intel Chipset Compatibility with Microsoft Windows* 95

LANDesk Management Suite 8, v8.1 Creating Custom Vulnerabilities

Intel Network Builders: Lanner and Intel Building the Best Network Security Platforms

Transcription:

Lustre* HSM in the Cloud Robert Read, Intel HPDD

Overview Lustre in the Cloud HSM for Cloud Importing from Amazon Simple Storage Service* (S3) General archive with S3 Crazy snapshot idea 2

Lustre in the Cloud Lustre is not intended to be used for long-term storage in AWS (though HA support makes this possible) S3 is more cost effective than Lustre (in AWS) for long term storage Data needs to be managed outside the filesystem Data outlives filesystems Datasets can be shared 3

HSM for Cloud Data Could HSM help with cloud data management? Supports on demand use case Existing tools don t support S3 Other challenges? Much of the rest of this is speculative and what iffy Not a Roadmap 4

Using FIDs in the Archive - Not a Good Idea Works fine for initial import in an empty filesystem (no FID clashes) But importing into a live file system requires remapping FIDs No rename in S3 Don t want to force a database Don t use FIDs when archive will outlast the filesystem 5

How about a GUID? GUID can be stored in file s extended attributes Database lookup not required Could support multiple archives with different identifiers Disadvantage: Mapping is lost when file is deleted Include xattr in changelog entry? Retain unlinked, archived inodes? 6

Extended Attributes can be Anything How about a URL? hsm-url=s3://my-bucket/my/big/file.log 7

Automated Import from S3 Import data from an S3 bucket on demand. Provide POSIX access to large dataset stored on S3 Allow processing to begin before data has been imported Retrieve data from S3 as it is needed Remains in Lustre* until released 8

New HSM Tools with S3 support S3 Import Tool Scan keys in S3 and create stub files in Lustre Save the object URL as extended attribute Create other metadata based on defaults (uid, gid) S3 Copy Tool Copy file data from S3 links 9

Sample Import to Lustre from S3 Bucket [root@test1]# lhsm mirror --url s3://my-bucket/tools --dest /mnt/lustre/tools --uid 500 --gid 500 [root@test1]# tree /mnt/lustre /mnt/lustre tools emacs-21.4 aclocal.m4 AUTHORS BUGS ChangeLog config.bat config.guess config.sub... [root@test1]# lhsm status -r /mnt/lustre/tools 100 released /mnt/lustre/tools/emacs-21.4/vpath.sed 100 released /mnt/lustre/tools/emacs-21.4/aclocal.m4 100 released /mnt/lustre/tools/emacs-21.4/install-sh 100 released /mnt/lustre/tools/emacs-21.4/configure 100 released /mnt/lustre/tools/emacs-21.4/config.sub 10

Sample Import to Lustre (part 2) [root@test1]# lhsm status -r -l /mnt/lustre/tools 100 released (s3://my-bucket/tools/emacs-21.4/vpath.sed) /mnt/lustre/tools/emacs-21.4/vpath.sed 100 released (s3://my-bucket/tools/emacs-21.4/aclocal.m4) /mnt/lustre/tools/emacs-21.4/aclocal.m4 100 released (s3://my-bucket/tools/emacs-21.4/install-sh) /mnt/lustre/tools/emacs-21.4/install-sh 100 released (s3://my-bucket/tools/emacs-21.4/configure) /mnt/lustre/tools/emacs-21.4/configure 100 released (s3://my-bucket/tools/emacs-21.4/config.sub) /mnt/lustre/tools/emacs-21.4/config.sub... [root@test1]# lhsm restore /mnt/lustre/tools/emacs-21.4/vpath.sed restore: /mnt/lustre/tools/emacs-21.4/vpath.sed [root@test1t]# lhsm status /mnt/lustre/tools/emacs-21.4/vpath.sed 100 archived /mnt/lustre/tools/emacs-21.4/vpath.sed [root@test1]# file /mnt/lustre/tools/emacs-21.4/aclocal.m4 /mnt/lustre/tools/emacs-21.4/aclocal.m4: ASCII English text [root@test1]# lhsm status /mnt/lustre/tools/emacs-21.4/vpath.sed 100 archived /mnt/lustre/tools/emacs-21.4/vpath.sed 11

General Purpose S3 Archive Archive Generate GUID and save in xattrs Store data in s3://archive-bucket/objects/guid Supports hardlinks and transparent to renames Restore Retrieve GUID from xattrs Fetch data by GUID 12

Namespace Archive Store Metadata in S3 JSON format Include special files Store entire directory in one key Single fetch for complete directory Updates are expensive One key per directory entry Fetch per key Updates easy { } "name": "work/important.txt", "xattr": { "user.hsm_guid": "e819d8fe-e969-40a5-af8a-0956da5c2e8c" }, "stat": { "ino": 180144002274692600, "mode": 33188, "gid": 0, "uid": 0, "nlink": 1, "mtime": 1427217574, "atime": 1427217756, "size": 38, "ctime": 1427217756, "rdev": 0, "dev": 743766374 }, "type": "reg" 13

HSM Snaphots Create stub file in.hsmsnap/ referencing the GUID in archive Generate a new GUID every time file is archived Multiple "snapshots" available directly to the user Could be created by copytool or based on policy 14

HSM Snapshot Demo Create a snapshot of a file when archived [root@test1]# lhsm archive -a 2 important.txt [root@test1]# lhsm status -l important.txt 2 archived (66b2d482-7d1d-462a-b76e-bdd2c50891f8) important.txt [root@test1]# tree.hsmsnap/.hsmsnap/ important.txt^2015-03-24t10:19:34-07:00 [root@test1]# lhsm status -l.hsmsnap/ 2 released (66b2d482-7d1d-462a-b76e-bdd2c50891f8).hsmsnap/important.txt^2015-03-24T15:12:41-07:00 15

HSM Snapshot Demo (part 2) Create a new snapshot when file is archived again [root@test1 work]# lhsm status -l important.txt 2 dirty (66b2d482-7d1d-462a-b76e-bdd2c50891f8) important.txt [root@test1 work]# lhsm archive -a 2 important.txt archive: important.txt [root@test1 work]# lhsm status -l important.txt 2 archived (a326deae-2024-4ce1-9999-817a78fd51be) important.txt [root@test1 work]# lhsm status -l.hsmsnap/ 2 released (66b2d482-7d1d-462a-b76e-bdd2c50891f8).hsmsnap/important.txt^2015-03-24T15:12:41-07:00 2 released (a326deae-2024-4ce1-9999-817a78fd51be).hsmsnap/important.txt^2015-03-24t15:14:18-07:00 16

Summary Lustre is being used on the cloud Need to manage datasets that outlive the filesystem Using extended attributes adds flexibility without a database Importing existing S3 data straightforward General archive to S3 needs metadata storage decision Creating approximate snapshots on HSM (CASH) 17

Legal Information No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade. This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps. The products and services described may contain defects or errors known as errata which may cause deviations from published specifications. Current characterized errata are available on request. Copies of documents which have an order number and are referenced in this document may be obtained by calling 1-800-548-4725 or by visiting www.intel.com/design/literature.htm. Intel technologies features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at http://www.intel.com/content/www/us/en/software/intel-solutions-for-lustre-software.html. Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. Test and System Configurations: Demonstrations used single virtual machine with CentOS 6.6 and recent build of Lustre master branch. For more complete information about performance and benchmark results, visithttp://www.intel.com/performance. Intel and the Intel logo, are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others 2015 Intel Corporation. 18

19