Secure Cloud Storage and Computing Using Reconfigurable Hardware

Similar documents
FPGAs for Trusted Cloud Computing

enabling Ultra-High Bandwidth Scalable SSDs with HLnand

Solid State Drive Architecture

Big Picture. IC220 Set #11: Storage and I/O I/O. Outline. Important but neglected

Embedded Trusted Computing on ARM-based systems

Certifying Program Execution with Secure Processors

The Reduced Address Space (RAS) for Application Memory Authentication

VMWare Workstation 11 Installation MICROSOFT WINDOWS SERVER 2008 R2 STANDARD ENTERPRISE ED.

Using AES 256 bit Encryption

Computer Systems Structure Input/Output

HP Z Turbo Drive PCIe SSD

Networking Virtualization Using FPGAs

Using Synology SSD Technology to Enhance System Performance Synology Inc.

IoT Security Platform

The Data Placement Challenge

Agenda. Enterprise Application Performance Factors. Current form of Enterprise Applications. Factors to Application Performance.

Arrow ECS sp. z o.o. Oracle Partner Academy training environment with Oracle Virtualization. Oracle Partner HUB

Quantifying Hardware Selection in an EnCase v7 Environment

Accelerating I/O- Intensive Applications in IT Infrastructure with Innodisk FlexiArray Flash Appliance. Alex Ho, Product Manager Innodisk Corporation

Chapter 4 System Unit Components. Discovering Computers Your Interactive Guide to the Digital World

Introduction to I/O and Disk Management

File System Management

Distributed File System. MCSN N. Tonellotto Complements of Distributed Enabling Platforms

IBM Europe Announcement ZG , dated March 11, 2008

Hadoop: Embracing future hardware

Price/performance Modern Memory Hierarchy

361 Computer Architecture Lecture 14: Cache Memory

Chapter Introduction. Storage and Other I/O Topics. p. 570( 頁 585) Fig I/O devices can be characterized by. I/O bus connections

Lab Evaluation of NetApp Hybrid Array with Flash Pool Technology

Discovering Computers Living in a Digital World

High Performance. CAEA elearning Series. Jonathan G. Dudley, Ph.D. 06/09/ CAE Associates

Opal SSDs Integrated with TPMs

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

An Overview of Flash Storage for Databases

Quiz for Chapter 6 Storage and Other I/O Topics 3.10

Trusted Platforms for Homeland Security

1 Storage Devices Summary

QuickSpecs. PCIe Solid State Drives for HP Workstations

Parallels Cloud Storage

SharePoint Performance Optimization

Embedded Operating Systems in a Point of Sale Environment. White Paper

CHAPTER 7: The CPU and Memory

Virtualised MikroTik

Terms of Reference Microsoft Exchange and Domain Controller/ AD implementation

RAID. RAID 0 No redundancy ( AID?) Just stripe data over multiple disks But it does improve performance. Chapter 6 Storage and Other I/O Topics 29

Hardware and Software Requirements for Installing California.pro

Logical Operations. Control Unit. Contents. Arithmetic Operations. Objectives. The Central Processing Unit: Arithmetic / Logic Unit.

QuickSpecs. SATA (Serial ATA) Hard Drives for HP Workstations Overview

QUESTIONS & ANSWERS. ItB tender 72-09: IT Equipment. Elections Project

Cisco Small Business NSS2000 Series Network Storage System

Deep Dive: Maximizing EC2 & EBS Performance

Hardware Configuration Guide

High Frequency Trading and NoSQL. Peter Lawrey CEO, Principal Consultant Higher Frequency Trading

Chapter 1: Introduction

CONFIGURATION CONCEPTS SUN SPARC ENTERPRISE M-SERIES SERVERS. James Hsieh, Sun Systems Group. Sun BluePrints Online

HyperQ Remote Office White Paper

SecureDoc Disk Encryption Cryptographic Engine

Penetration Testing Windows Vista TM BitLocker TM

4 Port PCI Express 2.0 SATA III 6Gbps RAID Controller Card with HyperDuo SSD Tiering

preliminary experiment conducted on Amazon EC2 instance further demonstrates the fast performance of the design.

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

The Bus (PCI and PCI-Express)

CS252 Project An Encrypted File System using TPM

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University. (

The IntelliMagic White Paper: Storage Performance Analysis for an IBM Storwize V7000

EDUCATION. PCI Express, InfiniBand and Storage Ron Emerick, Sun Microsystems Paul Millard, Xyratex Corporation

Client-aware Cloud Storage

Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks

PCI Express Impact on Storage Architectures and Future Data Centers. Ron Emerick, Oracle Corporation

Solid State Storage in Massive Data Environments Erik Eyberg

Fujitsu PRIMERGY BX920 S2 Dual-Socket Server

SERVER CLUSTERING TECHNOLOGY & CONCEPT

Fusionstor NAS Enterprise Server and Microsoft Windows Storage Server 2003 competitive performance comparison

How To Ensure Correctness Of Data In The Cloud

Amazon Cloud Storage Options

AppliedMicro Trusted Management Module

Emerging storage and HPC technologies to accelerate big data analytics Jerome Gaysse JG Consulting

Speeding Up Cloud/Server Applications Using Flash Memory

White paper. QNAP Turbo NAS with SSD Cache

Gigabit Ethernet Packet Capture. User s Guide

PCI Express Impact on Storage Architectures. Ron Emerick, Sun Microsystems

Cut Network Security Cost in Half Using the Intel EP80579 Integrated Processor for entry-to mid-level VPN

PSAM, NEC PCIe SSD Appliance for Microsoft SQL Server (Reference Architecture) September 11 th, 2014 NEC Corporation

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency

CSCA0102 IT & Business Applications. Foundation in Business Information Technology School of Engineering & Computing Sciences FTMS College Global

1000Mbps Ethernet Performance Test Report

Storage Class Memory and the data center of the future

PCI Express Impact on Storage Architectures and Future Data Centers

Cisco Small Business NSS3000 Series Network Storage System

Computer Organization. and Instruction Execution. August 22

PCI Express SATA III RAID Controller Card with Mini-SAS Connector (SFF-8087) - HyperDuo SSD Tiering

Software Environment. Options. Service guarantee:. 24/7 Hardware Support. 99% uptime

IT of SPIM Data Storage and Compression. EMBO Course - August 27th! Jeff Oegema, Peter Steinbach, Oscar Gonzalez

Architecting High-Speed Data Streaming Systems. Sujit Basu

PUF Physical Unclonable Functions

Board Notes on Virtual Memory

Transcription:

Secure Cloud Storage and Computing Using Reconfigurable Hardware Victor Costan, Brandon Cho, Srini Devadas Motivation Computing is more cost-efficient in public clouds but what about security?

Cloud Applications and Security Models Individual user backs up public data Upload file on Amazon S3 and anyone can download it User only concerned with integrity and reliability of storage User backs up private data (e.g., photographs) User can encrypt data prior to storing for privacy User concerned with integrity and reliability of storage User wants to back up and share photographs on Flickr User needs to trust integrity of application, e.g., Wordpress that is used to share photographs User wants to run a private application on private data (e.g., access private database) User has to trust the cloud provider to maintain privacy and integrity What Public Clouds Cannot Do (Yet) Guarantee integrity and privacy of computation Integrity and privacy can be guaranteed if cloud servers have trusted modules (e.g., TPMs, TEMs) Performance loss a significant concern Encrypted computation can be performed in theory using fully homomorphic encryption techniques (Gentry, 2008) These schemes are not yet practical For these reasons private clouds are used in database applications and other applications where privacy is crucial Can we secure public clouds?

Trusted Computing Bases 1. Trust the cloud providerʼs entire server The status quo: Amazon S3 and EBS Cheap, but no security guarantees 2. Trust a TPM (Trusted Platform Module) attached to server Very good security boundary: one well-studied chip Low performance, low throughput 3. The best of both worlds Donʼt trust weak components: server OS, system buses, RAM Do trust: TPM-like chip, plus high-performance chip (FPGA / ASIC) Security boundary is still good Good performance and throughput System Design

Design: System Architecture FPGA / ASIC (Trusted) Secure NVRAM Chip Client System Bus Internet CPU Disk RAM Network Card Attack Vectors for Trusted Storage Application Hard Disk tampering Try to inject invalid data (easy) Replay attacks (harder) Bugs from other applications running on the server OS compromise Physical tampering Active system bus tapping (e.g., Xbox) RAM glitching (e.g., PlayStation 3) Hard disk modification or roll-back to a previous state

Integrity Verification Client/TCB write Untrusted Disk INTEGRITY VERIFICATION read Integrity Verification Check if a value from untrusted disk is the most recent value stored at the address by the client MAC-based Integrity Verification? Client/TCB write Untrusted Disk Address 0x45 Keyed MAC V E RI F Y read 124, MAC(0x45, 124) 120, MAC(0x45, IGNORE 120) Message Authentication Code (MAC) is often used to authenticate a network message Store MAC(address, value) on writes, and check the MAC on reads Does NOT work Replay attacks Need to securely remember the untrusted disk state

Design: Trusted Storage on Untrusted Disks 160-bit hash in trusted memory authenticates 1TB disk 20 levels h 5 =h(h 1 h 2 ) Root Hash h 7 =h(h 5 h 6 ) h 6 =h(h 3 h 4 ) Root hash matches iff all blocks match Nodes hash their children h 1 =h(b 1 ) h 2 =h(b 2 ) h 3 =h(b 3 ) h 4 =h(b 4 ) Leaves hash their blocks B 1 B 2 B 3 B 4 Disk divided into 1MB blocks Design: Hash Tree Cache Server stores entire hash tree in RAM FPGA has a cache that stores a subset of nodes Server tells FPGA what nodes to store Cache management commands 1 2 3 4 5 6 7 Node Hash Verified 1 fabe Y 2 e6fc Y 4 53a8 Y 5 b2ce Y

Design: Hash Tree Cache - Efficiency Checking leaf 33 requires 10 node loads for a cold cache on this example Remember the root is always loaded in the cache 1 2 3 4 5 8 9 16 17 32 33 5/25/10 Design: Hash Tree Cache - Efficiency Checking leaf 38 only 4 node loads, because 9 is already in the cache and verified Server can predict client requests and manage cache for high performance 1 4 5 8 9 2 3 16 17 32 33 18 19 38 39 5/25/10

Design: Maintaining FPGA State FPGA 32nm, no NVRAM Physically Unclonable Function (PUF) or Battery-backed Encryption Key E-Fuses: hash of public key for the certificate of the trusted memory chip Trusted Memory Low performance Smart Card-family chip Encryption engine, manufacturer certificate NVRAM holding FPGAʼs root hash Implementation Decisions

Design: System Architecture Revisited FPGA / ASIC (Trusted) Secure NVRAM Chip Client System Bus Internet CPU Disk RAM Network Card Implementation: Storage Prototype uses desktop-class 7,200 RPM HDD with 1TB Normal servers would use 10,000 RPM disks Hash tree block size: 1Mb Model Throughput Latency GB / $ 7,200 RPM HDD 70 MB/s 12 ms 10 10,000 RPM HDD 100 MB/s 8 ms 1.5 15,000 RPM HDD 130 MB/s 6 ms 1 SSD 250 MB/s 0.065ms 0.4

Implementation: SHA-1 Hash Engine High-throughput 4-stage pipelined SHA-1 implementation 6 SHA-1 engines, 4 simultaneous hashes / engine Hash tree logic (with cache) uses 70% of the silicon, SHA-1 uses 30% FPGA Model Throughput Latency FPGA Cost Virtex-5 FPGA 20.4 GB/s 600 ns $50 Virtex-6 FPGA 21.6 GB/s 550 ns $75 Implementation: Hash Tree Cache 188 bits per cache entry, 43690 entries / MB 1 TB disk, 1MB nodes path length is 20 nodes Prototype: 1MB cache on FPGA, avg. 3 node loads / block Production: 8MB cache (like Core i7), avg. 1 load / block Cache size, strategy Hit rate Loads / op Verifies / write 32kB, LRU 50% 10 20 512kB, LRU 75% 5 20 1MB, LRU 85% 3 20 8MB, LRU 95% 1 20

Implementation: FPGA CPU Bus Prototype uses Gigabit Ethernet at 80% capacity Production servers should use 16-lane PCI-Express Model PCI Express x16 SATA II PCI Express x1 Ethernet USB 2.0 Throughput 4 GB/s 384 MB/s 250 MB/s 100 MB/s 60 MB/s Implementation: Trusted Memory Chip Irrelevant for performance, used for booting the FPGA Smart card technology Prototype: JavaCard 2.2.1, 32kB EEPROM 2kB RAM, 100ms / op Estimated requirements: 4kB ROM, 4kB EEPROM Production: any $1 secure chip with a processor and NVRAM Secure NVRAM to Server Bus Prototype: USB Production: USB, LPC Irrelevant for system performance, only used at boot

Implementation: Prototype System Virtex 5 XC5VLX110T JCOP21 36k MacBookPro6,2 Core i7 620 4GB RAM Gigabit Ethernet Ethernet USB 1.0 SATA II HyperTransport PCI-E x1 Cat 5 cable Core i7 920 1TB 7,200 RPM PC1066 2GB Generic Gigabit Ethernet Implementation: Performance Overview 5.2GB/s 96us 100MB/s 200us 100MB/s 200us 384MB/s 250MB/s 100MB/s 8ms 100MB/s 200us

Implementation: Overhead Analysis for the Prototype Client Server Bandwidth overhead: 0.002% Operation: 1 HMAC (20 bytes) per 1MB = 0.002% Handshake: extra secret exchange piggybacks on SSL: 5% Latency overhead (1 client): 4% Without security: 8.2ms / request With security: 8.5ms / request Latency overhead = the latency of a very fast Internet hop No throughput overhead (N-clients) With or without security: 100MB/s Need 40 HDDs to saturate PCI-E x16, 52 HDDs to saturate FPGA Ongoing Work

Other Applications FPGA can be used to load user-specified circuits and perform arbitrary computation with security guarantees Applications: encrypted image search, financial calculations Potential applications in highly regulated industries, e.g. medical record keeping and processing, secure financial services Acknowledgement: Work was funded by Quanta Corporation