Data storage and high-speed streaming

Similar documents
Architecting High-Speed Data Streaming Systems. Sujit Basu

File System & Device Drive. Overview of Mass Storage Structure. Moving head Disk Mechanism. HDD Pictures 11/13/2014. CS341: Operating System

CS 6290 I/O and Storage. Milos Prvulovic

Price/performance Modern Memory Hierarchy

Logical Operations. Control Unit. Contents. Arithmetic Operations. Objectives. The Central Processing Unit: Arithmetic / Logic Unit.

Chapter Introduction. Storage and Other I/O Topics. p. 570( 頁 585) Fig I/O devices can be characterized by. I/O bus connections

RAID. RAID 0 No redundancy ( AID?) Just stripe data over multiple disks But it does improve performance. Chapter 6 Storage and Other I/O Topics 29

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

1 Storage Devices Summary

Distribution One Server Requirements

Sistemas Operativos: Input/Output Disks

Scaling from Datacenter to Client

HP Smart Array Controllers and basic RAID performance factors

Chapter 10: Mass-Storage Systems

Random Access Memory (RAM) Types of RAM. RAM Random Access Memory Jamie Tees SDRAM. Micro-DIMM SO-DIMM

OBJECTIVE ANALYSIS WHITE PAPER MATCH FLASH. TO THE PROCESSOR Why Multithreading Requires Parallelized Flash ATCHING

CSCA0102 IT & Business Applications. Foundation in Business Information Technology School of Engineering & Computing Sciences FTMS College Global

HP Z Turbo Drive PCIe SSD

Chapter 6. Inside the System Unit. What You Will Learn... Computers Are Your Future. What You Will Learn... Describing Hardware Performance

Intel RAID SSD Cache Controller RCS25ZB040

Embedded Operating Systems in a Point of Sale Environment. White Paper

Using NI CompactDAQ Controllers

Communicating with devices

Input / Ouput devices. I/O Chapter 8. Goals & Constraints. Measures of Performance. Anatomy of a Disk Drive. Introduction - 8.1

Slide Set 8. for ENCM 369 Winter 2015 Lecture Section 01. Steve Norman, PhD, PEng

Modular Instrumentation Technology Overview

Computers. Hardware. The Central Processing Unit (CPU) CMPT 125: Lecture 1: Understanding the Computer

LSI MegaRAID CacheCade Performance Evaluation in a Web Server Environment

Accelerating Server Storage Performance on Lenovo ThinkServer

What s New in Mike Bailey LabVIEW Technical Evangelist. uk.ni.com

Storage Technologies for Video Surveillance

User s Manual. Home CR-H BAY RAID Storage Enclosure

Computer Architecture

GENERAL INFORMATION COPYRIGHT... 3 NOTICES... 3 XD5 PRECAUTIONS... 3 INTRODUCTION... 4 FEATURES... 4 SYSTEM REQUIREMENT... 4

The Bus (PCI and PCI-Express)

Introduction to the NI Real-Time Hypervisor

Introduction to I/O and Disk Management

Input/output (I/O) I/O devices. Performance aspects. CS/COE1541: Intro. to Computer Architecture. Input/output subsystem.

Disks and RAID. Profs. Bracy and Van Renesse. based on slides by Prof. Sirer

Chapter 12: Mass-Storage Systems

PIONEER RESEARCH & DEVELOPMENT GROUP

EMX-2500 DATA SHEET FEATURES GIGABIT ETHERNET REMOTE CONTROLLER FOR PXI EXPRESS MAINFRAMES SYSTEM LEVEL FUNCTIONALITY

Hardware: Input, Processing, and Output Devices. A PC in Every Home. Assembling a Computer System

PCIe Over Cable Provides Greater Performance for Less Cost for High Performance Computing (HPC) Clusters. from One Stop Systems (OSS)

SSD Old System vs HDD New

I3: Maximizing Packet Capture Performance. Andrew Brown

What is RAID--BASICS? Mylex RAID Primer. A simple guide to understanding RAID

RAID Storage System of Standalone NVR

SSDs and RAID: What s the right strategy. Paul Goodwin VP Product Development Avant Technology

Semiconductor Device Technology for Implementing System Solutions: Memory Modules

Chapter 4 System Unit Components. Discovering Computers Your Interactive Guide to the Digital World

IncidentMonitor Server Specification Datasheet

Indexing on Solid State Drives based on Flash Memory

Optimizing SQL Server Storage Performance with the PowerEdge R720

1. Memory technology & Hierarchy

Lecture 36: Chapter 6

RAID Implementation for StorSimple Storage Management Appliance

Summer Student Project Report

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University. (

An Overview of Flash Storage for Databases

CSCA0201 FUNDAMENTALS OF COMPUTING. Chapter 5 Storage Devices

Full Drive Encryption with Samsung Solid State Drives

High-Performance SSD-Based RAID Storage. Madhukar Gunjan Chakhaiyar Product Test Architect

Lecture 9: Memory and Storage Technologies

How to Choose the Right Data Storage Format for Your Measurement System

TECHNOLOGY BRIEF. Compaq RAID on a Chip Technology EXECUTIVE SUMMARY CONTENTS

How to recover a failed Storage Spaces

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 412, University of Maryland. Guest lecturer: David Hovemeyer.

A Close Look at PCI Express SSDs. Shirish Jamthe Director of System Engineering Virident Systems, Inc. August 2011

Solid State Drive Architecture

RAID Levels and Components Explained Page 1 of 23

Brainlab Node TM Technical Specifications

Evaluation Report: Accelerating SQL Server Database Performance with the Lenovo Storage S3200 SAN Array

SLIDE 1 Previous Next Exit

With respect to the way of data access we can classify memories as:

Comparison of NAND Flash Technologies Used in Solid- State Storage

N /150/151/160 RAID Controller. N MegaRAID CacheCade. Feature Overview

PowerVault MD1200/MD1220 Storage Solution Guide for Applications

How To Scale Myroster With Flash Memory From Hgst On A Flash Flash Flash Memory On A Slave Server

Chapter 11 I/O Management and Disk Scheduling

Data Storage - I: Memory Hierarchies & Disks

Memoright SSDs: The End of Hard Drives?

3.4 Planning for PCI Express

WHITE PAPER FUJITSU PRIMERGY SERVER BASICS OF DISK I/O PERFORMANCE

Performance Beyond PCI Express: Moving Storage to The Memory Bus A Technical Whitepaper

5-BAY RAID STATION. Manual

Discovering Computers Living in a Digital World

3.5 Dual Bay USB 3.0 RAID HDD Enclosure

760 Veterans Circle, Warminster, PA Technical Proposal. Submitted by: ACT/Technico 760 Veterans Circle Warminster, PA

Computer Systems Structure Main Memory Organization

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

The Central Processing Unit:

Disk Storage & Dependability

COMPUTER HARDWARE. Input- Output and Communication Memory Systems

Chapter 9: Peripheral Devices: Magnetic Disks

Evaluation Report: HP Blade Server and HP MSA 16GFC Storage Evaluation

An Introduction to RAID. Giovanni Stracquadanio

HP SN1000E 16 Gb Fibre Channel HBA Evaluation

Transcription:

FYS3240 PC-based instrumentation and microcontrollers Data storage and high-speed streaming Spring 2011 Lecture #8 Bekkeng, 9.5.2011

Data streaming Data written to or read from a hard drive at a sustained rate is often referred to as streaming Trends in data storage Ever-increasing amounts of data Record everything and play it back later Hard drives: faster, bigger, and cheaper RAID hardware PCI Express PCI Express provides higher, dedicated bandwidth

Overview Hard drive performance and alternatives File types RAID DAQ software design for high-speed acquisition and storage

Streaming Data with the PCI Express Bus A PCI Express device receives dedicated bandwidth (250 MB/s or more). Data is transferred from onboard device memory (typically less than 512 MB), across a dedicated PCI Express link, across the I/O bus, and into system memory (RAM; 3 GB or more possible). It can then be transferred from system memory, across the I/O bus, onto hard drives (TB s of data). The CPU/DMA-controller is responsible for managing this process. Peer-to-peer data streaming is also possible between two PCI Express devices.

PXI: Streaming to/from Hard Disk Drives

RAM Random Access Memory SRAM Static RAM: Each bit stored in a flip-flop DRAM Dynamic RAM: Each bit stored in a capacitor (transistor). Has to be refreshed (e.g. each 15 ms) EDO DRAM Extended Data Out DRAM. Data available while next bit is being set up Dual-Ported DRAM (VRAM Video RAM). Two locations can be accessed at the same time SDRAM Synchronous DRAM. Latched to the memory bus clock read/write on one single clock cycle Rambus (RDRAM) - Intel had to use RDRAM 1996-2002. Much more expensive than SDRAM DDR SDRAM Double Data Rate SDRAM (2.5 V). Data transferred on both rising and falling edge of clock. DDR2 SDRAM (1.8 V) Same, but lower power consumption and higher clock frequency DDR3 SDRAM (1.5 V) Came in 2007. Even less power and faster

Stream to/from disk rates Peak, not sustained rate! Important parameters for streaming Seek times Buffer size Rotational speed (RPM) Sustained streaming rates is most affected by the rotational speed

Sector HDD Performance Track HDD s Internal Data Rate (IDR) = density x RPM x disk diameter Outer HDD track is faster, inner track is slower Example above: 62 MB/s at outer track, 36 MB/s at inner track Windows OS allocates file space from outer track and inward

SSD (Solid-State drive) A data storage device that uses solid-state (Flash) memory to store data. SSDs are distinguished from traditional hard disk drives (HDDs), which are electromechanical devices containing spinning disks and movable read/write heads. SSDs, in contrast, use microchips which retain data in nonvolatile memory chips and contain no moving parts. SSDs use the same interface as HDDs, thus easily replacing them in most applications

SSDs Performance versus Capacity More robust used in industry/server applications

Pros SSDs pros and cons Robustness (Less susceptible to vibrations and shock) Increased write/read speeds (low access time and latency) Not a drop in write speed when the memory fills up (as for HDDs) Low power consumption (reduced heat generation) Low boot-up time (for OS) and quicker application-launches Cons High cost Low capacity (in # GB) Great quality variations have been experienced Reduced write speed experienced over time (for some suppliers)

SSD example: OCZ RevoDrive PCI express card with flash memory and RAID-controller (RAID-0) Available in 50 GB to 480 GB capacities PCI-Express interface (x4) For use as primary boot drive or data storage Read: Up to 540 MB/s Write: Up to 480 MB/s Sustained Write: Up to 400 MB/s

Selecting hard drives for DAQ systems A standard HDD is a 8/5 drive Designed for power on for 8 hours, 5 days a week Select Enterprise/Extended operations/es version HDDs They are 24/7 drives, meaning that they are designed for continuous operation and high sustained throughput Almost same price as standard versions (except server disks!) 2.5 HDDS designed for laptops are more rugged than 3.5 HDDs designed for use in desktop PCs

Factors that affect streaming performance Beyond overall application architecture, stream-to-disk or stream-from-disk rates can be affected by some of the following factors Running background programs such as virus scan Recommended to disable the scheduled scans and updates for the entire duration of data streaming How the hard drive is formatted to group data Location of the file on the hard drive(s) Locate the OS on a separate HDD (to free the fastest outer tracks for data storage)

Determining Storage Format When determining the appropriate storage format for the data, consider the following: What will you do with your data once you have acquire them? Will you write and read data with the same application? How much data will you acquire? At what rate will you acquire data? Will you need to exchange data with another program? Will you need to search your data files? ASCII Binary TDMS Config File Spreadsheet AVI XML

ASCII Files Pros Human-readable Easily portable to other applications such as Microsoft Excel Can easily add text information (first line) for each data column Cons Large file size Slow read and write

Binary Files Two different architectures for handling memory storage: Pros Compact file size Fast streaming Cons Not human-readable Less easily exchangeable

XML files Pros Can store complex data structures Can be displayed in a Web browser or in a text editor Cons Larger disk footprint Not for data streaming

Database (MS Access etc.) Pros Searchable Cons Time-intensive programming Not suitable for high-speed streaming

TDM and TDMS A file format from National Instruments TDMS = Technical Data Management Streaming Three levels of hierarchy TDMS Optimized for high-speed streaming Binary-based header Binary bulk data Single file for both header information and bulk data Supported on real-time software platforms TDM XML-based header file Separate binary bulk data file ni.com/tdm

TDMS File Format: Optimized for Data Storage and Search

TDMS Write Data and Set Properties

File format comparison

The data challenge

Applications Requiring high-speed Data Streaming (examples) High speed data acquisition Radar RF recording and playback High resolution video Digital video recording and playback

Key system components for highspeed streaming Hardware Platform with High-Throughput and Low- Latency PXI Express High-Speed Data Storage Hard Disk Drives (HDDs) Solid-State Drives (SSDs) Software for Streaming to Disk at High Rates Streaming Front-End Instrumentation

Bandwidth versus Latency

PXI Express Hardware Platform In a system composed of a x4 PXIe chassis, x4 PXIe controller, and x1 PXIe module, the module will limit the maximum data streaming rate of that particular slot to 250 Mbytes/s. Processing is required of the controller to manage the transfer of data from device memory to controller memory to hard drives for acquisition or from hard drives to controller memory to device memory for generation.

RAID RAID = Redundant Array of Independent Drives. RAID is a general term for mass storage schemes that split or replicate data across multiple hard drives. The rate at which data can be written to or read from the hard drives limits the data streaming rates. One way to increase this rate is to use a RAID configuration Can be the internal RAID of the remote controller (workstation/pc) or an in-chassis PXI RAID Can be a RAID connected externally to the PXIe chassis 12 disk or more RAIDs are easily available

RAID-0 Striping without redundancy Improved speed over streaming to a single hard drive Unimproved system reliability Transparently supported by Windows OS The fastest configuration!

RAID-1 Mirrored (redundancy) 100% data redundancy Each piece of data is written to two (or more) hard drives No write speed increase over single disk Highest overhead of all raid configurations Often used for the operating system (OS) disks

RAID-5 Distributed parity (single parity) Parity data distributed on all disks Can only tolerate one drive failure (array continues to operate with one failed drive) Single-parity RAID levels are as vulnerable to data loss as a RAID- 0 array until the failed drive is replaced and its data rebuilt More write overhead than RAID-1 because of the additional parity data that has to be created and written to the disk array Poor performance with small files Gives less space for measurement data (due to parity) Reduces the amount of storage space available by the size of one hard drive (due to parity information) Minimum number of disks is 3

RAID-6 Double distributed parity Extends RAID 5 by adding an additional parity block Provides fault tolerance from two drive failures (array continues to operate with up to two failed drives) This becomes increasingly important as large-capacity drives lengthen the time needed to recover from the failure of a single drive Double parity gives time to rebuild the array without the data being at risk if a single additional drive fails before the rebuild is complete. Very slow write because of the overhead associated with parity calculations

RAID-10 (1+0) Striping and mirroring Both increased speed and redundancy compared to single drive Can sustain multiple drive failures Configuration requires twice the number of hard drives

Kilobyte kb vs KB (from Wikipedia) The kilobyte (symbol KB) [1] is a multiple of the unit byte for digital information. Historically and in common usage, a kilobyte refers to 1024 (2 10 ) bytes in most fields of computer science and information technology. [2][3][4] However, hard disk drives measure capacity in strict accordance with the rule of the International System of Units unit prefixes (prefixes like kilo- and mega-) that are based on decimal math. [5][6] In such mass storage contexts, megabyte equals precisely 1,000,000 bytes and gigabyte is precisely 1,000 times greater than a megabyte. This association suggests that kilobyte can also denote 1,000 bytes in the context of mass storage. Thus, kilobyte and its symbol KB can be ambiguous, their exact value (1,024 or 1,000) dependent upon context. In December 1998, an international standards organization attempted to address these dual definitions of the conventional prefixes by proposing unique binary prefixes and prefix symbols to denote multiples of 1024, such as kibibyte (KiB), which exclusively denotes 2 10 or 1024 bytes. [7] Had this proposal been widely and consistently adopted, it would have liberated the standard unit prefixes to unambiguously refer only to their strict decimal definitions wherein kilobyte would be understood to represent only 1000 bytes. However, in the over-12 years that have since elapsed, the proposal has seen little adoption by the computer industry. [8][9][10][11]

kb vs KB k = 1000 (decimal) K = 1024 (binary) GB binary = 1024 3 byte TB binary = 1024 4 byte (GB binary ~= GB desimal *0.93) (TB binary ~= GB desimal *0.90)

Streaming to external NI-RAID

Stream to/from disk rates

RAID configuration Use the RAID management software to configure you RAID system Do not create any single array larger than 2 TB when using Windows XP or Vista (32-bit OS) as they recognize partitions of only 2 TB or less

RAID configuration II In order to speed up the write of multiple (simultaneous) files NI recommends to write these files to separate RAID volumes The RAID must be reconfigured with a suitable number of RAID volumes of a suitable size If for example three equally fast high-speed digital video streams are to be stored on a 12 disk RAID, the RAID could be configured with 3 separate RAID volumes of 4 drives each This should allow for writes (and reads) on the outer edge of each of the 3 volume s respective disks.

Memory Storage Chain for a large DAQ system Master DAQ computer DAQ Clients, if this is a server RAID-0 (HDD) RAID-5 (HDD) SSD RAM RAID-5 (HDD) Tape drives HDDs MB/s needed? Total amount of data (MB) needed? Offline

PXI RAID hardware available from NI

PXI RAID hardware available from NI

MXI bus It couples two physically separate buses with either a copper or fiber-optic, e.g PCIe to PXIe PXIe to PXIe E.g. control a PXIe chassis with a PCIe based PC (remote controller) Use e.g. a x4 MXI Express cable and interface cards in each PC/PXI system PCIe PXIe MXI PXIe PXIe MXI

Illustration of driver problems Problem: NI HDD-8264 is only achieving ~510 MB/s of sustained throughput in in Windows 7, not 600 MB/s like with Windows XP (Performance decrease) Solution: NI HDD-8264 uses an LSI RAID controller, and LSI has not released an official Windows 7 driver for this RAID card. LSI recommends the driver that is integrated in the Windows 7 kernel. However, this driver has decreased performance compared to the LSI developed, Windows XP driver. Benchmarking tests between Windows XP and Windows 7 performed by National Instruments show an approximate 15% performance decrease when using Windows 7.

Another solution - Direct-to-Disk controller When streaming to or from controller memory or hard drives, the PCI controller (in the case of PCI), I/O bus, controller memory, and processor are shared data paths, which divides down data streaming bandwidth. Additionally, the operating system, drivers, and application software managing data flow introduce latencies.

Direct-to-Disk controller With a PCI Express direct-to-disk controller module, data is streamed directly from the device memory onboard a DAQcard, across the PCI or PCI Express bus, to the direct-to-disk controller for acquisition This gives a minimum use of the PCI-buss and the CPU http://www.conduant.com/ Streamstor Amazon Express Controller (using PCI Express)

32-Bit vs. 64-Bit OS for DAQ applications A 32-bit processor can reference 2 32 bytes, or 4 GB of memory A 64-bit processor are theoretically capable of referencing 2 64 locations in memory. However, all 64-bit versions of Microsoft operating systems currently impose a 16 TB limit on address space Increased performance of 64 bit OS and application programs for DAQ applications (provided that 64-bit drivers are available) can be seen in: High-channel count, high-speed data analysis Vision acquisition and applications with large amounts of data moving directly into memory at rapid rates LabVIEW 2009 and newer versions are now available for 64-bit operating systems (e.g. Windows 7 x64).

LabVIEW: Bad Programming Structure for High-speed acquisition and storage

Solution: Use Multithreading!

Producer-Consumer loops

LabVIEW handles the queue as a block of allocated PC memory. This memory block is utilized as a temporary storage FIFO for data passing between two loops LabVIEW handles all the memory access to ensure that read-write race conditions do not occur.

Increase DAQ I/O Performance Sector Use the option to disable buffering (unbuffered I/O) in Win API Optimizes streaming applications Note that you must read from or write to the file in integer multiples of the disk sector size (usually 512 bytes) when using this option Track

Data Types file size

Examples of Streaming Front-End Instrumentation NI X Series Multifunction Data Acquisition Streaming over PCIe and PXIe busses PCIe x4 dual port NIC