Solving Agencies Big Data Challenges: PED for On-the-Fly Decisions

Similar documents
NetApp Storage Solutions for Processing, Exploitation, and Dissemination

Make the Most of Big Data to Drive Innovation Through Reseach

NetApp Big Content Solutions: Agile Infrastructure for Big Data

With DDN Big Data Storage

T a c k l i ng Big Data w i th High-Performance

Netapp HPC Solution for Lustre. Rich Fenton UK Solutions Architect

Introduction to NetApp Infinite Volume

Storage Switzerland White Paper Storage Infrastructures for Big Data Workflows

EMC SOLUTION FOR SPLUNK

EXPLORATION TECHNOLOGY REQUIRES A RADICAL CHANGE IN DATA ANALYSIS

ANY SURVEILLANCE, ANYWHERE, ANYTIME

Hitachi NAS Platform and Hitachi Content Platform with ESRI Image

EMC ISILON OneFS OPERATING SYSTEM Powering scale-out storage for the new world of Big Data in the enterprise

White Paper Storage for Big Data and Analytics Challenges

WHITE PAPER. Reinventing Large-Scale Digital Libraries With Object Storage Technology

HadoopTM Analytics DDN

Server Virtualization: Avoiding the I/O Trap

Protecting Information in a Smarter Data Center with the Performance of Flash

Cisco UCS and Quantum StorNext: Harnessing the Full Potential of Content

Big data management with IBM General Parallel File System

IBM DB2 Near-Line Storage Solution for SAP NetWeaver BW

Unisys ClearPath Forward Fabric Based Platform to Power the Weather Enterprise

HP and Aspera. Enabling collaboration and optimizing content storage and delivery. HP and Aspera making longdistance file sharing feel local

10th TF-Storage Meeting

Any Threat, Anywhere, Anytime. ddn.com. DDN Whitepaper. Scalable Infrastructure to Enable the Warfighter

Tactical Advantage for Data Management at Scale and gaining value. Callan Fox, Emerging Technologies Division, EMC.

Taming Big Data Storage with Crossroads Systems StrongBox

Providing On-Demand Situational Awareness

THE EMC ISILON STORY. Big Data In The Enterprise. Copyright 2012 EMC Corporation. All rights reserved.

INCREASING EFFICIENCY WITH EASY AND COMPREHENSIVE STORAGE MANAGEMENT

IBM PureFlex System. The infrastructure system with integrated expertise

Optimizing Storage for Better TCO in Oracle Environments. Part 1: Management INFOSTOR. Executive Brief

HGST Object Storage for a New Generation of IT

Colgate-Palmolive selects SAP HANA to improve the speed of business analytics with IBM and SAP

STORNEXT PRO SOLUTIONS. StorNext Pro Solutions

Maximum performance, minimal risk for data warehousing

Deploying Flash in the Enterprise Choices to Optimize Performance and Cost

Solution Brief Network Design Considerations to Enable the Benefits of Flash Storage

STORNEXT PRO SOLUTIONS. StorNext Pro Solutions

WHITE PAPER. Get Ready for Big Data:

Cisco UCS and Fusion- io take Big Data workloads to extreme performance in a small footprint: A case study with Oracle NoSQL database

Leveraging EMC Fully Automated Storage Tiering (FAST) and FAST Cache for SQL Server Enterprise Deployments

Managing the Unmanageable: A Better Way to Manage Storage

Flash Memory Technology in Enterprise Storage

Keystone Image Management System

DATA MANAGEMENT FOR THE INTERNET OF THINGS

Bricata Next Generation Intrusion Prevention System A New, Evolved Breed of Threat Mitigation

EMC Data Domain Boost for Oracle Recovery Manager (RMAN)

Scala Storage Scale-Out Clustered Storage White Paper

Using HP StoreOnce Backup Systems for NDMP backups with Symantec NetBackup

WHITE PAPER. BIG DATA: Managing Explosive Growth. The Importance of Tiered Storage

An Oracle White Paper October Realizing the Superior Value and Performance of Oracle ZFS Storage Appliance

Dell s SAP HANA Appliance

Improving Time to Results for Seismic Processing with Paradigm and DDN. ddn.com. DDN Whitepaper. James Coomer and Laurent Thiers

Unified Computing Systems

IOmark- VDI. Nimbus Data Gemini Test Report: VDI a Test Report Date: 6, September

How To Store Data On An Ocora Nosql Database On A Flash Memory Device On A Microsoft Flash Memory 2 (Iomemory)

RECOVERY SCALABLE STORAGE

An Oracle White Paper May Exadata Smart Flash Cache and the Oracle Exadata Database Machine

IBM Netezza High Capacity Appliance

Tap into Big Data at the Speed of Business

Business-centric Storage FUJITSU Hyperscale Storage System ETERNUS CD10000

IBM Global Technology Services September NAS systems scale out to meet growing storage demand.

Flash-optimized Data Progression

Drive Performance and Growth with Scalable Solutions for Midsize Companies

Object Storage: Out of the Shadows and into the Spotlight

How To Make An Integrated System For An Unmanned Aircraft System

Enterprise Storage Manager. Managing Video As A Strategic Asset November 2009

Simple. Extensible. Open.

Microsoft SQL Server 2008 R2 Enterprise Edition and Microsoft SharePoint Server 2010

Automated Data-Aware Tiering

Benchmarking Cassandra on Violin

Cisco Unified Data Center

Video Surveillance Storage and Verint Nextiva NetApp Video Surveillance Storage Solution

Windows Embedded Security and Surveillance Solutions

agility made possible

Consolidate and Virtualize Your Windows Environment with NetApp and VMware

CDH AND BUSINESS CONTINUITY:

EMC DATA DOMAIN OPERATING SYSTEM

ntier Verde Simply Affordable File Storage

EMC DATA DOMAIN OPERATING SYSTEM

BUSINESS INTELLIGENCE

Protecting Big Data Data Protection Solutions for the Business Data Lake

Building Backup-to-Disk and Disaster Recovery Solutions with the ReadyDATA 5200

Intel RAID SSD Cache Controller RCS25ZB040

IBM Unstructured Data Identification and Management

BUILDING A SCALABLE BIG DATA INFRASTRUCTURE FOR DYNAMIC WORKFLOWS

Transcription:

White Paper Solving Agencies Big Data Challenges: PED for On-the-Fly Decisions Carina Veksler, NetApp March 2012 WP-7158 ABSTRACT With the growing volumes of rich sensor data and imagery used today to derive meaningful intelligence, government agencies need to address the challenges posed by these big datasets. NetApp provides a scalable, unified single pool of storage to better handle your processing and analysis of data to drive actionable intelligence in the most demanding environments on earth.

TABLE OF CONTENTS 1 Processing, Exploitation, and Dissemination (PED)...3 1. 1 PED Requirements...3 2 PED Impact on Storage...4 2.1 Data Growth Trends...4 3 NetApp Storage for PED...5 3.1 Solutions Approach...6 3.2 Unique Differentiation...6 4 Summary...7 LIST OF FIGURES Figure 1) Information requirements 4 Figure 2) Linear scaling of the E5460...7 2 Solving Agencies Big Data Challenges: PED for On-the-Fly Decisions

1 Processing, Exploitation, and Dissemination (PED) Accurate intelligence data forms the foundation for sound decision-making across government agencies. Increasingly, this data comes in raw form from multiple large data sensors, providing the necessary information for both manual and automated analysis. Imagery and other sensor data is useless, however, unless the right quality and quantity of information can be extracted from that data to effectively manage, exploit, analyze, interpret, and disseminate it for faster and more effective action. By processing and exploitation, we mean converting the immense volume of data collected into a form that can be used by analysts. This is done through decryption, language translation, and data reduction. Beyond this, dissemination refers to quickly routing relevant, accurate, mission-critical information to the right people at the right time. However, much of this sensor data has already started to stress today s existing storage architectures. Government agencies face big data challenges related to the immense ingest of large amounts of data and they have begun asking the following questions: Are there opportunities for me to take better advantage of my data? How can I make smarter, more meaningful decisions to support my organization s mission? What are the insights that could enable mission success? Do I have the ability to identify the hot spots that likely will fail before they fail? NetApp can help you answer these questions and meet these challenges. We re going to find ourselves in the not too distant future swimming in sensors and drowning in data. The answer isn t throwing more manpower at it because in DoD, we don t have it we are going to have to use technology and smarter systems. Lt. Gen. David A. Deptula First Deputy Chief of Staff ISR 1. 1 PED Requirements As government agencies deploy new, more sophisticated information-gathering systems, users are confronted with a range of collection, integration, and management issues. These systems must support: Large bandwidth-ingest. Sensor inputs vary in workloads and require extremely high bandwidth for large sequential writes generally associated with streaming data from a variety of sensors: motion imagery, video, radar, and satellite imagery. Long-term archival. A dense form factor for the storage platform is mandatory to support the growing volumes of data. Analysis. Broad support is required for many operating systems and export formats to accommodate both processing and exploitation tools across multiple agencies. Distribution. The ability to support a variety of transport mechanisms and end points is needed for clients to handle dissemination over geographically distributed PED cells effectively. 3 Solving Agencies Big Data Challenges: PED for On-the-Fly Decisions

2 PED Impact on Storage Modern warfare has changed in many ways. One of the most revolutionary and powerful developments in the field of battle has been the use of unmanned aerial vehicles (UAVs) and satellites to boost intelligence, surveillance, and reconnaissance (ISR) capabilities. These resources allow analysts, operators, and decision makers to monitor much larger areas and provide greater situational awareness with far less risk to service personnel. In 2009, drone aircraft flying over Iraq and Afghanistan returned roughly 24 years worth of video footage for processing. Updated models deployed in 2010 produced 10 times as many data streams as their predecessors, and those in 2011 will triple that workload. 1 Fast transfer and storage of rich video, motion imagery, and other large sensor data form the basis for PED workflows. Sensor data and intelligence must be available for analysis and interpretation as quickly as possible to enable teams to make split-second decisions. The less time spent ingesting images, the more time spent in logistics decisions, threat detection, and intelligence gathering. High-frame-rate video streams from multiple simultaneous sources are becoming more common, making the requirements for video transfer and storage even more daunting. Figure 1) ISR architecture requirements. 2.1 Data Growth Trends The growth of big data generated by large data sensors and intelligence, surveillance, and reconnaissance (ISR) systems is putting enormous pressure on existing compute infrastructures, especially the storage platform. These larger datasets contain a wealth of useful information that, if analyzed in a timely fashion, can provide valuable intelligence for mission success. But without the necessary analytical tools, these valuable sources of data become useless. 1 The Data Deluge, http://www.economist.com/node/15579717 (Feb. 25, 2010). 4 Solving Agencies Big Data Challenges: PED for On-the-Fly Decisions

Infrastructure Breaking Points Big data is breaking today s storage infrastructure along three major axes: 1. Complexity. Data is just text and numbers; and big data is about finding the information hidden in huge volumes of data. Once found, information must be rapidly linked from a wide variety of sources, leading to high fidelity decision support that spans multiple sources and data types, each improving decision confidence. Using normal algorithms for search, storage, and categorization is becoming increasingly complex and inefficient. 2. Speed. How fast is the data coming in? How fast can it be processed? Is there relevant information buried in the data? High-definition video (FMV) and wide-area motion video (WAMI) for surveillance have very high ingestion rates and requires automated information extraction to improve the time to information. Time to information is critical if agencies want to derive maximum value from this data. Taking weeks or months to run an analysis is no longer a viable option as it will not be timely enough to detect patterns that may affect the success of the mission. 3. Volume. All collected data must be stored in a place that is secure and always available. With such high volumes of data, IT teams now have to make decisions about what is too much data. This abundance of data can cause the infrastructure to quickly break on the axis of volume. Once the information is found, it becomes much easier to identify what needs to be kept and for how long. Best Practice Effective PED environments require a multisensor datastore that delivers high bandwidth to support: Large sequential writes during ingest Frequent random reads during processing and exploitation NFS based access for easy integration Efficient local and remote access during dissemination Independent capacity and performance scaling Extreme density to support the increasing data volumes and retention times 3 NetApp Storage for PED The growing volume of streamed video now generated monthly by government agencies is equal to what was generated annually as recently as in 2007. This growth rate requires a scalable storage solution that allows multiple agencies to share satellite feeds for collaboration, analysis, and dissemination in almost real time. Currently, intelligence operations frequently create multiple local copies from a central master when time and bandwidth management are of the essence in making timely decisions. High-speed local and remote file sharing and scalable file system performance are critical to support rapid information extraction from the data generated by wide-area motion imagery (WAMI), full-motion video (FMV), radar, and satellite imagery. The NetApp Full-Motion Video solution provides quick reference to critical information, enabling agencies to drive operational efficiencies and reduce time and energy. NetApp storage solutions help government agencies take advantage of the tens of thousands of hours of live video captured each year. With better insight, that data can be turned into quality information to ultimately help users make better decisions within necessary time frames. 5 Solving Agencies Big Data Challenges: PED for On-the-Fly Decisions

3.1 Solutions Approach NetApp delivers preconfigured, pretested solutions that are designed to capture multiple high-speed feeds, such as video and satellite. By enabling faster data exploitation, agencies have the information needed to make better and informed decisions. Based on the E-Series platform, the NetApp storage solution is optimized for capturing and examining rich video for improved decision making. Our solutions enhance situational awareness and command decision-making processes across both strategic and tactical agencies. Optimized for Performance NetApp solutions are designed to handle the extreme bandwidth requirements for large sequential writes generated by streaming data from a variety of sensors, including motion imagery, video, radar, and satellite imagery. The NetApp E5424 delivers both high bandwidth and high IOPS with leading price performance. The E5424 saves money by consuming 50% less power using up to 24 2.5" SAS drives in a 2U form factor. A fully loaded rack delivers performance of up to 35GB/sec sustained disk read throughput, 30GB/sec sustained disk write throughput, and 350,000 sustained IOPS. Maximum Density for Longer Retention NetApp storage solutions are designed to handle the growing volume of data generated by large data sensors, streamlining the footprint for long-term archiving. The NetApp E5460 delivers optimized storage density for maximum capacity with excellent performance, supporting up to 60 drives in each 4U enclosure. The E5460 supports high-capacity near-line SAS disk options that are superior to SATA drives. SAS disks are becoming the drive technology of choice for high capacity and lower cost per MB/sec, and they are an excellent choice for throughput-intensive applications. The 4U enclosure holds 60 disk drives in 5 drawers, delivering roughly 4.4GB/sec of read throughput and 2.9GB/sec of write throughput in one 40U. 3.2 Unique Differentiation NetApp delivers high-performance storage systems that meet the demanding performance and capacity requirements of PED environments without sacrificing simplicity and efficiency. Designed to meet wideranging requirements, their balanced performance is equally adept at supporting high-performance file systems and bandwidth-intensive streaming applications. The Full-Motion Video Solution for Processing, Exploitation, and Dissemination provides the extreme bandwidth required to efficiently ingest the tens of thousands of hours of sensor data collected every year. This allows you to ingest and stream higher-resolution video and provide enhanced performance, By using a shared file system that allows multiple nodes to access large datasets in parallel, you can dramatically improve the time it takes to input and output large or streaming files so you can better focus on analyzing and understanding the data to deliver meaningful intelligence on the fly. The NetApp solution allows you to keep all sensor data in a single repository, using fewer storage arrays and a single namespace for huge libraries even reaching up to 1+PB. NetApp also helps to improve the efficiency of the overall ISR workflow of active data with policy-based automatic archiving allowing the collapse of storage tiers while maintaining the bandwidth required to support ISR workflows. NetApp allows you to deploy active archives using cost-effective online storage for fast access to historical content or seamless integration with industry-leading archive management software. Linear Scaling To accommodate projected data growth, NetApp has designed the NetApp Full-Motion Video solution to scale linearly in both capacity and bandwidth. Our solution also allows you to expand in capacity to support expected data growth independent of performance. This enables the storage system to expand to the size required, but also to the functionality required by the environment (bandwidth versus density). 6 Solving Agencies Big Data Challenges: PED for On-the-Fly Decisions

The modular design of the E5400 storage arrays simplifies scaling and increases flexibility. The multiple drive shelf options enable custom configurations that can be tailored for any environment. You can mix drive types in a single enclosure so you can address different requirements with the same system. By combining elements from each solution, you can create a storage deployment tailored to your specific big-bandwidth requirements that will grow with your needs and protect your investment. Figure 2) Linear scaling of the E5460. 6 x E5460 5 x E5460 4 x E5460 3 x E5460 1 x E5460 2 x E5460 Drives (n) 30-60 90-120 150-180 210-240 270-300 330-360 Capacity (TB) 60-180 180-360 300-540 420-720 540-900 660 1080 Bandwidth when scaling systems (GB/s, writes)* 2.9 5.8 8.7 11.6 14.5 17.4 * With StorNext File System and representative workload 4 Summary Government agencies need an effective storage strategy to manage and maintain the growing volume of data generated by motion imagery, video, radar, and satellites. The ability to use technology in a smarter way is now a necessity. In addition, it is critical to have access to tools that automate processes and make it easy to manage, retain, and retrieve time-sensitive information. NetApp solutions deliver optimized storage for capturing and examining sensor feeds for better decision making. With the solutions ultradense form factor, data can be stored for longer periods of time, helping to improve decision support. Bandwidth. The solution handles the rigors of heavy computational workloads and bandwidthsensitive streaming environments. Reliability. Advanced thermal and power features provide fast and confident deployment with preconfigured, pretested options. Availability. Standard redundant components provide the utmost availability of critical data. Linear scalability. The solutions can accommodate growing data streams and access requirements. Effective storage strategies must address the network bandwidth issues presented by the size of the files captured by motion imagery, radar, satellites, and UAVs. Agencies that rely on PED solutions now need to evaluate whether continuing with their current strategy can adequately meet both current needs and support projected workloads in the future. 7 Solving Agencies Big Data Challenges: PED for On-the-Fly Decisions

NetApp provides no representations or warranties regarding the accuracy, reliability or serviceability of any information or recommendations provided in this publication, or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS, and the use of this information or the implementation of any recommendations or techniques herein is a customer s responsibility and depends on the customer s ability to evaluate and integrate them into the customer s operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document. 2012 NetApp, Inc. All rights reserved. No portions of this document may be reproduced without prior written consent of NetApp, Inc. Specifications are subject to change without notice. NetApp, the NetApp logo, and Go further, faster are trademarks or registered trademarks of NetApp, Inc. in the United States and/or other countries. All other brands or products are trademarks or registered trademarks of their respective holders and should be treated as such. WP-7158-0312 8 Solving Agencies Big Data Challenges: PED for On-the-Fly Decisions