Hortonworks Data Platform Reference Architecture



Similar documents
MapR Enterprise Edition & Enterprise Database Edition

Dell Reference Configuration for Hortonworks Data Platform

Apache Hadoop Cluster Configuration Guide

The Greenplum Analytics Workbench

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Hadoop: Embracing future hardware

Platfora Big Data Analytics

Building & Optimizing Enterprise-class Hadoop with Open Architectures Prem Jain NetApp

docs.hortonworks.com

RED HAT ENTERPRISE LINUX 7

IRON Big Data Appliance Platform for Hadoop

HDFS Federation. Sanjay Radia Founder and Hortonworks. Page 1

Hadoop on the Gordon Data Intensive Cluster

Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack

HP SN1000E 16 Gb Fibre Channel HBA Evaluation

Successfully Deploying Alternative Storage Architectures for Hadoop Gus Horn Iyer Venkatesan NetApp

How To Run Apa Hadoop 1.0 On Vsphere Tmt On A Hyperconverged Network On A Virtualized Cluster On A Vspplace Tmter (Vmware) Vspheon Tm (

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales

Hadoop on OpenStack Cloud. Dmitry Mescheryakov Software

HP Reference Architecture for Hortonworks Data Platform on HP ProLiant SL4540 Gen8 Server

Accelerate Big Data Analysis with Intel Technologies

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

LSI MegaRAID CacheCade Performance Evaluation in a Web Server Environment

Upcoming Announcements

SUN ORACLE EXADATA STORAGE SERVER

Fujitsu PRIMERGY Servers Portfolio

HUAWEI TECHNOLOGIES CO., LTD. HUAWEI FusionServer X6800 Data Center Server

Optimizing SQL Server Storage Performance with the PowerEdge R720

High Performance Server SAN using Micron M500DC SSDs and Sanbolic Software

Maximizing Hadoop Performance and Storage Capacity with AltraHD TM

SUN HARDWARE FROM ORACLE: PRICING FOR EDUCATION

Sun Constellation System: The Open Petascale Computing Architecture

CORRIGENDUM TO TENDER FOR HIGH PERFORMANCE SERVER

INDIAN INSTITUTE OF TECHNOLOGY KANPUR Department of Mechanical Engineering

NetApp Solutions for Hadoop Reference Architecture

Redundancy in enterprise storage networks using dual-domain SAS configurations

Adobe Deploys Hadoop as a Service on VMware vsphere

Using Hadoop to Expand Data Warehousing

HADOOP ON ORACLE ZFS STORAGE A TECHNICAL OVERVIEW

Cisco UCS B-Series M2 Blade Servers

How To Build A Cisco Ukcsob420 M3 Blade Server

CURSO: ADMINISTRADOR PARA APACHE HADOOP

Maximum performance, minimal risk for data warehousing

Deploying Hadoop with Manager

Intel RAID SSD Cache Controller RCS25ZB040

The Hardware Dilemma. Stephanie Best, SGI Director Big Data Marketing Ray Morcos, SGI Big Data Engineering

1. Specifiers may alternately wish to include this specification in the following sections:

Architecting for the next generation of Big Data Hortonworks HDP 2.0 on Red Hat Enterprise Linux 6 with OpenJDK 7

Dell Reference Configuration for DataStax Enterprise powered by Apache Cassandra

Lenovo ThinkServer and Cloudera Solution for Apache Hadoop

HP Proliant BL460c G7

VMware Virtual SAN Backup Using VMware vsphere Data Protection Advanced SEPTEMBER 2014

MaxDeploy Ready. Hyper- Converged Virtualization Solution. With SanDisk Fusion iomemory products

Increasing Hadoop Performance with SanDisk Solid State Drives (SSDs)

Data Sheet FUJITSU Server PRIMERGY CX272 S1 Dual socket server node for PRIMERGY CX420 cluster server

Virtual Compute Appliance Frequently Asked Questions

Deploying Cloudera CDH (Cloudera Distribution Including Apache Hadoop) with Emulex OneConnect OCe14000 Network Adapters

Power Efficiency Comparison: Cisco UCS 5108 Blade Server Chassis and Dell PowerEdge M1000e Blade Enclosure

Cisco UCS B440 M2 High-Performance Blade Server

Backup Exec 3600 Appliances Partner Overview. Backup Exec 3600 Appliance Partner Overview

HP Moonshot System. Table of contents. A new style of IT accelerating innovation at scale. Technical white paper

Extending Hadoop beyond MapReduce

Hadoop Distributed File System. Dhruba Borthakur Apache Hadoop Project Management Committee

Hadoop in the Enterprise

IronPOD Piston OpenStack Cloud System Commodity Cloud IaaS Platforms for Enterprises & Service

HP recommended configuration for Microsoft Exchange Server 2010: ProLiant DL370 G6 supporting GB mailboxes

Hadoop Size does Hadoop Summit 2013

Hyperscale Use Cases for Scaling Out with Flash. David Olszewski

HadoopTM Analytics DDN

Server Consolidation for SAP Business Solutions on Lenovo X6 Systems with X ARCHITECTURE technology and Intel Xeon E7 v2 Processors

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC

version 7.0 Planning Guide

AirWave 7.7. Server Sizing Guide

Low-cost

Built for Business. Ready for the Future.

High-Performance Computing Clusters

Microsoft Analytics Platform System. Solution Brief

The Convergence of Software Defined Storage and Physical Appliances Hybrid Cloud Storage

ioscale: The Holy Grail for Hyperscale

Cost Efficient VDI. XenDesktop 7 on Commodity Hardware

NOTICE ADDENDUM NO. TWO (2) JULY 8, 2011 CITY OF RIVIERA BEACH BID NO SERVER VIRTULIZATION/SAN PROJECT

Enabling High performance Big Data platform with RDMA

How To Run A Hosted Physical Server On A Server At Redcentric

præsentation oktober 2011

Oracle Big Data SQL Technical Update

Dell In-Memory Appliance for Cloudera Enterprise

HP Reference Architecture for Cloudera Enterprise on ProLiant DL Servers

Transcription:

Hortonworks Data Platform Reference Architecture A PSSC Labs Reference Architecture Guide December 2014

Introduction PSSC Labs continues to bring innovative compute server and cluster platforms to market. Focusing on specific applications where performance and reliability are critical, highlights PSSC Labs strengths. The Apace Hadoop data framework requires substantial compute and storage capabilities coupled tightly together. Hortonworks Data Platform adds layers of unification, security and management. PSSC Labs is the first manufacturer to design a server platform specifically for Hadoop. Introducing the World s highest density, lowest power consuming, Enterprise Ready Big Data server platform designed specifically for Hadoop workloads. The Big Data D12000 offers the absolute highest possible compute and storage density combined with high performance Data IO throughput. PSSC Labs already delivers the Big Data D12000 for small POC to large production clusters spanning hundreds of nodes. Key Features: Reduce Data Center Footprint By 50% Reduce Power Consumption By 40% Nearly 50% Greater Data IO Rates Patent Pending Tool Free Maintenance Technical Specifications: Up to 12 x 3.5 SATAIII or 14 x SSDs in 1U Rack Space o 72 TB using twelve x 6 TB Hard Drives Supports UP to 2 x Intel Xeon E5 Series Processors Supports Up to 256 GB ECC Enterprise Memory 10 GigE, 40 GigE and Infiniband Network Support Red Hat, CentOS, Ubuntu, MS Windows Compatible We welcome PSSC Labs to the Hortonworks Technology Partner Program. Their unique server platforms are already being deployed in Apache Hadoop environments and we look forward to deepening our relationship to help joint customers achieve even greater performance from their big data deployments. John Kreisa Hortonworks, Vice President

Big Data D12000 Sample Configurations Every organization or use case requires different computing needs. The Big Data D12000 offers the greatest flexibility possible. Below are three different proposed architectures for different workloads: high density storage, high computational requirements and a balanced configuration. Big Data D12000 High Density 48 TB Total Storage 12 Xeon E5 Processor Cores 64 GB ECC Memory 2 x 10GigE Network Ports 2 x GigE Network Ports Power Draw Estimate 215 Watt Idle / 275 Watt Full Load Big Data D12000 High Compute 24 TB Total Storage 20 Xeon E5 Processor Cores 256 GB ECC Memory 2 x 10GigE Network Ports 2 x GigE Network Ports Power Draw Estimate 265 Watt Idle / 345 Watt Full Load Big Data D12000 Balanced 36 TB Total Storage 16 Xeon E5 Processor Cores 128 GB ECC Memory 2 x 10GigE Network Ports 2 x GigE Network Ports Power Draw Estimate 230 Watt Idle / 315 Watt Full Load A Sample of Organizations Currently Using PSSC Labs Big Data D 12000 Servers

HDP Rax: Hortonworks Data Platform Validated Turn Key Cluster Solutions PSSC Labs offers a complete, turn-key cluster that is ready to run on delivery. PSSC Labs understands everything that is necessary for a successful deployment. All necessary hardware including servers, network equipment, power and infrastructure are included. PSSC Labs Cluster Engineers preconfigure network, storage, operating system and BIOS settings to the end user s specifications. Hortonworks Data Platform is installed at PSSC Labs factory. The final step is the running of sample data sets to ensure proper functionality and performance. Below is an overview of each different server platform PSSC Labs offers for Hortonworks Data Platform turn-key deployments. Depending on the complexity of the environment, some software resources can be installed on different server platforms. Big Data D12000 DATA NODE Tech Specs Key Features Software Resource o 1U High Density Form Factor o 2 x Intel Xeon E5 Processors o 12 x SATAIII or SAS Hard Drives or 14 x SSDs o 12 TB to 72 TB Storage Capacity o 32 GB to 256 GB ECC Memory o 2 x GigE Network Adapters o Optional 10 GigE, 40 GigE, Infiniband Support o Dedicated IPMI / ikvm o Enterprise Platform o Redundant Power Supply o Improved Data IO Throughput o 40% Reduction in Power Consumption o 2 x the Density of Standard Server o Flexible Configuration Options o 3 Year Warranty Included (24 x 7 x 365 NBD Available) o DataNode Daemon o Ganglia Monitor o Region Server o Node Manager o Supervisor SURESTORE U2000 MANAGEMENT NODE (NAME NODE & SECONDARY NAME NODE) Tech Specs Key Features Software Resources o 2 x Intel Xeon E5 Processors o 1 TB to 24 TB SATA III, SAS, SSD Hard Drives o 32 GB to 256 GB ECC Memory o 2 x GigE Network Adapters o Optional 10GigE, 40GigE, Infiniband Support o Dedicated IPMI / ikvm o Enterprise Platform o Redundant Power Supply o Redundant Storage o Raid Levels 0,1,5,6,10,50 o Flexible Configuration Options o 3 Year Warranty Included (24 x 7 x 365 NBD Available) o App Timeline Server o DRPC Server o Ganglia Monitor o HDFS Client o NameNode / Secondary NN o Oozie Server / Client o Yarn Client o Zookeeper Server / Client o HDFS Client o MySQL Server SURESTORE U1000 EDGE NODE Tech Specs Key Features Resources o 2 x Intel Xeon E5 Processors o 1 TB to 12 TB SATAIII, SAS, SSD o Enterprise Platform o Redundant Power Supply o Hive Server / Client o Tez Client Hard Drives o Redundant Storage o Nimbus o 32 GB to 256 GB ECC Memory o Raid Levels 0,1,5,6,10,50 o Nagios Server o 2 x GigE Network Adapters o Flexible Configuration Options o Optional 10GigE, 40GigE, Infiniband Support o 3 Year Warranty Included (24 x 7 x 365 NBD Available) o Dedicated IPMI / ikvm

HDP Rax Turn-Key Cluster Sample Configurations HDP Rax 150 250 TB Total Storage 2 Name Node 5 Data Nodes 60 Xeon E5 Processor Cores 320 GB ECC Memory 10GigE Network Backplane HDP Installation Service HDP Validation Service HDP Rax 500 500 TB Total Storage 2 Name Nodes 1 Edge Node 10 Data Nodes 160 Xeon E5 Processor Cores 1280 GB ECC Memory 10GigE Network Backplane HDP Installation Service HDP Validation Service Rack & Roll Service HDP Rax 1500 1500 TB Total Storage 2 Name Nodes 1 Edge Node 30 Data Nodes 480 Xeon E5 Processor Cores 3840 GB ECC Memory 10GigE Network Backplane HDP Installation Service HDP Validation Service Rack & Roll Service We believe strongly in our ability to deliver the highest performance, highest reliability server platforms to Hortonworks end users. Our experience delivering clusters ranging from several hundred TBs to several dozen PBs ensures a successful Hortonworks Data Platform deployment. Larry Lesser PSSC Labs, CTO

Total Cost of Ownership Comparison PSSC Labs goal is to offer solutions with the absolute lowest total cost of ownership. The below chart compares different server manufacturer s solution for a 1 Petabyte (raw) Hadoop environment. PSSC Labs HDP Rax 1000 requires 50% less rack space and consumes 40% less power Required Data Center Footprint Power Consumption Estimate* Required Power Circuits Pre-installation and Validation of Hortonworks Data Platform at Factory Onsite Physical Installation Cluster Management Training Dedicated Remote Monitoring Capabilities Hardware Warranty PSSC Labs HDP Rax 1000 for Dell Configuration for HP Configuration for 1PB Lenovo Configuration for 1PB Total Storage Space 1PB Total Storage Total Storage 1PB Total Storage Single x 42U Rack Two x 42U Rack Two x 42U Rack Two x 42U Rack 4300 Watts Total @ Idle 5500 Watts Total @ Load Single x 30 Amp / 208V / Single Phase Yes. Hortonworks Data Platform preinstalled and tested. Yes. Cluster arrives preracked, cabled and labeled. Yes. Yes. IPMI 2.0 Network Standard 3 Year NBD Service Available. 5800 Watts Total @ Idle 8500 Watts Total @ Load Two x 30 Amp / 208V / Single Phase 6000 Watts Total @ Idle 8800 Watts Total @ Load Two x 30 Amp / 208V / Single Phase and fees required. 5700 Watts Total @ Idle 8700 Watts Total @ Load Two x 30 Amp / 208V / Single Phase and fees required. and fees required. Yes. Yes. Yes. 3 Year NBD Service Available. 3 Year NBD Service Available. 3 Year NBD Service Available. *Dell, HP and Lenovo power estimates based on manufacturers website power draw estimates.