Cloud Computing. Big Data. High Performance Computing



Similar documents
Big Data for Big Science. Bernard Doering Business Development, EMEA Big Data Software

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013

Vendor Update Intel 49 th IDC HPC User Forum. Mike Lafferty HPC Marketing Intel Americas Corp.

Big Data. Value, use cases and architectures. Petar Torre Lead Architect Service Provider Group. Dubrovnik, Croatia, South East Europe May, 2013

Big Data One size doesn t fit all. Dr. Jean-Laurent Philippe, PhD Directeur Avant-Vente, Intel EMEA ICAR 2013

Near-Real-Time Big Data: Hadoop 效 能 最 佳 化 調 校 分 析 美 商 英 特 爾 亞 太 科 技 有 限 公 司 台 灣 分 公 司 鄭 智 成

Introducing the First Datacenter Atom SOC

Hur hanterar vi utmaningar inom området - Big Data. Jan Östling Enterprise Technologies Intel Corporation, NER

Unlocking the Intelligence in. Big Data. Ron Kasabian General Manager Big Data Solutions Intel Corporation

Real-Time Big Data Analytics for the Enterprise

Accelerating Enterprise Big Data Success. Tim Stevens, VP of Business and Corporate Development Cloudera

新 一 代 軟 體 定 義 的 網 路 架 構 Software Defined Networking (SDN) and Network Function Virtualization (NFV)

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

Business opportunities from IOT and Big Data. Joachim Aertebjerg Director Enterprise Solution Sales Intel EMEA

Fast, Low-Overhead Encryption for Apache Hadoop*

Big Data Management and Security

The Foundation for Better Business Intelligence

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

The Future of Data Management

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

Big Data Performance Growth on the Rise

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP

Extended Attributes and Transparent Encryption in Apache Hadoop

Peers Techno log ies Pv t. L td. HADOOP

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

Performance Comparison of SQL based Big Data Analytics with Lustre and HDFS file systems

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi

Hadoop Ecosystem B Y R A H I M A.

HDP Hadoop From concept to deployment.

Certified Big Data and Apache Hadoop Developer VS-1221

COURSE CONTENT Big Data and Hadoop Training

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

Next-Gen Big Data Analytics using the Spark stack

Upcoming Announcements

IBM Big Data Platform

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.

Implement Hadoop jobs to extract business value from large and varied data sets

Introducing EEMBC Cloud and Big Data Server Benchmarks

Enabling High performance Big Data platform with RDMA

Modernizing Your Data Warehouse for Hadoop

Cloud-based Analytics and Map Reduce

#TalendSandbox for Big Data

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014

Big Data and Industrial Internet

The Future of Data Management with Hadoop and the Enterprise Data Hub

WHAT S NEW IN SAS 9.4

Programming Hadoop 5-day, instructor-led BD-106. MapReduce Overview. Hadoop Overview

Extending the Enterprise Data Warehouse with Hadoop Robert Lancaster. Nov 7, 2012

Accelerating and Simplifying Apache

Quick Reference Selling Guide for Intel Lustre Solutions Overview

Security in the Cloud

Deploying Hadoop with Manager

Open Source for Cloud Infrastructure

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

Simplifying Big Data Analytics: Unifying Batch and Stream Processing. John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!!

High Performance Computing and Big Data: The coming wave.

Real-Time Big Data Analytics SAP HANA with the Intel Distribution for Apache Hadoop software

The Open Cloud Near-Term Infrastructure Trends in Cloud Computing

Workshop on Hadoop with Big Data

How Cisco IT Built Big Data Platform to Transform Data Management

ITG Software Engineering

Life With Big Data and the Internet of Things

An Oracle White Paper November Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Qsoft Inc

Bringing Big Data to People

Information Architecture

BIG DATA - HADOOP PROFESSIONAL amron

IBM BigInsights for Apache Hadoop

Intel Xeon Processor E Product Family

Apache Hadoop: Past, Present, and Future

Virtualizing Apache Hadoop. June, 2012

Oracle Big Data SQL Technical Update

Cray XC30 Hadoop Platform Jonathan (Bill) Sparks Howard Pritchard Martha Dumler

Dell* In-Memory Appliance for Cloudera* Enterprise

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

HADOOP ADMINISTATION AND DEVELOPMENT TRAINING CURRICULUM

Intel Cloud Builder Guide: Cloud Design and Deployment on Intel Platforms

BIG DATA TRENDS AND TECHNOLOGIES

Complete Java Classes Hadoop Syllabus Contact No:

BIG DATA What it is and how to use?

HDP Enabling the Modern Data Architecture

Hortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>

Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack

Constructing a Data Lake: Hadoop and Oracle Database United!

WHITE PAPER USING CLOUDERA TO IMPROVE DATA PROCESSING

Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies

Intel Cloud Builders Guide: Cloud Design and Deployment on Intel Platforms

Hadoop on Windows Azure: Hive vs. JavaScript for Processing Big Data

Big Data Analytics. Copyright 2011 EMC Corporation. All rights reserved.

Introduction to Big Data Training

Dominik Wagenknecht Accenture

Apache Hadoop: The Big Data Refinery

Transcription:

Cloud Computing Big Data High Performance Computing Intel Corporation copy right 2013

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. Intel does not control or audit the design or implementation of third party benchmarks or Web sites referenced in this document. Intel encourages all of its customers to visit the referenced Web sites or others where similar performance benchmarks are reported and confirm whether the referenced benchmarks are accurate and reflect performance of systems available for purchase. Relative performance is calculated by assigning a baseline value of 1.0 to one benchmark result, and then dividing the actual benchmark result for the baseline platform into each of the specific benchmark results of each of the other platforms, and assigning them a relative performance number that correlates with the performance improvements reported. SPEC, SPECint, SPECfp, SPECrate. SPECpower, SPECjAppServer, SPECjbb, SPECjvm, SPECWeb, SPECompM, SPECompL, SPEC MPI, SPECjEnterprise* are trademarks of the Standard Performance Evaluation Corporation. See http://www.spec.org for more information. TPC-C, TPC-H, TPC-E are trademarks of the Transaction Processing Council. See http://www.tpc.org for more information. Intel Virtualization Technology requires a computer system with an enabled Intel processor, BIOS, virtual machine monitor (VMM) and, for some uses, certain platform software enabled for it. Functionality, performance or other benefits will vary depending on hardware and software configurations and may require a BIOS update. Software applications may not be compatible with all operating systems. Please check with your application vendor. Hyper-Threading Technology requires a computer system with a processor supporting HT Technology and an HT Technology-enabled chipset, BIOS and operating system. Performance will vary depending on the specific hardware and software you use. For more information including details on which processors support HT Technology, see here Intel Turbo Boost Technology requires a Platform with a processor with Intel Turbo Boost Technology capability. Intel Turbo Boost Technology performance varies depending on hardware, software and overall system configuration. Check with your platform manufacturer on whether your system delivers Intel Turbo Boost Technology. For more information, see http://www.intel.com/technology/turboboost No computer system can provide absolute security under all conditions. Intel Trusted Execution Technology (Intel TXT) requires a computer system with Intel Virtualization Technology, an Intel TXT-enabled processor, chipset, BIOS, Authenticated Code Modules and an Intel TXT-compatible measured launched environment (MLE). Intel TXT also requires the system to contain a TPM v1.s. For more information, visit http://www.intel.com/technology/security. In addition, Intel TXT requires that the original equipment manufacturer provides TPM functionality, which requires a TPM-supported BIOS. TPM functionality must be initialized and may not be available in all countries. Intel AES-NI requires a computer system with an AES-NI enabled processor, as well as non-intel software to execute the instructions in the correct sequence. AES-NI is available on Intel Core i5-600 Desktop Processor Series, Intel Core i7-600 Mobile Processor Series, and Intel Core i5-500 Mobile Processor Series. For availability, consult your reseller or system manufacturer. For more information, see http://software.intel.com/en-us/articles/intel-advanced-encryption-standard-instructions-aes-ni/ Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor series, not across different processor sequences. See http://www.intel.com/products/processor_number for details. Intel products are not intended for use in medical, life saving, life sustaining, critical control or safety systems, or in nuclear facility applications. All dates and products specified are for planning purposes only and are subject to change without notice Copyrirht 2011 Intel Corporation. All rights reserved. Intel, the Intel logo, Xeon and Intel Core are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. All dates and products specified are for planning purposes only and are subject to change without notice

Parviz Peiravi Principal Architect Enterprise Computing Enterprise Solution Sales Group Big Data & Cloud Strategy, Architecture and Design Intel Corporation Parviz Peiravi is a Principal Architect with Intel Corporation responsible for Enterprise Infrastructure solutions and design. He is primarily responsible for designing and driving development of Big Data, Service Oriented Architecture, Cloud computing architectures in support of Intel s focus areas within Enterprise Computing. Parviz has designed large scale Clusters using Oracle, Microsoft SQL Server, and IBM DB2, Data Grid and Cloud infrastructure using Grid, SOA and virtualization technologies. He has numerous certifications in Enterprise Architecture Framework, SOA, ITIL, XML\Web Services, VMware, Xen, Hadoop and Database design. His current focus is researching the application of Big Data, Virtualization, and SOA within Cloud Computing Infrastructure Framework. He is member of Cloud Computing Group, Cloud Security Alliance (CSA), DMTF, OGF, and IEEE. Parviz has been with Intel 16+ years and holds a degree in Computer and Electrical Engineering and a recipient of Intel Achievement Award (IAA). His e-mail address is parviz.peiravi@intel.com Recent Publication: The Business Value of Virtual Service Oriented Grids by Enrique Castro-leon, Jackson He, Mark Chang and Parviz Peiravi, Intel Press (2008), ISBN 978-1934053102. He is also editor in chief of Journey to Cloud emagazine published by Intel Corporation. www.intel.com

40 Zettabytes of data will be generated WW in 2020 1 Clients Cloud Richer user experiences Intelligent Systems Richer data to analyze 2.8 Zettabytes of data will be generated WW in 2012 1 (1) IDC Digital Universe 2020, (2) IDC Richer data from devices

Intel Powers the World's Fastest Supercomputer The National Supercomputing Center in Guangzhou supercomputer is called the Milky Way-2 system It is powered by a staggering 80,000 Intel processors: 32,000 12-core future generation Intel Xeon E5-2600 v2 processors and 48,000 Intel Xeon Phi coprocessors, for a total of 3,120,000 computing cores.

Large volume of Data, Not new to HPC High Velocity and Varity of Data, it is a game changer! Velocity; speed of generating and ingesting, processing, analyzing and delivery analytic services Variety; Multi-data source, Multi-Structured

Transforms Big Data from batch to real-time orientation Moves from static to dynamic, streaming operations requires a shift from simple to intelligent storage architecture to reap full Big Data benefits Move compute to data or data to compute as appropriate Balanced compute, storage and fabric performance is essential

Data Access User Authentication BI & Analytic as a Service Streaming Analytics BI Integrated Analytics with Hadoop Support PaaS Application Development Environment HPC as a Service Data as a Services Data ingestion, Integration and Processing Services Enterprise Infrastructure Services Public Cloud Private Cloud Human Genome & Drug Discovery Diagnostic Images Surveillance Medical and Medical Device Devices Streaming Data Data Sources Log Files Text, Video and Audio Social Media Medical Records Build on Secure and Open Cloud Architecture from ground up GIS

HPC Enabling exascale computing on massive data sets Cloud Helping enterprises build open interoperable clouds Open Source Contributing code and fostering ecosystem

File-based Encryption in HDFS Up to 20x faster decryption with AES-NI* MapReduce Role-based access control for Hadoop services Up to 8.5X faster Hive queries using HBase coprocessor Optimized for SSD with Cache Acceleration Software Adaptive replication in HDFS and HBase Instrument Aggregation Engine Report Engine Integrated text search with Lucene HiTune Controller Simplified deployment & comprehensive monitoring Deployment of HBase across multiple datacenters Automated configuration with Intel Active Tuner Detailed profiling of Hadoop jobs Simplified design of HBase schemas (+ in 2.4) REST APIs for deployment and management (+ in 2.4)

Flume 1.1.0 Log Data Collector Zookeeper 3.4.5 Coordination Sqoop 1.4.1 RDB Data Collector Big Data Intel s Distribution of Apache Hadoop Intel Hadoop Manager 3.x Deployment, Configuration, Monitoring, Alerting and Security Oozie 3.3.0 Data flow PL/SQL (compiler, planner, driver) HiveQL 0.10 Interactive Query Pig 0.11 Data manipulation Mahout 0.7 Data mining GraphLab Data mining framework R connectors statistics HCatalog 0.5 Metadata YARN (MRv2) Distributed Processing Framework Coprocessors Query execution engine HBase 0.96.x Real-time Distributed Big Table HDFS 2.0 Hadoop Distributed File System 13

Shared, parallel Lustre storage accelerates Big Data workloads Intel Manager for Lustre* Simplified, configuration monitoring and management Browser and CLI interfaces lowers complexity and costs Highly extensible storage plug-in design and REST API Lustre File System Proven sustained performance at massive scale Flexible, scale-up and scale-out solutions Intel leads the development and delivery of new features and releases Intel Support Services for Lustre Unrivaled Lustre support expertise Professional services offerings ensures success

Hadoop uses pluggable extensions to work with different file systems Lustre is POSIX compliant: Use Hadoop s built-in LocalFileSystem class Uses native file system support in Java Extend and override default file system behavior Optimizes the performance of the shuffle phase org.apache.hadoop.fs FileSystem RawLocalFileSystem LustreFileSystem

Advocacy Storage Compute Integration Development

Choice High Availability Secure Energy Efficient Responsive Compute Network Storage Hardware Datacenter Software Intel Xeon Product Family E3-E5-E7 Intel Atom Intel Xeon Phi TM Intel Ethernet Controllers Intel Ethernet Adapters Intel Ethernet Switch Silicon Intel True Scale Fabric Intelligent Storage 1 Scale-out Storage 1 Scale-up Storage 1 Intel SSD 710 series, DC S3700 (SATA) Intel Distribution for Apache Hadoop* Intel and Lustre* Intel SSD 910 series (PCIe) Intel s Foundational Technologies Offer Advanced Solutions for Big Data Analytics Xeon-based storage systems are available in a wide range of configuration options from the industry s leading storage vendors

Big Data http://www.intel.com/content/www/us/en/bigdata/big-data-analytics-turning-big-data-intointelligence.html Hadoop https://hadoop.intel.com/ Lustre* http://www.whamcloud.com/

Question & Respond! Parviz.peiravi@intel.com