Evolution from Big Data to Smart Data

Similar documents
HDP Enabling the Modern Data Architecture

HDP Hadoop From concept to deployment.

Neil Stobart Cloudian Inc. CLOUDIAN HYPERSTORE Smart Data Storage

The Future of Data Management

Session 0202: Big Data in action with SAP HANA and Hadoop Platforms Prasad Illapani Product Management & Strategy (SAP HANA & Big Data) SAP Labs LLC,

The Future of Data Management with Hadoop and the Enterprise Data Hub

Oracle Database 12c Plug In. Switch On. Get SMART.

Big Data Realities Hadoop in the Enterprise Architecture

The Convergence of Software Defined Storage and Physical Appliances Hybrid Cloud Storage

A Modern Data Architecture with Apache Hadoop

Cloudian The Storage Evolution to the Cloud.. Cloudian Inc. Pre Sales Engineering

BIG DATA TRENDS AND TECHNOLOGIES

Upcoming Announcements

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE

Hortonworks & SAS. Analytics everywhere. Page 1. Hortonworks Inc All Rights Reserved

Modernizing Your Data Warehouse for Hadoop

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

Hortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Real Time Big Data Processing

The Digital Enterprise Demands a Modern Integration Approach. Nada daveiga, Sr. Dir. of Technical Sales Tony LaVasseur, Territory Leader

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Data Security in Hadoop

The Enterprise Data Hub and The Modern Information Architecture

Oracle Database - Engineered for Innovation. Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya

Dominik Wagenknecht Accenture

Cloudian delivers object storage for next generation infrastructures

Open source Google-style large scale data analysis with Hadoop

Oracle Big Data SQL Technical Update

Big Data: Making Sense of it all!

Virtualizing Apache Hadoop. June, 2012

Scale-Out Storage Infrastructure for Apache * Hadoop* Big Data Analytics with Cloudian HyperStore & Intel -based Storage Servers

SAP and Hortonworks Reference Architecture

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

Transforming the Telecoms Business using Big Data and Analytics

BIG DATA TECHNOLOGY. Hadoop Ecosystem

Comprehensive Analytics on the Hortonworks Data Platform

Bringing Big Data to People

Hadoop in the Hybrid Cloud

The Next Wave of Data Management. Is Big Data The New Normal?

In-memory computing with SAP HANA

Big Data Are You Ready? Jorge Plascencia Solution Architect Manager

Addressing Open Source Big Data, Hadoop, and MapReduce limitations

Big Data Analytics Nokia

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

Hadoop: Embracing future hardware

Please give me your feedback

Hadoop, the Data Lake, and a New World of Analytics

Big Data Are You Ready? Thomas Kyte

So What s the Big Deal?

Workshop on Hadoop with Big Data

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Hadoop Big Data for Processing Data and Performing Workload

OPTIMIZING PRIMARY STORAGE WHITE PAPER FILE ARCHIVING SOLUTIONS FROM QSTAR AND CLOUDIAN

Technical Overview Simple, Scalable, Object Storage Software

Big Data: A Storage Systems Perspective Muthukumar Murugan Ph.D. HP Storage Division

Information Builders Mission & Value Proposition

Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect

BIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014

Apache Hadoop's Role in Your Big Data Architecture

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap

Peers Techno log ies Pv t. L td. HADOOP

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

How Companies are! Using Spark

Simple. Extensible. Open.

Oracle Big Data Building A Big Data Management System

Agenda. Big Data & Hadoop ViPR HDFS Pivotal Big Data Suite & ViPR HDFS ViON Customer Feedback #EMCVIPR

BIG DATA What it is and how to use?

Safe Harbor Statement

Protecting Big Data Data Protection Solutions for the Business Data Lake

Implementing Multi-Tenanted Storage for Service Providers with Cloudian HyperStore. The Challenge SOLUTION GUIDE

Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics

Data Lake In Action: Real-time, Closed Looped Analytics On Hadoop

Why Big Data in the Cloud?

Saving Millions through Data Warehouse Offloading to Hadoop. Jack Norris, CMO MapR Technologies. MapR Technologies. All rights reserved.

Nutanix Solutions for Private Cloud. Kees Baggerman Performance and Solution Engineer

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

MySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering

Big Data for Investment Research Management

Il mondo dei DB Cambia : Tecnologie e opportunita`

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload

Making Sense of Big Data in Insurance

Ganzheitliches Datenmanagement

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April

The Power of Pentaho and Hadoop in Action. Demonstrating MapReduce Performance at Scale

Big + Fast + Safe + Simple = Lowest Technical Risk

Big Data 101 Webinar

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Talend Big Data. Delivering instant value from all your data. Talend

Advanced In-Database Analytics

Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time?

ITG Software Engineering

The Evolving Apache Hadoop Eco-System

Transcription:

Evolution from Big Data to Smart Data

Information is Exploding 120 HOURS VIDEO UPLOADED TO YOUTUBE 50,000 APPS DOWNLOADED 204 MILLION E-MAILS EVERY MINUTE EVERY DAY Intel Corporation 2015

The Data is Changing Performance Optimized Capacity Optimized Data Type Structured Unstructured Record Size Kilobytes or less Megabytes to Terabytes Data Updates Frequent Rare/never Access Frequency Heavy Light Metadata Fixed Variable Scale Required Up to Terabytes Exabytes Unstructured Data accounts for 70-80% of storage capacity growth Ashish Nadkarni, IDC IDC, 2015 Copyright 2014 IDC 3

The Industry Responds 1. Scale Out 2. Software Defined 3. Smart Data 4

Scale-Out Economics Start Small Scale Large Start from a single node (TBs) but have the ability to scale to multiple independent nodes (PBs) RAIN Architecture Granular Resource Scaling Add CPUs and storage independently as needed Take advantage of decreasing storage costs and increased storage densities

Software Defined Storage Modern Apps Analytics Deep Archive Object HDFS HyperStore Smart Storage Platform New York. London File Tokyo 100% S3 Always On Smart Protect Multi Datacenter Smart Policies Enterprise Grade

The Era of Smart Data Storage DATA STORAGE = problem SMART DATA STORAGE = solution Passive Delayed Analytics Static Data Active Timely Insight Meaning Actionable Business Value HYPERSTORE ANALYTICS DATA INFORMATION OBJECT STORE 7

Smart Data Analytics in Place Consumer Activity (Events, GPS, WiFi) Device Tracking and Logs Social Media Result of Analysis INTERNET OF THINGS Benefits Faster time-to-decision Event processing platform B IG DATA Fast Efficient Better business decisions Analyze more allows for efficient bulk data analysis in place No redundant storage of data Analytics HyperStore scales out with your data adding nodes for I/O Take advantage of multi-core CPUs makes sense for MapReduce Can feed smarter data to subsequent analytic systems Cloudian HyperStore COST EFFICIENT 8

Cloudian & Hortonworks Batch Map Reduce Script Pig SQL Hive/Tez, HCatalog NoSQL HBase Accumulo Stream Storm Search Solr YARN : Data Operating System 1 HDFS S3 Native File System (URI scheme: s3n) Linux Windows On- Premise Others In-Memory Analytics, ISV engines N Cloud HDFS Shell Commands File I/O Operations Mass Upload ETL with Pig Standard Map Reduce Analysis with Hive 9

Analytics and Hadoop Availability Peer to peer storage system Locality Data Center Locality Can enforce constraints on the location of Hadoop data and maintain locality of reference for Hadoop Hadoop can be run on storage nodes Efficiency Erasure Coding for efficient bulk data storage Scale Cluster on demand as needed dynamic rebalance Multi-part uploading to improve large object uploads Rich metadata Example Pig can load filtered data directly from Cloudian HyperStore without passing for HDFS A = LOAD 's3n://bucket' USING CloudianStorage(); B = FILTER A BY (time >= '2015/02/16') AND (time <= '2015/02/20'); 10

Use Cases Hadoop for Internet of Things Clickstream data Sentiment data Server log data Sensor data Analysis of what people click on Individual web pages and in what order. Clickstream analysis can reveal how users research products and also how they complete their online purchases. Unstructured data on opinions, emotions, and attitudes from sources like social media posts, blogs, online product reviews and customer support interactions. Organizations use sentiment analysis to understand how the public feels about something and track how those opinions change over time. Large enterprises build, manage and protect their own proprietary, distributed information networks. Server logs are the computergenerated records that report data on the operations of those networks. When there is a problem, its one of the first places the IT team looks for a diagnosis. From refrigerators and coffee makers to energy-measuring smart meters, sensor data is everywhere. It is created by the machinery that runs assembly lines and the cell towers that route our phone calls. It is net new data that is increasing exponential in the information age. Internet Marketing Online Commerce Retail Media & Entertainment IT Organizations Customer Support Manufacturing Industrial 11

Smart Support CUSTOMER CLOUDIAN HyperStore Appliances Telemetrics Data Smart Support HyperStore Appliances S3n://bucket/ Hadoop Cluster Smart Support Analytics Cloudian Support 12

Cloudian HyperStore Platform 13

Multi tenancy & QoS Tenant A Tenant B Tenant C Storage Bytes Storage Policies Storage Objects Data Placement Data Access Requests per Min Access Control Inbound Bytes/Min Tenant A Tenant B Tenant C Outbound Bytes/Min Tiering HyperStore Software Defined Storage 14

Your Choice of Deployment Pre-Configured Software-Defined Storage Arrays Stand-alone Software HSA Series FL3000 Series 1U and 2U models Scales from 24TB to multiple PBs Dedicated, all-inone, on-premises storage Density optimized appliance with PBscale architecture Seamless scalability on demand 8 storage nodes in 8U Hot plug everything OR Efficient data protection with compression, replication and erasure coding On-premises S3 with full support for all S3 ecosystem apps Dynamic data tiering Hadoop-ready Geo-replication Multi-tenant QoS controls Self-healing and autorebalancing

SMART STORAGE OPERATIONS SMART STORAGE PLATFORM SMART STORAGE ANALYTICS Smart Protect Proactive Repair Smart Tiering Smart Scale Software Defined Forever Storage Platform Real Time Analytics Search & Discovery Smart Support CLOUDIAN HYPERSTORE SMART DATA STORAGE

1c per GB per Month Visit Us: Booth 415 WEBSCALE SIMPLICITY & ECONOMICS HYBRID CLOUD OPEN ARCHITECTURE ENTERPRISE READY