Object-based Storage in Big Data and Analytics. Ashish Nadkarni Research Director Storage IDC

Size: px
Start display at page:

Download "Object-based Storage in Big Data and Analytics. Ashish Nadkarni Research Director Storage IDC"

Transcription

1

2 Object-based Storage in Big Data and Analytics Ashish Nadkarni Research Director Storage IDC

3 IDC s definition of Big Data and Analytics (BDA) Mix of data, talent, technology, processes, and services that allow for effective management of potentially large volumes of multistructured and/or high-velocity data Enables a range of business intelligence and analytic applications to support tactical, operational, and strategic decision-making processes across the organization, delivering greater business value Applications Analytics & Discovery Tools Data Organization & Management Tools Infrastructure

4 Size of the BDA Market 2013* Big Data Software: $2.5 B Hardware: $4.6 B Services: $3.8 B Total WW: $10.9 B CAGR: 32% 2013* Business Analytics Software: $38 B Hardware: $19 B Services: $47 B Total WW: $104 B CAGR: 10%

5 What does Big Data comprise of? Hadoop Ecosystem NoSQL DBMSs Analytic Applications Rich Media Analytics Advanced & Predictive Analytics Value Added Content Providers Content Analytics Risk Management Decision Management Internet of Things Visual Discovery Skills and Org Structure Data Warehousing Big Data & Analytics Infrastructure Source: Google Trends, August 22, 2013

6 Types of BD/A applications deployed BD/A landscape is still dominated by commercial suppliers Opensource/community based platforms are playing an increasingly dominant role Q. What type of data analytics applications do you run on your data analytics infrastructure?

7 Big Data Infrastructure Workflow An IDC View Data IO Profiles Supporting Infrastructure Analytics Software Value derivation loop and/or and/or Data collected is in one or more formats and/or from one ore more locations Data sampled is more than 100TB at rest Object Storage! Data captured is via ultra-highspeed streaming Deployed on dynamically adaptable infrastructure Analyzed via batch, parallel and/or distributed processing framework Business value is continuously extracted and applied and/or and and Data generated is growing at >60% per year Value Volume Velocity Variety On-premise or cloudbased On-demand Resilient Optimized Intelligence Self-tuned Enduring Time-sensitive High-Impact Business-Tangibility Data Infrastructure Applications Business Impact

8 Dynamic Infrastructure for Big Data Variable Computing Compustorage IT Efficiency Capacity on demand Mobile Analytics Agile Business Apps integration SAP/Hana

9 What is object storage? Utilize APIs and REST-based object access methods for storing, retrieving and looking up data Use a flat Account-Container-Object approach for data organization and placement Use a referable and programmable metadata repository that is tied to a global and geo-dispersed namespace Meta data includes info about create, access, modified and other dates, permissions, security, application or file type, or other attributes Primary access is via an API where an object can be of arbitrary size (up to the maximum of the object system) along with variable sized meta data (depends on the object system/service implementation) Many object-platforms make use of NoSQL databases for data storage

10 The trifecta of object-storage platform suppliers Suppliers Open-source or community based Software ISVs (Deliver appliances) Traditional Hardware suppliers OpenStack, Basho Riak CS, and Ceph are examples of open-source or community based platforms Software ISVs include Scality, Cleversafe and Amplidata Traditional suppliers include EMC, NetApp and DDN

11 APPLICATION Most object storage platforms are built using shared nothing architectures APPLICATION APPLICATION CIFS NFS HTTP Block CIFS NFS HTTP Block Global Namespace Local persistent mory CPU/Me storage CPU/Me mory CPU/Me mory CPU/Me mory CPU/Me mory Local persistent mory CPU/Me storage CPU/Me mory CPU/Me mory CPU/Me mory CPU/Me mory Local persistent mory CPU/Me storage CPU/Me mory CPU/Me mory CPU/Me mory CPU/Me mory Autonomous node Site 1 Site 2 Site n

12 Object platforms with unified data access Some of them also leverage a unified access model for flexible data ingestion capabilities Ceph is an example of shared nothing architectures and unified (file/block/access) access LIBRADOS App (Direct) App (REST) RADOSGW (object access) Ceph Object Store (RADOS) (Distributed object-based storage) Local File system Local File system CPU + Memory CPU + Memory Local File system CPU + Memory 12 Host or VM RBD (block access) FS Client CEPHFS (file access) Individual Ceph Nodes

13 Common terms used in conjunction with objectplatforms Shared nothing architectures Data sharding Geo-dispersed namespaces Multi-master or master-less configurations Erasure coding (distributed vs. local)

14 Why object-storage in Big Data? BD/A analytics environments deal with large unstructured data sets - analyzed using applications with distributed workloads (e.g. Map/Reduce) Many of these workloads run on commodity hardware nodes that are clustered together It is expensive to use traditional storage platforms for distributed analytics With most object storage systems/services you can specify anywhere from a few KB of user defined meta data or GB This bodes well for workloads that need to analyze data of all types together

15 The growing influence of object-storage in Big Data IDC recently conducted a survey on storage patterns in Big Data More than 95% respondents said that they had either deployed or had plans to deploy object storage for their BD/A infrastructure Of this group, 55% plan to deploy it exclusively as the storage platform for BD/A infrastructure Q. Does your organization deploy or plan to deploy file- and object-based storage in your data analytics infrastructure?

16 Where data lives before it is processed 50% respondents said that they plan to store preprocessed data in an onpremises or cloud-based object-platform Geo-distributed and latency-resilient access mechanism of object storage platforms make it suitable for distributed analytics applications Q. On what storage medium is data stored before it is processed?

17 Where data lives after it is processed More than 47% respondents said that they plan to store postprocessed data in an onpremises or cloud-based object-platform The ability to use off-theshelf or commodity components to build objectplatforms makes them economical platforms for post-processed data Q. On what storage medium is data stored after it is processed?

18 The use of object-platforms as an archive tier Object platforms are excellent for archival of data economical and compliant They can quickly adapt for in-place analytics Several object-platforms are getting certified in regulatory and compliance environments Q. What kinds of data retention policies are in place in your data analytics infrastructure for the resulting data set after the raw data is processed/analyzed?

19 Growing popularity of object-platforms Platforms used in place of HDFS in Hadoop In a recent study conducted by IDC, respondents were asked Which independent file systems or database platforms do they use or plan to use with your Hadoop Infrastructure as a replacement for or to augment HDFS (Hadoop data store)? Amongst distributed file systems, object platforms made it to the list Ceph Openstack Swift Cleversafe dsnet Scality RING Amazon S3

20 Object platforms and adjacent non-storage workloads (Compustorage) Q. What were your reasons for selecting an alternative data store to augment or replace HDFS? Hadoop Map/Reduce is a common workload that can be run directly on the objectplatforms Many object-platform suppliers provide HDFS-connectors or replacements Preferred because of native deficiencies of HDFS (resiliency and performance in particular)

21 Essential Guidance The platform has to be open The platform is no good unless it enables solutions The platform itself cannot be a bottleneck (i.e. it is all about scale) It's all about geo-dispersed analytics Consider the workloads Don't forget the combination structured and unstructured data Don't forget the objects

22 Thank You! Ashish IDC Visit us at IDC.com and follow us on 22

RED HAT STORAGE PORTFOLIO OVERVIEW

RED HAT STORAGE PORTFOLIO OVERVIEW RED HAT STORAGE PORTFOLIO OVERVIEW Andrew Hatfield Practice Lead Cloud Storage and Big Data MILCIS November 2015 THE RED HAT STORAGE MISSION To offer a unified, open software-defined storage portfolio

More information

I D C V E N D O R S P O T L I G H T

I D C V E N D O R S P O T L I G H T I D C V E N D O R S P O T L I G H T S t o r a g e T rends: Delive r i n g M a n a g e m ent and I n t e l l i g e n c e T h r o u g h C l o u d S e r vi c es April 2013 Adapted from IDC's Worldwide Storage

More information

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE Hadoop Storage-as-a-Service ABSTRACT This White Paper illustrates how EMC Elastic Cloud Storage (ECS ) can be used to streamline the Hadoop data analytics

More information

DDN updates object storage platform as it aims to break out of HPC niche

DDN updates object storage platform as it aims to break out of HPC niche DDN updates object storage platform as it aims to break out of HPC niche Analyst: Simon Robinson 18 Oct, 2013 DataDirect Networks has refreshed its Web Object Scaler (WOS), the company's platform for efficiently

More information

Data management challenges in todays Healthcare and Life Sciences ecosystems

Data management challenges in todays Healthcare and Life Sciences ecosystems Data management challenges in todays Healthcare and Life Sciences ecosystems Jose L. Alvarez Principal Engineer, WW Director Life Sciences jose.alvarez@seagate.com Evolution of Data Sets in Healthcare

More information

IDC MarketScape Excerpt: Worldwide Object-Based Storage 2013 Vendor Assessment

IDC MarketScape Excerpt: Worldwide Object-Based Storage 2013 Vendor Assessment IDC MarketScape IDC MarketScape Excerpt: Worldwide Object-Based Storage 2013 Vendor Assessment Ashish Nadkarni Amita Potnis THIS IDC MARKETSCAPE EXCERPT FEATURES: EMC IDC MARKETSCAPE FIGURE FIGURE 1 IDC

More information

Scale-Out File Systems on Object-Based Storage Platforms

Scale-Out File Systems on Object-Based Storage Platforms TECHNOLOGY ASSESSMENT Scale-Out File Systems on Object-Based Storage Platforms Ashish Nadkarni IN THIS EXCERPT The content for this excerpt was taken directly from Scale-Out File Systems on Object-Based

More information

Product Spotlight. A Look at the Future of Storage. Featuring SUSE Enterprise Storage. Where IT perceptions are reality

Product Spotlight. A Look at the Future of Storage. Featuring SUSE Enterprise Storage. Where IT perceptions are reality Where IT perceptions are reality Product Spotlight A Look at the Future of Storage Featuring SUSE Enterprise Storage Document # SPOTLIGHT2013001 v5, January 2015 Copyright 2015 IT Brand Pulse. All rights

More information

The Future of Data Management

The Future of Data Management The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class

More information

Introducing ScienceCloud

Introducing ScienceCloud Zentrale Informatik Introducing ScienceCloud Sergio Maffioletti IS/Cloud S3IT: Service and Support for Science IT Zurich, 10.03.2015 What are we going to talk about today? 1. Why are we building ScienceCloud?

More information

I D C T E C H N O L O G Y S P O T L I G H T. T i m e t o S c ale Out, Not Scale Up

I D C T E C H N O L O G Y S P O T L I G H T. T i m e t o S c ale Out, Not Scale Up I D C T E C H N O L O G Y S P O T L I G H T M a naging the Explosion of Enterprise Data: T i m e t o S c ale Out, Not Scale Up July 2014 Adapted from Scale-Out Meets Virtualization by Ashish Nadkarni,

More information

CloudStack and Big Data. Sebastien Goasguen @sebgoa May 22nd 2013 LinuxTag, Berlin

CloudStack and Big Data. Sebastien Goasguen @sebgoa May 22nd 2013 LinuxTag, Berlin CloudStack and Big Data Sebastien Goasguen @sebgoa May 22nd 2013 LinuxTag, Berlin Google trends Start of Clouds Cloud computing trending down, while Big Data is booming. Virtualization BigData on the Trigger

More information

Agenda. Big Data & Hadoop ViPR HDFS Pivotal Big Data Suite & ViPR HDFS ViON Customer Feedback #EMCVIPR

Agenda. Big Data & Hadoop ViPR HDFS Pivotal Big Data Suite & ViPR HDFS ViON Customer Feedback #EMCVIPR 1 Agenda Big Data & Hadoop ViPR HDFS Pivotal Big Data Suite & ViPR HDFS ViON Customer Feedback 2 A World of Connected Devices Need a new data management architecture for Internet of Things 21% the % of

More information

IBM Enhances Portfolio by Acquiring Object-Based Storage Supplier Cleversafe

IBM Enhances Portfolio by Acquiring Object-Based Storage Supplier Cleversafe Flash IBM Enhances Portfolio by Acquiring Object-Based Storage Supplier Cleversafe Amita Potnis Richard L. Villars Laura DuBois Ashish Nadkarni IN THIS FLASH This IDC Flash discusses IBM's announcement

More information

Implementing Multi-Tenanted Storage for Service Providers with Cloudian HyperStore. The Challenge SOLUTION GUIDE

Implementing Multi-Tenanted Storage for Service Providers with Cloudian HyperStore. The Challenge SOLUTION GUIDE Implementing Multi-Tenanted Storage for Service Providers with Cloudian HyperStore COST EFFECTIVE SCALABLE STORAGE PLATFORM FOR CLOUD STORAGE SERVICES SOLUTION GUIDE The Challenge Service providers (MSPs/ISPs/ASPs)

More information

Enterprise Data Lake Platforms: Deep Storage for Big Data and Analytics

Enterprise Data Lake Platforms: Deep Storage for Big Data and Analytics Insight Enterprise Data Lake Platforms: Deep Storage for Big Data and Analytics Ashish Nadkarni Laura DuBois IDC OPINION In the past 18 months or so, the term data lakes has surfaced as yet another phrase

More information

Oracle Big Data SQL Technical Update

Oracle Big Data SQL Technical Update Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical

More information

SUSE Enterprise Storage Highly Scalable Software Defined Storage. Gábor Nyers Sales Engineer @SUSE gnyers@suse.com

SUSE Enterprise Storage Highly Scalable Software Defined Storage. Gábor Nyers Sales Engineer @SUSE gnyers@suse.com SUSE Enterprise Storage Highly Scalable Software Defined Storage Gábor Nyers Sales Engineer @SUSE gnyers@suse.com Setting the Stage Enterprise Data Capacity Utilization 1-3% 15-20% 20-25% Tier 0 Ultra

More information

I D C T E C H N O L O G Y S P O T L I G H T

I D C T E C H N O L O G Y S P O T L I G H T I D C T E C H N O L O G Y S P O T L I G H T S o f tw a re - D e fined Storage: T h e N e x t - Generation Data Platform for the S o f tw a re - D e fined Datacenter July 2014 Adapted from IDC's Worldwide

More information

Luncheon Webinar Series May 13, 2013

Luncheon Webinar Series May 13, 2013 Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration

More information

Distributed File Systems An Overview. Nürnberg, 30.04.2014 Dr. Christian Boehme, GWDG

Distributed File Systems An Overview. Nürnberg, 30.04.2014 Dr. Christian Boehme, GWDG Distributed File Systems An Overview Nürnberg, 30.04.2014 Dr. Christian Boehme, GWDG Introduction A distributed file system allows shared, file based access without sharing disks History starts in 1960s

More information

CA Technologies Big Data Infrastructure Management Unified Management and Visibility of Big Data

CA Technologies Big Data Infrastructure Management Unified Management and Visibility of Big Data Research Report CA Technologies Big Data Infrastructure Management Executive Summary CA Technologies recently exhibited new technology innovations, marking its entry into the Big Data marketplace with

More information

Evolution from Big Data to Smart Data

Evolution from Big Data to Smart Data Evolution from Big Data to Smart Data Information is Exploding 120 HOURS VIDEO UPLOADED TO YOUTUBE 50,000 APPS DOWNLOADED 204 MILLION E-MAILS EVERY MINUTE EVERY DAY Intel Corporation 2015 The Data is Changing

More information

Overview Copy Cop ri y g ri h g t h 2014 t EM E C Corp M orat C Corp i orat o i n. n. A l A ll l rig ri h g t h s t res s e res rved rv.

Overview Copy Cop ri y g ri h g t h 2014 t EM E C Corp M orat C Corp i orat o i n. n. A l A ll l rig ri h g t h s t res s e res rved rv. Overview 1 What the Heck is an Object anyway? File Object G:\folder1\Dog.jpg File Size ACL Date LAN Access SMB/CIFS/NFS Object ID + Metadata All File info plus: M Lady Beagle Likes Kids LAN/WAN/Mobile

More information

Global Headquarters: 5 Speen Street Framingham, MA 01701 USA P.508.872.8200 F.508.935.4015 www.idc.com

Global Headquarters: 5 Speen Street Framingham, MA 01701 USA P.508.872.8200 F.508.935.4015 www.idc.com WHITE PAPER Trends in Enterprise Hadoop Deployments Sponsored by: Red Hat Ashish Nadkarni October 2013 Laura DuBois IDC OPINION Global Headquarters: 5 Speen Street Framingham, MA 01701 USA P.508.872.8200

More information

HYPER-CONVERGED INFRASTRUCTURE STRATEGIES

HYPER-CONVERGED INFRASTRUCTURE STRATEGIES 1 HYPER-CONVERGED INFRASTRUCTURE STRATEGIES MYTH BUSTING & THE FUTURE OF WEB SCALE IT 2 ROADMAP INFORMATION DISCLAIMER EMC makes no representation and undertakes no obligations with regard to product planning

More information

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics ESSENTIALS EMC ISILON Use the industry's first and only scale-out NAS solution with native Hadoop

More information

IBM Spectrum Protect in the Cloud

IBM Spectrum Protect in the Cloud IBM Spectrum Protect in the Cloud. Disclaimer IBM s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice at IBM s sole discretion. Information regarding

More information

How To Understand And Understand Cyber Group

How To Understand And Understand Cyber Group Buyer Case Study Cyber Group Deploys EMC ViPR for Next-Generation SaaS Application Infrastructure Laura DuBois IDC OPINION 3rd Platform computing has given rise to massive scale datacenters architected

More information

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Datenverwaltung im Wandel - Building an Enterprise Data Hub with Datenverwaltung im Wandel - Building an Enterprise Data Hub with Cloudera Bernard Doering Regional Director, Central EMEA, Cloudera Cloudera Your Hadoop Experts Founded 2008, by former employees of Employees

More information

HDP Hadoop From concept to deployment.

HDP Hadoop From concept to deployment. HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some

More information

Sep 23, 2014. OSBCONF 2014 Cloud backup with Bareos

Sep 23, 2014. OSBCONF 2014 Cloud backup with Bareos Sep 23, 2014 OSBCONF 2014 Cloud backup with Bareos OSBCONF 23/09/2014 Content: Who am I Quick overview of Cloud solutions Bareos and Backup/Restore using Cloud Storage Bareos and Backup/Restore of Cloud

More information

SUSE Storage. FUT7537 Software Defined Storage Introduction and Roadmap: Getting your tentacles around data growth. Larry Morris

SUSE Storage. FUT7537 Software Defined Storage Introduction and Roadmap: Getting your tentacles around data growth. Larry Morris SUSE FUT7537 Software Defined Introduction and Roadmap: Getting your tentacles around data growth Larry Morris Sr. Product Manager lmorris@suse.com AGENDA Enterprise Market SUSE Product SUSE Solutions

More information

THE FUTURE OF STORAGE IS SOFTWARE DEFINED. Jasper Geraerts Business Manager Storage Benelux/Red Hat

THE FUTURE OF STORAGE IS SOFTWARE DEFINED. Jasper Geraerts Business Manager Storage Benelux/Red Hat THE FUTURE OF STORAGE IS SOFTWARE DEFINED Jasper Geraerts Business Manager Storage Benelux/Red Hat THE FUTURE OF STORAGE Traditional Storage Complex proprietary silos Open, Software-Defined Storage Standardized,

More information

Growth of Unstructured Data & Object Storage. Marcel Laforce Sr. Director, Object Storage

Growth of Unstructured Data & Object Storage. Marcel Laforce Sr. Director, Object Storage Growth of Unstructured Data & Object Storage Marcel Laforce Sr. Director, Object Storage Agenda Unstructured Data Growth Contrasting approaches: Objects, Files & Blocks The Emerging Object Storage Market

More information

The Future of Data Management with Hadoop and the Enterprise Data Hub

The Future of Data Management with Hadoop and the Enterprise Data Hub The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah Cofounder & CTO, Cloudera, Inc. Twitter: @awadallah 1 2 Cloudera Snapshot Founded 2008, by former employees of Employees

More information

Hadoop Architecture. Part 1

Hadoop Architecture. Part 1 Hadoop Architecture Part 1 Node, Rack and Cluster: A node is simply a computer, typically non-enterprise, commodity hardware for nodes that contain data. Consider we have Node 1.Then we can add more nodes,

More information

Oracle Database 12c Plug In. Switch On. Get SMART.

Oracle Database 12c Plug In. Switch On. Get SMART. Oracle Database 12c Plug In. Switch On. Get SMART. Duncan Harvey Head of Core Technology, Oracle EMEA March 2015 Safe Harbor Statement The following is intended to outline our general product direction.

More information

Building Out Your Cloud-Ready Solutions. Clark D. Richey, Jr., Principal Technologist, DoD

Building Out Your Cloud-Ready Solutions. Clark D. Richey, Jr., Principal Technologist, DoD Building Out Your Cloud-Ready Solutions Clark D. Richey, Jr., Principal Technologist, DoD Slide 1 Agenda Define the problem Explore important aspects of Cloud deployments Wrap up and questions Slide 2

More information

IDC MarketScape Excerpt: Worldwide Object-Based Storage 2013 Vendor Assessment

IDC MarketScape Excerpt: Worldwide Object-Based Storage 2013 Vendor Assessment IDC MarketScape IDC MarketScape Excerpt: Worldwide Object-Based Storage 2013 Vendor Assessment Ashish Nadkarni Amita Potnis THIS IDC MARKETSCAPE EXCERPT FEATURES: AMPLIDATA IDC MARKETSCAPE FIGURE FIGURE

More information

Building Storage as a Service with OpenStack. Greg Elkinbard Senior Technical Director

Building Storage as a Service with OpenStack. Greg Elkinbard Senior Technical Director Building Storage as a Service with OpenStack Greg Elkinbard Senior Technical Director MIRANTIS 2012 PAGE 1 About the Presenter Greg Elkinbard Senior Technical Director at Mirantis Builds on demand IaaS

More information

Balboa Park Online Collaborative Deploys the Exablox OneBlox Solution: Achieves Cost and Management Savings

Balboa Park Online Collaborative Deploys the Exablox OneBlox Solution: Achieves Cost and Management Savings BUYER CASE STUDY Balboa Park Online Collaborative Deploys the Exablox OneBlox Solution: Achieves Cost and Management Savings Laura DuBois Ashish Nadkarni IDC OPINION Global Headquarters: 5 Speen Street

More information

Solid State Storage in the Evolution of the Data Center

Solid State Storage in the Evolution of the Data Center Solid State Storage in the Evolution of the Data Center Trends and Opportunities Bruce Moxon CTO, Systems and Solutions stec Presented at the Lazard Capital Markets Solid State Storage Day New York, June

More information

Introduction to Cloud : Cloud and Cloud Storage. Lecture 2. Dr. Dalit Naor IBM Haifa Research Storage Systems. Dalit Naor, IBM Haifa Research

Introduction to Cloud : Cloud and Cloud Storage. Lecture 2. Dr. Dalit Naor IBM Haifa Research Storage Systems. Dalit Naor, IBM Haifa Research Introduction to Cloud : Cloud and Cloud Storage Lecture 2 Dr. Dalit Naor IBM Haifa Research Storage Systems 1 Advanced Topics in Storage Systems for Big Data - Spring 2014, Tel-Aviv University http://www.eng.tau.ac.il/semcom

More information

Using Big Data for Smarter Decision Making. Colin White, BI Research July 2011 Sponsored by IBM

Using Big Data for Smarter Decision Making. Colin White, BI Research July 2011 Sponsored by IBM Using Big Data for Smarter Decision Making Colin White, BI Research July 2011 Sponsored by IBM USING BIG DATA FOR SMARTER DECISION MAKING To increase competitiveness, 83% of CIOs have visionary plans that

More information

VMware Enriches vcloud Air Services with Object Storage

VMware Enriches vcloud Air Services with Object Storage VENDOR PROFILE VMware Enriches vcloud Air Services with Object Storage Amita Potnis IDC OPINION In today's mobile world, managing and maintaining data has become a top concern for IT departments. Businesses

More information

D e c e n t r a lized Scale - Out Ar c h i t e c t u r e s

D e c e n t r a lized Scale - Out Ar c h i t e c t u r e s I D C T E C H N O L O G Y S P O T L I G H T Object-Based Storage: The Need for D e c e n t r a lized Scale - Out Ar c h i t e c t u r e s February 2015 Adapted from IDC MarketScape: Worldwide Object-Based

More information

Clodoaldo Barrera Chief Technical Strategist IBM System Storage. Making a successful transition to Software Defined Storage

Clodoaldo Barrera Chief Technical Strategist IBM System Storage. Making a successful transition to Software Defined Storage Clodoaldo Barrera Chief Technical Strategist IBM System Storage Making a successful transition to Software Defined Storage Open Server Summit Santa Clara Nov 2014 Data at the core of everything Data is

More information

<Insert Picture Here> Big Data

<Insert Picture Here> Big Data Big Data Kevin Kalmbach Principal Sales Consultant, Public Sector Engineered Systems Program Agenda What is Big Data and why it is important? What is your Big

More information

Data Refinery with Big Data Aspects

Data Refinery with Big Data Aspects International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 7 (2013), pp. 655-662 International Research Publications House http://www. irphouse.com /ijict.htm Data

More information

HGST Object Storage for a New Generation of IT

HGST Object Storage for a New Generation of IT Enterprise Strategy Group Getting to the bigger truth. SOLUTION SHOWCASE HGST Object Storage for a New Generation of IT Date: October 2015 Author: Scott Sinclair, Storage Analyst Abstract: Under increased

More information

IBM ELASTIC STORAGE SEAN LEE

IBM ELASTIC STORAGE SEAN LEE IBM ELASTIC STORAGE SEAN LEE Solution Architect Platform Computing Division IBM Greater China Group Agenda Challenges in Data Management What is IBM Elastic Storage Key Features Elastic Storage Server

More information

Object Storage: Out of the Shadows and into the Spotlight

Object Storage: Out of the Shadows and into the Spotlight Technology Insight Paper Object Storage: Out of the Shadows and into the Spotlight By John Webster December 12, 2012 Enabling you to make the best technology decisions Object Storage: Out of the Shadows

More information

NextGen Infrastructure for Big DATA Analytics.

NextGen Infrastructure for Big DATA Analytics. NextGen Infrastructure for Big DATA Analytics. So What is Big Data? Data that exceeds the processing capacity of conven4onal database systems. The data is too big, moves too fast, or doesn t fit the structures

More information

BIG DATA-AS-A-SERVICE

BIG DATA-AS-A-SERVICE White Paper BIG DATA-AS-A-SERVICE What Big Data is about What service providers can do with Big Data What EMC can do to help EMC Solutions Group Abstract This white paper looks at what service providers

More information

Implement Hadoop jobs to extract business value from large and varied data sets

Implement Hadoop jobs to extract business value from large and varied data sets Hadoop Development for Big Data Solutions: Hands-On You Will Learn How To: Implement Hadoop jobs to extract business value from large and varied data sets Write, customize and deploy MapReduce jobs to

More information

EMC IRODS RESOURCE DRIVERS

EMC IRODS RESOURCE DRIVERS EMC IRODS RESOURCE DRIVERS PATRICK COMBES: PRINCIPAL SOLUTION ARCHITECT, LIFE SCIENCES 1 QUICK AGENDA Intro to Isilon (~2 hours) Isilon resource driver Intro to ECS (~1.5 hours) ECS Resource driver Possibilities

More information

SUSE Enterprise Storage Highly Scalable Software Defined Storage. Māris Smilga

SUSE Enterprise Storage Highly Scalable Software Defined Storage. Māris Smilga SUSE Enterprise Storage Highly Scalable Software Defined Storage āris Smilga Storage Today Traditional Storage Arrays of disks with RAID for redundancy SANs based on Fibre Channel connectivity Total System

More information

Building Storage-as-a-Service Businesses

Building Storage-as-a-Service Businesses White Paper Service Providers Greatest New Growth Opportunity: Building Storage-as-a-Service Businesses According to 451 Research, Storage as a Service represents a large and rapidly growing market with

More information

Traditional BI vs. Business Data Lake A comparison

Traditional BI vs. Business Data Lake A comparison Traditional BI vs. Business Data Lake A comparison The need for new thinking around data storage and analysis Traditional Business Intelligence (BI) systems provide various levels and kinds of analyses

More information

StorReduce Technical White Paper Cloud-based Data Deduplication

StorReduce Technical White Paper Cloud-based Data Deduplication StorReduce Technical White Paper Cloud-based Data Deduplication See also at storreduce.com/docs StorReduce Quick Start Guide StorReduce FAQ StorReduce Solution Brief, and StorReduce Blog at storreduce.com/blog

More information

Building low cost disk storage with Ceph and OpenStack Swift

Building low cost disk storage with Ceph and OpenStack Swift Background photo from: http://edelomahony.com/2011/07/25/loving-money-doesnt-bring-you-more/ Building low cost disk storage with Ceph and OpenStack Swift Paweł Woszuk, Maciej Brzeźniak TERENA TF-Storage

More information

Session 0202: Big Data in action with SAP HANA and Hadoop Platforms Prasad Illapani Product Management & Strategy (SAP HANA & Big Data) SAP Labs LLC,

Session 0202: Big Data in action with SAP HANA and Hadoop Platforms Prasad Illapani Product Management & Strategy (SAP HANA & Big Data) SAP Labs LLC, Session 0202: Big Data in action with SAP HANA and Hadoop Platforms Prasad Illapani Product Management & Strategy (SAP HANA & Big Data) SAP Labs LLC, Bellevue, WA Legal disclaimer The information in this

More information

Hadoop on OpenStack Cloud. Dmitry Mescheryakov Software Engineer, @MirantisIT

Hadoop on OpenStack Cloud. Dmitry Mescheryakov Software Engineer, @MirantisIT Hadoop on OpenStack Cloud Dmitry Mescheryakov Software Engineer, @MirantisIT Agenda OpenStack Sahara Demo Hadoop Performance on Cloud Conclusion OpenStack Open source cloud computing platform 17,209 commits

More information

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web

More information

IBM Storage Technical Strategy and Trends

IBM Storage Technical Strategy and Trends IBM Storage Technical Strategy and Trends 9.3.2016 Dr. Robert Haas CTO Storage Europe, IBM rha@zurich.ibm.com 2016 International Business Machines Corporation 1 Cognitive Computing: Technologies that will

More information

21 st Century Storage What s New and What s Changing

21 st Century Storage What s New and What s Changing 21 st Century Storage What s New and What s Changing Randy Kerns Senior Strategist Evaluator Group Overview New technologies in storage - Continued evolution - Each has great economic value - Differing

More information

EMC BACKUP MEETS BIG DATA

EMC BACKUP MEETS BIG DATA EMC BACKUP MEETS BIG DATA Strategies To Protect Greenplum, Isilon And Teradata Systems 1 Agenda Big Data: Overview, Backup and Recovery EMC Big Data Backup Strategy EMC Backup and Recovery Solutions for

More information

WHITEPAPER. Network-Attached Storage in the Public Cloud. Introduction. Red Hat Storage for Amazon Web Services

WHITEPAPER. Network-Attached Storage in the Public Cloud. Introduction. Red Hat Storage for Amazon Web Services WHITEPAPER Network-Attached Storage in the Public Cloud Red Hat Storage for Amazon Web Services Introduction Cloud computing represents a major transformation in the way enterprises deliver a wide array

More information

DreamObjects. Cloud Object Storage Powered by Ceph. Monday, November 5, 12

DreamObjects. Cloud Object Storage Powered by Ceph. Monday, November 5, 12 DreamObjects Cloud Object Storage Powered by Ceph This slide is all about me, me, me. Ross Turk Community Manager, Ceph VP Community, Inktank ross@inktank.com @rossturk inktank.com ceph.com 2 DreamHost

More information

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ End to End Solution to Accelerate Data Warehouse Optimization Franco Flore Alliance Sales Director - APJ Big Data Is Driving Key Business Initiatives Increase profitability, innovation, customer satisfaction,

More information

Testing 3Vs (Volume, Variety and Velocity) of Big Data

Testing 3Vs (Volume, Variety and Velocity) of Big Data Testing 3Vs (Volume, Variety and Velocity) of Big Data 1 A lot happens in the Digital World in 60 seconds 2 What is Big Data Big Data refers to data sets whose size is beyond the ability of commonly used

More information

Big Data Explained. An introduction to Big Data Science.

Big Data Explained. An introduction to Big Data Science. Big Data Explained An introduction to Big Data Science. 1 Presentation Agenda What is Big Data Why learn Big Data Who is it for How to start learning Big Data When to learn it Objective and Benefits of

More information

Hadoop IST 734 SS CHUNG

Hadoop IST 734 SS CHUNG Hadoop IST 734 SS CHUNG Introduction What is Big Data?? Bulk Amount Unstructured Lots of Applications which need to handle huge amount of data (in terms of 500+ TB per day) If a regular machine need to

More information

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Wayne W. Eckerson Director of Research, TechTarget Founder, BI Leadership Forum Business Analytics

More information

POSIX and Object Distributed Storage Systems

POSIX and Object Distributed Storage Systems 1 POSIX and Object Distributed Storage Systems Performance Comparison Studies With Real-Life Scenarios in an Experimental Data Taking Context Leveraging OpenStack Swift & Ceph by Michael Poat, Dr. Jerome

More information

White. Paper. EMC Isilon: A Scalable Storage Platform for Big Data. April 2014

White. Paper. EMC Isilon: A Scalable Storage Platform for Big Data. April 2014 White Paper EMC Isilon: A Scalable Storage Platform for Big Data By Nik Rouda, Senior Analyst and Terri McClure, Senior Analyst April 2014 This ESG White Paper was commissioned by EMC Isilon and is distributed

More information

Virtualizing Apache Hadoop. June, 2012

Virtualizing Apache Hadoop. June, 2012 June, 2012 Table of Contents EXECUTIVE SUMMARY... 3 INTRODUCTION... 3 VIRTUALIZING APACHE HADOOP... 4 INTRODUCTION TO VSPHERE TM... 4 USE CASES AND ADVANTAGES OF VIRTUALIZING HADOOP... 4 MYTHS ABOUT RUNNING

More information

Big Data Analytics on Object Storage -- Hadoop over Ceph* Object Storage with SSD Cache

Big Data Analytics on Object Storage -- Hadoop over Ceph* Object Storage with SSD Cache Big Data Analytics on Object Storage -- Hadoop over Ceph* Object Storage with SSD Cache David Cohen (david.e.cohen@intel.com ) Yuan Zhou (yuan.zhou@intel.com) Jun Sun (jun.sun@intel.com) Weiting Chen (weiting.chen@intel.com)

More information

Executive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform...

Executive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform... Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure Requirements... 5 Solution Spectrum... 6 Oracle s Big Data

More information

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect

Big Data on AWS. Services Overview. Bernie Nallamotu Principle Solutions Architect on AWS Services Overview Bernie Nallamotu Principle Solutions Architect \ So what is it? When your data sets become so large that you have to start innovating around how to collect, store, organize, analyze

More information

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies

Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies Big Data, Cloud Computing, Spatial Databases Steven Hagan Vice President Server Technologies Big Data: Global Digital Data Growth Growing leaps and bounds by 40+% Year over Year! 2009 =.8 Zetabytes =.08

More information

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM Sneha D.Borkar 1, Prof.Chaitali S.Surtakar 2 Student of B.E., Information Technology, J.D.I.E.T, sborkar95@gmail.com Assistant Professor, Information

More information

Non-Stop Hadoop Paul Scott-Murphy VP Field Techincal Service, APJ. Cloudera World Japan November 2014

Non-Stop Hadoop Paul Scott-Murphy VP Field Techincal Service, APJ. Cloudera World Japan November 2014 Non-Stop Hadoop Paul Scott-Murphy VP Field Techincal Service, APJ Cloudera World Japan November 2014 WANdisco Background WANdisco: Wide Area Network Distributed Computing Enterprise ready, high availability

More information

Search and Real-Time Analytics on Big Data

Search and Real-Time Analytics on Big Data Search and Real-Time Analytics on Big Data Sewook Wee, Ryan Tabora, Jason Rutherglen Accenture & Think Big Analytics Strata New York October, 2012 Big Data: data becomes your core asset. It realizes its

More information

Market Landscape Report

Market Landscape Report Market Landscape Report Object Storage By Scott Sinclair, Analyst June 2015 Market Landscape Report: Object Storage 2 Contents What Is Object Storage?... 3 Limitations of RAID and the Need for Object Storage...

More information

Protecting Big Data Data Protection Solutions for the Business Data Lake

Protecting Big Data Data Protection Solutions for the Business Data Lake White Paper Protecting Big Data Data Protection Solutions for the Business Data Lake Abstract Big Data use cases are maturing and customers are using Big Data to improve top and bottom line revenues. With

More information

Nexenta looks to expand TAM with scale-out object storage play, IPO on horizon

Nexenta looks to expand TAM with scale-out object storage play, IPO on horizon Nexenta looks to expand TAM with scale-out object storage play, IPO on horizon Analyst: Simon Robinson 5 Feb, 2015 As the idea of the 'software defined datacenter' begins to take hold in the industry at

More information

Caringo Swarm 7: beyond the limits of traditional storage. A new private cloud foundation for storage needs at scale

Caringo Swarm 7: beyond the limits of traditional storage. A new private cloud foundation for storage needs at scale Caringo Swarm 7: beyond the limits of traditional storage. A new private cloud foundation for storage needs at scale Prepared for: Caringo May 2014 TABLE OF CONTENTS TABLE OF CONTENTS 1 EXECUTIVE SUMMARY

More information

WHITE PAPER. Software Defined Storage Hydrates the Cloud

WHITE PAPER. Software Defined Storage Hydrates the Cloud WHITE PAPER Software Defined Storage Hydrates the Cloud Table of Contents Overview... 2 NexentaStor (Block & File Storage)... 4 Software Defined Data Centers (SDDC)... 5 OpenStack... 5 CloudStack... 6

More information

Big Data Technologies Compared June 2014

Big Data Technologies Compared June 2014 Big Data Technologies Compared June 2014 Agenda What is Big Data Big Data Technology Comparison Summary Other Big Data Technologies Questions 2 What is Big Data by Example The SKA Telescope is a new development

More information

TECHNICAL WHITE PAPER: ELASTIC CLOUD STORAGE SOFTWARE ARCHITECTURE

TECHNICAL WHITE PAPER: ELASTIC CLOUD STORAGE SOFTWARE ARCHITECTURE TECHNICAL WHITE PAPER: ELASTIC CLOUD STORAGE SOFTWARE ARCHITECTURE Deploy a modern hyperscale storage platform on commodity infrastructure ABSTRACT This document provides a detailed overview of the EMC

More information

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop

Lecture 32 Big Data. 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop Lecture 32 Big Data 1. Big Data problem 2. Why the excitement about big data 3. What is MapReduce 4. What is Hadoop 5. Get started with Hadoop 1 2 Big Data Problems Data explosion Data from users on social

More information

Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components

Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components of Hadoop. We will see what types of nodes can exist in a Hadoop

More information

Next-Generation Cloud Analytics with Amazon Redshift

Next-Generation Cloud Analytics with Amazon Redshift Next-Generation Cloud Analytics with Amazon Redshift What s inside Introduction Why Amazon Redshift is Great for Analytics Cloud Data Warehousing Strategies for Relational Databases Analyzing Fast, Transactional

More information

" " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " "

                                ! WHITE PAPER! The Evolution of High-Performance Computing Storage Architectures in Commercial Environments! Prepared by: Eric Slack, Senior Analyst! May 2014 The Evolution of HPC Storage Architectures

More information

SCALABLE FILE SHARING AND DATA MANAGEMENT FOR INTERNET OF THINGS

SCALABLE FILE SHARING AND DATA MANAGEMENT FOR INTERNET OF THINGS Sean Lee Solution Architect, SDI, IBM Systems SCALABLE FILE SHARING AND DATA MANAGEMENT FOR INTERNET OF THINGS Agenda Converging Technology Forces New Generation Applications Data Management Challenges

More information

Information Architecture

Information Architecture The Bloor Group Actian and The Big Data Information Architecture WHITE PAPER The Actian Big Data Information Architecture Actian and The Big Data Information Architecture Originally founded in 2005 to

More information

WHITE PAPER. www.fusionstorm.com. Get Ready for Big Data:

WHITE PAPER. www.fusionstorm.com. Get Ready for Big Data: WHitE PaPER: Easing the Way to the cloud: 1 WHITE PAPER Get Ready for Big Data: How Scale-Out NaS Delivers the Scalability, Performance, Resilience and manageability that Big Data Environments Demand 2

More information

WOS. High Performance Object Storage

WOS. High Performance Object Storage Datasheet WOS High Performance Object Storage The Big Data explosion brings both challenges and opportunities to businesses across all industry verticals. Providers of online services are building infrastructures

More information