HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics
|
|
|
- Cory Wilkinson
- 10 years ago
- Views:
Transcription
1 HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics ESSENTIALS EMC ISILON Use the industry's first and only scale-out NAS solution with native Hadoop support Reduce costs and accelerate results with in-place data analytics Increase efficiency with over 80 percent storage utilization and data deduplication Support multiple Hadoop versions and instances simultaneously Gain added operational flexibility with multiprotocol support Achieve fast, easy scalability to over 20 PB in a single Isilon cluster CLOUDERA ENTERPRISE Unified one integrated system, bringing diverse users and application workloads to one pool of data on a common infrastructure, with no data movement required THE BIG DATA OPPORTUNITY The rapid growth of data represents a significant challenge for many enterprises across a wide range of industries today. Many organizations are realizing that Big Data is a valuable asset that can be leveraged to uncover new opportunities to accelerate their businesses and gain a competitive advantage. Central to realizing this opportunity is Hadoop, an innovative Big Data analytics engine designed specifically to analyze largescale datasets. CHALLENGES WITH TRADITIONAL HADOOP DEPLOYMENTS While most companies start out using Hadoop to capture and store data for future analysis in one consolidated pool, many already own network-attached storage (NAS)- or storage area network (SAN)-based storage systems existing in silos across the organization. How do you utilize these data storage systems without the required separate capital investment and added management resources? How do organizations work around Inefficient storage with poor utilization and difficult management due to silos Manual ingest of large datasets into Hadoop time and resource consuming Accessing or sharing data and analytics results across the organization as most systems do not support Hadoop Distributed File System (HDFS) COMBINED ISILON AND CLOUDERA APPROACH EMC Isilon together with Cloudera provide a complete, tested, and popular solution encompassing Apache Hadoop distribution and related projects. The combination of Isilon Big Data shared storage and the Cloudera Enterprise data management platform helps organizations speed up time to insights, enforce consistent security, enable multiprotocol access, and eliminate storage silos with a powerful yet simple, efficient, scalable, and complete solution. Secure perimeter security, authentication, granular authorization, and data protection SOLUTION OVERVIEW
2 CLOUDERA ENTERPRISE (cont.) Governed enterprise-grade data auditing, data lineage, and data discovery Managed native highavailability, fault-tolerance, and self-healing storage; automated backup and disaster recovery; and advanced system and data management Open Apache-licensed open source to ensure that your data and applications remain yours, and an open platform connects with all of your existing investments in technology and skills Isilon is the first and only scale-out NAS platform with native HDFS support in addition to traditional Server Message Block (SMB), network file system (NFS), HTTP, and FTP. This enables organizations to deploy one shared storage system that works across traditional and new and emerging workloads. In addition to delivering the core elements of Hadoop, Cloudera Enterprise focuses on making the Big Data management platform secure, managed, governed, and open. Cloudera Enterprise is a thoroughly tested, documented, and supported solution that takes the guesswork out of building the Hadoop deployment. Together Isilon and Cloudera provide a comprehensive scalable storage and distributed computing solution to meet most enterprise Big Data analytics needs. EMC ISILON SHARED STORAGE Isilon combines a powerful yet simple, highly efficient, and massively scalable storage platform with integrated support for Hadoop analytics. Isilon native support for the HDFS allows you to quickly implement an in-place data analytics solution and avoid unnecessary capital expenditures, increased operational costs, and time-consuming replication of your Big Data to a separate infrastructure. Simply connect your analytics compute resources to your Isilon storage system, and you're ready to begin your analytics projects immediately. GAIN FASTER TIME TO INSIGHTS Isilon's in-place data analytics approach allows you to eliminate the time and resources required to replicate your Big Data set into a separate Hadoop infrastructure. For example, it can take over 24 hours to copy 100 TB of data over a 10 Gb line. Instead, with Isilon, you can initiate data analytics projects immediately and get to results quickly. ENJOY INCREASED FLEXIBILITY Isilon supports multiple instances and multiple versions of Apache Hadoop distributions simultaneously. This allows you to leverage the tools you need for each of your unstructured data analytics projects. In addition to native HDFS 1.0 and HDFS 2.0 support, Isilon solutions include integrated support for a wide range of industrystandard protocols, including NFS, SMB, HTTP, FTP, SWIFT, and REST-based object access. GAIN MASSIVE SCALABILITY With Isilon, you can have massive room for growth for your unstructured data assets and related analytics projects. Isilon scales from 18 TB to over 20 PB of capacity in a single Isilon cluster. The EMC Isilon OneFS operating system allows a storage system to grow symmetrically or independently as more space or processing power is required. This provides a true grow-as-you-go approach and the ability to scale out as business
3 needs dictate. With Isilon, you can scale capacity and performance. PROTECT YOUR BIG DATA ASSETS Isilon provides unsurpassed levels of data protection and availability to meet a wide range of enterprise needs. OneFS enables all nodes in the Isilon storage cluster to become, NameNodes, thus improving the resiliency of your Hadoop environment. Isilon also offers end-to-end data protection options for fast and efficient data backup and recovery. You can schedule snapshots as frequently as needed to meet your specific recovery point objectives. For reliable disaster recovery protection, Isilon provides fast data replication, along with push-button failover and failback simplicity, to further increase the availability of your data assets. SECURE YOUR BIG DATA ASSETS To help you meet regulatory compliance and corporate governance requirements, Isilon offers robust security options, including file system auditing and write once, read many (WORM) data protection to prevent accidental or malicious alteration or deletion. With Isilon, you can also provide secure role separation between storage administration and file system access, as well as authentication zones, to create secure, isolated storage pools for specific departments within your organization. IMPROVE STORAGE UTILIZATION With Isilon, you can consolidate your storage infrastructure, including file, semistructured, and unstructured data assets. With a storage utilization rate of over 80 percent, which can be further improved by up to 35 percent with the use of EMC Isilon SmartDedupe data deduplication that eliminates redundant data. You need less storage capacity and physical space to house your dataset reducing both initial capital outlay and ongoing operating costs. CLOUDERA ENTERPRISE Cloudera Enterprise helps you become information driven by leveraging the best of the open source community with the enterprise capabilities you need to succeed with Apache Hadoop in your organization. Designed specifically for mission-critical environments, Cloudera Enterprise includes CDH, the world s most popular open source Hadoop-based platform, as well as advanced system management and data management tools, plus dedicated support and community advocacy from our worldclass team of Hadoop developers and experts. Cloudera is your partner on the path to Big Data. RETHINK DATA MANAGEMENT Cloudera Enterprise is designed to function as an enterprise data hub. It is: One massively scalable platform to store any amount or type of data, in its original form, for as long as desired or required Integrated with your existing infrastructure and tools
4 Flexible to run a variety of enterprise workloads including batch processing, interactive SQL, enterprise search, and advanced analytics Robust security, governance, data protection, and management that enterprises require With Cloudera Enterprise, today s leading organizations can put their data at the center of their operations to increase business visibility and reduce costs while successfully managing risk and compliance requirements. ONLINE NOSQL HBASE HBase is a distributed key-value store that helps you build real-time applications on massive tables with billions of rows and millions of columns with fast random access. When you deploy HBase as part of Cloudera Enterprise Flex Edition or Data Hub Edition as part of an enterprise data hub, you can rely on our market-leading technical support for HBase, as well as actively influence the future of the project. ANALYTIC SQL CLOUDERA IMPALA Cloudera Impala is the industry s leading massively parallel processing (MPP) SQL query engine that runs natively in Apache Hadoop. With Impala, you can enable analysts and data scientists to directly interact with any data stored in Hadoop using their existing business intelligence (BI) tools and skills through an industry-standard SQL interface. You can also offload self-service business intelligence to Hadoop, relieving the burden on existing analytical databases and reducing your BI backlog. SEARCH CLOUDERA SEARCH Cloudera Search lets your users query and browse data in Hadoop just as they would search Google or their favorite ecommerce site. Powered by Apache Hadoop and Apache Solr, the enterprise standard for open source search, Cloudera Search brings scale and reliability for a new generation of integrated, multiworkload search. Through its unique integrations with Cloudera Enterprise, Cloudera Search gains the same fault tolerance, scale, visibility, security, and flexibility provided to other enterprise data hub workloads. IN-MEMORY MACHINE LEARNING AND STREAM PROCESSING APACHE SPARK Apache Spark (incubating) is an open source, parallel data processing framework that
5 complements Apache Hadoop to make it easy to develop fast, unified Big Data applications combining batch, streaming, and interactive analytics on all your data. DATA MANAGEMENT CLOUDERA NAVIGATOR Cloudera Navigator is the only native end-to-end governance solution for Apache Hadoop-based systems. Through a single user interface, it provides visibility for administrators, data managers, data scientists, and analysts to secure, govern, and explore the large amounts of diverse data that land in Hadoop. Cloudera Navigator is part of Cloudera Enterprise s comprehensive data security and governance offering and is a key part to meeting compliance and regulatory requirements. SUMMARY EMC Isilon together with Cloudera provides a Big Data storage and analytics solution that is powerful yet simple and highly efficient with a massively scalable shared storage platform and industry-leading Hadoop analytics distribution. With Isilon, you can streamline your storage infrastructure by consolidating large-scale file and unstructured data assets, eliminating silos of storage and reducing costs. Isilon solutions also allow you to reduce time to insights with in-place analytics. At the same time, you gain the flexibility to support multiple instances of Apache Hadoop distributions from different vendors simultaneously. With Cloudera Enterprise, you get the core elements of Hadoop along with additional components such as a user interface, plus necessary enterprise capabilities such as security, and integration with a broad range of hardware and software solutions. All the integration work is done for you, and the entire solution is thoroughly tested and fully documented. By taking the guesswork out of building out your Hadoop deployment, Cloudera Enterprise gives you a streamlined path to success in solving real business problems. TAKE THE NEXT STEP Contact your EMC sales representative or authorized reseller to learn more about how EMC Isilon Big Data storage and analytics solutions can benefit your organization. Also see our solutions in the EMC Store at and Cloudera solutions at CONTACT US To learn more about how EMC products, services, and solutions can help solve your business and IT challenges, contact your local representative or authorized reseller or visit us at EMC 2, EMC, the EMC logo, Isilon, and OneFS are registered trademarks or trademarks of EMC Corporation in the United States and other countries. All other trademarks used herein are the property of their respective owners. Copyright 2014 EMC Corporation. All rights reserved. Published in the USA. 10/14 Solution Overview H13533 EMC believes the information in this document is accurate as of its publication date. The information is subject to change without notice.
EMC ISILON OneFS OPERATING SYSTEM Powering scale-out storage for the new world of Big Data in the enterprise
EMC ISILON OneFS OPERATING SYSTEM Powering scale-out storage for the new world of Big Data in the enterprise ESSENTIALS Easy-to-use, single volume, single file system architecture Highly scalable with
EMC ISILON ONEFS OPERATING SYSTEM
EMC ISILON ONEFS OPERATING SYSTEM Powering scale-out storage for the Big Data and Object workloads of today and tomorrow ESSENTIALS Easy-to-use, single volume, single file system architecture Highly scalable
EMC ISILON SCALE-OUT STORAGE PRODUCT FAMILY
SCALE-OUT STORAGE PRODUCT FAMILY Storage made simple ESSENTIALS Simple storage designed for ease of use Massive scalability with easy, grow-as-you-go flexibility World s fastest-performing NAS Unmatched
Cloudera Enterprise Data Hub. GCloud Service Definition Lot 3: Software as a Service
Cloudera Enterprise Data Hub GCloud Service Definition Lot 3: Software as a Service December 2014 1 SERVICE OVERVIEW & SOLUTION... 4 1.1 Service Overview... 4 1.2 Introduction to Cloudera... 5 1.3 Cloudera
White. Paper. EMC Isilon: A Scalable Storage Platform for Big Data. April 2014
White Paper EMC Isilon: A Scalable Storage Platform for Big Data By Nik Rouda, Senior Analyst and Terri McClure, Senior Analyst April 2014 This ESG White Paper was commissioned by EMC Isilon and is distributed
How To Manage A Single Volume Of Data On A Single Disk (Isilon)
1 ISILON SCALE-OUT NAS OVERVIEW AND FUTURE DIRECTIONS PHIL BULLINGER, SVP, EMC ISILON 2 ROADMAP INFORMATION DISCLAIMER EMC makes no representation and undertakes no obligations with regard to product planning
Protecting Big Data Data Protection Solutions for the Business Data Lake
White Paper Protecting Big Data Data Protection Solutions for the Business Data Lake Abstract Big Data use cases are maturing and customers are using Big Data to improve top and bottom line revenues. With
EMC s Enterprise Hadoop Solution. By Julie Lockner, Senior Analyst, and Terri McClure, Senior Analyst
White Paper EMC s Enterprise Hadoop Solution Isilon Scale-out NAS and Greenplum HD By Julie Lockner, Senior Analyst, and Terri McClure, Senior Analyst February 2012 This ESG White Paper was commissioned
Datenverwaltung im Wandel - Building an Enterprise Data Hub with
Datenverwaltung im Wandel - Building an Enterprise Data Hub with Cloudera Bernard Doering Regional Director, Central EMEA, Cloudera Cloudera Your Hadoop Experts Founded 2008, by former employees of Employees
ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE
ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE Hadoop Storage-as-a-Service ABSTRACT This White Paper illustrates how EMC Elastic Cloud Storage (ECS ) can be used to streamline the Hadoop data analytics
EMC ISILON SCALE-OUT STORAGE PRODUCT FAMILY
SCALE-OUT STORAGE PRODUCT FAMILY Unstructured data storage made simple ESSENTIALS Simple storage management designed for ease of use Massive scalability of capacity and performance Unmatched efficiency
More Data in Less Time
More Data in Less Time Leveraging Cloudera CDH as an Operational Data Store Daniel Tydecks, Systems Engineering DACH & CE Goals of an Operational Data Store Load Data Sources Traditional Architecture Operational
THE EMC ISILON STORY. Big Data In The Enterprise. Copyright 2012 EMC Corporation. All rights reserved.
THE EMC ISILON STORY Big Data In The Enterprise 2012 1 Big Data In The Enterprise Isilon Overview Isilon Technology Summary 2 What is Big Data? 3 The Big Data Challenge File Shares 90 and Archives 80 Bioinformatics
The Future of Data Management
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class
Deploying an Operational Data Store Designed for Big Data
Deploying an Operational Data Store Designed for Big Data A fast, secure, and scalable data staging environment with no data volume or variety constraints Sponsored by: Version: 102 Table of Contents Introduction
The Enterprise Data Hub and The Modern Information Architecture
The Enterprise Data Hub and The Modern Information Architecture Dr. Amr Awadallah CTO & Co-Founder, Cloudera Twitter: @awadallah 1 2013 Cloudera, Inc. All rights reserved. Cloudera Overview The Leader
EMC IRODS RESOURCE DRIVERS
EMC IRODS RESOURCE DRIVERS PATRICK COMBES: PRINCIPAL SOLUTION ARCHITECT, LIFE SCIENCES 1 QUICK AGENDA Intro to Isilon (~2 hours) Isilon resource driver Intro to ECS (~1.5 hours) ECS Resource driver Possibilities
Interactive data analytics drive insights
Big data Interactive data analytics drive insights Daniel Davis/Invodo/S&P. Screen images courtesy of Landmark Software and Services By Armando Acosta and Joey Jablonski The Apache Hadoop Big data has
Integrated Grid Solutions. and Greenplum
EMC Perspective Integrated Grid Solutions from SAS, EMC Isilon and Greenplum Introduction Intensifying competitive pressure and vast growth in the capabilities of analytic computing platforms are driving
EMC SOLUTION FOR AGILE AND ROBUST ANALYTICS ON HADOOP DATA LAKE WITH PIVOTAL HDB
EMC SOLUTION FOR AGILE AND ROBUST ANALYTICS ON HADOOP DATA LAKE WITH PIVOTAL HDB ABSTRACT As companies increasingly adopt data lakes as a platform for storing data from a variety of sources, the need for
Dell In-Memory Appliance for Cloudera Enterprise
Dell In-Memory Appliance for Cloudera Enterprise Hadoop Overview, Customer Evolution and Dell In-Memory Product Details Author: Armando Acosta Hadoop Product Manager/Subject Matter Expert [email protected]/
Driving Growth in Insurance With a Big Data Architecture
Driving Growth in Insurance With a Big Data Architecture The SAS and Cloudera Advantage Version: 103 Table of Contents Overview 3 Current Data Challenges for Insurers 3 Unlocking the Power of Big Data
The BIG Data Era has. your storage! Bratislava, Slovakia, 21st March 2013
The BIG Data Era has arrived Re-invent your storage! Bratislava, Slovakia, 21st March 2013 Luka Topic Regional Manager East Europe EMC Isilon Storage Division [email protected] 1 What is Big Data? 2 EXABYTES
How To Use Hp Vertica Ondemand
Data sheet HP Vertica OnDemand Enterprise-class Big Data analytics in the cloud Enterprise-class Big Data analytics for any size organization Vertica OnDemand Organizations today are experiencing a greater
EMC PERSPECTIVE: THE POWER OF WINDOWS SERVER 2012 AND EMC INFRASTRUCTURE FOR MICROSOFT PRIVATE CLOUD ENVIRONMENTS
EMC PERSPECTIVE: THE POWER OF WINDOWS SERVER 2012 AND EMC INFRASTRUCTURE FOR MICROSOFT PRIVATE CLOUD ENVIRONMENTS EXECUTIVE SUMMARY It s no secret that organizations continue to produce overwhelming amounts
CDH AND BUSINESS CONTINUITY:
WHITE PAPER CDH AND BUSINESS CONTINUITY: An overview of the availability, data protection and disaster recovery features in Hadoop Abstract Using the sophisticated built-in capabilities of CDH for tunable
Hadoop Trends and Practical Use Cases. April 2014
Hadoop Trends and Practical Use Cases John Howey Cloudera [email protected] Kevin Lewis Cloudera [email protected] April 2014 1 Agenda Hadoop Overview Latest Trends in Hadoop Enterprise Ready Beyond
Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments
Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and
Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014
Forecast of Big Data Trends Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Big Data transforms Business 2 Data created every minute Source http://mashable.com/2012/06/22/data-created-every-minute/
WHITE PAPER. Hadoop and HDFS: Storage for Next Generation Data Management. Version: Q414-102
Storage for Next Generation Data Management Version: Q414-102 Table of Content Storage for the Modern Enterprise 3 The Challenges of Big Data 5 Data at the Center of the Enterprise 6 The Internals of HDFS
THE EMC ISILON SCALE-OUT DATA LAKE
THE EMC ISILON SCALE-OUT DATA LAKE Key capabilities ABSTRACT This white paper provides an introduction to the EMC Isilon scale-out data lake as the key enabler to store, manage, and protect unstructured
Implementation of Hadoop Distributed File System Protocol on OneFS Tanuj Khurana EMC Isilon Storage Division
Implementation of Hadoop Distributed File System Protocol on OneFS Tanuj Khurana EMC Isilon Storage Division Outline HDFS Overview OneFS Overview HDFS protocol on OneFS HDFS protocol server implementation
WHITEPAPER. A Technical Perspective on the Talena Data Availability Management Solution
WHITEPAPER A Technical Perspective on the Talena Data Availability Management Solution BIG DATA TECHNOLOGY LANDSCAPE Over the past decade, the emergence of social media, mobile, and cloud technologies
How To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
Introduction to NetApp Infinite Volume
Technical Report Introduction to NetApp Infinite Volume Sandra Moulton, Reena Gupta, NetApp April 2013 TR-4037 Summary This document provides an overview of NetApp Infinite Volume, a new innovation in
Advanced In-Database Analytics
Advanced In-Database Analytics Tallinn, Sept. 25th, 2012 Mikko-Pekka Bertling, BDM Greenplum EMEA 1 That sounds complicated? 2 Who can tell me how best to solve this 3 What are the main mathematical functions??
Dell* In-Memory Appliance for Cloudera* Enterprise
Built with Intel Dell* In-Memory Appliance for Cloudera* Enterprise Find out what faster big data analytics can do for your business The need for speed in all things related to big data is an enormous
Cloudera Enterprise Data Hub in Telecom:
Cloudera Enterprise Data Hub in Telecom: Three Customer Case Studies Version: 103 Table of Contents Introduction 3 Cloudera Enterprise Data Hub for Telcos 4 Cloudera Enterprise Data Hub in Telecom: Customer
Non-Stop Hadoop Paul Scott-Murphy VP Field Techincal Service, APJ. Cloudera World Japan November 2014
Non-Stop Hadoop Paul Scott-Murphy VP Field Techincal Service, APJ Cloudera World Japan November 2014 WANdisco Background WANdisco: Wide Area Network Distributed Computing Enterprise ready, high availability
Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments
Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2016 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and
TRANSFORM YOUR BUSINESS: BIG DATA AND ANALYTICS WITH VCE AND EMC
TRANSFORM YOUR BUSINESS: BIG DATA AND ANALYTICS WITH VCE AND EMC Vision Big data and analytic initiatives within enterprises have been rapidly maturing from experimental efforts to production-ready deployments.
EMC SOLUTION FOR SPLUNK
EMC SOLUTION FOR SPLUNK Splunk validation using all-flash EMC XtremIO and EMC Isilon scale-out NAS ABSTRACT This white paper provides details on the validation of functionality and performance of Splunk
ORACLE COHERENCE 12CR2
ORACLE COHERENCE 12CR2 KEY FEATURES AND BENEFITS ORACLE COHERENCE IS THE #1 IN-MEMORY DATA GRID. KEY FEATURES Fault-tolerant in-memory distributed data caching and processing Persistence for fast recovery
BIG DATA-AS-A-SERVICE
White Paper BIG DATA-AS-A-SERVICE What Big Data is about What service providers can do with Big Data What EMC can do to help EMC Solutions Group Abstract This white paper looks at what service providers
EXPLORATION TECHNOLOGY REQUIRES A RADICAL CHANGE IN DATA ANALYSIS
EXPLORATION TECHNOLOGY REQUIRES A RADICAL CHANGE IN DATA ANALYSIS EMC Isilon solutions for oil and gas EMC PERSPECTIVE TABLE OF CONTENTS INTRODUCTION: THE HUNT FOR MORE RESOURCES... 3 KEEPING PACE WITH
Big + Fast + Safe + Simple = Lowest Technical Risk
Big + Fast + Safe + Simple = Lowest Technical Risk The Synergy of Greenplum and Isilon Architecture in HP Environments Steffen Thuemmel (Isilon) Andreas Scherbaum (Greenplum) 1 Our problem 2 What is Big
Microsoft Big Data Solutions. Anar Taghiyev P-TSP E-mail: [email protected];
Microsoft Big Data Solutions Anar Taghiyev P-TSP E-mail: [email protected]; Why/What is Big Data and Why Microsoft? Options of storage and big data processing in Microsoft Azure. Real Impact of Big
Big Data and Apache Hadoop Adoption:
Expert Reference Series of White Papers Big Data and Apache Hadoop Adoption: Key Challenges and Rewards 1-800-COURSES www.globalknowledge.com Big Data and Apache Hadoop Adoption: Key Challenges and Rewards
Isilon OneFS. Version 7.2.1. OneFS Migration Tools Guide
Isilon OneFS Version 7.2.1 OneFS Migration Tools Guide Copyright 2015 EMC Corporation. All rights reserved. Published in USA. Published July, 2015 EMC believes the information in this publication is accurate
Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum
Big Data Analytics with EMC Greenplum and Hadoop Big Data Analytics with EMC Greenplum and Hadoop Ofir Manor Pre Sales Technical Architect EMC Greenplum 1 Big Data and the Data Warehouse Potential All
Hadoop Ecosystem B Y R A H I M A.
Hadoop Ecosystem B Y R A H I M A. History of Hadoop Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open
The Future of Data Management with Hadoop and the Enterprise Data Hub
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah Cofounder & CTO, Cloudera, Inc. Twitter: @awadallah 1 2 Cloudera Snapshot Founded 2008, by former employees of Employees
IBM InfoSphere Guardium Data Activity Monitor for Hadoop-based systems
IBM InfoSphere Guardium Data Activity Monitor for Hadoop-based systems Proactively address regulatory compliance requirements and protect sensitive data in real time Highlights Monitor and audit data activity
HDP Hadoop From concept to deployment.
HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some
CONVERGE APPLICATIONS, ANALYTICS, AND DATA WITH VCE AND PIVOTAL
CONVERGE APPLICATIONS, ANALYTICS, AND DATA WITH VCE AND PIVOTAL Vision In today s volatile economy, an organization s ability to exploit IT to speed time-to-results, control cost and risk, and drive differentiation
Oracle Database 12c Plug In. Switch On. Get SMART.
Oracle Database 12c Plug In. Switch On. Get SMART. Duncan Harvey Head of Core Technology, Oracle EMEA March 2015 Safe Harbor Statement The following is intended to outline our general product direction.
DATA LAKE FOUNDATION 2.0 JEUDI 19 NOVEMBRE 2015. Denis FRAVAL-OLIVIER : ISD Presales Manager
DATA LAKE FOUNDATION 2.0 JEUDI 19 NOVEMBRE 2015 Denis FRAVAL-OLIVIER : ISD Presales Manager EMC Isilon Unifying Workloads in one place Module 4: Horizontal and Vertical Markets ISILON FOR ALL TYPES OF
Big Data Technology ดร.ช ชาต หฤไชยะศ กด. Choochart Haruechaiyasak, Ph.D.
Big Data Technology ดร.ช ชาต หฤไชยะศ กด Choochart Haruechaiyasak, Ph.D. Speech and Audio Technology Laboratory (SPT) National Electronics and Computer Technology Center (NECTEC) National Science and Technology
EMC ISILON HD-SERIES. Specifications. EMC Isilon HD400 ARCHITECTURE
EMC ISILON HD-SERIES The rapid growth of unstructured data combined with increasingly stringent compliance requirements is resulting in a growing need for efficient data archiving solutions that can store
EMC ISILON X-SERIES. Specifications. EMC Isilon X200. EMC Isilon X210. EMC Isilon X410 ARCHITECTURE
EMC ISILON X-SERIES EMC Isilon X200 EMC Isilon X210 The EMC Isilon X-Series, powered by the OneFS operating system, uses a highly versatile yet simple scale-out storage architecture to speed access to
Isilon OneFS. Version 7.2. OneFS Migration Tools Guide
Isilon OneFS Version 7.2 OneFS Migration Tools Guide Copyright 2014 EMC Corporation. All rights reserved. Published in USA. Published November, 2014 EMC believes the information in this publication is
Luncheon Webinar Series May 13, 2013
Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration
EMC Federation Big Data Solutions. Copyright 2015 EMC Corporation. All rights reserved.
EMC Federation Big Data Solutions 1 Introduction to data analytics Federation offering 2 Traditional Analytics! Traditional type of data analysis, sometimes called Business Intelligence! Type of analytics
Agenda. Big Data & Hadoop ViPR HDFS Pivotal Big Data Suite & ViPR HDFS ViON Customer Feedback #EMCVIPR
1 Agenda Big Data & Hadoop ViPR HDFS Pivotal Big Data Suite & ViPR HDFS ViON Customer Feedback 2 A World of Connected Devices Need a new data management architecture for Internet of Things 21% the % of
NetApp Big Content Solutions: Agile Infrastructure for Big Data
White Paper NetApp Big Content Solutions: Agile Infrastructure for Big Data Ingo Fuchs, NetApp April 2012 WP-7161 Executive Summary Enterprises are entering a new era of scale, in which the amount of data
Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes
Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Highly competitive enterprises are increasingly finding ways to maximize and accelerate
EMC VPLEX FAMILY. Continuous Availability and data Mobility Within and Across Data Centers
EMC VPLEX FAMILY Continuous Availability and data Mobility Within and Across Data Centers DELIVERING CONTINUOUS AVAILABILITY AND DATA MOBILITY FOR MISSION CRITICAL APPLICATIONS Storage infrastructure is
Simplified Management With Hitachi Command Suite. By Hitachi Data Systems
Simplified Management With Hitachi Command Suite By Hitachi Data Systems April 2015 Contents Executive Summary... 2 Introduction... 3 Hitachi Command Suite v8: Key Highlights... 4 Global Storage Virtualization
Apache Hadoop in the Enterprise. Dr. Amr Awadallah, CTO/Founder @awadallah, [email protected]
Apache Hadoop in the Enterprise Dr. Amr Awadallah, CTO/Founder @awadallah, [email protected] Cloudera The Leader in Big Data Management Powered by Apache Hadoop The Leading Open Source Distribution of Apache
EMC ISILON NL-SERIES. Specifications. EMC Isilon NL400. EMC Isilon NL410 ARCHITECTURE
EMC ISILON NL-SERIES The challenge of cost-effectively storing and managing data is an ever-growing concern. You have to weigh the cost of storing certain aging data sets against the need for quick access.
Understanding Enterprise NAS
Anjan Dave, Principal Storage Engineer LSI Corporation Author: Anjan Dave, Principal Storage Engineer, LSI Corporation SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA
Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances
INSIGHT Oracle's All- Out Assault on the Big Data Market: Offering Hadoop, R, Cubes, and Scalable IMDB in Familiar Packages Carl W. Olofson IDC OPINION Global Headquarters: 5 Speen Street Framingham, MA
Big Data Technologies Compared June 2014
Big Data Technologies Compared June 2014 Agenda What is Big Data Big Data Technology Comparison Summary Other Big Data Technologies Questions 2 What is Big Data by Example The SKA Telescope is a new development
MULTITENANCY AND THE ENTERPRISE DATA HUB:
MULTITENANCY AND THE ENTERPRISE DATA HUB: Version: Q414-105 Table of Content Introduction 3 Business Objectives for Multitenant Environments 3 Standard Isolation Models of an EDH 4 Elements of a Multitenant
EMC VPLEX FAMILY. Continuous Availability and Data Mobility Within and Across Data Centers
EMC VPLEX FAMILY Continuous Availability and Data Mobility Within and Across Data Centers DELIVERING CONTINUOUS AVAILABILITY AND DATA MOBILITY FOR MISSION CRITICAL APPLICATIONS Storage infrastructure is
EMC DATA DOMAIN OPERATING SYSTEM
EMC DATA DOMAIN OPERATING SYSTEM Powering EMC Protection Storage ESSENTIALS High-Speed, Scalable Deduplication Up to 58.7 TB/hr performance Reduces requirements for backup storage by 10 to 30x and archive
Virtualizing Apache Hadoop. June, 2012
June, 2012 Table of Contents EXECUTIVE SUMMARY... 3 INTRODUCTION... 3 VIRTUALIZING APACHE HADOOP... 4 INTRODUCTION TO VSPHERE TM... 4 USE CASES AND ADVANTAGES OF VIRTUALIZING HADOOP... 4 MYTHS ABOUT RUNNING
EMC DATA DOMAIN OPERATING SYSTEM
ESSENTIALS HIGH-SPEED, SCALABLE DEDUPLICATION Up to 58.7 TB/hr performance Reduces protection storage requirements by 10 to 30x CPU-centric scalability DATA INVULNERABILITY ARCHITECTURE Inline write/read
Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84
Index A Amazon Web Services (AWS), 50, 58 Analytics engine, 21 22 Apache Kafka, 38, 131 Apache S4, 38, 131 Apache Sqoop, 37, 131 Appliance pattern, 104 105 Application architecture, big data analytics
Hadoop in the Hybrid Cloud
Presented by Hortonworks and Microsoft Introduction An increasing number of enterprises are either currently using or are planning to use cloud deployment models to expand their IT infrastructure. Big
Data Governance in the Hadoop Data Lake. Michael Lang May 2015
Data Governance in the Hadoop Data Lake Michael Lang May 2015 Introduction Product Manager for Teradata Loom Joined Teradata as part of acquisition of Revelytix, original developer of Loom VP of Sales
NextGen Infrastructure for Big DATA Analytics.
NextGen Infrastructure for Big DATA Analytics. So What is Big Data? Data that exceeds the processing capacity of conven4onal database systems. The data is too big, moves too fast, or doesn t fit the structures
Databricks. A Primer
Databricks A Primer Who is Databricks? Databricks vision is to empower anyone to easily build and deploy advanced analytics solutions. The company was founded by the team who created Apache Spark, a powerful
BIG DATA TRENDS AND TECHNOLOGIES
BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.
Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics
In Organizations Mark Vervuurt Cluster Data Science & Analytics AGENDA 1. Yellow Elephant 2. Data Ingestion & Complex Event Processing 3. SQL on Hadoop 4. NoSQL 5. InMemory 6. Data Science & Machine Learning
Proact whitepaper on Big Data
Proact whitepaper on Big Data Summary Big Data is not a definite term. Even if it sounds like just another buzz word, it manifests some interesting opportunities for organisations with the skill, resources
Simple. Extensible. Open.
White Paper Simple. Extensible. Open. Unleash the Value of Data with EMC ViPR Global Data Services Abstract The following paper opens with the evolution of enterprise storage infrastructure in the era
WHITE PAPER USING CLOUDERA TO IMPROVE DATA PROCESSING
WHITE PAPER USING CLOUDERA TO IMPROVE DATA PROCESSING Using Cloudera to Improve Data Processing CLOUDERA WHITE PAPER 2 Table of Contents What is Data Processing? 3 Challenges 4 Flexibility and Data Quality
SYMANTEC NETBACKUP APPLIANCE FAMILY OVERVIEW BROCHURE. When you can do it simply, you can do it all.
SYMANTEC NETBACKUP APPLIANCE FAMILY OVERVIEW BROCHURE When you can do it simply, you can do it all. SYMANTEC NETBACKUP APPLIANCES Symantec understands the shifting needs of the data center and offers NetBackup
WHITE PAPER. www.fusionstorm.com. Get Ready for Big Data:
WHitE PaPER: Easing the Way to the cloud: 1 WHITE PAPER Get Ready for Big Data: How Scale-Out NaS Delivers the Scalability, Performance, Resilience and manageability that Big Data Environments Demand 2
BUILDING A SCALABLE BIG DATA INFRASTRUCTURE FOR DYNAMIC WORKFLOWS
BUILDING A SCALABLE BIG DATA INFRASTRUCTURE FOR DYNAMIC WORKFLOWS ESSENTIALS Executive Summary Big Data is placing new demands on IT infrastructures. The challenge is how to meet growing performance demands
Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook
Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future
