Multitenancy and the Enterprise Data Hub. James IP EXPO EUROPE Big Data Evolution Summit

Size: px
Start display at page:

Download "Multitenancy and the Enterprise Data Hub. James Kinley @jrkinley IP EXPO EUROPE Big Data Evolution Summit"

Transcription

1 Multitenancy and the Enterprise Data Hub James IP EXPO EUROPE Big Data Evolution Summit 1

2 About me James Principal Solutions Architect EMEA Hadooper since 2010 Clouderan since 2012 Cyber Security background github.com/jrkinley jameskinley.tumblr.com Cloudera, Inc. All rights reserved.

3 Introduction: EDH Objectives Sharing Data (better insight) Sharing Compute (better utilisation and performance) Consolidated Operations (reduced cost and complexity) Cloudera, Inc. All rights reserved.

4 Introduction: Multitenancy Objectives Multitenancy in Hadoop refers to a set of features that enable multiple groups from within the same organisation to share the common set of resources in a cluster without negatively impacting service-levels, violating security constraints, or even revealing the existence of each other, all via policy rather than physical separation. Multitenancy and the Enterprise Data Hub Cloudera, Inc. All rights reserved.

5 Multitenant Cluster Architecture Three Critical Facets Security & Governance Resource Isolation & Management Chargeback & Showback Cloudera, Inc. All rights reserved.

6 Multitenant Cluster Architecture Security & Governance 7

7 Security & Governance Authentication: proves users are who they say they are [Kerberos, Identity Management (LDAP)] Authorisation: determines what users can see and do [HDFS Permissions, RBAC (Apache Sentry), Encryption] Auditing: determines who did what, and when [Cloudera Navigator] Cloudera, Inc. All rights reserved.

8 Security & Governance HDFS Information Architecture (IA) drwxr-x--- tadmin tgroup /users/<tenantid> drwxr-x--- tadmin tgroup /users/<tenantid>/landing drwxr-x--- tadmin tgroup /users/<tenantid>/data drwxrwx--x hive hive /users/<tenantid>/data/warehouse drwxrwx--x hive hive /users/<tenantid>/data/warehouse/<db>/<table>/<partition> drwxrwx--- tadmin tgroup /users/<tenantid>/processing drwx <tuser> tgroup /users/<tenantid>/processing/<jobid> drwx <tuser> tgroup /users/<tenantid>/processing/<jobid>/input drwx <tuser> tgroup /users/<tenantid>/processing/<jobid>/output Cloudera, Inc. All rights reserved.

9 Security & Governance Authentication: Kerberos & LDAP drwxr-x--- tadmin tgroup /users/<tenantid> drwxr-x--- tadmin tgroup /users/<tenantid>/landing drwxr-x--- tadmin tgroup /users/<tenantid>/data drwxrwx--x hive hive /users/<tenantid>/data/warehouse drwxrwx--x hive hive /users/<tenantid>/data/warehouse/<db>/<table>/<partition> drwxrwx--- tadmin tgroup /users/<tenantid>/processing drwx <tuser> tgroup /users/<tenantid>/processing/<jobid> drwx <tuser> tgroup /users/<tenantid>/processing/<jobid>/input drwx <tuser> tgroup /users/<tenantid>/processing/<jobid>/output Cloudera, Inc. All rights reserved.

10 Security & Governance Authorisation: HDFS Permissions drwxr-x--- tadmin tgroup /users/<tenantid> drwxr-x--- tadmin tgroup /users/<tenantid>/landing drwxr-x--- tadmin tgroup /users/<tenantid>/data drwxrwx--x hive hive /users/<tenantid>/data/warehouse drwxrwx--x hive hive /users/<tenantid>/data/warehouse/<db>/<table>/<partition> drwxrwx--- tadmin tgroup /users/<tenantid>/processing drwx <tuser> tgroup /users/<tenantid>/processing/<jobid> drwx <tuser> tgroup /users/<tenantid>/processing/<jobid>/input drwx <tuser> tgroup /users/<tenantid>/processing/<jobid>/output Cloudera, Inc. All rights reserved.

11 Security & Governance Authorisation: HDFS Extended ACLs drwxr-x--- tadmin tgroup /users/<tenantid> drwxr-x--- tadmin tgroup /users/<tenantid>/landing drwxr-x--- tadmin tgroup /users/<tenantid>/data drwxrwx--x hive hive /users/<tenantid>/data/warehouse drwxrwx--x hive hive /users/<tenantid>/data/warehouse/<db>/<table>/<partition> drwxrwx--- tadmin tgroup /users/<tenantid>/processing Give tenant s ingest write permission over the landing directory: drwx <tuser> tgroup hdfs dfs /users/<tenantid>/processing/<jobid> -setfacl -m user:tingest:-w- /users/<tenantid>/landing drwx <tuser> tgroup /users/<tenantid>/processing/<jobid>/input drwx <tuser> tgroup /users/<tenantid>/processing/<jobid>/output Give hive and impala users read & write permission over the landing directory: hdfs dfs -setfacl m group:hive:rw- /users/<tenantid>/landing Cloudera, Inc. All rights reserved.

12 Security & Governance Authorisation: Apache Sentry (RBAC) Users can see only the data and metadata for which they have the privilege File or Service (GRANT/REVOKE) based policy providers Role-based privilege model: [user] > [groups] > [roles] > object > privilege object = [server, database, table, URI] privilege = [select, insert, all] Cloudera, Inc. All rights reserved.

13 Security & Governance Authorisation: Apache Sentry (RBAC) drwxr-x--- tadmin tgroup /users/<tenantid> drwxr-x--- tadmin tgroup /users/<tenantid>/landing drwxr-x--- tadmin tgroup /users/<tenantid>/data drwxrwx--x hive hive /users/<tenantid>/data/warehouse drwxrwx--x hive hive /users/<tenantid>/data/warehouse/<db>/<table>/<partition> drwxrwx--- tadmin tgroup /users/<tenantid>/processing drwx <tuser> tgroup /users/<tenantid>/processing/<jobid> drwx <tuser> tgroup /users/<tenantid>/processing/<jobid>/input drwx <tuser> tgroup /users/<tenantid>/processing/<jobid>/output Cloudera, Inc. All rights reserved.

14 Security & Governance Authorisation: Encryption Network encryption (HDFS and MR) At-rest encryption for HDFS Navigator Encrypt & KeyTrustee (Gazzang) Project Rhino (Cloudera + Intel) HDFS-level encryption Encryption Zones Hardware-accelerated Cloudera, Inc. All rights reserved.

15 Security & Governance Authorisation: HDFS Encryption Zone drwxr-x--- tadmin tgroup /users/<tenantid> drwxr-x--- tadmin tgroup /users/<tenantid>/landing drwxr-x--- tadmin tgroup /users/<tenantid>/data drwxrwx--x hive hive /users/<tenantid>/data/warehouse drwxrwx--x hive hive /users/<tenantid>/data/warehouse/<db>/<table>/<partition> drwxrwx--- tadmin tgroup /users/<tenantid>/processing drwx <tuser> tgroup /users/<tenantid>/processing/<jobid> drwx <tuser> tgroup /users/<tenantid>/processing/<jobid>/input drwx <tuser> tgroup /users/<tenantid>/processing/<jobid>/output Cloudera, Inc. All rights reserved.

16 Security & Governance Governance: HDFS Disk Quota Management Restrict tenants use of storage Prevents misuse of the shared filesystem HDFS supports two quota mechanisms Disk space quotas Name quotas Cloudera, Inc. All rights reserved.

17 Security & Governance Governance: HDFS Disk Quota Management drwxr-x--- tadmin tgroup /users/<tenantid> drwxr-x--- tadmin tgroup /users/<tenantid>/landing drwxr-x--- tadmin tgroup /users/<tenantid>/data drwxrwx--x hive hive /users/<tenantid>/data/warehouse drwxrwx--x hive hive /users/<tenantid>/data/warehouse/<db>/<table>/<partition> drwxrwx--- tadmin tgroup /users/<tenantid>/processing drwx <tuser> tgroup /users/<tenantid>/processing/<jobid> drwx <tuser> tgroup /users/<tenantid>/processing/<jobid>/input drwx <tuser> tgroup /users/<tenantid>/processing/<jobid>/output Cloudera, Inc. All rights reserved.

18 Multitenant Cluster Architecture Resource Isolation & Management 19

19 Resource Isolation & Management Dividing up finite cluster resource Service Level Isolation Static Service Pools Admission Control Throttling concurrent apps and queries Dynamic Prioritisation Dynamic Resource Pools ACLs SLOs Cloudera, Inc. All rights reserved.

20 Resource Isolation & Management Classifier User to pool placement rules Based on user, group, or specified tag MR: mapreduce.job.queuename Impala: REQUEST_POOL Cloudera, Inc. All rights reserved.

21 Resource Isolation & Management Queues Admission Control (queue policy) Max concurrency (YARN / Impala) Max memory (Impala) Max queue size (Impala) Cloudera, Inc. All rights reserved.

22 Resource Isolation & Management Dynamic Resource Pools % of cluster resource Virtual cores min/max (YARN) Memory min/max (YARN) Scheduling policy (DRF, FAIR, FIFO) Recommendations: disabling undeclared pools enabling the default pool Cloudera, Inc. All rights reserved.

23 Resource Isolation & Management Cloudera, Inc. All rights reserved.

24 Multitenant Cluster Architecture Chargeback & Showback 25

25 Chargeback and Showback Meter cluster usage (CM) Input to chargeback model Illustrate compliance Facilitate capacity planning and budgeting Cloudera, Inc. All rights reserved.

26 27

Secure Your Hadoop Cluster With Apache Sentry (Incubating) Xuefu Zhang Software Engineer, Cloudera April 07, 2014

Secure Your Hadoop Cluster With Apache Sentry (Incubating) Xuefu Zhang Software Engineer, Cloudera April 07, 2014 1 Secure Your Hadoop Cluster With Apache Sentry (Incubating) Xuefu Zhang Software Engineer, Cloudera April 07, 2014 2 Outline Introduction Hadoop security primer Authentication Authorization Data Protection

More information

MULTITENANCY AND THE ENTERPRISE DATA HUB:

MULTITENANCY AND THE ENTERPRISE DATA HUB: MULTITENANCY AND THE ENTERPRISE DATA HUB: Version: Q414-105 Table of Content Introduction 3 Business Objectives for Multitenant Environments 3 Standard Isolation Models of an EDH 4 Elements of a Multitenant

More information

Apache Sentry. Prasad Mujumdar [email protected] [email protected]

Apache Sentry. Prasad Mujumdar prasadm@apache.org prasadm@cloudera.com Apache Sentry Prasad Mujumdar [email protected] [email protected] Agenda Various aspects of data security Apache Sentry for authorization Key concepts of Apache Sentry Sentry features Sentry architecture

More information

Accelerating Enterprise Big Data Success. Tim Stevens, VP of Business and Corporate Development Cloudera

Accelerating Enterprise Big Data Success. Tim Stevens, VP of Business and Corporate Development Cloudera Accelerating Enterprise Big Data Success Tim Stevens, VP of Business and Corporate Development Cloudera 1 Big Opportunity: Extract value from data Revenue Growth x = 50 Billion 35 ZB Cost Savings Margin

More information

Olivier Renault Solu/on Engineer Hortonworks. Hadoop Security

Olivier Renault Solu/on Engineer Hortonworks. Hadoop Security Olivier Renault Solu/on Engineer Hortonworks Hadoop Security Agenda Why security Kerberos HDFS ACL security Network security - KNOX Hive - doas = False - ATZ-NG YARN ACL p67-91 Capacity scheduler ACL Killing

More information

Hadoop & Spark Using Amazon EMR

Hadoop & Spark Using Amazon EMR Hadoop & Spark Using Amazon EMR Michael Hanisch, AWS Solutions Architecture 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda Why did we build Amazon EMR? What is Amazon EMR?

More information

Fighting Cyber Fraud with Hadoop. Niel Dunnage Senior Solutions Architect

Fighting Cyber Fraud with Hadoop. Niel Dunnage Senior Solutions Architect Fighting Cyber Fraud with Hadoop Niel Dunnage Senior Solutions Architect 1 Summary Big Data is an increasingly powerful enterprise asset and this talk will explore the relationship between big data and

More information

Securing Your Enterprise Hadoop Ecosystem Comprehensive Security for the Enterprise with Cloudera

Securing Your Enterprise Hadoop Ecosystem Comprehensive Security for the Enterprise with Cloudera Securing Your Enterprise Hadoop Ecosystem Comprehensive Security for the Enterprise with Cloudera Version: 103 Table of Contents Introduction 3 Importance of Security 3 Growing Pains 3 Security Requirements

More information

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE Hadoop Storage-as-a-Service ABSTRACT This White Paper illustrates how EMC Elastic Cloud Storage (ECS ) can be used to streamline the Hadoop data analytics

More information

Important Notice. (c) 2010-2015 Cloudera, Inc. All rights reserved.

Important Notice. (c) 2010-2015 Cloudera, Inc. All rights reserved. Cloudera Security Important Notice (c) 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, and any other product or service names or slogans contained in this document

More information

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Datenverwaltung im Wandel - Building an Enterprise Data Hub with Datenverwaltung im Wandel - Building an Enterprise Data Hub with Cloudera Bernard Doering Regional Director, Central EMEA, Cloudera Cloudera Your Hadoop Experts Founded 2008, by former employees of Employees

More information

Who Am I? Mark Cusack Chief Architect 9 years@rainstor Founding developer Ex UK Ministry of Defence Research InfoSec projects

Who Am I? Mark Cusack Chief Architect 9 years@rainstor Founding developer Ex UK Ministry of Defence Research InfoSec projects 1 Who Am I? Mark Cusack Chief Architect 9 years@rainstor Founding developer Ex UK Ministry of Defence Research InfoSec projects 2 RainStor: a SQL Database on Hadoop SCALE (MPP, Shared everything) LOAD

More information

Data Security For Government Agencies

Data Security For Government Agencies Data Security For Government Agencies Version: Q115-101 Table of Contents Abstract Agencies are transforming data management with unified systems that combine distributed storage and computation at limitless

More information

Big Data Management and Security

Big Data Management and Security Big Data Management and Security Audit Concerns and Business Risks Tami Frankenfield Sr. Director, Analytics and Enterprise Data Mercury Insurance What is Big Data? Velocity + Volume + Variety = Value

More information

Securing Hadoop in an Enterprise Context

Securing Hadoop in an Enterprise Context Securing Hadoop in an Enterprise Context Hellmar Becker, Senior IT Specialist Apache: Big Data conference Budapest, September 29, 2015 Who am I? 2 Securing Hadoop in an Enterprise Context 1. The Challenge

More information

Securing Your Enterprise Hadoop Ecosystem Comprehensive Security for the Enterprise with Cloudera

Securing Your Enterprise Hadoop Ecosystem Comprehensive Security for the Enterprise with Cloudera Securing Your Enterprise Hadoop Ecosystem Comprehensive Security for the Enterprise with Cloudera Version: 102 Table of Contents Introduction 3 Importance of Security 3 Growing Pains 3 Security Requirements

More information

Cloudera Navigator Installation and User Guide

Cloudera Navigator Installation and User Guide Cloudera Navigator Installation and User Guide Important Notice (c) 2010-2013 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, and any other product or service names or

More information

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and

More information

... ... PEPPERDATA OVERVIEW AND DIFFERENTIATORS ... ... ... ... ...

... ... PEPPERDATA OVERVIEW AND DIFFERENTIATORS ... ... ... ... ... ..................................... WHITEPAPER PEPPERDATA OVERVIEW AND DIFFERENTIATORS INTRODUCTION Prospective customers will often pose the question, How is Pepperdata different from tools like Ganglia,

More information

The Future of Data Management

The Future of Data Management The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class

More information

Data Security in Hadoop

Data Security in Hadoop Data Security in Hadoop Eric Mizell Director, Solution Engineering Page 1 What is Data Security? Data Security for Hadoop allows you to administer a singular policy for authentication of users, authorize

More information

SECURING YOUR ENTERPRISE HADOOP ECOSYSTEM

SECURING YOUR ENTERPRISE HADOOP ECOSYSTEM WHITE PAPER SECURING YOUR ENTERPRISE HADOOP ECOSYSTEM Realizing Data Security for the Enterprise with Cloudera Securing Your Enterprise Hadoop Ecosystem CLOUDERA WHITE PAPER 2 Table of Contents Introduction

More information

Professional Hadoop Solutions

Professional Hadoop Solutions Brochure More information from http://www.researchandmarkets.com/reports/2542488/ Professional Hadoop Solutions Description: The go-to guidebook for deploying Big Data solutions with Hadoop Today's enterprise

More information

Big Data SQL and Query Franchising

Big Data SQL and Query Franchising Big Data SQL and Query Franchising An Architecture for Query Beyond Hadoop Dan McClary, Ph.D. Big Data Product Management Oracle Copyright 2014, Oracle and/or its affiliates. All rights reserved. Safe Harbor

More information

Practical Hadoop. Security. Bhushan Lakhe

Practical Hadoop. Security. Bhushan Lakhe Practical Hadoop Security Bhushan Lakhe Contents J About the Author About the Technical Reviewer Acknowledgments Introduction xiii xv xvii xix Part I: Introducing Hadoop and Its Security 1 Chapter 1: Understanding

More information

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase Architectural patterns for building real time applications with Apache HBase Andrew Purtell Committer and PMC, Apache HBase Who am I? Distributed systems engineer Principal Architect in the Big Data Platform

More information

Hadoop: Embracing future hardware

Hadoop: Embracing future hardware Hadoop: Embracing future hardware Suresh Srinivas @suresh_m_s Page 1 About Me Architect & Founder at Hortonworks Long time Apache Hadoop committer and PMC member Designed and developed many key Hadoop

More information

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP Eva Andreasson Cloudera Most FAQ: Super-Quick Overview! The Apache Hadoop Ecosystem a Zoo! Oozie ZooKeeper Hue Impala Solr Hive Pig Mahout HBase MapReduce

More information

Big Data Operations Guide for Cloudera Manager v5.x Hadoop

Big Data Operations Guide for Cloudera Manager v5.x Hadoop Big Data Operations Guide for Cloudera Manager v5.x Hadoop Logging into the Enterprise Cloudera Manager 1. On the server where you have installed 'Cloudera Manager', make sure that the server is running,

More information

HDFS Federation. Sanjay Radia Founder and Architect @ Hortonworks. Page 1

HDFS Federation. Sanjay Radia Founder and Architect @ Hortonworks. Page 1 HDFS Federation Sanjay Radia Founder and Architect @ Hortonworks Page 1 About Me Apache Hadoop Committer and Member of Hadoop PMC Architect of core-hadoop @ Yahoo - Focusing on HDFS, MapReduce scheduler,

More information

More Data in Less Time

More Data in Less Time More Data in Less Time Leveraging Cloudera CDH as an Operational Data Store Daniel Tydecks, Systems Engineering DACH & CE Goals of an Operational Data Store Load Data Sources Traditional Architecture Operational

More information

The Future of Data Management with Hadoop and the Enterprise Data Hub

The Future of Data Management with Hadoop and the Enterprise Data Hub The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah Cofounder & CTO, Cloudera, Inc. Twitter: @awadallah 1 2 Cloudera Snapshot Founded 2008, by former employees of Employees

More information

Integrate Master Data with Big Data using Oracle Table Access for Hadoop

Integrate Master Data with Big Data using Oracle Table Access for Hadoop Integrate Master Data with Big Data using Oracle Table Access for Hadoop Kuassi Mensah Oracle Corporation Redwood Shores, CA, USA Keywords: Hadoop, BigData, Hive SQL, Spark SQL, HCatalog, StorageHandler

More information

White paper. The Big Data Security Gap: Protecting the Hadoop Cluster

White paper. The Big Data Security Gap: Protecting the Hadoop Cluster The Big Data Security Gap: Protecting the Hadoop Cluster Introduction While the open source framework has enabled the footprint of Hadoop to logically expand, enterprise organizations face deployment and

More information

Hadoop Ecosystem B Y R A H I M A.

Hadoop Ecosystem B Y R A H I M A. Hadoop Ecosystem B Y R A H I M A. History of Hadoop Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open

More information

HDFS 2015: Past, Present, and Future

HDFS 2015: Past, Present, and Future Apache: Big Data Europe 2015 HDFS 2015: Past, Present, and Future 9/30/2015 NTT DATA Corporation Akira Ajisaka Copyright 2015 NTT DATA Corporation Self introduction Akira Ajisaka (NTT DATA) Apache Hadoop

More information

WHITE PAPER. Hadoop and HDFS: Storage for Next Generation Data Management. Version: Q414-102

WHITE PAPER. Hadoop and HDFS: Storage for Next Generation Data Management. Version: Q414-102 Storage for Next Generation Data Management Version: Q414-102 Table of Content Storage for the Modern Enterprise 3 The Challenges of Big Data 5 Data at the Center of the Enterprise 6 The Internals of HDFS

More information

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2016 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and

More information

Hadoop Elephant in Active Directory Forest. Marek Gawiński, Arkadiusz Osiński Allegro Group

Hadoop Elephant in Active Directory Forest. Marek Gawiński, Arkadiusz Osiński Allegro Group Hadoop Elephant in Active Directory Forest Marek Gawiński, Arkadiusz Osiński Allegro Group Agenda Goals and motivations Technology stack Architecture evolution Automation integrating new servers Making

More information

IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures

IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures IaaS Cloud Architectures: Virtualized Data Centers to Federated Cloud Infrastructures Dr. Sanjay P. Ahuja, Ph.D. 2010-14 FIS Distinguished Professor of Computer Science School of Computing, UNF Introduction

More information

Mirjam van Olst. Best Practices & Considerations for Designing Your SharePoint Logical Architecture

Mirjam van Olst. Best Practices & Considerations for Designing Your SharePoint Logical Architecture Mirjam van Olst Best Practices & Considerations for Designing Your SharePoint Logical Architecture About me http://sharepointchick.com @mirjamvanolst [email protected] Agenda Introduction Logical Architecture

More information

Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam [email protected]

Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam sastry.vedantam@oracle.com Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam [email protected] Agenda The rise of Big Data & Hadoop MySQL in the Big Data Lifecycle MySQL Solutions for Big Data Q&A

More information

Like what you hear? Tweet it using: #Sec360

Like what you hear? Tweet it using: #Sec360 Like what you hear? Tweet it using: #Sec360 HADOOP SECURITY Like what you hear? Tweet it using: #Sec360 HADOOP SECURITY About Robert: School: UW Madison, U St. Thomas Programming: 15 years, C, C++, Java

More information

Introduction to Apache YARN Schedulers & Queues

Introduction to Apache YARN Schedulers & Queues Introduction to Apache YARN Schedulers & Queues In a nutshell, YARN was designed to address the many limitations (performance/scalability) embedded into Hadoop version 1 (MapReduce & HDFS). Some of the

More information

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Highly competitive enterprises are increasingly finding ways to maximize and accelerate

More information

Cloudera Enterprise Data Hub. GCloud Service Definition Lot 3: Software as a Service

Cloudera Enterprise Data Hub. GCloud Service Definition Lot 3: Software as a Service Cloudera Enterprise Data Hub GCloud Service Definition Lot 3: Software as a Service December 2014 1 SERVICE OVERVIEW & SOLUTION... 4 1.1 Service Overview... 4 1.2 Introduction to Cloudera... 5 1.3 Cloudera

More information

Datameer Big Data Governance

Datameer Big Data Governance TECHNICAL BRIEF Datameer Big Data Governance Bringing open-architected and forward-compatible governance controls to Hadoop analytics As big data moves toward greater mainstream adoption, its compliance

More information

Big Data Technology Core Hadoop: HDFS-YARN Internals

Big Data Technology Core Hadoop: HDFS-YARN Internals Big Data Technology Core Hadoop: HDFS-YARN Internals Eshcar Hillel Yahoo! Ronny Lempel Outbrain *Based on slides by Edward Bortnikov & Ronny Lempel Roadmap Previous class Map-Reduce Motivation This class

More information

RapidMiner OrangePaper Big Data Security on Hadoop

RapidMiner OrangePaper Big Data Security on Hadoop by Tobias Malbrecht and Zoltan Prekopcsak February 2015 RapidMiner OrangePaper As an increasing number of enterprises move towards production deployments of Hadoop, security continues to be an important

More information

The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer, Cofounder @mikeolson

The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer, Cofounder @mikeolson The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer, Cofounder @mikeolson 1 A New Platform for Pervasive Analytics Multiple big data opportunities

More information

PEPPERDATA IN MULTI-TENANT ENVIRONMENTS

PEPPERDATA IN MULTI-TENANT ENVIRONMENTS ..................................... PEPPERDATA IN MULTI-TENANT ENVIRONMENTS technical whitepaper June 2015 SUMMARY OF WHAT S WRITTEN IN THIS DOCUMENT If you are short on time and don t want to read the

More information

Upcoming Announcements

Upcoming Announcements Enterprise Hadoop Enterprise Hadoop Jeff Markham Technical Director, APAC [email protected] Page 1 Upcoming Announcements April 2 Hortonworks Platform 2.1 A continued focus on innovation within

More information

Virtualizing Apache Hadoop. June, 2012

Virtualizing Apache Hadoop. June, 2012 June, 2012 Table of Contents EXECUTIVE SUMMARY... 3 INTRODUCTION... 3 VIRTUALIZING APACHE HADOOP... 4 INTRODUCTION TO VSPHERE TM... 4 USE CASES AND ADVANTAGES OF VIRTUALIZING HADOOP... 4 MYTHS ABOUT RUNNING

More information

MySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering

MySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering MySQL and Hadoop: Big Data Integration Shubhangi Garg & Neha Kumari MySQL Engineering 1Copyright 2013, Oracle and/or its affiliates. All rights reserved. Agenda Design rationale Implementation Installation

More information

Data Governance in the Hadoop Data Lake. Kiran Kamreddy May 2015

Data Governance in the Hadoop Data Lake. Kiran Kamreddy May 2015 Data Governance in the Hadoop Data Lake Kiran Kamreddy May 2015 One Data Lake: Many Definitions A centralized repository of raw data into which many data-producing streams flow and from which downstream

More information

Enterprise-grade Hadoop: The Building Blocks

Enterprise-grade Hadoop: The Building Blocks Enterprise-grade Hadoop: The Building Blocks An Ovum white paper for MapR Publication Date: 24 Sep 2014 Author name Summary Catalyst Hadoop was initially developed for trusted environments that did not

More information

Big Data Analytics(Hadoop) Prepared By : Manoj Kumar Joshi & Vikas Sawhney

Big Data Analytics(Hadoop) Prepared By : Manoj Kumar Joshi & Vikas Sawhney Big Data Analytics(Hadoop) Prepared By : Manoj Kumar Joshi & Vikas Sawhney General Agenda Understanding Big Data and Big Data Analytics Getting familiar with Hadoop Technology Hadoop release and upgrades

More information

How To Use Cloudera Manager Backup And Disaster Recovery (Brd) On A Microsoft Hadoop 5.5.5 (Clouderma) On An Ubuntu 5.2.5 Or 5.3.5

How To Use Cloudera Manager Backup And Disaster Recovery (Brd) On A Microsoft Hadoop 5.5.5 (Clouderma) On An Ubuntu 5.2.5 Or 5.3.5 Cloudera Manager Backup and Disaster Recovery Important Notice (c) 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, and any other product or service names or

More information

EMC ViPR Controller. Version 2.4. User Interface Virtual Data Center Configuration Guide 302-002-416 REV 01 DRAFT

EMC ViPR Controller. Version 2.4. User Interface Virtual Data Center Configuration Guide 302-002-416 REV 01 DRAFT EMC ViPR Controller Version 2.4 User Interface Virtual Data Center Configuration Guide 302-002-416 REV 01 DRAFT Copyright 2014-2015 EMC Corporation. All rights reserved. Published in USA. Published November,

More information

Oracle Big Data Fundamentals Ed 1 NEW

Oracle Big Data Fundamentals Ed 1 NEW Oracle University Contact Us: +90 212 329 6779 Oracle Big Data Fundamentals Ed 1 NEW Duration: 5 Days What you will learn In the Oracle Big Data Fundamentals course, learn to use Oracle's Integrated Big

More information

Oracle Big Data SQL. Architectural Deep Dive. Dan McClary, Ph.D. Big Data Product Management Oracle

Oracle Big Data SQL. Architectural Deep Dive. Dan McClary, Ph.D. Big Data Product Management Oracle Oracle Big Data SQL Architectural Deep Dive Dan McClary, Ph.D. Big Data Product Management Oracle Copyright 2014, Oracle and/or its affiliates. All rights reserved. Safe Harbor Statement The following is

More information

SharePoint 2010 Performance and Capacity Planning Best Practices

SharePoint 2010 Performance and Capacity Planning Best Practices Information Technology Solutions SharePoint 2010 Performance and Capacity Planning Best Practices Eric Shupps SharePoint Server MVP About Information Me Technology Solutions SharePoint Server MVP President,

More information

Deploying an Operational Data Store Designed for Big Data

Deploying an Operational Data Store Designed for Big Data Deploying an Operational Data Store Designed for Big Data A fast, secure, and scalable data staging environment with no data volume or variety constraints Sponsored by: Version: 102 Table of Contents Introduction

More information

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics ESSENTIALS EMC ISILON Use the industry's first and only scale-out NAS solution with native Hadoop

More information

Architecture Guidelines Application Security

Architecture Guidelines Application Security Executive Summary These guidelines describe best practice for application security for 2 or 3 tier web-based applications. It covers the use of common security mechanisms including Authentication, Authorisation

More information

Dell In-Memory Appliance for Cloudera Enterprise

Dell In-Memory Appliance for Cloudera Enterprise Dell In-Memory Appliance for Cloudera Enterprise Hadoop Overview, Customer Evolution and Dell In-Memory Product Details Author: Armando Acosta Hadoop Product Manager/Subject Matter Expert [email protected]/

More information

Big Data Security. Kevvie Fowler. kpmg.ca

Big Data Security. Kevvie Fowler. kpmg.ca Big Data Security Kevvie Fowler kpmg.ca About myself Kevvie Fowler, CISSP, GCFA Partner, Advisory Services KPMG Canada Industry contributions Big data security definitions Definitions Big data Datasets

More information

Take An Internal Look at Hadoop. Hairong Kuang Grid Team, Yahoo! Inc [email protected]

Take An Internal Look at Hadoop. Hairong Kuang Grid Team, Yahoo! Inc hairong@yahoo-inc.com Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc [email protected] What s Hadoop Framework for running applications on large clusters of commodity hardware Scale: petabytes of data

More information

IBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look

IBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look IBM BigInsights Has Potential If It Lives Up To Its Promise By Prakash Sukumar, Principal Consultant at iolap, Inc. IBM released Hadoop-based InfoSphere BigInsights in May 2013. There are already Hadoop-based

More information

HDFS Under the Hood. Sanjay Radia. [email protected] Grid Computing, Hadoop Yahoo Inc.

HDFS Under the Hood. Sanjay Radia. Sradia@yahoo-inc.com Grid Computing, Hadoop Yahoo Inc. HDFS Under the Hood Sanjay Radia [email protected] Grid Computing, Hadoop Yahoo Inc. 1 Outline Overview of Hadoop, an open source project Design of HDFS On going work 2 Hadoop Hadoop provides a framework

More information

Cloudera Enterprise Data Hub in Telecom:

Cloudera Enterprise Data Hub in Telecom: Cloudera Enterprise Data Hub in Telecom: Three Customer Case Studies Version: 103 Table of Contents Introduction 3 Cloudera Enterprise Data Hub for Telcos 4 Cloudera Enterprise Data Hub in Telecom: Customer

More information

SharePoint 2013 Logical Architecture

SharePoint 2013 Logical Architecture SharePoint 2013 Logical Architecture This document is provided "as-is". Information and views expressed in this document, including URL and other Internet Web site references, may change without notice.

More information

Optimized for the Industrial Internet: GE s Industrial Data Lake Platform

Optimized for the Industrial Internet: GE s Industrial Data Lake Platform Optimized for the Industrial Internet: GE s Industrial Lake Platform Agenda The Opportunity The Solution The Challenges The Results Solutions for Industrial Internet, deep domain expertise 2 GESoftware.com

More information

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Copyright 2012, Oracle and/or its affiliates. All rights reserved. 1 Oracle Big Data Appliance Releases 2.5 and 3.0 Ralf Lange Global ISV & OEM Sales Agenda Quick Overview on BDA and its Positioning Product Details and Updates Security and Encryption New Hadoop Versions

More information

Interactive data analytics drive insights

Interactive data analytics drive insights Big data Interactive data analytics drive insights Daniel Davis/Invodo/S&P. Screen images courtesy of Landmark Software and Services By Armando Acosta and Joey Jablonski The Apache Hadoop Big data has

More information

Ganzheitliches Datenmanagement

Ganzheitliches Datenmanagement Ganzheitliches Datenmanagement für Hadoop Michael Kohs, Senior Sales Consultant @mikchaos The Problem with Big Data Projects in 2016 Relational, Mainframe Documents and Emails Data Modeler Data Scientist

More information

www.biobankcloud.com Jim Dowling KTH Royal Institute of Technology, Stockholm SICS Swedish ICT CSHL Meeting on Biological Data Science, 2014

www.biobankcloud.com Jim Dowling KTH Royal Institute of Technology, Stockholm SICS Swedish ICT CSHL Meeting on Biological Data Science, 2014 www.biobankcloud.com Jim Dowling KTH Royal Institute of Technology, Stockholm SICS Swedish ICT CSHL Meeting on Biological Data Science, 2014 Definition of a Biobank The Biobank concept is defined (by Swedish

More information

Cloudera Backup and Disaster Recovery

Cloudera Backup and Disaster Recovery Cloudera Backup and Disaster Recovery Important Notice (c) 2010-2013 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, and any other product or service names or slogans

More information

How To Manage Big Data In A Microsoft Cloud (Hadoop)

How To Manage Big Data In A Microsoft Cloud (Hadoop) Oracle Database 12c and the Future of Data Warehousing in the Era of Big Data George Lumpkin Data Warehousing Neil Mendelson Big Data & Advanced AnalyEcs Vice Presidents Server Technologies September 29,

More information

Extended Attributes and Transparent Encryption in Apache Hadoop

Extended Attributes and Transparent Encryption in Apache Hadoop Extended Attributes and Transparent Encryption in Apache Hadoop Uma Maheswara Rao G Yi Liu ( 刘 轶 ) Who we are? Uma Maheswara Rao G - [email protected] - Software Engineer at Intel - PMC/committer, Apache

More information

Reference Architecture and Best Practices for Virtualizing Hadoop Workloads Justin Murray VMware

Reference Architecture and Best Practices for Virtualizing Hadoop Workloads Justin Murray VMware Reference Architecture and Best Practices for Virtualizing Hadoop Workloads Justin Murray ware 2 Agenda The Hadoop Journey Why Virtualize Hadoop? Elasticity and Scalability Performance Tests Storage Reference

More information

Simplifying Big Data Analytics: Unifying Batch and Stream Processing. John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!!

Simplifying Big Data Analytics: Unifying Batch and Stream Processing. John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!! Simplifying Big Data Analytics: Unifying Batch and Stream Processing John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!! Streaming Analy.cs S S S Scale- up Database Data And Compute Grid

More information

Hadoop Trends and Practical Use Cases. April 2014

Hadoop Trends and Practical Use Cases. April 2014 Hadoop Trends and Practical Use Cases John Howey Cloudera [email protected] Kevin Lewis Cloudera [email protected] April 2014 1 Agenda Hadoop Overview Latest Trends in Hadoop Enterprise Ready Beyond

More information

HAWQ Architecture. Alexey Grishchenko

HAWQ Architecture. Alexey Grishchenko HAWQ Architecture Alexey Grishchenko Who I am Enterprise Architect @ Pivotal 7 years in data processing 5 years of experience with MPP 4 years with Hadoop Using HAWQ since the first internal Beta Responsible

More information

Performance Comparison of SQL based Big Data Analytics with Lustre and HDFS file systems

Performance Comparison of SQL based Big Data Analytics with Lustre and HDFS file systems Performance Comparison of SQL based Big Data Analytics with Lustre and HDFS file systems Rekha Singhal and Gabriele Pacciucci * Other names and brands may be claimed as the property of others. Lustre File

More information

Cloudera Backup and Disaster Recovery

Cloudera Backup and Disaster Recovery Cloudera Backup and Disaster Recovery Important Note: Cloudera Manager 4 and CDH 4 have reached End of Maintenance (EOM) on August 9, 2015. Cloudera will not support or provide patches for any of the Cloudera

More information

docs.hortonworks.com

docs.hortonworks.com docs.hortonworks.com : Ambari Views Guide Copyright 2012-2015 Hortonworks, Inc. All rights reserved. The, powered by Apache Hadoop, is a massively scalable and 100% open source platform for storing, processing

More information

Implementation of Hadoop Distributed File System Protocol on OneFS Tanuj Khurana EMC Isilon Storage Division

Implementation of Hadoop Distributed File System Protocol on OneFS Tanuj Khurana EMC Isilon Storage Division Implementation of Hadoop Distributed File System Protocol on OneFS Tanuj Khurana EMC Isilon Storage Division Outline HDFS Overview OneFS Overview HDFS protocol on OneFS HDFS protocol server implementation

More information

Dell* In-Memory Appliance for Cloudera* Enterprise

Dell* In-Memory Appliance for Cloudera* Enterprise Built with Intel Dell* In-Memory Appliance for Cloudera* Enterprise Find out what faster big data analytics can do for your business The need for speed in all things related to big data is an enormous

More information

7 Deadly Hadoop Misconfigurations. Kathleen Ting February 2013

7 Deadly Hadoop Misconfigurations. Kathleen Ting February 2013 7 Deadly Hadoop Misconfigurations Kathleen Ting February 2013 Who Am I? Kathleen Ting Apache Sqoop Committer, PMC Member Customer Operations Engineering Mgr, Cloudera @kate_ting, [email protected] 2

More information

Apache Hadoop. Alexandru Costan

Apache Hadoop. Alexandru Costan 1 Apache Hadoop Alexandru Costan Big Data Landscape No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard, except Hadoop 2 Outline What is Hadoop? Who uses it? Architecture HDFS MapReduce Open

More information

Cloudera Manager Monitoring and Diagnostics Guide

Cloudera Manager Monitoring and Diagnostics Guide Cloudera Manager Monitoring and Diagnostics Guide Important Notice (c) 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, and any other product or service names

More information

How to Hadoop Without the Worry: Protecting Big Data at Scale

How to Hadoop Without the Worry: Protecting Big Data at Scale How to Hadoop Without the Worry: Protecting Big Data at Scale SESSION ID: CDS-W06 Davi Ottenheimer Senior Director of Trust EMC Corporation @daviottenheimer Big Data Trust. Redefined Transparency Relevance

More information