Data Security in Hadoop



Similar documents
Upcoming Announcements

Hortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015

HDP Hadoop From concept to deployment.

HDP Enabling the Modern Data Architecture

Olivier Renault Solu/on Engineer Hortonworks. Hadoop Security

Big Data Management and Security

How to Hadoop Without the Worry: Protecting Big Data at Scale

A Modern Data Architecture with Apache Hadoop

Comprehensive Analytics on the Hortonworks Data Platform

Hortonworks Data Platform for Hadoop and SAP HANA

The Digital Enterprise Demands a Modern Integration Approach. Nada daveiga, Sr. Dir. of Technical Sales Tony LaVasseur, Territory Leader

Session 0202: Big Data in action with SAP HANA and Hadoop Platforms Prasad Illapani Product Management & Strategy (SAP HANA & Big Data) SAP Labs LLC,

Like what you hear? Tweet it using: #Sec360

HADOOP. Revised 10/19/2015

Why Spark on Hadoop Matters

Encryption and Anonymization in Hadoop

Apache Sentry. Prasad Mujumdar

Hortonworks Data Platform. Buyer s Guide

Ensure PCI DSS compliance for your Hadoop environment. A Hortonworks White Paper October 2015

Dominik Wagenknecht Accenture

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE

Modernizing Your Data Warehouse for Hadoop

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

Secure Your Hadoop Cluster With Apache Sentry (Incubating) Xuefu Zhang Software Engineer, Cloudera April 07, 2014

Hadoop, the Data Lake, and a New World of Analytics

Self-service BI for big data applications using Apache Drill

Fighting Cyber Fraud with Hadoop. Niel Dunnage Senior Solutions Architect

Big Data Realities Hadoop in the Enterprise Architecture

#TalendSandbox for Big Data

Training Catalog. Summer 2015 Training Catalog. Apache Hadoop Training from the Experts. Apache Hadoop Training From the Experts

Hadoop Ecosystem B Y R A H I M A.

Hadoop Job Oriented Training Agenda

Modern Data Architecture for Retail with Apache Hadoop on Windows

HADOOP VENDOR DISTRIBUTIONS THE WHY, THE WHO AND THE HOW? Guruprasad K.N. Enterprise Architect Wipro BOTWORKS

Self-service BI for big data applications using Apache Drill

Moving From Hadoop to Spark

A Brief Introduction to Apache Tez

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP

Data Lake In Action: Real-time, Closed Looped Analytics On Hadoop

docs.hortonworks.com

The Future of Data Management with Hadoop and the Enterprise Data Hub

Hadoop Introduction. Olivier Renault Solution Engineer - Hortonworks

Workshop on Hadoop with Big Data

Roadmap Talend : découvrez les futures fonctionnalités de Talend

Modern Data Architecture for Financial Services with Apache Hadoop on Windows

Please give me your feedback

YARN Apache Hadoop Next Generation Compute Platform

Hadoop in the Enterprise

Qsoft Inc

Bringing Big Data to People

SECURING YOUR ENTERPRISE HADOOP ECOSYSTEM

The Future of Data Management

DATA LAKE FOUNDATION 2.0 JEUDI 19 NOVEMBRE Denis FRAVAL-OLIVIER : ISD Presales Manager

HADOOP BIG DATA DEVELOPER TRAINING AGENDA

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

Enterprise-grade Hadoop: The Building Blocks

Securing Hadoop in an Enterprise Context

Peers Techno log ies Pv t. L td. HADOOP

Constructing a Data Lake: Hadoop and Oracle Database United!

The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer,

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013

Community Driven Apache Hadoop. Apache Hadoop Basics. May Hortonworks Inc.

HADOOP ADMINISTATION AND DEVELOPMENT TRAINING CURRICULUM

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.

Deploying Hadoop with Manager

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

Evolution from Big Data to Smart Data

Data Governance in the Hadoop Data Lake. Michael Lang May 2015

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

docs.hortonworks.com

Securing Your Enterprise Hadoop Ecosystem Comprehensive Security for the Enterprise with Cloudera

How Companies are! Using Spark

Talend Big Data. Delivering instant value from all your data. Talend

Hadoop Trends and Practical Use Cases. April 2014

Chukwa, Hadoop subproject, 37, 131 Cloud enabled big data, 4 Codd s 12 rules, 1 Column-oriented databases, 18, 52 Compression pattern, 83 84

Information Builders Mission & Value Proposition

White paper. The Big Data Security Gap: Protecting the Hadoop Cluster

Professional Hadoop Solutions

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi

Hortonworks Architecting the Future of Big Data

Harnessing big data with Hortonworks Data Platform and Red Hat JBoss Data Virtualization

TE's Analytics on Hadoop and SAP HANA Using SAP Vora

and Hadoop Technology

Accelerating Enterprise Big Data Success. Tim Stevens, VP of Business and Corporate Development Cloudera

Implement Hadoop jobs to extract business value from large and varied data sets

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

EMC Federation Big Data Solutions. Copyright 2015 EMC Corporation. All rights reserved.

Big Data: Making Sense of it all!

Oracle Big Data Fundamentals Ed 1 NEW

Hadoop & Spark Using Amazon EMR

SQL Server 2012 PDW. Ryan Simpson Technical Solution Professional PDW Microsoft. Microsoft SQL Server 2012 Parallel Data Warehouse

Transcription:

Data Security in Hadoop Eric Mizell Director, Solution Engineering Page 1

What is Data Security? Data Security for Hadoop allows you to administer a singular policy for authentication of users, authorize data access, protect data at rest and in motion, and collect a record of interactions with that data. It allows you to coordinate enforcement of this policy across the entire Hadoop stack. Page 2

What is Security? 5 areas of security focus Administration Centrally management & consistent security Authentication Authenticate users and systems Provision access to data Audit Maintain a record of data access Data Protection Protect data at rest and in motion Page 3

Why Security? 5 areas of security focus Administration Centrally management & consistent security Security needs are changing YARN unlocks the data lake Multi-tenant: Multiple applications for data access Changing and complex compliance environment ETL of non-sensitive data can yield sensitive data Authentication Authenticate users and systems Provision access to data Audit Maintain a record of data access Data Protection Protect data at rest and in motion Fall 2013 Largely silo d deployments with single workload clusters Summer 2014 65% of clusters host multiple workloads Page 4

HDP delivers comprehensive security for Hadoop HDP 2.1 Hortonworks Data Platform GOVERNANCE & INTEGRATION BATCH, INTERACTIVE & REAL-TIME DATA ACCESS SECURITY OPERATIONS COMPREHENSIVE SECURITY Meet all security requirements across authentication, authorization, audit & data protection Data Workflow, Lifecycle & Governance Falcon Sqoop Flume NFS WebHDFS Script Pig Tez SQL Hive HCatalog Tez NoSQL HBase Accumulo Stream Storm Search Solr YARN: Data Operating System DATA MANAGEMENT In-Memory Spark 1 HDFS (Hadoop Distributed File System) Others ISV Engines N Authentication Accounting Data Protection Storage: HDFS Resources: YARN Access: Hive, Pipeline: Falcon Cluster: Knox Provision, Manage & Monitor Ambari Zookeeper Scheduling Oozie CENTRAL ADMINISTRATION Provide one location for administering security policies and for viewing and managing audit across the platform CONSISTENT INTEGRATION Integrate with other security and identity management systems, for compliance with IT policies Page 6

Start of 2014: Hadoop Data Security Capabilities Requirements: Authenticate, authorize, provide auditability of data access and protect data at rest and in motion Administration Centrally management & consistent security Authentication Authenticate users and systems Provision access to data Audit Maintain a record of data access Did not exist Kerberos & Apache Knox Fragmented across Components Hive: ATZ-NG HDFS: ACL s Sentry (interactive SQL & Search) Fragmented across Components Start of 2014: State of Hadoop Security Largely disjoint patchwork A few projects address spot needs No coverage across all workloads No central administration and enforcement Data Protection Protect data at rest and in motion 3 rd party add-ons, Integration w/apache Falcon & OS capabilities Page 7

May 2014: Hortonworks Acquires XA Secure Requirements: Authenticate, authorize, provide auditability of data access and protect data at rest and in motion Administration Centrally management & consistent security Authentication Authenticate users and systems Provision access to data Audit Maintain a record of data access Data Protection Protect data at rest and in motion + + + + + Hortonworks acquired XA Secure Accelerates delivery against the enterprise requirements for central security administration and enforcement across all Hadoop workloads from, batch, interactive SQL and real-time Founded in 2013, XASecure provides an enterprise ready, cross-platform, security layer built from the ground up for Hadoop, providing centralized capabilities around data security, authorization, auditing and overall governance. Page 8

Current State of Security in HDP HDP provides central administration and coordinated enforcement of security policy across the entire Hadoop ecosystem of projects. Administration Centrally management & consistent security Authentication Authenticate users and systems Provision access to data Audit Maintain a record of data access Data Protection Protect data at rest and in motion Central administration Kerberos & Apache Knox for HDFS, Hive, HBase, etc. Compliance controls 3 rd party add-ons, Integration w/apache Falcon & OS capabilities With additional XA Secure features, HDP is a leader in Hadoop Security NEW FEATURES Centralized security policy enforcement Granular access control across HDFS, Hive, and HBase Universal audit tracking Compliance conformance controls Page 9

Central Security Administration HDP Advanced Security Delivers a single pane of glass for the security administrator Centralizes administration of security policy Ensures consistent coverage across the entire Hadoop stack Apache Argus All delivered in Open Source under the governance of the ASF in Fall 2014 Page 10

Authentication For more than 20 years, Kerberos has been the de-facto standard for strong authentication no other option exists. What does Kerberos Do? Establishes identity for clients, hosts and services Prevents impersonation/passwords are never sent over the wire Integrates w/ enterprise identity mgmt tools such as LDAP & Active Directory More granular auditing of data access/job execution The design and implementation of Kerberos security in native Apache Hadoop was delivered by Hortonworker, Owen O Malley in 2010 Page 11

Perimeter Security with Apache Knox Incubated and led by Hortonworks, Apache Knox provides a simple and open framework for Hadoop perimeter security. Single, simple point of access for a cluster Single Hadoop access point REST API hierarchy Consolidated API calls Multi-cluster support Central controls ensure consistency across one or more clusters Eliminates SSH edge node Central API management Central audit control Simple Service level Integrated with existing systems to simplify identity maintenance SSO Integration Siteminder and OAM* LDAP & AD integration Page 12

and Audit Fine grain access control HDFS Folder, File Hive Database, Table, Column HBase Table, Column Family, Column Flexibility in defining policies Audit Extensive user access auditing in HDFS, Hive and HBase IP Address Resource type/ resource Timestamp Access granted or denied Control access into system Page 13

Data Protection HDP allows you to apply data protection policy at three different layers across the Hadoop stack Layer What? How? Storage Encrypt data while it is at rest Partners, OS level encrypt, Custom Code Transmission Encrypt data as it moves Supported in HDP 2.1 Upon Access Apply restrictions when accessed Partner, Custom Code Page 14

Thank You! Eric Mizell Director, Solution Engineering Page 15