Hadoop Elephant in Active Directory Forest. Marek Gawiński, Arkadiusz Osiński Allegro Group



Similar documents
System Security Services Daemon

RHEL Clients to AD Integrating RHEL clients to Active Directory

SSSD Active Directory Improvements

Implementing Linux Authentication and Authorisation Using SSSD

SSSD. Client side identity management. LinuxAlt 2012 Jakub Hrozek 3. listopadu 2012

Advancements in Linux Authentication and Authorisation using SSSD

CAC AND KERBEROS FROM VISION TO REALITY

Univention Corporate Server. Extended domain services documentation

Interoperability Update: Red Hat Enterprise Linux 7 beta and Microsoft Windows

Kerberos and Active Directory symmetric cryptography in practice COSC412

Identity Management based on FreeIPA

Document Type: Best Practice

Centrify Server Suite For MapR 4.1 Hadoop With Multiple Clusters in Active Directory

Integrating Linux systems with Active Directory

Going in production Winbind in large AD domains today. Günther Deschner (Red Hat / Samba Team)

Integration with Active Directory. Jeremy Allison Samba Team

Integrating Red Hat Enterprise Linux 6 with Microsoft Active Directory Presentation

FreeIPA Cross Forest Trusts

Kerberos + Android. A Tale of Opportunity. Slide 1 / 39. Copyright 2012 yassl

FreeIPA 3.3 Trust features

Configuring Hadoop Security with Cloudera Manager

SUSE Manager 1.2.x ADS Authentication

Building Open Source Identity Management with FreeIPA. Martin Kosek

CDH 5 Quick Start Guide

Olivier Renault Solu/on Engineer Hortonworks. Hadoop Security

Windows Security and Directory Services for UNIX using Centrify DirectControl

How to Deploy a Secure, Highly-Available Hadoop Platform

FreeIPA - Open Source Identity Management in Linux

Cloudera Backup and Disaster Recovery

Integrating UNIX and Linux with Active Directory. John H Terpstra

Big Data Operations Guide for Cloudera Manager v5.x Hadoop

Theorie Practical part Outlook. Kerberos. Secure and efficient authentication and key distribution. Johannes Lötzsch and Meike Zehlike

Configuring Squid Proxy, Active Directory Authentication and SurfProtect ICAP Access

Charles Firth Managing Macs in a Windows World

FreeIPA Client and Server

FreeIPA Client and Server

Cloudera Backup and Disaster Recovery

Managing Identity & Access in On-premise and Cloud Environments. Ellen Newlands Identity Management Product Manager Red Hat, Inc

Integrating OID with Active Directory and WNA

Centrify Identity and Access Management for Cloudera

Architecting the Future of Big Data

Samba in the Enterprise : Samba 3.0 and beyond

Setting up Single Sign-On (SSO) with SAP HANA and SAP BusinessObjects XI 4.0

Configure the Application Server User Account on the Domain Server

Single sign-on websites with Apache httpd: Integrating with Active Directory for authentication and authorization

The following process allows you to configure exacqvision permissions and privileges for accounts that exist on an Active Directory server:

How To Configure the Oracle ZFS Storage Appliance for Quest Authentication for Oracle Solaris

Active Directory and Linux Identity Management

Quick Deployment Step-by-step instructions to deploy Oracle Big Data Lite Virtual Machine

Guide to SASL, GSSAPI & Kerberos v.6.0

Cloudera ODBC Driver for Impala Version

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.

Kerberos Delegation with SAS 9.4

Identity Management: The authentic & authoritative guide for the modern enterprise

ENABLING SINGLE SIGN-ON: SPNEGO AND KERBEROS Technical Bulletin For Use with DSView 3 Management Software

Best Practices: Integrating Mac OS X with Active Directory. Technical White Paper April 2009

Secure Unified Authentication for NFS

Architecting the Future of Big Data

SSSD AD Provider: Access Control

Handling POSIX attributes for trusted Active Directory users and groups in FreeIPA

Red Hat Enterprise ipa

Hadoop Security Design

Contents. Supported Platforms. Event Viewer. User Identification Using the Domain Controller Security Log. SonicOS

Kerberos on z/os. Active Directory On Windows Server William Mosley z/os NAS Development. December Interaction with.

1 Introduction. Ubuntu Linux Server & Client and Active Directory. Page 1 of 14

Cloudera Manager Training: Hands-On Exercises

HDFS Users Guide. Table of contents

Centrify Single Sign-On

python hadoop pig October 29, 2015

Single Sign On. Configuration Checklist for Single Sign On CHAPTER

Using Active Directory as your Solaris Authentication Source

Migration of Windows Intranet domain to Linux Domain Moving Linux to a Wider World

Vintela Authentication from SCO Release 2.2. System Administration Guide

Active Directory and Oxford Single Sign-On

Like what you hear? Tweet it using: #Sec360

Virtual Machine (VM) For Hadoop Training

Hadoop Security Analysis NOTE: This is a working draft. Notes are being collected and will be edited for readability.

VINTELA AUTHENTICATION SERVICES

docs.hortonworks.com

(june > this is version 3.025a)

External and Federated Identities on the Web

Pivotal HD Enterprise

Important Notice. (c) Cloudera, Inc. All rights reserved.

Introduction to Highly Available NFS Server on scale out storage systems based on GlusterFS

HOW TO SILENTLY INSTALL CLOUD LINK REMOTELY WITHOUT SUPERVISION

Kerberos and Windows SSO Guide Jahia EE v6.1

MongoDB Security Guide

Kognitio Technote Kognitio v8.x Hadoop Connector Setup

Integrating Red Hat Enterprise Linux 6 with Active Directory. Mark Heslin Principal Software Engineer

INUVIKA TECHNICAL GUIDE

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP

Important Notice. (c) Cloudera, Inc. All rights reserved.

How To Use Cloudera Manager Backup And Disaster Recovery (Brd) On A Microsoft Hadoop (Clouderma) On An Ubuntu Or 5.3.5

Red Hat Identity Management

INTRODUCING SAMBA 4 NOW, EVEN MORE AWESOMENESS

AD Integration options for Linux Systems

Best Practices: Integrating Mac OS X with Active Directory. Technical White Paper September 2007

Red Hat Enterprise Identity (IPA) Centralized Management of Identities & Authentication

Multitenancy and the Enterprise Data Hub. James IP EXPO EUROPE Big Data Evolution Summit

Transcription:

Hadoop Elephant in Active Directory Forest Marek Gawiński, Arkadiusz Osiński Allegro Group

Agenda Goals and motivations Technology stack Architecture evolution Automation integrating new servers Making AD users and groups visible to Linux Making architecture non-vulnerable to AD service inaccessibility Auto-deployment clients software on desktops

Allegro Hadoop cluster in numbers 4 terabytes RAM 2 petabytes disk space 47 datanodes 79 projects 612 users

Goals and motivations Secured cluster Central authentication and authorisation Compliance for real and project users and groups Cluster resources available from desktop Integrating new servers automatically Making whole architecture non-vulnerable for failures or timeouts to AD Auto-deployment and autoconfiguration of Hadoop clients software on users desktops

Technology stack Cloudera CDH5 MIT Kerberos Microsoft Active Directory FreeIPA sssd puppet msktutil Hadoop desktop client

History - FreeIPA+FreeIPA Kerberos In te rn al ha do op FreeIPA User cr ed Chec k gro Kerberos Service Ticket ups Check user/pass s Secured Hadoop cluster Local groups management User/pass Client Kerberos KDC

History - FreeIPA+own Kerberos Secured Hadoop cluster Chec Internal hadoop creds k gro Kerberos Service Ticket ups Check user/pass FreeIPA User Local groups management User/pass Client Kerberos KDC Kerberos KDC MIT

History - FreeIPA+own Kerberos+AD In te rn al ha do op FreeIPA User cr ed Chec k gro ups Ch kg ec Kerberos Service Ticket Local groups management ps u ro Check user/pass s Secured Hadoop cluster User/pass Client Kerberos KDC MIT Us e r/p s Check user/pass as AD User&Groups AD Kerberos

Final - own Kerberos+AD In te rn al ha do op cr ed s Secured Hadoop cluster Ch kg ec Kerberos Service Ticket ps u ro Client Kerberos KDC MIT Us e r/p s Check user/pass as AD User&Groups AD Kerberos

Integrating new Linux servers automatically with AD Kerberos keytab user e t a Cre AD Kerberos Msktutil Create AD User&Groups princip al

Integrating new Linux servers automatically with AD define get_ad_keytab ( $path = '',...) {... $realm = 'SOME_REALM' $pass = hiera('hadoop_prod/ad/krb_manager_pass') $principal = "${title}/${host}@${realm}" $command = "echo ${pass} kinit _hadoop_manager@${realm}; \ /usr/local/bin/add_ad_princ.sh ${title} ${host} ${path}; kdestroy"... msktutil -c -s $PRINCIPAL --upn $PRINCIPAL -k $KEYTAB \ --computer-name $COMPUTER_NAME \ --server $SERVER_KRB \ --realm $REALM \ -b $USER_LDAP_ROOT \ --dont-expire-password \ --description "\"$DESCRIPTION\"" \ --user-creds-only

Integrating new Linux servers automatically with AD root@nn1:~# klist -ket Keytab name: FILE:/etc/krb5.keytab KVNO Timestamp Principal ---- ------------------- -----------------------------------------------------1 08/17/2015 13:26:45 host/nn1.local@ipa.realm (aes256-cts-hmac-sha1-96) 1 08/17/2015 13:26:45 host/nn1.local@ipa.realm (aes128-cts-hmac-sha1-96) 1 08/17/2015 13:26:45 host/nn1.local@ipa.realm (des3-cbc-sha1) 1 08/17/2015 13:26:45 host/nn1.local@ipa.realm (arcfour-hmac) 1 08/17/2015 13:26:45 host/nn1.local@ipa.realm (camellia128-cts-cmac) 1 08/17/2015 13:26:45 host/nn1.local@ipa.realm (camellia256-cts-cmac) 4 08/17/2015 13:30:23 91c76848bc458b62e67$@AD.REALM (arcfour-hmac) 4 08/17/2015 13:30:23 91c76848bc458b62e67$@AD.REALM (aes128-cts-hmac-sha1-96) 4 08/17/2015 13:30:23 91c76848bc458b62e67$@AD.REALM (aes256-cts-hmac-sha1-96) 4 08/17/2015 13:30:23 host/nn1.local@ad.realm (arcfour-hmac) 4 08/17/2015 13:30:23 host/nn1.local@ad.realm (aes128-cts-hmac-sha1-96) 4 08/17/2015 13:30:23 host/nn1.local@ad.realm (aes256-cts-hmac-sha1-96)

Integrating new Linux servers automatically with AD Separated Subtree in AD structure

System Security Services Daemon Identity and authentication Multiple providers (FreeIPA, LDAP, AD) High availability for backends Provides PAM and NSS modules Caching > 1.11.x - stable support for AD forest auth

System Security Services Daemon /etc/sssd/sssd.conf [domain/ad.realm] id_provider = ad ad_server = h1, h2, h3 ad_backup_server = hb1, hb2, hb3 auth_provider = ad chpass_provider = ad access_provider = ad enumerate = False krb5_realm = AD.REALM ldap_schema = ad ldap_id_mapping = True cache_credentials = True ldap_access_order = expire ldap_account_expire_policy = ad ldap_force_upper_case_realm = true fallback_homedir = /home/ad.realm/%u default_shell = /bin/false ldap_referrals = false AD schema with no modifications root@nn1:~# id _hc_tech_prod tr "," "\n" uid=1827653611(_hc_tech_prod) gid=1827600513(domain users) groups=1827600513(domain users) 1827652945(_gr_hc_users_common) 1827647474(_gr_hc_hadoop_prod) 1827652940(_gr_hc_project1_prod) 1827652919(_gr_hc_project2_prod)

Making whole architecture nonvulnerable for failures Active Closest DC Fallback servers in Remote DC Local filesystem nss cache /etc/sssd/sssd.conf [nss] memcache_timeout = 3600

Auto-deployment and autoconfiguration on desktops Install script for Hadoop Client on desktops Refresh configs with currently prod environment Support for HDFS/YARN/Hive/Spark [marek.gawinski:~/allehadoop] $ sh env.sh Password for marek.gawinski@ad.realm: ************** [marek.gawinski:~/allehadoop] $ klist Ticket cache: FILE:/tmp/krb5cc_1511317717 Default principal: marek.gawinski@ad.realm Valid starting Expires 09/04/15 23:31:35 09/05/15 09:31:35 renew until 09/11/15 23:31:33 Service principal krbtgt/ad.realm@ad.realm

Auto-deployment and autoconfiguration on desktops [marek.gawinski:~/allehadoop] Found 8 items drwxr-xr-x - marek.gawinski drwxr-xr-x - marek.gawinski drwxr-xr-x - marek.gawinski drwx------ marek.gawinski drwxr-xr-x - marek.gawinski -rw-r--r-3 marek.gawinski -rw-r--r-3 marek.gawinski drwxr-xr-x - marek.gawinski $ hdfs dfs -ls hadoop hadoop hadoop hadoop hadoop hadoop hadoop hadoop 0 0 0 0 0 43 13 0 2015-08-06 2015-07-28 2015-07-09 2015-05-22 2015-08-31 2015-05-26 2015-08-31 2015-04-16 [marek.gawinski:~/allehadoop] $ hive hive (default)> show databases; OK database_name tpch_benchmarks... xwing_poc Time taken: 0.816 seconds, Fetched: 72 row(s) hive (default)> set hive.execution.engine = tez; hive (default)> select count(*) from table1; 02:00 21:01 10:43 02:35 13:11 15:26 12:30 16:21.Trash.hiveJars.sparkStaging.staging oozie1 ozzietest1.hql pwd.txt tables

Auto-deployment and autoconfiguration on desktops

Auto-deployment and autoconfiguration on desktops

Auto-deployment and autoconfiguration on desktops

Auto-deployment and autoconfiguration on desktops

Benefits One standard for access control to all company resources Every new employee automatically can play with Hadoop with no additional effort One password to all systems

Thank you! Questions?