Data Security as a Business Enabler Not a Ball & Chain. Big Data Everywhere May 12, 2015



Similar documents
Data Security as a Business Enabler Not a Ball & Chain. Big Data Everywhere May 21, 2015

Data-Centric Security Key to Cloud and Digital Business

Big Data Management and Security

Data-Centric security and HP NonStop-centric ecosystems. Andrew Price, XYPRO Technology Corporation Mark Bower, Voltage Security

Where Data Security and Value of Data Meet in the Cloud

Teradata and Protegrity High-Value Protection for High-Value Data

Data-Centric Security vs. Database-Level Security

Ensure PCI DSS compliance for your Hadoop environment. A Hortonworks White Paper October 2015

Data Breaches Gone Mad. Straight Away! Wednesday September 28 th, 2011

Data Security in Hadoop

Encryption and Anonymization in Hadoop

Practical Advice for Cloud Data Protection

Data Governance in the Hadoop Data Lake. Michael Lang May 2015

How to Hadoop Without the Worry: Protecting Big Data at Scale

The Future of Data Management

Upcoming Announcements

Data Governance in the Hadoop Data Lake. Kiran Kamreddy May 2015

Hortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015

HDP Hadoop From concept to deployment.

Bringing Big Data to People

Like what you hear? Tweet it using: #Sec360

Who Am I? Mark Cusack Chief Architect 9 years@rainstor Founding developer Ex UK Ministry of Defence Research InfoSec projects

Top Ten Security and Privacy Challenges for Big Data and Smartgrids. Arnab Roy Fujitsu Laboratories of America

Ganzheitliches Datenmanagement

White paper. The Big Data Security Gap: Protecting the Hadoop Cluster

IBM InfoSphere Guardium Data Activity Monitor for Hadoop-based systems

SafeNet Data Encryption and Control. Securing data over its lifecycle, wherever it resides from the data center to endpoints and into the cloud

SafeNet Data Encryption and Control. Securing data over its lifecycle, wherever it resides from the data center to endpoints and into the cloud

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE

Securing Data in Oracle Database 12c

Securing Hadoop Data Big Data Everywhere - Atlanta January 27, 2015

Protegrity Data Security Platform

Securing Data Today. Ulf Mattsson CTO Protegrity ulf.mattsson [at] protegrity.com

With Great Power comes Great Responsibility: Managing Privileged Users

Comprehensive Analytics on the Hortonworks Data Platform

Building Your Big Data Team

WHAT S NEW IN SAS 9.4

APIs The Next Hacker Target Or a Business and Security Opportunity?

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.

HDP Enabling the Modern Data Architecture

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Hadoop and Relational Database The Best of Both Worlds for Analytics Greg Battas Hewlett Packard

Fighting Cyber Fraud with Hadoop. Niel Dunnage Senior Solutions Architect

Protecting Enterprise Data In Hadoop HPE SecureData for Hadoop

05.0 Application Development

Apache Sentry. Prasad Mujumdar

Why Add Data Masking to Your IBM DB2 Application Environment

PROTECTING ENTERPRISE DATA IN HADOOP

Olivier Renault Solu/on Engineer Hortonworks. Hadoop Security

Oracle Database Security

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Test Data Management for Security and Compliance

Big Data Success Step 1: Get the Technology Right

Modern Data Architecture for Predictive Analytics

Fasoo Data Security Framework

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data

Protecting Sensitive Data Reducing Risk with Oracle Database Security

Addressing Risk Data Aggregation and Risk Reporting Ben Sharma, CEO. Big Data Everywhere Conference, NYC November 2015

Using RDBMS, NoSQL or Hadoop?

Seven Things To Consider When Evaluating Privileged Account Security Solutions

Qlik Sense Enabling the New Enterprise

Cisco IT Hadoop Journey

<Insert Picture Here> Oracle Database Security Overview

Cloud Data Security. Sol Cates

Testing Big data is one of the biggest

RE Think. IT & Business. Invent. IBM SmartCloud Security. Dr. Khaled Negm, SMIEEE, ACM Fellow IBM SW Global Competency Center Leader GCC

The Informatica Solution for Data Privacy

Database Security & Compliance with Audit Vault and Database Firewall. Pierre Leon Database Security

Certified Information Systems Auditor (CISA)

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Compliance & Data Protection in the Big Data Age - MongoDB Security Architecture

Securing Your Enterprise Hadoop Ecosystem Comprehensive Security for the Enterprise with Cloudera

Securing Your Enterprise Hadoop Ecosystem Comprehensive Security for the Enterprise with Cloudera

Securing and protecting the organization s most sensitive data

Securing sensitive data at Rest ProtectFile, ProtectDb and ProtectV. Nadav Elkabets Presale Consultant

Arnab Roy Fujitsu Laboratories of America and CSA Big Data WG

Highly available, scalable and secure data with Cassandra and DataStax Enterprise. GOTO Berlin 27 th February 2014

Luncheon Webinar Series May 13, 2013

Securing Your Big Data Environment

Complying with Payment Card Industry (PCI-DSS) Requirements with DataStax and Vormetric

Cloud Assurance: Ensuring Security and Compliance for your IT Environment

NoSQL Database Systems and their Security Challenges

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP

A Scalable Data Transformation Framework using the Hadoop Ecosystem

Complete Database Security. Thomas Kyte

BIG DATA CAN DRIVE THE BUSINESS AND IT TO EVOLVE AND ADAPT RALPH KIMBALL BUSSUM 2014

Qsoft Inc

Hadoop Ecosystem B Y R A H I M A.

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

Transcription:

Data Security as a Business Enabler Not a Ball & Chain Big Data Everywhere May 12, 2015

Les McMonagle Protegrity - Director Data Security Solutions Les has over twenty years experience in information security. He has held the position of Chief Information Security Officer (CISO) for a credit card company and ILC bank, founded a computer training and IT outsourcing company in Europe and helped several security technology firms develop their initial product strategy. Les founded and managed Teradata s Information Security, Data Privacy and Regulatory Compliance Center of Excellence and is currently Director of Data Security Solutions for Protegrity. Les holds a BS in MIS, CISSP, CISA, ITIL and other relevant industry certifications. Les McMonagle (CISSP, CISA, ITIL) Mobile: (617) 501-7144 Email:Les.McMonagle@Protegrity.com 2

The Problem... The cost of cybercrime is staggering: The annual cost to the global economy is in excess of $400 billion/year. Businesses that are victims of cybercrime need an average of 18 days to resolve the problem and suffer average costs of over $400K. The tangible and intangible costs associated with some of the recent high-profile cases exceeds $400M. Traditional network security, firewalls, IDS, SIEM, AV and monitoring solutions do not offer the comprehensive security needed to protect the target data against current, new and evolving threats. 3

Typical Phases of an Attack 4 http://eval.symantec.com/mktginfo/enterprise/white_papers/b-anatomy_of_a_data_breach_wp_20049424-1.en-us.pdf

Factors to Consider Bad guys search for the easy targets Large repositories of valuable, un-protected data Systems with weaker controls and/or more access paths Financial Data or Personally Identifiable Information (PII) Blurring or Network Boundaries Where does your company network end and another begin? BYOD Cloud IoT (Internet of Things) Insider threats remain the biggest threat Advanced Persistent Threats (APTs) Coordinated, comprehensive attack strategies 5

Types of Sensitive Data Potentially Stored in Hadoop SSN Credit Card PAN Bank Account Numbers PIN Pending Patents Health History DOB Production Planning Prescriptions Employee Personnel Records Best Practices Trade Secrets Customer Lists Health Records Sales Forecasts Payroll Data Accounts Receivable Order History Accounts Payable Customer Contact Information R&D Home Addresses Income Data Salary Data Location Data Passwords Project Plans

Process Policy Sponsors What to do about it? Engage Information Security CISO & InfoSec Work with Legal and Compliance Establish Good Data Governance Program Apply consistent protection throughout the data flow Limit access on a Need-to-Know basis Protect the actual data Itself (regardless of where it is) De-Identify data without losing analytics value

Engage InfoSec, Legal, Compliance, Privacy Engage Information Security rather than avoid them CISO s and InfoSec ultimately have the same goals Will help fund and implement effective data protection Legal, Privacy and Compliance Identify/interpret regulatory and compliance requirements Helping protect the business by identifying risks to consider Incorporate generally accepted Privacy Principles* 8

Data Governance Program Establish good data governance program Identified Data Owners Identified Data Stewards Identified Data Custodians RACI Roles and Responsibilities Data Governance subject areas Data Ownership Data Quality Data Integration Metadata Management Master Data Management Data Architecture Data Security & Privacy 9

Protect sensitive data consistently wherever it goes At Rest In Transit In Use 10 Ideally with a single, centralized enterprise solution

What Data to Tokenize or Encrypt? Important questions to ask... What policy and regulatory compliance requirements apply? What risks must be mitigated? How/Why are protected columns accessed/used? What other mitigating controls are available? Appropriate balance between business and data privacy/security? When is Tokenization or Encryption most appropriate? Utilization and access control limitations of Hadoop / Hive Alternative protection options to consider Full Disk Encryption (FTE) Important Data Security Architecture Questions

To Encrypt or Tokenize... This is the Question Tokenization SSN Large - Field Size relative to width of lookup table - Small CC-PAN More - Structured - Less Healthcare Records More - Logic in portions of the data element - Less Encryption PIN, CID, CV2 Password X-Ray Cat Scan HIV-Pos* Diagnosis Patient ID # Bank Acct No. report Less - Percent of Access Requiring Clear Text - More Customer ID # Increasing Data Sensitivity DOB * With Initialization Vector (IV)

Potential Additional Controls to Consider Tokenization or Encryption farther upstream in Data Flow Do not load unnecessary regulated data to Hadoop Access Hadoop Hive Tables through Teradata (QueryGrid) HDFS file-level access control Accumulo cell level access control (Row/Column intersection) Knox Gateway (authentication for multiple Hadoop clusters) Coarse grained HDFS File Encryption XASecure (now HDP Advanced Security) Ambari (Hadoop Cluster Management) Kerberos (Authentication) all or nothing Piecemeal independent security tools for Hadoop

Reduce Your Exposure and Risk Population of users who have access to SSN today Population of users who can perform their job function with only the last 4 digits of the SSN SSN Token SSN Last 4 Digits SSN Full Vaultless Tokenization is a form of data protection that converts sensitive data into fake data. The real data can be retrieved only by authorized users. Often a more usable form of protection than encryption. Population of users who need access to the full SSN to perform their job function Improve Security Posture Without Impacting Analytics Value 14

What to look for in a good Enterprise Solution Critical Core Requirements: Single Solution Across All Core Platforms Scalable, Centralized Enterprise-class Solution Segregation of Duties between DBA and Security Admin Good Encryption Key or Token Lookup Table Management Data Layer Solution Tamper-proof Audit Trail Transparent (as possible) to Authorized Users High Availability (HA) Optional In-database vs. Ex-database Encryption/Tokenization 15

Other "nice to have" Features... Flexible protection options (Encrypt, Tokenize, DTP/FPE, Masking) Broadest possible support for a range of data types Built in DR, Dual Active, Key and system recovery capability Minimal performance impact to applications/end users Optimized operations to minimize CPU utilization Proven Implementation methodology PCI-DSS compliant solution (meeting all relevant requirements) Deep partnership with Teradata and other database providers Minimal impact on system upgrades Maintain consistent referential integrity and indexing capability Low Total Cost of Ownership (TCO) 16

What to look for in a good solution for Hadoop Course Grained and Fine Grained Protection Capability HDFS File Encryption, Multi-Tennant File Encryption, HDFS FP (HDFS Codec) Column/Field Level Fine Grained Protection Multi-Tennant Row Level Protection Allow authorized users access to specific rows only Unprotect columns for authorized users only Heterogeneous Protection Capabilities Protect Upstream sources of data and Downstream targets of data Vaultless Tokenization, often less intrusive than encryption, reversible protection Reversible where masking is not Deployed on the (Data) Nodes Leverage MPP architecture of Hadoop Avoid Appliance based solutions that can slow down Hadoop Tokenization capability for Hive access to HDFS Files/Tables Hive does not support VarByte data type (Encryption = Binary Ciphertext) 17

Granularity of Protecting Sensitive Data Coarse Grained Protection (File/Volume) Fine Grained Protection (Data/Field) Methods: File or Volume encryption All or nothing approach Does NOT secure file contents in use OS File System Encryption HDFS Encryption Secures data at rest and in transit Operates at the individual field level Fine Grained Protection Methods: Vaultless Tokenization Masking Encryption (Strong, Format Preserving) Data is protected in use and wherever it goes Business logic can be retained

Data Security Platform RDBMS Applications Audit Log Audit Log EDW Audit Log Enterprise Security Administrator Policy Big Data Audit Log IBM Mainframe Protector Audit Log Netezza Audit Log Audit File Servers Log File and Cloud Gateway Servers Protection Servers 20 Protegrity Confidential

Protegrity s Big Data Protector for Hadoop Hadoop Cluster Hadoop Node Hive Pig Other Policy Audit MapReduce YARN HBase HDFS OS File System Protegrity Big Data Protector for Hadoop delivers protection at every node and is delivered with our own cluster management capability. All nodes are managed by the Enterprise Security Administrator that delivers policy and accepts audit logs Protegrity Data Security Policy contains information about how data is deidentified and who is authorized to have access to that data. Policy is enforced at different levels of protection in Hadoop. 21

Rich Security Layer over the Hadoop Ecosystem UDF Support for Pig UDF Support for Hive Hive - Tokenization Java API Support for MapReduce Hbase - Coprocessor support via UDFs Cassandra UDT Pig / Hive MapReduce YARN HBase HDFS Encryption through the HDFS Codec HDFS Commands Extended for Security Functions HDFS Interface for Java Programs De-identify before Ingestion into HDFS HDFS OS File System Encryption; Folder/File or Volume File System 22

Coarse Grained Protection: File / Volume Encryption All fields are in the clear Pig / Hive All fields are in the clear MapReduce YARN HBase HDFS File Entire with identifiable File is data Encrypted elements File System Volume encryption option will encrypt the entire volume versus the files themselves. 23

Coarse Grained with HDFS Staging Area Pig / Hive MapReduce Jobs MapReduce YARN HBase Ingest into HDFS HDFS Staging Area File System 24

Coarse Grained Multi-Tenant Protection Pig / Hive T1 T2 T3 Ingest into HDFS T1 folder T2 folder T3 folder Key 1 Key 2 Key 3 clear folder MapReduce YARN HBase HDFS File System 25

Fine Grained Protection Production Systems Encryption Reversible Policy Control (authorized / Unauthorized Access) Lacks Integration Transparency Not searchable or sortable Complex Key Management Example:!@#$%a^.,mhu7///&*B()_+!@ Vaultless Tokenization / Pseudonymization Reversible Policy Control (Authorized / Unauthorized Access) or Not Reversible No Complex Key Management In either case Integrates Transparently Searchable and sortable Business Intelligence: 0389 3778 3652 0038 Non-Production Systems Masking Not reversible No Policy, Everyone Can Access the Data Integrates Transparently No Complex Key Management Example: Date of Birth 2/15/1967 masked as xx/xx/1967 Protegrity Confidential

Enterprise-wide Protection Source Systems (Internal / External) Consumption BI Systems Target Systems (Internal / External) Input File Source Input File Source FPG ETL Ecosystem Components Pig Hive Node Node Node Database Server MapReduce YARN HBase Downstream Systems Database Database Protector Sqoop HDFS OS FS Edge Node File Protector Java Program Application Protector ESA If Edge Node is a Hadoop Node, Hadoop resources can be used Policy Deployment Audit Collection

Traditional IT Environment: Protegrity Protection Typical Enterprise Today Internet Inside the Firewall Apps EDW DBs Files Hadoop Apps Arch 028 Protegrity Confidential

Today s IT Environment: Protegrity Protection Typical Enterprise Today Internet Inside the Firewall Apps Cloud Protector Gateway DBs Files File Protector Gateway Files EDW Apps Arch ESA HG Apps Hadoop 029 Protegrity Confidential

In Summary Establish Good Data Governance Protect the actual data Itself Maintain referential integrity De-Identify data while maintaining analytics capability Apply consistent protection throughout the data flow Engage Information Security, Legal and Compliance 30 Build security in rather than bolt it on later

31 Sign Up for a Free ½ Day Risk Assessment Workshop

Thank You Q & A Les McMonagle (CISSP, CISA, ITIL) Mobile: (617) 501-7144 Email:Les.McMonagle@Protegrity.com