Data Security as a Business Enabler Not a Ball & Chain. Big Data Everywhere May 21, 2015



Similar documents
Data Security as a Business Enabler Not a Ball & Chain. Big Data Everywhere May 12, 2015

Data-Centric Security Key to Cloud and Digital Business

Big Data Management and Security

Where Data Security and Value of Data Meet in the Cloud

Teradata and Protegrity High-Value Protection for High-Value Data

Data-Centric security and HP NonStop-centric ecosystems. Andrew Price, XYPRO Technology Corporation Mark Bower, Voltage Security

Practical Advice for Cloud Data Protection

Data Governance in the Hadoop Data Lake. Michael Lang May 2015

Encryption and Anonymization in Hadoop

Ensure PCI DSS compliance for your Hadoop environment. A Hortonworks White Paper October 2015

Data Breaches Gone Mad. Straight Away! Wednesday September 28 th, 2011

Data Governance in the Hadoop Data Lake. Kiran Kamreddy May 2015

Data-Centric Security vs. Database-Level Security

White paper. The Big Data Security Gap: Protecting the Hadoop Cluster

Data Security in Hadoop

The Future of Data Management

APIs The Next Hacker Target Or a Business and Security Opportunity?

How to Hadoop Without the Worry: Protecting Big Data at Scale

Upcoming Announcements

05.0 Application Development

HDP Hadoop From concept to deployment.

Who Am I? Mark Cusack Chief Architect 9 years@rainstor Founding developer Ex UK Ministry of Defence Research InfoSec projects

Why Add Data Masking to Your IBM DB2 Application Environment

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

Bringing Big Data to People

Securing Data Today. Ulf Mattsson CTO Protegrity ulf.mattsson [at] protegrity.com

Like what you hear? Tweet it using: #Sec360

SafeNet Data Encryption and Control. Securing data over its lifecycle, wherever it resides from the data center to endpoints and into the cloud

SafeNet Data Encryption and Control. Securing data over its lifecycle, wherever it resides from the data center to endpoints and into the cloud

Cloud Data Security. Sol Cates

Ganzheitliches Datenmanagement

Hortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE

HDP Enabling the Modern Data Architecture

Apache Sentry. Prasad Mujumdar

Test Data Management for Security and Compliance

Securing Data in Oracle Database 12c

IBM InfoSphere Guardium Data Activity Monitor for Hadoop-based systems

CLOUD STORAGE SECURITY INTRODUCTION. Gordon Arnold, IBM

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

With Great Power comes Great Responsibility: Managing Privileged Users

Protegrity Data Security Platform

Top Ten Security and Privacy Challenges for Big Data and Smartgrids. Arnab Roy Fujitsu Laboratories of America

Introduction to Big data. Why Big data? Case Studies. Introduction to Hadoop. Understanding Features of Hadoop. Hadoop Architecture.

Taking a Data-Centric Approach to Security in the Cloud

<Insert Picture Here> Oracle Database Security Overview

WHAT S NEW IN SAS 9.4

Building Your Big Data Team

GoodData Corporation Security White Paper

STORAGE SECURITY TUTORIAL With a focus on Cloud Storage. Gordon Arnold, IBM

Virginia Government Finance Officers Association Spring Conference May 28, Cloud Security 101

Protecting Sensitive Data Reducing Risk with Oracle Database Security

Olivier Renault Solu/on Engineer Hortonworks. Hadoop Security

Oracle Database Security

Comprehensive Analytics on the Hortonworks Data Platform

Auditing Data Access Without Bringing Your Database To Its Knees

Fighting Cyber Fraud with Hadoop. Niel Dunnage Senior Solutions Architect

Solutions for Health Insurance Portability and Accountability Act (HIPAA) Compliance

Cloud Security Trust Cisco to Protect Your Data

Securing Your Enterprise Hadoop Ecosystem Comprehensive Security for the Enterprise with Cloudera

Highly available, scalable and secure data with Cassandra and DataStax Enterprise. GOTO Berlin 27 th February 2014

Protecting Enterprise Data In Hadoop HPE SecureData for Hadoop

Securing Your Enterprise Hadoop Ecosystem Comprehensive Security for the Enterprise with Cloudera

Presentation for : The New England Board of Higher Education. Hot Topics in IT Security and Data Privacy

PROTECTING ENTERPRISE DATA IN HADOOP

Securing Hadoop Data Big Data Everywhere - Atlanta January 27, 2015

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

Data Masking Best Practices

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data

Adopt a unified, holistic approach to a broad range of data security challenges with IBM Data Security Services.

Modern Data Architecture for Predictive Analytics

RE Think. IT & Business. Invent. IBM SmartCloud Security. Dr. Khaled Negm, SMIEEE, ACM Fellow IBM SW Global Competency Center Leader GCC

Fasoo Data Security Framework

Securing Your Big Data Environment

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Seven Things To Consider When Evaluating Privileged Account Security Solutions

Hadoop and Relational Database The Best of Both Worlds for Analytics Greg Battas Hewlett Packard

ensure prompt restart of critical applications and business activities in a timely manner following an emergency or disaster

Can Cloud Providers Guarantee Data Privacy & Sovereignty?

Qlik Sense Enabling the New Enterprise

Session 0202: Big Data in action with SAP HANA and Hadoop Platforms Prasad Illapani Product Management & Strategy (SAP HANA & Big Data) SAP Labs LLC,

DIVISION OF INFORMATION SECURITY (DIS)

Data Centric Security

Cisco IT Hadoop Journey

An Oracle White Paper June Oracle Database 11g: Cost-Effective Solutions for Security and Compliance

COURSE CONTENT Big Data and Hadoop Training

Compliance & Data Protection in the Big Data Age - MongoDB Security Architecture

The Business Benefits of Logging

Real World Big Data Architecture - Splunk, Hadoop, RDBMS

Securing Hadoop. Sudheesh Narayanan. Chapter No.1 "Hadoop Security Overview"

Everything You Always Wanted to Know About Log Management But Were Afraid to Ask. August 21, 2013

Transcription:

Data Security as a Business Enabler Not a Ball & Chain Big Data Everywhere May 21, 2015

Les McMonagle Protegrity - Director Data Security Solutions Les has over twenty years experience in information security. He has held the position of Chief Information Security Officer (CISO) for a credit card company and ILC bank, founded a computer training and IT outsourcing company in Europe and helped several security technology firms develop their initial product strategy. Les founded and managed Teradata s Information Security, Data Privacy and Regulatory Compliance Center of Excellence and is currently Director of Data Security Solutions for Protegrity. Les holds a BS in MIS, CISSP, CISA, ITIL and other relevant industry certifications. Les McMonagle (CISSP, CISA, ITIL) Mobile: (617) 501-7144 Email:Les.McMonagle@Protegrity.com 2

The Problem... The cost of cybercrime is staggering: The annual cost to the global economy is in excess of $400 billion/year. Businesses that are victims of cybercrime need an average of 18 days to resolve the problem and suffer average costs of over $400K. The tangible and intangible costs associated with some of the recent high-profile cases exceeds $400M. Traditional network security, firewalls, IDS, SIEM, AV and monitoring solutions do not offer the comprehensive security needed to protect the target data against current, new and evolving threats. 3

Typical Phases of an Attack 4 http://eval.symantec.com/mktginfo/enterprise/white_papers/b-anatomy_of_a_data_breach_wp_20049424-1.en-us.pdf

Factors to Consider " Bad guys search for the easy targets Large repositories of valuable, un-protected data Systems with weaker controls and/or more access paths Financial Data or Personally Identifiable Information (PII) " Blurring or Network Boundaries Where does your company network end and another begin? BYOD Cloud IoT (Internet of Things) " Insider threats remain the biggest threat " Advanced Persistent Threats (APTs) Coordinated, comprehensive attack strategies 5

Types of Sensitive Data Potentially Stored in Hadoop SSN DOB PIN Credit Card PAN Best Practices Bank Account Numbers Customer Lists Pending Patents Health History Production Planning Prescriptions Employee Personnel Records Trade Secrets Health Records Accounts Receivable Payroll Data Order History Accounts Payable Sales Forecasts Customer Contact Information R&D Home Addresses Income Data Salary Data Location Data Passwords Project Plans 6

What to do about it " Engage Information Security " Work with Legal and Compliance " Establish Good Data Governance Program " Adhere to generally accepted privacy principles * " Apply consistent protection throughout the data flow " Limit access on a Need-to-Know basis " Protect the actual data itself (regardless of where it is) " De-Identify data without losing analytics value 7 * See reference slide(s) at end of presentation

Engage InfoSec, Legal, Compliance, Privacy " Engage Information Security rather than avoid them " CISO s and InfoSec ultimately have the same goals " Will help fund and implement effective data protection " Legal, Privacy and Compliance Identify/interpret regulatory and compliance requirements Helping protect the business by identifying risks to consider Incorporate generally accepted Privacy Principles* 8 * See reference slide(s) at end of presentation

Data Governance Program " Establish good data governance program Identified Data Owners Identified Data Stewards Identified Data Custodians RACI Roles and Responsibilities " Data Governance subject areas Data Ownership Data Quality Data Integration Metadata Management Master Data Management Data Architecture Data Security & Privacy 9

Protect sensitive data consistently wherever it goes At Rest In Transit In Use 10 Ideally with a single, centralized enterprise solution

What Data to Tokenize or Encrypt? " Important questions to ask... What policy and regulatory compliance requirements apply? What risks must be mitigated? How/Why are protected columns accessed/used? What other mitigating controls are available? Appropriate balance between business and data privacy/security? When is Tokenization or Encryption most appropriate? " Utilization and access control limitations of Hadoop / Hive " Alternative protection options to consider Full Disk Encryption (FTE) Important Data Security Architecture Questions

To Encrypt or Tokenize... This is the Question Tokenization SSN Large - Field Size relative to width of lookup table - Small CC-PAN More - Structured - Less Healthcare Records More - Logic in portions of the data element - Less Encryption PIN, CID, CV2 Password X-Ray Cat Scan HIV-Pos* Diagnosis Patient ID # Less - Bank Acct No. Percent of Access Requiring Clear Text - More report Customer ID # Increasing Data Sensitivity DOB * With Initialization Vector (IV)

Potential Additional Controls to Consider " Tokenization or Encryption farther upstream in Data Flow " Do not load unnecessary regulated data to Hadoop " Access Hadoop Hive Tables through Teradata (QueryGrid) " HDFS file-level access control " Accumulo cell level access control (Row/Column intersection) " Knox Gateway (authentication for multiple Hadoop clusters) " Coarse grained HDFS File Encryption " XASecure (now HDP Advanced Security) " Ambari (Hadoop Cluster Management) " Kerberos (Authentication) all or nothing Piecemeal independent security tools for Hadoop

Reduce your Exposure and Risk Token SSN Population of users who have access to SSN today Population of users who can perform their job function with only the last 4 digits of the SSN SSN Last 4 Digits SSN Vaultless Tokenization is a form of data protection that converts sensitive data into fake data. The real data can be retrieved only by authorized users. Often a more usable form of protection than encryption. Full Population of users who need access to the full SSN to perform their job function Improve Security Posture Without Impacting Analytics Value 14

What to look for in a good Enterprise Solution Critical core requirements: v A single solution that works across all core platforms v Scalable, centralized enterprise class solution v Segregation of duties between DBA and Security Admin v Good Encryption Key or Token Lookup Table management v Data layer solution v Tamper-proof audit trail v Transparent (as possible) to authorized end-users v High Availability (HA) v Optional in-database versus ex-database encryption/tokenization 15

Other "nice to have" features " Flexible protection options (Encrypt, Tokenize, DTP/FPE, Masking) " Broadest possible support for a range of data types " Built in DR, Dual Active, Key and system recovery capability " Minimal performance impact to applications/end users " Optimized operations to minimize CPU utilization " Proven Implementation methodology " PCI-DSS compliant solution (meeting all relevant requirements) " Deep partnership with Teradata and other database providers " Minimal impact on system upgrades " Maintain consistent referential integrity and indexing capability " Low Total Cost of Ownership (TCO) 16

What to look for in a good solution for Hadoop " Course Grained and Fine Grained Protection Capability HDFS File Encryption, Multi-Tennant File Encryption, HDFS FP (HDFS Codec) Column/Field Level Fine Grained Protection " Multi-Tennant Row Level Protection Allow authorized users access to specific rows only Unprotect columns for authorized users only " Heterogeneous Protection Capabilities Protect Upstream sources of data and Downstream targets of data Vaultless Tokenization, often less intrusive than encryption, reversible protection Reversible where masking is not Deployed on the (Data) Nodes Leverage MPP architecture of Hadoop Avoid Appliance based solutions that can slow down Hadoop " Tokenization capability for Hive access to HDFS Files/Tables Hive does not support VarByte data type (Encryption = Binary Ciphertext) 17

Hadoop security controls are playing catch-up Traditional RDBMS Firewalls, IDS/IPS Authentication (Kerberos) Authorization RBAC RLS CLS Audit RDBMS Encrypt Tokenize Hadoop (Fewer Layers) Firewalls, IDS/IPS Authentication (Kerberos) Future? (Accumulo, Knox) Hive HDFS Tokenize Only Heavier reliance on Tokenization with Hadoop 18

Granularity of Protecting Sensitive Data Coarse Grained Protection (File/Volume) Fine Grained Protection (Data/Field) Methods: File or Volume encryption All or nothing approach Does NOT secure file contents in use OS File System Encryption HDFS Encryption Secures data at rest and in transit Operates at the individual field level Fine Grained Protection Methods: Vaultless Tokenization Masking Encryption (Strong, Format Preserving) Data is protected in use and wherever it goes Business logic can be retained

Data Security Platform RDBMS Applications Audit Log Audit Log EDW Audit Log Enterprise Security Administrator Policy Big Data Audit Log IBM Mainframe Protector Audit Log Netezza Audit Log Audit File Servers Log File and Cloud Gateway Servers Protection Servers 20 Protegrity Confidential

Protegrity s Big Data Protector for Hadoop Hadoop Cluster Hadoop Node Hive Pig Other Policy Audit MapReduce YARN HBase HDFS OS File System " Protegrity Big Data Protector for Hadoop delivers protection at every node and is delivered with our own cluster management capability. " All nodes are managed by the Enterprise Security Administrator that delivers policy and accepts audit logs " Protegrity Data Security Policy contains information about how data is deidentified and who is authorized to have access to that data. " Policy is enforced at different levels of protection in Hadoop. 21

Rich Security Layer over the Hadoop Ecosystem UDF Support for Pig UDF Support for Hive Hive - Tokenization Java API Support for MapReduce Hbase - Coprocessor support via UDFs Cassandra UDT Pig / Hive MapReduce YARN HBase HDFS Encryption through the HDFS Codec HDFS Commands Extended for Security Functions HDFS Interface for Java Programs De-identify before Ingestion into HDFS HDFS OS File System Encryption; Folder/File or Volume File System 22

Coarse Grained Protection: File / Volume Encryption All fields are in the clear Pig / Hive All fields are in the clear MapReduce YARN HBase HDFS File Entire with identifiable File is data Encrypted elements File System Volume encryption option will encrypt the entire volume versus the files themselves. 23

Coarse Grained with HDFS Staging Area Pig / Hive MapReduce Jobs MapReduce YARN HBase Ingest into HDFS HDFS Staging Area File System 24

Coarse Grained Multi-Tenant Protection Pig / Hive T1 T2 T3 Ingest into HDFS T1 folder T2 folder T3 folder Key 1 Key 2 Key 3 clear folder MapReduce YARN HBase HDFS File System 25

Fine Grained Protection Production Systems Encryption Reversible Policy Control (authorized / Unauthorized Access) Lacks Integration Transparency Not searchable or sortable Complex Key Management Example:!@#$%a^.,mhu7///&*B()_+!@ Vaultless Tokenization / Pseudonymization Reversible Policy Control (Authorized / Unauthorized Access) or Not Reversible No Complex Key Management In either case Integrates Transparently Searchable and sortable Business Intelligence: 0389 3778 3652 0038 Non-Production Systems Masking Not reversible No Policy, Everyone Can Access the Data Integrates Transparently No Complex Key Management Example: Date of Birth 2/15/1967 masked as xx/xx/1967 Protegrity Confidential

Enterprise-wide Protection Source Systems (Internal / External) Consumption BI Systems Target Systems (Internal / External) Input File Source Input File Source FPG ETL Ecosystem Components Pig Hive Node Node Node Database Server MapReduce YARN HBase Downstream Systems Database Database Protector Sqoop HDFS OS FS Edge Node File Protector Java Program Application Protector ESA If Edge Node is a Hadoop Node, Hadoop resources can be used Policy Deployment Audit Collection

Traditional IT Environment: Protegrity Protection Typical Enterprise Today Internet Inside the Firewall Apps EDW DBs Files Hadoop Apps Arch 028 Protegrity Confidential

Today s IT Environment: Protegrity Protection Typical Enterprise Today Internet Inside the Firewall Apps Cloud Protector Gateway DBs Files File Protector Gateway Files EDW Apps Arch ESA HG Apps Hadoop 029 Protegrity Confidential

Summarize what to do " Establish Good Data Governance " Protect the actual data Itself " Maintain referential integrity " De-Identify data while maintaining analytics capability " Apply consistent protection throughout the data flow " Engage Information Security, Legal and Compliance 30 Build security in rather than bolt it on later

Sign Up for a Free, Half-Day Risk Assessment Workshop Protegrity is proud to offer free, half-day risk assessment workshops designed to help companies evaluate their security posture. This is a no-obligation offer. These workshops are a unique, low-cost opportunity to gain valuable insight into where you stand from a risk management perspective relative to your peers. For more information or to schedule a free half-day workshop, please email: info@protegrity.com 31

The End... Q & A

Convergence of Data Privacy Regulations Government and industry groups are regularly releasing new data privacy laws, requirements, recommendations Each leverages the best of previous privacy laws and discards what has proven not to work New regulations and standards are converging on a standard set of data privacy principles The International Security, Trust and Privacy Alliance (ISTPA) has published a comparison of leading privacy

Privacy Principles One 1/2 " Accountability requires that the entity define, document, communicate, and assign accountability for its privacy polices and procedures and be accountable for PII under its control. " Notice requires that the entity provide notice about its privacy policies and procedures and identify the purpose for which personal information is collected, used, retained, and disclosed. " Choice and Consent requires that the entity describe the choices available to the individual and obtain implicit or explicit consent with respect to the collection, use, and disclosure of personal information. " Collection Limitation requires that the entity collect personal information only for the purposes identified in the notice. " Use Limitation requires that the entity limit the use of personal information to the purpose identified in the notice and for which the individual has provided implicit or explicit consent. Comparable lists from: International Security, Trust and Privacy Alliance (ISTPA) Association of Insurance Compliance Professionals (AICP)

Privacy Principles Two 2/2 " Access requires that the entity provide individuals with access to their personal information for review and update. " Disclosure requires that the entity disclose personal information to third parties only for the purposes identified in the notice and only with the implicit or explicit consent of the individual. " Security requires that the entity protect personal information against unauthorized access or alteration (both physical & logical). " Data Quality requires an entity maintain accurate, complete, and relevant personal information for the purposes identified in the notice. " Enforcement requires that the entity monitor compliance with its privacy policies and procedures and have procedures to address privacy-related inquiries and disputes. These must be captured in business/technical requirements

Plethora of Global Privacy Regulations Legislation and Regulations European Union 95/46/EC Directive on Data Privacy Germany Federal Data Protection Act Sweden Personal Data Act United Kingdom Data Protection Act Australia Privacy Act Japan Personal Information Protection Act United States SOX, GLBA, HIPAA, COPPA, SB 1386 36