Securing Your Enterprise Hadoop Ecosystem Comprehensive Security for the Enterprise with Cloudera

Size: px
Start display at page:

Download "Securing Your Enterprise Hadoop Ecosystem Comprehensive Security for the Enterprise with Cloudera"

Transcription

1 Securing Your Enterprise Hadoop Ecosystem Comprehensive Security for the Enterprise with Cloudera Version: 103

2 Table of Contents Introduction 3 Importance of Security 3 Growing Pains 3 Security Requirements of the Enterprise 5 Security Capabilities and Hadoop 5 Guard the Perimeter: Authentication 5 Access Controls: Authorization 7 Gain Visibility: Data Governance for Hadoop 10 Data Protection 12 Compliance-Ready 15 Conclusion 15 About Cloudera 15 2

3 Abstract Enterprises are transforming data management with unified systems that combine distributed storage and computation at limitless scale, for any amount of data and any type, and with powerful, flexible analytics, from batch processing to interactive SQL to full-text search. Yet to realize their full potential, these enterprise data hubs require authentication, authorization, audit, and data protection controls. Cloudera makes the enterprise data hub a reality for information-driven business by delivering applications and frameworks for deploying, managing, and integrating the necessary data security controls demanded by today s regulatory environments and compliance policies. Cloudera s enterprise data hub provides comprehensive security and governance across the four pillars of security, including Cloudera Manager for centralized, automated authentication management; Apache Sentry for unified role-based authorization; Cloudera Navigator for comprehensive audit, lineage, and metadata management; and Navigator Encrypt and Navigator Key Trustee for transparent encryption and enterprisegrade key management. Introduction Information-driven enterprises recognize that the next generation of data management is a single, central system that stores all data in any form or amount and provides business users with a wide array of tools and applications for working with this data. This evolution in data management is the enterprise data hub, and Apache Hadoop is at its core. Yet as organizations increasing store data in this system, a growing portion is business-sensitive and subject to regulations and governance controls. This information requires that Hadoop provide strong capabilities and measures to ensure security and compliance. This paper aims to explore and summarize the common security approaches, tools, and practices found within Hadoop at large, and how Cloudera is uniquely positioned to provide, strengthen, and manage the data security required for an enterprise data hub. Importance of Security Data security is a top business and IT priority for most enterprises. Many organizations operate within a business environment awash in compliance regulations that dictate controls and protection of data. For example, retail merchants and banking service firms must adhere to the Payment Card Industry Data Security Standard (PCI DSS) to protect their clients and their transactions, services, and account information. Healthcare providers and insurers, as well as biotechnology and pharmaceutical firms involved in research, are subject to the Health Insurance Portability and Accountability Act (HIPAA) compliance standards for patient information. And US federal organizations and agencies must comply with the various information security mandates of the Federal Information Security Management Act of 2002 (FISMA). Organizations also have business mandates and objectives that establish internal information security standards to guard their data. For instance, firms may endeavor to maintain strict client privacy as a market differentiator and business asset. Others may view data security as a wall protecting their hard-earned intellectual property or politically sensitive information. These internal policies can also be seen as insurance against negative public relations and press that may come as a result of a breach or accidental exposure. Growing Pains Historically, early Hadoop developers did not prioritize data security, as the use of Hadoop during this time was limited to small, internal audiences and designed for workloads and data sets that were not deemed overly sensitive and subject to regulation. The resulting early controls were designed to address user error to protect against accidental deletion, for example and not to protect against misuse. For stronger protection, Hadoop relied on the surrounding security capabilities inherent to existing data management and networking infrastructure. As Hadoop matured, the various projects within the platform, such as HDFS, MapReduce, Oozie, and Pig, addressed their own particular security needs. As is often the case with distributed, open source efforts, these projects established different methods for configuring the same types of security controls. 3

4 Early Hadoop developers did not prioritize data security, as Hadoop s initial focus was on workloads and data sets not considered sensitive and subject to regulation. Security controls addressed user error, e.g. accidental deletion, and not malicious use. The data management environment and needs are vastly different today. The flexible, scalable, and cost-efficient storage capabilities of Hadoop now offer businesses the opportunity to acquire a greater range of data for analysis and the ability to retain this data for longer periods of time. This concentration of data assets makes Hadoop a natural target for those with malicious intent, but with the advances in data security and security management in Hadoop today, enterprises have robust and comprehensive capabilities to prevent intrusions and abuse. The maturity of Hadoop today also means many data paths into and out of this information source. The breadth of ways in which a business can work with data in Hadoop, from interactive SQL and full-text search to advanced machine-learning analysis, does offer organizations unprecedented technical capabilities and insights for their multiple business audiences and constituencies. Enterprise security standards demand examination of each pathway, and this poses challenges for IT teams, as some channels and methods may have the concept of fields, tables, and views, for example, while other channels might only address the same data as more granular files and directories. While all of these capabilities open up this data to more users with various frameworks to access and analyze the data, these users may also have varying degrees of permissions that must be enforced. Additionally, unified data storage means sensitive data is also being stored and must be properly protected, while still maintaining the flexibility for organizations to explore data more fully and realize the true value of their data. The powerful and far-reaching capabilities inherent in Hadoop also pose substantial challenges to organizations. While Hadoop stands as the core of the platform for big data, employing Hadoop as the foundation of an enterprise data hub requires that the data and processes of this platform meet the security requirements of the enterprise. Without such assurances, organizations simply cannot realize the full potential of and capitalize on a production-ready enterprise data hub for data management. PROCESS, ANALYZE, SERVE BATCH Spark, Hive, Pig MapReduce STREAM Spark SQL Impala SEARCH SOLR SDK Partners OPERATIONS Cloudera Manager Cloudera Director RESOURCE MANAGEMENT YARN FILESYSTEM HDFS UNIFIED SERVICES RELATIONAL Kudu STORE SECURITY Sentry, RecordService NoSQL HBase DATA MANAGEMENT Cloudera Navigator Encrypt and KeyTrustee Optimizer BATCH Sqoop INTEGRATE REAL-TIME Kafka, Flume 4

5 Employing Hadoop as the core of an enterprise data hub requires that it meet the security requirements of the enterprise. Without such assurances, organizations cannot realize the full potential of an enterprise data hub for data management. Security Requirements of the Enterprise How then can organizations provide these assurances for enterprise-grade data security? When evaluating security for Hadoop, it s important to look beyond just installing a firewall and evaluate the platform based on the four pillars of security: Due to the sheer scale of Hadoop in terms of data, users, and access frameworks, it s critical that security controls are both managed centrally and automated across multiple systems and data sets in order to provide consistency of configuration and application according to IT policies and to ease and streamline operations. This requirement is particularly critical for distributed systems where deployment and management events are common throughout the lifetime of the services and underlying infrastructure. Lastly, enterprises require that systems and data integrate with existing IT tools such as Kerberos, directories, like Microsoft s Active Directory or the Lightweight Directory Access Protocol (LDAP), and logging and governance infrastructures, like security information and event management (SIEM) tools. Without the ability to leverage and integrate with existing security tools, it can be difficult to implement a new system and resource intensive and repetitive to manage multiple tools. Security Capabilities and Hadoop Hadoop has matured dramatically in the past few years and this maturity means it is being used in production to power mission critical applications. That is why Cloudera has been focused on adding the necessary security and governance to both run Hadoop in production and realize the true value of data. Cloudera is the leader in comprehensive security and governance for Hadoop. Across the four pillars of security, Cloudera s solution preserves the flexibility of Hadoop while providing the compliance-ready security required for the enterprise. The remainder of this paper focuses on how Hadoop now provides enterprise-grade security and governance and how IT teams can support deployments in environments that face frequent regulatory compliance audits and governance mandates. To do so, it is helpful to explore in greater detail the technical underpinnings, design patterns, and practices manifested in Hadoop for the four common factors of enterprise security: Perimeter, Access, Visibility, and Data Protection. Guard the Perimeter: Authentication Authentication mitigates the risk of unauthorized usage of services, and the purpose of authentication in Hadoop, as in other systems, is simply to prove that a user or service is who they claim to be. 5

6 Typically, enterprises manage identities, profiles, and credentials through a single distributed system, such as a Lightweight Directory Access Protocol (LDAP) directory. LDAP authentication consists of straightforward username/password services backed by a variety of storage systems, ranging from file to database. Kerberos is a centralized network authentication system providing secure, single sign-on and data privacy for users and applications. Developed as a research project at the Massachusetts Institute of Technology (MIT) in the 1980s to tackle mutual authentication and encrypted communication across unsecured networks, Kerberos has become a critical part of any enterprise security plan. The Kerberos protocol offers multiple encryption options that satisfy virtually all current security standards, ranging from single DES and DIGEST-MD5 to RC4-HMAC and the latest NIST encryption standard, the Advanced Encryption Standard (AES). Learn more about Kerberos at Another common enterprise-grade authentication system is Kerberos. Kerberos provides strong security benefits including capabilities that render intercepted authentication packets unusable by an attacker. Kerberos virtually eliminates the threat of impersonation present in earlier versions of Hadoop and never sends a user s credentials in the clear over the network. Both authentication schemes are widely used for more than 20 years and many enterprise applications and frameworks leverage them at their foundation. For example, Microsoft s Active Directory (AD) is an LDAP directory that also provides Kerberos for additional authentication security. Virtually all of the components of the Hadoop ecosystem are converging to use Kerberos authentication with the option to manage and store credentials in LDAP or AD. Patterns and Practices Hadoop, given its distributed nature, has many touchpoints, services, and operational processes that require robust authentication in order for a cluster to operate securely. Hadoop provides critical facilities for accomplishing this goal. Perimeter Authentication The first common approach concerns secured access to the Hadoop cluster itself: access that is external to or on the perimeter of Hadoop. As mentioned above, there are two principle forms of external authentication commonly available within Hadoop: AD / LDAP and Kerberos. AD/LDAP and Kerberos are authentication staples within the enterprise. With AD/LDAP, you can manage users, groups, and services within the environment. Users provide username and password authentication when they login, and groups control what services users have access to. For example: if there s a group called HR, and there s a payroll system, users in group HR have access to payroll and users not in group HR don t have access to it. Best practices suggest that client-to-hadoop services use Kerberos for authentication, and, as mentioned earlier, virtually all ecosystem clients now support Kerberos. When Hadoop security is configured to utilize Kerberos, client applications authenticate to the edge of the cluster via Kerberos RPC, and web consoles utilize HTTP SPNEGO Kerberos authentication. Despite the different methods used by clients to handle Kerberos tickets, all leverage a common core Kerberos and associated central directory of users and groups, thus maintaining continuity of identification across all services of the cluster and beyond. By leveraging existing standards for authentication, Cloudera not only integrates with AD and Kerberos, but automates much of the tedious set-up process through Cloudera Manager. Active Directory has a built in Kerberos server and Cloudera Manager integrates directly into that eliminating the need to setup a separate Kerberos server. This allows users to continue to authenticate directly against AD, allows Hadoop services to be defined directly in AD as part of overall service management layer, and controls user access to Hadoop services via AD groups. In addition, Cloudera Manager automates the process of Kerberizing the cluster. Through a wizard, authorized users configure Kerberos for individual servers (with recommended defaults) and trigger an automated workflow to secure their cluster. Cloudera Manager can also be used to tune the configurations between the Kerberos server for the cluster 6

7 and the primary Kerberos service, as well as monitor services for them all. Finally, Cloudera Manager manages and deploys Kerberos client configs and has Hadoop SSL related configs. To help further aid IT teams with access integration, Cloudera updated Hue and Cloudera Manager to offer additional Single Sign-On (SSO) options by employing Security Assertion Markup Language (SAML) for authentication. The Security Assertion Markup Language (SAML) is an XML-based framework for communicating user authentication, entitlement, and attribute information By defining standardized mechanisms for the communication of security and identity information between business partners, SAML makes federated identity, and the crossdomain transactions that it enables, a reality. Organization for the Advancement of Structured Information Standards (OASIS) Learn more about SAML at Cluster and RPC Authentication Virtually all intra-hadoop services mutually authenticate each other using Kerberos RPC. These internal checks prevent rogue services from introducing themselves into cluster activities via impersonation and subsequently capturing potentially critical data as a member of a distributed job or service. Distributed User Accounts Many Hadoop projects operate on data sets within HDFS, and this activity requires that a given user s account exist on all processing and storage nodes (NameNode, DataNode, JobTracker/Application Master, and TaskTracker/NodeManager) within a cluster in order to validate access rights and provide resource segmentation. The host operating system provides the mechanics to ensure user account propagation - for example, Pluggable Authentication Manager (PAM) or System Security Services Daemon (SSSD) on Linux hosts - allowing user accounts to be centrally managed via LDAP or AD, which facilitates the required process isolation. (See Authorization: Process Isolation ) Access Controls: Authorization Authorization is concerned with who or what has access or control over a given resource or service. Hadoop merges together the capabilities of multiple, varied, and previously separate IT systems, and then adds a wide variety of open source and commercial software tools into the mix. These different methods of accessing data deal with it at multiple levels of granularities. The challenge for Hadoop administrators is to be able to provide a single set of authorization policies that can operate across all of this infrastructure. For example, common data flows in both Hadoop and traditional data management environments require different authorization controls at each stage of processing. First, a commercial ETL tool may gather data from a variety of sources and formats, ingesting that data into a common repository. During this stage, an ETL administrator has full access to view all elements of input data in order to verify formats and configure data parsing. Then the processed data is exposed to business intelligence (BI) tools through a SQL interface to the common repository. End-users of the BI tools have varying levels of access to the data according to their assigned roles. Finally, some users query this same data through a commercial GUI-based tool which runs MapReduce jobs. Apache Sentry enables policies to be created that are enforced across all of these access methods. Patterns and Practices Authorization for data access in Hadoop typically manifests in three forms: POSIX-style permissions on files and directories, Access Control Lists (ACL) for management of services and resources, and Role-Based Access Control (RBAC) for certain services with advanced access controls to data. Authorization across different components of Hadoop can increasingly be performed within a single tool - Apache Sentry. For example, data can now be shared between Hive and other jobs such as MapReduce, Apache Spark, and Apache Pig, accessing that same data directly through HDFS - while only setting a single set of permissions for that data. 7

8 Process Isolation Hadoop, as a shared services environment, relies on the isolation of computing processes from one another within the cluster, thus providing strict assurance that each process and user has explicit authorization to a given set of resources. This is particularly important within MapReduce, as the Tasks of a given Job can execute Unix processes (i.e. MR Streaming), individual Java VMs, and even arbitrary code on a host server. In frameworks like MapReduce, the Task code is executed on the host server using the Job owner s UID, thus providing reliable process isolation and resource segmentation at the OS level of the Hadoop cluster. Role-Based Access Control (RBAC) Across Hadoop Central management of access policies is key due to the number of access paths and users, so users can access the data needed to do their job. Cloudera 5.x delivers this unified authorization with Apache Sentry (incubating). Sentry is an open source project that provides unified authorization via fine-grained role-based access controls (RBAC) for Impala, Apache Hive, Apache Spark, MapReduce, Apache Pig, and Search. Apache Sentry has quickly emerged as the open standard for authorization in Hadoop, with a broad set of contributions for sustained engineering quality, multi-vendor support for portability, and third-party integrations for compatibility all resulting in wide industry adoption across even the most regulated verticals. Figure 3 Figure 3 provides an overview of how Sentry authorization works. Here we have a Sentry Role (Fraud Analyst Role) and this role has permissions for this example, read access to all transaction data, though permissions can range from access to an entire database down to a specific set of columns and a particular set of rows. As defined with the perimeter security, there s also a group in AD called Fraud Analysts that is connected to the Sentry Fraud Analyst Role. Sam Smith, a member of this AD group, inherits this Sentry role and corresponding permissions. Figure 4 8

9 Sentry can be configured to use AD to determine a user s group assignment so any changes to group assignment in AD are automatically picked up by Sentry, resulting in updated Sentry role assignments. This eliminates manual mirroring between the two and allows organizations to continue to leverage existing AD tools and skillsets. For managing Sentry policies, Cloudera offers a UI for getting all this done in HUE. Users can go in and select TABLE or DATABASE, define roles and permissions, or create group associations - all within a single visual interface. Figure 5 HDFS Permissions Many services within the Hadoop ecosystem, from client applications like the CLI shell to tools written to use the Hadoop API, directly accessing data stored within HDFS. HDFS uses POSIX-style permissions for directories, files, and HDFS ACLs; each directory and file can be assigned multiple users and groups. Each assignment has a basic set of permissions available; file permissions are simply read, write, and execute, and directories have an additional permission to determine access to child directories. HDFS permissions/acls are sufficient when file-level granularity is good enough and when data doesn t need to be shared with SQL components like Hive and Impala. Sentry integration with HDFS enables customers to easily share data between Hive, Impala, and all the other Hadoop components that interact with HDFS (MapReduce, Spark, Pig, Sqoop, and so on) while ensuring that user access permissions only need to be set once, and that they are uniformly enforced. For example, users that have SELECT access to a Hive table will automatically have read access to all of the table s underlying files. Access Control Lists Hadoop also maintains general access controls for the services themselves in addition to the data within each service and in HDFS. Service access control lists (ACL) range from NameNode access to client-to-datanode communication. In the context of MapReduce and YARN, user and group identifiers form the basis for determining permission for job submission or modification. Apache HBase also uses ACLs for data-level authorization. HBase ACLs authorize various operations (READ, WRITE, CREATE, ADMIN) by column, column family, column family qualifier, and down to cell-level. HBase ACLs are granted and revoked to both users and groups. 9

10 Gain Visibility: Data Governance for Hadoop For the data in the cluster, it is critical to understand where the data is coming from and how it s being used. This is the classic question of audit. The goal of auditing is to capture a complete and immutable record of all activity within a system. Auditing plays a central role in three key activities within the enterprise. First, auditing is part of a system s security regime and can explain what happened, when, and to whom or what in case of a breach or other malicious intent. For example, if a rogue administrator deletes a user s data set, auditing provides the details of this action, and the correct data may be retrieved from backup. The second activity is compliance, and auditing participates in satisfying the core requirements of regulations associated with sensitive or personally identifiable data (PII), such as the Health Insurance Portability and Accountability Act (HIPAA) or the Payment Card Industry (PCI) Data Security Standard. Auditing provides the touchpoints necessary to construct the trail of who, how, when, and how often data is produced, viewed, and manipulated. Lastly, auditing provides the historical data and context for data forensics. Audit information leads to the understanding of how various populations use different data sets and can help establish the access patterns of these data sets. This examination, such as trend analysis, is broader in scope than compliance and can assist content and system owners in their data optimization and ROI efforts. The risks facing auditing are the reliable, timely, and tamper-proof capture of all activity, including administrative actions. Until recently, the native Hadoop ecosystem has relied primarily on using log files. Log files are unacceptable for most audit use cases in the enterprise as real-time monitoring is impossible, and log mechanics can be unreliable - a system crash before or during a write commit can compromise integrity and lose data. Other alternatives have included application-specific databases and other single-purpose data stores, and yet these approaches fail to capture all activity across the entire cluster. Cloudera Navigator is the only native end-to-end governance solution for Apache Hadoop. It provides administrators, analysts, and data scientists alike with unified visibility into the vast, diverse data being stored in Hadoop. Patterns and Practices Cloudera Navigator is a core part of Cloudera s platform and is critical for meeting compliance requirements. Key features include: Comprehensive Auditing Cloudera Navigator maintains a full audit history for HDFS, Impala, Hive, HBase, and Sentry, all in a single place. It tracks every access attempt, right down to the user ID, IP address, and full query text. This provides visibility into who has been accessing what data, ability to see point-in-time permissions and how they have changed (leveraging Sentry), and review and verify HDFS permissions. Cloudera Navigator also provides out-of-the-box integration with leading enterprise metadata, lineage, and SIEM applications. 10

11 Figure 6 Metadata Management For metadata and discovery, Cloudera Navigator features complete metadata storage. First, it consolidates the technical metadata for all data inside Hadoop into a single, searchable interface and allows for automatic tagging of data based on the external sources entering the cluster. For example, if there is an external ETL process, data can be automatically tagged as such when it enters Hadoop. Second, it supports user-based tagging to augment files, tables, and individual columns with custom business context, tags, and key/value pairs. Combined, this allows data to be easily discovered, classified, and located to not only support governance and compliance, but also user discovery within Hadoop. Cloudera Navigator also includes metadata policy management that can trigger actions (such as the autoclassification of metadata) for specific datasets based on arrival or scheduled intervals. This allows users to easily set, monitor, and enforce data management policies, while also integrating with common third-party tools. Lineage Cloudera Navigator provides an automatic collection and easy visualization of upstream and downstream data lineage to verify reliability. For each data source, it shows, down to the column-level within that data source, what the precise upstream data sources were, the transforms performed to produce it, and the impact that data has on downstream artifacts. Figure 7 11

12 Data Protection The goal of data protection is to ensure that only authorized users can view, use, and contribute to a data set. These security controls not only add another layer of protection against potential threats by end-users, but also thwart attacks from administrators and malicious actors on the network or in the data center. The means to achieving this goal consists of two elements: protecting data when it is persisted to disk or other storage mediums, commonly called data-at-rest, and protecting data while it moves from one process or system to another, or data in transit. Several common compliance regulations call for data protection in addition to the other security controls discussed in this paper. Cloudera provides security for data in transit, through TLS and other mechanisms - centrally deployed through Cloudera Manager. Transparent data-at-rest protection is provided through the combination of HDFS encryption, Navigator Encrypt, and Navigator Key Trustee. Patterns and Practices Hadoop, as a central data hub for many kinds of data within an organization, naturally has different needs for data protection depending on the audience and processes while using a given data set. Only Cloudera provides enterprise-grade encryption and key management. CDH provides transparent encryption, ensuring that all sensitive data is encrypted before being stored on disk. This capability, together with enterprise-grade encryption key management with Navigator Key Trustee, delivers the necessary protection to meet regulatory compliance for most enterprises. HDFS Encryption together with Navigator Encrypt (available with Cloudera Enterprise) provides transparent encryption for Hadoop, for both data and metadata. These solutions automatically encrypt data while the cluster continues to run as usual, with a very low performance impact. It is massively scalable, allowing encryption to happen in parallel against all the data nodes - as the cluster grows, encryption grows with it. Additionally, Cloudera s transparent encryption is optimized for the Intel chipset for high performance. Intel chipsets include AES-NI co-processors, which provide special capabilities that make encryption workloads run extremely fast. Not all AES-NI is equal, and Cloudera is able to take advantage of the latest Intel advances for even faster performance. Additionally, HDFS Encryption and Navigator Encrypt feature separation of duties, preventing even IT Administrators and root users from accessing data that they are not authorized to see. MANAGER NAVIGATOR IMPALA HIVE SENTRY HDFS HBASE Log/Config/ Spill Files Metadata Store NAVIGATOR KEY TRUSTEE HDFS Encryption Navigator Encrypt HSM Navigator Key Trustee Figure 8 12

13 Occasionally, if an organization s security mandates going beyond compliance, they may choose to implement additional controls in addition to encryption, such as static data masking and tokenization. Cloudera s partners provide these two options. The chart below compares these different approaches, showing how transparent encryption is used as the baseline protection for compliance, while the other options may by added for particular use cases. Data Types Usage Implementation Tokenization Credit Card Number SSN PCT-DSS scope reduction Intercept data flows at applications Make code changes to call tokenization APIs Static Data Masking Credit Card Number SSN Name Address Generate test data Prepare data sets being sent to third parties Identify location of all fields to be masked Data sitting in the clear until masking batch job runs Transparent Encryption Data & metadata: All data types All data formats (structured, unstructured, discovered, undiscovered) Ensure no sensitive data ever exists on disk Meet compliance requirements No integration needed Enable through Cloudera Manager Key Management Encryption requires encryption keys. Key generation, storage, and access all need to be carefully managed to avoid a security breach or data loss. For example, keys must never be stored with the data, so that encrypted data remains useless if stolen. Cloudera Navigator Key Trustee provides enterprise-grade key management for securing encryption keys and other Hadoop security artifacts. Navigator Key Trustee is a software-based key management system that provides key management for Navigator Encrypt. In order for Hadoop to meet compliance regulations, the encryption keys need to be stored on a separate, dedicated machine, out of the cluster and away from the data which is what Navigator Key Trustee provides. It also integrates with leading Hardware Security Modules (HSMs), so Navigator Key Trustee can use the HSMs as the root of trust for all cryptography within the environment. This also means all key management policies built around the HSM will continue to be executed within the HSM. While the primary use case for Navigator Key Trustee is to store encryption keys, it is essentially a centralized, virtual safe deposit box that can store any sensitive security related artifact within the Hadoop architecture. 13

14 Compliance-Ready For many industries, storing sensitive data also means complying with various regulatory requirements, such as PCI for storing credit card data and HIPAA for storing patient health records. These regulations not only mean that the system storing this data must achieve compliance, but any system that integrates with them must as well. Cloudera is the first and only Hadoop platform to achieve PCI compliance. Leveraging the four pillars of security, Cloudera s platform is well-equipped to pass technology reviews for most common compliance regulations and has experts on staff to help users through the rest of the process. Conclusion Next-generation data management - the enterprise data hub - requires authentication, authorization, governance, and data protection controls in order to establish a place to store and operate on all data within the enterprise, from batch processing to interactive SQL to advanced analytics. Hadoop is at its foundation, yet the core elements of Hadoop are only part of a complete enterprise solution. As more data and more variety of data is moved to the enterprise data hub, the more likely this data will be subject to compliance and security mandates. Comprehensive and integrated security within Hadoop, then, is a keystone to establishing the new center of data management. Cloudera s enterprise data hub has built-in security that s comprehensive, transparent, and compliance-ready. Across the four pillars, Cloudera offers a set of security and governance capabilities that s unmatched within the Hadoop environment and ecosystem. About Cloudera Cloudera delivers the modern platform for data management and analytics. The world s leading organizations trust Cloudera to help solve their most challenging business problems with Cloudera Enterprise, the fastest, easiest, and most secure data platform built on Apache Hadoop. Our customers can efficiently capture, store, process, and analyze vast amounts of data, empowering them to use advanced analytics to drive business decisions quickly, flexibly, and at lower cost than has been possible before. To ensure our customers are successful, we offer comprehensive support, training, and professional services. Learn more at cloudera.com. cloudera.com or Cloudera, Inc Page Mill Road, Palo Alto, CA 94304, USA 2015 Cloudera, Inc. All rights reserved. Cloudera and the Cloudera logo are trademarks or registered trademarks of Cloudera Inc. in the USA and other countries. All other trademarks are the property of their respective companies. Information is subject to change without notice.

Securing Your Enterprise Hadoop Ecosystem Comprehensive Security for the Enterprise with Cloudera

Securing Your Enterprise Hadoop Ecosystem Comprehensive Security for the Enterprise with Cloudera Securing Your Enterprise Hadoop Ecosystem Comprehensive Security for the Enterprise with Cloudera Version: 102 Table of Contents Introduction 3 Importance of Security 3 Growing Pains 3 Security Requirements

More information

SECURING YOUR ENTERPRISE HADOOP ECOSYSTEM

SECURING YOUR ENTERPRISE HADOOP ECOSYSTEM WHITE PAPER SECURING YOUR ENTERPRISE HADOOP ECOSYSTEM Realizing Data Security for the Enterprise with Cloudera Securing Your Enterprise Hadoop Ecosystem CLOUDERA WHITE PAPER 2 Table of Contents Introduction

More information

Data Security For Government Agencies

Data Security For Government Agencies Data Security For Government Agencies Version: Q115-101 Table of Contents Abstract Agencies are transforming data management with unified systems that combine distributed storage and computation at limitless

More information

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics ESSENTIALS EMC ISILON Use the industry's first and only scale-out NAS solution with native Hadoop

More information

INDUSTRY BRIEF DATA CONSOLIDATION AND MULTI-TENANCY IN FINANCIAL SERVICES

INDUSTRY BRIEF DATA CONSOLIDATION AND MULTI-TENANCY IN FINANCIAL SERVICES INDUSTRY BRIEF DATA CONSOLIDATION AND MULTI-TENANCY IN FINANCIAL SERVICES Data Consolidation and Multi-Tenancy in Financial Services CLOUDERA INDUSTRY BRIEF 2 Table of Contents Introduction 3 Security

More information

Apache Sentry. Prasad Mujumdar prasadm@apache.org prasadm@cloudera.com

Apache Sentry. Prasad Mujumdar prasadm@apache.org prasadm@cloudera.com Apache Sentry Prasad Mujumdar prasadm@apache.org prasadm@cloudera.com Agenda Various aspects of data security Apache Sentry for authorization Key concepts of Apache Sentry Sentry features Sentry architecture

More information

Fighting Cyber Fraud with Hadoop. Niel Dunnage Senior Solutions Architect

Fighting Cyber Fraud with Hadoop. Niel Dunnage Senior Solutions Architect Fighting Cyber Fraud with Hadoop Niel Dunnage Senior Solutions Architect 1 Summary Big Data is an increasingly powerful enterprise asset and this talk will explore the relationship between big data and

More information

Big Data Management and Security

Big Data Management and Security Big Data Management and Security Audit Concerns and Business Risks Tami Frankenfield Sr. Director, Analytics and Enterprise Data Mercury Insurance What is Big Data? Velocity + Volume + Variety = Value

More information

Secure Your Hadoop Cluster With Apache Sentry (Incubating) Xuefu Zhang Software Engineer, Cloudera April 07, 2014

Secure Your Hadoop Cluster With Apache Sentry (Incubating) Xuefu Zhang Software Engineer, Cloudera April 07, 2014 1 Secure Your Hadoop Cluster With Apache Sentry (Incubating) Xuefu Zhang Software Engineer, Cloudera April 07, 2014 2 Outline Introduction Hadoop security primer Authentication Authorization Data Protection

More information

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and

More information

MULTITENANCY AND THE ENTERPRISE DATA HUB:

MULTITENANCY AND THE ENTERPRISE DATA HUB: MULTITENANCY AND THE ENTERPRISE DATA HUB: Version: Q414-105 Table of Content Introduction 3 Business Objectives for Multitenant Environments 3 Standard Isolation Models of an EDH 4 Elements of a Multitenant

More information

The Future of Data Management

The Future of Data Management The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class

More information

More Data in Less Time

More Data in Less Time More Data in Less Time Leveraging Cloudera CDH as an Operational Data Store Daniel Tydecks, Systems Engineering DACH & CE Goals of an Operational Data Store Load Data Sources Traditional Architecture Operational

More information

Ensure PCI DSS compliance for your Hadoop environment. A Hortonworks White Paper October 2015

Ensure PCI DSS compliance for your Hadoop environment. A Hortonworks White Paper October 2015 Ensure PCI DSS compliance for your Hadoop environment A Hortonworks White Paper October 2015 2 Contents Overview Why PCI matters to your business Building support for PCI compliance into your Hadoop environment

More information

IBM InfoSphere Guardium Data Activity Monitor for Hadoop-based systems

IBM InfoSphere Guardium Data Activity Monitor for Hadoop-based systems IBM InfoSphere Guardium Data Activity Monitor for Hadoop-based systems Proactively address regulatory compliance requirements and protect sensitive data in real time Highlights Monitor and audit data activity

More information

Securing your Big Data Environment

Securing your Big Data Environment Securing your Big Data Environment Ajit Gaddam @ajitgaddam Securing Your Big Data Environment Black Hat USA 2015 Page # 1 @VISA Chief Security Architect Before senior tech roles at diff tech & FI companies

More information

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2016 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and

More information

IBM Software InfoSphere Guardium. Planning a data security and auditing deployment for Hadoop

IBM Software InfoSphere Guardium. Planning a data security and auditing deployment for Hadoop Planning a data security and auditing deployment for Hadoop 2 1 2 3 4 5 6 Introduction Architecture Plan Implement Operationalize Conclusion Key requirements for detecting data breaches and addressing

More information

Deploying an Operational Data Store Designed for Big Data

Deploying an Operational Data Store Designed for Big Data Deploying an Operational Data Store Designed for Big Data A fast, secure, and scalable data staging environment with no data volume or variety constraints Sponsored by: Version: 102 Table of Contents Introduction

More information

Cloudera Enterprise Data Hub. GCloud Service Definition Lot 3: Software as a Service

Cloudera Enterprise Data Hub. GCloud Service Definition Lot 3: Software as a Service Cloudera Enterprise Data Hub GCloud Service Definition Lot 3: Software as a Service December 2014 1 SERVICE OVERVIEW & SOLUTION... 4 1.1 Service Overview... 4 1.2 Introduction to Cloudera... 5 1.3 Cloudera

More information

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Highly competitive enterprises are increasingly finding ways to maximize and accelerate

More information

How to avoid building a data swamp

How to avoid building a data swamp How to avoid building a data swamp Case studies in Hadoop data management and governance Mark Donsky, Product Management, Cloudera Naren Korenu, Engineering, Cloudera 1 Abstract DELETE How can you make

More information

White paper. The Big Data Security Gap: Protecting the Hadoop Cluster

White paper. The Big Data Security Gap: Protecting the Hadoop Cluster The Big Data Security Gap: Protecting the Hadoop Cluster Introduction While the open source framework has enabled the footprint of Hadoop to logically expand, enterprise organizations face deployment and

More information

Driving Growth in Insurance With a Big Data Architecture

Driving Growth in Insurance With a Big Data Architecture Driving Growth in Insurance With a Big Data Architecture The SAS and Cloudera Advantage Version: 103 Table of Contents Overview 3 Current Data Challenges for Insurers 3 Unlocking the Power of Big Data

More information

Important Notice. (c) 2010-2015 Cloudera, Inc. All rights reserved.

Important Notice. (c) 2010-2015 Cloudera, Inc. All rights reserved. Cloudera Security Important Notice (c) 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, and any other product or service names or slogans contained in this document

More information

Complying with Payment Card Industry (PCI-DSS) Requirements with DataStax and Vormetric

Complying with Payment Card Industry (PCI-DSS) Requirements with DataStax and Vormetric Complying with Payment Card Industry (PCI-DSS) Requirements with DataStax and Vormetric Table of Contents Table of Contents... 2 Overview... 3 PIN Transaction Security Requirements... 3 Payment Application

More information

Hadoop & Spark Using Amazon EMR

Hadoop & Spark Using Amazon EMR Hadoop & Spark Using Amazon EMR Michael Hanisch, AWS Solutions Architecture 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda Why did we build Amazon EMR? What is Amazon EMR?

More information

Fast, Low-Overhead Encryption for Apache Hadoop*

Fast, Low-Overhead Encryption for Apache Hadoop* Fast, Low-Overhead Encryption for Apache Hadoop* Solution Brief Intel Xeon Processors Intel Advanced Encryption Standard New Instructions (Intel AES-NI) The Intel Distribution for Apache Hadoop* software

More information

Cloudera Enterprise Data Hub in Telecom:

Cloudera Enterprise Data Hub in Telecom: Cloudera Enterprise Data Hub in Telecom: Three Customer Case Studies Version: 103 Table of Contents Introduction 3 Cloudera Enterprise Data Hub for Telcos 4 Cloudera Enterprise Data Hub in Telecom: Customer

More information

Datameer Big Data Governance

Datameer Big Data Governance TECHNICAL BRIEF Datameer Big Data Governance Bringing open-architected and forward-compatible governance controls to Hadoop analytics As big data moves toward greater mainstream adoption, its compliance

More information

Interactive data analytics drive insights

Interactive data analytics drive insights Big data Interactive data analytics drive insights Daniel Davis/Invodo/S&P. Screen images courtesy of Landmark Software and Services By Armando Acosta and Joey Jablonski The Apache Hadoop Big data has

More information

Encryption and Anonymization in Hadoop

Encryption and Anonymization in Hadoop Encryption and Anonymization in Hadoop Current and Future needs Sept-28-2015 Page 1 ApacheCon, Budapest Agenda Need for data protection Encryption and Anonymization Current State of Encryption in Hadoop

More information

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera

SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP. Eva Andreasson Cloudera SOLVING REAL AND BIG (DATA) PROBLEMS USING HADOOP Eva Andreasson Cloudera Most FAQ: Super-Quick Overview! The Apache Hadoop Ecosystem a Zoo! Oozie ZooKeeper Hue Impala Solr Hive Pig Mahout HBase MapReduce

More information

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK BIG DATA HOLDS BIG PROMISE FOR SECURITY NEHA S. PAWAR, PROF. S. P. AKARTE Computer

More information

The Future of Data Management with Hadoop and the Enterprise Data Hub

The Future of Data Management with Hadoop and the Enterprise Data Hub The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah Cofounder & CTO, Cloudera, Inc. Twitter: @awadallah 1 2 Cloudera Snapshot Founded 2008, by former employees of Employees

More information

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data

Introduction to Hadoop HDFS and Ecosystems. Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Introduction to Hadoop HDFS and Ecosystems ANSHUL MITTAL Slides credits: Cloudera Academic Partners Program & Prof. De Liu, MSBA 6330 Harvesting Big Data Topics The goal of this presentation is to give

More information

Data Security in Hadoop

Data Security in Hadoop Data Security in Hadoop Eric Mizell Director, Solution Engineering Page 1 What is Data Security? Data Security for Hadoop allows you to administer a singular policy for authentication of users, authorize

More information

How to Hadoop Without the Worry: Protecting Big Data at Scale

How to Hadoop Without the Worry: Protecting Big Data at Scale How to Hadoop Without the Worry: Protecting Big Data at Scale SESSION ID: CDS-W06 Davi Ottenheimer Senior Director of Trust EMC Corporation @daviottenheimer Big Data Trust. Redefined Transparency Relevance

More information

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013 Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software SC13, November, 2013 Agenda Abstract Opportunity: HPC Adoption of Big Data Analytics on Apache

More information

Accelerating Enterprise Big Data Success. Tim Stevens, VP of Business and Corporate Development Cloudera

Accelerating Enterprise Big Data Success. Tim Stevens, VP of Business and Corporate Development Cloudera Accelerating Enterprise Big Data Success Tim Stevens, VP of Business and Corporate Development Cloudera 1 Big Opportunity: Extract value from data Revenue Growth x = 50 Billion 35 ZB Cost Savings Margin

More information

Hadoop Ecosystem B Y R A H I M A.

Hadoop Ecosystem B Y R A H I M A. Hadoop Ecosystem B Y R A H I M A. History of Hadoop Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open

More information

Understanding Enterprise Cloud Governance

Understanding Enterprise Cloud Governance Understanding Enterprise Cloud Governance Maintaining control while delivering the agility of cloud computing Most large enterprises have a hybrid or multi-cloud environment comprised of a combination

More information

The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer, Cofounder @mikeolson

The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer, Cofounder @mikeolson The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer, Cofounder @mikeolson 1 A New Platform for Pervasive Analytics Multiple big data opportunities

More information

Data Governance in the Hadoop Data Lake. Kiran Kamreddy May 2015

Data Governance in the Hadoop Data Lake. Kiran Kamreddy May 2015 Data Governance in the Hadoop Data Lake Kiran Kamreddy May 2015 One Data Lake: Many Definitions A centralized repository of raw data into which many data-producing streams flow and from which downstream

More information

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Datenverwaltung im Wandel - Building an Enterprise Data Hub with Datenverwaltung im Wandel - Building an Enterprise Data Hub with Cloudera Bernard Doering Regional Director, Central EMEA, Cloudera Cloudera Your Hadoop Experts Founded 2008, by former employees of Employees

More information

Upcoming Announcements

Upcoming Announcements Enterprise Hadoop Enterprise Hadoop Jeff Markham Technical Director, APAC jmarkham@hortonworks.com Page 1 Upcoming Announcements April 2 Hortonworks Platform 2.1 A continued focus on innovation within

More information

Data Governance in the Hadoop Data Lake. Michael Lang May 2015

Data Governance in the Hadoop Data Lake. Michael Lang May 2015 Data Governance in the Hadoop Data Lake Michael Lang May 2015 Introduction Product Manager for Teradata Loom Joined Teradata as part of acquisition of Revelytix, original developer of Loom VP of Sales

More information

IBM BigInsights for Apache Hadoop

IBM BigInsights for Apache Hadoop IBM BigInsights for Apache Hadoop Efficiently manage and mine big data for valuable insights Highlights: Enterprise-ready Apache Hadoop based platform for data processing, warehousing and analytics Advanced

More information

How To Achieve Pca Compliance With Redhat Enterprise Linux

How To Achieve Pca Compliance With Redhat Enterprise Linux Achieving PCI Compliance with Red Hat Enterprise Linux June 2009 CONTENTS EXECUTIVE SUMMARY...2 OVERVIEW OF PCI...3 1.1. What is PCI DSS?... 3 1.2. Who is impacted by PCI?... 3 1.3. Requirements for achieving

More information

CDH AND BUSINESS CONTINUITY:

CDH AND BUSINESS CONTINUITY: WHITE PAPER CDH AND BUSINESS CONTINUITY: An overview of the availability, data protection and disaster recovery features in Hadoop Abstract Using the sophisticated built-in capabilities of CDH for tunable

More information

Fighting Cyber Fraud with Hadoop. Niel Dunnage Senior Solutions Architect

Fighting Cyber Fraud with Hadoop. Niel Dunnage Senior Solutions Architect Fighting Cyber Fraud with Hadoop Niel Dunnage Senior Solutions Architect 1 Summary Big Data is an increasingly powerful enterprise asset with many potential user cases in this case we ll explore the relationship

More information

Big Data Security. Kevvie Fowler. kpmg.ca

Big Data Security. Kevvie Fowler. kpmg.ca Big Data Security Kevvie Fowler kpmg.ca About myself Kevvie Fowler, CISSP, GCFA Partner, Advisory Services KPMG Canada Industry contributions Big data security definitions Definitions Big data Datasets

More information

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE Hadoop Storage-as-a-Service ABSTRACT This White Paper illustrates how EMC Elastic Cloud Storage (ECS ) can be used to streamline the Hadoop data analytics

More information

Compliance for the Road Ahead

Compliance for the Road Ahead THE DATA PROTECTION COMPANY CENTRAL CONTROL A NTROL RBAC UNIVERSAL DATA PROTECTION POLICY ENTERPRISE KEY DIAGRAM MANAGEMENT SECURE KEY STORAGE ENCRYPTION SERVICES LOGGING AUDITING Compliance for the Road

More information

SafeNet DataSecure vs. Native Oracle Encryption

SafeNet DataSecure vs. Native Oracle Encryption SafeNet vs. Native Encryption Executive Summary Given the vital records databases hold, these systems often represent one of the most critical areas of exposure for an enterprise. Consequently, as enterprises

More information

Cloudera Manager Training: Hands-On Exercises

Cloudera Manager Training: Hands-On Exercises 201408 Cloudera Manager Training: Hands-On Exercises General Notes... 2 In- Class Preparation: Accessing Your Cluster... 3 Self- Study Preparation: Creating Your Cluster... 4 Hands- On Exercise: Working

More information

APPLICATION COMPLIANCE AUDIT & ENFORCEMENT

APPLICATION COMPLIANCE AUDIT & ENFORCEMENT TELERAN SOLUTION BRIEF Building Better Intelligence APPLICATION COMPLIANCE AUDIT & ENFORCEMENT For Exadata and Oracle 11g Data Warehouse Environments BUILDING BETTER INTELLIGENCE WITH BI/DW COMPLIANCE

More information

Are You Big Data Ready?

Are You Big Data Ready? ACS 2015 Annual Canberra Conference Are You Big Data Ready? Vladimir Videnovic Business Solutions Director Oracle Big Data and Analytics Introduction Introduction What is Big Data? If you can't explain

More information

Cloudera in the Public Cloud

Cloudera in the Public Cloud Cloudera in the Public Cloud Deployment Options for the Enterprise Data Hub Version: Q414-102 Table of Contents Executive Summary 3 The Case for Public Cloud 5 Public Cloud vs On-Premise 6 Public Cloud

More information

ProtectV. Securing Sensitive Data in Virtual and Cloud Environments. Executive Summary

ProtectV. Securing Sensitive Data in Virtual and Cloud Environments. Executive Summary VISIBILITY DATA GOVERNANCE SYSTEM OS PARTITION UNIFIED MANAGEMENT CENTRAL AUDIT POINT ACCESS MONITORING ENCRYPTION STORAGE VOLUME POLICY ENFORCEMENT ProtectV SECURITY SNAPSHOT (backup) DATA PROTECTION

More information

Key Steps to Meeting PCI DSS 2.0 Requirements Using Sensitive Data Discovery and Masking

Key Steps to Meeting PCI DSS 2.0 Requirements Using Sensitive Data Discovery and Masking Key Steps to Meeting PCI DSS 2.0 Requirements Using Sensitive Data Discovery and Masking SUMMARY The Payment Card Industry Data Security Standard (PCI DSS) defines 12 high-level security requirements directed

More information

IBM Software Top tips for securing big data environments

IBM Software Top tips for securing big data environments IBM Software Top tips for securing big data environments Why big data doesn t have to mean big security challenges 2 Top Comprehensive tips for securing data big protection data environments for physical,

More information

Who Am I? Mark Cusack Chief Architect 9 years@rainstor Founding developer Ex UK Ministry of Defence Research InfoSec projects

Who Am I? Mark Cusack Chief Architect 9 years@rainstor Founding developer Ex UK Ministry of Defence Research InfoSec projects 1 Who Am I? Mark Cusack Chief Architect 9 years@rainstor Founding developer Ex UK Ministry of Defence Research InfoSec projects 2 RainStor: a SQL Database on Hadoop SCALE (MPP, Shared everything) LOAD

More information

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase Architectural patterns for building real time applications with Apache HBase Andrew Purtell Committer and PMC, Apache HBase Who am I? Distributed systems engineer Principal Architect in the Big Data Platform

More information

Qlik Sense Enabling the New Enterprise

Qlik Sense Enabling the New Enterprise Technical Brief Qlik Sense Enabling the New Enterprise Generations of Business Intelligence The evolution of the BI market can be described as a series of disruptions. Each change occurred when a technology

More information

Oracle Big Data Fundamentals Ed 1 NEW

Oracle Big Data Fundamentals Ed 1 NEW Oracle University Contact Us: +90 212 329 6779 Oracle Big Data Fundamentals Ed 1 NEW Duration: 5 Days What you will learn In the Oracle Big Data Fundamentals course, learn to use Oracle's Integrated Big

More information

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook

Hadoop Ecosystem Overview. CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Hadoop Ecosystem Overview CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook Agenda Introduce Hadoop projects to prepare you for your group work Intimate detail will be provided in future

More information

PCI COMPLIANCE ON AWS: HOW TREND MICRO CAN HELP

PCI COMPLIANCE ON AWS: HOW TREND MICRO CAN HELP solution brief PCI COMPLIANCE ON AWS: HOW TREND MICRO CAN HELP AWS AND PCI DSS COMPLIANCE To ensure an end-to-end secure computing environment, Amazon Web Services (AWS) employs a shared security responsibility

More information

Cloudera Backup and Disaster Recovery

Cloudera Backup and Disaster Recovery Cloudera Backup and Disaster Recovery Important Note: Cloudera Manager 4 and CDH 4 have reached End of Maintenance (EOM) on August 9, 2015. Cloudera will not support or provide patches for any of the Cloudera

More information

Seven Things To Consider When Evaluating Privileged Account Security Solutions

Seven Things To Consider When Evaluating Privileged Account Security Solutions Seven Things To Consider When Evaluating Privileged Account Security Solutions Contents Introduction 1 Seven questions to ask every privileged account security provider 4 1. Is the solution really secure?

More information

Complying with PCI Data Security

Complying with PCI Data Security Complying with PCI Data Security Solution BRIEF Retailers, financial institutions, data processors, and any other vendors that manage credit card holder data today must adhere to strict policies for ensuring

More information

Enterprise-grade Hadoop: The Building Blocks

Enterprise-grade Hadoop: The Building Blocks Enterprise-grade Hadoop: The Building Blocks An Ovum white paper for MapR Publication Date: 24 Sep 2014 Author name Summary Catalyst Hadoop was initially developed for trusted environments that did not

More information

Data Collection and Analysis: Get End-to-End Security with Cisco Connected Analytics for Network Deployment

Data Collection and Analysis: Get End-to-End Security with Cisco Connected Analytics for Network Deployment White Paper Data Collection and Analysis: Get End-to-End Security with Cisco Connected Analytics for Network Deployment Cisco Connected Analytics for Network Deployment (CAND) is Cisco hosted, subscription-based

More information

Hadoop Trends and Practical Use Cases. April 2014

Hadoop Trends and Practical Use Cases. April 2014 Hadoop Trends and Practical Use Cases John Howey Cloudera jhowey@cloudera.com Kevin Lewis Cloudera klewis@cloudera.com April 2014 1 Agenda Hadoop Overview Latest Trends in Hadoop Enterprise Ready Beyond

More information

A Websense Research Brief Prevent Data Loss and Comply with Payment Card Industry Data Security Standards

A Websense Research Brief Prevent Data Loss and Comply with Payment Card Industry Data Security Standards A Websense Research Brief Prevent Loss and Comply with Payment Card Industry Security Standards Prevent Loss and Comply with Payment Card Industry Security Standards Standards for Credit Card Security

More information

Windows Least Privilege Management and Beyond

Windows Least Privilege Management and Beyond CENTRIFY WHITE PAPER Windows Least Privilege Management and Beyond Abstract Devising an enterprise-wide privilege access scheme for Windows systems is complex (for example, each Window system object has

More information

Certified Big Data and Apache Hadoop Developer VS-1221

Certified Big Data and Apache Hadoop Developer VS-1221 Certified Big Data and Apache Hadoop Developer VS-1221 Certified Big Data and Apache Hadoop Developer Certification Code VS-1221 Vskills certification for Big Data and Apache Hadoop Developer Certification

More information

Like what you hear? Tweet it using: #Sec360

Like what you hear? Tweet it using: #Sec360 Like what you hear? Tweet it using: #Sec360 HADOOP SECURITY Like what you hear? Tweet it using: #Sec360 HADOOP SECURITY About Robert: School: UW Madison, U St. Thomas Programming: 15 years, C, C++, Java

More information

Identity and Access Management Integration with PowerBroker. Providing Complete Visibility and Auditing of Identities

Identity and Access Management Integration with PowerBroker. Providing Complete Visibility and Auditing of Identities Identity and Access Management Integration with PowerBroker Providing Complete Visibility and Auditing of Identities Table of Contents Executive Summary... 3 Identity and Access Management... 4 BeyondTrust

More information

OpenHRE Security Architecture. (DRAFT v0.5)

OpenHRE Security Architecture. (DRAFT v0.5) OpenHRE Security Architecture (DRAFT v0.5) Table of Contents Introduction -----------------------------------------------------------------------------------------------------------------------2 Assumptions----------------------------------------------------------------------------------------------------------------------2

More information

IBM InfoSphere BigInsights Enterprise Edition

IBM InfoSphere BigInsights Enterprise Edition IBM InfoSphere BigInsights Enterprise Edition Efficiently manage and mine big data for valuable insights Highlights Advanced analytics for structured, semi-structured and unstructured data Professional-grade

More information

Implement Hadoop jobs to extract business value from large and varied data sets

Implement Hadoop jobs to extract business value from large and varied data sets Hadoop Development for Big Data Solutions: Hands-On You Will Learn How To: Implement Hadoop jobs to extract business value from large and varied data sets Write, customize and deploy MapReduce jobs to

More information

Securing Hadoop. Sudheesh Narayanan. Chapter No.1 "Hadoop Security Overview"

Securing Hadoop. Sudheesh Narayanan. Chapter No.1 Hadoop Security Overview Securing Hadoop Sudheesh Narayanan Chapter No.1 "Hadoop Security Overview" In this package, you will find: A Biography of the author of the book A preview chapter from the book, Chapter NO.1 "Hadoop Security

More information

Securing Data in the Virtual Data Center and Cloud: Requirements for Effective Encryption

Securing Data in the Virtual Data Center and Cloud: Requirements for Effective Encryption THE DATA PROTECTIO TIO N COMPANY Securing Data in the Virtual Data Center and Cloud: Requirements for Effective Encryption whitepaper Executive Summary Long an important security measure, encryption has

More information

Oracle Enterprise Single Sign-on Technical Guide An Oracle White Paper June 2009

Oracle Enterprise Single Sign-on Technical Guide An Oracle White Paper June 2009 Oracle Enterprise Single Sign-on Technical Guide An Oracle White Paper June 2009 EXECUTIVE OVERVIEW Enterprises these days generally have Microsoft Windows desktop users accessing diverse enterprise applications

More information

Apache HBase. Crazy dances on the elephant back

Apache HBase. Crazy dances on the elephant back Apache HBase Crazy dances on the elephant back Roman Nikitchenko, 16.10.2014 YARN 2 FIRST EVER DATA OS 10.000 nodes computer Recent technology changes are focused on higher scale. Better resource usage

More information

WHITEPAPER. A Technical Perspective on the Talena Data Availability Management Solution

WHITEPAPER. A Technical Perspective on the Talena Data Availability Management Solution WHITEPAPER A Technical Perspective on the Talena Data Availability Management Solution BIG DATA TECHNOLOGY LANDSCAPE Over the past decade, the emergence of social media, mobile, and cloud technologies

More information

Alliance Key Manager Solution Brief

Alliance Key Manager Solution Brief Alliance Key Manager Solution Brief KEY MANAGEMENT Enterprise Encryption Key Management On the road to protecting sensitive data assets, data encryption remains one of the most difficult goals. A major

More information

<Insert Picture Here> Oracle Database Security Overview

<Insert Picture Here> Oracle Database Security Overview Oracle Database Security Overview Tammy Bednar Sr. Principal Product Manager tammy.bednar@oracle.com Data Security Challenges What to secure? Sensitive Data: Confidential, PII, regulatory

More information

IBM Data Security Services for endpoint data protection endpoint data loss prevention solution

IBM Data Security Services for endpoint data protection endpoint data loss prevention solution Automating policy enforcement to prevent endpoint data loss IBM Data Security Services for endpoint data protection endpoint data loss prevention solution Highlights Facilitate policy-based expertise and

More information

AtScale Intelligence Platform

AtScale Intelligence Platform AtScale Intelligence Platform PUT THE POWER OF HADOOP IN THE HANDS OF BUSINESS USERS. Connect your BI tools directly to Hadoop without compromising scale, performance, or control. TURN HADOOP INTO A HIGH-PERFORMANCE

More information

EXECUTIVE VIEW. CA Privileged Identity Manager. KuppingerCole Report

EXECUTIVE VIEW. CA Privileged Identity Manager. KuppingerCole Report KuppingerCole Report EXECUTIVE VIEW by Alexei Balaganski March 2015 is a comprehensive Privileged Identity Management solution for physical and virtual environments with a very broad range of supported

More information

Communicating with the Elephant in the Data Center

Communicating with the Elephant in the Data Center Communicating with the Elephant in the Data Center Who am I? Instructor Consultant Opensource Advocate http://www.laubersoltions.com sml@laubersolutions.com Twitter: @laubersm Freenode: laubersm Outline

More information

Securing and protecting the organization s most sensitive data

Securing and protecting the organization s most sensitive data Securing and protecting the organization s most sensitive data A comprehensive solution using IBM InfoSphere Guardium Data Activity Monitoring and InfoSphere Guardium Data Encryption to provide layered

More information

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi Getting Started with Hadoop Raanan Dagan Paul Tibaldi What is Apache Hadoop? Hadoop is a platform for data storage and processing that is Scalable Fault tolerant Open source CORE HADOOP COMPONENTS Hadoop

More information

White paper. Four Best Practices for Secure Web Access

White paper. Four Best Practices for Secure Web Access White paper Four Best Practices for Secure Web Access What can be done to protect web access? The Web has created a wealth of new opportunities enabling organizations to reduce costs, increase efficiency

More information

Vulnerability Management

Vulnerability Management Vulnerability Management Buyer s Guide Buyer s Guide 01 Introduction 02 Key Components 03 Other Considerations About Rapid7 01 INTRODUCTION Exploiting weaknesses in browsers, operating systems and other

More information

Big Data and New Paradigms in Information Management. Vladimir Videnovic Institute for Information Management

Big Data and New Paradigms in Information Management. Vladimir Videnovic Institute for Information Management Big Data and New Paradigms in Information Management Vladimir Videnovic Institute for Information Management 2 "I am certainly not an advocate for frequent and untried changes laws and institutions must

More information

Debunking The Myths of Column-level Encryption

Debunking The Myths of Column-level Encryption Debunking The Myths of Column-level Encryption Vormetric, Inc. 888.267.3732 408.433.6000 sales@vormetric.com www.vormetric.com Page 1 Column-level Encryption Overview Enterprises have a variety of options

More information

The Payment Card Industry (PCI) Data Security Standards (DSS) v1.2 Requirements:

The Payment Card Industry (PCI) Data Security Standards (DSS) v1.2 Requirements: Compliance Brief The Payment Card Industry (PCI) Data Security Standards (DSS) v1.2 Requirements: Using Server Isolation and Encryption as a Regulatory Compliance Solution and IT Best Practice Introduction

More information