RapidMiner OrangePaper Big Data Security on Hadoop

Size: px
Start display at page:

Download "RapidMiner OrangePaper Big Data Security on Hadoop"

Transcription

1 by Tobias Malbrecht and Zoltan Prekopcsak February 2015 RapidMiner OrangePaper As an increasing number of enterprises move towards production deployments of Hadoop, security continues to be an important topic and an integral implementation initiative often coinciding with initial deployments of analytics platforms that run. As such, modern analytics platforms must comply with security standards early on. In this OrangePaper we show how RapidMiner Radoop complies with current and future security implementation standards providing authentication and authorization and integrating additional levels such as data encryption support. Challenge These days, we see widespread adoption of Hadoop. Hadoop has grown beyond a series of open source projects for programmers, and, now, organizations have matured in their understanding of Big Data technologies and their expectations on the benefits of Hadoop. Acknowledging the added value that can be generated by applying analytics on Big Data in Hadoop in a cost-effective way, many organizations have successfully passed the proof of concept stage and moved on to setting up production clusters. With that, new aspects of deploying Hadoop gain the focus. Among these aspects, data security is the one we see coming up most often. Though requirements differ depending on the type of organization and level of regulations typically applied within an industry sector, most organizations actively consider and implement security as an integral part of a productive Hadoop environment. The challenge is to deploy solutions that bring analytics to Hadoop while seamlessly integrating with data security policies and platforms that make security transparent and easily applicable for users in order to facilitate frictionless building of modern analytics. Next: Analysis >

2 Analysis For implementing Hadoop security, there is a common understanding of the respective measures to be implemented among leading Hadoop vendors. All Hadoop distribution providers promote a 4-layer security model for Hadoop. Sometimes they use different names for the security layers but the underlying concepts are typically similar. Data Security Implementation Model Perimeter Security Data Access Security Accountability Data Protection Authentication Authorization Auditing and Data Lineage Encryption Perimeter Security: The first level is responsible for authenticating a user, i.e. ensuring that a user is who he or she claims to be. This is usually solved with MIT Kerberos, a well-known system and de-facto standard for implementing authentication. Kerberos integrates with LADP or Active Directory to obtain user information. The Hadoop vendors offer some tooling to manage Kerberos. As an alternative, Hortonworks also promotes Apache Knox as a way of ensuring perimeter authentication. Data Access: The second level is responsible for authorizing access to data, i.e. granting access to users only to data, services and resources that they are specifically entitled to use. Some Hadoop services like HDFS already have file permissions and other features to ensure proper authorization, but sometimes users are looking for more fine-grained authorization capabilities (e.g. on a column level or even on the data cell level). Cloudera promotes Apache Sentry for this, while Hortonworks has acquired a company called XA Secure to deliver data access security. Analysis Continued >

3 Accountability: The common goal of this security level is to foster accountability by allowing administrators to monitor and audit data access. Additional measures include data lineage that allows understanding where data comes from and how different data sets rely on each other. To support this level of security, Cloudera has a special product for this called Navigator, while Hortonworks is again building on XA Secure technology. Data Protection: The fourth and last aspect of security is also a large field, covering data-at-rest encryption, on-the-wire encryption, data masking, and many more. Hadoop vendors usually have some features for this, but they currently rely mostly on partners to provide full-blown solutions. As of today, many enterprise production deployments of Hadoop already include implementations of perimeter security, with a few also securing data access through authorization. With deployments becoming more mature, adoption of security levels will increase and perimeter security and data access security will become standard and integration a necessity for analytics tools. The increasing adoption of Cloud infrastructures will also drive the implementation of data protection, whereas the audit level will in particular be relevant for strongly regulated businesses such as financial services. Analytics tools integrating with Hadoop in particular those pushing computation down into Hadoop clusters need to deal with security levels once they are implemented in Hadoop. Being on the forefront of in-hadoop analytics, RapidMiner Radoop brings ease-of-use and visual analytics workflow development into Hadoop. Continuing to anticipate market needs, RapidMiner Radoop now integrates with Hadoop security implementations to deliver analytics in Hadoop seamlessly and frictionless also with secured Hadoop clusters. RapidMiner Radoop pushes down visually designed workflows for analytics into Hadoop environments for processing these workflows integrating with core Hadoop technologies HDFS, MapReduce/YARN and Hive among others to execute parts of the workflows. Kerberized Hadoop clusters require authentication via Kerberos when connecting to and accessing these services. As of version 2.2, RapidMiner Radoop integrates with Kerberos authentication. When accessing a Hadoop cluster, and any of the services listed above, RapidMiner Radoop requests a ticket from Kerberos and if authenticated uses that ticket to gain access to the services. To confirm user information, Kerberos itself typically integrates with an LDAP (Lightweight Directory Access Protocol) or Active Directory server. Continued >

4 Kerberos Authentication 1. Request Authentication RapidMiner Radoop 2. Grant Ticket-Granting Ticket 3. Request Service Ticket Kerberos Authentication Server 4. Grant Service Session Ticket 5. Access Hadoop Service (e.g. Hive) Beyond authentication, RapidMiner Radoop now also supports data access authorization employing Apache Sentry. In several distributions, Apache Sentry is used to control access e.g. to tables in Hive. As with any other configuration requirement, configuration of Kerberos authentication support in RapidMiner Radoop is easy and frictionless. RapidMiner Radoop hides all administration and configuration complexity and reveals only necessary settings to the user. Effectively, configuration and administration requirements for IT concerning RapidMiner Radoop as in-hadoop analytics solution are reduced to a minimum. With perimeter security and data access security supported for most Hadoop clusters (given the broad adoption of Kerberos and Sentry), RapidMiner Radoop already delivers security for a large portion of production clusters deployed within organizations. In upcoming platform releases, RapidMiner Radoop will be broadened to support those security measures early-on that evolve and have the potential to be adopted as security standards within enterprises. With that, RapidMiner Radoop is future-proof delivering easy-to-use in-hadoop analytics on any Hadoop cluster no matter what security implementations will be involved. Next: Conclusion >

5 Conclusion Conclusion With the increased adoption of security implementations for Hadoop, organizations add perimeter security through authentication, implement data access authorization, set up auditing measures and encrypt data for better protection. RapidMiner Radoop complies with the currently implemented security levels and seamlessly integrates analytics with secured Hadoop clusters. Furthermore, RapidMiner Radoop makes security configuration very easy to provide hassle-free connectivity and frictionless deployment of RapidMiner Radoop as analytics platform for Hadoop. In particular, RapidMiner Radoop integrates with Kerberos authentication and data access authorization using Apache Sentry. Other security implementations providing data access authorization for all distributions and allowing for reading encrypted data are planned for integration as we expect importance of security for Hadoop strengthen further and security implementations gain more traction in the market. With that, RapidMiner Radoop is not only leading in the way it does analytics on Big Data offering the visual design of analytical workflows and facilitating pushdown computation of these workflows. RapidMiner Radoop is also leading in how it integrates with heterogeneous Hadoop infrastructures and security implementations by anticipating the trends in implementing security for Hadoop and complying with the standards of tomorrow, today. All content 2014 RapidMiner RapidMiner provides software, solutions, and services in the field of advanced analytics, including predictive analytics, data mining, and text mining. Learn more at Tobias Tobias Malbrecht is Director of Product Management and Product Marketing at RapidMiner. Before, Tobias headed the consulting services unit of RapidMiner and also served as a consultant and product engineer. Tobias holds master degrees in computer science, economics, and business administration from the Technical University of Dortmund, Germany. Zoltan Zoltan Prekopcsak is the V.P. of Big Data at RapidMiner and has experience in data-driven projects in industries including telecommunications, financial services, e-commerce, and neuroscience. Previously, he was co-founder/ceo of Radoop before its acquisition by RapidMiner, a data scientist at Secret Sauce Partners, Inc., and has been a lecturer at Budapest University of Technology and Economics.

Data Security in Hadoop

Data Security in Hadoop Data Security in Hadoop Eric Mizell Director, Solution Engineering Page 1 What is Data Security? Data Security for Hadoop allows you to administer a singular policy for authentication of users, authorize

More information

Who Am I? Mark Cusack Chief Architect 9 years@rainstor Founding developer Ex UK Ministry of Defence Research InfoSec projects

Who Am I? Mark Cusack Chief Architect 9 years@rainstor Founding developer Ex UK Ministry of Defence Research InfoSec projects 1 Who Am I? Mark Cusack Chief Architect 9 years@rainstor Founding developer Ex UK Ministry of Defence Research InfoSec projects 2 RainStor: a SQL Database on Hadoop SCALE (MPP, Shared everything) LOAD

More information

Upcoming Announcements

Upcoming Announcements Enterprise Hadoop Enterprise Hadoop Jeff Markham Technical Director, APAC jmarkham@hortonworks.com Page 1 Upcoming Announcements April 2 Hortonworks Platform 2.1 A continued focus on innovation within

More information

Ensure PCI DSS compliance for your Hadoop environment. A Hortonworks White Paper October 2015

Ensure PCI DSS compliance for your Hadoop environment. A Hortonworks White Paper October 2015 Ensure PCI DSS compliance for your Hadoop environment A Hortonworks White Paper October 2015 2 Contents Overview Why PCI matters to your business Building support for PCI compliance into your Hadoop environment

More information

Big Data Management and Security

Big Data Management and Security Big Data Management and Security Audit Concerns and Business Risks Tami Frankenfield Sr. Director, Analytics and Enterprise Data Mercury Insurance What is Big Data? Velocity + Volume + Variety = Value

More information

Like what you hear? Tweet it using: #Sec360

Like what you hear? Tweet it using: #Sec360 Like what you hear? Tweet it using: #Sec360 HADOOP SECURITY Like what you hear? Tweet it using: #Sec360 HADOOP SECURITY About Robert: School: UW Madison, U St. Thomas Programming: 15 years, C, C++, Java

More information

Secure Your Hadoop Cluster With Apache Sentry (Incubating) Xuefu Zhang Software Engineer, Cloudera April 07, 2014

Secure Your Hadoop Cluster With Apache Sentry (Incubating) Xuefu Zhang Software Engineer, Cloudera April 07, 2014 1 Secure Your Hadoop Cluster With Apache Sentry (Incubating) Xuefu Zhang Software Engineer, Cloudera April 07, 2014 2 Outline Introduction Hadoop security primer Authentication Authorization Data Protection

More information

and Hadoop Technology

and Hadoop Technology SAS and Hadoop Technology Overview SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2015. SAS and Hadoop Technology: Overview. Cary, NC: SAS Institute

More information

Data Governance in the Hadoop Data Lake. Kiran Kamreddy May 2015

Data Governance in the Hadoop Data Lake. Kiran Kamreddy May 2015 Data Governance in the Hadoop Data Lake Kiran Kamreddy May 2015 One Data Lake: Many Definitions A centralized repository of raw data into which many data-producing streams flow and from which downstream

More information

Securing Hadoop. Sudheesh Narayanan. Chapter No.1 "Hadoop Security Overview"

Securing Hadoop. Sudheesh Narayanan. Chapter No.1 Hadoop Security Overview Securing Hadoop Sudheesh Narayanan Chapter No.1 "Hadoop Security Overview" In this package, you will find: A Biography of the author of the book A preview chapter from the book, Chapter NO.1 "Hadoop Security

More information

Encryption and Anonymization in Hadoop

Encryption and Anonymization in Hadoop Encryption and Anonymization in Hadoop Current and Future needs Sept-28-2015 Page 1 ApacheCon, Budapest Agenda Need for data protection Encryption and Anonymization Current State of Encryption in Hadoop

More information

White paper. The Big Data Security Gap: Protecting the Hadoop Cluster

White paper. The Big Data Security Gap: Protecting the Hadoop Cluster The Big Data Security Gap: Protecting the Hadoop Cluster Introduction While the open source framework has enabled the footprint of Hadoop to logically expand, enterprise organizations face deployment and

More information

Securing Hadoop in an Enterprise Context

Securing Hadoop in an Enterprise Context Securing Hadoop in an Enterprise Context Hellmar Becker, Senior IT Specialist Apache: Big Data conference Budapest, September 29, 2015 Who am I? 2 Securing Hadoop in an Enterprise Context 1. The Challenge

More information

Datameer Big Data Governance

Datameer Big Data Governance TECHNICAL BRIEF Datameer Big Data Governance Bringing open-architected and forward-compatible governance controls to Hadoop analytics As big data moves toward greater mainstream adoption, its compliance

More information

Securing Your Enterprise Hadoop Ecosystem Comprehensive Security for the Enterprise with Cloudera

Securing Your Enterprise Hadoop Ecosystem Comprehensive Security for the Enterprise with Cloudera Securing Your Enterprise Hadoop Ecosystem Comprehensive Security for the Enterprise with Cloudera Version: 103 Table of Contents Introduction 3 Importance of Security 3 Growing Pains 3 Security Requirements

More information

Data Governance in the Hadoop Data Lake. Michael Lang May 2015

Data Governance in the Hadoop Data Lake. Michael Lang May 2015 Data Governance in the Hadoop Data Lake Michael Lang May 2015 Introduction Product Manager for Teradata Loom Joined Teradata as part of acquisition of Revelytix, original developer of Loom VP of Sales

More information

Complying with Payment Card Industry (PCI-DSS) Requirements with DataStax and Vormetric

Complying with Payment Card Industry (PCI-DSS) Requirements with DataStax and Vormetric Complying with Payment Card Industry (PCI-DSS) Requirements with DataStax and Vormetric Table of Contents Table of Contents... 2 Overview... 3 PIN Transaction Security Requirements... 3 Payment Application

More information

Securing Your Enterprise Hadoop Ecosystem Comprehensive Security for the Enterprise with Cloudera

Securing Your Enterprise Hadoop Ecosystem Comprehensive Security for the Enterprise with Cloudera Securing Your Enterprise Hadoop Ecosystem Comprehensive Security for the Enterprise with Cloudera Version: 102 Table of Contents Introduction 3 Importance of Security 3 Growing Pains 3 Security Requirements

More information

Apache Sentry. Prasad Mujumdar prasadm@apache.org prasadm@cloudera.com

Apache Sentry. Prasad Mujumdar prasadm@apache.org prasadm@cloudera.com Apache Sentry Prasad Mujumdar prasadm@apache.org prasadm@cloudera.com Agenda Various aspects of data security Apache Sentry for authorization Key concepts of Apache Sentry Sentry features Sentry architecture

More information

Fighting Cyber Fraud with Hadoop. Niel Dunnage Senior Solutions Architect

Fighting Cyber Fraud with Hadoop. Niel Dunnage Senior Solutions Architect Fighting Cyber Fraud with Hadoop Niel Dunnage Senior Solutions Architect 1 Summary Big Data is an increasingly powerful enterprise asset and this talk will explore the relationship between big data and

More information

TOP 8 TRENDS FOR 2016 BIG DATA

TOP 8 TRENDS FOR 2016 BIG DATA The year 2015 was an important one in the world of big data. What used to be hype became the norm as more businesses realized that data, in all forms and sizes, is critical to making the best possible

More information

Optimized for the Industrial Internet: GE s Industrial Data Lake Platform

Optimized for the Industrial Internet: GE s Industrial Data Lake Platform Optimized for the Industrial Internet: GE s Industrial Lake Platform Agenda The Opportunity The Solution The Challenges The Results Solutions for Industrial Internet, deep domain expertise 2 GESoftware.com

More information

Arvind uses Vaultize Enterprise platform for File Sharing with Endpoint Data Protection to Secure and Control Business Critical Data

Arvind uses Vaultize Enterprise platform for File Sharing with Endpoint Data Protection to Secure and Control Business Critical Data CASE STUDY Arvind uses Vaultize Enterprise platform for File Sharing with Endpoint Data Protection to Secure and Control Business Critical Data COMPANY Arvind Limited (ARVIND) 25,000 employees INDUSTRY

More information

Mobile-First Actionable Data solutions help companies make sense of their business.

Mobile-First Actionable Data solutions help companies make sense of their business. Mobile-First Actionable Data solutions help companies make sense of their business. Mobile-First Actionable Data (M-FAD) solutions streamline businesses by providing the right tools to staff when they

More information

White Paper: Evaluating Big Data Analytical Capabilities For Government Use

White Paper: Evaluating Big Data Analytical Capabilities For Government Use CTOlabs.com White Paper: Evaluating Big Data Analytical Capabilities For Government Use March 2012 A White Paper providing context and guidance you can use Inside: The Big Data Tool Landscape Big Data

More information

R / TERR. Ana Costa e SIlva, PhD Senior Data Scientist TIBCO. Copyright 2000-2013 TIBCO Software Inc.

R / TERR. Ana Costa e SIlva, PhD Senior Data Scientist TIBCO. Copyright 2000-2013 TIBCO Software Inc. R / TERR Ana Costa e SIlva, PhD Senior Data Scientist TIBCO Copyright 2000-2013 TIBCO Software Inc. Tower of Big and Fast Data Visual Data Discovery Hundreds of Records Millions of Records Key peformance

More information

RapidMiner looks to step up advanced analysis business, adds to processing options

RapidMiner looks to step up advanced analysis business, adds to processing options RapidMiner looks to step up advanced analysis business, adds to processing options Analyst: Krishna Roy 14 Jan, 2015 RapidMiner has its eyes down and focused on higher growth rates in its business this

More information

BIRT ihub 3. 2013 Actuate Customer Days. Wow that looks good! Jeff Morris & Mark Gamble

BIRT ihub 3. 2013 Actuate Customer Days. Wow that looks good! Jeff Morris & Mark Gamble BIRT ihub 3 Wow that looks good! Jeff Morris & Mark Gamble SF Nov7 - UK Nov12 - DE Nov13 - FR Nov14 - SG Nov19 - JP Nov22 - NY Dec4 2013 Actuate Customer Days Actuate BIRT ihub 3 Focus Areas Simplified,

More information

The Inside Scoop on Hadoop

The Inside Scoop on Hadoop The Inside Scoop on Hadoop Orion Gebremedhin National Solutions Director BI & Big Data, Neudesic LLC. VTSP Microsoft Corp. Orion.Gebremedhin@Neudesic.COM B-orgebr@Microsoft.com @OrionGM The Inside Scoop

More information

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP

Infomatics. Big-Data and Hadoop Developer Training with Oracle WDP Big-Data and Hadoop Developer Training with Oracle WDP What is this course about? Big Data is a collection of large and complex data sets that cannot be processed using regular database management tools

More information

BalaBit IT Security Insight Singaporean Internet Banking and Technology Risk Management Guidelines Compliance

BalaBit IT Security Insight Singaporean Internet Banking and Technology Risk Management Guidelines Compliance GUARDING YOUR BUSINESS BalaBit IT Security Insight Singaporean Internet Banking and Technology Risk Management Guidelines Compliance www.balabit.com In 2008, the Monetary Authority of Singapore (MAS),

More information

Hortonworks CISC Innovation day

Hortonworks CISC Innovation day Hortonworks CISC Innovation day Simon gregory sgregory@hortonworks.com Here was the ask Hortonworks' data reposition - how this works and the types of data you work with. 1: Data Types & Value. What have

More information

Building Your Big Data Team

Building Your Big Data Team Building Your Big Data Team With all the buzz around Big Data, many companies have decided they need some sort of Big Data initiative in place to stay current with modern data management requirements.

More information

Constructing a Data Lake: Hadoop and Oracle Database United!

Constructing a Data Lake: Hadoop and Oracle Database United! Constructing a Data Lake: Hadoop and Oracle Database United! Sharon Sophia Stephen Big Data PreSales Consultant February 21, 2015 Safe Harbor The following is intended to outline our general product direction.

More information

Document Type: Best Practice

Document Type: Best Practice Global Architecture and Technology Enablement Practice Hadoop with Kerberos Deployment Considerations Document Type: Best Practice Note: The content of this paper refers exclusively to the second maintenance

More information

Lofan Abrams Data Services for Big Data Session # 2987

Lofan Abrams Data Services for Big Data Session # 2987 Lofan Abrams Data Services for Big Data Session # 2987 Big Data Are you ready for blast-off? Big Data, for better or worse: 90% of world s data generated over last two years. ScienceDaily, ScienceDaily

More information

Practical Hadoop. Security. Bhushan Lakhe

Practical Hadoop. Security. Bhushan Lakhe Practical Hadoop Security Bhushan Lakhe Contents J About the Author About the Technical Reviewer Acknowledgments Introduction xiii xv xvii xix Part I: Introducing Hadoop and Its Security 1 Chapter 1: Understanding

More information

SECURING YOUR ENTERPRISE HADOOP ECOSYSTEM

SECURING YOUR ENTERPRISE HADOOP ECOSYSTEM WHITE PAPER SECURING YOUR ENTERPRISE HADOOP ECOSYSTEM Realizing Data Security for the Enterprise with Cloudera Securing Your Enterprise Hadoop Ecosystem CLOUDERA WHITE PAPER 2 Table of Contents Introduction

More information

Hadoop & SAS Data Loader for Hadoop

Hadoop & SAS Data Loader for Hadoop Turning Data into Value Hadoop & SAS Data Loader for Hadoop Sebastiaan Schaap Frederik Vandenberghe Agenda What s Hadoop SAS Data management: Traditional In-Database In-Memory The Hadoop analytics lifecycle

More information

Accelerating Enterprise Big Data Success. Tim Stevens, VP of Business and Corporate Development Cloudera

Accelerating Enterprise Big Data Success. Tim Stevens, VP of Business and Corporate Development Cloudera Accelerating Enterprise Big Data Success Tim Stevens, VP of Business and Corporate Development Cloudera 1 Big Opportunity: Extract value from data Revenue Growth x = 50 Billion 35 ZB Cost Savings Margin

More information

Informatica Big Data Management (Version 10.1) Security Guide

Informatica Big Data Management (Version 10.1) Security Guide Informatica Big Data Management (Version 10.1) Security Guide Informatica Big Data Management Security Guide Version 10.1 June 2016 Copyright (c) 1993-2016 Informatica LLC. All rights reserved. This software

More information

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics ESSENTIALS EMC ISILON Use the industry's first and only scale-out NAS solution with native Hadoop

More information

This ESG White Paper was commissioned by Zettaset and is distributed under license from ESG.

This ESG White Paper was commissioned by Zettaset and is distributed under license from ESG. White Paper Closing the Big Data Management and Security Gap By Nik Rouda, Senior Analyst October 2014 This ESG White Paper was commissioned by Zettaset and is distributed under license from ESG. 2 Contents

More information

Which SQL Engine Leads the Herd?

Which SQL Engine Leads the Herd? October 2014 Which SQL Engine Leads the Herd? A Comparison of three leading SQL-on-Hadoop Implementations for compatibility, performance and scalability Which SQL Engine Leads the Herd? 2 Contents Executive

More information

CA Technologies Big Data Infrastructure Management Unified Management and Visibility of Big Data

CA Technologies Big Data Infrastructure Management Unified Management and Visibility of Big Data Research Report CA Technologies Big Data Infrastructure Management Executive Summary CA Technologies recently exhibited new technology innovations, marking its entry into the Big Data marketplace with

More information

#TalendSandbox for Big Data

#TalendSandbox for Big Data Evalua&on von Apache Hadoop mit der #TalendSandbox for Big Data Julien Clarysse @whatdoesdatado @talend 2015 Talend Inc. 1 Connecting the Data-Driven Enterprise 2 Talend Overview Founded in 2006 BRAND

More information

Data Security For Government Agencies

Data Security For Government Agencies Data Security For Government Agencies Version: Q115-101 Table of Contents Abstract Agencies are transforming data management with unified systems that combine distributed storage and computation at limitless

More information

Multitenancy and the Enterprise Data Hub. James Kinley @jrkinley IP EXPO EUROPE Big Data Evolution Summit

Multitenancy and the Enterprise Data Hub. James Kinley @jrkinley IP EXPO EUROPE Big Data Evolution Summit Multitenancy and the Enterprise Data Hub James Kinley @jrkinley IP EXPO EUROPE Big Data Evolution Summit 1 About me James Kinley @jrkinley Principal Solutions Architect EMEA Hadooper since 2010 Clouderan

More information

Oracle Big Data Fundamentals Ed 1 NEW

Oracle Big Data Fundamentals Ed 1 NEW Oracle University Contact Us: +90 212 329 6779 Oracle Big Data Fundamentals Ed 1 NEW Duration: 5 Days What you will learn In the Oracle Big Data Fundamentals course, learn to use Oracle's Integrated Big

More information

Contents. Pentaho Corporation. Version 5.1. Copyright Page. New Features in Pentaho Data Integration 5.1. PDI Version 5.1 Minor Functionality Changes

Contents. Pentaho Corporation. Version 5.1. Copyright Page. New Features in Pentaho Data Integration 5.1. PDI Version 5.1 Minor Functionality Changes Contents Pentaho Corporation Version 5.1 Copyright Page New Features in Pentaho Data Integration 5.1 PDI Version 5.1 Minor Functionality Changes Legal Notices https://help.pentaho.com/template:pentaho/controls/pdftocfooter

More information

How to Hadoop Without the Worry: Protecting Big Data at Scale

How to Hadoop Without the Worry: Protecting Big Data at Scale How to Hadoop Without the Worry: Protecting Big Data at Scale SESSION ID: CDS-W06 Davi Ottenheimer Senior Director of Trust EMC Corporation @daviottenheimer Big Data Trust. Redefined Transparency Relevance

More information

Deploying Hadoop with Manager

Deploying Hadoop with Manager Deploying Hadoop with Manager SUSE Big Data Made Easier Peter Linnell / Sales Engineer plinnell@suse.com Alejandro Bonilla / Sales Engineer abonilla@suse.com 2 Hadoop Core Components 3 Typical Hadoop Distribution

More information

XpoLog Competitive Comparison Sheet

XpoLog Competitive Comparison Sheet XpoLog Competitive Comparison Sheet New frontier in big log data analysis and application intelligence Technical white paper May 2015 XpoLog, a data analysis and management platform for applications' IT

More information

Adobe s Story of Integrating Hadoop and SAP HANA with SAP Data Services

Adobe s Story of Integrating Hadoop and SAP HANA with SAP Data Services Orange County Convention Center Orlando, Florida June 3-5, 2014 Adobe s Story of Integrating Hadoop and SAP HANA with SAP Data Services Kevin Davis, Senior Data Warehouse Engineer, Adobe Hemant Puranik,

More information

Enterprise-grade Hadoop: The Building Blocks

Enterprise-grade Hadoop: The Building Blocks Enterprise-grade Hadoop: The Building Blocks An Ovum white paper for MapR Publication Date: 24 Sep 2014 Author name Summary Catalyst Hadoop was initially developed for trusted environments that did not

More information

STORAGE AS. A SERVICE (STaaS) ELASTIC CLOUD STORAGE. global.de/cloud-storage MADE IN GERMANY

STORAGE AS. A SERVICE (STaaS) ELASTIC CLOUD STORAGE. global.de/cloud-storage MADE IN GERMANY STORAGE AS A SERVICE (STaaS) ELASTIC CLOUD STORAGE global.de/cloud-storage MADE IN GERMANY 4026 4126 HOW DOES OUR ELASTIC CLOUD STORAGE WORK? CUSTOM APPS PACKAGED APPS INTERNET OF THINGS ANALYTICS CLOUD

More information

Where is Hadoop Going Next?

Where is Hadoop Going Next? Where is Hadoop Going Next? Owen O Malley owen@hortonworks.com @owen_omalley November 2014 Page 1 Who am I? Worked at Yahoo Seach Webmap in a Week Dreadnaught to Juggernaut to Hadoop MapReduce Security

More information

Actian SQL in Hadoop Buyer s Guide

Actian SQL in Hadoop Buyer s Guide Actian SQL in Hadoop Buyer s Guide Contents Introduction: Big Data and Hadoop... 3 SQL on Hadoop Benefits... 4 Approaches to SQL on Hadoop... 4 The Top 10 SQL in Hadoop Capabilities... 5 SQL in Hadoop

More information

IBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look

IBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look IBM BigInsights Has Potential If It Lives Up To Its Promise By Prakash Sukumar, Principal Consultant at iolap, Inc. IBM released Hadoop-based InfoSphere BigInsights in May 2013. There are already Hadoop-based

More information

Integrating Big Data into Business Processes and Enterprise Systems

Integrating Big Data into Business Processes and Enterprise Systems Integrating Big Data into Business Processes and Enterprise Systems THOUGHT LEADERSHIP FROM BMC TO HELP YOU: Understand what Big Data means Effectively implement your company s Big Data strategy Get business

More information

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION Syed Rasheed Solution Manager Red Hat Corp. Kenny Peeples Technical Manager Red Hat Corp. Kimberly Palko Product Manager Red Hat Corp.

More information

SECURITY IMPLEMENTATION IN HADOOP. By Narsimha Chary(200607008) Siddalinga K M(200950034) Rahman(200950032)

SECURITY IMPLEMENTATION IN HADOOP. By Narsimha Chary(200607008) Siddalinga K M(200950034) Rahman(200950032) SECURITY IMPLEMENTATION IN HADOOP By Narsimha Chary(200607008) Siddalinga K M(200950034) Rahman(200950032) AGENDA What is security? Security in Distributed File Systems? Current level of security in Hadoop!

More information

Big Data Security. Kevvie Fowler. kpmg.ca

Big Data Security. Kevvie Fowler. kpmg.ca Big Data Security Kevvie Fowler kpmg.ca About myself Kevvie Fowler, CISSP, GCFA Partner, Advisory Services KPMG Canada Industry contributions Big data security definitions Definitions Big data Datasets

More information

Analytics With Hadoop. SAS and Cloudera Starter Services: Visual Analytics and Visual Statistics

Analytics With Hadoop. SAS and Cloudera Starter Services: Visual Analytics and Visual Statistics Analytics With Hadoop SAS and Cloudera Starter Services: Visual Analytics and Visual Statistics Everything You Need to Get Started on Your First Hadoop Project SAS and Cloudera have identified the essential

More information

Building & Optimizing Enterprise-class Hadoop with Open Architectures Prem Jain NetApp

Building & Optimizing Enterprise-class Hadoop with Open Architectures Prem Jain NetApp Building & Optimizing Enterprise-class Hadoop with Open Architectures Prem Jain NetApp Introduction to Hadoop Comes from Internet companies Emerging big data storage and analytics platform HDFS and MapReduce

More information

Fighting Cyber Fraud with Hadoop. Niel Dunnage Senior Solutions Architect

Fighting Cyber Fraud with Hadoop. Niel Dunnage Senior Solutions Architect Fighting Cyber Fraud with Hadoop Niel Dunnage Senior Solutions Architect 1 Summary Big Data is an increasingly powerful enterprise asset with many potential user cases in this case we ll explore the relationship

More information

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Copyright 2012, Oracle and/or its affiliates. All rights reserved. 1 Oracle Big Data Appliance Releases 2.5 and 3.0 Ralf Lange Global ISV & OEM Sales Agenda Quick Overview on BDA and its Positioning Product Details and Updates Security and Encryption New Hadoop Versions

More information

IBM BigInsights for Apache Hadoop

IBM BigInsights for Apache Hadoop IBM BigInsights for Apache Hadoop Efficiently manage and mine big data for valuable insights Highlights: Enterprise-ready Apache Hadoop based platform for data processing, warehousing and analytics Advanced

More information

Federation At Fermilab. Al Lilianstrom National Laboratories Information Technology Summit May 2015

Federation At Fermilab. Al Lilianstrom National Laboratories Information Technology Summit May 2015 Federation At Fermilab Al Lilianstrom National Laboratories Information Technology Summit May 2015 About Fermilab Since 1967, Fermilab has worked to answer fundamental questions and enhance our understanding

More information

Hortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015

Hortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015 Hortonworks and ODP: Realizing the Future of Big Data, Now Manila, May 13, 2015 We Do Hadoop Fall 2014 Page 1 HDP delivers a comprehensive data management platform GOVERNANCE Hortonworks Data Platform

More information

Radoop: Analyzing Big Data with RapidMiner and Hadoop

Radoop: Analyzing Big Data with RapidMiner and Hadoop Radoop: Analyzing Big Data with RapidMiner and Hadoop Zoltán Prekopcsák, Gábor Makrai, Tamás Henk, Csaba Gáspár-Papanek Budapest University of Technology and Economics, Hungary Abstract Working with large

More information

Olivier Renault Solu/on Engineer Hortonworks. Hadoop Security

Olivier Renault Solu/on Engineer Hortonworks. Hadoop Security Olivier Renault Solu/on Engineer Hortonworks Hadoop Security Agenda Why security Kerberos HDFS ACL security Network security - KNOX Hive - doas = False - ATZ-NG YARN ACL p67-91 Capacity scheduler ACL Killing

More information

Addressing Open Source Big Data, Hadoop, and MapReduce limitations

Addressing Open Source Big Data, Hadoop, and MapReduce limitations Addressing Open Source Big Data, Hadoop, and MapReduce limitations 1 Agenda What is Big Data / Hadoop? Limitations of the existing hadoop distributions Going enterprise with Hadoop 2 How Big are Data?

More information

owncloud Architecture Overview

owncloud Architecture Overview owncloud Architecture Overview Time to get control back Employees are using cloud-based services to share sensitive company data with vendors, customers, partners and each other. They are syncing data

More information

Control-M for Hadoop. Technical Bulletin. www.bmc.com

Control-M for Hadoop. Technical Bulletin. www.bmc.com Technical Bulletin Control-M for Hadoop Version 8.0.00 September 30, 2014 Tracking number: PACBD.8.0.00.004 BMC Software is announcing that Control-M for Hadoop now supports the following: Secured Hadoop

More information

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi

Getting Started with Hadoop. Raanan Dagan Paul Tibaldi Getting Started with Hadoop Raanan Dagan Paul Tibaldi What is Apache Hadoop? Hadoop is a platform for data storage and processing that is Scalable Fault tolerant Open source CORE HADOOP COMPONENTS Hadoop

More information

Open Directory. Apple s standards-based directory and network authentication services architecture. Features

Open Directory. Apple s standards-based directory and network authentication services architecture. Features Open Directory Apple s standards-based directory and network authentication services architecture. Features Scalable LDAP directory server OpenLDAP for providing standards-based access to centralized data

More information

Bringing the Power of SAS to Hadoop. White Paper

Bringing the Power of SAS to Hadoop. White Paper White Paper Bringing the Power of SAS to Hadoop Combine SAS World-Class Analytic Strength with Hadoop s Low-Cost, Distributed Data Storage to Uncover Hidden Opportunities Contents Introduction... 1 What

More information

Red Hat Enterprise IPA Identity & Access Management for Linux and Unix Environments. Dragos Manac 01.10.2008

Red Hat Enterprise IPA Identity & Access Management for Linux and Unix Environments. Dragos Manac 01.10.2008 Red Hat Enterprise IPA Identity & Access Management for Linux and Unix Environments Dragos Manac 01.10.2008 Agenda The Need for Identity & Access Management Enterprise IPA Overview Pricing Questions to

More information

How to avoid building a data swamp

How to avoid building a data swamp How to avoid building a data swamp Case studies in Hadoop data management and governance Mark Donsky, Product Management, Cloudera Naren Korenu, Engineering, Cloudera 1 Abstract DELETE How can you make

More information

Kerberos. Public domain image of Heracles and Cerberus. From an Attic bilingual amphora, 530 520 BC. From Italy (?).

Kerberos. Public domain image of Heracles and Cerberus. From an Attic bilingual amphora, 530 520 BC. From Italy (?). Kerberos Public domain image of Heracles and Cerberus. From an Attic bilingual amphora, 530 520 BC. From Italy (?). 1 Kerberos Kerberos is an authentication protocol and a software suite implementing this

More information

Ganzheitliches Datenmanagement

Ganzheitliches Datenmanagement Ganzheitliches Datenmanagement für Hadoop Michael Kohs, Senior Sales Consultant @mikchaos The Problem with Big Data Projects in 2016 Relational, Mainframe Documents and Emails Data Modeler Data Scientist

More information

Please give me your feedback

Please give me your feedback Please give me your feedback Session BB4089 Speaker Claude Lorenson, Ph. D and Wendy Harms Use the mobile app to complete a session survey 1. Access My schedule 2. Click on this session 3. Go to Rate &

More information

Datameer Cloud. End-to-End Big Data Analytics in the Cloud

Datameer Cloud. End-to-End Big Data Analytics in the Cloud Cloud End-to-End Big Data Analytics in the Cloud Datameer Cloud unites the economics of the cloud with big data analytics to deliver extremely fast time to insight. With Datameer Cloud, empowered line

More information

Hadoop & Spark Using Amazon EMR

Hadoop & Spark Using Amazon EMR Hadoop & Spark Using Amazon EMR Michael Hanisch, AWS Solutions Architecture 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda Why did we build Amazon EMR? What is Amazon EMR?

More information

Cloud security architecture

Cloud security architecture ericsson White paper Uen 284 23-3244 January 2015 Cloud security architecture from process to deployment The Trust Engine concept and logical cloud security architecture presented in this paper provide

More information

Integrate Big Data into Business Processes and Enterprise Systems. solution white paper

Integrate Big Data into Business Processes and Enterprise Systems. solution white paper Integrate Big Data into Business Processes and Enterprise Systems solution white paper THOUGHT LEADERSHIP FROM BMC TO HELP YOU: Understand what Big Data means Effectively implement your company s Big Data

More information

Mitra Innovation Leverages WSO2's Open Source Middleware to Build BIM Exchange Platform

Mitra Innovation Leverages WSO2's Open Source Middleware to Build BIM Exchange Platform Mitra Innovation Leverages WSO2's Open Source Middleware to Build BIM Exchange Platform May 2015 Contents 1. Introduction... 3 2. What is BIM... 3 2.1. History of BIM... 3 2.2. Why Implement BIM... 4 2.3.

More information

QUICK FACTS. Delivering a Unified Data Architecture for Sony Computer Entertainment America TEKSYSTEMS GLOBAL SERVICES CUSTOMER SUCCESS STORIES

QUICK FACTS. Delivering a Unified Data Architecture for Sony Computer Entertainment America TEKSYSTEMS GLOBAL SERVICES CUSTOMER SUCCESS STORIES [ Consumer goods, Data Services ] TEKSYSTEMS GLOBAL SERVICES CUSTOMER SUCCESS STORIES QUICK FACTS Objectives Develop a unified data architecture for capturing Sony Computer Entertainment America s (SCEA)

More information

The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer, Cofounder @mikeolson

The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer, Cofounder @mikeolson The Future of Big Data SAS Automotive Roundtable Los Angeles, CA 5 March 2015 Mike Olson Chief Strategy Officer, Cofounder @mikeolson 1 A New Platform for Pervasive Analytics Multiple big data opportunities

More information

Modern Data Architecture for Predictive Analytics

Modern Data Architecture for Predictive Analytics Modern Data Architecture for Predictive Analytics David Smith VP Marketing and Community - Revolution Analytics John Kreisa VP Strategic Marketing- Hortonworks Hortonworks Inc. 2013 Page 1 Your Presenters

More information

Comprehensive Analytics on the Hortonworks Data Platform

Comprehensive Analytics on the Hortonworks Data Platform Comprehensive Analytics on the Hortonworks Data Platform We do Hadoop. Page 1 Page 2 Back to 2005 Page 3 Vertical Scaling Page 4 Vertical Scaling Page 5 Vertical Scaling Page 6 Horizontal Scaling Page

More information

DISCOVERING AND SECURING SENSITIVE DATA IN HADOOP DATA STORES

DISCOVERING AND SECURING SENSITIVE DATA IN HADOOP DATA STORES DATAGUISE WHITE PAPER SECURING HADOOP: DISCOVERING AND SECURING SENSITIVE DATA IN HADOOP DATA STORES OVERVIEW: The rapid expansion of corporate data being transferred or collected and stored in Hadoop

More information

Microsoft Big Data. Solution Brief

Microsoft Big Data. Solution Brief Microsoft Big Data Solution Brief Contents Introduction... 2 The Microsoft Big Data Solution... 3 Key Benefits... 3 Immersive Insight, Wherever You Are... 3 Connecting with the World s Data... 3 Any Data,

More information

Why Big Data in the Cloud?

Why Big Data in the Cloud? Have 40 Why Big Data in the Cloud? Colin White, BI Research January 2014 Sponsored by Treasure Data TABLE OF CONTENTS Introduction The Importance of Big Data The Role of Cloud Computing Using Big Data

More information

Microsoft Analytics Platform System. Solution Brief

Microsoft Analytics Platform System. Solution Brief Microsoft Analytics Platform System Solution Brief Contents 4 Introduction 4 Microsoft Analytics Platform System 5 Enterprise-ready Big Data 7 Next-generation performance at scale 10 Engineered for optimal

More information

Cloudera Enterprise Data Hub. GCloud Service Definition Lot 3: Software as a Service

Cloudera Enterprise Data Hub. GCloud Service Definition Lot 3: Software as a Service Cloudera Enterprise Data Hub GCloud Service Definition Lot 3: Software as a Service December 2014 1 SERVICE OVERVIEW & SOLUTION... 4 1.1 Service Overview... 4 1.2 Introduction to Cloudera... 5 1.3 Cloudera

More information

Hadoop Data Hubs and BI. Supporting the migration from siloed reporting and BI to centralized services with Hadoop

Hadoop Data Hubs and BI. Supporting the migration from siloed reporting and BI to centralized services with Hadoop Hadoop Data Hubs and BI Supporting the migration from siloed reporting and BI to centralized services with Hadoop John Allen October 2014 Introduction John Allen; computer scientist Background in data

More information

How To Turn Big Data Into An Insight

How To Turn Big Data Into An Insight mwd a d v i s o r s Turning Big Data into Big Insights Helena Schwenk A special report prepared for Actuate May 2013 This report is the fourth in a series and focuses principally on explaining what s needed

More information

HDP Hadoop From concept to deployment.

HDP Hadoop From concept to deployment. HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some

More information