Oracle Big Data Appliance X5-2

Similar documents
Oracle Big Data Appliance X5-2

Oracle Big Data Appliance X5-2

ORACLE BIG DATA APPLIANCE X4-2

ORACLE BIG DATA APPLIANCE X3-2

SUN ORACLE DATABASE MACHINE

ORACLE EXADATA DATABASE MACHINE X2-8

SUN ORACLE EXADATA STORAGE SERVER

SUN ORACLE DATABASE MACHINE

An Oracle White Paper June High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database

ORACLE DATA SHEET RELATED PRODUCTS AND SERVICES RELATED PRODUCTS

VIRTUAL COMPUTE APPLIANCE

MACHINE X2-22 ORACLE EXADATA DATABASE ORACLE DATA SHEET FEATURES AND FACTS FEATURES

ORACLE PRIVATE CLOUD APPLIANCE

MACHINE X2-8 ORACLE EXADATA DATABASE ORACLE DATA SHEET FEATURES AND FACTS FEATURES

ORACLE EXADATA STORAGE SERVER X2-2

Oracle Big Data Management System

ORACLE EXADATA STORAGE SERVER X4-2

MACHINE X3-8 ORACLE DATA SHEET ORACLE EXADATA DATABASE FEATURES AND FACTS FEATURES

The Five Most Common Big Data Integration Mistakes To Avoid O R A C L E W H I T E P A P E R A P R I L

ORACLE DATA INTEGRATOR ENTERPRISE EDITION

Oracle Database Appliance X5-2

ORACLE EXALYTICS IN-MEMORY MACHINE X3-4

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

STORAGETEK SL150 MODULAR TAPE LIBRARY

Big Data and Natural Language: Extracting Insight From Text

ORACLE SUPERCLUSTER T5-8

STORAGETEK SL150 MODULAR TAPE LIBRARY

Oracle Big Data Spatial and Graph

Oracle Database Cloud Exadata Service. Transforming The Cloud for Mission-critical Business Operations

Running Oracle s PeopleSoft Human Capital Management on Oracle SuperCluster T5-8 O R A C L E W H I T E P A P E R L A S T U P D A T E D J U N E

Oracle Big Data Fundamentals Ed 1 NEW

Oracle Big Data SQL Technical Update

Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>

ORACLE INFRASTRUCTURE AS A SERVICE PRIVATE CLOUD WITH CAPACITY ON DEMAND

ORACLE DATABASE APPLIANCE X3-2

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

Safe Harbor Statement

Constructing a Data Lake: Hadoop and Oracle Database United!

Oracle Database - Engineered for Innovation. Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya

STORAGETEK SL150 MODULAR TAPE LIBRARY

ORACLE EXADATA DATABASE MACHINE X4-2

ORACLE EXADATA DATABASE MACHINE X3-2

Virtual Compute Appliance Frequently Asked Questions

SUN BLADE 6000 CHASSIS

BACKUP AND RECOVERY FOR ORACLE ENGINEERED SYSTEMS WITH ORACLE'S SUN ZFS BACKUP APPLIANCE

An Oracle White Paper October Oracle Database Appliance

An Oracle White Paper May Exadata Smart Flash Cache and the Oracle Exadata Database Machine

SUN HARDWARE FROM ORACLE: PRICING FOR EDUCATION

ORACLE EXADATA STORAGE SERVER X3-2

ORACLE OPS CENTER: PROVISIONING AND PATCH AUTOMATION PACK

Simplify IT and Reduce TCO: Oracle s End-to-End, Integrated Infrastructure for SAP Data Centers

An Oracle White Paper September Oracle: Big Data for the Enterprise

An Oracle White Paper June Oracle: Big Data for the Enterprise

An Oracle White Paper November Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

Oracle Internet of Things Cloud Service

Executive Summary... 2 Introduction Defining Big Data The Importance of Big Data... 4 Building a Big Data Platform...

Oracle Database Cloud Exadata Service

An Oracle White Paper February Oracle Data Integrator 12c Architecture Overview

Oracle Big Data Discovery The Visual Face of Hadoop

ORACLE FINANCIAL SERVICES ANALYTICAL APPLICATIONS INFRASTRUCTURE

IBM Netezza High Capacity Appliance

ORACLE OLAP. Oracle OLAP is embedded in the Oracle Database kernel and runs in the same database process

INCREASING EFFICIENCY WITH EASY AND COMPREHENSIVE STORAGE MANAGEMENT

SUN SPARC ENTERPRISE M3000 SERVER

STORAGETEK VIRTUAL STORAGE MANAGER SYSTEM

ORACLE DATA INTEGRATOR ENTERPRISE EDITION

ORACLE EXALYTICS IN-MEMORY MACHINE X4-4

SUN FIRE X4170 SERVER

Oracle Database Backup Service. Secure Backup in the Oracle Cloud

Oracle Financial Management Analytics

IBM Software Information Management Creating an Integrated, Optimized, and Secure Enterprise Data Platform:

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances

Oracle Fusion Accounting Hub Reporting Cloud Service

Oracle FS1-2 Flash Storage System

An Oracle White Paper October Oracle Data Integrator 12c New Features Overview

An Oracle White Paper October Oracle: Big Data for the Enterprise

Oracle Data Integrator 12c (ODI12c) - Powering Big Data and Real-Time Business Analytics. An Oracle White Paper October 2013

PeopleSoft Compensation

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

ORACLE CRM ON DEMAND RELEASE 30

ORACLE DATA SHEET KEY FEATURES AND BENEFITS ORACLE WEBLOGIC SERVER STANDARD EDITION

ORACLE OPS CENTER: VIRTUALIZATION MANAGEMENT PACK

STORAGETEK 2540 ARRAY

SUN SPARC ENTERPRISE M4000 SERVER

Introducing Oracle Exalytics In-Memory Machine

ORACLE CLOUD MANAGEMENT PACK FOR ORACLE DATABASE

Oracle Database 12c Plug In. Switch On. Get SMART.

ORACLE EXADATA DATABASE MACHINE X5-2

Oracle Data Integrator 12c New Features Overview Advancing Big Data Integration O R A C L E W H I T E P A P E R M A R C H

An Oracle White Paper November Backup and Recovery with Oracle s Sun ZFS Storage Appliances and Oracle Recovery Manager

Sun SPARC Enterprise M5000 Server Specifications Processor

Migrating Non-Oracle Databases and their Applications to Oracle Database 12c O R A C L E W H I T E P A P E R D E C E M B E R

ORACLE COHERENCE 12CR2

APPLICATION MANAGEMENT SUITE FOR SIEBEL APPLICATIONS

ORACLE VM MANAGEMENT PACK

Oracle SDN Performance Acceleration with Software-Defined Networking

Top Ten Reasons for Deploying Oracle Virtual Networking in Your Data Center

StorageTek SL8500 Modular Library System

Advanced Matching and IHE Profiles

ORACLE FUSION ACCOUNTING HUB

Transcription:

Oracle Big Data Appliance X5-2 Oracle Big Data Appliance is a flexible, high-performance, secure platform for running diverse workloads on Hadoop and NoSQL systems. With Oracle Big Data SQL, Oracle Big Data Appliance extends Oracle s industry-leading implementation of SQL to Hadoop and NoSQL systems. By combining the newest technologies from the Hadoop ecosystem and powerful Oracle SQL capabilities together on a single pre-configured platform, Oracle Big Data Appliance is uniquely able to support rapid development of new Big Data applications and tight integration with existing relational data. Oracle Big Data Appliance X5-2 Oracle Big Data Appliance is an open, multi-purpose engineered system for Hadoop and NoSQL processing. It runs a diverse set of workloads from Hadoop-only workloads (MapReduce 2, Spark, Hive etc.) to interactive, all-encompassing interactive SQL queries using Oracle Big Data SQL. KEY FEATURES Massively scalable, open infrastructure to store, analyze and manage big data Flexible configuration and elastic hardware choices for optimizing both floor space and growth path for Hadoop and Oracle NoSQL Database Oracle Big Data SQL delivers unprecedented integration of Big Data and Oracle Database Industry-leading security, performance and the most comprehensive big data tool set on the market all bundled in an easy to deploy appliance Big Data Connectors delivers load rates of up to 15TB per hour between Big Data Appliance and Oracle Exadata Cloudera s comprehensive software suite including Cloudera Distribution Big Data Appliance provides an open environment for innovation while maintaining tight integration and enterprise-level support. Organizations can deploy external software to support new functionality such as graph analytics, natural language processing and fraud detection. Support for non-oracle components is delivered by their respective support channels and not by Oracle. Oracle Big Data SQL Oracle Big Data SQL is an innovation from Oracle only available on Oracle Big Data Appliance. It is a new architecture for SQL on Hadoop, seamlessly integrating data in Hadoop and NoSQL with data in Oracle Database. Using Oracle Big Data SQL, organizations can: Combine data from Oracle Database, Hadoop and NoSQL in a single SQL query Query and analyze data in Hadoop and NoSQL Integrate big data analysis into existing applications and architectures Extend security and access policies from Oracle Database to data in Hadoop and NoSQL Maximize query performance on all data using Smart Scan

including Apache Hadoop and Apache Spark Oracle Enterprise Manager combined with Cloudera Manager simplifies management of the entire Big Data Appliance Advanced analytics with Oracle R directly interacting with data stored in HDFS Handle low-latency workloads with the pre-installed and configured Oracle NoSQL Database InfiniBand connectivity between nodes and across racks as well as to Oracle Exadata and other Oracle Engineered Systems KEY BENEFITS Optimized, Complete and Secure Big Data Solution Oracle Big Data SQL provides the most complete SQL solution for Big Data when integrated with Oracle Exadata Most comprehensive big data tool set integrated in a single appliance Risk-free installation and rapid time to value Simplified operations, updates and patch management though a single command utility of the entire stack (OS, Java, Oracle NoSQL Database and the Cloudera stack) Single Management Console integrating Big Data Appliance hardware and software monitoring through Oracle Enterprise Manager Single-vendor support for your entire big data solution covering both hardware and software Oracle Big Data SQL radically simplifies integrating and operating in the big data domain through two powerful features: newly expanded External Tables and Smart Scan functionality on Hadoop. Using new external table types, data in Hadooop and NoSQL is exposed to Oracle Database users. These tables, once defined, automatically discover Hive metadata including data location and data parsing requirements (i.e. SerDes and StorageHandlers). This enables SQL queries to access the data in its existing format leveraging native parsing constructs. Oracle s unique Smart Scan capability brings the proven storage processing innovations of Oracle Exadata to Oracle Big Data Appliance. The biggest performance penalties in data processing are typically the result of excess data movement. Instead of sending all scanned data to the compute resources, Smart Scan on Hadoop radically minimizes data movement to the compute nodes by applying the following techniques at the storage level: Data-local scans o Hadoop data is read using native operators local to the node Column projection o Only relevant columns are returned from the source Predicate evaluation o Only relevant rows are returned from the source Complex function evaluation o SQL operators on JSON and XML types applied at the source o Model scoring and analytical operators evaluated at the source Smart Scan coexists with other Hadoop services and does not require any changes to Hadoop itself, thus staying in line with the open environment Oracle Big Data Appliance provides. Oracle Big Data Spatial and Graph Oracle Big Data Spatial and Graph provides advanced spatial analytic capabilities and a graph database on both Oracle Big Data Appliance and other supported Apache Hadoop and NoSQL database platforms. The property graph component gives users a scalable graph database with industryleading in-memory analytics. It includes 35 pre-built graph analytics enabling users to easily discover relationships, communities, influencers, and other graph patterns. The graph database is hosted on either Apache HBase or Oracle NoSQL Database and supports popular scripting languages like Python, Groovy, the open source Tinkerpop stack as well as a Java API. The spatial analytics and services include a data enrichment service to harmonize data based on locations and place names and a wide range of 2D, 3D and raster algorithms to analyze location relationships among persons and assets in for example social media or log data. It can apply city, state, and country categorization and process and visualize geospatial map data and satellite imagery. 2 ORACLE BIG DATA APPLIANCE X5-2 DATA SHEET

RELATED PRODUCTS Oracle Exadata Oracle Big Data SQL Oracle Big Data Spatial and Graph Oracle Big Data Connectors Oracle Big Data Discovery Oracle NoSQL Database Oracle Exalytics Oracle Business Intelligence Enterprise Edition Oracle GoldenGate for Big Data Oracle Data Integrator Oracle Stream Explorer Oracle Enterprise Manager RELATED SERVICES Advanced Customer Services Product Support Services Consulting Services Oracle University Training Oracle Big Data Connectors In addition to providing Oracle Big Data SQL and the full Cloudera software platform, Big Data Appliance utilizes Oracle Big Data Connectors to simplify data integration and analytics. Big Data Connectors provide high-speed access to data in Hadoop from Oracle Exadata and Oracle Database with data transfer rates on the order of 15 TB/hour. Big Data Connectors also enable integrated, highly scalable analytics providing native access to Hadoop data and parallel processing using Oracle R Distribution. Finally, Oracle XQuery for Hadoop facilitates standard XQuery operations to process and transform documents in various formats (JSON, XML, Avro and others), executing in parallel across the Hadoop cluster. Lower TCO and Faster Time to Value Big Data Appliance provides unique pricing to offer both a lower initial deployment cost as well as a dramatically reduced three to four-year TCO when compared to a DIY Hadoop system. Big Data Appliance bundles the hardware (servers, high-speed networking, power distribution units and peripherals), OS support and subscription costs for the Cloudera software into a single price for the life of the system. A single hardware support license covers the hardware as well as the integrated software. Organizations do not want to spend valuable intellectual capital assembling and tuning an optimized Hadoop/NoSQL infrastructure, especially when these resources can be applied to delivering high value business solutions. Big Data Appliance delivers a preconfigured, highly tuned environment for Cloudera Enterprise and Oracle NoSQL Database. This optimized environment enables companies to focus their resources on developing compelling business applications lowering the risk for the solution. Additionally, the pre-tuned environment avoids extensive ramp-up time for new applications due to performance and production issues. Comprehensive Security Securing data is critical to Big Data solutions in the enterprise; Big Data Appliance provides strong authentication, authorization and auditing of data in Hadoop out of the box. Strong authentication is provided using Kerberos. This ensures that all users are who they claim to be and that rogue services are not added to the system. Big Data Appliance leverages Apache Sentry (an open-source project of which Oracle developers are founding members) to authorize SQL access via tools like Hive and Impala. By delivering and developing Sentry, Oracle delivers Big Data Appliance with the highest data security levels currently available for Hadoop. Both network encryption and encryption of data-at-rest are included with Oracle Big Data Appliance and supported by Oracle. Big Data Appliance supports the latest innovations in encryption of data-at-rest by supporting native HDFS encryption with a key management facility. This implementation enables the tightest security on all data in HDFS. Network encryption prevents network sniffing from capturing protected data and is enabled on Big Data Appliance through a simple check box. To ensure security and data access compliance, Big Data Appliance integrates with Oracle Audit Vault and Database Firewall. An Oracle Audit Vault agent is pre-installed 3 ORACLE BIG DATA APPLIANCE X5-2 DATA SHEET

on Big Data Appliance to track and audit data access on the Hadoop system. By leveraging Oracle Audit Vault and Database Firewall, all auditing across the organization is consolidated into a single audit repository ensuring a comprehensive view across all data. In addition to securing the Hadoop system, Oracle Big Data SQL enables organizations to leverage Oracle Database security capabilities when querying Hadoop and NoSQL data. A secure Big Data Appliance combined with Oracle Big Data SQL delivers the most comprehensive security of any big data system. Simplified Operations Oracle Enterprise Manager provides a single entry point for managing the entire system both hardware and software providing continuity across other Oracle products in the organization. To provide deep management capabilities for Hadoop, Enterprise Manager enables a context-aware integration with Cloudera Manager. Big Data Appliance simplifies day-to-day operations by providing a simple onecommand installation, update, patch and expansion utility Mammoth which enables rapid deployment updates (typically quarterly) to the frequently evolving Hadoop stack without incurring significant downtime. Mammoth also enables Oracle-tested, seamless upgrades between Hadoop versions and automated service management to ensure the best balance between Hadoop Master Nodes and Data Nodes. Big Data Appliance is supported by Oracle, giving organizations a single point of support for their hardware, all integrated software (including all Cloudera software) and any additional Oracle software installed. Elastic Configurations Big Data Appliance is designed to expand as your data and requirements grow. Initial big data implementations may start with Big Data Appliance Starter Rack. This six server rack comes fully equipped with a complete set of switches and power distribution units (PDU) required for a full rack. The Starter Rack and switching gear enables the appliance to be easily and efficiently expanded in single node hardware increments to larger configurations using Oracle Big Data Appliance X5-2 High Capacity (HC) Node plus InfiniBand Infrastructure. Figure 1. Modular Hardware Building Blocks In addition to expanding the system within a rack, multiple racks can be connected using 4 ORACLE BIG DATA APPLIANCE X5-2 DATA SHEET

the integrated InfiniBand fabric to form larger configurations; up to 18 racks can be connected in a non-blocking manner by connecting InfiniBand cables without the need for any external switches. Larger non-blocking configurations are supported with additional external InfiniBand switches, while larger blocking network configurations can be supported without additional switches. The use of InfiniBand dramatically reduces the cost of large configurations by reducing the need for the top of rack switching fabric. Big Data Appliance is multitenant; it can be configured as a single cluster or as a set of clusters. This provides the flexibility customers need when deploying development, test and production clusters. Software Details Big Data Appliance X5-2 Included Software Operating System: Oracle Linux 6 Integrated Software: Cloudera Enterprise 5 Data Hub Edition with support for: Other: Cloudera s Distribution including Apache Hadoop (CDH) Cloudera Impala Cloudera Search Apache HBase and Apache Accumulo Apache Spark Apache Kafka Cloudera Manager with support for: Cloudera Navigator Cloudera Back-up and Disaster Recovery (BDR) Oracle Java JDK 7 or 8 MySQL Database Standard Edition* Oracle Big Data Appliance Enterprise Manager Plug-In Oracle R Distribution Oracle NoSQL Database Community Edition (CE)** * Restricted Use License ** Oracle NoSQL Database CE Support is not included in the Support for Big Data Appliance. Instead a separate support subscription for Oracle NoSQL DB CE is required. Big Data Appliance X5-2 Optional Software Oracle Big Data SQL Oracle Big Data Connectors: Oracle SQL Connector for Hadoop Oracle Loader for Hadoop Oracle XQuery for Hadoop Oracle R Advanced Analytics for Hadoop Oracle Data Integrator 5 ORACLE BIG DATA APPLIANCE X5-2 DATA SHEET

Oracle Audit Vault and Database Firewall for Hadoop Auditing Oracle Data Integrator Oracle GoldenGate Oracle NoSQL Database Enterprise Edition Oracle Big Data Spatial and Graph Oracle Big Data Discovery Hardware Details and Specifications BIG DATA APPLIANCE X5-2 HARDWARE Full Rack Starter Rack HC Node plus InfiniBand Infrastructure 18 x Compute / Storage Nodes 6 x Compute / Storage Nodes 1 x Compute / Storage Nodes Per Node: 2 x 18-Core 2.3 GHz Intel Xeon E5-2699 v3 8 x 16GB DDR4 Memory (expandable to maximum of 768GB per node) 12 x 8TB 7,200 RPM High Capacity SAS Disks 2 x QDR 40Gb/sec InfiniBand Ports 4 x 10 Gb Ethernet Ports 1 x ILOM Ethernet Port 2 x 32 Port QDR InfiniBand Leaf Switch 32 x InfiniBand ports 8 x 10Gb Ethernet ports 1 x 36 Port QDR InfiniBand Spine Switch 36 x InfiniBand Ports Additional Hardware Components included: Ethernet Administration Switch 2 x Redundant Power Distributions Units (PDUs) 42U rack packaging Leverages the leaf switches from the Starter Rack Leverages the spine switch from the Starter Rack Leverages the administration switch, PDUs and base rack from the Starter Rack Spares Kit Included: 2 x 8 TB High Capacity SAS disk InfiniBand cables Leverages the spares kit from the Starter Rack BIG DATA APPLIANCE X5-2 COMPONENT ENVIRONMENTAL SPECIFICATIONS X5-2 High Capacity Node plus InfiniBand Infrastructure Physical Dimensions Height Width 3.5 in. (87.6 mm) 17.5 in. (445.0 mm) 6 ORACLE BIG DATA APPLIANCE X5-2 DATA SHEET

Depth Weight Power Cooling Airflow 2 29.0 in. (737.0 mm) 33.1 kg (73.0 lbs) Maximum: 0.7 kw Typical 1 : 0.5 kw Maximum: 2,481 BTU/hour Typical 1 : 1,736 BTU/hour Maximum: 115 CFM Typical 1 : 80 CFM Operating temperature/humidity: 5 ºC to 32 ºC (41 ºF to 89.6 ºF), 10% to 90% relative humidity, noncondensing Altitude Operating: Up to 3,048 m, max. ambient temperature is de-rated by 1 C per 300m above 900m 1 Typical power usage varies by application workload 2 Airflow must be front to back BIG DATA APPLIANCE X5-2 ENVIRONMENTAL SPECIFICATIONS Physical Dimensions Full Rack Height Width Depth 42U, 78.66-1998 mm 23.62-600mm 47.24-1200 mm Weight Rack Shipping Starter Rack 415kg 915lbs 546kg 1203lbs Full Rack 836kg 1843lbs 979kg 2158lbs Power Starter Rack Full Rack Maximum: 5.2KW Typical 1 : 3.6KW Maximum: 13.0KW Typical 1 : 9.1KW Cooling Starter Rack Full Rack Maximum:17,521 BTU/hour Typical 1 : 12,265 BTU/hour Maximum: 44,136 BTU/hour Typical 1 : 30,895 BTU/hour Airflow 2 Starter Rack Full Rack Maximum: 811 CFM Typical 1 : 568 CFM Maximum: 2043 CFM 7 ORACLE BIG DATA APPLIANCE X5-2 DATA SHEET

Typical 1 : 1430 CFM Big Data Appliance X5-2 Further Environmental Specifications Operating temperature/humidity: 5 ºC to 32 ºC (41 ºF to 89.6 ºF), 10% to 90% relative humidity, noncondensing Altitude Operating: Up to 3,048 m, max. ambient temperature is de-rated by 1 C per 300m above 900m Big Data Appliance X5-2 Regulations and Certifications Regulations 3 Safety: UL/CSA 60950-1, EN 60950-1, IEC 60950-1 CB Scheme with all country differences RFI/EMI: EN55022, EN61000-3-11, EN61000-3-12 Immunity: EN 55024 Emissions and Immunity: EN300 386 Certifications 3 North America (NRTL), European Union (EU), International CB Scheme, BSMI (Taiwan), C-Tick (Australia), CCC (PRC), MSIP (Korea), CU EAC (Customs Union), VCCI (Japan) European Union Directives 3 2006/95/EC Low Voltage Directive, 2004/108/EC EMC Directive, 2011/65/EU RoHS Directive, 2012/19/EU WEEE Directive 1 Typical power usage varies by application workload 2 Airflow must be front to back 3 All standards and certifications referenced are to the latest official version at the time the data sheet was written. Other country regulations/certifications may apply. In some cases, as applicable, regulatory and certification compliance were obtained at the component level. BIG DATA APPLIANCE X5-2 EXPANSIONS In-Rack Expansion: Field upgrade leveraging up to 12 HC Nodes plus InfiniBand Infrastructure per Starter Rack. Additional hardware included with each HC Node plus InfiniBand Infrastructure: 1 x Node with direct attached storage as shown earlier InfiniBand and Ethernet cables to connect the components Expansion supports multiple generations of hardware. Multi-Rack Up to 18 racks can be connected without requiring additional InfiniBand switches. InfiniBand cables to connect 3 racks are included in the rack Spares Kits. Additional optical InfiniBand cables required when connecting 4 or more racks. Memory Expand the memory in any individual node or any number of nodes from 128GB per node to 768GB per node BIG DATA APPLIANCE X5-2 SUPPORT SERVICES Hardware Warranty: 1 year with a 4 hour web/phone response during normal business hours (Mon-Fri 8AM-5PM), with 2 business day on-site response/parts Exchange Oracle Premier Support for Systems: Oracle Linux and integrated software support and 24x7 with 2 hour on-site hardware service response (subject to proximity to service center) Oracle Premier Support for Operating Systems Oracle Customer Data and Device Retention 8 ORACLE BIG DATA APPLIANCE X5-2 DATA SHEET

System Installation Services Software Configuration Services System Expansion Support Services including hardware installation and software configuration Quarterly on-site patch deployment service Oracle Automatic Service Request (ASR) CONTACT US For more information about Big Data Appliance X5-2 and Oracle Big Data SQL, visit oracle.com or call +1.800.ORACLE1 to speak to an Oracle representative. CONNECT WITH US blogs.oracle.com/oracle facebook.com/oracle twitter.com/oracle oracle.com Copyright 2015, Oracle and/or its affiliates. All rights reserved. This document is provided for information purposes only, and the contents hereof are subject to change without notice. This document is not warranted to be error-free, nor subject to any other warranties or conditions, whether expressed orally or implied in law, including implied warranties and conditions of merchantability or fitness for a particular purpose. We specifically disclaim any liability with respect to this document, and no contractual obligations are formed either directly or indirectly by this document. This document may not be reproduced or transmitted in any form or by any means, electronic or mechanical, for any purpose, without our prior written permission. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. AMD, Opteron, the AMD logo, and the AMD Opteron logo are trademarks or registered trademarks of Advanced Micro Devices. UNIX is a registered trademark of The Open Group. Cloudera, Cloudera CDH, and Cloudera Manager, Cloudera BDR and Cloudera Navigator are registered and unregistered trademarks of Cloudera, Inc. 0715 9 ORACLE BIG DATA APPLIANCE X5-2 DATA SHEET

O R A C L E D A T A S H E E T Oracle Big Data Connectors Oracle Big Data Connectors is a software suite that integrates processing in Apache Hadoop with operations in Oracle Database. Optimized for Hadoop and Oracle Database, they allow you to use massive volumes of unstructured and structured data to discover critical business insights. Oracle Big Data Connectors enable extremely fast data load from Hadoop to Oracle Database, provide access to HDFS data from Oracle Database, and support R and XML processing in Hadoop with familiar tools. They are easy to use with existing skill sets, greatly simplifying development of Big Data solutions. Oracle Big Data Connectors delivers a rich set of features, high speed connectivity, fast performance, and security for Big Data applications. Oracle Big Data Appliance was especially attractive because of its extensibility, and by using Oracle Big Data Connectors, we could move data seamlessly between Oracle Big Data Appliance and Oracle Database Appliance, which supports high throughput and low latency. CRAIG FRYAR HEAD OF BUSINESS INTELLIGENCE WARGAMING K E Y BUS INE S S B E N E F I T S Quickly deliver data discovery applications to business users. High performance load of large volumes from Hadoop to Oracle Database. Query data in Hadoop from the database with Oracle SQL. Enable data scientists to run R on Hadoop for fast analytics using their favorite R IDE. Oracle Big Data Connectors Enterprises are developing Big Data solutions to process large volumes of raw data to disrupt competitors, get insights into individual customers, and exploit new analyses. These solutions use Hadoop for large scale data processing, data transformation, and exploratory analysis of raw data, integrating the results with business critical data in Oracle Database for real-time queries, advanced analytics, and complex data management. Oracle Big Data Connectors connect Hadoop with Oracle Database, providing an essential infrastructure for Big Data solutions. They include fast load, Oracle SQL access to data in HDFS, R and XML analytics in Hadoop, and data integration, enabling information discovery, deep analytics, and fast integration of all data in the enterprise. The components of this software suite are: Oracle SQL Connector for Hadoop Distributed File System (HDFS) Oracle Loader for Hadoop Oracle Advanced Analytics for Hadoop Oracle XQuery for Hadoop Oracle Data Integrator 1 Oracle Big Data Connectors work with Oracle s engineered systems, connecting Oracle Big Data Appliance with Oracle Exadata, Oracle SuperCluster, and Oracle Database Appliance, and with certified Hadoop distributions and Oracle Databases on commodity hardware. Click here for the latest certifications or search for Oracle Big Data Connectors in your favorite search engine and click on the certifications tab. 1 Oracle Big Data Connectors includes a restricted use license for Oracle Data Integrator only on Oracle Big Data Appliance. Additional licensing is required on other Hadoop platforms.

O R A C L E D A T A S H E E T Allow XML application developers to use the familiar XQuery language to process XML data in Hadoop. Reduce the complexities of Hadoop through graphical tooling. Optimized for Oracle Big Data Appliance. Quick development and deployment of Big Data solutions with easy-to-use tools for Hadoop and Oracle Developers. Certified with multiple database and Hadoop versions. K E Y F E A T U R E S Tight integration with Oracle Database. Leverage Hadoop compute resources. Enable Oracle SQL to access and load Hadoop data. Fast load from Hadoop into Oracle Database with minimal use of database CPU. Partition pruning of Hive tables during load and query. Graphical user interfaces of Oracle Data Integrator drive data transformation workflows in Hadoop. Automatically transform R programs into Hadoop jobs. Process large volumes of XML files in parallel and load XQuery results into Oracle Database. Secure access of HDFS data with Kerberos authentication. Oracle SQL Connector for Hadoop Distributed File System (HDFS) Oracle SQL Connector for HDFS allows you to query of Hadoop resident data from the database using Oracle SQL. The data is accessed via external tables, which can be queried like any other table in the database. Data can also be loaded by selecting data from the external table and inserting it into a table in the database. The load speed from Oracle Big Data Appliance to Oracle Exadata is 15 TB/hour. Oracle SQL Connector for HDFS can query or load data in text files or Hive tables over text files. When querying from a Hive partitioned table, Oracle SQL Connector for HDFS can be restricted to access a subset of Hive partitions, minimizing the data accessed for faster performance. Oracle SQL Connector for HDFS can also query files in Oracle Data Pump format, such as data files created by Oracle Loader for Hadoop for offline loading and data files copied from the database to Hadoop for archival. FEATURES Oracle SQL access to data in Hadoop Partition-aware access of Hive partitioned tables Parallel query and load Security Flexible and easy to use Access Oracle Data Pump files in HDFS with Oracle SQL Query Hive tables and text files in HDFS directly from Oracle Database Load or query only relevant Hive partitions, enabling partition pruning High speed parallel query and load into Oracle Database Authenticated access with Kerberos Tool to create external tables Archive database data as data pump files in Hadoop and query from Oracle Database Oracle Loader for Hadoop Oracle Loader for Hadoop is a high performance load tool for fast movement of data from Hadoop to Oracle Database. Oracle Loader for Hadoop takes advantage of Hadoop compute resources to sort, partition, and convert data into Oracle types on Hadoop before loading into the database. Transforming the data into Oracle types reduces database CPU usage during the load, minimizing impact on database applications. It alleviates competition for resources, a common issue when ingesting large data volumes. This feature makes this connector particularly useful for continuous and frequent loads. Oracle Loader for Hadoop uses an innovative sampling technique to intelligently distribute data across Hadoop nodes while loading data in parallel. This minimizes the effects of data skew, a common concern in parallel applications. Oracle Loader for Hadoop loads into tables with any database compression: Basic 2 ORACLE BIG DATA CONNECTORS

O R A C L E D A T A S H E E T Table compression, Advanced Row compression, and Hybrid Columnar Compression. Oracle Loader for Hadoop can load a wide variety of input formats. Natively it can load data from text files, Hive tables, log files (parse and load), Oracle NoSQL Database, and more. Through Hive it can also load from input formats (e.g.: Parquet, JSON files) and input sources (e.g.: HBase) accessible to Hive. In addition, Oracle Loader for Hadoop can read proprietary data formats through custom input format implementations provided by the user. O R A C L E B I G D A T A C O N N E C T O R S Oracle Big Data Connectors provides high performance integration between Hadoop and Oracle Database. R E L A T E D P R O D U C T S The following are related products available from Oracle Oracle Big Data Appliance Oracle Exadata Oracle NoSQL Database Oracle Big Data Discovery Oracle Exalytics Oracle Business Intelligence Oracle Data Integrator R E L A T E D S E R V I C E S The following services support Oracle Main Product: Features Offload data type conversion to Hadoop Parallel load Load balancing Security Input formats Partition-aware load Minimize impact on database CPU High speed parallel load leveraging Hadoop nodes Automatic load balancing to address data skew Authenticated access with Kerberos Load from a variety of input formats: text files, Hive, Parquet, JSON, sequence files, compressed files, log files, and more Load only partitions of interest from Hive partitioned tables Update Subscription Services Product Support Services Professional Services Oracle R Advanced Analytics for Hadoop Oracle R Advanced Analytics for Hadoop runs R code in Hadoop for scalable analytics. The connector hides complexities of Hadoop-based computing from the R user, executing R code in parallel in Hadoop from stand-alone desktop applications developed in any IDE the R user chooses. It additionally integrates with Oracle Advanced Analytics in Oracle Database, to execute R and in-database Data Mining computations directly in the database. Oracle R Advanced Analytics for Hadoop enables faster insights with the rich collection of scalable, high performance, parallel implementations of common statistical and predictive techniques in Hadoop, without requiring data movement to any other platform. A complete list of supported techniques is in the table below. Oracle R Advanced Analytics for Hadoop supports rapid development with R-style debugging capabilities of parallel R code on user desktops, simulating parallelism under the covers. The connector enables analysts to combine data from several environments client desktop, HDFS, Hive, Oracle Database and in-memory R data structures all in the context of a single analytic task execution, greatly simplifying data assembly and preparation. Oracle R Advanced Analytics for Hadoop also provides a general computation framework for execution of R code in parallel. 3 ORACLE BIG DATA CONNECTORS

O R A C L E D A T A S H E E T Features Scalable, distributed analytics for Big Data Ease-of-use and rapid deployment without requiring new skill sets Native distributed R analytics in Hadoop for transparent execution of R code in parallel Developer productivity: R code developed and debugged in a familiar R environment on a user s desktop without the need for parallel computing skills Simplified interfaces allow R users to leverage Hadoop s parallelism Support for hybrid data assembly and scalable data preparation Native distributed R analytics Statistics and Advanced Matrix Computation Covariance and Correlation matrix computation Reservoir Sampling Principal Component Analysis Matrix completion using low rank matrix factorization Non negative matrix factorization Regression Models Linear regression Single layer feed forward Neural Networks Generalized linear models Classification models Logistic regression based on generalized linear models Segmentation using k-means clustering Oracle XQuery for Hadoop Oracle XQuery for Hadoop enables the use of XQuery to process and transform text, XML, JSON and Avro content stored in Hadoop. It takes advantage of the parallelism of Hadoop to evaluate W3C XQuery expressions, using all nodes in the cluster. 4 ORACLE BIG DATA CONNECTORS

O R A C L E D A T A S H E E T Oracle XQuery for Hadoop is based on a Java implementation of the proven Oracle Database XQuery engine that has been optimized for Hadoop. The XQuery engine transparently evaluates XQuery expressions in parallel in Hadoop, distributing an XQuery expression to all nodes in the cluster. This takes XQuery processing to the data, rather than bringing the data to the XQuery processor. This method of query evaluation delivers much higher throughput than is available with other XQuery solutions. Typical use cases for Oracle XQuery for Hadoop include web log analysis and transformation operations on text, XML, JSON and Avro content. After processing data can be loaded into the database or indexed with Cloudera Search on the Hadoop platform. Features Scalable, native, XQuery processing Input data stores Integration with Hadoop technologies XQuery engines are automatically distributed across the Hadoop cluster, so XQueries execute where the data is located Process data stored in HDFS, Hive, or Oracle NoSQL Database Execute Oracle XQuery for Hadoop jobs from Apache Oozie workflows Cloudera Search XML/XQuery extensions for Hive Parallel XML parsing Fast load of XQuery results into Oracle Database Process very large XML documents extremely fast Fast load using Oracle Loader for Hadoop Oracle Data Integrator Enterprise Edition Oracle Data Integrator (ODI) has a comprehensive set of knowledge modules for native Hadoop integration within ODI. The modules allow complex data transformations and data movement operations to be specified and executed through a familiar graphical interface, greatly simplifying running jobs in Hadoop. Data movement from one source to another can be graphically defined (e.g.: from HBase to Oracle Database, HBase to Hive, Hadoop to third party databases and from third party databases to Hadoop, etc.). Knowledge modules for Oracle Loader for Hadoop and Oracle SQL Connector for HDFS provide high speed data movement from Hive and HDFS to Oracle Database. With graphical tooling and declarative specifications Oracle Data Integrator knowledge modules eliminate the need to write complex code typically required for Hadoop applications. Oracle Big Data Connectors includes a restricted use license for Oracle Data Integrator 5 ORACLE BIG DATA CONNECTORS

O R A C L E D A T A S H E E T when licensed on Oracle Big Data Appliance. Additional licensing is required on other Hadoop platforms. Features Optimized for developer productivity Familiar ODI graphical user interface End-to-end coordination of Hadoop jobs Hadoop jobs created and orchestrated by ODI Native integration with Hadoop Ability to represent Hive metadata within ODI Transformations and filtering occur directly in Hadoop Optimized for performance Optimized Hadoop ODI Knowledge Modules High performance load to Oracle Database using ODI with Oracle Loader for Hadoop and Oracle SQL Connector for HDFS. C O N T A C T U S For more information Oracle Big Data Connectors, visit oracle.com or call +1.800.ORACLE1 to speak to an Oracle representative. C O N N E C T W I T H U S blogs.oracle.com/oracle facebook.com/oracle twitter.com/oracle oracle.com 6 ORACLE BIG DATA CONNECTORS Copyright 2015, Oracle and/or its affiliates. All rights reserved. This document is provided for information purposes only, and the contents hereof are subject to change without notice. This document is not warranted to be error-free, nor subject to any other warranties or conditions, whether expressed orally or implied in law, including implied warranties and conditions of merchantability or fitness for a particular purpose. We specifically disclaim any liability with respect to this document, and no contractual obligations are formed either directly or indirectly by this document. This document may not be reproduced or transmitted in any form or by any means, electronic or mechanical, for any purpose, without our prior written permission. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. AMD, Opteron, the AMD logo, and the AMD Opteron logo are trademarks or registered trademarks of Advanced Micro Devices. UNIX is a registered trademark of The Open Group. 0115