Distributed Architecture of Oracle Database In-memory



Similar documents
Distributed Architecture of Oracle Database In-memory

Safe Harbor Statement

Oracle Database In-Memory The Next Big Thing

Preview of Oracle Database 12c In-Memory Option. Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Query Optimization in Oracle 12c Database In-Memory

Who am I? Copyright 2014, Oracle and/or its affiliates. All rights reserved. 3

Oracle Database In-Memory: A Dual Format In-Memory Database

2009 Oracle Corporation 1

Under The Hood. Tirthankar Lahiri Vice President Data Technologies and TimesTen October 28, #DBIM12c

<Insert Picture Here> Best Practices for Extreme Performance with Data Warehousing on Oracle Database

An Oracle White Paper June High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database

When to Use Oracle Database In-Memory

Inge Os Sales Consulting Manager Oracle Norway

Query Acceleration of Oracle Database 12c In-Memory using Software on Chip Technology with Fujitsu M10 SPARC Servers

Overview: X5 Generation Database Machines

SUN ORACLE EXADATA STORAGE SERVER

Tips and Tricks for Using Oracle TimesTen In-Memory Database in the Application Tier

Infrastructure Matters: POWER8 vs. Xeon x86

Oracle Database - Engineered for Innovation. Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya

Can the Elephants Handle the NoSQL Onslaught?

<Insert Picture Here> Oracle In-Memory Database Cache Overview

SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011

Oracle Database In-Memory

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

IBM Data Retrieval Technologies: RDBMS, BLU, IBM Netezza, and Hadoop

Oracle Exadata Database Machine for SAP Systems - Innovation Provided by SAP and Oracle for Joint Customers

Main Memory Data Warehouses

Oracle Exadata: The World s Fastest Database Machine Exadata Database Machine Architecture

ORACLE DATABASE 12C IN-MEMORY OPTION

SQL Server 2012 Performance White Paper

Application-Tier In-Memory Analytics Best Practices and Use Cases

BENCHMARKING CLOUD DATABASES CASE STUDY on HBASE, HADOOP and CASSANDRA USING YCSB

Performance Baseline of Oracle Exadata X2-2 HR HC. Part II: Server Performance. Benchware Performance Suite Release 8.4 (Build ) September 2013

An Oracle White Paper July Oracle Database In-Memory

Oracle Database In-Memory A Practical Solution

In-memory databases and innovations in Business Intelligence

SAP HANA SPS 09 - What s New? SAP HANA Scalability

CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level. -ORACLE TIMESTEN 11gR1

Oracle Database 12c Built for Data Warehousing O R A C L E W H I T E P A P E R F E B R U A R Y

OLTP Meets Bigdata, Challenges, Options, and Future Saibabu Devabhaktuni

Optimize Oracle Business Intelligence Analytics with Oracle 12c In-Memory Database Option

The Sierra Clustered Database Engine, the technology at the heart of

RAID 5 rebuild performance in ProLiant

ORACLE DATABASE 10G ENTERPRISE EDITION

Scalability of web applications. CSCI 470: Web Science Keith Vertanen

ORACLE EXADATA STORAGE SERVER X4-2

IncidentMonitor Server Specification Datasheet

In-Memory Data Management for Enterprise Applications

Big Data With Hadoop

Chapter 18: Database System Architectures. Centralized Systems

SAP HANA PLATFORM Top Ten Questions for Choosing In-Memory Databases. Start Here

Is there any alternative to Exadata X5? March 2015

An Oracle White Paper December A Technical Overview of the Oracle Exadata Database Machine and Exadata Storage Server

Oracle InMemory Database

HP ProLiant BL660c Gen9 and Microsoft SQL Server 2014 technical brief

Oracle Database In- Memory & Rest of the Database Performance Features

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale

Safe Harbor Statement

Oracle Big Data, In-memory, and Exadata - One Database Engine to Rule Them All Dr.-Ing. Holger Friedrich

An Oracle White Paper March Best Practices for Implementing a Data Warehouse on the Oracle Exadata Database Machine

In-Memory Columnar Databases HyPer. Arto Kärki University of Helsinki

Architectures for Big Data Analytics A database perspective

The Methodology Behind the Dell SQL Server Advisor Tool

Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering

Instant-On Enterprise

System Architecture. In-Memory Database

Oracle Database 12c Plug In. Switch On. Get SMART.

An Oracle White Paper June A Technical Overview of the Oracle Exadata Database Machine and Exadata Storage Server

Enterprise Applications

Comprehending the Tradeoffs between Deploying Oracle Database on RAID 5 and RAID 10 Storage Configurations. Database Solutions Engineering

Centralized Systems. A Centralized Computer System. Chapter 18: Database System Architectures

Petabyte Scale Data at Facebook. Dhruba Borthakur, Engineer at Facebook, SIGMOD, New York, June 2013

Oracle Big Data SQL Technical Update

Oracle TimesTen: An In-Memory Database for Enterprise Applications

SWISSBOX REVISITING THE DATA PROCESSING SOFTWARE STACK

The Oracle Universal Server Buffer Manager

Advances in Virtualization In Support of In-Memory Big Data Applications

SUN ORACLE DATABASE MACHINE

Big Data Technologies Compared June 2014

An Oracle White Paper May Exadata Smart Flash Cache and the Oracle Exadata Database Machine

Colgate-Palmolive selects SAP HANA to improve the speed of business analytics with IBM and SAP

SAP HANA. SAP HANA Performance Efficient Speed and Scale-Out for Real-Time Business Intelligence

IN-MEMORY DATABASE SYSTEMS. Prof. Dr. Uta Störl Big Data Technologies: In-Memory DBMS - SoSe

Maximum performance, minimal risk for data warehousing

Oracle s In-Memory Database Strategy for OLTP and Analytics

How To Store Data On An Ocora Nosql Database On A Flash Memory Device On A Microsoft Flash Memory 2 (Iomemory)

Safe Harbor Statement

Performance Baseline of Hitachi Data Systems HUS VM All Flash Array for Oracle

Dell Microsoft Business Intelligence and Data Warehousing Reference Configuration Performance Results Phase III

Data Warehousing and Analytics Infrastructure at Facebook. Ashish Thusoo & Dhruba Borthakur athusoo,dhruba@facebook.com

Oracle Database In-Memory

Why compute in parallel? Cloud computing. Big Data 11/29/15. Introduction to Data Management CSE 344. Science is Facing a Data Deluge!

TECHNICAL OVERVIEW HIGH PERFORMANCE, SCALE-OUT RDBMS FOR FAST DATA APPS RE- QUIRING REAL-TIME ANALYTICS WITH TRANSACTIONS.

MS SQL Performance (Tuning) Best Practices:

Converged storage architecture for Oracle RAC based on NVMe SSDs and standard x86 servers

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage

Performance Verbesserung von SAP BW mit SQL Server Columnstore

Practical Cassandra. Vitalii

SAS and Oracle: Big Data and Cloud Partnering Innovation Targets the Third Platform

Optimizing Performance. Training Division New Delhi

Transcription:

Distributed Architecture of Oracle Database In-memory Niloy Mukherjee, Shasank Chavan, Maria Colgan, Dinesh Das, Mike Gleeson, Sanket Hase, Allison Holloway, Hui Jin, Jesse Kamp, Kartik Kulkarni, Tirthankar Lahiri, Juan Loaiza, Neil Macnaughton, Vineet Marwah, Atrayee Mullick, Andy Witkowski, Jiaqi Yan, Mohamed Zait Håkon Åmdal Suresh Kumar Mukhiya 1/25

Motivation Ad-hoc real-time analysis (OLAP) High performance Large amount of data while still keeping the system suitable for transactional workloads (OLTP) All without explicit optimizer plan changes or query rewrites 2/25

Oracle Database In-Memory Dual Format Row storage for online transactional processing (OLTP) Column storage for online analytical processing (OLAP) Database objects optimized for memory, while still being persisted on disk. Memory is cheap From cache on disk-access to primary storage. 3/25

Need for a distributed architecture Scaling out Scaling up Scaling out allows for elastic expansion Avoid single point of failure Hard to program? Majority of analytics do not process huge datasets at the same time Cheap hardware on a single server can process 90% of Facebook s jobs. Main memory bus might become bottleneck Simpler implementation? Single point of failure Long recovery period We are motivated by these observations to design an extremely scalable, high-available fault-tolerant distributed architecture within the Oracle Database In-Memory Option 4/25

Distributed Oracle DBIM Oracle Real Application Cluster (RAC) allows for scaling ut. across multiple machines. Shared nothing -architecture Persisted in row-based blocks 5/25

In-Memory compression units (IMCUs) Compressed with user-defined compression levels Optimized for OLTP Optimized for OLAP Optimized for Storage High performance Single Input Multiple Data (SIMD) instructions In-Memory Storage indexes Bloom filter based joins 6/25

Shared Database Buffer Cache Global Cache Service (GCS) tracks and maintains locations and access modes of all data blocks in the global cache. Handle all OLTP operations Guarantees strict ACID and robustness properties Cache Fusion Protocol 7/25

In-memory column store Shared-nothing container of in-memory segments on each instance Distributed together with the underlying row blocks Falls back to disk storage if not present in-memory 8/25

In-memory Transaction manager Each server is responsible for transactional consistency for incoming DML statements When looking up the transactional log causes too much overhead, the IMCUs are rebuilt 9/25

Distribution manager a. extremely scalable application-transparent distribution of IMCUs across a RAC cluster allowing for efficient utilization of collective memory across inmemory column stores b. high availability of IMCUs across the cluster guaranteeing in-memory faulttolerance for queries c. application-transparent distribution of IMCUs across NUMA nodes within a single server to improve vertical scale-up performance d. efficient recovery against instance failures by guaranteeing minimal rebalancing of IMCUs on cluster topology changes e. seamless interaction with Oracle s SQL execution engine ensuring affinitized high performance parallel scan execution at local memory bandwidths, without explicit optimizer plan changes. 10/25

Distribution scheme Partition Subpartition Block range Auto 11/25

Distribution mechanism Global phase for distribution consensus Decentralized population phase Each instance comes up with the same object location using a hashing algorithm Spread equally across a NUMA node 12/25

Redistribution If a server goes down, a new distribution is calculated Only the objects that has been redistributed is moved, the other ones stay the same. 13/25

Availability None 1-safe (N-1)-safe 14/25

Distributed SQL Execution 15/25

Uniqueness of Architecture Paper Compares uniqueness with: SAP HANA and IBM DB2 with BLU with respect to Distribution Scalability Availability Recovery The architecture provides complete scale out solution with collective memory utilization, redundancy, availability and efficient failure handling by redistribution. 16/25

Evaluation Hardware Setup Distribution Experiments Distributed Query Execution In-Memory Distribution Awareness In-Memory fault Tolerance NUMA Aware Query Execution Evaluation to Validate Quality Attributes like: Performance Scalability Availability 17/25

Evaluation- Hardware Setup Conducted on Oracle Exadata Database Machine, a state of the art database SMP Server and storage cluster system NUMA Experiment is conducted on an X4-8 single node machine with 8 15-core Intel Xeon processor and 2TB DRAM Rest of the experiments are conducted on X4-2 RAC Configuration comprising up to 8 database server nodes, each with 2 12-core Intel Xeon processor and 256GB DRAM and 14 shared storage servers amounting to 200TB total storage capacity 18/25

Evaluation- Distribution Experiments To verify whether IMCU throughput scales out with the number of database server instances in the RAC Cluster. Two experiments were performed: Non-partitioned Table Distribution 13-column and 1 billion row nonpartitioned atomic table with size of 64 GB Composite-partitioned Table Distribution TPC-H lineitem schema is chosen for this experiment. 19/25

Evaluation- Distributed Query Execution - set of 3 experiments performed in 13- column 64 gb atomics table Table is auto-distributed based on block ranges without redundancy Four sets of query sets are selected for each of these experiments 20/25

Evaluation- Distributed Query Execution Q1, Q2, Q3 nonlinear scale out where queries are CPU-Bound Queries in set 4 exercise in-memory storages. Throughput of such queries is not expected to scale with number of instances. 21/25

Evaluation - In Memory Distribution Awareness To observe and validate impact of in-memory distribution awareness in execution of cluster-wide analytic query performance. Performance gains in orders of 20x to 40x over executions with distribution awareness disabled in parallel query granule generation phase. 22/25

Evaluation - In Memory Fault Tolerance To validate in-memory fault tolerance of distributed query execution under 1-safe redundancy. Single instance failure has no visible effect on the elapsed times of the queries. 23/25

Evaluation - NUMA Aware query Execution To observe weather IMCU NUMA-affined query execution yields better throughput compared to the same inmemory execution but without NUMA awareness. 150-250% improvements in query elapsed times when execution in IMCU NUMAaware. 24/25

Conclusion This paper summarizes: Motivation for development of in-memory database optimized for OLTAP environment Fault-tolerant distributed architecture of Oracle database in-memory option Uniqueness of the architecture among its peers Evaluation with sets of experiments showing how performance can be enhanced with in-memory option 25/25