In-memory database systems, NVDIMMs and data durability



Similar documents
Seeking Fast, Durable Data Management: A Database System and Persistent Storage Benchmark

IS IN-MEMORY COMPUTING MAKING THE MOVE TO PRIME TIME?

In-memory databases and innovations in Business Intelligence

In-memory database 1

NV-DIMM: Fastest Tier in Your Storage Strategy

Speed and Persistence for Real-Time Transactions

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

Improve Business Productivity and User Experience with a SanDisk Powered SQL Server 2014 In-Memory OLTP Database

The Benefits of Virtualizing

CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level. -ORACLE TIMESTEN 11gR1

WITH A FUSION POWERED SQL SERVER 2014 IN-MEMORY OLTP DATABASE

Non-Volatile Memory. Non-Volatile Memory & its use in Enterprise Applications. Contents

Benchmarking Cassandra on Violin

DIABLO TECHNOLOGIES MEMORY CHANNEL STORAGE AND VMWARE VIRTUAL SAN : VDI ACCELERATION

Nutanix Tech Note. Configuration Best Practices for Nutanix Storage with VMware vsphere

Top Ten Questions. to Ask Your Primary Storage Provider About Their Data Efficiency. May Copyright 2014 Permabit Technology Corporation

HRG Assessment: Stratus everrun Enterprise

Solution Brief Availability and Recovery Options: Microsoft Exchange Solutions on VMware

Server: Performance Benchmark. Memory channels, frequency and performance

Non-Volatile Memory and Its Use in Enterprise Applications

Nutanix Tech Note. Failure Analysis All Rights Reserved, Nutanix Corporation

Intel RAID Controllers

Enabling Technologies for Distributed Computing

Accelerate SQL Server 2014 AlwaysOn Availability Groups with Seagate. Nytro Flash Accelerator Cards

How To Store Data On An Ocora Nosql Database On A Flash Memory Device On A Microsoft Flash Memory 2 (Iomemory)

Feature Comparison. Windows Server 2008 R2 Hyper-V and Windows Server 2012 Hyper-V

Microsoft SharePoint 2010 on VMware Availability and Recovery Options. Microsoft SharePoint 2010 on VMware Availability and Recovery Options

Memory Channel Storage ( M C S ) Demystified. Jerome McFarland

Recovery Principles in MySQL Cluster 5.1

Price/performance Modern Memory Hierarchy

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale

Microsoft SQL Server 2014 Fast Track

The functionality and advantages of a high-availability file server system

Benchmarking Hadoop & HBase on Violin

Cisco Active Network Abstraction Gateway High Availability Solution

Enabling Technologies for Distributed and Cloud Computing

EMC MID-RANGE STORAGE AND THE MICROSOFT SQL SERVER I/O RELIABILITY PROGRAM

An Overview of Flash Storage for Databases

Why ClearCube Technology for VDI?

Server Forum Copyright 2014 Micron Technology, Inc

Cloud Based Application Architectures using Smart Computing

Microsoft SMB File Sharing Best Practices Guide

High Availability & Disaster Recovery Development Project. Concepts, Design and Implementation

Tips and Tricks for Using Oracle TimesTen In-Memory Database in the Application Tier

Availability Digest. Stratus Avance Brings Availability to the Edge February 2009

CHAPTER 2: HARDWARE BASICS: INSIDE THE BOX

SDRAM and DRAM Memory Systems Overview

An Oracle White Paper November Oracle Real Application Clusters One Node: The Always On Single-Instance Database

EXECUTIVE SUMMARY CONTENTS. 1. Summary 2. Objectives 3. Methodology and Approach 4. Results 5. Next Steps 6. Glossary 7. Appendix. 1.

Nexenta Performance Scaling for Speed and Cost

How To Fix A Powerline From Disaster To Powerline

NEC Corporation of America Intro to High Availability / Fault Tolerant Solutions

Indexing on Solid State Drives based on Flash Memory

SanDisk ION Accelerator High Availability

The Shortcut Guide to Balancing Storage Costs and Performance with Hybrid Storage

Cisco Prime Home 5.0 Minimum System Requirements (Standalone and High Availability)

7 Real Benefits of a Virtual Infrastructure

Protect SQL Server 2012 AlwaysOn Availability Group with Hitachi Application Protector

Windows Server ,500-user pooled VDI deployment guide

SQL Server Virtualization

Step-by-Step Guide. to configure Open-E DSS V7 Active-Active iscsi Failover on Intel Server Systems R2224GZ4GC4. Software Version: DSS ver. 7.

Everything you need to know about flash storage performance

Accelerating Applications and File Systems with Solid State Storage. Jacob Farmer, Cambridge Computer

Parallels Cloud Storage

Real-time Protection for Hyper-V

Best Practices. Server: Power Benchmark

HP ProLiant BL660c Gen9 and Microsoft SQL Server 2014 technical brief

SAP HANA PLATFORM Top Ten Questions for Choosing In-Memory Databases. Start Here

Virtualization Technologies and Blackboard: The Future of Blackboard Software on Multi-Core Technologies

A+ Guide to Managing and Maintaining Your PC, 7e. Chapter 1 Introducing Hardware

Ground up Introduction to In-Memory Data (Grids)

Promise of Low-Latency Stable Storage for Enterprise Solutions

Designing a Cloud Storage System

Chapter 4 System Unit Components. Discovering Computers Your Interactive Guide to the Digital World

Microsoft SQL Server 2005 on Windows Server 2003

Memory-Centric Database Acceleration

Real-Time Analysis of CDN in an Academic Institute: A Simulation Study

Achieving Nanosecond Latency Between Applications with IPC Shared Memory Messaging

Flash s Role in Big Data, Past Present, and Future OBJECTIVE ANALYSIS. Jim Handy

NETWORK ATTACHED STORAGE DIFFERENT FROM TRADITIONAL FILE SERVERS & IMPLEMENTATION OF WINDOWS BASED NAS

<Insert Picture Here> Oracle In-Memory Database Cache Overview

Chapter 14: Recovery System

Cloud Server. Parallels. An Introduction to Operating System Virtualization and Parallels Cloud Server. White Paper.

Samsung Solid State Drive RAPID mode

High Availability and Disaster Recovery for Exchange Servers Through a Mailbox Replication Approach

How To Write An Article On An Hp Appsystem For Spera Hana

How To Make Money From A Network Connection

QLIKVIEW SERVER MEMORY MANAGEMENT AND CPU UTILIZATION

Using Data Domain Storage with Symantec Enterprise Vault 8. White Paper. Michael McLaughlin Data Domain Technical Marketing

EMC Business Continuity for Microsoft SQL Server Enabled by SQL DB Mirroring Celerra Unified Storage Platforms Using iscsi

Software-defined Storage at the Speed of Flash

Flash In The Enterprise

Distribution One Server Requirements

Transcription:

In-memory database systems, NVDIMMs and data durability Steve Graves - July 23, 2014 Database management system (DBMS) software is increasingly common in electronics, spurred by growing data management demands within technology ranging from communications equipment to avionics gear and industrial controllers, and facilitated by these devices increasing on-board CPU, RAM and storage resources. The size of on-device databases varies, ranging from a few gigabytes of data to support a telecom billing/credit system s rating and balance management application, to 10+ GB for an IP router s control plane database, and more than 100 GB for a telecom call routing database. And DBMSs once associated almost entirely with business, desktop and Web-based applications have evolved to meet the needs of today s electronics. Designers often turn to in-memory database systems (IMDSs), which store records in main memory, eliminating sources of latency such as caching and file management that are hardwired into DBMSs that store data persistently on hard disk or flash (these sources of latency are Application example: IMDS and Industrial Controller In an industrial control system, integration of an IMDS within a controller supports a flattened control system architecture in which data is stored and processed, and some control decisions occur, at the level of individual controllers; in the opposing (and traditional) hierarchical system architecture, data stored at the controller level is typically limited to control variables. shown in Figure 1, below). As a result, IMDSs perform orders of magnitude faster than traditional on-disk DBMSs; their simpler design minimizes demand for CPU cycles, permitting the use of less powerful and less costly processors.

Figure 1. Sources of latency in a traditional (on-disk) database system. Volatility, however, is sometimes a concern. In the event of power loss or system failure, main memory s contents are gone. Some applications can tolerate this risk. For example, a RAM-based electronic programming guide stored in a set-top box will be lost if power fails, but can be re-built quickly with information from the cable head-end or satellite transponder. However, other electronics require a higher level of database durability and recoverability. For example, some medical devices require a record of vital signs over time, to support clinical decisions this data can t just vanish in the event of power failure. Network routers and switches store configuration data persistently, usually in flash. Keeping this configuration data in memory would make sense, to facilitate faster rebooting but the data would need to be recoverable. Also challenged by DRAM s volatity are scanners that read fingerprints or faces, and match these with biometric data in an ondevice IMDS, in order to grant or deny access to secured facilities. If the access control system goes down, it must recover quickly. Solutions to IMDS Volatility Solutions to IMDS Volatility Solutions have emerged to address this volatility. Non-volatile memory in the form of battery-backed RAM enables data held on a DRAM chip to survive a system power loss, but has not caught on widely, due to restrictive temperature requirements, risk of leakage, finite battery shelf life and other drawbacks. The IMDS software itself can provide mechanisms for data durability. For example, with a transaction logging feature, the database system creates a record of transactions (groups of changes to the database that must complete or fail as units) in a log file, which can be used to restore the database after failure. But logging itself requires writes to persistent storage, and therefore carries a performance penalty. Another IMDS feature to mitigate volatility is database replication, in which one or more standby inmemory databases on independent nodes are kept synchronized with the master or main database. If the master node goes down, one of these replicas takes over its role. Synchronization can take place quickly, although some latency is imposed by the processing that manages synchronization (and failover, if it occurs) and by communication between the nodes. The performance cost grows as the number of replicas or the physical distance between nodes increases. Different replication

strategies can be used to manage latency. Synchronous or 2-safe replication requires a database transaction to complete on replica nodes concurrently with completion on the master, while asynchronous or 1-safe replication allows transactions to commit on the main database before they re finalized on replicas. The asynchronous approach offers shorter resource holding time and hence faster performance, but with weaker consistency and durability. NVDIMMS: Non-Volatile RAM, Minus the Battery The emergence of non-volatile dual in-line memory modules, or NVDIMMs, adds a new tool for inmemory database durability. NVDIMMs take the form of standard memory sticks that plug into existing DIMM sockets, simplifying integration into off-the-shelf platforms. Typically they combine standard DRAM with NAND flash and an ultracapacitor power source. In normal operation, this technology provides the capabilities of high speed DRAM. In the event of power loss, the ultracapacitor provides a burst of electricity that is used to write main memory contents to the NAND flash chip, where it can be held virtually indefinitely. Upon recovery, the NVDIMM restores data from NAND flash to DRAM. For in-memory databases, NVDIMMs promise is similar to that of battery-backed RAM, but without the battery and its shortcomings. McObject had previously added hooks enabling its extremedb IMDS to work with battery-backed RAM, and was eager to try the IMDS using NVDIMMs as main memory storage. Several vendors now offer NVDIMMs. We tested extremedb using the product from AgigA Tech because of our familiarity with its parent company, Cypress Semiconductor, and we limited our testing to their NVDIMMs (not testing, for example, the NVDIMMs from Viking Technology and Smart Modular Technologies) due largely to our limited time and resources. Therefore the tests described in this article amount not a product shootout so much as a proof-o- -concept that an IMDS can operate with an NVDIMM as storage, achieve performance comparable to using conventional DRAM, and leverage the NVDIMMs recovery capability to restore an inmemory database that has been lost due to system failure. Performance advantage The tests addressed another question that often comes up when considering use of an IMDS in an application that requires both low latency and data recoverability, namely, to what extent will an IMDS with transaction logging retain its performance advantage over a disk-based DBMS? For these latter tests involving persistent storage (of the IMDS s transaction log, and the entire database in the case of the on-disk DBMS) the storage device consisted of a RAM-disk configured using the AGIGARAM NVDIMM. The reasons for using a RAM-disk instead of a conventional hard disk drive or solid state drive are described below. The AgigA Tech NVDIMMs used in the tests are designed for use with Intel s Romley and Grantley platforms (taking in Sandy Bridge, Ivy Bridge, Haswell and Broadwell processor architectures). McObject used the 4GB AGIGARAM DDR3-1600 NVDIMM in an Intel Oak Creek Canyon reference motherboard with Intel Pentium Dual Core CPU 1407 @ 2.8 GHz processor and 8 GB Kingston conventional DDR3-1333 DRAM, running Debian Linux 2.6.32.5. The test application performed five database operations, with each loop constituting a database transaction and containing at least two instances of the operation (see Figure 2). The benchmark application recorded the number of loops accomplished per millisecond for each of the two database types (on-disk DBMS and IMDS with transaction logging, or IMDS+TL ) and both types of memory (NVDIMM and conventional DRAM). The test application used extremedb s native C/C++ application programming interface (API).

Figure 2. Test application operations Test application code enabling database recovery leveraged an extremedb capability originally added to enable its use with battery-backed RAM as storage. This feature enables a process to reconnect to an NVRAM-hosted extremedb database after a system reboots, initiate any needed cleanup, and resume normal operation. An application s recovery algorithm assumes that the memory block of the database memory device assigned as MCO_MEMORY_ASSIGN_DATABASE can be re-used after an application crash or power failure by re-opening it with the additional flag MCO_DB_OPEN_EXISTING. Benchmark Results Recovery from failure was tested by rebooting the test system mid-execution. When the system came back up, the test application re-started automatically, accessed the extremedb database in its prefailure state (upon recovery, the NVDIMM had loaded it from its flash into its DRAM), checked for database consistency and resumed operation, accessing the database from the same NVDIMM memory space that was used prior to the system restart. In the tests comparing the speed of a pure IMDS (no transaction logging) with NVDIMM as main memory storage, to the same database configuration using conventional DRAM, any gap between the two storage types was negligible. The difference in performance on all the database operations tested inserts, updates, deletes, index searches and table traversals was within the margin of error for the measurement technique used. One might attribute this equivalence to the entire database being loaded into CPU cache and data access occurring from there rather than from DRAM or NVDIMM. However, at approximately 12MB, test database size greatly exceeded the 5MB CPU cache size, and the test application relied on random keys to look up random pages from the database. Effect of transaction logging The remaining tests focused on the effect of transaction logging on IMDS performance. IMDS vendors offer transaction logging to mitigate the volatility of pure in-memory data storage. However, transaction logging requires persistent storage (for the log) which could impact IMDS performance. For this reason, IMDS vendors are often asked whether their products, when deployed with transaction logging, still outperform on-disk DBMSs. The test sought to answer this question. The hard disk used for persistent storage was actually a RAM disk (a memory-based analog of disk storage) using the NVDIMM as memory. This was done partly to further test AgigA Tech s product (i.e. to confirm if it would work to create a RAM-disk and have a database system interact with it), but also to shed light on the reason why an IMDS with transaction logging outperforms an on-disk DBMS. In-memory database systems differ from on-disk DBMSs in important ways beyond the storage devices they utilize (hard disk or solid state drive for on-disk DBMSs vs. DRAM for IMDSs). An IMDS

eliminates cache management, file I/O and other sources of overhead inherent in traditional DBMS architecture. Eliminating the hard disk replacing it with a RAM disk eliminates overhead stemming from physical operation of the storage device, in order to highlight the latency effect of the IMDS s streamlined design vs. the on-disk DBMS s more complex processing. The test showed that for insert, update and delete operations, the IMDS with transaction logging significantly outperformed the traditional on-disk DBMS (again, with both of these using a RAM-disk for their persistent storage). Figure 3 shows results in loops/ms for each of the configurations, as well as the performance multiple exhibited by the IMDS+TL over the on-disk DBMS. For example, in the test of database deletes, the IMDS+TL was 12.77 times faster than the on-disk DBMS. Figure 3 also shows the performance impact of turning off transaction logging and having extremedb perform the operations as a pure IMDS using the NVDIMM as main memory storage. Figure 3. Results Database index searches and table traversals showed little to no performance change when moving from on-disk DBMS to IMDS+TL. This result was expected, because such database reads do not change database contents, and are typically much less costly, in performance terms, than insert, update and delete operations. Discussion NVDIMMs match the speed of conventional DRAM when used as IMDS storage, while delivering full in-memory database durability. Why, then, would anyone consider using an IMDS with latencyinducing transaction logging? There are several reasons, including cost, since GB for GB, NVDIMMs cost more than DRAM; a desire to use platforms other than Intel s Romley and Grantley; and required database size (AgigA Tech s NVDIMMs support up to a 128GB total memory size). As shown in the numbers presented above, adding transaction logging to achieve data durability slows IMDS performance, but the IMDS+TL combination still outperforms a traditional on-disk DBMS for insert, update and delete operations. Another question for prospective users is whether their chosen in-memory database system supports the use of NVDIMMs as main memory storage. As mentioned above, McObject s extremedb IMDS includes features added early in the product s development to support its interaction with batterybacked RAM that enabled database recovery to occur seamlessly with NVDIMMs. Using an IMDS

without such features may entail more complexity, with significant development and testing required before reaching a workable solution. It should also be noted that the database durability discussed in this article that is, the assurance that the database and all committed transactions can be recovered in the event of system failure differs from high availability, or the ability to operate without downtime. While both techniques aim to enable databases to withstand failure, high availability is usually achieved via replication, as described above, with failover time measured in milliseconds. In contrast, durability achieved in IMDSs with transaction logging or use of NVDIMMs as main memory storage carries no such guarantee of eliminating downtime. Database recovery using either NVDIMMs or transaction logging is usually automated, but the most likely usage scenario for either is following an unexpected system shutdown, which implies a cold re-start (e.g. re-booting), a minutes-long process. Developers should understand the distinction between database high availability and durability when considering techniques to tame volatility.