Berkeley DB and Solid Data



Similar documents
Speed and Persistence for Real-Time Transactions

EMC XtremSF: Delivering Next Generation Performance for Oracle Database

Promise of Low-Latency Stable Storage for Enterprise Solutions

How To Store Data On An Ocora Nosql Database On A Flash Memory Device On A Microsoft Flash Memory 2 (Iomemory)

Highly Available Mobile Services Infrastructure Using Oracle Berkeley DB

EMC XtremSF: Delivering Next Generation Storage Performance for SQL Server

A Comparison of Oracle Berkeley DB and Relational Database Management Systems. An Oracle Technical White Paper November 2006

Red Hat Enterprise Linux as a

Symantec NetBackup 7 Clients and Agents

WHITE PAPER: customize. Best Practice for NDMP Backup Veritas NetBackup. Paul Cummings. January Confidence in a connected world.

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale

Microsoft Windows Server in a Flash

INCREASING EFFICIENCY WITH EASY AND COMPREHENSIVE STORAGE MANAGEMENT

IS IN-MEMORY COMPUTING MAKING THE MOVE TO PRIME TIME?

Data Management for Portable Media Players

An Oracle White Paper May Exadata Smart Flash Cache and the Oracle Exadata Database Machine

VERITAS NetBackup 6.0 Database and Application Protection

Best Practices for Deploying Citrix XenDesktop on NexentaStor Open Storage

Veritas NetBackup 6.0 Database and Application Protection

With the increasing deployment of computers

SOLUTION BRIEF. Advanced ODBC and JDBC Access to Salesforce Data.

Running Oracle s PeopleSoft Human Capital Management on Oracle SuperCluster T5-8 O R A C L E W H I T E P A P E R L A S T U P D A T E D J U N E

EMC Backup and Recovery for Microsoft SQL Server 2008 Enabled by EMC Celerra Unified Storage

MySQL Cluster New Features. Johan Andersson MySQL Cluster Consulting johan.andersson@sun.com

CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level. -ORACLE TIMESTEN 11gR1

Boost Database Performance with the Cisco UCS Storage Accelerator

EMC Business Continuity for Microsoft SQL Server Enabled by SQL DB Mirroring Celerra Unified Storage Platforms Using iscsi

Top 10 Reasons why MySQL Experts Switch to SchoonerSQL - Solving the common problems users face with MySQL

Running Highly Available, High Performance Databases in a SAN-Free Environment

An Oracle White Paper January A Technical Overview of New Features for Automatic Storage Management in Oracle Database 12c

Application Brief: Using Titan for MS SQL

Data Sheet: Backup & Recovery Symantec Backup Exec 12.5 for Windows Servers The gold standard in Windows data protection

Microsoft SQL Server Acceleration with SanDisk

DataStax Enterprise, powered by Apache Cassandra (TM)

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

WHITE PAPER: DATA PROTECTION. Veritas NetBackup for Microsoft Exchange Server Solution Guide. Bill Roth January 2008

The Benefits of Continuous Data Protection (CDP) for IBM i and AIX Environments

ATA DRIVEN GLOBAL VISION CLOUD PLATFORM STRATEG N POWERFUL RELEVANT PERFORMANCE SOLUTION CLO IRTUAL BIG DATA SOLUTION ROI FLEXIBLE DATA DRIVEN V

The 8Gb Fibre Channel Adapter of Choice in Oracle Environments

Performance Beyond PCI Express: Moving Storage to The Memory Bus A Technical Whitepaper

Integrated Grid Solutions. and Greenplum

Best Practices for Optimizing SQL Server Database Performance with the LSI WarpDrive Acceleration Card

Removing Performance Bottlenecks in Databases with Red Hat Enterprise Linux and Violin Memory Flash Storage Arrays. Red Hat Performance Engineering

Guide to Scaling OpenLDAP

Server Consolidation with SQL Server 2008

Eloquence Training What s new in Eloquence B.08.00

How SafeVelocity Improves Network Transfer of Files

COMPARING STORAGE AREA NETWORKS AND NETWORK ATTACHED STORAGE

Optimized data protection through one console for physical and virtual systems, including VMware and Hyper-V virtual systems

Online Transaction Processing in SQL Server 2008

Intel RAID Controllers

Accelerating Microsoft Exchange Servers with I/O Caching

Mission-Critical Java. An Oracle White Paper Updated October 2008

BEST PRACTICES FOR PROTECTING MICROSOFT EXCHANGE DATA

WHITE PAPER Optimizing Virtual Platform Disk Performance

technology brief RAID Levels March 1997 Introduction Characteristics of RAID Levels

How to Manage Critical Data Stored in Microsoft Exchange Server By Hitachi Data Systems

Using HP StoreOnce D2D systems for Microsoft SQL Server backups

Smart Tips. Enabling WAN Load Balancing. Key Features. Network Diagram. Overview. Featured Products. WAN Failover. Enabling WAN Load Balancing Page 1

Microsoft SQL Server Always On Technologies

All-Flash Arrays: Not Just for the Top Tier Anymore

ioscale: The Holy Grail for Hyperscale

Postgres Plus Advanced Server

Network Registrar Data Backup and Recovery Strategies

IBM Enterprise Linux Server

EMC VFCACHE ACCELERATES ORACLE

Overview of I/O Performance and RAID in an RDBMS Environment. By: Edward Whalen Performance Tuning Corporation

VERITAS Business Solutions. for DB2

EMC Invista: The Easy to Use Storage Manager

Veritas NetBackup 6.0 Server Now from Symantec

BrightStor ARCserve Backup for Windows

VERITAS Backup Exec 10 for Windows Servers AGENTS & OPTIONS MEDIA SERVER OPTIONS KEY BENEFITS AGENT AND OPTION GROUPS

CA ARCserve and CA XOsoft r12.5 Best Practices for protecting Microsoft SQL Server

QLogic 2500 Series FC HBAs Accelerate Application Performance

Chapter 14: Recovery System

Microsoft Windows Server Hyper-V in a Flash

Performance Characteristics of VMFS and RDM VMware ESX Server 3.0.1

Symantec NetBackup OpenStorage Solutions Guide for Disk

An Oracle White Paper September Advanced Java Diagnostics and Monitoring Without Performance Overhead

SAN Conceptual and Design Basics

Archiving File Data with Snap Enterprise Data Replicator (Snap EDR): Technical Overview

Solving I/O Bottlenecks to Enable Superior Cloud Efficiency

RAID technology and IBM TotalStorage NAS products

Software-defined Storage at the Speed of Flash

Virtual Tape Library Solutions by Hitachi Data Systems

I N T E R S Y S T E M S W H I T E P A P E R INTERSYSTEMS CACHÉ AS AN ALTERNATIVE TO IN-MEMORY DATABASES. David Kaaret InterSystems Corporation

Maximum Availability Architecture

Using Data Domain Storage with Symantec Enterprise Vault 8. White Paper. Michael McLaughlin Data Domain Technical Marketing

How to Choose your Red Hat Enterprise Linux Filesystem

Benchmarking Couchbase Server for Interactive Applications. By Alexey Diomin and Kirill Grigorchuk

CDH AND BUSINESS CONTINUITY:

High Availability and Disaster Recovery Solutions for Perforce

HyperQ Remote Office White Paper

Answering the Requirements of Flash-Based SSDs in the Virtualized Data Center

QLogic 16Gb Gen 5 Fibre Channel for Database and Business Analytics

Recovery and the ACID properties CMPUT 391: Implementing Durability Recovery Manager Atomicity Durability

An Oracle White Paper November Achieving New Levels of Datacenter Performance and Efficiency with Software-optimized Flash Storage

Nexenta Performance Scaling for Speed and Cost

Configuring Apache Derby for Performance and Durability Olav Sandstå

Transcription:

Berkeley DB and Solid Data October 2002 A white paper by Sleepycat and Solid Data Systems

Table of Contents Introduction to Berkeley DB 1 Performance Tuning in Database Applications 2 Introduction to Solid Data 3 Integration with Solid Data 4 Performance Results 5 Conclusions 6

Introduction to Berkeley DB Berkeley DB, the most widely deployed embedded database system in the world, is used in systems ranging from large-scale email and directory server systems to high performance network switch and routers that handle core data transmission on the Internet. Across this range of applications, speed, reliability and availability are critical. Developers choose Berkeley DB because it provides enterprise-grade transaction management and recovery services in a compact package suitable for embedded deployment. Berkeley DB provides superior data management for these systems because it is designed for use inside applications, rather than as a stand alone database management system. Berkeley DB links directly into the application's address space, rather than operating as a separate server. As a result, the application can store and fetch data directly, rather than sending a message to a remote server every time a record is required. This drastic reduction in the run-time cost of database access speeds up applications dramatically. Berkeley DB's easy-to-use programmatic interfaces make it simple for software developers to save and search for data in its native format, without translation to some relational database format. Because there is no query language parser, planner, optimizer, and executor that operate at runtime, and because no data translation is required, access is simpler and faster than for conventional relational database systems. Demanding applications require not just speed, but also guaranteed reliability so that data is never lost, even if the application or system fails. Berkeley DB uses the same techniques for data integrity and failure recovery as the most popular enterprise relational systems on the market, and is able to provide the same reliability guarantees. These services, including two-phase locking and write-ahead logging, allow Berkeley DB to survive system failures without losing data. Recovery processing at system restart time ensures that all committed database updates are restored and available, and any interrupted work is rolled back so that it does not appear in the database. Applications that require high availability, so that they continue to run even if one or more parts of the system fail altogether, can use Berkeley DB's replication service. Replication keeps multiple independent systems synchronized. In the event that one or more of the systems fails, the others are able to take over for them. The application can continue to run without interruption. Like most database systems, Berkeley DB uses magnetic disk for storage of both actual database records and the log that it uses to guarantee that data cannot be lost. During normal processing, records are read from and written to the disk, and every time a transaction is completed, Berkeley DB writes log information to disk. Berkeley DB uses the file system interfaces provided by modern operating systems to access the disk. 1

Performance Tuning in Database Applications Like most enterprise-grade database systems, Berkeley DB performs extremely well when properly tuned for the application it supports. Performance tuning for database systems requires an understanding of the data stored, common search and update workloads, and the characteristics of the hardware platform on which the system runs. Database systems generally store information on a magnetic disk or other persistent storage device. As the application uses data, records are copied from disk into the computer's main memory. This transfer is a common source of performance problems in real-world applications. While main memory accesses can be very fast, reading information from magnetic disk involves moving the actuator arm that carries the disk head, waiting for the records of interest to rotate underneath the read head, and then copying the data from through the disk controller and into the operating system's memory. The entire operation can take tens of milliseconds using common magnetic disk systems, and that overhead translates directly into slower application performance. In addition, enterprise-grade database engines like Berkeley DB use logging techniques to guarantee that no data is ever lost, even if the system loses power or suffers a hardware or system software failure. Whenever the application makes a change to a record in the database, Berkeley DB writes a log entry to a special location. That log entry allows the database engine to redo or back out the change later, in the event of a failure. Log processing is generally tuned to minimize its performance impact, but writing log entries to disk during normal processing can be another source of performance problems in demanding applications. One of the best ways to improve the performance of an application that uses the database is to reduce the cost of I/O operations. Reducing the number of disk accesses and making every disk access faster makes the database engine, and therefore the application, run much faster. 2

Introduction to Solid Data Solid Data manufactures a line of high-performance, highly-available solid state storage systems widely used in the telecom and financial services markets. The Solid Data product family provides applications with a familiar file system interface. Unlike conventional magnetic disk, however, Solid Data's solid state storage system uses normal memory to store file data. Every solid state storage system includes a batterybacked power supply and intelligent monitoring system that watches for power failure. If a Solid Data device loses power, the battery-backed power supply allows it to continue running without shutting down. If the external power failure lasts long enough, the internal battery may run down. In that case, the monitoring systems write all the information in magnetic memory to an emergency backup disk inside the chassis. On restart, the memory image can be reloaded from the disk. During normal processing, Solid Data's solid state storage products use only memory for data storage. No information needs to be written to disk. This provides I/O intensive applications, including those that use database systems, with a dramatic performance improvement - typically as much as 1,000%. The most common sources of latency - I/O costs associated with data storage on magnetic disk and log management - disappear. 3

Integration with Solid Data Solid Data and Sleepycat Software have integrated Berkeley DB with the Solid Data solid state storage products to provide an embedded data management solution to the most demanding applications. Integration of the two products was straightforward. Berkeley DB uses the operating system's file system interfaces to store and fetch data, and for log management. Solid Data provides standard file system interfaces to its solid state storage product family. The result is that an application can use Berkeley DB with a Solid Data solid state storage system by simply locating an application's Berkeley DB directory in the Solid Data file system. Sleepycat ran exhaustive tests against the combined system. The Solid Data solid state storage system performed flawlessly with Berkeley DB. 4

Performance Results In order to characterize the performance of Berkeley DB using the Solid Data solid state system for storage, Sleepycat Software ran two different benchmarks reflecting two different application workloads. Both benchmarks ran on a single-processor Sun Enterprise 450 using a SCSI interface to the storage device. The disk tests ran against a conventional magnetic disk. The solid state tests ran against a Solid Data model 800 system. In all the tests, both the Berkeley DB log region and the database files were located on the same storage device. The first benchmark measured the performance of write-intensive applications using Berkeley DB for record storage. The benchmark used multiple threads of control, each of which fetched a ten-byte record from the database, updated a part of the record, and wrote the changed record back to the database. Each read/modify/write operation ran as a single transaction. The benchmark did no other processing of the records. As a result, it reflects the performance that a write-intensive application should expect. Using fifty concurrent threads, each continuously updating single records in the database, the disk-based version of Berkeley DB was able to deliver 849 transactions per second. When the Solid Data system was used instead, performance jumped to 3,266 transactions per second - nearly 300% performance improvement. The twin benefits of an embedded database engine and a high-performance solid state storage system as a backing store delivered outstanding throughput using a relatively inexpensive computer system. The second benchmark mixed read and write operations in a single workload. Eighty percent of the database accesses were single-record reads. Twenty percent were singlerecord writes. This time, each record was 200 bytes long. Using magnetic disk and 100 concurrent threads of control, Berkeley DB delivered exactly 100 transactions per second under this workload. After switching to the Solid Data system, Berkeley DB delivered 1,139 transactions per second - over 1,000% performance improvement. 5

Conclusions Berkeley DB is a high-performance embedded database system capable of delivering very high transaction throughput. When used with a conventional magnetic disk storage device on inexpensive hardware, benchmark applications achieved between 100 and 850 transactions per second throughput, depending on workload. When using a Solid Data solid state storage system in place of magnetic disk, throughput jumped dramatically, to between 1,139 and 3,266 transactions per second. The combination of Berkeley DB's embedded architecture and the high-performance, low-latency Solid Data solid state storage system delivers unmatched transaction rates for applications with the most demanding performance requirements. 6

About Sleepycat Sleepycat Software, Inc. (http://www.sleepycat.com), a private company with offices in California and Massachusetts, was founded in 1996 to commercially develop, support, and distribute Berkeley DB. Berkeley DB is the embedded database engine behind many of the largest ISPs and wired and wireless networks, as well as e-commerce sites around the world. Our solutions are used in mission-critical applications by some of the world's largest computing enterprises, including Alcatel, Amazon.com, Ask Jeeves, CMG, Cisco Systems, Motorola, RSA Security, and Sun Microsystems. Berkeley DB is open source and runs on all major operating systems, including embedded Linux, Linux, MacOS X, QNX, UNIX, VxWorks, and Windows. In January of 2002, Sleepycat won the prestigious Crossroads A-List Award for Berkeley DB. For more information, see www.sleepycat.com. Corporate Headquarters: 118 Tower Road Lincoln, MA 01773, USA www.sleepycat.com Emily Salus Telephone +1-510-923-9312 Email emily@sleepycat.com About Solid Data Solid Data Systems is the leading global provider of solid-state storage systems that multiply the throughput of business-critical, transaction-intensive applications. Solid Data's products are based on solid-state memory technology and are designed to provide near-instant access to data storage. These compact, rack-mount systems combine the speed of main memory with the persistence (non-volatility) of external disk storage - enabling enterprises to multiply the transaction performance of general-purpose systems, reduce complexity, and scale their infrastructures cost effectively. Customers include global telecommunications equipment integrators and financial service providers. Visit Solid Data on the web at www.soliddata.com. Corporate Headquarters: 3542 Bassett Street Santa Clara, CA 95054 USA Tel: +1.800.287.0373 +1.408.845.5700 Fax: +1.408.727.5496 Gary Taggart Telephone +1-408-845-5743 E mail gtaggart@soliddata.com The Solid Data logo is a registered trademark in the United States and Japan. All other brands, or products are the trademarks or registered trademarks of their respective owners. Solid Data disclaims any proprietary interest in 7 Solid Data Systems 3542 Bassett Street Santa Clara, CA 95054 USA Tel: +1.408.845.5700 Fax: +1.408.727.5496 www.soliddata.com Printed in the USA 10/02 Copyright Solid Data Systems 2002 All Rights Reserved.