Availability Digest. www.availabilitydigest.com. Raima s High-Availability Embedded Database December 2011



Similar documents
Availability Digest. MySQL Clusters Go Active/Active. December 2006

Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications

Shadowbase Data Replication VNUG - May 26, Dick Davis, Sales Manager Shadowbase Products Group Gravic, Inc.

Whitepaper: Back Up SAP HANA and SUSE Linux Enterprise Server with SEP sesam. Copyright 2014 SEP

Features of AnyShare

High Availability Using Raima Database Manager Server

<Insert Picture Here> Oracle In-Memory Database Cache Overview

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

C p o y p r y i r g i h g t D t e a t i a lie l d

Empress Embedded Database. for. Medical Systems

ORACLE DATABASE 10G ENTERPRISE EDITION

Configuring Apache Derby for Performance and Durability Olav Sandstå

Bryan Tuft Sr. Sales Consultant Global Embedded Business Unit

SQL Databases Course. by Applied Technology Research Center. This course provides training for MySQL, Oracle, SQL Server and PostgreSQL databases.

The EMSX Platform. A Modular, Scalable, Efficient, Adaptable Platform to Manage Multi-technology Networks. A White Paper.

Cache Database: Introduction to a New Generation Database

High-Volume Data Warehousing in Centerprise. Product Datasheet

Microsoft SQL Server for Oracle DBAs Course 40045; 4 Days, Instructor-led

Replication and Mirroring User's Guide. Raima Database Manager 11.0

1 File Processing Systems

Microsoft SQL Server 2008 R2 Enterprise Edition and Microsoft SharePoint Server 2010

Introduction. Introduction: Database management system. Introduction: DBS concepts & architecture. Introduction: DBS versus File system

Archive Data Retention & Compliance. Solutions Integrated Storage Appliances. Management Optimized Storage & Migration

DataBlitz Main Memory DataBase System

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage

Eloquence Training What s new in Eloquence B.08.00

CitusDB Architecture for Real-Time Big Data

Introduction: Database management system

Cloud Based Application Architectures using Smart Computing

SQL Server Training Course Content

Architecture and Mode of Operation

Module 14: Scalability and High Availability

Protect SAP HANA Based on SUSE Linux Enterprise Server with SEP sesam

Shadowbase Data Replication Solutions. William Holenstein Senior Manager of Product Delivery Shadowbase Products Group

Virtuoso Replication and Synchronization Services

Real-time Data Replication

Guide to Scaling OpenLDAP

MS SQL Server 2014 New Features and Database Administration


Tier Architectures. Kathleen Durant CS 3200

HA / DR Jargon Buster High Availability / Disaster Recovery

SYSPRO Point of Sale: Architecture

In Memory Accelerator for MongoDB

AD207: Advances in Data Integration with Lotus Enterprise Integrator for Domino 6.5. Sarah Boucher, Manager Enterprise Integration Development

Managing your Red Hat Enterprise Linux guests with RHN Satellite

I N T E R S Y S T E M S W H I T E P A P E R INTERSYSTEMS CACHÉ AS AN ALTERNATIVE TO IN-MEMORY DATABASES. David Kaaret InterSystems Corporation

Upgrading Your SQL Server 2000 Database Administration (DBA) Skills to SQL Server 2008 DBA Skills Course 6317A: Three days; Instructor-Led

MS-40074: Microsoft SQL Server 2014 for Oracle DBAs

CatDV Pro Workgroup Serve r

Best Practices for Deploying and Managing Linux with Red Hat Network

Cisco Prime Optical. Overview

Web Application Deployment in the Cloud Using Amazon Web Services From Infancy to Maturity

Appendix A Core Concepts in SQL Server High Availability and Replication

Contents. SnapComms Data Protection Recommendations

High Availability Solutions for the MariaDB and MySQL Database

Open Source DBMS CUBRID 2008 & Community Activities. Byung Joo Chung bjchung@cubrid.com

Oracle BI EE Implementation on Netezza. Prepared by SureShot Strategies, Inc.

Egnyte Local Cloud Architecture. White Paper

Upgrading to Microsoft SQL Server 2008 R2 from Microsoft SQL Server 2008, SQL Server 2005, and SQL Server 2000

Diagram 1: Islands of storage across a digital broadcast workflow

Configuring Apache Derby for Performance and Durability Olav Sandstå

Online Firm Improves Performance, Customer Service with Mission-Critical Storage Solution

EMC Backup and Recovery for Microsoft SQL Server 2008 Enabled by EMC Celerra Unified Storage

Performance Counters. Microsoft SQL. Technical Data Sheet. Overview:

Key Benefits. R1Soft CDP Server Software. R1Soft Continuous Data Protection for Linux and Windows Systems. Bare-Metal Disaster Recovery

Informix Dynamic Server May Availability Solutions with Informix Dynamic Server 11

DISTRIBUTED AND PARALLELL DATABASE

SOLUTION BRIEF. Advanced ODBC and JDBC Access to Salesforce Data.

Accelerate Data Loading for Big Data Analytics Attunity Click-2-Load for HP Vertica

Distributed Data Management

Contents RELATIONAL DATABASES

Microsoft SQL Server 2012: What to Expect

TIBCO ActiveSpaces Use Cases How in-memory computing supercharges your infrastructure

WOS Cloud. ddn.com. Personal Storage for the Enterprise. DDN Solution Brief

WHAT IS ENTERPRISE OPEN SOURCE?

The Future of PostgreSQL High Availability Robert Hodges - Continuent, Inc. Simon Riggs - 2ndQuadrant

Detailed Features. Detailed Features. EISOO AnyBackup Family 1 / 19

Introduction to Operating Systems. Perspective of the Computer. System Software. Indiana University Chen Yu

Introduction. Part I: Finding Bottlenecks when Something s Wrong. Chapter 1: Performance Tuning 3

Business Application Services Testing

Enterprise and Standard Feature Compare

DATABASE SYSTEM CONCEPTS AND ARCHITECTURE CHAPTER 2

In-memory databases and innovations in Business Intelligence

The Methodology Behind the Dell SQL Server Advisor Tool

CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level. -ORACLE TIMESTEN 11gR1

æ A collection of interrelated and persistent data èusually referred to as the database èdbèè.

Kaseya IT Automation Framework

Your Data, Any Place, Any Time.

Postgres Plus Advanced Server

ENTERPRISE STORAGE WITH THE FUTURE BUILT IN

Scalability of web applications. CSCI 470: Web Science Keith Vertanen

Web Development. Owen Sacco. ICS2205/ICS2230 Web Intelligence

EMC Data Protection Advisor 6.0

Frequently Asked Questions. Secure Log Manager. Last Update: 6/25/ Barfield Road Atlanta, GA Tel: Fax:

DiskPulse DISK CHANGE MONITOR

SalesLogix. SalesLogix v6 Architecture, Customization and Integration

Transcription:

the Availability Digest Raima s High-Availability Embedded Database December 2011 Embedded processing systems are everywhere. You probably cannot go a day without interacting with dozens of these powerful systems. Software applications embedded in microprocessor chips control airplanes and toasters. They provide the intelligence for our smart phones and GPS devices. Manufacturing robots and patent monitoring systems depend upon them. The average car today is run by about forty microprocessors with forty million lines of code. Many of these embedded applications depend upon sophisticated databases. RDM Embedded () from Raima, Inc. (www.raima.com) fulfills this need with a highly available embedded database. Database Fundamentals An effective embedded database must meet several criteria: It must have a small footprint. can use as little as 400K of RAM memory. It must be fast. does not depend upon disk accesses. It must be reliable. provides transaction semantics and mirroring to protect data. It must be scalable. s Database Union and Multi-Version Concurrency Control (MVCC) features let applications work seamlessly across partitioned databases. It must be predictable. s core database uses the network model with fixed length records that make access times and database memory sizes easily predictable. It must provide a low level of control. The API includes more than 150 low-level control functions implemented in C. It must optionally provide high-level access. databases are accessible via C++ object models, SQL statements, and ODBC, among others. It should allow geographical distribution of data. s replication facility and remote login capabilities let be the database in highly distributed peer or hierarchical applications. Architecture The basic database functionality is implemented in its core database engine. This engine is suitable for many applications and has been put to use in hundreds of thousands of installations over the last twenty-five years. It has served many industries including aerospace, automotive, business, finance, government, healthcare, manufacturing, and telecommunications. 1

Raima offers several extensions to its core database engine. These include: High Availability s goal is to achieve five 9s of reliability. DataFlow An core database can be replicated to one or more external SQL databases. Distributed An database can be partitioned across multiple locations and viewed as a single database. Interop External applications can access an database via ODBC, JDBC, and ADO.NET, and via a browser over HTTP. HA Interop Core Database Engine Distributed DataFlow The Core Database Engine The core database engine comprises the database itself, a Transaction File Server to manage the database, and a Runtime Library that applications use to issue database commands. Database Models Network Model Database To achieve maximum efficiency in the use of space and time, the underlying database architecture of is a network model database. It is based on owner/member relationships. A file comprises an owner record and many member records. The members form a linked list, with each member record pointing to the previous and next member record as well as to the owner. The owner record points to the first and last member record in the list. All records in a file are fixed length. They are organized into fixed-length pages with multiple records per page. It is the page that is the block that is written to and read from disk. The core database supports five types of files: Extensions Database Dictionary Files which contain the data definition schema for each file. Data Files which contain the file data. Key Files which contain B-tree indices on fields within a file. Hash Files which can provide quicker lookup of records than going through a B-tree index providing key ordering is not necessary in the application. Vardata File for holding variable length data that is referenced by a pointer in a data record. Files may be fully resident in memory. A file may also be resident on disk, with memory being used as cache to hold the most recently used records from the disk-resident file. To maintain persistence, an application can sync the memory-resident content of a file at any time by writing it to disk. Thus, a core file is a linked list of fixed length records. The network model is extremely efficient when records must be read in sequence. 2

Relational Model Database Through the use of its Key Files, the database is extended to a relational database that sits on top of the core network model database. Records may be selected in key order via the keys in the Key Files. This allows the database to be used as a SQL database. SQL DDL statements are translated into core DDL statements and stored in the core Database Dictionary Files. 1 Relational equi-joins, based on primary/foreign key relationships, are actually implemented in the core database as network model sets, resulting in very fast join processing in SQL. Runtime Library The Runtime Library provides applications with the functionality needed to make effective use of the database. A copy of the Runtime Library is bound into each application that needs to access. Runtime Library functions include: Database Control Add and delete tables, open and close files. Record Set Control Create and delete records, connect or disconnect records from sets. Data Manipulation Read or write individual fields or entire records. Locking Lock records for shared reading or exclusive writing. Transaction Control Begin, commit, and abort transactions. Navigation Key lookup, hash lookup, and scanning. Transaction File Server (TFS) The Transaction File Server (TFS) serves the role of a database controller, much like a disk controller manages disk activity. An configuration can have multiple TFSs, and they all operate independently. Though a TFS can manage multiple disks, typically only one disk is assigned to each TFS to achieve the maximum parallelism for disk activity. Any number of applications may connect to a specific TFS through their Runtime Libraries. It is the responsibility of the TFS to execute commands invoked by applications via their calls to the Runtime Library functions. A file partition is always contained on one disk. Therefore, all commands executed against a given file (or partition thereof) always flow through the same TFS. The TFS manages the cache and the locks for its files. Distributed Processing In addition to multiple TFSs on the same computer, there can be TFSs on multiple computers. Any application can connect to TFSs on any computer. Therefore, the database may be distributed over several computers that may, in fact, be geographically distributed. Applications connect to the TFSs via TCP/IP. A TFS is identified via a domain name and a port. If an application is connecting to a TFS on its own computer, the TCP/IP stack is optimized for local communication. 1 For a benchmark comparison of network versus relational databases, see the Raima white paper entitled Database Management in Real-time and Embedded Systems. http://www.raima.com/wp-content/uploads/database_management_in_real_time_and_embedded_systems.pdf 3

TFS Versions There are three different versions of TFS: TFSt is a full-featured TFS that links directly to an application. It is multied and can execute several runtime commands simultaneously. It is optimized for a multicore environment. TFSr has the same functionality as TFSt but applications connect to it via RPCs (remote procedure calls). This capability is used if the application and the TFS are on different computers. TFSs is a single-ed TFS that is intended for batch processing. It is not safe, and the database should be backed up before processing it with TFSs. Transaction Control When an application issues a begin transaction directive, the Runtime Library initiates a transaction. As data manipulation commands are issued, TFS is notified and locks the records that are being changed. The Runtime Library keeps a copy of the changed records. Upon the receipt of a commit transaction, the Runtime Library bundles all of the pages that have been newly created or updated by the transaction and sends these as a transaction log to the TFS. The TFS updates its cache with the changes and releases the locks on the changed records. It writes the transaction log to a diskresident log for persistence. At this time, the transaction is safe-stored and is considered committed. The commit directive issued by the application is completed with a status response. application runtime library tx log TFS application runtime library tx log tx log Transaction Processing data base Periodically, TFS will update the disk-resident database with the transaction logs that have accumulated since the last synchronization point. The synchronization interval is configurable from ten milliseconds to ten seconds. If a transaction affects files managed by multiple TFSs, the changes are sent as separate transaction logs to each involved TFS. However, there is no two-phase commit protocol between the TFSs. 4

SQL Access A fully functional SQL implementation is much too massive for an embedded system. Fortunately, many of the SQL functions are not typically needed in an embedded application. Raima has implemented a subset of SQL that provides the necessary functions required by most embedded applications. 2 Raima s abridged SQL does not include Grant/Revoke security statements, views, triggers, or dynamic DDL. It typically executes precompiled SQL statements but, as an option, Raima provides a SQL compiler. The SQL engine is linked with the Runtime Library and uses the underlying core database to execute SQL statements. A SQL application can interact with multiple TFSs on the same or different computers. Operating System Support runs under many operating systems, including Linux (many flavors), Windows, HP-UX, Solaris, ios, AIX, and Mac OS. Extensions As described earlier, Raima offers several extensions to. High Availability provides many features to achieve high availability of its database. Raima s goal is to achieve five 9s of availability (downtime of about five minutes per year) and 100% protection of the data in the database. Transaction Management availability starts with its transaction management capabilities. Once a transaction is committed, it is durable. Thus, the database remains intact following a system failure. Upon recovery, the safe-stored transaction logs are read to replay any transactions that had not yet been materialized at the time of the failure. Hot Backup provides a hot backup facility to take full or incremental snapshots of active files and to write them to an offline medium. The backup facility does not interfere with normal processing, which continues unabated during the backup. Database backups allow the database to be restored should the disks suffer damage. Mirroring To provide rapid recovery from a total system failure, the database can be mirrored. Mirroring provides a byte-by-byte copy of the database on a backup system. The backup system can then be put into service rapidly should the primary system go down. Database mirroring is accomplished by copying the transaction logs from the primary system to the backup system. The backup system replays the transaction logs against its copy of the database to keep the two databases synchronized. 2 See Is Using SQL in an Embedded Computer Application Like Trying to Squeeze an Elephant into a Mini?, Raima White Paper. http://www.raima.com/wp-content/uploads/sql_in_an_embedded_application.pdf 5

Mirroring can be either synchronous or asynchronous. With synchronous mirroring, both systems will synchronize their databases with the accumulated transaction logs simultaneously. Database synchronization is not considered complete until both systems have materialized all of the data changes in the transaction logs. In this case, the primary and backup databases are always exact copies of each other. With asynchronous mirroring, the backup system will synchronize its database independently of the primary system. The backup database will be out of synchronization with the primary database by some small increment of time. Raima supports third-party HA Managers. HA Managers run on each system and coordinate the mirroring process. It is the responsibility of the backup HA Manager to determine when the backup system should take over. A Raima-supplied HA Manager is scheduled to be available in the future, along with userdefined callouts for taking specified actions on certain failures. DataFlow Via DataFlow, the contents of an database can be distributed to other systems for their use. DataFlow uses data replication rather than mirroring to do this. As opposed to mirroring, DataFlow replicates core database actions such as create, update, and delete rather than byte changes. The Runtime Library creates replication logs containing these actions for each transaction when configured to do so. The replication logs are sent asynchronously from the primary system to one or more target systems. Resident on each target system is a Replication Client that is responsible for mapping the replication log to the target system s database and for replaying the replication logs to that database. A primary use of DataFlow is to move data from remote non-indexed databases to a heavily indexed persistent query database. For instance, it can feed SQL databases such as MySQL, SQL Server, Oracle, or another SQL database. It can also be used to aggregate data from several sources onto a single target. For instance, data from multiple remote sensors can be aggregated in the database of a central control computer. Distributed Distributed provides a unified view of a geographically distributed partitioned database. The partitions are identically structured, and each partition can reside on a different computer. Through s Database Union feature, an application can access the distributed partitions as a single database without being aware of the fact that they reside on different computers. uses Multi-Version Concurrency Control (MVCC) to present a consistent snapshot of the database to a reader even while the database is being actively updated. This allows readers to view the database without having to lock records in order to maintain a consistent view. As a result, MVCC read activity is transparent to applications that are actively updating the database. 6

Interop Interop allows other systems to interoperate with an database via standard protocols. Interop supports ODBC, JDBC, and ADO.NET. databases can be read and updated with Internet browsers over HTTP using s MicroHTTP Server. Use Cases A series of detailed use cases for can be found in the Raima white paper entitled RDM Embedded 10.1 Architecture and Features. 3 Summary Embedded systems are all around us and control many aspects of our lives without our ever thinking of them. Some of these are simple systems that control our coffee makers and dish washers. Others are quite complex, controlling manufacturing processes and jet airplanes. Complex embedded applications often require sophisticated, efficient, and high-performing databases. Raima s database fulfills this need. It can be configured as an elemental network database requiring less than 400K bytes of memory all the way up to a sophisticated SQL relational database. has many features to support high-availability applications with a goal of achieving five 9s of availability. It can be geographically distributed in peer and hierarchical architectures. is a mature database. It has been used in hundreds of thousands of embedded installations over the last twenty-five years. 3 http://www.raima.com/wp-content/uploads/-10-1-technical-summary.pdf 7