Attunity RepliWeb Event Driven Jobs



Similar documents
Attunity RepliWeb Failover Guide

RMFT Event Viewer Messages

Online Transaction Processing in SQL Server 2008

Disaster Recovery for Oracle Database

Backup Strategies for Integrity Virtual Machines

Data Protection with IBM TotalStorage NAS and NSI Double- Take Data Replication Software

Installing RMFT on an MS Cluster

High Availability with Windows Server 2012 Release Candidate

Microsoft SQL Server 2008 R2 Enterprise Edition and Microsoft SharePoint Server 2010

Neverfail for Windows Applications June 2010

Neverfail Solutions for VMware: Continuous Availability for Mission-Critical Applications throughout the Virtual Lifecycle

Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications

NETWORK ATTACHED STORAGE DIFFERENT FROM TRADITIONAL FILE SERVERS & IMPLEMENTATION OF WINDOWS BASED NAS

Stretching A Wolfpack Cluster Of Servers For Disaster Tolerance. Dick Wilkins Program Manager Hewlett-Packard Co. Redmond, WA dick_wilkins@hp.

HP OpenView Storage Mirroring application notes. Guidelines for testing a disaster recovery/high availability scenario

A SURVEY OF POPULAR CLUSTERING TECHNOLOGIES

Achieving High Availability & Rapid Disaster Recovery in a Microsoft Exchange IP SAN April 2006

(Formerly Double-Take Backup)

High Availability of VistA EHR in Cloud. ViSolve Inc. White Paper February

Deploying Global Clusters for Site Disaster Recovery via Symantec Storage Foundation on Infortrend Systems

Affordable Remote Data Replication

Ecomm Enterprise High Availability Solution. Ecomm Enterprise High Availability Solution (EEHAS) Page 1 of 7

Windows Server 2008 R2 Hyper-V Live Migration

W H I T E P A P E R. Disaster Recovery Virtualization Protecting Production Systems Using VMware Virtual Infrastructure and Double-Take

Windows Geo-Clustering: SQL Server

High Availability Storage

MailMarshal SMTP in a Load Balanced Array of Servers Technical White Paper September 29, 2003

Using HP StoreOnce Backup Systems for NDMP backups with Symantec NetBackup

Provisioning Server High Availability Considerations

Protecting Microsoft SQL Server

Introducing. Markus Erlacher Technical Solution Professional Microsoft Switzerland

CA ARCserve Family r15

Protect Microsoft Exchange databases, achieve long-term data retention

Veritas Storage Foundation High Availability for Windows by Symantec

Solution Brief Availability and Recovery Options: Microsoft Exchange Solutions on VMware

RepliWeb Topology Manager. User Guide

RDS Directory Synchronization

File Replication plus NAS enables Server-Less Remote Site Backup

Application Brief: Using Titan for MS SQL

Blackboard Managed Hosting SM Disaster Recovery Planning Document

CROSS PLATFORM AUTOMATIC FILE REPLICATION AND SERVER TO SERVER FILE SYNCHRONIZATION

Near-Instant Oracle Cloning with Syncsort AdvancedClient Technologies White Paper

Pervasive PSQL Meets Critical Business Requirements

Total Disaster Recovery in Clustered Storage Servers

Enabling comprehensive data protection for VMware environments using FalconStor Software solutions

IBM TSM DISASTER RECOVERY BEST PRACTICES WITH EMC DATA DOMAIN DEDUPLICATION STORAGE

CA ARCserve Replication and High Availability Deployment Options for Hyper-V

High Availability and Disaster Recovery Solutions for Perforce

Attunity RepliWeb PAM Configuration Guide

Backup Express and Network Appliance: A Cost-Effective Remote Site Business Continuity Solution White Paper

NetApp Big Content Solutions: Agile Infrastructure for Big Data

Fault Tolerant Servers: The Choice for Continuous Availability on Microsoft Windows Server Platform

Availability Digest. Stratus Avance Brings Availability to the Edge February 2009

User Guidance. CimTrak Integrity & Compliance Suite

Virtually Effortless Backup for VMware Environments

Using Live Sync to Support Disaster Recovery

CA ARCserve and CA XOsoft r12.5 Best Practices for protecting Microsoft SQL Server

Backup with synchronization/ replication

HP Operations Agent for NonStop Software Improves the Management of Large and Cross-platform Enterprise Solutions

Protecting enterprise servers with StoreOnce and CommVault Simpana

DISK IMAGE BACKUP. For Physical Servers. VEMBU TECHNOLOGIES TRUSTED BY OVER 25,000 BUSINESSES

HA / DR Jargon Buster High Availability / Disaster Recovery

Zerto Virtual Manager Administration Guide

Big data management with IBM General Parallel File System

UNINETT Sigma2 AS: architecture and functionality of the future national data infrastructure

Vijeo Citect Architecture and Redundancy Study Guide

An Oracle White Paper November Oracle Real Application Clusters One Node: The Always On Single-Instance Database

Veeam Cloud Connect. Version 8.0. Administrator Guide

next generation architecture created to safeguard in virtual & physical environments to deliver comprehensive UNIFIED DATA PROTECTION SOLUTION BRIEF

How To Manage The Sas Metadata Server With Ibm Director Multiplatform

SQL Server 2008 Performance and Scale

Protecting Data with a Unified Platform

Windows Server 2008 R2 Essentials

Remote Site Business Continuity with Syncsort XRS White Paper

Real-time Protection for Hyper-V

High Availability and Disaster Recovery for Exchange Servers Through a Mailbox Replication Approach

EMC VPLEX FAMILY. Continuous Availability and data Mobility Within and Across Data Centers

Windows Server Failover Clustering April 2010

Handling Hyper-V. In this series of articles, learn how to manage Hyper-V, from ensuring high availability to upgrading to Windows Server 2012 R2

Whitepaper Continuous Availability Suite: Neverfail Solution Architecture

Contingency Planning and Disaster Recovery

Project management integrated into Outlook

Infortrend EonNAS 3000 and 5000: Key System Features

HP StorageWorks Data Protection Strategy brief

How To Build A Clustered Storage Area Network (Csan) From Power All Networks

HP EVA to 3PAR Online Import for EVA-to-3PAR StoreServ Migration

Database Backup and Recovery Guide

Distributed Software Development with Perforce Perforce Consulting Guide

BackupEnabler: Virtually effortless backups for VMware Environments

Perforce Backup Strategy & Disaster Recovery at National Instruments

Transcription:

Attunity RepliWeb Event Driven Jobs Software Version 5.2 June 25, 2012 RepliWeb, Inc., 6441 Lyons Road, Coconut Creek, FL 33073 Tel: (954) 946-2274, Fax: (954) 337-6424 E-mail: info@repliweb.com, Support: http://support.repliweb.com

2012 Attunity Ltd. All rights reserved. The information in this manual has been compiled with care, but Attunity Ltd. makes no warranties as to its accuracy or completeness. The software described herein may be changed or enhanced from time to time. This information does not constitute a commitment or representation by Attunity and is subject to change without notice. The software described in this document is furnished under license and may be used and/or copied only in accordance with the terms of this license and the End User License Agreement. No part of this manual may be reproduced or transmitted, in any form, by any means (electronic, photocopying, recording or otherwise) without the express written consent of Attunity Ltd. Windows, Windows XP and Windows Vista are trademarks of Microsoft Corporation in the US and/or other countries. UNIX is a registered trademark of Bell Laboratories licensed to X/OPEN. Any other product or company names referred to in this document may be the trademarks of their respective owners. Please direct correspondence or inquiries to: RepliWeb, Inc. 6441 Lyons Road Coconut Creek, Florida 33073 USA Telephone: (954) 946-2274 Fax: (954) 337-6424 Sales & General Information: Documentation: Technical Support: Website: info@repliweb.com docs@repliweb.com http://support.repliweb.com http://www.repliweb.com ii

Table of Contents 1. Overview... 2 2. Introduction and Background... 3 3. Architecture and Distribution Flow... 5 4. I/O vs. File-level in Real-Time... 7 5. Conclusion... 9 1

1. Overview Organizations in need of real-time content deployment, distribution, or synchronization can employ an I/O-level or file-level solution. The former system is based on capturing disk I/O in real-time and replicating the changes to a remote-server in the same sequence they occurred on the source. The latter is based on capturing events on the file level, and subsequently replicating the changes to a remote server. While I/O level replication is necessary in database replication, it can introduce significant issues when deploying vast content depositories (web sites, corporate data, etc.) Enterprise projects that do not require write order replications enjoy significant improvements by using a file-based event-driven product such as R-1 over an I/O-based solution. Unlike I/O-based solutions that only monitor the source system, R-1 is scalable to hundreds of target systems and can: Integrate an event-driven replication with pre/post command processing Repair corrupt or missing data on the target systems Recover from system and communication errors without consuming significant amount of system resources Transfer data in multiple parallel streams All R-1 distribution jobs are available for monitoring through a single monitoring interface, and include the ability to incorporate e-mail notification and database entry of each file transferred and deleted. Furthermore, all enterprise-wide distribution policies run independently of one another and benefit from R-1 s market leading automatic recovery and resilience scalable to hundreds of targets without sacrificing reliability. 2

2. Introduction and Background Note: This document assumes the reader is familiar with R-1 in a scheduled data distribution context, and is evaluating the costs and benefits of employing a real-time solution. R-1 is a high-level one-to-many file distribution system that is engineered to perform in complex and dynamic IP networks (LAN, WAN, LFA, Internet, ecdn, satellite based, over Unicast and Multicast). It provides unattended, Center-to-Edge content deployment and synchronization that scales in volume of data, number of processes and number of target machines, employing heterogeneous (cross-platform) server environments (Microsoft Windows TM, UNIX). To enhance efficiency, each content deployment or synchronization job resolves the absolute minimum amount of data to be transferred or deleted between the distribution source and targets. In addition to determining the topology of the content distribution, a system administrator must determine when the data consolidation will take place a factor unique to each enterprise and project. R-1 uses an easily configurable job scheduler to initiate the content distributions according to user-determined policies (at specific times, on the creation of specific files, etc.). The nature of scheduled content distribution and synchronization can result in the source and target systems falling out of sync between the scheduled distributions. While some enterprises are satisfied with weekly, daily, or hourly distributions to synchronize the data between their source and target systems, others require more immediate data consolidation. Scheduler Start Pre-Commands Snapshots Snapshots Matrix Source Content 1:Many Synchronization Transfer Delete Permissions Edge / Center Post-Commands Figure 1 Distribution Flow. 3

Introduction and Background In a scheduled content distribution, this can be accomplished by scheduling distributions as close together as possible - at one (1) minute intervals. Due to the time and system resources needed to generate the system snapshots (CSM) and perform the subsequent data synchronization, a one (1) minute scheduling time can become resource intensive, resulting in a continuous loop of distribution after distribution. The only way of addressing this issue is to introduce a real-time replication solution in place of a scheduled one. As we will see in the following section, real-time solutions introduce their own synchronization issues, the severity of which depends on the project s requirements. This document discusses R-1 s Continuous Update feature (for Microsoft Windows) and how it eludes real-time synchronization issues by introducing an event-driven data consolidation stage within a scheduled data distribution framework. 4

Content Event Monitoring Transfer & Delete Processes Attunity RepliWeb Continuous Update Guide 3. Architecture and Distribution Flow An R-1 distribution goes through a series of independent stages designed to be completely stateless. This means that nothing in the system is taken for granted, and each stage of any replication process is fully fault tolerant and recoverable. The RepliWeb Event Driven Agent (REDA) is an event driven process - not a device driver - that detects, collects, and filters file level events on the source data that are relevant to each of the distribution s targets. A visual representation of the distribution flow can be seen in Figure 1 Distribution Flow. The source machine sends instructions to the target(s) requesting a snapshot of a given directory, while in turn building its own. The respective machines do all of the state processing locally, without use of the network or any conversational protocols. Once the targets have returned snapshots to the source, the source machine builds a matrix of which objects need to go to which machines, and which need to be deleted, subsequently performing the necessary operations. Initial Snapshots Input Handle Events Source Content Pick Consolidation Events Figure 2 Continuous Synchronization From the moment CSM generation is initiated, REDA begins detecting and collecting any relevant changes made to the source directory (relevant with respect to the source file and exclude specifications). REDA then continuously instructs the R-1 distribution to propagate the relevant file / directory creation and deletion events to the target systems. The continuous update distribution will autonomously force a full synchronization on a pre-determined periodic 5

Content Event Monitoring Handle Events Transfer & Delete Processes Attunity RepliWeb Continuous Update Guide Architecture and Distribution Flow basis as well as under special circumstances (for example, if a flood of events results in a dropped event). Scheduler Start Pre-Commands Snapshots Periodic Sweep Snapshots Matrix Initial Snapshots Input Source Content Pick Consolidation Events Edge / Center Post-Commands Figure 3 Combined Operation The forced synchronization ensures that at fixed intervals, any damaged, missing, or corrupt files on the target are corrected according to the latest source data, in addition to the execution of all distribution level tasks (On-exit commands, e-mail reporting, etc ) 6

4. I/O vs. File-level in Real-Time Real-time replication solutions can be divided into I/O-level and file-level solutions. I/O level solutions monitor disk write operations on the source system and propagate them to the target system. By replicating the disk write sequence, a critical aspect of database replication, I/O-level solutions sacrifice performance under many conditions and introduce device drivers on the source system. File-level event driven solutions such as R-1 monitor events on the files and directories themselves and enjoy the benefits of being incorporated into a larger distribution framework. For projects where both satisfy the functional requirements equally well, there are a few key demarcation points: Production issues - All real-time replication systems assume by design that both source and target systems begin from a synchronized state before the real-time system is activated. This is the single largest hurdle to implementing a real-time solution in production environments. Each and every system failure requires the halting and manual re-synchronization of the systems involved before the real-time system can continue to operate as designed. Production systems are designed to be both faulttolerant and highly available; properties that are difficult to maintain with a real-time solution. Furthermore, production systems often require the integration of pre/post replication commands into the replication process a feature I/O replication by nature cannot offer. R-1 Solution - RepliWeb solutions are stateless, holding nothing in RAM. Even if the source system were to fail and the collected events are lost before or during propagation, the recovery process ensures that any source data changes that occurred during the downtime are propagated from the exact point of failure when conditions allow. Pre/Post command processing can be integrated into the real-time distribution process on forced and periodic full synchronizations, allowing for production integration of the distribution process. High availability systems using cluster technology can exploit this recovery method as control is automatically passed to the fail over node if a node does go down. Target-side considerations A real-time system only monitors one side of the equation - the source machine. If a file becomes corrupted or removed from the target machine(s), there is no way for the source machine to detect the problem and retransfer the damaged file. The file on the target machine will remain unavailable indefinitely. Furthermore, if a file on the target cannot be updated (for example, if the target system rebooted, or failed), there is no guarantee that a memory resident realtime system will be able to propagate the changes if there is a subsequent failure on the source machine. R-1 Solution - If a file is damaged, lost or corrupted on the target machine(s), or locked in use, it is repaired on the next forced or periodic synchronization. This option is not 7

I/O vs. File-level in Real-Time available to real-time solutions, as they have no way of knowing what the status of a file is on the target machine. R-1 avoids these risks by forcing a full synchronization upon recovery from a system failure. Responding to communications failures When a communications failure does not permit the propagation of real-time changes, I/O monitoring systems continue to record all changes on the source system. Since these changes cannot be propagated, system memory and subsequently system disk space are exhausted. Each I/O write is accompanied with all the information necessary to write the changes to the target, usually four times more data than the changes themselves. This significantly degrades the performance of the source system during the communications outage, and can even result in a production system exhausting its resources, crippling its production responsibilities. These effects are multiplied in the event of multiple target systems. R-1 Solution During communication outages, R-1 continues to track file-level events. Once communication between the source and target systems is restored, the changes are propagated to the target systems in multiple parallel streams. System resources are never exhausted in the event of a system or communication failure. The distribution will attempt to recover through the active data transfer, however, if communication has not been restored after successive recovery attempts, the distribution pauses real-time replication. Once communication has been restored, a full synchronization is automatically carried out that by its nature integrates any changes that occurred during the downtime. Performance Considerations I/O level real-time systems operate on a queuing mechanism, propagating changes on the target systems in the same order as they occurred on the source systems. While this is necessary in database synchronization, it is inefficient and time consuming in file synchronization, especially at high I/O write rates. The performance of I/O level real-time systems degrades as the rate of event generation increases as the events must be queued until they can be replicated to the target systems one-by-one. If the queue becomes too long, the events are moved from memory to the hard disk. This results in a significant amount of disk space being used, as well as further performance degradation as the events need to be reloaded into memory before they can be propagated. R-1 Solution Since file level changes do not need to be propagated in I/O sequence, all file-level changes are transferred to the target system(s) using multiple parallel streams. This significantly improves performance in both LAN and WAN environments. Furthermore, performance is independent of the rate of event generation. If a breakeven is met whereby a flood of events overwhelms the source system, a full synchronization, which is based on RepliWeb s CSM technology, results in the most efficient propagation of the high number of changes. Only the bare minimum needed to bring the source and target systems is transferred over the network, such as permissions, security, or changes to the data itself. 8

5. Conclusion The introduction of the R-1 s Continuous Update feature provides system administrators with the reliability and resilience of an R-1 scheduled distribution while significantly improving performance in projects requiring immediate data consolidation. Through the implementation of REDA over a scheduled distribution solution, R-1 s Continuous Update overcomes the pitfalls of standard real-time replication solutions by regularly performing a full analysis and repair of the target system(s) involved at fixed intervals. The REDA framework allows it to be aware of the number of events generated, ensuring that a flood of events does not overload the system. If such a scenario does present itself, REDA will force a full synchronization ensuring that the system is always up to date. R-1 s Continuous Update has significant benefits over I/O level real-time solutions by enabling the integration of pre/post command processing into the real-time distribution process, recovering from system and communication errors transparently and without affecting the source system s performance and stability, and repairing the target file-system(s) periodically and after a system/communication failure. By integrating REDA within existing R-1 distribution jobs, all data distribution activity performed through a forced synchronization or through REDA is accessible through a single monitoring interface. This integration, when coupled with RepliWeb R-1 s market leading resiliency and reporting functionality (e-mail notification, database entry, etc.) results in a complete, enterprise-level solution that is capable of handling even the heaviest loads without sacrificing reliability. For any additional information, please contact us at support.repliweb.com 9