White Paper. Managing MapR Clusters on Google Compute Engine

Size: px
Start display at page:

Download "White Paper. Managing MapR Clusters on Google Compute Engine"

Transcription

1 White Paper Managing MapR Clusters on Google Compute Engine

2 MapR Technologies, Inc. Introduction Google Compute Engine is a proven platform for running MapR. Consistent, high performance virtual machines coupled with a high bandwidth, low latency network linking them together and to the rest of the Google Cloud Platform services deliver a solid foundation for cloud-based data processing architectures. The ability to quickly instantiate numerous virtual machines on demand and the availability of per-minute pricing make Compute Engine well-suited and cost-effective for spinning up and turning off ad hoc clusters. Further, Compute Engine s advanced routing and data encryption features together with Google s global network performance enable the construction of secure, compelling hybrid solutions. This paper presents several techniques for those who wish to manage their own MapR installations on Google Compute Engine, and select scenarios (migration across zones, disaster recovery and high availability) that arise when dealing with long-lived clusters and operating across multiple zones. This list is neither exhaustive nor authoritative as other solutions certainly exist and might even be more applicable to specific situations. This paper, however, illustrates how the same features that make Compute Engine a powerful platform upon which to run MapR can provide the basis for MapR cluster management solutions. Scenarios The scenarios presented in this section are all based on the premise that the cluster under consideration ought to be preserved to the greatest extent possible when faced with an adverse environmental event (for example, a zone is shut down for maintenance or experiences an unanticipated outage). This is a common situation for long-lived clusters used in support of job pipelines and stream processing. On-demand clusters 1 require little or no management and are, generally speaking, outside the scope of this document. Further, the scenarios presented assume that data consumed and generated by the cluster needs to reside durably in a self-managed MapR-FS (MapR File System). 2 Administrators can plan accordingly for maintenance events and manage cluster availability. However, should an unexpected situation occur, jobs currently being processed can be impacted. While specific outcomes vary, in some cases, these jobs are restarted rather than resumed, though there is a possibility that jobs will need to be resubmitted. Zone Migration At some point, it may become desirable or necessary to move or clone a cluster from one zone to another. This section presents two different zone migration scenarios: 1. Minimize the unavailability of the cluster to accept and run MapReduce jobs, at the expense of operational complexity. 2. Trade additional unavailability in favor of significantly simpler management. 1 Google Compute Engine s fast virtual machine creation and sub-hour billing model enable customers to perform data processing using ad hoc clusters, without the overhead of managing resources constrained by timeboxed pricing. 2 Relaxing this constraint enables one to take advantage of the preferred approach of leveraging Google Cloud Storage as the durable repository.

3 2 Zone Migration Scenario 1: Minimizing Unavailability Many MapR clusters are integral parts of a business s daily operations. Any time users cannot submit work can be costly. This first approach attempts to reduce this duration as much as possible in a traditional deployment 3. First, new FileServers and TaskTrackers are added to the cluster in the destination zone, and then the Container Location Database (CLDB) and JobTracker master nodes are switched over as well. This method bounds the cluster s unavailability to roughly the interval defined from the shutdown of the master services in the source zone to their startup in the destination zone. Considerations FileServers may use either (or both) scratch or persistent disks for MapR-FS. Persistent disks offer more storage space 3 as well as durability beyond the lifetime of an instance. With persistent disks, administrators have the option of reducing the MapR-FS replication factor while protecting data in the event of a zone becoming unavailable. Using persistent root disks enables any node to survive an unexpected reboot. This scenario makes use of MapR rack topologies to enable the cluster to continue running while nodes are decommissioned. 4 Topology describes the locations of nodes and racks in a cluster. The MapR software uses node topology to determine the location of replicated copies of data. Optimally defined cluster topology results in data being replicated to separate racks, providing continued data availability in the event of rack or node failure. Data is copied across zones when the Container Location Database (CLDB) distributes replicated data containers on separate racks. A rack is a logical collection of nodes configured with FileServers and TaskTrackers running in the same zone. Figure 1: Minimizing unavailability during zone migration 3 As of July 2013, up to 10TB of persistent disk storage can be mounted on standard instance types compared to 3.5TB maximum scratch disk space. 4 Refer to Decommissioning a Node for more details on decommissioning nodes.

4 3 Zone Migration Scenario 1: Minimizing Unavailability Approach 1. Run the configure.sh script to configure new nodes. The nodes are automatically added into the MapR cluster when Warden (the service that starts and stops other services in a cluster) starts the services on the new node. You can create new instances for these nodes programmatically using the gcutil utility, REST API or Google client libraries, or manually via the Google Cloud Console. The new nodes should identify themselves with a pair of new rack identifiers, different from those in the source zone. By organizing nodes in different topologies, it can be established that only one copy of any data container will be removed from the zone when a rack is decommissioned. MapR-FS replication ensures that a copy of every data container will exist on at least two different racks and that every data container in the decommissioned rack will be available on a rack in the destination zone 2. Migrate CLDB and JobTracker. There are several ways to implement this step. The automatic approach is recommended if persistent root and data disks have been used as this greatly simplifies the process. The manual approach may be necessary when performing custom configurations and/or moving data from scratch disks. Automatic Approach a) Shut down the cluster. b) Use the gcutil moveinstances command to migrate the nodes into the destination zone. This command copies instance configurations, takes snapshots of the persistent disks, deletes the existing instances, and then recreates the instances in the destination zone. Once the new instances are up and services are running, the cluster is available to accept jobs. Manual Approach a) Shutdown the cluster. b) If persistent root and/or data disks have been used, create snapshots for later use. c) If any data residing on scratch disk must be preserved, take either of the following actions: i. Create and attach a new persistent disk, copy the data from the scratch disk, detach the persistent disk and create a snapshot. ii. Store the data in Google Cloud Storage. For example, use the following command to stream a compressed archive of the src directory directly into a bucket: tar zcf - <src> gsutil cp - gs://<bucket>/src.tar.gz d) Terminate the nodes in the source zone. e) If applicable, create new persistent root and data disks from snapshots created previously.

5 4 Zone Migration Scenario 1: Minimizing Unavailability f) Use the new persistent root and data disks created in the previous step to create instances for the nodes in the destination zone. g) If data has been temporarily stored in Google Cloud Storage, copy the data from Google Cloud Storage to the new instances. h) Start Warden and ZooKeeper. Warden will start up the master services on the new nodes. Once the new instances are up and services are running, the cluster can accept jobs. 3. Decommission one rack of nodes and TaskTrackers. Move nodes assigned to racks in the source zone to an isolated topology and decommission 4 them using following steps: c) Blacklist TaskTrackers, and wait for the blacklist to finish. d) Move the node to the /offline or /decommissioned topology. e) When the node no longer has non-local data (the data has been drained ), shut down the FileServer service. f) Remove the node from the cluster using the maprcli node remove command. 4. Repeat step 3 for all racks.

6 5 Zone Migration Scenario 2: Simplifying Management This scenario presents how to perform a migration with a single command, easing migration operations at the possible expense of longer cluster downtime. This process is nearly identical to the automatic migration of the Container Location Database (CLDB) and JobTracker instances, as described in scenario 1. gcutil moveinstances migrates the entire cluster between zones. Considerations Use persistent disks when adding data disks to nodes for MapR-FS (MapR File System). Figure 2: Simplifying management for zone migration Approach Migrate the cluster. Use the gcutil moveinstances command to clone all of the nodes in the cluster and their persistent disks into the destination zone. For example, consider a small cluster with a single master node (mapr-maprm) and ten worker nodes (mapr-maprw-000,, mapr-maprw-009). The following single call will move from the us-central2-a zone and restart the cluster in uscentral1-a: gcutil --project=<project> moveinstances mapr-maprm mapr-maprw-00\\ d+--source _ zone=us-central2-a --destination _ zone=us-central1-a

7 6 Disaster Recovery Zone-to-zone migration is an effective way to gracefully plan for and manage anticipated zone maintenance windows and other drivers of cluster relocation. However, it is always a good practice to account for the unexpected; catastrophic zone-wide failures are extremely rare but can occur 5. You can deploy a zone across multiple zones for disaster recovery. Multi-zone Cluster When you deploy a cluster across multiple zones 6, commission another set of FileServers and TaskTrackers in a second zone. Add a standby Container Location Database (CLDB) and JobTracker to the second zone for failover. Considerations This pattern requires twice the number of FileServers and TaskTrackers than would otherwise be deployed. It also requires substantial cross-zone communication. While this pattern results in twice the processing capability, operational (instance, storage and network) expenses will increase and overall performance could be degraded, especially if data is accessed across zones. MapR provides high availability for the Container Location Database (CLDB) and JobTracker. Figure 3: A multi-zone cluster for disaster recovery 5 Google Compute Engine Service Level Agreement. 6 This option is presented for sake of completeness and is not typically recommended because it incurs additional costs and may impact performance.

8 7 Scenarios: Disaster Recovery Approach 1. Distribute the cluster. FileServers and TaskTrackers are spread equally across both zones and fully commissioned. The FileServers within a zone should identify with the same rack and each zone should use a different rack identifier, ensuring that at least one copy of each data container exists in the second zone. 2. Failover the CLDB. The CLDB can be restored in the second zone automatically. 3. Failover the JobTracker. The JobTracker can be restored automatically in the second zone. High Availability MapR is the only distribution with high availability at the cluster and job levels to eliminate any single points of failure across the system. MapR distributes metadata across the cluster to avoid bottlenecks and improve cluster performance. The JobTracker in MapR has high availability. If the service crashes, a new JobTracker automatically picks up where the original JobTracker stopped, and the MapReduce can continue without any restart or intervention. If the node itself crashes, a new node automatically takes over and continues the process. Google Compute Engine Network DNS One of the benefits of Google Compute Engine is the ability to address an instance on the network in any zone in any region by its user-provided hostname. Additionally, these names can be reused when instances are recycled (turning down one instance and spinning up another with the same name), which is convenient for deployments that want to rely on name-based addressing. It is also worth noting that there are no guarantees around internal IP address assignment. That instance hostnames can be reused dynamically is due to the fact that DNS entries are cleaned up almost immediately after an instance is terminated. ZooKeeper The MapR cluster uses Apache ZooKeeper to coordinate services and enables high availability (HA) and fault tolerance for MapR clusters. The Warden will not start any services unless ZooKeeper is reachable and more than half of the configured ZooKeeper nodes (a quorum) are live. It s worth noting that special considerations need to be made with regards to DNS when deploying ZooKeeper for MapR on Google Compute Engine. Currently, ZooKeeper resolves DNS names once at startup. This prevents the replacement of a member instance without rebooting the remaining members of the ensemble. So given the previous description of the behavior of Google Compute Engine DNS, it is worth noting that any time a node running ZooKeeper is rebooted or replaced, all other ZooKeeper services must be restarted. Failure to do so will leave the existing ensemble with one less member, and the new instance will be running what will amount to an orphaned ZooKeeper instance.

9 8 Straddling Multiple Zones High availability can also be considered in terms of resilience of the cluster to zone failure. This might be applicable in scenarios where any interruption of long running jobs or pipelines might be detrimental and/or incur substantial costs. Consider the scenario where zone A is scheduled for maintenance earlier than zone B. Given this knowledge, it is possible to construct a cluster spanning both zones such that it will continue to run in the event that either goes down. Figure 4: Straddling multiple zones for high availability 1. Distribute the cluster. As previously addressed, FileServers and TaskTrackers are split equally across both zones and commissioned as part of the MapR-FS cluster; multiple MapR-FS racks are used to ensure data is fully replicated. 2. Distribute the ZooKeeper ensemble. The majority of ZooKeeper nodes must be deployed in zone B along with the standby CLDB and the active JobTracker. 3. Manage the CLDB. If zone A becomes unavailable, the ZooKeeper ensemble still maintains a quorum and can automatically facilitate the failover and promotion of the standby CLDB to active. If instead, zone B experiences an outage, the active CLBD will continue to run. The cluster, however, will not be able to sustain an active node failure as the ensemble has no quorum. Replacement ZooKeeper nodes must be added and the ensemble must be restarted. 4. Manage the JobTracker. The active JobTracker is deployed to zone B, so that if zone A becomes unavailable (as anticipated), any active jobs can continue running until completion.

10 9 Conclusion Google Cloud Platform not only offers a high performance platform upon which to run MapR, but also provides features and tools that can assist in the maintenance of clusters across zones to keep business-critical jobs running in zone migration, disaster recovery, and high availability scenarios. Persistent disks, Google Cloud Storage, and the network infrastructure enable efficient data and instance migration; and gcutil provides a rich set of commands to help accomplish cluster management tasks. Additional Resources MapR, Hive, and Pig on Google Compute Engine Google Cloud Platform solution for more on how to take advantage of Google Compute Engine, with support from Google Cloud Storage, to run a selfmanaged MapR cluster with Apache Hive and Apache Pig as part of a Big Data processing solution. Google-compute-engine-cluster-for-hadoop a sample application to assist in setting up Hadoop compute clusters and executing MapReduce tasks. Please note that this application does not perform any of the cluster management outlined in this paper. MapR delivers on the promise of Hadoop with a proven, enterprise-grade platform that supports a broad set of mission-critical and real-time production uses. MapR brings unprecedented dependability, ease-of-use and world-record speed to Hadoop, NoSQL, database and streaming applications in one unified big data platform. MapR is used by more than 500 customers across financial services, retail, media, healthcare, manufacturing, telecommunications and government organizations as well as by leading Fortune 100 and Web 2.0 companies. Amazon, Cisco, Google and HP are part of the broad MapR partner ecosystem. Investors include Lightspeed Venture Partners, Mayfield Fund, NEA, and Redpoint Ventures. MapR is based in San Jose, CA. Connect with MapR on Facebook, LinkedIn, and Twitter MapR Technologies. All rights reserved. Apache Hadoop, HBase and Hadoop are trademarks of the Apache Software Foundation and not affiliated with MapR Technologies.

MapR, Hive, and Pig on Google Compute Engine

MapR, Hive, and Pig on Google Compute Engine White Paper MapR, Hive, and Pig on Google Compute Engine Bring Your MapR-based Infrastructure to Google Compute Engine MapR Technologies, Inc. www.mapr.com MapR, Hive, and Pig on Google Compute Engine

More information

RPO represents the data differential between the source cluster and the replicas.

RPO represents the data differential between the source cluster and the replicas. Technical brief Introduction Disaster recovery (DR) is the science of returning a system to operating status after a site-wide disaster. DR enables business continuity for significant data center failures

More information

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and

More information

CDH AND BUSINESS CONTINUITY:

CDH AND BUSINESS CONTINUITY: WHITE PAPER CDH AND BUSINESS CONTINUITY: An overview of the availability, data protection and disaster recovery features in Hadoop Abstract Using the sophisticated built-in capabilities of CDH for tunable

More information

Investor Newsletter. Storage Made Easy Cloud Appliance High Availability Options WHAT IS THE CLOUD APPLIANCE?

Investor Newsletter. Storage Made Easy Cloud Appliance High Availability Options WHAT IS THE CLOUD APPLIANCE? Investor Newsletter Storage Made Easy Cloud Appliance High Availability Options WHAT IS THE CLOUD APPLIANCE? The SME Cloud Appliance is a software platform that enables companies to enhance their existing

More information

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney

Introduction to Hadoop. New York Oracle User Group Vikas Sawhney Introduction to Hadoop New York Oracle User Group Vikas Sawhney GENERAL AGENDA Driving Factors behind BIG-DATA NOSQL Database 2014 Database Landscape Hadoop Architecture Map/Reduce Hadoop Eco-system Hadoop

More information

Deploying Exchange Server 2007 SP1 on Windows Server 2008

Deploying Exchange Server 2007 SP1 on Windows Server 2008 Deploying Exchange Server 2007 SP1 on Windows Server 2008 Product Group - Enterprise Dell White Paper By Ananda Sankaran Andrew Bachler April 2008 Contents Introduction... 3 Deployment Considerations...

More information

Feature Comparison. Windows Server 2008 R2 Hyper-V and Windows Server 2012 Hyper-V

Feature Comparison. Windows Server 2008 R2 Hyper-V and Windows Server 2012 Hyper-V Comparison and Contents Introduction... 4 More Secure Multitenancy... 5 Flexible Infrastructure... 9 Scale, Performance, and Density... 13 High Availability... 18 Processor and Memory Support... 24 Network...

More information

BIG DATA TRENDS AND TECHNOLOGIES

BIG DATA TRENDS AND TECHNOLOGIES BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.

More information

Whitepaper Continuous Availability Suite: Neverfail Solution Architecture

Whitepaper Continuous Availability Suite: Neverfail Solution Architecture Continuous Availability Suite: Neverfail s Continuous Availability Suite is at the core of every Neverfail solution. It provides a comprehensive software solution for High Availability (HA) and Disaster

More information

Introduction to Multi-Data Center Operations with Apache Cassandra and DataStax Enterprise

Introduction to Multi-Data Center Operations with Apache Cassandra and DataStax Enterprise Introduction to Multi-Data Center Operations with Apache Cassandra and DataStax Enterprise White Paper BY DATASTAX CORPORATION October 2013 1 Table of Contents Abstract 3 Introduction 3 The Growth in Multiple

More information

Non-Stop for Apache HBase: Active-active region server clusters TECHNICAL BRIEF

Non-Stop for Apache HBase: Active-active region server clusters TECHNICAL BRIEF Non-Stop for Apache HBase: -active region server clusters TECHNICAL BRIEF Technical Brief: -active region server clusters -active region server clusters HBase is a non-relational database that provides

More information

Introduction to Multi-Data Center Operations with Apache Cassandra, Hadoop, and Solr WHITE PAPER

Introduction to Multi-Data Center Operations with Apache Cassandra, Hadoop, and Solr WHITE PAPER Introduction to Multi-Data Center Operations with Apache Cassandra, Hadoop, and Solr WHITE PAPER By DataStax Corporation August 2012 Contents Introduction...3 The Growth in Multiple Data Centers...3 Why

More information

Hadoop & Spark Using Amazon EMR

Hadoop & Spark Using Amazon EMR Hadoop & Spark Using Amazon EMR Michael Hanisch, AWS Solutions Architecture 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda Why did we build Amazon EMR? What is Amazon EMR?

More information

Hadoop Architecture. Part 1

Hadoop Architecture. Part 1 Hadoop Architecture Part 1 Node, Rack and Cluster: A node is simply a computer, typically non-enterprise, commodity hardware for nodes that contain data. Consider we have Node 1.Then we can add more nodes,

More information

How To Run Apa Hadoop 1.0 On Vsphere Tmt On A Hyperconverged Network On A Virtualized Cluster On A Vspplace Tmter (Vmware) Vspheon Tm (

How To Run Apa Hadoop 1.0 On Vsphere Tmt On A Hyperconverged Network On A Virtualized Cluster On A Vspplace Tmter (Vmware) Vspheon Tm ( Apache Hadoop 1.0 High Availability Solution on VMware vsphere TM Reference Architecture TECHNICAL WHITE PAPER v 1.0 June 2012 Table of Contents Executive Summary... 3 Introduction... 3 Terminology...

More information

High Availability with Postgres Plus Advanced Server. An EnterpriseDB White Paper

High Availability with Postgres Plus Advanced Server. An EnterpriseDB White Paper High Availability with Postgres Plus Advanced Server An EnterpriseDB White Paper For DBAs, Database Architects & IT Directors December 2013 Table of Contents Introduction 3 Active/Passive Clustering 4

More information

HRG Assessment: Stratus everrun Enterprise

HRG Assessment: Stratus everrun Enterprise HRG Assessment: Stratus everrun Enterprise Today IT executive decision makers and their technology recommenders are faced with escalating demands for more effective technology based solutions while at

More information

Microsoft SharePoint 2010 on VMware Availability and Recovery Options. Microsoft SharePoint 2010 on VMware Availability and Recovery Options

Microsoft SharePoint 2010 on VMware Availability and Recovery Options. Microsoft SharePoint 2010 on VMware Availability and Recovery Options This product is protected by U.S. and international copyright and intellectual property laws. This product is covered by one or more patents listed at http://www.vmware.com/download/patents.html. VMware

More information

Prepared By : Manoj Kumar Joshi & Vikas Sawhney

Prepared By : Manoj Kumar Joshi & Vikas Sawhney Prepared By : Manoj Kumar Joshi & Vikas Sawhney General Agenda Introduction to Hadoop Architecture Acknowledgement Thanks to all the authors who left their selfexplanatory images on the internet. Thanks

More information

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2016 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and

More information

Virtualizing Apache Hadoop. June, 2012

Virtualizing Apache Hadoop. June, 2012 June, 2012 Table of Contents EXECUTIVE SUMMARY... 3 INTRODUCTION... 3 VIRTUALIZING APACHE HADOOP... 4 INTRODUCTION TO VSPHERE TM... 4 USE CASES AND ADVANTAGES OF VIRTUALIZING HADOOP... 4 MYTHS ABOUT RUNNING

More information

WHITEPAPER. A Technical Perspective on the Talena Data Availability Management Solution

WHITEPAPER. A Technical Perspective on the Talena Data Availability Management Solution WHITEPAPER A Technical Perspective on the Talena Data Availability Management Solution BIG DATA TECHNOLOGY LANDSCAPE Over the past decade, the emergence of social media, mobile, and cloud technologies

More information

VMware vsphere Data Protection 6.0

VMware vsphere Data Protection 6.0 VMware vsphere Data Protection 6.0 TECHNICAL OVERVIEW REVISED FEBRUARY 2015 Table of Contents Introduction.... 3 Architectural Overview... 4 Deployment and Configuration.... 5 Backup.... 6 Application

More information

Migration and Building of Data Centers in IBM SoftLayer with the RackWare Management Module

Migration and Building of Data Centers in IBM SoftLayer with the RackWare Management Module Migration and Building of Data Centers in IBM SoftLayer with the RackWare Management Module June, 2015 WHITE PAPER Contents Advantages of IBM SoftLayer and RackWare Together... 4 Relationship between

More information

America s Most Wanted a metric to detect persistently faulty machines in Hadoop

America s Most Wanted a metric to detect persistently faulty machines in Hadoop America s Most Wanted a metric to detect persistently faulty machines in Hadoop Dhruba Borthakur and Andrew Ryan dhruba,andrewr1@facebook.com Presented at IFIP Workshop on Failure Diagnosis, Chicago June

More information

Cisco Active Network Abstraction Gateway High Availability Solution

Cisco Active Network Abstraction Gateway High Availability Solution . Cisco Active Network Abstraction Gateway High Availability Solution White Paper This white paper describes the Cisco Active Network Abstraction (ANA) Gateway High Availability solution developed and

More information

Five Features Your Cloud Disaster Recovery Solution Should Have

Five Features Your Cloud Disaster Recovery Solution Should Have Five Features Your Cloud Disaster Recovery Solution Should Have Content Executive summary... 3 Problems with traditional disaster recovery... 3 Benefits Azure and AWS bring to the data center... 4 5 Features

More information

Migration and Building of Data Centers in IBM SoftLayer with the RackWare Management Module

Migration and Building of Data Centers in IBM SoftLayer with the RackWare Management Module Migration and Building of Data Centers in IBM SoftLayer with the RackWare Management Module June, 2015 WHITE PAPER Contents Advantages of IBM SoftLayer and RackWare Together... 4 Relationship between

More information

VMware vsphere Data Protection 5.8 TECHNICAL OVERVIEW REVISED AUGUST 2014

VMware vsphere Data Protection 5.8 TECHNICAL OVERVIEW REVISED AUGUST 2014 VMware vsphere Data Protection 5.8 TECHNICAL OVERVIEW REVISED AUGUST 2014 Table of Contents Introduction.... 3 Features and Benefits of vsphere Data Protection... 3 Additional Features and Benefits of

More information

The Hadoop Distributed File System

The Hadoop Distributed File System The Hadoop Distributed File System Konstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler Yahoo! Sunnyvale, California USA {Shv, Hairong, SRadia, Chansler}@Yahoo-Inc.com Presenter: Alex Hu HDFS

More information

Nutanix Solution Note

Nutanix Solution Note Nutanix Solution Note Version 1.0 April 2015 2 Copyright 2015 Nutanix, Inc. All rights reserved. This product is protected by U.S. and international copyright and intellectual property laws. Nutanix is

More information

Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware

Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware Created by Doug Cutting and Mike Carafella in 2005. Cutting named the program after

More information

WHITE PAPER: ENTERPRISE SOLUTIONS. Symantec Backup Exec Continuous Protection Server Continuous Protection for Microsoft SQL Server Databases

WHITE PAPER: ENTERPRISE SOLUTIONS. Symantec Backup Exec Continuous Protection Server Continuous Protection for Microsoft SQL Server Databases WHITE PAPER: ENTERPRISE SOLUTIONS Symantec Backup Exec Continuous Protection Server Continuous Protection for Microsoft SQL Server Databases White Paper: Enterprise Solutions Symantec Backup Exec Continuous

More information

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform

On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform On- Prem MongoDB- as- a- Service Powered by the CumuLogic DBaaS Platform Page 1 of 16 Table of Contents Table of Contents... 2 Introduction... 3 NoSQL Databases... 3 CumuLogic NoSQL Database Service...

More information

FioranoMQ 9. High Availability Guide

FioranoMQ 9. High Availability Guide FioranoMQ 9 High Availability Guide Copyright (c) 1999-2008, Fiorano Software Technologies Pvt. Ltd., Copyright (c) 2008-2009, Fiorano Software Pty. Ltd. All rights reserved. This software is the confidential

More information

Maximum Availability Architecture. Oracle Best Practices For High Availability. Backup and Recovery Scenarios for Oracle WebLogic Server: 10.

Maximum Availability Architecture. Oracle Best Practices For High Availability. Backup and Recovery Scenarios for Oracle WebLogic Server: 10. Backup and Recovery Scenarios for Oracle WebLogic Server: 10.3 An Oracle White Paper January, 2009 Maximum Availability Architecture Oracle Best Practices For High Availability Backup and Recovery Scenarios

More information

TECHNICAL PAPER. Veeam Backup & Replication with Nimble Storage

TECHNICAL PAPER. Veeam Backup & Replication with Nimble Storage TECHNICAL PAPER Veeam Backup & Replication with Nimble Storage Document Revision Date Revision Description (author) 11/26/2014 1. 0 Draft release (Bill Roth) 12/23/2014 1.1 Draft update (Bill Roth) 2/20/2015

More information

International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February-2014 10 ISSN 2278-7763

International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February-2014 10 ISSN 2278-7763 International Journal of Advancements in Research & Technology, Volume 3, Issue 2, February-2014 10 A Discussion on Testing Hadoop Applications Sevuga Perumal Chidambaram ABSTRACT The purpose of analysing

More information

Cloud Server. Parallels. Key Features and Benefits. White Paper. www.parallels.com

Cloud Server. Parallels. Key Features and Benefits. White Paper. www.parallels.com Parallels Cloud Server White Paper Key Features and Benefits www.parallels.com Table of Contents Introduction... 3 Key Features... 3 Distributed Cloud Storage (Containers and Hypervisors)... 3 Rebootless

More information

Create and Drive Big Data Success Don t Get Left Behind

Create and Drive Big Data Success Don t Get Left Behind Create and Drive Big Data Success Don t Get Left Behind The performance boost from MapR not only means we have lower hardware requirements, but also enables us to deliver faster analytics for our users.

More information

Hadoop Scalability at Facebook. Dmytro Molkov (dms@fb.com) YaC, Moscow, September 19, 2011

Hadoop Scalability at Facebook. Dmytro Molkov (dms@fb.com) YaC, Moscow, September 19, 2011 Hadoop Scalability at Facebook Dmytro Molkov (dms@fb.com) YaC, Moscow, September 19, 2011 How Facebook uses Hadoop Hadoop Scalability Hadoop High Availability HDFS Raid How Facebook uses Hadoop Usages

More information

A SWOT ANALYSIS ON CISCO HIGH AVAILABILITY VIRTUALIZATION CLUSTERS DISASTER RECOVERY PLAN

A SWOT ANALYSIS ON CISCO HIGH AVAILABILITY VIRTUALIZATION CLUSTERS DISASTER RECOVERY PLAN A SWOT ANALYSIS ON CISCO HIGH AVAILABILITY VIRTUALIZATION CLUSTERS DISASTER RECOVERY PLAN Eman Al-Harbi 431920472@student.ksa.edu.sa Soha S. Zaghloul smekki@ksu.edu.sa Faculty of Computer and Information

More information

High Availability of VistA EHR in Cloud. ViSolve Inc. White Paper February 2015. www.visolve.com

High Availability of VistA EHR in Cloud. ViSolve Inc. White Paper February 2015. www.visolve.com High Availability of VistA EHR in Cloud ViSolve Inc. White Paper February 2015 1 Abstract Inspite of the accelerating migration to cloud computing in the Healthcare Industry, high availability and uptime

More information

Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components

Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components Welcome to the unit of Hadoop Fundamentals on Hadoop architecture. I will begin with a terminology review and then cover the major components of Hadoop. We will see what types of nodes can exist in a Hadoop

More information

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE Hadoop Storage-as-a-Service ABSTRACT This White Paper illustrates how EMC Elastic Cloud Storage (ECS ) can be used to streamline the Hadoop data analytics

More information

Multi-Datacenter Replication

Multi-Datacenter Replication www.basho.com Multi-Datacenter Replication A Technical Overview & Use Cases Table of Contents Table of Contents... 1 Introduction... 1 How It Works... 1 Default Mode...1 Advanced Mode...2 Architectural

More information

Non-Stop Hadoop Paul Scott-Murphy VP Field Techincal Service, APJ. Cloudera World Japan November 2014

Non-Stop Hadoop Paul Scott-Murphy VP Field Techincal Service, APJ. Cloudera World Japan November 2014 Non-Stop Hadoop Paul Scott-Murphy VP Field Techincal Service, APJ Cloudera World Japan November 2014 WANdisco Background WANdisco: Wide Area Network Distributed Computing Enterprise ready, high availability

More information

COMPARING STORAGE AREA NETWORKS AND NETWORK ATTACHED STORAGE

COMPARING STORAGE AREA NETWORKS AND NETWORK ATTACHED STORAGE COMPARING STORAGE AREA NETWORKS AND NETWORK ATTACHED STORAGE Complementary technologies provide unique advantages over traditional storage architectures Often seen as competing technologies, Storage Area

More information

Deploying Hadoop with Manager

Deploying Hadoop with Manager Deploying Hadoop with Manager SUSE Big Data Made Easier Peter Linnell / Sales Engineer plinnell@suse.com Alejandro Bonilla / Sales Engineer abonilla@suse.com 2 Hadoop Core Components 3 Typical Hadoop Distribution

More information

5 FEATURES YOUR CLOUD DISASTER RECOVERY SOLUTION SHOULD HAVE

5 FEATURES YOUR CLOUD DISASTER RECOVERY SOLUTION SHOULD HAVE 5 FEATURES YOUR CLOUD DISASTER RECOVERY SOLUTION SHOULD HAVE 1 5 FEATURES YOUR CLOUD DISASTER RECOVERY SOLUTION SHOULD HAVE EXECUTIVE SUMMARY For organizations managing on-premises data centers, having

More information

Certified Big Data and Apache Hadoop Developer VS-1221

Certified Big Data and Apache Hadoop Developer VS-1221 Certified Big Data and Apache Hadoop Developer VS-1221 Certified Big Data and Apache Hadoop Developer Certification Code VS-1221 Vskills certification for Big Data and Apache Hadoop Developer Certification

More information

Drobo How-To Guide. What You Will Need. Configure Replication for DR Using Double-Take Availability and Drobo iscsi SAN

Drobo How-To Guide. What You Will Need. Configure Replication for DR Using Double-Take Availability and Drobo iscsi SAN This document shows you how to use Drobo iscsi SAN storage with Double-Take Availability to deliver replication and DR for servers and applications. Double-Take Availability from Vision Solutions performs

More information

High Availability on MapR

High Availability on MapR Technical brief Introduction High availability (HA) is the ability of a system to remain up and running despite unforeseen failures, avoiding unplanned downtime or service disruption*. HA is a critical

More information

Big Data With Hadoop

Big Data With Hadoop With Saurabh Singh singh.903@osu.edu The Ohio State University February 11, 2016 Overview 1 2 3 Requirements Ecosystem Resilient Distributed Datasets (RDDs) Example Code vs Mapreduce 4 5 Source: [Tutorials

More information

HADOOP MOCK TEST HADOOP MOCK TEST I

HADOOP MOCK TEST HADOOP MOCK TEST I http://www.tutorialspoint.com HADOOP MOCK TEST Copyright tutorialspoint.com This section presents you various set of Mock Tests related to Hadoop Framework. You can download these sample mock tests at

More information

Windows Server 2008 R2 Hyper-V Server and Windows Server 8 Beta Hyper-V

Windows Server 2008 R2 Hyper-V Server and Windows Server 8 Beta Hyper-V Features Comparison: Hyper-V Server and Hyper-V February 2012 The information contained in this document relates to a pre-release product which may be substantially modified before it is commercially released.

More information

Amazon EC2 Product Details Page 1 of 5

Amazon EC2 Product Details Page 1 of 5 Amazon EC2 Product Details Page 1 of 5 Amazon EC2 Functionality Amazon EC2 presents a true virtual computing environment, allowing you to use web service interfaces to launch instances with a variety of

More information

IBM System x reference architecture for Hadoop: MapR

IBM System x reference architecture for Hadoop: MapR IBM System x reference architecture for Hadoop: MapR May 2014 Beth L Hoffman and Billy Robinson (IBM) Andy Lerner and James Sun (MapR Technologies) Copyright IBM Corporation, 2014 Table of contents Introduction...

More information

How to Manage Critical Data Stored in Microsoft Exchange Server 2010. By Hitachi Data Systems

How to Manage Critical Data Stored in Microsoft Exchange Server 2010. By Hitachi Data Systems W H I T E P A P E R How to Manage Critical Data Stored in Microsoft Exchange Server 2010 By Hitachi Data Systems April 2012 2 Table of Contents Executive Summary and Introduction 3 Mission-critical Microsoft

More information

All Clouds Are Not Created Equal THE NEED FOR HIGH AVAILABILITY AND UPTIME

All Clouds Are Not Created Equal THE NEED FOR HIGH AVAILABILITY AND UPTIME THE NEED FOR HIGH AVAILABILITY AND UPTIME 1 THE NEED FOR HIGH AVAILABILITY AND UPTIME All Clouds Are Not Created Equal INTRODUCTION Companies increasingly are looking to the cloud to help deliver IT services.

More information

Veritas Storage Foundation High Availability for Windows by Symantec

Veritas Storage Foundation High Availability for Windows by Symantec Veritas Storage Foundation High Availability for Windows by Symantec Simple-to-use solution for high availability and disaster recovery of businesscritical Windows applications Data Sheet: High Availability

More information

Cloud Based Application Architectures using Smart Computing

Cloud Based Application Architectures using Smart Computing Cloud Based Application Architectures using Smart Computing How to Use this Guide Joyent Smart Technology represents a sophisticated evolution in cloud computing infrastructure. Most cloud computing products

More information

A very short Intro to Hadoop

A very short Intro to Hadoop 4 Overview A very short Intro to Hadoop photo by: exfordy, flickr 5 How to Crunch a Petabyte? Lots of disks, spinning all the time Redundancy, since disks die Lots of CPU cores, working all the time Retry,

More information

Modern IT Operations Management. Why a New Approach is Required, and How Boundary Delivers

Modern IT Operations Management. Why a New Approach is Required, and How Boundary Delivers Modern IT Operations Management Why a New Approach is Required, and How Boundary Delivers TABLE OF CONTENTS EXECUTIVE SUMMARY 3 INTRODUCTION: CHANGING NATURE OF IT 3 WHY TRADITIONAL APPROACHES ARE FAILING

More information

Exchange Data Protection: To the DAG and Beyond. Whitepaper by Brien Posey

Exchange Data Protection: To the DAG and Beyond. Whitepaper by Brien Posey Exchange Data Protection: To the DAG and Beyond Whitepaper by Brien Posey Exchange is Mission Critical Ask a network administrator to name their most mission critical applications and Exchange Server is

More information

Hadoop implementation of MapReduce computational model. Ján Vaňo

Hadoop implementation of MapReduce computational model. Ján Vaňo Hadoop implementation of MapReduce computational model Ján Vaňo What is MapReduce? A computational model published in a paper by Google in 2004 Based on distributed computation Complements Google s distributed

More information

High Availability and Disaster Recovery Solutions for Perforce

High Availability and Disaster Recovery Solutions for Perforce High Availability and Disaster Recovery Solutions for Perforce This paper provides strategies for achieving high Perforce server availability and minimizing data loss in the event of a disaster. Perforce

More information

High Availability for Citrix XenServer

High Availability for Citrix XenServer WHITE PAPER Citrix XenServer High Availability for Citrix XenServer Enhancing XenServer Fault Tolerance with High Availability www.citrix.com Contents Contents... 2 Heartbeating for availability... 4 Planning

More information

Introduction to Apache Cassandra

Introduction to Apache Cassandra Introduction to Apache Cassandra White Paper BY DATASTAX CORPORATION JULY 2013 1 Table of Contents Abstract 3 Introduction 3 Built by Necessity 3 The Architecture of Cassandra 4 Distributing and Replicating

More information

Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015

Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015 Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL May 2015 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document

More information

Jive and High-Availability

Jive and High-Availability Jive and High-Availability TOC 2 Contents Jive and High-Availability... 3 Supported High-Availability Jive Configurations...3 Designing a Single Data Center HA Configuration... 3 Designing a Multiple Data

More information

Skelta BPM and High Availability

Skelta BPM and High Availability Skelta BPM and High Availability Introduction Companies are now adopting cloud for hosting their business process management (BPM) tools. BPM on cloud can help control costs, optimize business processes

More information

Connectivity. Alliance Access 7.0. Database Recovery. Information Paper

Connectivity. Alliance Access 7.0. Database Recovery. Information Paper Connectivity Alliance Access 7.0 Database Recovery Information Paper Table of Contents Preface... 3 1 Overview... 4 2 Resiliency Concepts... 6 2.1 Database Loss Business Impact... 6 2.2 Database Recovery

More information

How Cisco IT Built Big Data Platform to Transform Data Management

How Cisco IT Built Big Data Platform to Transform Data Management Cisco IT Case Study August 2013 Big Data Analytics How Cisco IT Built Big Data Platform to Transform Data Management EXECUTIVE SUMMARY CHALLENGE Unlock the business value of large data sets, including

More information

Hyper-V Network Virtualization Gateways - Fundamental Building Blocks of the Private Cloud

Hyper-V Network Virtualization Gateways - Fundamental Building Blocks of the Private Cloud Hyper-V Network Virtualization Gateways - nappliance White Paper July 2012 Introduction There are a number of challenges that enterprise customers are facing nowadays as they move more of their resources

More information

Real-time Protection for Hyper-V

Real-time Protection for Hyper-V 1-888-674-9495 www.doubletake.com Real-time Protection for Hyper-V Real-Time Protection for Hyper-V Computer virtualization has come a long way in a very short time, triggered primarily by the rapid rate

More information

Apache Hadoop. Alexandru Costan

Apache Hadoop. Alexandru Costan 1 Apache Hadoop Alexandru Costan Big Data Landscape No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard, except Hadoop 2 Outline What is Hadoop? Who uses it? Architecture HDFS MapReduce Open

More information

Cloudera in the Public Cloud

Cloudera in the Public Cloud Cloudera in the Public Cloud Deployment Options for the Enterprise Data Hub Version: Q414-102 Table of Contents Executive Summary 3 The Case for Public Cloud 5 Public Cloud vs On-Premise 6 Public Cloud

More information

BigData. An Overview of Several Approaches. David Mera 16/12/2013. Masaryk University Brno, Czech Republic

BigData. An Overview of Several Approaches. David Mera 16/12/2013. Masaryk University Brno, Czech Republic BigData An Overview of Several Approaches David Mera Masaryk University Brno, Czech Republic 16/12/2013 Table of Contents 1 Introduction 2 Terminology 3 Approaches focused on batch data processing MapReduce-Hadoop

More information

HIGH AVAILABILITY STRATEGIES

HIGH AVAILABILITY STRATEGIES An InterSystems Technology Guide One Memorial Drive, Cambridge, MA 02142, USA Tel: +1.617.621.0600 Fax: +1.617.494.1631 http://www.intersystems.com HIGH AVAILABILITY STRATEGIES HA Strategies for InterSystems

More information

New Features in PSP2 for SANsymphony -V10 Software-defined Storage Platform and DataCore Virtual SAN

New Features in PSP2 for SANsymphony -V10 Software-defined Storage Platform and DataCore Virtual SAN New Features in PSP2 for SANsymphony -V10 Software-defined Storage Platform and DataCore Virtual SAN Updated: May 19, 2015 Contents Introduction... 1 Cloud Integration... 1 OpenStack Support... 1 Expanded

More information

Apache HBase. Crazy dances on the elephant back

Apache HBase. Crazy dances on the elephant back Apache HBase Crazy dances on the elephant back Roman Nikitchenko, 16.10.2014 YARN 2 FIRST EVER DATA OS 10.000 nodes computer Recent technology changes are focused on higher scale. Better resource usage

More information

Confidently Virtualize Business-Critical Applications in Microsoft

Confidently Virtualize Business-Critical Applications in Microsoft Confidently Virtualize Business-Critical Applications in Microsoft Hyper-V with Veritas ApplicationHA Who should read this paper Windows Virtualization IT Architects and IT Director for Windows Server

More information

CA ARCserve Replication and High Availability Deployment Options for Hyper-V

CA ARCserve Replication and High Availability Deployment Options for Hyper-V Solution Brief: CA ARCserve R16.5 Complexity ate my budget CA ARCserve Replication and High Availability Deployment Options for Hyper-V Adding value to your Hyper-V environment Overview Server virtualization

More information

Hadoop and Map-Reduce. Swati Gore

Hadoop and Map-Reduce. Swati Gore Hadoop and Map-Reduce Swati Gore Contents Why Hadoop? Hadoop Overview Hadoop Architecture Working Description Fault Tolerance Limitations Why Map-Reduce not MPI Distributed sort Why Hadoop? Existing Data

More information

Assuring High Availability in Healthcare Interfacing Considerations and Approach

Assuring High Availability in Healthcare Interfacing Considerations and Approach Assuring High Availability in Healthcare Interfacing Considerations and Approach High availability is a term used in the software industry to indicate that the application is available a high percentage

More information

VMware VDR and Cloud Storage: A Winning Backup/DR Combination

VMware VDR and Cloud Storage: A Winning Backup/DR Combination VMware VDR and Cloud Storage: A Winning Backup/DR Combination 7/29/2010 CloudArray, from TwinStrata, and VMware Data Recovery combine to provide simple, fast and secure backup: On-site and Off-site The

More information

AUTOMATED DISASTER RECOVERY SOLUTION USING AZURE SITE RECOVERY FOR FILE SHARES HOSTED ON STORSIMPLE

AUTOMATED DISASTER RECOVERY SOLUTION USING AZURE SITE RECOVERY FOR FILE SHARES HOSTED ON STORSIMPLE AUTOMATED DISASTER RECOVERY SOLUTION USING AZURE SITE RECOVERY FOR FILE SHARES HOSTED ON STORSIMPLE Copyright This document is provided "as-is." Information and views expressed in this document, including

More information

EonStor DS remote replication feature guide

EonStor DS remote replication feature guide EonStor DS remote replication feature guide White paper Version: 1.0 Updated: Abstract: Remote replication on select EonStor DS storage systems offers strong defense against major disruption to IT continuity,

More information

Complete Storage and Data Protection Architecture for VMware vsphere

Complete Storage and Data Protection Architecture for VMware vsphere Complete Storage and Data Protection Architecture for VMware vsphere Executive Summary The cost savings and agility benefits of server virtualization are well proven, accounting for its rapid adoption.

More information

Overview. Big Data in Apache Hadoop. - HDFS - MapReduce in Hadoop - YARN. https://hadoop.apache.org. Big Data Management and Analytics

Overview. Big Data in Apache Hadoop. - HDFS - MapReduce in Hadoop - YARN. https://hadoop.apache.org. Big Data Management and Analytics Overview Big Data in Apache Hadoop - HDFS - MapReduce in Hadoop - YARN https://hadoop.apache.org 138 Apache Hadoop - Historical Background - 2003: Google publishes its cluster architecture & DFS (GFS)

More information

Journal of science STUDY ON REPLICA MANAGEMENT AND HIGH AVAILABILITY IN HADOOP DISTRIBUTED FILE SYSTEM (HDFS)

Journal of science STUDY ON REPLICA MANAGEMENT AND HIGH AVAILABILITY IN HADOOP DISTRIBUTED FILE SYSTEM (HDFS) Journal of science e ISSN 2277-3290 Print ISSN 2277-3282 Information Technology www.journalofscience.net STUDY ON REPLICA MANAGEMENT AND HIGH AVAILABILITY IN HADOOP DISTRIBUTED FILE SYSTEM (HDFS) S. Chandra

More information

Highly available, scalable and secure data with Cassandra and DataStax Enterprise. GOTO Berlin 27 th February 2014

Highly available, scalable and secure data with Cassandra and DataStax Enterprise. GOTO Berlin 27 th February 2014 Highly available, scalable and secure data with Cassandra and DataStax Enterprise GOTO Berlin 27 th February 2014 About Us Steve van den Berg Johnny Miller Solutions Architect Regional Director Western

More information

Windows Geo-Clustering: SQL Server

Windows Geo-Clustering: SQL Server Windows Geo-Clustering: SQL Server Edwin Sarmiento, Microsoft SQL Server MVP, Microsoft Certified Master Contents Introduction... 3 The Business Need for Geo-Clustering... 3 Single-location Clustering

More information

Powerful Duo: MapR Big Data Analytics with Cisco ACI Network Switches

Powerful Duo: MapR Big Data Analytics with Cisco ACI Network Switches Powerful Duo: MapR Big Data Analytics with Cisco ACI Network Switches Introduction For companies that want to quickly gain insights into or opportunities from big data - the dramatic volume growth in corporate

More information

Hadoop MapReduce and Spark. Giorgio Pedrazzi, CINECA-SCAI School of Data Analytics and Visualisation Milan, 10/06/2015

Hadoop MapReduce and Spark. Giorgio Pedrazzi, CINECA-SCAI School of Data Analytics and Visualisation Milan, 10/06/2015 Hadoop MapReduce and Spark Giorgio Pedrazzi, CINECA-SCAI School of Data Analytics and Visualisation Milan, 10/06/2015 Outline Hadoop Hadoop Import data on Hadoop Spark Spark features Scala MLlib MLlib

More information

OPTIMIZING SERVER VIRTUALIZATION

OPTIMIZING SERVER VIRTUALIZATION OPTIMIZING SERVER VIRTUALIZATION HP MULTI-PORT SERVER ADAPTERS BASED ON INTEL ETHERNET TECHNOLOGY As enterprise-class server infrastructures adopt virtualization to improve total cost of ownership (TCO)

More information

OnX Big Data Reference Architecture

OnX Big Data Reference Architecture OnX Big Data Reference Architecture Knowledge is Power when it comes to Business Strategy The business landscape of decision-making is converging during a period in which: > Data is considered by most

More information

WHITE PAPER. Header Title. Side Bar Copy. Real-Time Replication Is Better Than Periodic Replication WHITEPAPER. A Technical Overview

WHITE PAPER. Header Title. Side Bar Copy. Real-Time Replication Is Better Than Periodic Replication WHITEPAPER. A Technical Overview Side Bar Copy Header Title Why Header Real-Time Title Replication Is Better Than Periodic Replication A Technical Overview WHITEPAPER Table of Contents Introduction...1 Today s IT Landscape...2 What Replication

More information