Introduction to Multi-Data Center Operations with Apache Cassandra and DataStax Enterprise
|
|
- Jade Gaines
- 8 years ago
- Views:
Transcription
1 Introduction to Multi-Data Center Operations with Apache Cassandra and DataStax Enterprise White Paper BY DATASTAX CORPORATION October
2 Table of Contents Abstract 3 Introduction 3 The Growth in Multiple Data Centers 3 A Brief Multi-Data Center Database Checklist 4 A Look at Apache Cassandra 5 Cassandra and Multiple Data Centers 6 Multi-Data Center Performance 7 Running Analytics and Search Across Multiple Data Centers 7 Options for Multi-Data Center Analytics and Search 7 A Look at DataStax Enterprise 8 What About the Cloud? 8 Managing and Monitoring Multi-Data Center Deployments 9 Multi-Data Center Customer Examples 10 Conclusion 11 About DataStax 11 2
3 Abstract Many modern businesses serve customers all around the world, with database applications that need to be always available even if a disaster hits a particular region. A database that easily spans multiple data centers and/or the cloud ensures the fastest possible response times for customers and employees who are geographically separated. A multi-data center database also protects information from loss in the event that a single data center experiences a disaster. This paper discusses why multi-data center databases are fast becoming the new norm for database operations, and how Apache Cassandra and DataStax Enterprise can comprise a smart and agile data store that is truly location-independent. Introduction Many modern businesses have external-facing database applications that are dramatically growing, and which serve a customer base that is geographically dispersed. Numerous companies also have workforces that are highly distributed in nature, with each employee needing fast access to the same corporate information no matter where they happen to be located. A database that easily spans multiple data centers and/or the cloud ensures the fastest possible response times (both read and write) for customers and employees who are geographically separated. A multi-data center database also provides a number of other benefits such as protecting information from loss in the event that a single data center experiences a disaster. This paper discusses why multi-data center databases are fast becoming the new norm for database operations, along with what characteristics a database must possess to run across many data centers and the cloud at once. Focus is then turned to how Apache Cassandra and DataStax Enterprise can be easily configured to run across multiple data centers and cloud providers to meet the requirements of those needing a smart and agile datastore that is truly location independent. The Growth in Multiple Data Centers A 2012 article in InfoWorld divulged interesting statistics about the rise and growth of multidata centers. In their latest poll of data center managers, the Uptime Institute discovered that 80 percent of respondents have built a new data center or upgraded an existing facility within the past five years. 1 The same article cited another study of the North American data center market done by Digital Realty Trust. In that study, 92 percent of respondents said their companies will definitely or probably expand their data center space in 2012 the highest percentage reported in six years. This news, coupled with the fact that data centers are primarily put in place to hold (no surprise) corporate data, makes it plain to see that the need for databases that can easily span and interact between multiple data centers is only going to escalate and likely at a rapid clip. 1 Large enterprises handing off data center builds as demand booms, by Ann Bednarz, InfoWorld, April 23, 2012: 3
4 Why Multi-Data Center Datastores? The reasons why a multi-data center datastore is needed vary. Some use cases involve just the simple desire for a good disaster recovery plan. But the majority of multi-data center use cases revolve around needing to keep one logical database synched up between 1-N physical data centers and to deliver, as quickly as possible, response times for the users that each data center serves. One other factor contributing to the multi-data center discussion is big data. Those familiar with the term big data normally can recite the three V s of what makes up big data: velocity, volume, and variety. However, one overlooked aspect of big data systems is complexity, which, according to Gartner Inc., involves the domain of managing data across many different data centers, time zones, geographies, and so forth. 2 Distributing data across many different data centers and the cloud is not an easy task with traditional databases. When one adds characteristics of data that is coming in at extremely high rates of speed from many places, data that is of varying formats, and data that can involve heavy volumes, the job becomes even harder. A Brief Multi-Data Center Database Checklist Even outside of big data environments, legacy relational databases (RDBMSs), the primary datastores for most businesses, have traditionally provided minimal support for multi-data centers. Other than basic replication or one-way mirroring, all RDBMS vendors lack key built-in features needed by modern applications that require a datastore that spans many different data centers and/or cloud geographies. This raises the question: What are the features and capabilities that a modern database / datastore needs to meet the demands of multi-data center operations? Does it just equate to log shipping, mirroring between data centers, or master-slave replication or is it something else? Increasingly, the must-have short list from those wanting modern multi-data center capabilities includes the following: The ability to span 1-N data centers, and not just two. This includes the agility to handle multiple cloud geo-zones as well. Multidirectional syncs between all participating data centers, and not just one way. Or, in other words, the desire to have truly location independent, read and write anywhere freedom. Built-in network intelligence, so that data is smartly transferred between data centers to minimize bandwidth overload and latency issues. The ability to support the required type of data traffic across data centers (e.g. real-time, analytic, search). Capabilities for handling big data use cases in a way where all data centers appear as just one logical database to an end user application. 2 Big Data Is Only the Beginning of Extreme Information Management, by Beyer, et al., Gartner Group Inc., April 7, 2011: 4
5 Pulling this off is not easy unless one starts with the right database architecture and feature set. Traditional master-slave designs inherent in RDBMSs and some NoSQL solutions are many times practically impossible, as the requirement for true location independence cannot be met. Fortunately, Apache Cassandra possesses the right blend of technical features and big data capabilities to handle modern multi-data center and cloud deployments. A Look at Apache Cassandra Apache Cassandra is a massively scalable NoSQL database. Cassandra s technical roots can be found at companies recognized for their ability to effectively tackle big data Google, Amazon, and Facebook. Used today by numerous modern businesses to manage their critical data infrastructure, Cassandra is known for being the solution technical professionals turn to when they need a realtime NoSQL database that supplies high performance at massive scale, which never goes down. Rather than using a legacy master-slave or a manual and difficult-to-maintain sharded design, Cassandra has a peer-to-peer (or masterless ) distributed ring architecture that is elegant, easy to set up, and maintain. In Cassandra, all nodes are the same; there is no concept of a master node, with all nodes communicating with each other via a gossip protocol. Cassandra s built-for-scale architecture means that it is capable of handling terabytes of information and thousands of concurrent users/operations per second across one to many data centers as easily as it can manage much smaller amounts of data and user traffic. It also means that, unlike other master-slave or sharded systems, Cassandra has no single point of failure and therefore is capable of offering true continuous availability. 5
6 Cassandra and Multiple Data Centers Cassandra s architecture is tailor-made for multiple data centers. Its peer-to-peer design (vs. legacy master-slave implementations) coupled with online scale-out and full redundancy that offers no single points of failure and continuous availability make it ideal in multi-data center environments. Because Cassandra is a masterless architecture, all nodes are the same and all nodes offer full read/write capabilities in a database cluster, regardless of where those nodes are physically located. A single Cassandra ring (or database cluster) can certainly exist at just one physical data center. However, Cassandra can easily support a single database spanning multiple data centers, where each data center holds its own copy of the database and can have as many nodes as needed for supporting that site: Figure 2: A Single Cassandra Database with Multiple Data Centers Creating a database that spans multiple data centers in Cassandra is easy and is accomplished via the definition of a new database. Once the database software has been installed on all machines in all participating data centers and is running, and network communication has been established among all the nodes, a keyspace (analogous to an RDBMS database) is created using Cassandra s CQL language. Within the definition of a keyspace, each data center is identified (with the ID matching configuration parameters that have been previously set) along with the number of copies of the data that the keyspace will hold in each data center. For example, the syntax below creates a new keyspace named Globalbiz, with three data centers (DC1, DC2, and DC3): the first and second holding six total copies of the data (for fault tolerance purposes) and the third data center holding three copies: CREATE KEYSPACE Globalbiz WITH REPLICATION = {'class': 'NetworkTopologyStrategy', 'DC1': 6, 'DC2' : 6, DC3 : 3}; Once this command successfully executes, all data will then be automatically and transparently replicated between all nodes in all data centers with no further work being necessary on the part of any developer or administrator. 6
7 Multi-Data Center Performance One reason for multi-data center deployment is to keep copies of a database close to users of a particular data center/geographic region, with the end result being faster performance for both reads and writes. But what about performance across data centers? Won t updating many nodes in many different data centers put too heavy a load on a database cluster? To eliminate this concern, Cassandra has built-in intelligence to only send a single data stream from one data center to all others participating in a multi-data center cluster. Once the data has reached one of the nodes in a different data center, that node then takes the responsibility to update all other nodes in a cluster that are responsible for holding that piece of data. Figure 3: Cross-Data Center Writes in Cassandra Running Analytics and Search Across Multiple Data Centers In addition to managing real-time data across multiple data centers, many modern businesses also wish to run analytic and enterprise search operations that span more than one data center. As with real-time data, implementing cross-data center operations for analytics and search data has proven to be no easy task. Options for Multi-Data Center Analytics and Search The need for multi-data center support for analytics and enterprise search has not been lost on those developing and supporting analytics and search technology like Apache Hadoop and Apache Solr. Today, Apache Hadoop offers a warm standby option that can be configured to go to a different data center. Third-party Hadoop vendors also offer solutions with one-way mirror capabilities. For Solr, writes to Solr indexes in the community version of Solr cannot span multiple data centers. Instead, there is only replication support to another node in a different data center via rsync. 7
8 Both the open source versions of Hadoop and Solr as well as those offered by third-party software vendors miss the mark where the criteria for operating a datastore in a multi-data center environment is concerned. However, DataStax Enterprise, offered by DataStax, supplies not only multi-data center support that meets the criteria suggested earlier in this paper for real-time/online data, but also delivers the same enterprise support for running analytics and search on Cassandra data across multiple data centers. A Look at DataStax Enterprise DataStax is the most trusted provider of Cassandra, employing the Apache chair of the Cassandra project as well as most of the committers. For enterprises that want to use Cassandra in production, DataStax supplies DataStax Enterprise Edition, which includes an enterprise-ready version of Cassandra plus built in security and the ability to run analytics and enterprise search operations on Cassandra data. With DataStax Enterprise, modern businesses get a complete big data platform that contains: A certified version of Cassandra that has passed DataStax s rigorous internal certification process, which includes heavy quality assurance testing, performance benchmarking, and defect resolution. Integrated analytics on Cassandra data using Hadoop MapReduce, Hive, Pig, Mahout, and Sqoop. Bundled enterprise search support with Apache Solr. Automatic management services that transparently run and take care of many administration tasks without IT staff involvement. DataStax OpsCenter, a visual management and monitoring tool. Expert, 24x7x365 support. Certified maintenance releases and platform certification What About the Cloud? Both Cassandra and DataStax Enterprise are fully cloud-enabled and capable of supporting multiple availability zones in a cloud provider. Further, hybrid deployments are supported so that a single cluster can span multiple on-premise installations as well as cloud-based implementations. Figure 4: Cassandra supports hybrid on-premise/cloud deployments 8
9 Managing and Monitoring Multi-Data Center Deployments Administering and monitoring the performance of any distributed database system can be challenging, especially when the database spans multiple geographical locations. However, DataStax makes it easy to manage multi-data center databases with DataStax OpsCenter. DataStax OpsCenter is a visual management and monitoring solution for Cassandra. Because DataStax OpsCenter is web based, developers or administrators can easily manage and monitor all aspects of their databases from any desktop, laptop, or tablet without installing any client software. This includes databases that span multiple data centers and the cloud. Figure 5: Managing a 9-node Cassandra cluster with DataStax OpsCenter 9
10 Multi-Data Center Customer Examples Figure 6: A sample of companies and organizations using Cassandra in production Some DataStax customers using Cassandra and DataStax Enterprise across multiple data centers and the cloud include: Netflix has over 500 nodes of Cassandra running in multiple clusters and geo-zones on Amazon. ebay has over 200 TB in DataStax Enterprise across three data centers. HealthX supports their online patient and provider portal with DataStax Enterprise running in multiple geographies on Amazon. ReachLocal uses DataStax Enterprise in six different data centers across the world to support their global online advertising business. Pantheon Systems uses Cassandra across multiple data centers to deliver their cloud-based web development platform. Scandit runs Cassandra across three different data centers to support its mobile barcode and product scanning service. 10
11 Conclusion Today s successful businesses are looking for a modern database management system that can easily span multiple data centers and handle real-time, analytic, and enterprise search operations. Cassandra and DataStax Enterprise meet the requirements these businesses have for multi-data center and cloud support. To find out more about Cassandra and DataStax, and to obtain downloads of Cassandra and DataStax Enterprise software, please visit or send an to info@datastax.com. Note that DataStax Enterprise Edition is completely free to evaluate in development environments, while production deployments require the purchase of a software subscription. About DataStax DataStax powers the big data applications that transform business for more than 300 customers, including startups and 20 of the Fortune 100. DataStax delivers a massively scalable, flexible and continuously available big data platform built on Apache Cassandra. DataStax integrates enterprise-ready Cassandra and includes the ability to run analytics and search on Cassandra data across multi-data centers and in the cloud. Companies such as Adobe, Healthcare Anytime, ebay and Netflix rely on DataStax to transform their businesses. Based in San Mateo, Calif., DataStax is backed by industry-leading investors: Lightspeed Venture Partners, Crosslink Capital and Meritech Capital Partners. For more information, visit DataStax or follow 11
Introduction to Multi-Data Center Operations with Apache Cassandra, Hadoop, and Solr WHITE PAPER
Introduction to Multi-Data Center Operations with Apache Cassandra, Hadoop, and Solr WHITE PAPER By DataStax Corporation August 2012 Contents Introduction...3 The Growth in Multiple Data Centers...3 Why
More informationIntroduction to Apache Cassandra
Introduction to Apache Cassandra White Paper BY DATASTAX CORPORATION JULY 2013 1 Table of Contents Abstract 3 Introduction 3 Built by Necessity 3 The Architecture of Cassandra 4 Distributing and Replicating
More informationComparing the Hadoop Distributed File System (HDFS) with the Cassandra File System (CFS)
Comparing the Hadoop Distributed File System (HDFS) with the Cassandra File System (CFS) White Paper BY DATASTAX CORPORATION August 2013 1 Table of Contents Abstract 3 Introduction 3 Overview of HDFS 4
More informationComparing the Hadoop Distributed File System (HDFS) with the Cassandra File System (CFS) WHITE PAPER
Comparing the Hadoop Distributed File System (HDFS) with the Cassandra File System (CFS) WHITE PAPER By DataStax Corporation September 2012 Contents Introduction... 3 Overview of HDFS... 4 The Benefits
More informationTable of Contents... 2
Why NoSQL? Table of Contents Table of Contents... 2 Abstract... 3 Introduction... 3 You Have Big Data... 3 How Does DataStax Helps Manage Big Data... 3 Big Data Performance... 4 You Need Continuous Availability...
More informationThe Modern Online Application for the Internet Economy: 5 Key Requirements that Ensure Success
The Modern Online Application for the Internet Economy: 5 Key Requirements that Ensure Success 1 Table of Contents Abstract... 3 Introduction... 3 Requirement #1 Smarter Customer Interactions... 4 Requirement
More informationBig Data: Beyond the Hype. Why Big Data Matters to You. White Paper
Big Data: Beyond the Hype Why Big Data Matters to You White Paper BY DATASTAX CORPORATION October 2013 Table of Contents Abstract 3 Introduction 3 Big Data and You 5 Big Data Is More Prevalent Than You
More informationEvaluating Apache Cassandra as a Cloud Database White Paper
Evaluating Apache Cassandra as a Cloud Database White Paper BY DATASTAX CORPORATION October 2013 1 Table of Contents Abstract 3 Introduction 3 Why Move to a Cloud Database? 3 The Cloud Promises Transparent
More informationBig Data: Beyond the Hype
Big Data: Beyond the Hype Why Big Data Matters to You WHITE PAPER Big Data: Beyond the Hype Why Big Data Matters to You By DataStax Corporation October 2011 Table of Contents Introduction...4 Big Data
More informationBig Data: Beyond the Hype
Big Data: Beyond the Hype Why Big Data Matters to You WHITE PAPER By DataStax Corporation March 2012 Contents Introduction... 3 Big Data and You... 5 Big Data Is More Prevalent Than You Think... 5 Big
More informationHow Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns
How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns Table of Contents Abstract... 3 Introduction... 3 Definition... 3 The Expanding Digitization
More informationHighly available, scalable and secure data with Cassandra and DataStax Enterprise. GOTO Berlin 27 th February 2014
Highly available, scalable and secure data with Cassandra and DataStax Enterprise GOTO Berlin 27 th February 2014 About Us Steve van den Berg Johnny Miller Solutions Architect Regional Director Western
More informationDon t Let Your Shoppers Drop; 5 Rules for Today s Ecommerce A guide for ecommerce teams comprised of line-of-business managers and IT managers
Don t Let Your Shoppers Drop; 5 Rules for Today s Ecommerce A guide for ecommerce teams comprised of line-of-business managers and IT managers White Paper BY DATASTAX CORPORATION AUGUST 2013 Table of Contents
More informationEvaluating Apache Cassandra as a Cloud Database WHITE PAPER
Evaluating Apache Cassandra as a Cloud Database WHITE PAPER Evaluating Apache Cassandra as a Cloud Database By DataStax Corporation November 2011 Contents Introduction... 3 Why Move to a Cloud Database?...
More informationEvaluating Apache Cassandra as a Cloud Database WHITE PAPER
Evaluating Apache Cassandra as a Cloud Database WHITE PAPER By DataStax Corporation March 2012 Contents Introduction... 3 Why Move to a Cloud Database?... 3 The Cloud Promises Transparent Elasticity...
More informationCloudwick. CLOUDWICK LABS Big Data Research Paper. Nebula: Powering Enterprise Private & Hybrid Cloud for DataStax Big Data
Nebula: Powering Enterprise Private & Hybrid Cloud for DataStax Big Data was commissioned to evaluate and test the Nebula One Private and Hybrid Cloud Appliance using DataStax, a leading Apache Cassandra
More informationSimplifying Database Management with DataStax OpsCenter
Simplifying Database Management with DataStax OpsCenter Table of Contents Table of Contents... 2 Abstract... 3 Introduction... 3 DataStax OpsCenter... 3 How Does DataStax OpsCenter Work?... 3 The OpsCenter
More informationComplying with Payment Card Industry (PCI-DSS) Requirements with DataStax and Vormetric
Complying with Payment Card Industry (PCI-DSS) Requirements with DataStax and Vormetric Table of Contents Table of Contents... 2 Overview... 3 PIN Transaction Security Requirements... 3 Payment Application
More informationSo What s the Big Deal?
So What s the Big Deal? Presentation Agenda Introduction What is Big Data? So What is the Big Deal? Big Data Technologies Identifying Big Data Opportunities Conducting a Big Data Proof of Concept Big Data
More informationImplementing Search in Web, Mobile, and IOT Applications An Overview of DataStax Enterprise Search
Implementing Search in Web, Mobile, and IOT Applications An Overview of DataStax Enterprise Search Table of Contents Introduction... 3 Why Search?... 3 General Search Requirements... 3 Traditional Deployment
More informationComparing Oracle with Cassandra / DataStax Enterprise
Comparing Oracle with Cassandra / DataStax Enterprise Table of Contents Table of Contents... 2 Abstract... 3 Introduction... 3 Oracle and Today s Online Applications... 3 Architectural Limitations... 3
More informationINTRODUCTION TO CASSANDRA
INTRODUCTION TO CASSANDRA This ebook provides a high level overview of Cassandra and describes some of its key strengths and applications. WHAT IS CASSANDRA? Apache Cassandra is a high performance, open
More informationDataStax Enterprise, powered by Apache Cassandra (TM)
PerfAccel (TM) Performance Benchmark on Amazon: DataStax Enterprise, powered by Apache Cassandra (TM) Disclaimer: All of the documentation provided in this document, is copyright Datagres Technologies
More informationBIG DATA TRENDS AND TECHNOLOGIES
BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.
More informationHealthCare Anytime. As we approach the 2020s, the trend toward big data, tools, and systemization
Datastax Provides with a Strategic Competitive Advantage as They Improve Patients Medical Care Executive Summary For more than 20 years, much of the national debate on reforming health care has focused
More informationEnabling SOX Compliance on DataStax Enterprise
Enabling SOX Compliance on DataStax Enterprise Table of Contents Table of Contents... 2 Introduction... 3 SOX Compliance and Requirements... 3 Who Must Comply with SOX?... 3 SOX Goals and Objectives...
More informationModern IT Operations Management. Why a New Approach is Required, and How Boundary Delivers
Modern IT Operations Management Why a New Approach is Required, and How Boundary Delivers TABLE OF CONTENTS EXECUTIVE SUMMARY 3 INTRODUCTION: CHANGING NATURE OF IT 3 WHY TRADITIONAL APPROACHES ARE FAILING
More informationWhy Migrate from MySQL to Cassandra?
Why Migrate from MySQL to Cassandra? White Paper BY DATASTAX CORPORATION June 2012 1 Table of Contents Abstract 3 Introduction 3 Why Stay with MySQL 4 Why Migrate from MySQL? 4 Architectural Limitations
More informationThe Future of Data Management
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class
More informationTHE REALITIES OF NOSQL BACKUPS
THE REALITIES OF NOSQL BACKUPS White Paper Trilio Data, Inc. March 2015 1 THE REALITIES OF NOSQL BACKUPS TABLE OF CONTENTS INTRODUCTION... 2 NOSQL DATABASES... 2 PROBLEM: LACK OF COMPREHENSIVE BACKUP AND
More informationWelcome to Apache Cassandra 1.0
Welcome to Apache Cassandra 1.0 An Overview for Architects, Developers, and IT Managers WHITE PAPER Welcome to Apache Cassandra 1.0 An Overview for Architects, Developers, and IT Managers By DataStax Corporation
More informationWOS Cloud. ddn.com. Personal Storage for the Enterprise. DDN Solution Brief
DDN Solution Brief Personal Storage for the Enterprise WOS Cloud Secure, Shared Drop-in File Access for Enterprise Users, Anytime and Anywhere 2011 DataDirect Networks. All Rights Reserved DDN WOS Cloud
More informationNoSQL Data Base Basics
NoSQL Data Base Basics Course Notes in Transparency Format Cloud Computing MIRI (CLC-MIRI) UPC Master in Innovation & Research in Informatics Spring- 2013 Jordi Torres, UPC - BSC www.jorditorres.eu HDFS
More informationImplementing a NoSQL Strategy
Implementing a NoSQL Strategy White Paper BY DATASTAX CORPORATION JULY 2013 Table of Contents Abstract 3 Introduction 3 What is Driving NoSQL Adoption in the Enterprise? 3 The Need for Speed 3 The Need
More informationMulti-Datacenter Replication
www.basho.com Multi-Datacenter Replication A Technical Overview & Use Cases Table of Contents Table of Contents... 1 Introduction... 1 How It Works... 1 Default Mode...1 Advanced Mode...2 Architectural
More informationHarnessing the Power of Big Data for Real-Time IT: Sumo Logic Log Management and Analytics Service
Harnessing the Power of Big Data for Real-Time IT: Sumo Logic Log Management and Analytics Service A Sumo Logic White Paper Introduction Managing and analyzing today s huge volume of machine data has never
More informationScaleArc for SQL Server
Solution Brief ScaleArc for SQL Server Overview Organizations around the world depend on SQL Server for their revenuegenerating, customer-facing applications, running their most business-critical operations
More informationINTRODUCTION. Specifically we looked at:
3 INTRODUCTION The Evolve IP-CCNG 2014 North American Call Center Survey Results Paper examined the trends, concerns and spending in today s call centers. Specifically we looked at: Cloud-based versus
More informationIntroduction to Cassandra
Introduction to Cassandra DuyHai DOAN, Technical Advocate Agenda! Architecture cluster replication Data model last write win (LWW), CQL basics (CRUD, DDL, collections, clustering column) lightweight transactions
More informationHigh Availability with Postgres Plus Advanced Server. An EnterpriseDB White Paper
High Availability with Postgres Plus Advanced Server An EnterpriseDB White Paper For DBAs, Database Architects & IT Directors December 2013 Table of Contents Introduction 3 Active/Passive Clustering 4
More informationDBA'S GUIDE TO NOSQL APACHE CASSANDRA
DBA'S GUIDE TO NOSQL APACHE CASSANDRA THE ENLIGHTENED DBA Smashwords Edition Copyright 2014 The Enlightened DBA This ebook is licensed for your personal enjoyment only. This ebook may not be re-sold or
More informationDataStax Enterprise Reference Architecture
DataStax Enterprise Reference Architecture DataStax Enterprise Reference Architecture 7.8.15 1 Table of Contents ABSTRACT... 3 INTRODUCTION... 3 DATASTAX ENTERPRISE... 3 ARCHITECTURE... 3 OPSCENTER: EASY-
More informationWhite Paper. Managing MapR Clusters on Google Compute Engine
White Paper Managing MapR Clusters on Google Compute Engine MapR Technologies, Inc. www.mapr.com Introduction Google Compute Engine is a proven platform for running MapR. Consistent, high performance virtual
More informationHow To Use Hp Vertica Ondemand
Data sheet HP Vertica OnDemand Enterprise-class Big Data analytics in the cloud Enterprise-class Big Data analytics for any size organization Vertica OnDemand Organizations today are experiencing a greater
More informationBig Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum
Big Data Analytics with EMC Greenplum and Hadoop Big Data Analytics with EMC Greenplum and Hadoop Ofir Manor Pre Sales Technical Architect EMC Greenplum 1 Big Data and the Data Warehouse Potential All
More informationObject Storage: A Growing Opportunity for Service Providers. White Paper. Prepared for: 2012 Neovise, LLC. All Rights Reserved.
Object Storage: A Growing Opportunity for Service Providers Prepared for: White Paper 2012 Neovise, LLC. All Rights Reserved. Introduction For service providers, the rise of cloud computing is both a threat
More informationEnterprise IT is complex. Today, IT infrastructure spans the physical, the virtual and applications, and crosses public, private and hybrid clouds.
ENTERPRISE MONITORING & LIFECYCLE MANAGEMENT Unify IT Operations Enterprise IT is complex. Today, IT infrastructure spans the physical, the virtual and applications, and crosses public, private and hybrid
More informationThe Production Cloud
The Production Cloud The cloud is not just for backup storage, development projects and other low-risk applications. In this document, we look at the characteristics of a public cloud environment that
More informationNoSQL for SQL Professionals William McKnight
NoSQL for SQL Professionals William McKnight Session Code BD03 About your Speaker, William McKnight President, McKnight Consulting Group Frequent keynote speaker and trainer internationally Consulted to
More informationBig Data & the Cloud: The Sum Is Greater Than the Parts
E-PAPER March 2014 Big Data & the Cloud: The Sum Is Greater Than the Parts Learn how to accelerate your move to the cloud and use big data to discover new hidden value for your business and your users.
More informationVistara Lifecycle Management
Vistara Lifecycle Management Solution Brief Unify IT Operations Enterprise IT is complex. Today, IT infrastructure spans the physical, the virtual and applications, and crosses public, private and hybrid
More informationComprehensive Analytics on the Hortonworks Data Platform
Comprehensive Analytics on the Hortonworks Data Platform We do Hadoop. Page 1 Page 2 Back to 2005 Page 3 Vertical Scaling Page 4 Vertical Scaling Page 5 Vertical Scaling Page 6 Horizontal Scaling Page
More informationNo-SQL Databases for High Volume Data
Target Conference 2014 No-SQL Databases for High Volume Data Edward Wijnen 3 November 2014 The New Connected World Needs a Revolutionary New DBMS Today The Internet of Things 1990 s Mobile 1970 s Mainfram
More informationCitusDB Architecture for Real-Time Big Data
CitusDB Architecture for Real-Time Big Data CitusDB Highlights Empowers real-time Big Data using PostgreSQL Scales out PostgreSQL to support up to hundreds of terabytes of data Fast parallel processing
More informationBig Data Analytics for Space Exploration, Entrepreneurship and Policy Opportunities. Tiffani Crawford, PhD
Big Analytics for Space Exploration, Entrepreneurship and Policy Opportunities Tiffani Crawford, PhD Big Analytics Characteristics Large quantities of many data types Structured Unstructured Human Machine
More informationEvaluator s Guide. McKnight. Consulting Group. McKnight Consulting Group
NoSQL Evaluator s Guide McKnight Consulting Group William McKnight is the former IT VP of a Fortune 50 company and the author of Information Management: Strategies for Gaining a Competitive Advantage with
More informationNoSQL and Hadoop Technologies On Oracle Cloud
NoSQL and Hadoop Technologies On Oracle Cloud Vatika Sharma 1, Meenu Dave 2 1 M.Tech. Scholar, Department of CSE, Jagan Nath University, Jaipur, India 2 Assistant Professor, Department of CSE, Jagan Nath
More informationAffordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale
WHITE PAPER Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale Sponsored by: IBM Carl W. Olofson December 2014 IN THIS WHITE PAPER This white paper discusses the concept
More informationHadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics
In Organizations Mark Vervuurt Cluster Data Science & Analytics AGENDA 1. Yellow Elephant 2. Data Ingestion & Complex Event Processing 3. SQL on Hadoop 4. NoSQL 5. InMemory 6. Data Science & Machine Learning
More informationDelivering Real-World Total Cost of Ownership and Operational Benefits
Delivering Real-World Total Cost of Ownership and Operational Benefits Treasure Data - Delivering Real-World Total Cost of Ownership and Operational Benefits 1 Background Big Data is traditionally thought
More informationMonitis Project Proposals for AUA. September 2014, Yerevan, Armenia
Monitis Project Proposals for AUA September 2014, Yerevan, Armenia Distributed Log Collecting and Analysing Platform Project Specifications Category: Big Data and NoSQL Software Requirements: Apache Hadoop
More informationA Survey of Distributed Database Management Systems
Brady Kyle CSC-557 4-27-14 A Survey of Distributed Database Management Systems Big data has been described as having some or all of the following characteristics: high velocity, heterogeneous structure,
More informationOPTIMIZING PERFORMANCE IN AMAZON EC2 INTRODUCTION: LEVERAGING THE PUBLIC CLOUD OPPORTUNITY WITH AMAZON EC2. www.boundary.com
OPTIMIZING PERFORMANCE IN AMAZON EC2 While the business decision to migrate to Amazon public cloud services can be an easy one, tracking and managing performance in these environments isn t so clear cut.
More informationHow to Unlock Agility by Backing up to, from, and in the Cloud
WHITE PAPER: HOW TO UNLOCK AGILITY BY BACKING UP TO, FROM,....... AND.... IN.. THE.... CLOUD....................... How to Unlock Agility by Backing up to, from, and in the Cloud Who should read this paper
More informationGet More Scalability and Flexibility for Big Data
Solution Overview LexisNexis High-Performance Computing Cluster Systems Platform Get More Scalability and Flexibility for What You Will Learn Modern enterprises are challenged with the need to store and
More informationThe Multi-Model Database Cloud Applications in a Complex World
The Multi-Model Database Cloud Applications in a Complex World Table of Contents INTRODUCTION MULTI-MODEL: AN EVOLUTIONARY TALE FROM RDBMS TO NOSQL TO MULTI-MODEL DATASTAX ENTERPRISE AND MULTI-MODEL DECIDING
More informationReal-World Scale for Mobile IT: Nine Core Performance Requirements
White Paper Real-World Scale for Mobile IT: Nine Core Performance Requirements Mobile IT Scale As the leader in Mobile IT, MobileIron has worked with hundreds of Global 2000 companies to scale their mobile
More informationwww.basho.com Technical Overview Simple, Scalable, Object Storage Software
www.basho.com Technical Overview Simple, Scalable, Object Storage Software Table of Contents Table of Contents... 1 Introduction & Overview... 1 Architecture... 2 How it Works... 2 APIs and Interfaces...
More informationHow To Scale Out Of A Nosql Database
Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 thomas.steinmaurer@scch.at www.scch.at Michael Zwick DI
More informationPractical Guidelines for Selecting NoSQL vs. an RDBMS Deployment Considerations Conclusion About DataStax
TABLE OF CONTENTS Introduction Why NoSQL? NoSQL 101 Types of NoSQL Databases What are the Advantages of NoSQL Over an RDBMS? A NoSQL Example Apache Cassandra What Makes Cassandra Ideal for Modern Online
More information7 INSIGHTS FROM OUR 2014 CLOUD ADOPTION SURVEY
1 7 INSIGHTS FROM OUR 2014 CLOUD ADOPTION SURVEY THE NEW INDUSTRY PULSE ON CLOUD MIGRATION We asked nearly 200 IT professionals in industries ranging from healthcare and government to finance and media/
More informationWebinar: Modern Data Protection For Next-Gen Apps and Databases
Enterprise Strategy Group Getting to the bigger truth. Webinar: Modern Data Protection For Next-Gen Apps and Databases Nik Rouda, Senior Analyst, ESG Group Tarun Thakur, Co-Founder and CEO, Datos IO Speakers
More informationMakeMyTrip CUSTOMER SUCCESS STORY
MakeMyTrip CUSTOMER SUCCESS STORY MakeMyTrip is the leading travel site in India that is running two ClustrixDB clusters as multi-master in two regions. It removed single point of failure. MakeMyTrip frequently
More informationArchitectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase
Architectural patterns for building real time applications with Apache HBase Andrew Purtell Committer and PMC, Apache HBase Who am I? Distributed systems engineer Principal Architect in the Big Data Platform
More informationBIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES
BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data
More informationHadoop Ecosystem B Y R A H I M A.
Hadoop Ecosystem B Y R A H I M A. History of Hadoop Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open
More informationTo run large data set applications in the cloud, and run them well,
How to Harness the Power of DBaaS and the Cloud to Achieve Superior Application Performance To run large data set applications in the cloud, and run them well, businesses and other organizations have embraced
More informationSecurity and Compliance in Big Data
Security and Compliance in Big Data White Paper BY DATASTAX CORPORATION AND GAZZANG, INC MAY 2013 Contents Executive Summary 3 A Brief Note About Compliance 3 HIPAA and HITECH Regulations 4 Payment Card
More informationModernizing Your Data Warehouse for Hadoop
Modernizing Your Data Warehouse for Hadoop Big data. Small data. All data. Audie Wright, DW & Big Data Specialist Audie.Wright@Microsoft.com O 425-538-0044, C 303-324-2860 Unlock Insights on Any Data Taking
More informationElasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack
Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack HIGHLIGHTS Real-Time Results Elasticsearch on Cisco UCS enables a deeper
More informationWhy NoSQL? Your database options in the new non- relational world. 2015 IBM Cloudant 1
Why NoSQL? Your database options in the new non- relational world 2015 IBM Cloudant 1 Table of Contents New types of apps are generating new types of data... 3 A brief history on NoSQL... 3 NoSQL s roots
More informationWhite Paper: What You Need To Know About Hadoop
CTOlabs.com White Paper: What You Need To Know About Hadoop June 2011 A White Paper providing succinct information for the enterprise technologist. Inside: What is Hadoop, really? Issues the Hadoop stack
More informationBig Data Explained. An introduction to Big Data Science.
Big Data Explained An introduction to Big Data Science. 1 Presentation Agenda What is Big Data Why learn Big Data Who is it for How to start learning Big Data When to learn it Objective and Benefits of
More informationLarge scale processing using Hadoop. Ján Vaňo
Large scale processing using Hadoop Ján Vaňo What is Hadoop? Software platform that lets one easily write and run applications that process vast amounts of data Includes: MapReduce offline computing engine
More informationCan the Elephants Handle the NoSQL Onslaught?
Can the Elephants Handle the NoSQL Onslaught? Avrilia Floratou, Nikhil Teletia David J. DeWitt, Jignesh M. Patel, Donghui Zhang University of Wisconsin-Madison Microsoft Jim Gray Systems Lab Presented
More informationT a c k l i ng Big Data w i th High-Performance
Worldwide Headquarters: 211 North Union Street, Suite 105, Alexandria, VA 22314, USA P.571.296.8060 F.508.988.7881 www.idc-gi.com T a c k l i ng Big Data w i th High-Performance Computing W H I T E P A
More informationCloud Computing Backgrounder
Cloud Computing Backgrounder No surprise: information technology (IT) is huge. Huge costs, huge number of buzz words, huge amount of jargon, and a huge competitive advantage for those who can effectively
More informationDataStax Enterprise Reference Architecture. White Paper
DataStax Enterprise Reference Architecture White Paper BY DATASTAX CORPORATION January 2014 Table of Contents Abstract...3 Introduction...3 DataStax Enterprise Architecture...3 Management Interface...
More informationThings You Need to Know About Cloud Backup
Things You Need to Know About Cloud Backup Over the last decade, cloud backup, recovery and restore (BURR) options have emerged as a secure, cost-effective and reliable method of safeguarding the increasing
More informationWHITE PAPER. 5 Ways Your Organization is Missing Out on Massive Opportunities By Not Using Cloud Software
WHITE PAPER 5 Ways Your Organization is Missing Out on Massive Opportunities By Not Using Cloud Software Cloud software allows your organization to focus on its strengths and outsource tough data storage
More informationDell Reference Configuration for DataStax Enterprise powered by Apache Cassandra
Dell Reference Configuration for DataStax Enterprise powered by Apache Cassandra A Quick Reference Configuration Guide Kris Applegate kris_applegate@dell.com Solution Architect Dell Solution Centers Dave
More informationAdvanced In-Database Analytics
Advanced In-Database Analytics Tallinn, Sept. 25th, 2012 Mikko-Pekka Bertling, BDM Greenplum EMEA 1 That sounds complicated? 2 Who can tell me how best to solve this 3 What are the main mathematical functions??
More informationScaleArc idb Solution for SQL Server Deployments
ScaleArc idb Solution for SQL Server Deployments Objective This technology white paper describes the ScaleArc idb solution and outlines the benefits of scaling, load balancing, caching, SQL instrumentation
More informationLeveraging Public Clouds to Ensure Data Availability
Systems Engineering at MITRE CLOUD COMPUTING SERIES Leveraging Public Clouds to Ensure Data Availability Toby Cabot Lawrence Pizette The MITRE Corporation manages federally funded research and development
More informationTable of Contents Abstract Introduction The Expanding Digitization of Business The Core of the Internet Enterprise
1 Table of Contents Abstract... Introduction... Definition... The Expanding Digitization of Business... The Core of the Internet Enterprise... Requirements leading to radical change... Success Factors
More informationBig Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect
Big Data & QlikView Democratizing Big Data Analytics David Freriks Principal Solution Architect TDWI Vancouver Agenda What really is Big Data? How do we separate hype from reality? How does that relate
More informationCreate and Drive Big Data Success Don t Get Left Behind
Create and Drive Big Data Success Don t Get Left Behind The performance boost from MapR not only means we have lower hardware requirements, but also enables us to deliver faster analytics for our users.
More informationMaking the Business and IT Case for Dedicated Hosting
Making the Business and IT Case for Dedicated Hosting Overview Dedicated hosting is a popular way to operate servers and devices without owning the hardware and running a private data centre. Dedicated
More informationRPO represents the data differential between the source cluster and the replicas.
Technical brief Introduction Disaster recovery (DR) is the science of returning a system to operating status after a site-wide disaster. DR enables business continuity for significant data center failures
More informationSolution brief. HP CloudSystem. An integrated and open platform to build and manage cloud services
Solution brief An integrated and open platform to build and manage cloud services The industry s most complete cloud system for enterprises and service providers Approximately every decade, technology
More information