How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns
|
|
|
- Tyrone Price
- 10 years ago
- Views:
Transcription
1 How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns
2 Table of Contents Abstract... 3 Introduction... 3 Definition... 3 The Expanding Digitization of Business... 3 The Core of the Internet Enterprise... 4 Requirements leading to radical change... 4 Success Factors for the Internet Enterprise... 6 Global Scaling... 6 Customer-Driven Development... 6 Micro Thinking... 6 Rise of the Global Database... 6 Roadmap Toward the Internet Enterprise... 7 How DataStax Helps Power the Internet Enterprise Conclusion About DataStax... 11
3 Abstract Data, a key strategic asset, must be used more effectively than ever before, if businesses are to compete in today s Internet economy. Modern enterprises must leverage data collected from operational (transactional) systems to achieve fast time-to-insight that results in better decisions and better customer service. These papers explores how DataStax Enterprise makes it easy for Internet Enterprises to run operational analytics on data stored in Cassandra, and integrate that data with historical Hadoop data warehouses/lakes, so that online applications can lead to better business. Introduction No one questions the fact that data is a key strategic asset businesses must use effectively to compete in today s Internet economy. Modern enterprises must utilize data collected from operational (transactional) systems in ways that provide the fastest possible time-to-insight so they can quickly make decisions that better serve their customer and benefit their business. Examples of modern Web and mobile applications that need fast turnaround of collected data into information to improve a customer s experience and assist in making business decisions include: Fraud detection systems that quickly detect identity theft and prevent loss to customers and a business. Media and entertainment applications that track a customer s viewing and listening preferences and make on-target recommendations that increase the customer s enjoyment of the service and result in additional purchases for the business. Home utility and appliance sensor applications that continuously ingest and analyze usage information, resulting in lower energy costs and better use of the product for the customer. These and other types of Internet economy systems depend upon a data management platform foundationally architected to consume operational data and analyze it in a way that enables fast decision making capabilities to benefit both the customer and underlying business. This paper explores how DataStax Enterprise supplies these analytic capabilities to today s Web and mobile applications that extend around the globe and must always be available for customer use. The data gathered by NREL comes in different formats, at different rates, from a wide variety of sensors, meters, and control networks. DataStax aligns it within one scalable database. Keith Searight, NREL The Evolution of Analytics A survey of how analytics on data collected through operational systems is performed today reveals that some IT practices used in the past remain intact while new trends are emerging for Web and mobile applications. Operational versus Data Warehouse Analytics For many decades, a separation between operational (online databases) and data warehouses has existed; a separation that has been characterized by the different types of workloads and applications each type of database serves. Operational or line-of-business (LOB) systems typically support transactions and queries that are short in duration, are both write and read intensive, and reflect a real-time nature where data handling is concerned. By contrast, data warehouses are typified by workloads with long running queries against very large data volumes that have been collected from multiple operational systems, which are used for analysis and decision making purposes. Even though a data warehouse s primary purpose is to enable analysis on collected data, this does not mean that analytics reside only in the domain of the data warehouse. In fact, traditional RDBMS s like Oracle, Microsoft SQL Server, etc., have all included various analytics functions (e.g. windowing, partition by, etc.) that allow for running analysis on operational data. The evolution of today s business to one of an Internet economy has not altered this paradigm, although, because of scaling and data distribution
4 needs, the types of databases and data platforms being used have definitely changed to support the need of modern online applications. As a result, legacy operational and data warehouse engines such as Oracle and Teradata have begun to lose ground to NoSQL databases that handle distributed line-of-business applications and Hadoop that services data warehouses or data lakes. cases tailor-made for transactional-analytics are online recommendation engines that constantly consume and analyze user activity and then quickly turn around recommendations on other suggested items to purchase, additional news stories to read, and more. Figure 2 Transactional-analytical processing application. Figure 1 Contrasting legacy and Internet Enterprise platforms for operational and data warehousing. As with legacy RDBMS operational and data warehouse applications, the need exists in modern online systems using NoSQL to perform analytics on transactional data and also integrate that data with data warehouses / data lakes that use Hadoop. The Emergence of Transactional Analytics Many of today s online applications have outgrown the traditional and basic ACID (atomic, consistent, isolated, durable) transaction of the relational era and have broadened it so that it can (1) be used across a widely distributed system and; (2) be more of an interaction where the transaction may include analysis that is real/near time and possibly even historical. Once completed, the transaction is then used to trigger other events and make decisions that affect literally the next transaction the user makes or internal activities such as business intelligence decision-making processes. Examples of applications that are increasingly becoming transactional-analytic include fraud detection systems that field incoming purchase requests and analyze many specifics regarding the request such as purchase location, frequency, amount, and much more. Other application use Analyst groups such as Gartner Group classify this broadening of legacy transactions as hybrid transactional analytical processing or HTAP. Additionally, Gartner states that the analytics required in many of these applications will be of varied tempos, meaning that the speed at which the analysis is carried out will sometimes need to be real/near time while other situations will best be handled by analytics that take longer to run. Requirements for Running Analytics on Online Applications Given the heightened priority of making fast and accurate decisions from data collected from online applications, what are the key requirements for supporting analytic functionality in a modern operational database? While each application is different, the following can serve as a general musthave checklist for today s operational databases: High-speed data consumption the database should support fast data use cases where data is rapidly flowing into the system from user transactions, sensor inputs, and other similar feeds. Heterogeneous data type support the system should support all types of data,
5 including structured, semi-structured, and unstructured. Continuous availability because analytics on operational data is not optional, the same uptime requirements used for OLTP operations apply to analytic workloads. Location independence analytics on operational data must be capable of being run in any location that the underlying application serves. Performance at scale the database should be able to run analytic operations that meet performance SLA s regardless of the underlying data volumes. Multi-workload support with isolation analytic workloads performed on OLTP data should not impact OLTP operations; in other words, there should be a way to support both OLTP and analytic workloads with isolation between the two, so no competition exists for either compute or data resources. Minimization of data movement the need to ETL (extract-transform-load) data to separate databases for analysis should be minimal as constant data movement costs time. Multi-analytic tempo support the database should be able to support multiple analytic tempos that satisfy applications needing more than one speed of analytics (e.g. both near/real time and long running/batch). Integration with data warehouses/lakes easy back/forth integration with external data warehouses/lakes should be possible, beyond simple ETL where the data warehouse may access data directly in the operational data store and run analytic tasks remotely. A New Approach: Analytics with DataStax Enterprise Today s Internet Enterprises that utilize modern Web and mobile applications to engage and interact with their customers will find that running analytics on their operational data is made easy by using DataStax Enterprise. DataStax Enterprise is the leading distributed database for today s digital world of always-on, connected-everywhere applications. At the core of DataStax Enterprise is Apache Cassandra - the #1 open source massively scalable NoSQL database used by many Internet Enterprises today to power their online applications. Cassandra sports an always-on, continuously available architecture that future-proof s the success of business applications by providing linear scale performance against ever-increasing data volumes. The modern masterless ring architecture and distributed nature of Cassandra allows a business to easily support its customers no matter where they are geographically located, plus it provides hybrid application support for those systems that run partly in private data centers and partly on public cloud providers. Figure 3 The distributed, masterless architecture of Cassandra makes distributing data anywhere in the world fast and easy. DataStax Enterprise provides a production-ready version of Cassandra along with other important features that modernize traditional businesses into Internet Enterprises: Enterprise-class security that ensures data is safe and protected. Integrated analytics support on Cassandra data (more on this below). Integrated enterprise search capabilities on Cassandra data. Workload isolation and data replication that ensures OLTP, analytics, and search workloads do not compete with each other for data or compute resources. In-memory database option for both OLTP and analytic workloads. Automatic management services that transparently automate numerous database maintenance and performance monitoring tasks.
6 Visual management and monitoring of all database clusters from any device (laptop, tablet, smart phone). Around-the-clock expert support. Figure 4 DataStax Enterprise components. When it comes to supporting analytic workloads on operational data, DataStax Enterprise provides three different options that may be utilized (any one or all) in a database cluster. Real-Time Analytics For applications needing real-time analytics support, DataStax Enterprise provides the ability to run fast analytic operations on Cassandra data in either an application-based manner (i.e. developed in an application with a language like Java), or via ad-hoc queries executed through bundled database utilities or BI tools such as Tableau. When creating a new database cluster, an architect or administrator simply specifies that some or all nodes in the new cluster be analytics enabled. After that, analytics can be run on any incoming data housed on those nodes. A number of different deployment scenarios may be used such as combining OLTP and analytics on the same nodes or segregating OLTP and analytics on different nodes, the latter of which accomplishes workload isolation so that OLTP and analytics workloads do not compete with each other for data or compute resources. Enabling this capability is DataStax Enterprise s built-in replication, which automatically replicates data from OLTP nodes to analytic nodes where analytic operations may be carried out. Figure 5 Deploying a new cluster with segregated OLTP and analytics nodes. For real-time analytics, DataStax Enterprise uses Spark, which provides in-memory as well as diskbased support for running fast analytics across a distributed, shared nothing architecture. Analytic applications may be developed in languages such as Java, Scala, and Python, while ad-hoc queries are supported in three ways: (1) SparkSQL, which has a subset of SQL-92 compatible syntax allows SQL styled queries to be run against Cassandra data (2) Shark, which is a Hadoop Hive-compatible utility that allows Hive-styled queries to be run against Cassandra data; (3) BI tools such as Tableau, which are enabled through a free ODBC driver that connects directly to a DataStax Enterprise cluster. Further, DataStax Enterprise also enables streaming analytics on high velocity, in-flight data streams via support for Spark Streaming. This shortens the time between a transaction and its impact on analytical insight, which is especially required for use cases such as Internet of Things (IoT) applications. A primary benefit of DataStax Enterprise real/neartime analytics is very fast response times made possible by various technology enablers including inmemory processing. It should be noted that DataStax Enterprise s OLTP in-memory option may be used in conjunction with in-memory analytics, with the combination delivering a full in-memory solution for transactional-analytic workloads and fast turnaround times for use cases such as recommendation engines, online retail re-pricing, fraud detection, and others.
7 Integrated Batch Analytics For situations where analytics use cases on operational data are of a batch-oriented (or longer in duration) nature, DataStax Enterprise provides builtin batch analytics capabilities that allow for longer running analytic tasks to be executed directly on Cassandra data. As with real/near-time analytics, nodes in a DataStax Enterprise cluster may be specifically marked out for such operations. External Batch Analytics and Integration with Data Warehouses Because there are situations where operational and historical data must be combined for decision making purposes, DataStax Enterprise supports integration with Hadoop data warehouses/lakes such as those offered by Cloudera and HortonWorks. The integration allows three things 1. Components from an external Hadoop vendor (e.g. Hive, Pig, etc.) can be installed directly on nodes in a DataStax Enterprise cluster and execute directly on Cassandra data. 2. Cassandra tables may be linked with external Hadoop objects (e.g. a Hive table) and queried / joined together. 3. Results from analytic tasks may be sent back to a Hadoop data warehouse. Figure 6 Specifying that a node in a cluster be devoted to batch analytics. To enable integration, Hadoop task trackers and other desired components are installed and configured on specified nodes in a DataStax Enterprise cluster. Once running, analytic tasks can be run against Cassandra data, and optionally link Cassandra and external Hadoop objects together, with output results being sent back to a Hadoop deployment. Analytic tasks may be run internally and directly on Cassandra data in a DataStax Enterprise cluster with MapReduce, Hive, Pig, and Mahout functions. Enabling both real/near-time and batch analytics in a cluster provides full support for the multiple analytic tempos required by many of today s online applications. The standard use case for integrated batch analytics in DataStax Enterprise involves situations where there is a need to perform longer running analytic tasks on Cassandra data that may include numerous computations and be programmatic in nature (e.g. a health-care company that analyzes patient procedures for billing). It is important to note that the integrated batch analytics feature should not be used as a replacement for a Hadoop data warehouse/lake and is not meant to handle the types of very large data warehouse workloads that are better served by standalone Hadoop implementations. Instead, integration between DataStax Enterprise and such deployments is made available for linking hot and cold/historical data together. Figure 7 Integration with external Hadoop data warehouses is easily handled with DataStax Enterprise.
8 Evaluating DataStax Enterprise for Modern Analytics The following table describes how DataStax Enterprise delivers analytic requirements of today s online applications. REQUIREMENT COMMENTS High-speed Data Consumption One of Cassandra s hallmarks is being the fastest write engine of any database- RDBMS or NoSQL Modern Data Type Support Supports all data types Continuous Availability Has no single point of failure and provides capabilities for no downtime Location Independence Best multi-datacenter and cloud support of any database, allowing data to be read, written and analyzed anywhere Performance at Scale Only database to provide true linear scale performance; nodes are added online to increase performance Minimization of Data Movement Built-in replication removes the need to move data to different systems for real-time analysis and search Integration with Data Warehouses Easily integrates with external Hadoop data warehouses Conclusion DataStax Enterprise makes it easy for Internet Enterprises to run operational analytics on data stored in Cassandra, as well as integrate that data with historical Hadoop data warehouses/lakes, so that online applications can better serve both the needs of the target customer and the internal decision making requirements of the business. For downloads of DataStax Enterprise, online documentation, tutorials, client drivers, getting started materials and more, visit About DataStax DataStax, the leading distributed database management system, delivers Apache Cassandra to the world s most innovative enterprises. Datastax is built to be agile, always-on, and predictably scalable to any size. DataStax has more than 500 customers in 45 countries including leaders such as Netflix, Rackspace and Pearson Education, and spans verticals including web, financial services, telecommunications, logistics, and government. Based in Santa Clara, Calif., DataStax is backed by industry-leading investors including Lightspeed Venture Partners, Meritech Capital, and Crosslink Capital. For more information, visit DataStax.com or follow EU
The Modern Online Application for the Internet Economy: 5 Key Requirements that Ensure Success
The Modern Online Application for the Internet Economy: 5 Key Requirements that Ensure Success 1 Table of Contents Abstract... 3 Introduction... 3 Requirement #1 Smarter Customer Interactions... 4 Requirement
Introduction to Apache Cassandra
Introduction to Apache Cassandra White Paper BY DATASTAX CORPORATION JULY 2013 1 Table of Contents Abstract 3 Introduction 3 Built by Necessity 3 The Architecture of Cassandra 4 Distributing and Replicating
Introduction to Multi-Data Center Operations with Apache Cassandra and DataStax Enterprise
Introduction to Multi-Data Center Operations with Apache Cassandra and DataStax Enterprise White Paper BY DATASTAX CORPORATION October 2013 1 Table of Contents Abstract 3 Introduction 3 The Growth in Multiple
Introduction to Multi-Data Center Operations with Apache Cassandra, Hadoop, and Solr WHITE PAPER
Introduction to Multi-Data Center Operations with Apache Cassandra, Hadoop, and Solr WHITE PAPER By DataStax Corporation August 2012 Contents Introduction...3 The Growth in Multiple Data Centers...3 Why
Big Data: Beyond the Hype
Big Data: Beyond the Hype Why Big Data Matters to You WHITE PAPER Big Data: Beyond the Hype Why Big Data Matters to You By DataStax Corporation October 2011 Table of Contents Introduction...4 Big Data
Comparing the Hadoop Distributed File System (HDFS) with the Cassandra File System (CFS)
Comparing the Hadoop Distributed File System (HDFS) with the Cassandra File System (CFS) White Paper BY DATASTAX CORPORATION August 2013 1 Table of Contents Abstract 3 Introduction 3 Overview of HDFS 4
Big Data: Beyond the Hype. Why Big Data Matters to You. White Paper
Big Data: Beyond the Hype Why Big Data Matters to You White Paper BY DATASTAX CORPORATION October 2013 Table of Contents Abstract 3 Introduction 3 Big Data and You 5 Big Data Is More Prevalent Than You
Big Data: Beyond the Hype
Big Data: Beyond the Hype Why Big Data Matters to You WHITE PAPER By DataStax Corporation March 2012 Contents Introduction... 3 Big Data and You... 5 Big Data Is More Prevalent Than You Think... 5 Big
Simplifying Database Management with DataStax OpsCenter
Simplifying Database Management with DataStax OpsCenter Table of Contents Table of Contents... 2 Abstract... 3 Introduction... 3 DataStax OpsCenter... 3 How Does DataStax OpsCenter Work?... 3 The OpsCenter
INTRODUCTION TO CASSANDRA
INTRODUCTION TO CASSANDRA This ebook provides a high level overview of Cassandra and describes some of its key strengths and applications. WHAT IS CASSANDRA? Apache Cassandra is a high performance, open
Highly available, scalable and secure data with Cassandra and DataStax Enterprise. GOTO Berlin 27 th February 2014
Highly available, scalable and secure data with Cassandra and DataStax Enterprise GOTO Berlin 27 th February 2014 About Us Steve van den Berg Johnny Miller Solutions Architect Regional Director Western
Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale
WHITE PAPER Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale Sponsored by: IBM Carl W. Olofson December 2014 IN THIS WHITE PAPER This white paper discusses the concept
Implementing Search in Web, Mobile, and IOT Applications An Overview of DataStax Enterprise Search
Implementing Search in Web, Mobile, and IOT Applications An Overview of DataStax Enterprise Search Table of Contents Introduction... 3 Why Search?... 3 General Search Requirements... 3 Traditional Deployment
Big Data & QlikView. Democratizing Big Data Analytics. David Freriks Principal Solution Architect
Big Data & QlikView Democratizing Big Data Analytics David Freriks Principal Solution Architect TDWI Vancouver Agenda What really is Big Data? How do we separate hype from reality? How does that relate
EMC Federation Big Data Solutions. Copyright 2015 EMC Corporation. All rights reserved.
EMC Federation Big Data Solutions 1 Introduction to data analytics Federation offering 2 Traditional Analytics! Traditional type of data analysis, sometimes called Business Intelligence! Type of analytics
Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap
Aligning Your Strategic Initiatives with a Realistic Big Data Analytics Roadmap 3 key strategic advantages, and a realistic roadmap for what you really need, and when 2012, Cognizant Topics to be discussed
Comparing the Hadoop Distributed File System (HDFS) with the Cassandra File System (CFS) WHITE PAPER
Comparing the Hadoop Distributed File System (HDFS) with the Cassandra File System (CFS) WHITE PAPER By DataStax Corporation September 2012 Contents Introduction... 3 Overview of HDFS... 4 The Benefits
Tap into Hadoop and Other No SQL Sources
Tap into Hadoop and Other No SQL Sources Presented by: Trishla Maru What is Big Data really? The Three Vs of Big Data According to Gartner Volume Volume Orders of magnitude bigger than conventional data
HDP Hadoop From concept to deployment.
HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some
Architecting for the Internet of Things & Big Data
Architecting for the Internet of Things & Big Data Robert Stackowiak, Oracle North America, VP Information Architecture & Big Data September 29, 2014 Safe Harbor Statement The following is intended to
Integrating a Big Data Platform into Government:
Integrating a Big Data Platform into Government: Drive Better Decisions for Policy and Program Outcomes John Haddad, Senior Director Product Marketing, Informatica Digital Government Institute s Government
Virtualizing Apache Hadoop. June, 2012
June, 2012 Table of Contents EXECUTIVE SUMMARY... 3 INTRODUCTION... 3 VIRTUALIZING APACHE HADOOP... 4 INTRODUCTION TO VSPHERE TM... 4 USE CASES AND ADVANTAGES OF VIRTUALIZING HADOOP... 4 MYTHS ABOUT RUNNING
Big Data on Microsoft Platform
Big Data on Microsoft Platform Prepared by GJ Srinivas Corporate TEG - Microsoft Page 1 Contents 1. What is Big Data?...3 2. Characteristics of Big Data...3 3. Enter Hadoop...3 4. Microsoft Big Data Solutions...4
Next-Generation Cloud Analytics with Amazon Redshift
Next-Generation Cloud Analytics with Amazon Redshift What s inside Introduction Why Amazon Redshift is Great for Analytics Cloud Data Warehousing Strategies for Relational Databases Analyzing Fast, Transactional
CA Technologies Big Data Infrastructure Management Unified Management and Visibility of Big Data
Research Report CA Technologies Big Data Infrastructure Management Executive Summary CA Technologies recently exhibited new technology innovations, marking its entry into the Big Data marketplace with
Hadoop Evolution In Organizations. Mark Vervuurt Cluster Data Science & Analytics
In Organizations Mark Vervuurt Cluster Data Science & Analytics AGENDA 1. Yellow Elephant 2. Data Ingestion & Complex Event Processing 3. SQL on Hadoop 4. NoSQL 5. InMemory 6. Data Science & Machine Learning
Big Data: Are You Ready? Kevin Lancaster
Big Data: Are You Ready? Kevin Lancaster Director, Engineered Systems Oracle Europe, Middle East & Africa 1 A Data Explosion... Traditional Data Sources Billing engines Custom developed New, Non-Traditional
Oracle s Big Data solutions. Roger Wullschleger. <Insert Picture Here>
s Big Data solutions Roger Wullschleger DBTA Workshop on Big Data, Cloud Data Management and NoSQL 10. October 2012, Stade de Suisse, Berne 1 The following is intended to outline
The Future of Data Management
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class
AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW
AGENDA What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story Hadoop PDW Our BIG DATA Roadmap BIG DATA? Volume 59% growth in annual WW information 1.2M Zetabytes (10 21 bytes) this
A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel
A Next-Generation Analytics Ecosystem for Big Data Colin White, BI Research September 2012 Sponsored by ParAccel BIG DATA IS BIG NEWS The value of big data lies in the business analytics that can be generated
Executive Summary... 2 Introduction... 3. Defining Big Data... 3. The Importance of Big Data... 4 Building a Big Data Platform...
Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure Requirements... 5 Solution Spectrum... 6 Oracle s Big Data
Complying with Payment Card Industry (PCI-DSS) Requirements with DataStax and Vormetric
Complying with Payment Card Industry (PCI-DSS) Requirements with DataStax and Vormetric Table of Contents Table of Contents... 2 Overview... 3 PIN Transaction Security Requirements... 3 Payment Application
From Spark to Ignition:
From Spark to Ignition: Fueling Your Business on Real-Time Analytics Eric Frenkiel, MemSQL CEO June 29, 2015 San Francisco, CA What s in Store For This Presentation? 1. MemSQL: A real-time database for
Big Data Analytics - Accelerated. stream-horizon.com
Big Data Analytics - Accelerated stream-horizon.com Legacy ETL platforms & conventional Data Integration approach Unable to meet latency & data throughput demands of Big Data integration challenges Based
THE JOURNEY TO A DATA LAKE
THE JOURNEY TO A DATA LAKE 1 THE JOURNEY TO A DATA LAKE 85% OF DATA GROWTH BY 2020 WILL COME FROM NEW TYPES OF DATA ACCORDING TO IDC, AS MUCH AS 85% OF DATA GROWTH BY 2020 WILL COME FROM NEW TYPES OF DATA,
Comparing Oracle with Cassandra / DataStax Enterprise
Comparing Oracle with Cassandra / DataStax Enterprise Table of Contents Table of Contents... 2 Abstract... 3 Introduction... 3 Oracle and Today s Online Applications... 3 Architectural Limitations... 3
Native Connectivity to Big Data Sources in MSTR 10
Native Connectivity to Big Data Sources in MSTR 10 Bring All Relevant Data to Decision Makers Support for More Big Data Sources Optimized Access to Your Entire Big Data Ecosystem as If It Were a Single
Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database
Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica
So What s the Big Deal?
So What s the Big Deal? Presentation Agenda Introduction What is Big Data? So What is the Big Deal? Big Data Technologies Identifying Big Data Opportunities Conducting a Big Data Proof of Concept Big Data
End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ
End to End Solution to Accelerate Data Warehouse Optimization Franco Flore Alliance Sales Director - APJ Big Data Is Driving Key Business Initiatives Increase profitability, innovation, customer satisfaction,
No-SQL Databases for High Volume Data
Target Conference 2014 No-SQL Databases for High Volume Data Edward Wijnen 3 November 2014 The New Connected World Needs a Revolutionary New DBMS Today The Internet of Things 1990 s Mobile 1970 s Mainfram
Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances
INSIGHT Oracle's All- Out Assault on the Big Data Market: Offering Hadoop, R, Cubes, and Scalable IMDB in Familiar Packages Carl W. Olofson IDC OPINION Global Headquarters: 5 Speen Street Framingham, MA
Big Data Are You Ready? Thomas Kyte http://asktom.oracle.com
Big Data Are You Ready? Thomas Kyte http://asktom.oracle.com The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated
Modernizing Your Data Warehouse for Hadoop
Modernizing Your Data Warehouse for Hadoop Big data. Small data. All data. Audie Wright, DW & Big Data Specialist [email protected] O 425-538-0044, C 303-324-2860 Unlock Insights on Any Data Taking
Transforming the Telecoms Business using Big Data and Analytics
Transforming the Telecoms Business using Big Data and Analytics Event: ICT Forum for HR Professionals Venue: Meikles Hotel, Harare, Zimbabwe Date: 19 th 21 st August 2015 AFRALTI 1 Objectives Describe
www.objectivity.com Choosing The Right Big Data Tools For The Job A Polyglot Approach
www.objectivity.com Choosing The Right Big Data Tools For The Job A Polyglot Approach Nic Caine NoSQL Matters, April 2013 Overview The Problem Current Big Data Analytics Relationship Analytics Leveraging
BIG DATA TRENDS AND TECHNOLOGIES
BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.
Enabling SOX Compliance on DataStax Enterprise
Enabling SOX Compliance on DataStax Enterprise Table of Contents Table of Contents... 2 Introduction... 3 SOX Compliance and Requirements... 3 Who Must Comply with SOX?... 3 SOX Goals and Objectives...
Big Data Technologies Compared June 2014
Big Data Technologies Compared June 2014 Agenda What is Big Data Big Data Technology Comparison Summary Other Big Data Technologies Questions 2 What is Big Data by Example The SKA Telescope is a new development
Big Data Technology ดร.ช ชาต หฤไชยะศ กด. Choochart Haruechaiyasak, Ph.D.
Big Data Technology ดร.ช ชาต หฤไชยะศ กด Choochart Haruechaiyasak, Ph.D. Speech and Audio Technology Laboratory (SPT) National Electronics and Computer Technology Center (NECTEC) National Science and Technology
Addressing Open Source Big Data, Hadoop, and MapReduce limitations
Addressing Open Source Big Data, Hadoop, and MapReduce limitations 1 Agenda What is Big Data / Hadoop? Limitations of the existing hadoop distributions Going enterprise with Hadoop 2 How Big are Data?
Evaluating Apache Cassandra as a Cloud Database White Paper
Evaluating Apache Cassandra as a Cloud Database White Paper BY DATASTAX CORPORATION October 2013 1 Table of Contents Abstract 3 Introduction 3 Why Move to a Cloud Database? 3 The Cloud Promises Transparent
Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum
Big Data Analytics with EMC Greenplum and Hadoop Big Data Analytics with EMC Greenplum and Hadoop Ofir Manor Pre Sales Technical Architect EMC Greenplum 1 Big Data and the Data Warehouse Potential All
Using Tableau Software with Hortonworks Data Platform
Using Tableau Software with Hortonworks Data Platform September 2013 2013 Hortonworks Inc. http:// Modern businesses need to manage vast amounts of data, and in many cases they have accumulated this data
A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani
A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani Technical Architect - Big Data Syntel Agenda Welcome to the Zoo! Evolution Timeline Traditional BI/DW Architecture Where Hadoop Fits In 2 Welcome to
Big Data Are You Ready? Jorge Plascencia Solution Architect Manager
Big Data Are You Ready? Jorge Plascencia Solution Architect Manager Big Data: The Datafication Of Everything Thoughts Devices Processes Thoughts Things Processes Run the Business Organize data to do something
How To Use Big Data For Telco (For A Telco)
ON-LINE VIDEO ANALYTICS EMBRACING BIG DATA David Vanderfeesten, Bell Labs Belgium ANNO 2012 YOUR DATA IS MONEY BIG MONEY! Your click stream, your activity stream, your electricity consumption, your call
Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy. Presented by: Jeffrey Zhang and Trishla Maru
Tapping Into Hadoop and NoSQL Data Sources with MicroStrategy Presented by: Jeffrey Zhang and Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop?
Data Integration Checklist
The need for data integration tools exists in every company, small to large. Whether it is extracting data that exists in spreadsheets, packaged applications, databases, sensor networks or social media
Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014
Forecast of Big Data Trends Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Big Data transforms Business 2 Data created every minute Source http://mashable.com/2012/06/22/data-created-every-minute/
Oracle Database 12c Plug In. Switch On. Get SMART.
Oracle Database 12c Plug In. Switch On. Get SMART. Duncan Harvey Head of Core Technology, Oracle EMEA March 2015 Safe Harbor Statement The following is intended to outline our general product direction.
Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing
Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Wayne W. Eckerson Director of Research, TechTarget Founder, BI Leadership Forum Business Analytics
Building Your Big Data Team
Building Your Big Data Team With all the buzz around Big Data, many companies have decided they need some sort of Big Data initiative in place to stay current with modern data management requirements.
INTELLIGENT BUSINESS STRATEGIES WHITE PAPER
INTELLIGENT BUSINESS STRATEGIES WHITE PAPER Improving Access to Data for Successful Business Intelligence Part 2: Supporting Multiple Analytical Workloads in a Changing Analytical Landscape By Mike Ferguson
Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies
Big Data and Advanced Analytics Applications and Capabilities Steven Hagan, Vice President, Server Technologies 1 Copyright 2011, Oracle and/or its affiliates. All rights Big Data, Advanced Analytics:
Real-Time Big Data Analytics + Internet of Things (IoT) = Value Creation
Real-Time Big Data Analytics + Internet of Things (IoT) = Value Creation January 2015 Market Insights Report Executive Summary According to a recent customer survey by Vitria, executives across the consumer,
Datenverwaltung im Wandel - Building an Enterprise Data Hub with
Datenverwaltung im Wandel - Building an Enterprise Data Hub with Cloudera Bernard Doering Regional Director, Central EMEA, Cloudera Cloudera Your Hadoop Experts Founded 2008, by former employees of Employees
Dell Reference Configuration for DataStax Enterprise powered by Apache Cassandra
Dell Reference Configuration for DataStax Enterprise powered by Apache Cassandra A Quick Reference Configuration Guide Kris Applegate [email protected] Solution Architect Dell Solution Centers Dave
HDP Enabling the Modern Data Architecture
HDP Enabling the Modern Data Architecture Herb Cunitz President, Hortonworks Page 1 Hortonworks enables adoption of Apache Hadoop through HDP (Hortonworks Data Platform) Founded in 2011 Original 24 architects,
How to Enhance Traditional BI Architecture to Leverage Big Data
B I G D ATA How to Enhance Traditional BI Architecture to Leverage Big Data Contents Executive Summary... 1 Traditional BI - DataStack 2.0 Architecture... 2 Benefits of Traditional BI - DataStack 2.0...
5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014
5 Keys to Unlocking the Big Data Analytics Puzzle Anurag Tandon Director, Product Marketing March 26, 2014 1 A Little About Us A global footprint. A proven innovator. A leader in enterprise analytics for
An Oracle White Paper June 2013. Oracle: Big Data for the Enterprise
An Oracle White Paper June 2013 Oracle: Big Data for the Enterprise Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5 Infrastructure
Interactive data analytics drive insights
Big data Interactive data analytics drive insights Daniel Davis/Invodo/S&P. Screen images courtesy of Landmark Software and Services By Armando Acosta and Joey Jablonski The Apache Hadoop Big data has
Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack
Elasticsearch on Cisco Unified Computing System: Optimizing your UCS infrastructure for Elasticsearch s analytics software stack HIGHLIGHTS Real-Time Results Elasticsearch on Cisco UCS enables a deeper
Oracle Big Data SQL Technical Update
Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical
Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth
MAKING BIG DATA COME ALIVE Big Data Architecture & Analytics A comprehensive approach to harness big data architecture and analytics for growth Steve Gonzales, Principal Manager [email protected]
The Inside Scoop on Hadoop
The Inside Scoop on Hadoop Orion Gebremedhin National Solutions Director BI & Big Data, Neudesic LLC. VTSP Microsoft Corp. [email protected] [email protected] @OrionGM The Inside Scoop
Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control
Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control EP/K006487/1 UK PI: Prof Gareth Taylor (BU) China PI: Prof Yong-Hua Song (THU) Consortium UK Members: Brunel University
BIG DATA & DATA SCIENCE
BIG DATA & DATA SCIENCE ACADEMY PROGRAMS IN-COMPANY TRAINING PORTFOLIO 2 TRAINING PORTFOLIO 2016 Synergic Academy Solutions BIG DATA FOR LEADING BUSINESS Big data promises a significant shift in the way
Big Data Open Source Stack vs. Traditional Stack for BI and Analytics
Big Data Open Source Stack vs. Traditional Stack for BI and Analytics Part I By Sam Poozhikala, Vice President Customer Solutions at StratApps Inc. 4/4/2014 You may contact Sam Poozhikala at [email protected].
Real Time Big Data Processing
Real Time Big Data Processing Cloud Expo 2014 Ian Meyers Amazon Web Services Global Infrastructure Deployment & Administration App Services Analytics Compute Storage Database Networking AWS Global Infrastructure
Advanced In-Database Analytics
Advanced In-Database Analytics Tallinn, Sept. 25th, 2012 Mikko-Pekka Bertling, BDM Greenplum EMEA 1 That sounds complicated? 2 Who can tell me how best to solve this 3 What are the main mathematical functions??
An Oracle White Paper October 2011. Oracle: Big Data for the Enterprise
An Oracle White Paper October 2011 Oracle: Big Data for the Enterprise Executive Summary... 2 Introduction... 3 Defining Big Data... 3 The Importance of Big Data... 4 Building a Big Data Platform... 5
SAP and Hortonworks Reference Architecture
SAP and Hortonworks Reference Architecture Hortonworks. We Do Hadoop. June Page 1 2014 Hortonworks Inc. 2011 2014. All Rights Reserved A Modern Data Architecture With SAP DATA SYSTEMS APPLICATIO NS Statistical
Cloudwick. CLOUDWICK LABS Big Data Research Paper. Nebula: Powering Enterprise Private & Hybrid Cloud for DataStax Big Data
Nebula: Powering Enterprise Private & Hybrid Cloud for DataStax Big Data was commissioned to evaluate and test the Nebula One Private and Hybrid Cloud Appliance using DataStax, a leading Apache Cassandra
Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.
Collaborative Big Data Analytics 1 Big Data Is Less About Size, And More About Freedom TechCrunch!!!!!!!!! Total data: bigger than big data 451 Group Findings: Big Data Is More Extreme Than Volume Gartner!!!!!!!!!!!!!!!
IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS!
The Bloor Group IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS VENDOR PROFILE The IBM Big Data Landscape IBM can legitimately claim to have been involved in Big Data and to have a much broader
Big Data Explained. An introduction to Big Data Science.
Big Data Explained An introduction to Big Data Science. 1 Presentation Agenda What is Big Data Why learn Big Data Who is it for How to start learning Big Data When to learn it Objective and Benefits of
THE DEVELOPER GUIDE TO BUILDING STREAMING DATA APPLICATIONS
THE DEVELOPER GUIDE TO BUILDING STREAMING DATA APPLICATIONS WHITE PAPER Successfully writing Fast Data applications to manage data generated from mobile, smart devices and social interactions, and the
#mstrworld. Tapping into Hadoop and NoSQL Data Sources in MicroStrategy. Presented by: Trishla Maru. #mstrworld
Tapping into Hadoop and NoSQL Data Sources in MicroStrategy Presented by: Trishla Maru Agenda Big Data Overview All About Hadoop What is Hadoop? How does MicroStrategy connects to Hadoop? Customer Case
Elastic Application Platform for Market Data Real-Time Analytics. for E-Commerce
Elastic Application Platform for Market Data Real-Time Analytics Can you deliver real-time pricing, on high-speed market data, for real-time critical for E-Commerce decisions? Market Data Analytics applications
Introducing Oracle Exalytics In-Memory Machine
Introducing Oracle Exalytics In-Memory Machine Jon Ainsworth Director of Business Development Oracle EMEA Business Analytics 1 Copyright 2011, Oracle and/or its affiliates. All rights Agenda Topics Oracle
How To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
