Actian SQL in Hadoop Buyer s Guide

Size: px
Start display at page:

Download "Actian SQL in Hadoop Buyer s Guide"

Transcription

1 Actian SQL in Hadoop Buyer s Guide

2 Contents Introduction: Big Data and Hadoop... 3 SQL on Hadoop Benefits... 4 Approaches to SQL on Hadoop... 4 The Top 10 SQL in Hadoop Capabilities... 5 SQL in Hadoop Decision Criteria... 6 Actian: The High- performance SQL in Hadoop Database... 8 Summary Learn more Try it for free!

3 Introduction: Big Data and Hadoop When most people think of big data, they also think of Hadoop. Why? Because Hadoop has emerged as the de facto data operating system platform to address the biggest challenge posed by big data: ginormous amounts of data continuously being generated. Organizations have come to recognize that they can use Hadoop to store massive data sets by distributing them across inexpensive commodity servers at a tenth of the cost of storing data in a data warehouse. All this Hadoop data stored in an ever- growing reservoir provides an incredible opportunity for organizations who can mine this massive big data to create transformational business outcomes, including: Delighting customers with exceptional service to increase revenue or reduce churn. Responding faster or using new insights to create a competitive advantage. Reducing risk and fraud, which impact the bottom line of every organization. Uncovering opportunities for new product or service offerings. However, while Hadoop has helped address the big data challenge of how to inexpensively store large amounts of data of all shapes and sizes, it has fallen short as an analytics platform to deliver on the promise of big data. Partly, because of the limitations of the Hadoop technology: Not a database Hadoop consists of multiple components but fundamentally is a file system Hadoop Distributed File System (HDFS) with a specialized programming framework (MapReduce) to build programs that can process the data stored in HDFS. It is not easy to find individual records or record sets. And, Hadoop lacks most of the capabilities of a typical database to organize, access, and manage data. Batch vs low latency Hadoop is designed for large batch processing where it can take hours or even days to analyze a single data set. Because Hadoop is a file system, it requires scanning all files just to find a single record. Batch processing also makes it difficult for data scientists to explore data and build analytic models. Moreover, Hadoop cannot support the interactive, ad hoc queries required by most business analysts and applications. Programming required Hadoop requires the use of complex, specialized MapReduce programs to process data. MapReduce programmers are in short supply and rarely understand the business objectives for how the Hadoop data is to be used. Plus, writing, testing, and running a MapReduce job to prepare and analyze data often can take weeks. This can become a huge bottleneck as every data request from a data scientist or business analyst must go through a Java programmer. Thus, the need for rare and expensive skillsets, long and error- prone implementation cycles, lack of support for popular reporting and BI tools, and inadequate execution speed have led to a search for an alternative that combines all of the benefits of Hadoop with SQL the world s most widely used data querying language. This guide is designed to help organizations understand what capabilities are most important when making a SQL on Hadoop solution purchase decision. 3

4 SQL on Hadoop Benefits SQL is THE standard trade language for anyone who uses databases and works with data. There are more than one million SQL trained users and only 150,000 MapReduce programmers (with an estimated shortage of 170,000). Almost every enterprise application and business analyst tool uses SQL to manage data. It makes sense that SQL should be the way business users want to access Hadoop data. Here are six more reasons why it makes sense to consider a SQL on Hadoop solution: 1. Use existing SQL trained users no retraining needed for user access to Hadoop data. 2. Use existing BI tools immediate productivity using existing SQL- based business intelligence and visualization tools. 3. Use existing SQL apps no need to rewrite queries to access Hadoop- based data. 4. No hunting for Hadoop skills no searching for hard to find, expensive MapReduce programmers. 5. No transferring of Hadoop data no need to move the data out of Hadoop in order for business users to be able to use SQL to access Hadoop data. 6. No managing duplicate Hadoop data duplicating and maintaining another copy of Hadoop data in a separate data management environment negates the cost- benefit of Hadoop and also requires additional IT resources for support. Approaches to SQL on Hadoop The SQL on Hadoop marketplace is crowded with solutions that broadly can be viewed in three main categories: Hadoop Connector: With this approach, the organization must deploy both a Hadoop cluster and a DBMS cluster, on the same or separate hardware, and use a connector to pass data back and forth between the two systems. The approach is expensive and hard to manage and, most often, is adopted by traditional data warehouse vendors. Solutions that fall into this category include HP Vertica, Teradata, and Oracle. SQL and Hadoop: In this approach, vendors have taken an existing SQL engine and modified it such that when generating a query execution plan, it can determine which parts of the query should be executed via MapReduce and which parts should be executed via SQL operators. Data that is processed via SQL operators is copied from the HDFS into a local table structure. This again requires management of data in both HDFS and local tables. Solutions that fall into this category include Hadapt, RainStor, Citus Data, and Splice Machine. SQL on Hadoop: These vendors are building SQL engines from the ground up that enable native SQL processing of data in HDFS while avoiding the use of MapReduce. These products have limited SQL language support, rudimentary query optimizers that can require handcrafting queries to specify the join order, and no support for trickle updates. Product immaturity is reflected in the lack of support for internationalization, limited security features, and the lack of workload management. Solutions that fall into this category include Impala, Drill, Stinger, and HAWQ. 4

5 Unfortunately, all of these categories fall short of what is needed to deliver true SQL access to Hadoop data. Thus, a new category called SQL in Hadoop is needed for solutions that provide an industrialized, high- performance analytics database able to run natively in Hadoop on top of HDFS. The Top 10 SQL in Hadoop Capabilities Actian has worked with hundreds of customers to understand their big data requirements, including SQL access to Hadoop data. Here are the top 10 key capabilities required of a true SQL in Hadoop solution. 1. Comprehensive: A SQL in Hadoop capability shouldn t be used in isolation. Organizations need to view SQL in Hadoop as part of a complete analytics solution that covers the end- to- end process from connecting to raw big data, analyzing data natively on Hadoop, all the way through delivering analytics results that impact business. Capabilities should include data integration; data blending and enrichment; discovery analytics and data science workbench; high- performance analytical processing; and SQL access to analytical results all natively on Hadoop. 2. Data Quality: Organizations need to ensure that Hadoop data accessed by business users has been curated for high- quality data. This especially is true if businesses will use the Hadoop data for operational decision making. The SQL in Hadoop solution should be part of an analytics platform that supports data blending, enrichment, and quality validation. 3. SQL: The solution needs to support standard ANSI SQL to make it easy and flexible to work with SQL- based applications, as well as standard business intelligence or visualization tools. This also ensures that any existing SQL queries will be able to access Hadoop data without the need for modification. In addition, the SQL in Hadoop capability should support advanced analytics, such as cubing and window functions. 4. Performance Optimized: The solution should be mature technology strongly rooted in data management with published TPC benchmarks. It should include a proven cost- based query planner and optimizer that make optimal use of all available resources from the node, memory, cache, and all the way down to the CPU. 5. Security: The solution should include native security expected of any enterprise quality database management system (DBMS), including authentication, user- and role- based security, data protection, and encryption. 6. Read/Write: The ability to not only access but also update/append Hadoop data is important. The SQL in Hadoop solution should be able to both read and write data to the HDFS nodes. It also should be fully ACID- compliant, with multi- version read consistency, plus system- wide failover protection. 7. Compression: Hadoop can reduce storage costs but it still requires replicating copies for high availability. The SQL in Hadoop solution should compress data by at least a factor of 10 and store it in a native columnar format for faster SQL performance. 5

6 8. Manageability: The SQL in Hadoop solution should be certified as YARN- ready to ensure it uses the Hadoop YARN capability to automatically manage resources and nodes on the cluster. The solution also should include a Web- based systems management console to monitor analytics and query processing. 9. Architecture: The SQL in Hadoop solution should be able to expand and scale as the size of the cluster grows. It should be able to handle extremely complex queries, thousands of nodes and SQL users, and petabytes and more of data. 10. Native: SQL in Hadoop should run natively without the need to move the data off the HDFS nodes into a separate database or file system. It also should be flexible to support the most commonly used Hadoop distributions including, Apache, Cloudera, Hortonworks, and MapR. SQL in Hadoop Decision Criteria Choosing a SQL in Hadoop solution is a critical decision that will have a long- term impact on your application infrastructure and your ability to scale to support the needs of business users. While you may not need a solution that meets all of these criteria immediately, it s important to consider how your requirements may change over time. 1. Comprehensive: Is the solution part of a complete analytics platform with: Connectors to all types of disparate data (DBMS, file systems, social media, machine/sensors, SaaS applications)? Parallelized integration and loading of data into Hadoop? Visual data science workbench to build re- usable data and analytics workflows? Comprehensive set of hundreds of analytics and transformation functions? Ability to analyze massive amounts of structured and unstructured data without need for sampling? High- performance data flow engine that is 10x faster than MapReduce for processing analytic workflows? Ability to blend and enrich Hadoop data with non- Hadoop data to improve context and improve analytical accuracy? 2. Data Quality: If business users are to use the Hadoop data for operational decision making, is the solution part of a complete analytics platform with ability to perform: Profiling to assess the data to discover inconsistencies? Data cleansing and validation to improve data quality and consistency? 3. SQL: Does the solution support: Standard ANSI SQL enabling the use of existing SQL without rewrite? Advanced analytics, including cubing, grouping, and window functions? Standard enterprise and desktop BI and visualization tools? 6

7 4. Performance Optimized: Does the solution support/use/have: Mature and proven cost- based query planner? Optimal use of all available resources, including node, memory, cache, and CPU? Use cases requiring responses in milliseconds to seconds versus minutes to hours? Fast lookups on small subsets of the data? Fast performance on massive joins? Published performance benchmarks? 5. Security: Does the solution support: Role- based security? Authentication through LDAP or Active Directory? Column- level data encryption? User- level and role- level data access? 6. Read/Write: Does the solution use/provide: Ability to both read and append/update data in HDFS? Concurrent reads and updates without locking up the database? Full ACID- compliance with multi- version read consistency? 7. Compression: Does the solution use/provide: Compress the data by at least a factor of 10 to reduce the amount of Hadoop storage? Store the data in a columnar format for faster access? 8. Manageability: Does the solution use/provide: YARN for automated Hadoop cluster resource management? Certification as YARN- ready? Web- based management console for monitoring analytic/query processing? 9. Architecture: Is/does the solution: Based on a mature, proven data management platform? Scale linearly to handle thousands of users, nodes, and petabytes of data? Provide system- wide failover protection? 10. Native: Is/does the solution: Run natively on Hadoop (HDFS) without the need to move the data or use a separate integration layer? Natively run on the most common Hadoop distributions, such as Apache, Cloudera, Hortonworks, or MapR 11. Vendor: Is/does the vendor: Built upon years of deep data management and Hadoop experience? Well- financed and stable? Have offices worldwide? Have global customer support? Have a strong customer base? 7

8 Actian: The High-performance SQL in Hadoop Database Actian Vector in Hadoop uniquely offers the first true SQL in Hadoop solution. Actian Vector in Hadoop, a key capability of the Actian Analytics Platform Hadoop SQL Edition, represents the next generation of innovation at Actian and employs a unique approach for bringing all of the benefits of industrialized, high- performance SQL together with Hadoop. Figure 1: Actian Analytics Platform Hadoop SQL Edition Capabilities 8

9 Actian Vector in Hadoop contains a mature RDBMS engine that performs native SQL processing of data in HDFS. It has rich SQL language support, an advanced query optimizer, support for trickle updates, and has been certified for use with the most popular BI tools. It is built from mature vector database technology that has been hardened in the enterprise and includes support for localization and internationalization, advanced security features, and workload management. Plus, it has been benchmarked to perform more than 30 times faster than other approaches to SQL on Hadoop. Figure 2: Actian Analytics Platform offers a fully functional analytics database in Hadoop To understand what s special about Actian Vector in Hadoop, here are key performance features that make it unique: 1. CPU Exploitation: Actian Vector was written from the ground up to take advantage of performance features in modern CPUs, resulting in dramatically higher data processing rates compared to other relational databases. Actian Vector in Hadoop leverages these innovations and brings this unbridled processing power to the data nodes in a Hadoop cluster. 9

10 2. Single Instruction, Multiple Data (SIMD): SIMD enables a single operation to be applied on every entity in a set of data all at once. Actian Vector takes advantage of SIMD instructions by processing vectors of data through the Streaming SIMD Extensions instruction set. Because typical data analysis queries process large volumes of data, the SIMD results in the average computation against a single data value taking less than a single CPU cycle. 3. Parallel Execution: Actian implements a flexible, adaptive parallel execution algorithm and can be scaled- up or scaled- out to meet specific workload requirements. Actian Vector can execute statements in parallel using any number of CPU cores on a server or across any number of data nodes on a Hadoop cluster. Taking the raw power of the Vector data processing engine to the HDFS data is what gives Actian Vector in Hadoop its unique performance characteristics. 4. Updates via Positional Delta Trees (PDTs): Actian Vector in Hadoop implements a fully ACID- compliant transactional database with multi- version read consistency. One of the biggest challenges with HDFS is that it is not designed for incremental updates. Actian addresses this challenge with high- performance, in- memory Positional Delta Trees (PDTs), which are used to store small incremental changes (inserts that are not appends), as well as updates and deletes. 5. Out- of- the- box YARN Support: YARN provides resource negotiation and management for the entire Hadoop cluster and all applications running on it. Actian Vector is the first SQL in Hadoop capability certified as YARN- ready. This means Actian query workloads can run as first- class citizens on Hadoop clusters, sharing resources side- by- side with MapReduce- based applications. Summary If you need to analyze large volumes of data on Hadoop and deliver scalable, enterprise- grade SQL access to Hadoop data to your business, you should strongly consider Actian for its industrialized, high- performance SQL in Hadoop solution. Actian s SQL in Hadoop is the foundation for revolutionary performance gains in database processing gains that are so game- changing that they appear on our competitor s long- term roadmaps. Implement an easy to deploy, easy to use, ANSI- compliant solution and benefit from significantly better query performance than any other SQL on Hadoop solution. Learn more Learn more about the Actian Analytics Platform Hadoop SQL Edition, which includes Actian Vector in Hadoop, our high- performance SQL in Hadoop capability: actian.com/hadoop Try it for free! Experience the difference of having a true analytics database for high- performance SQL access to Hadoop data. Download and try out Actian s SQL in Hadoop capability for 30 days for free: bigdata.actian.com/sql- in- hadoop 10

11 About Actian: Accelerating Big Data 2.0 Actian transforms big data into business value for any organization not just the privileged few. sources of revenue, business opportunities, and ways of mitigating risk with high- performance, in- database analytics complemented with extensive connectivity and data preparation. Actian makes Hadoop enterprise- grade by providing high- performance data blending and enrichment, visual design, and SQL analytics on Hadoop without the need for MapReduce skills. Among the tens of thousands of organizations using Actian are innovators using analytics for a competitive advantage in industries such as financial services, telecommunications, digital media, healthcare, and retail. The company is headquartered in Silicon Valley and has offices worldwide Arguello Street, Ste. 200, Redwood City, CA [Toll Free] [Tel] 2014 Actian Corporation. Actian, Big Data for the Rest of Us, Accelerating Big Data 2.0, and Actian Analytics Platform are trademarks of Actian and its subsidiaries. All other trademarks, trade names, service marks, and logos referenced herein belong to their respective companies. (WP )

Actian Vector in Hadoop

Actian Vector in Hadoop Actian Vector in Hadoop Industrialized, High-Performance SQL in Hadoop A Technical Overview Contents Introduction...3 Actian Vector in Hadoop - Uniquely Fast...5 Exploiting the CPU...5 Exploiting Single

More information

Actian SQL Analytics in Hadoop

Actian SQL Analytics in Hadoop Actian SQL Analytics in Hadoop The Fastest, Most Industrialized SQL in Hadoop A Technical Overview 2015 Actian Corporation. All Rights Reserved. Actian product names are trademarks of Actian Corp. Other

More information

Using Tableau Software with Hortonworks Data Platform

Using Tableau Software with Hortonworks Data Platform Using Tableau Software with Hortonworks Data Platform September 2013 2013 Hortonworks Inc. http:// Modern businesses need to manage vast amounts of data, and in many cases they have accumulated this data

More information

Splice Machine: SQL-on-Hadoop Evaluation Guide www.splicemachine.com

Splice Machine: SQL-on-Hadoop Evaluation Guide www.splicemachine.com REPORT Splice Machine: SQL-on-Hadoop Evaluation Guide www.splicemachine.com The content of this evaluation guide, including the ideas and concepts contained within, are the property of Splice Machine,

More information

How To Handle Big Data With A Data Scientist

How To Handle Big Data With A Data Scientist III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution

More information

Information Architecture

Information Architecture The Bloor Group Actian and The Big Data Information Architecture WHITE PAPER The Actian Big Data Information Architecture Actian and The Big Data Information Architecture Originally founded in 2005 to

More information

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics An Oracle White Paper November 2010 Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics 1 Introduction New applications such as web searches, recommendation engines,

More information

Interactive data analytics drive insights

Interactive data analytics drive insights Big data Interactive data analytics drive insights Daniel Davis/Invodo/S&P. Screen images courtesy of Landmark Software and Services By Armando Acosta and Joey Jablonski The Apache Hadoop Big data has

More information

UNLEASHING THE VALUE OF THE TERADATA UNIFIED DATA ARCHITECTURE WITH ALTERYX

UNLEASHING THE VALUE OF THE TERADATA UNIFIED DATA ARCHITECTURE WITH ALTERYX UNLEASHING THE VALUE OF THE TERADATA UNIFIED DATA ARCHITECTURE WITH ALTERYX 1 Successful companies know that analytics are key to winning customer loyalty, optimizing business processes and beating their

More information

Native Connectivity to Big Data Sources in MicroStrategy 10. Presented by: Raja Ganapathy

Native Connectivity to Big Data Sources in MicroStrategy 10. Presented by: Raja Ganapathy Native Connectivity to Big Data Sources in MicroStrategy 10 Presented by: Raja Ganapathy Agenda MicroStrategy supports several data sources, including Hadoop Why Hadoop? How does MicroStrategy Analytics

More information

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances INSIGHT Oracle's All- Out Assault on the Big Data Market: Offering Hadoop, R, Cubes, and Scalable IMDB in Familiar Packages Carl W. Olofson IDC OPINION Global Headquarters: 5 Speen Street Framingham, MA

More information

Big Data at Cloud Scale

Big Data at Cloud Scale Big Data at Cloud Scale Pushing the limits of flexible & powerful analytics Copyright 2015 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For

More information

Vectorwise 3.0 Fast Answers from Hadoop. Technical white paper

Vectorwise 3.0 Fast Answers from Hadoop. Technical white paper Vectorwise 3.0 Fast Answers from Hadoop Technical white paper 1 Contents Executive Overview 2 Introduction 2 Analyzing Big Data 3 Vectorwise and Hadoop Environments 4 Vectorwise Hadoop Connector 4 Performance

More information

Actian Vortex Express 3.0

Actian Vortex Express 3.0 Actian Vortex Express 3.0 Quick Start Guide AH-3-QS-09 This Documentation is for the end user's informational purposes only and may be subject to change or withdrawal by Actian Corporation ("Actian") at

More information

Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time?

Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time? Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time? Kai Wähner kwaehner@tibco.com @KaiWaehner www.kai-waehner.de Disclaimer! These opinions are my own and do not necessarily

More information

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica

More information

Bringing the Power of SAS to Hadoop. White Paper

Bringing the Power of SAS to Hadoop. White Paper White Paper Bringing the Power of SAS to Hadoop Combine SAS World-Class Analytic Strength with Hadoop s Low-Cost, Distributed Data Storage to Uncover Hidden Opportunities Contents Introduction... 1 What

More information

VIEWPOINT. High Performance Analytics. Industry Context and Trends

VIEWPOINT. High Performance Analytics. Industry Context and Trends VIEWPOINT High Performance Analytics Industry Context and Trends In the digital age of social media and connected devices, enterprises have a plethora of data that they can mine, to discover hidden correlations

More information

Tap into Hadoop and Other No SQL Sources

Tap into Hadoop and Other No SQL Sources Tap into Hadoop and Other No SQL Sources Presented by: Trishla Maru What is Big Data really? The Three Vs of Big Data According to Gartner Volume Volume Orders of magnitude bigger than conventional data

More information

Ganzheitliches Datenmanagement

Ganzheitliches Datenmanagement Ganzheitliches Datenmanagement für Hadoop Michael Kohs, Senior Sales Consultant @mikchaos The Problem with Big Data Projects in 2016 Relational, Mainframe Documents and Emails Data Modeler Data Scientist

More information

Oracle Big Data SQL Technical Update

Oracle Big Data SQL Technical Update Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical

More information

TUT NoSQL Seminar (Oracle) Big Data

TUT NoSQL Seminar (Oracle) Big Data Timo Raitalaakso +358 40 848 0148 rafu@solita.fi TUT NoSQL Seminar (Oracle) Big Data 11.12.2012 Timo Raitalaakso MSc 2000 Work: Solita since 2001 Senior Database Specialist Oracle ACE 2012 Blog: http://rafudb.blogspot.com

More information

MySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering

MySQL and Hadoop: Big Data Integration. Shubhangi Garg & Neha Kumari MySQL Engineering MySQL and Hadoop: Big Data Integration Shubhangi Garg & Neha Kumari MySQL Engineering 1Copyright 2013, Oracle and/or its affiliates. All rights reserved. Agenda Design rationale Implementation Installation

More information

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics ESSENTIALS EMC ISILON Use the industry's first and only scale-out NAS solution with native Hadoop

More information

Please give me your feedback

Please give me your feedback Please give me your feedback Session BB4089 Speaker Claude Lorenson, Ph. D and Wendy Harms Use the mobile app to complete a session survey 1. Access My schedule 2. Click on this session 3. Go to Rate &

More information

Real-Time Big Data Analytics SAP HANA with the Intel Distribution for Apache Hadoop software

Real-Time Big Data Analytics SAP HANA with the Intel Distribution for Apache Hadoop software Real-Time Big Data Analytics with the Intel Distribution for Apache Hadoop software Executive Summary is already helping businesses extract value out of Big Data by enabling real-time analysis of diverse

More information

Big Data and Big Data Modeling

Big Data and Big Data Modeling Big Data and Big Data Modeling The Age of Disruption Robin Bloor The Bloor Group March 19, 2015 TP02 Presenter Bio Robin Bloor, Ph.D. Robin Bloor is Chief Analyst at The Bloor Group. He has been an industry

More information

Next-Generation Cloud Analytics with Amazon Redshift

Next-Generation Cloud Analytics with Amazon Redshift Next-Generation Cloud Analytics with Amazon Redshift What s inside Introduction Why Amazon Redshift is Great for Analytics Cloud Data Warehousing Strategies for Relational Databases Analyzing Fast, Transactional

More information

Luncheon Webinar Series May 13, 2013

Luncheon Webinar Series May 13, 2013 Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration

More information

Native Connectivity to Big Data Sources in MSTR 10

Native Connectivity to Big Data Sources in MSTR 10 Native Connectivity to Big Data Sources in MSTR 10 Bring All Relevant Data to Decision Makers Support for More Big Data Sources Optimized Access to Your Entire Big Data Ecosystem as If It Were a Single

More information

Microsoft Big Data. Solution Brief

Microsoft Big Data. Solution Brief Microsoft Big Data Solution Brief Contents Introduction... 2 The Microsoft Big Data Solution... 3 Key Benefits... 3 Immersive Insight, Wherever You Are... 3 Connecting with the World s Data... 3 Any Data,

More information

Tap into Big Data at the Speed of Business

Tap into Big Data at the Speed of Business SAP Brief SAP Technology SAP Sybase IQ Objectives Tap into Big Data at the Speed of Business A simpler, more affordable approach to Big Data analytics A simpler, more affordable approach to Big Data analytics

More information

Quantcast Petabyte Storage at Half Price with QFS!

Quantcast Petabyte Storage at Half Price with QFS! 9-131 Quantcast Petabyte Storage at Half Price with QFS Presented by Silvius Rus, Director, Big Data Platforms September 2013 Quantcast File System (QFS) A high performance alternative to the Hadoop Distributed

More information

Introducing Oracle Exalytics In-Memory Machine

Introducing Oracle Exalytics In-Memory Machine Introducing Oracle Exalytics In-Memory Machine Jon Ainsworth Director of Business Development Oracle EMEA Business Analytics 1 Copyright 2011, Oracle and/or its affiliates. All rights Agenda Topics Oracle

More information

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the

More information

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved.

Collaborative Big Data Analytics. Copyright 2012 EMC Corporation. All rights reserved. Collaborative Big Data Analytics 1 Big Data Is Less About Size, And More About Freedom TechCrunch!!!!!!!!! Total data: bigger than big data 451 Group Findings: Big Data Is More Extreme Than Volume Gartner!!!!!!!!!!!!!!!

More information

Il mondo dei DB Cambia : Tecnologie e opportunita`

Il mondo dei DB Cambia : Tecnologie e opportunita` Il mondo dei DB Cambia : Tecnologie e opportunita` Giorgio Raico Pre-Sales Consultant Hewlett-Packard Italiana 2011 Hewlett-Packard Development Company, L.P. The information contained herein is subject

More information

Using Big Data for Smarter Decision Making. Colin White, BI Research July 2011 Sponsored by IBM

Using Big Data for Smarter Decision Making. Colin White, BI Research July 2011 Sponsored by IBM Using Big Data for Smarter Decision Making Colin White, BI Research July 2011 Sponsored by IBM USING BIG DATA FOR SMARTER DECISION MAKING To increase competitiveness, 83% of CIOs have visionary plans that

More information

IBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look

IBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look IBM BigInsights Has Potential If It Lives Up To Its Promise By Prakash Sukumar, Principal Consultant at iolap, Inc. IBM released Hadoop-based InfoSphere BigInsights in May 2013. There are already Hadoop-based

More information

The Next Wave of Data Management. Is Big Data The New Normal?

The Next Wave of Data Management. Is Big Data The New Normal? The Next Wave of Data Management Is Big Data The New Normal? Table of Contents Introduction 3 Separating Reality and Hype 3 Why Are Firms Making IT Investments In Big Data? 4 Trends In Data Management

More information

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web

More information

Imagine business analytics at the speed of thought

Imagine business analytics at the speed of thought Imagine business analytics at the speed of thought We are overwhelmed by the performance gains we have seen from Ingres VectorWise Database. The creation of prediction models was even faster than for inmemory

More information

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014 5 Keys to Unlocking the Big Data Analytics Puzzle Anurag Tandon Director, Product Marketing March 26, 2014 1 A Little About Us A global footprint. A proven innovator. A leader in enterprise analytics for

More information

The big data revolution

The big data revolution The big data revolution Friso van Vollenhoven (Xebia) Enterprise NoSQL Recently, there has been a lot of buzz about the NoSQL movement, a collection of related technologies mostly concerned with storing

More information

Big Data and Market Surveillance. April 28, 2014

Big Data and Market Surveillance. April 28, 2014 Big Data and Market Surveillance April 28, 2014 Copyright 2014 Scila AB. All rights reserved. Scila AB reserves the right to make changes to the information contained herein without prior notice. No part

More information

Big Data must become a first class citizen in the enterprise

Big Data must become a first class citizen in the enterprise Big Data must become a first class citizen in the enterprise An Ovum white paper for Cloudera Publication Date: 14 January 2014 Author: Tony Baer SUMMARY Catalyst Ovum view Big Data analytics have caught

More information

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM Sneha D.Borkar 1, Prof.Chaitali S.Surtakar 2 Student of B.E., Information Technology, J.D.I.E.T, sborkar95@gmail.com Assistant Professor, Information

More information

locuz.com Big Data Services

locuz.com Big Data Services locuz.com Big Data Services Big Data At Locuz, we help the enterprise move from being a data-limited to a data-driven one, thereby enabling smarter, faster decisions that result in better business outcome.

More information

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved.

Mike Maxey. Senior Director Product Marketing Greenplum A Division of EMC. Copyright 2011 EMC Corporation. All rights reserved. Mike Maxey Senior Director Product Marketing Greenplum A Division of EMC 1 Greenplum Becomes the Foundation of EMC s Big Data Analytics (July 2010) E M C A C Q U I R E S G R E E N P L U M For three years,

More information

Using RDBMS, NoSQL or Hadoop?

Using RDBMS, NoSQL or Hadoop? Using RDBMS, NoSQL or Hadoop? DOAG Conference 2015 Jean- Pierre Dijcks Big Data Product Management Server Technologies Copyright 2014 Oracle and/or its affiliates. All rights reserved. Data Ingest 2 Ingest

More information

Data Refinery with Big Data Aspects

Data Refinery with Big Data Aspects International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 7 (2013), pp. 655-662 International Research Publications House http://www. irphouse.com /ijict.htm Data

More information

G-Cloud Big Data Suite Powered by Pivotal. December 2014. G-Cloud. service definitions

G-Cloud Big Data Suite Powered by Pivotal. December 2014. G-Cloud. service definitions G-Cloud Big Data Suite Powered by Pivotal December 2014 G-Cloud service definitions TABLE OF CONTENTS Service Overview... 3 Business Need... 6 Our Approach... 7 Service Management... 7 Vendor Accreditations/Awards...

More information

In-Memory Analytics for Big Data

In-Memory Analytics for Big Data In-Memory Analytics for Big Data Game-changing technology for faster, better insights WHITE PAPER SAS White Paper Table of Contents Introduction: A New Breed of Analytics... 1 SAS In-Memory Overview...

More information

Advanced In-Database Analytics

Advanced In-Database Analytics Advanced In-Database Analytics Tallinn, Sept. 25th, 2012 Mikko-Pekka Bertling, BDM Greenplum EMEA 1 That sounds complicated? 2 Who can tell me how best to solve this 3 What are the main mathematical functions??

More information

Big Data and New Paradigms in Information Management. Vladimir Videnovic Institute for Information Management

Big Data and New Paradigms in Information Management. Vladimir Videnovic Institute for Information Management Big Data and New Paradigms in Information Management Vladimir Videnovic Institute for Information Management 2 "I am certainly not an advocate for frequent and untried changes laws and institutions must

More information

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Forecast of Big Data Trends Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Big Data transforms Business 2 Data created every minute Source http://mashable.com/2012/06/22/data-created-every-minute/

More information

Microsoft Analytics Platform System. Solution Brief

Microsoft Analytics Platform System. Solution Brief Microsoft Analytics Platform System Solution Brief Contents 4 Introduction 4 Microsoft Analytics Platform System 5 Enterprise-ready Big Data 7 Next-generation performance at scale 10 Engineered for optimal

More information

IBM BigInsights for Apache Hadoop

IBM BigInsights for Apache Hadoop IBM BigInsights for Apache Hadoop Efficiently manage and mine big data for valuable insights Highlights: Enterprise-ready Apache Hadoop based platform for data processing, warehousing and analytics Advanced

More information

Hadoop Market - Global Industry Analysis, Size, Share, Growth, Trends, and Forecast, 2012 2018

Hadoop Market - Global Industry Analysis, Size, Share, Growth, Trends, and Forecast, 2012 2018 Transparency Market Research Hadoop Market - Global Industry Analysis, Size, Share, Growth, Trends, and Forecast, 2012 2018 Buy Now Request Sample Published Date: July 2013 Single User License: US $ 4595

More information

Big Data and Hadoop for the Executive A Reference Guide

Big Data and Hadoop for the Executive A Reference Guide Big Data and Hadoop for the Executive A Reference Guide Overview The amount of information being collected by companies today is incredible. Wal- Mart has 460 terabytes of data, which, according to the

More information

BigMemory and Hadoop: Powering the Real-time Intelligent Enterprise

BigMemory and Hadoop: Powering the Real-time Intelligent Enterprise WHITE PAPER and Hadoop: Powering the Real-time Intelligent Enterprise BIGMEMORY: IN-MEMORY DATA MANAGEMENT FOR THE REAL-TIME ENTERPRISE Terracotta is the solution of choice for enterprises seeking the

More information

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW AGENDA What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story Hadoop PDW Our BIG DATA Roadmap BIG DATA? Volume 59% growth in annual WW information 1.2M Zetabytes (10 21 bytes) this

More information

BIG DATA TECHNOLOGY. Hadoop Ecosystem

BIG DATA TECHNOLOGY. Hadoop Ecosystem BIG DATA TECHNOLOGY Hadoop Ecosystem Agenda Background What is Big Data Solution Objective Introduction to Hadoop Hadoop Ecosystem Hybrid EDW Model Predictive Analysis using Hadoop Conclusion What is Big

More information

MapR: Best Solution for Customer Success

MapR: Best Solution for Customer Success 2015 MapR Technologies 2015 MapR Technologies 1 MapR: Best Solution for Customer Success Best Product High Growth 700+ Customers Premier Investors Apache Open Source 2X 2X Growth In Direct Customers Growth

More information

The Future of Data Management

The Future of Data Management The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class

More information

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum Big Data Analytics with EMC Greenplum and Hadoop Big Data Analytics with EMC Greenplum and Hadoop Ofir Manor Pre Sales Technical Architect EMC Greenplum 1 Big Data and the Data Warehouse Potential All

More information

Big Data Management and Security

Big Data Management and Security Big Data Management and Security Audit Concerns and Business Risks Tami Frankenfield Sr. Director, Analytics and Enterprise Data Mercury Insurance What is Big Data? Velocity + Volume + Variety = Value

More information

The Business Analyst s Guide to Hadoop

The Business Analyst s Guide to Hadoop White Paper The Business Analyst s Guide to Hadoop Get Ready, Get Set, and Go: A Three-Step Guide to Implementing Hadoop-based Analytics By Alteryx and Hortonworks (T)here is considerable evidence that

More information

SQL Server 2012 Performance White Paper

SQL Server 2012 Performance White Paper Published: April 2012 Applies to: SQL Server 2012 Copyright The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication.

More information

Innovative technology for big data analytics

Innovative technology for big data analytics Technical white paper Innovative technology for big data analytics The HP Vertica Analytics Platform database provides price/performance, scalability, availability, and ease of administration Table of

More information

Cisco Data Preparation

Cisco Data Preparation Data Sheet Cisco Data Preparation Unleash your business analysts to develop the insights that drive better business outcomes, sooner, from all your data. As self-service business intelligence (BI) and

More information

Product Innovation with Big Data

Product Innovation with Big Data Product Innovation with Big Data A resource for software product organizations and enterprise IT groups Copyright 2015 Pentaho Corporation. Redistribution permitted. All trademarks are the property of

More information

Build a Streamlined Data Refinery. An enterprise solution for blended data that is governed, analytics-ready, and on-demand

Build a Streamlined Data Refinery. An enterprise solution for blended data that is governed, analytics-ready, and on-demand Build a Streamlined Data Refinery An enterprise solution for blended data that is governed, analytics-ready, and on-demand Introduction As the volume and variety of data has exploded in recent years, putting

More information

Empowering the Masses with Analytics

Empowering the Masses with Analytics Empowering the Masses with Analytics THE GAP FOR BUSINESS USERS For a discussion of bridging the gap from the perspective of a business user, read Three Ways to Use Data Science. Ask the average business

More information

Understanding Your Customer Journey by Extending Adobe Analytics with Big Data

Understanding Your Customer Journey by Extending Adobe Analytics with Big Data SOLUTION BRIEF Understanding Your Customer Journey by Extending Adobe Analytics with Big Data Business Challenge Today s digital marketing teams are overwhelmed by the volume and variety of customer interaction

More information

The Five Most Common Big Data Integration Mistakes To Avoid O R A C L E W H I T E P A P E R A P R I L 2 0 1 5

The Five Most Common Big Data Integration Mistakes To Avoid O R A C L E W H I T E P A P E R A P R I L 2 0 1 5 The Five Most Common Big Data Integration Mistakes To Avoid O R A C L E W H I T E P A P E R A P R I L 2 0 1 5 Executive Summary Big Data projects have fascinated business executives with the promise of

More information

and Hadoop Technology

and Hadoop Technology SAS and Hadoop Technology Overview SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2015. SAS and Hadoop Technology: Overview. Cary, NC: SAS Institute

More information

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Copyright 2012, Oracle and/or its affiliates. All rights reserved. 1 Oracle Big Data Appliance Releases 2.5 and 3.0 Ralf Lange Global ISV & OEM Sales Agenda Quick Overview on BDA and its Positioning Product Details and Updates Security and Encryption New Hadoop Versions

More information

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren News and trends in Data Warehouse Automation, Big Data and BI Johan Hendrickx & Dirk Vermeiren Extreme Agility from Source to Analysis DWH Appliances & DWH Automation Typical Architecture 3 What Business

More information

SAP HANA PLATFORM Top Ten Questions for Choosing In-Memory Databases. Start Here

SAP HANA PLATFORM Top Ten Questions for Choosing In-Memory Databases. Start Here PLATFORM Top Ten Questions for Choosing In-Memory Databases Start Here PLATFORM Top Ten Questions for Choosing In-Memory Databases. Are my applications accelerated without manual intervention and tuning?.

More information

Powerful Duo: MapR Big Data Analytics with Cisco ACI Network Switches

Powerful Duo: MapR Big Data Analytics with Cisco ACI Network Switches Powerful Duo: MapR Big Data Analytics with Cisco ACI Network Switches Introduction For companies that want to quickly gain insights into or opportunities from big data - the dramatic volume growth in corporate

More information

Constructing a Data Lake: Hadoop and Oracle Database United!

Constructing a Data Lake: Hadoop and Oracle Database United! Constructing a Data Lake: Hadoop and Oracle Database United! Sharon Sophia Stephen Big Data PreSales Consultant February 21, 2015 Safe Harbor The following is intended to outline our general product direction.

More information

Dell* In-Memory Appliance for Cloudera* Enterprise

Dell* In-Memory Appliance for Cloudera* Enterprise Built with Intel Dell* In-Memory Appliance for Cloudera* Enterprise Find out what faster big data analytics can do for your business The need for speed in all things related to big data is an enormous

More information

Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities

Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities Technology Insight Paper Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities By John Webster February 2015 Enabling you to make the best technology decisions Enabling

More information

In-Database Analytics

In-Database Analytics Embedding Analytics in Decision Management Systems In-database analytics offer a powerful tool for embedding advanced analytics in a critical component of IT infrastructure. James Taylor CEO CONTENTS Introducing

More information

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase

Architectural patterns for building real time applications with Apache HBase. Andrew Purtell Committer and PMC, Apache HBase Architectural patterns for building real time applications with Apache HBase Andrew Purtell Committer and PMC, Apache HBase Who am I? Distributed systems engineer Principal Architect in the Big Data Platform

More information

Hadoop Ecosystem B Y R A H I M A.

Hadoop Ecosystem B Y R A H I M A. Hadoop Ecosystem B Y R A H I M A. History of Hadoop Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open

More information

Data Modeling for Big Data

Data Modeling for Big Data Data Modeling for Big Data by Jinbao Zhu, Principal Software Engineer, and Allen Wang, Manager, Software Engineering, CA Technologies In the Internet era, the volume of data we deal with has grown to terabytes

More information

SAP BusinessObjects Business Intelligence 4.1 One Strategy for Enterprise BI. May 2013

SAP BusinessObjects Business Intelligence 4.1 One Strategy for Enterprise BI. May 2013 SAP BusinessObjects Business Intelligence 4.1 One Strategy for Enterprise BI May 2013 SAP s Strategic Focus on Business Intelligence Core Self-service Mobile Extreme Social Core for innovation Complete

More information

Powerful analytics. and enterprise security. in a single platform. microstrategy.com 1

Powerful analytics. and enterprise security. in a single platform. microstrategy.com 1 Powerful analytics and enterprise security in a single platform microstrategy.com 1 Make faster, better business decisions with easy, powerful, and secure tools to explore data and share insights. Enterprise-grade

More information

Sybase IQ Supercharges Predictive Analytics

Sybase IQ Supercharges Predictive Analytics SOLUTIONS BROCHURE Sybase IQ Supercharges Predictive Analytics Deliver smarter predictions with Sybase IQ for SAP BusinessObjects users Optional Photos Here (fill space) www.sybase.com SOLUTION FEATURES

More information

QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering

QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering June 2014 Page 1 Contents Introduction... 3 About Amazon Web Services (AWS)... 3 About Amazon Redshift... 3 QlikView on AWS...

More information

An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise

An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise Solutions Group The following is intended to outline our

More information

Harnessing the power of advanced analytics with IBM Netezza

Harnessing the power of advanced analytics with IBM Netezza IBM Software Information Management White Paper Harnessing the power of advanced analytics with IBM Netezza How an appliance approach simplifies the use of advanced analytics Harnessing the power of advanced

More information

Understanding How Sensage Compares/Contrasts with Hadoop

Understanding How Sensage Compares/Contrasts with Hadoop Frequently Asked Questions Understanding How Sensage Compares/Contrasts with Hadoop 1. How does Sensage s approach to managing large, distributed data systems compare/contrast with Hadoop in terms of storage,

More information

SQL Server 2012 Parallel Data Warehouse. Solution Brief

SQL Server 2012 Parallel Data Warehouse. Solution Brief SQL Server 2012 Parallel Data Warehouse Solution Brief Published February 22, 2013 Contents Introduction... 1 Microsoft Platform: Windows Server and SQL Server... 2 SQL Server 2012 Parallel Data Warehouse...

More information

SQL Server 2005 Features Comparison

SQL Server 2005 Features Comparison Page 1 of 10 Quick Links Home Worldwide Search Microsoft.com for: Go : Home Product Information How to Buy Editions Learning Downloads Support Partners Technologies Solutions Community Previous Versions

More information

An Oracle White Paper May 2012. Oracle Database Cloud Service

An Oracle White Paper May 2012. Oracle Database Cloud Service An Oracle White Paper May 2012 Oracle Database Cloud Service Executive Overview The Oracle Database Cloud Service provides a unique combination of the simplicity and ease of use promised by Cloud computing

More information