WHY IN-MEMORY TECHNOLOGY WILL DOMINATE BIG DATA

Size: px
Start display at page:

Download "WHY IN-MEMORY TECHNOLOGY WILL DOMINATE BIG DATA"

Transcription

1 WHY IN-MEMORY TECHNOLOGY WILL DOMINATE BIG DATA In-Memory and the New BI Robin Bloor, Ph.D. WHITE PAPER

2 A Tale of Two Markets Big data is primarily about business intelligence (BI). There are a few areas of big data activity that are not focused on the opportunity to gather data from new sources, possibly combine it with data from established sources and then analyze it; however, the goal of data analysis is, as always, to discover new knowledge about some area of an organization s activity and exploit it. Big data itself may be gathered from a variety of sources: agents and partners up and down a supply chain; RFID data and data from embedded systems and sensors; mobile applications data; data from public data providers of many kinds, including the plethora of real-time data streams; social media data; data from the large number of log files (web site logs, computer logs, network logs, application logs, etc.); and more. Some of these data sources are new, but others were simply too voluminous to use until recently. In some ways it is the forward march of Moore s Law delivering ever more powerful technology at every level that is responsible for the dawn of the Big Era. However, the computer power has been usefully complemented by the emergence of open source Hadoop and the rapid deployment capabilities of cloud computing. The fact is that organizations are investing in big data projects because they can afford to and because they expect to reap significant benefits. Old BI and New BI BI applications have evolved considerably since the 1990s when business intelligence first began to flower. Nowadays, we can reduce BI into two distinct areas of activity: old BI and new BI. The old BI applications include the traditional areas of business monitoring such as reporting, data visualizations, analytical processing, dashboards and KPIs. What we regard as new BI has developed out of the area of data analytics which had, to some extent, been a cloistered activity, usually fed from a data warehouse and focused mainly on company data. This is now spoken of as the domain of the data scientist. It includes: exploration by data analysts/data scientists to reveal new insights that can be applied to the business. Advanced predictive analytics which reveals knowledge that can be acted on swiftly, often to preemptively manage business situations, i.e., to take advantage of a trend as it develops. These activities have been enlivened by the availability of new (big) data that was previously not subjected to analysis of any kind. Without wishing to disparage the useful contribution of old BI applications, it is clearly with the new BI applications that companies look for and find valuable insights into their business activities. The Value of Information From an IT perspective the new BI applications can be problematic to support. The need is for a highly productive environment which enables the data analyst to carry out analysis swiftly and implement the knowledge that has been discovered quickly so that it is delivered to the individual or software application that can use it just-in-time or sooner. 1

3 Even so, there are three different latencies that an organization will experience with the new BI. 1. The Time to Discovery: The time it takes for a data scientist to explore a collection of data and discover useful knowledge in it. 2. The Time to Deployment: The time it takes to implement the discovered knowledge within the business processes that it can enrich. 3. Knowledge Delivery Time: The time it takes for the BI application to deliver its knowledge in real time. New BI is distinctive in requiring very fast processing on large or even very large volumes of data tens to hundreds of terabytes and beyond. And there is a simple economic imperative for carrying out new BI as swiftly as possible. Consider, hypothetically, an analytical project where a data scientist discovers a data pattern that can save a business one million dollars per week when implemented. Now assume it takes the data scientist six months of analytical activity to make the discovery and prove the soundness of it: time to discovery is six months. Imagine, as is likely, that to utilize that discovery an analytical engine needs to be permanently analyzing streams of data not only identifying the patterns, but following any trends it exhibits. It may take significant time, maybe three months, to implement this. Time to deployment is three months. When implemented there will inevitably be a time lag before the analytical engine can feed usable knowledge to BI users let s say five minutes. Knowledge delivery time is five minutes. If it were possible to cut the discovery time and the deployment time to one month, then this would reduce the time to value by seven months. It would be worth an extra $28 million to the business, and that value would begin to be realized seven months earlier. In some situations the knowledge delivery time will be an important factor. This is contextual and it is easiest to see in competitive situations. Consider the stock market. Stock prices change in fractions of a second. If data analysis shows that a specific trigger will move a stock price, then you will make the most profit in the sale or purchase of that stock at the moment the trigger occurs. If you take action five minutes later, you may make less profit or miss the opportunity completely. Today, of course, automated trading happens in fractions of a second for very small stock price movements that obey known statistical patterns. However, there are many business situations where the delivery of knowledge needs to be very fast. There are many in web businesses: ad placement on web sites, web site behavior, computer gaming, the identification of trends on Twitter and so on. There also many transportation, telecoms, retail and media applications. The Strategic Deployment of In-Memory Technology As a rough rule of thumb, reading from memory is more than 3,300 times faster than reading from disk. A simple calculation would suggest that if it takes an hour to read a set of information from disk, it would take just over a second to read it from memory. The difference is dramatic, but of course memory is more expensive than disk. The relative cost, at 2

4 the time of writing, is about 100:1; a terabyte of disk costs about $50 and a terabyte of memory about $4,500. Clearly it makes no economic sense to hold all corporate data in memory, even given an unlimited budget, as most data is rarely accessed. There are also other factors to consider. solid state disk (SSD) has become an important data storage option. In practice SSDs speed up data access by a factor of between three and ten (it depends on workload), so they do not come close to in-memory speeds. They are about ten times the cost per byte of HDD (normal disk) at the moment. The picture is further complicated by the fact that data volumes grow fairly rapidly for many businesses, and workloads can vary significantly from one BI project to another. If the goal is to accelerate BI activities dramatically, the natural approach is to have an inmemory processing resource that can be used where it makes a difference, flowing the data from disk through SSD to memory in order to support those BI workloads. In other words, data is kept in memory when the value obtained from processing it is high, and data stays on disk when it is inactive or the value from processing it is low. The Morphing of the base We have entered a stimulating era of database innovation. It began when the scalability of the traditional relational databases tailed off, leading to a focus on column-store and NoSQL databases with scale-out architectures that were better suited to processing very large amounts of data. At almost the same time, the inexpensive and highly scalable Hadoop appeared not a database per se, but a very useful data store and platform. It can be, and is, used effectively as a data reservoir and as a staging area for flowing data to analytical databases. Although Hadoop was not engineered for performance in the way that many databases are, it enabled businesses to inexpensively collect much larger volumes of data than before. This was not just archived and log file data, but also external data from social media, public data sources, partners and suppliers. This capability was a blessing in giving companies more valuable data to mine, but it also increased the computer workload of data analyst activities significantly. As a consequence, for many businesses Hadoop has become an integral component of data exploration and data analysis activity. But, if performance is important, and in this area it nearly always is, its use needs to be considered. In simple architectural terms, you can either move the processing to where the data is or you can move the data to where the processing is. When you have very large amounts of data, it will obviously be better to move the processing to the data. But Hadoop is not lightning fast, nor is it easy to program. This creates a definite challenge, especially if one needs the muscle that in-memory processing provides, because neither Hadoop nor any of its components have in-memory capabilities. The nature of data analysis further complicates the situation. With analytics, the data analyst conducts a fairly extensive dialogue with the data that may involve many steps. In modeling a problem the analyst may read just a small sample of the data and interact with it using various statistical techniques. So the first query might hit all the data, perhaps even hundreds of terabytes or even petabytes of data, to extract the required sample. Hadoop might process that query reasonably swiftly. But then the rest of the data analyst interactions including data preparation and cleansing might involve a relatively small amount of data which could 3

5 be held in memory and against which various mathematical routines execute. This is not going to work well with Hadoop. Once the data analyst is finished with modeling, he or she may wish to query a much larger volume of data, but again this might fit into memory if a sufficiently large memory configuration were available. And if not, then it might be best to flow the data to memory anyway to take maximum advantage of in-memory speed. The important point is that the data analyst activity is an iterative conversation with the data, and for maximum speed, the in-memory resource needs to be used optimally, not just for querying, transforming and cleansing data, but also for applying mathematical functions to it. It demands more than an in-memory database, it requires an in-memory analytical database which can apply parallel processing to both data queries and to mathematical calculations. Such an architecture will produce two useful and profitable outcomes: 1. The data analyst achieves greater productivity 2. The knowledge gained by the data analyst can be acted on more swiftly In-Memory Analytical Platforms There are currently just a few software platforms that can work in the manner we have described. Here we describe one that is offered by Kognitio, illustrated in Figure 1. Analyst Tools BI Tools MS Excel OLAP Clients Application Layer Queries Results Kognitio SSD Near Line/ SSD Analytical Platform Feeds Storage Hadoop Hadoop Hadoop EDW Cloud Storage Other Other Stores Stores Reporting Persistence Layer Figure 1. Kognitio Overview Kognitio is both an in-memory database and an analytical engine that can execute analytical functions. As such it naturally fits in between a persistence layer which stores persistent data and a layer of applications that can exploit its capabilities. It can gather and/or store data from other data sources, including many databases, Hadoop and cloud storage services, 4

6 and has the option to read from and write to available local storage from memory, if memory resources become exhausted. This analytical platform can be added incrementally into almost any computing environment. As the diagram indicates, while data feeds of any type might flow into the persistence layer, many reporting and BI functions can carry on as they did before. The customer is only likely to deploy this product for those BI tasks that will benefit from a big acceleration in execution time. The database runs on commodity servers, usually with large amounts of configured memory on each and, possibly, extended by the addition of SSD or near-line storage. It employs a massively parallel processing (MPP) engine that scales linearly. It can be upgraded simply by adding more servers (i.e., more CPUs and memory) as needed. It loads data in parallel at the persistence layer, distributing it across the grid to create its in-memory data store and balancing the resources available to it. It can continue to ingest data in real time, without disturbing any of the queries it is executing. It serves a BI applications layer which can include data analyst tools and more conventional BI tools, including Microsoft Excel and OLAP clients. Analytical Tasks Kognitio supports the industry-standard interface SQL (ODBC and JDBC) as well as MDX for OLAP queries. To satisfy MDX queries, it builds virtual in-memory cubes. However, it is not just a query engine. It can also execute analytical processing, running third-party binaries embedded directly within SQL commands. In this way, it delivers full MPP execution of WPS libraries, R, Python, Java and other languages, so that the whole workload is executed in parallel. It can, by wrapping it in SQL, include any script or binary that will run under Linux. It can also be configured to isolate workloads. It achieves this by treating part of the MPP configuration as a hub, automatically building external table instances on other nodes of the grid for each isolated workload. In this way, for example, data analyst workloads can be given a greater share of the available resources than other BI workloads and OLAP workloads. Overall, the engine works in a distinctly different manner than, for example, a scalable column store database. As it has a full in-memory architecture, it does not locate data on disk. Thus there is no disk I/O when a query executes, only direct memory access. When analytical routines execute they run in parallel with the query, operating on the results of the query as data becomes available. With a column store database an analytical task can be thought of as a three-step process involving reading data from disk into cache, answering the query and then applying the analytical routine to get a result. With an in-memory architecture, it is a one-step process that runs the query and the analytical calculation in parallel at a much greater speed. Hadoop Integration Naturally, situations will arise where the data is too large to fit into memory and disk I/O is necessary. Enter Hadoop. Nearly every vendor out there offers some sort of Hadoop integration, and for good reason: it can process very, very large volumes of data. Its interface, however, requires a certain level of expertise that many companies simply cannot cater for. As 5

7 a result, vendors have either developed native connectors to Hadoop or, more recently, put SQL right on top of Hadoop, both of which mitigate the skill set gap. Kognitio has an interesting solution: it integrates directly with the Hadoop environment, both at the MapReduce level and the Hadoop Distributed File System (HDFS) level, and it offers a SQL interface. By placing an agent on every node within the Hadoop cluster, it can filter data from very large files after the manner of MapReduce, but at higher speeds. A SQL query, for example, will pass selections and relevant predicates to the agents. filtering and projection is performed locally on each node, and only the data required to satisfy the query is transferred and loaded into memory, in parallel. Resolving a query to a large Hadoop cluster at the node level thus greatly reduces processing time. Similarly, connecting directly to Hadoop s file system produces faster loads of complete files. For example, when a connector defines access to HDFS, it allows an external table to grab the row-based data that is stored within it. That data can then be pinned into memory to return the query results. As mentioned, many vendors are and have been offering Hadoop connectors. It makes sense. It also makes sense to allow users to query data that is stored in Hadoop by using standard SQL. Hive is Hadoop s answer to SQL, but it must be configured. Using a SQL wrapper for Hadoop eliminates the need to learn and deploy yet another interface. The end user can continue to use SQL as a data management language and can query from Hadoop directly. A tight integration with Hadoop essentially creates a separation of concerns. In other words, Hadoop is left to store and filter data, and the analytical engine can focus on complex analytics. The Business Benefits of New BI In-memory analytics is not just about time to value, it is also about volume processing. While this may seem counter-intuitive, there are good reasons why this is so. The fact is that without an analytical engine of the kind we have described, some analytical activity will not even be attempted because the results will arrive far too late. Part of this relates to data analyst productivity. If data analysts can get results in seconds or minutes instead of hours, then they can test many more hypotheses than before and they will. They are also more likely to consider engaging in activities they previously thought impractical, such as adding to their data sources and pursuing new analytical projects. The point is that memory is three orders of magnitude (more than 1,000 times) faster than disk, and an increase in speed of that magnitude does not just open up possibilities for the data analyst, it completely changes the approach to analysis projects. The same applies to other areas of BI. If BI dashboards can be accurate up-to-the-second then it may be possible to support individual decision making in the moment, rather than report after the fact what the optimum way to handle a specific opportunity or business problem might have been. 6

8 About The Bloor Group The Bloor Group is a consulting, research and technology analysis firm that focuses on open research and the use of modern media to gather knowledge and disseminate it to IT users. Visit both and for more information. The Bloor Group is the sole copyright holder of this publication. PO Box Austin TX Tel: w w w. B l o o r G r o u p. c o m 7

Information Architecture

Information Architecture The Bloor Group Actian and The Big Data Information Architecture WHITE PAPER The Actian Big Data Information Architecture Actian and The Big Data Information Architecture Originally founded in 2005 to

More information

Actian SQL in Hadoop Buyer s Guide

Actian SQL in Hadoop Buyer s Guide Actian SQL in Hadoop Buyer s Guide Contents Introduction: Big Data and Hadoop... 3 SQL on Hadoop Benefits... 4 Approaches to SQL on Hadoop... 4 The Top 10 SQL in Hadoop Capabilities... 5 SQL in Hadoop

More information

How To Handle Big Data With A Data Scientist

How To Handle Big Data With A Data Scientist III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution

More information

Why Big Data in the Cloud?

Why Big Data in the Cloud? Have 40 Why Big Data in the Cloud? Colin White, BI Research January 2014 Sponsored by Treasure Data TABLE OF CONTENTS Introduction The Importance of Big Data The Role of Cloud Computing Using Big Data

More information

Luncheon Webinar Series May 13, 2013

Luncheon Webinar Series May 13, 2013 Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration

More information

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ End to End Solution to Accelerate Data Warehouse Optimization Franco Flore Alliance Sales Director - APJ Big Data Is Driving Key Business Initiatives Increase profitability, innovation, customer satisfaction,

More information

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS!

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS! The Bloor Group IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS VENDOR PROFILE The IBM Big Data Landscape IBM can legitimately claim to have been involved in Big Data and to have a much broader

More information

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the

More information

Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities

Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities Technology Insight Paper Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities By John Webster February 2015 Enabling you to make the best technology decisions Enabling

More information

ORACLE OLAP. Oracle OLAP is embedded in the Oracle Database kernel and runs in the same database process

ORACLE OLAP. Oracle OLAP is embedded in the Oracle Database kernel and runs in the same database process ORACLE OLAP KEY FEATURES AND BENEFITS FAST ANSWERS TO TOUGH QUESTIONS EASILY KEY FEATURES & BENEFITS World class analytic engine Superior query performance Simple SQL access to advanced analytics Enhanced

More information

Big Data and Big Data Modeling

Big Data and Big Data Modeling Big Data and Big Data Modeling The Age of Disruption Robin Bloor The Bloor Group March 19, 2015 TP02 Presenter Bio Robin Bloor, Ph.D. Robin Bloor is Chief Analyst at The Bloor Group. He has been an industry

More information

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica

More information

QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering

QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering June 2014 Page 1 Contents Introduction... 3 About Amazon Web Services (AWS)... 3 About Amazon Redshift... 3 QlikView on AWS...

More information

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances INSIGHT Oracle's All- Out Assault on the Big Data Market: Offering Hadoop, R, Cubes, and Scalable IMDB in Familiar Packages Carl W. Olofson IDC OPINION Global Headquarters: 5 Speen Street Framingham, MA

More information

BIG DATA TECHNOLOGY. Hadoop Ecosystem

BIG DATA TECHNOLOGY. Hadoop Ecosystem BIG DATA TECHNOLOGY Hadoop Ecosystem Agenda Background What is Big Data Solution Objective Introduction to Hadoop Hadoop Ecosystem Hybrid EDW Model Predictive Analysis using Hadoop Conclusion What is Big

More information

Native Connectivity to Big Data Sources in MicroStrategy 10. Presented by: Raja Ganapathy

Native Connectivity to Big Data Sources in MicroStrategy 10. Presented by: Raja Ganapathy Native Connectivity to Big Data Sources in MicroStrategy 10 Presented by: Raja Ganapathy Agenda MicroStrategy supports several data sources, including Hadoop Why Hadoop? How does MicroStrategy Analytics

More information

Einsatzfelder von IBM PureData Systems und Ihre Vorteile.

Einsatzfelder von IBM PureData Systems und Ihre Vorteile. Einsatzfelder von IBM PureData Systems und Ihre Vorteile demirkaya@de.ibm.com Agenda Information technology challenges PureSystems and PureData introduction PureData for Transactions PureData for Analytics

More information

Using Big Data for Smarter Decision Making. Colin White, BI Research July 2011 Sponsored by IBM

Using Big Data for Smarter Decision Making. Colin White, BI Research July 2011 Sponsored by IBM Using Big Data for Smarter Decision Making Colin White, BI Research July 2011 Sponsored by IBM USING BIG DATA FOR SMARTER DECISION MAKING To increase competitiveness, 83% of CIOs have visionary plans that

More information

The big data revolution

The big data revolution The big data revolution Friso van Vollenhoven (Xebia) Enterprise NoSQL Recently, there has been a lot of buzz about the NoSQL movement, a collection of related technologies mostly concerned with storing

More information

Microsoft Analytics Platform System. Solution Brief

Microsoft Analytics Platform System. Solution Brief Microsoft Analytics Platform System Solution Brief Contents 4 Introduction 4 Microsoft Analytics Platform System 5 Enterprise-ready Big Data 7 Next-generation performance at scale 10 Engineered for optimal

More information

SQLSaturday #399 Sacramento 25 July, 2015. Big Data Analytics with Excel

SQLSaturday #399 Sacramento 25 July, 2015. Big Data Analytics with Excel SQLSaturday #399 Sacramento 25 July, 2015 Big Data Analytics with Excel Presenter Introduction Peter Myers Independent BI Expert Bitwise Solutions BBus, SQL Server MCSE, SQL Server MVP since 2007 Experienced

More information

In-Memory Analytics for Big Data

In-Memory Analytics for Big Data In-Memory Analytics for Big Data Game-changing technology for faster, better insights WHITE PAPER SAS White Paper Table of Contents Introduction: A New Breed of Analytics... 1 SAS In-Memory Overview...

More information

The Future of Data Management

The Future of Data Management The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class

More information

Oracle Big Data Building A Big Data Management System

Oracle Big Data Building A Big Data Management System Oracle Big Building A Big Management System Copyright 2015, Oracle and/or its affiliates. All rights reserved. Effi Psychogiou ECEMEA Big Product Director May, 2015 Safe Harbor Statement The following

More information

Cloud Integration and the Big Data Journey - Common Use-Case Patterns

Cloud Integration and the Big Data Journey - Common Use-Case Patterns Cloud Integration and the Big Data Journey - Common Use-Case Patterns A White Paper August, 2014 Corporate Technologies Business Intelligence Group OVERVIEW The advent of cloud and hybrid architectures

More information

In-Database Analytics

In-Database Analytics Embedding Analytics in Decision Management Systems In-database analytics offer a powerful tool for embedding advanced analytics in a critical component of IT infrastructure. James Taylor CEO CONTENTS Introducing

More information

Getting Started Practical Input For Your Roadmap

Getting Started Practical Input For Your Roadmap Getting Started Practical Input For Your Roadmap Mike Ferguson Managing Director, Intelligent Business Strategies BA4ALL Big Data & Analytics Insight Conference Stockholm, May 2015 About Mike Ferguson

More information

Big Data Integration: A Buyer's Guide

Big Data Integration: A Buyer's Guide SEPTEMBER 2013 Buyer s Guide to Big Data Integration Sponsored by Contents Introduction 1 Challenges of Big Data Integration: New and Old 1 What You Need for Big Data Integration 3 Preferred Technology

More information

Oracle Big Data SQL Technical Update

Oracle Big Data SQL Technical Update Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical

More information

Annex: Concept Note. Big Data for Policy, Development and Official Statistics New York, 22 February 2013

Annex: Concept Note. Big Data for Policy, Development and Official Statistics New York, 22 February 2013 Annex: Concept Note Friday Seminar on Emerging Issues Big Data for Policy, Development and Official Statistics New York, 22 February 2013 How is Big Data different from just very large databases? 1 Traditionally,

More information

Data processing goes big

Data processing goes big Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,

More information

NoSQL for SQL Professionals William McKnight

NoSQL for SQL Professionals William McKnight NoSQL for SQL Professionals William McKnight Session Code BD03 About your Speaker, William McKnight President, McKnight Consulting Group Frequent keynote speaker and trainer internationally Consulted to

More information

From Spark to Ignition:

From Spark to Ignition: From Spark to Ignition: Fueling Your Business on Real-Time Analytics Eric Frenkiel, MemSQL CEO June 29, 2015 San Francisco, CA What s in Store For This Presentation? 1. MemSQL: A real-time database for

More information

Cisco Data Preparation

Cisco Data Preparation Data Sheet Cisco Data Preparation Unleash your business analysts to develop the insights that drive better business outcomes, sooner, from all your data. As self-service business intelligence (BI) and

More information

The Bloor Group. The Pillars of Data Science

The Bloor Group. The Pillars of Data Science The Pillars of Data Science The Three Pillars DOMAIN KNOWLEDGE The DS/DA needs to know the business STATISTICAL SKILLS Knowing how to use software to analyze data TECHNOLOGY KNOWLEDGE Knowing how to leverage

More information

Understanding the Value of In-Memory in the IT Landscape

Understanding the Value of In-Memory in the IT Landscape February 2012 Understing the Value of In-Memory in Sponsored by QlikView Contents The Many Faces of In-Memory 1 The Meaning of In-Memory 2 The Data Analysis Value Chain Your Goals 3 Mapping Vendors to

More information

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren News and trends in Data Warehouse Automation, Big Data and BI Johan Hendrickx & Dirk Vermeiren Extreme Agility from Source to Analysis DWH Appliances & DWH Automation Typical Architecture 3 What Business

More information

IBM Netezza High Capacity Appliance

IBM Netezza High Capacity Appliance IBM Netezza High Capacity Appliance Petascale Data Archival, Analysis and Disaster Recovery Solutions IBM Netezza High Capacity Appliance Highlights: Allows querying and analysis of deep archival data

More information

The 3 questions to ask yourself about BIG DATA

The 3 questions to ask yourself about BIG DATA The 3 questions to ask yourself about BIG DATA Do you have a big data problem? Companies looking to tackle big data problems are embarking on a journey that is full of hype, buzz, confusion, and misinformation.

More information

Why DBMSs Matter More than Ever in the Big Data Era

Why DBMSs Matter More than Ever in the Big Data Era E-PAPER FEBRUARY 2014 Why DBMSs Matter More than Ever in the Big Data Era Having the right database infrastructure can make or break big data analytics projects. TW_1401138 Big data has become big news

More information

ETPL Extract, Transform, Predict and Load

ETPL Extract, Transform, Predict and Load ETPL Extract, Transform, Predict and Load An Oracle White Paper March 2006 ETPL Extract, Transform, Predict and Load. Executive summary... 2 Why Extract, transform, predict and load?... 4 Basic requirements

More information

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012. Viswa Sharma Solutions Architect Tata Consultancy Services

Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012. Viswa Sharma Solutions Architect Tata Consultancy Services Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012 Viswa Sharma Solutions Architect Tata Consultancy Services 1 Agenda What is Hadoop Why Hadoop? The Net Generation is here Sizing the

More information

Real-Time Big Data Analytics SAP HANA with the Intel Distribution for Apache Hadoop software

Real-Time Big Data Analytics SAP HANA with the Intel Distribution for Apache Hadoop software Real-Time Big Data Analytics with the Intel Distribution for Apache Hadoop software Executive Summary is already helping businesses extract value out of Big Data by enabling real-time analysis of diverse

More information

Bringing Big Data into the Enterprise

Bringing Big Data into the Enterprise Bringing Big Data into the Enterprise Overview When evaluating Big Data applications in enterprise computing, one often-asked question is how does Big Data compare to the Enterprise Data Warehouse (EDW)?

More information

White Paper. How Streaming Data Analytics Enables Real-Time Decisions

White Paper. How Streaming Data Analytics Enables Real-Time Decisions White Paper How Streaming Data Analytics Enables Real-Time Decisions Contents Introduction... 1 What Is Streaming Analytics?... 1 How Does SAS Event Stream Processing Work?... 2 Overview...2 Event Stream

More information

ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION

ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION EXECUTIVE SUMMARY Oracle business intelligence solutions are complete, open, and integrated. Key components of Oracle business intelligence

More information

In-memory computing with SAP HANA

In-memory computing with SAP HANA In-memory computing with SAP HANA June 2015 Amit Satoor, SAP @asatoor 2015 SAP SE or an SAP affiliate company. All rights reserved. 1 Hyperconnectivity across people, business, and devices give rise to

More information

THE DEVELOPER GUIDE TO BUILDING STREAMING DATA APPLICATIONS

THE DEVELOPER GUIDE TO BUILDING STREAMING DATA APPLICATIONS THE DEVELOPER GUIDE TO BUILDING STREAMING DATA APPLICATIONS WHITE PAPER Successfully writing Fast Data applications to manage data generated from mobile, smart devices and social interactions, and the

More information

Using Tableau Software with Hortonworks Data Platform

Using Tableau Software with Hortonworks Data Platform Using Tableau Software with Hortonworks Data Platform September 2013 2013 Hortonworks Inc. http:// Modern businesses need to manage vast amounts of data, and in many cases they have accumulated this data

More information

UNIFY YOUR (BIG) DATA

UNIFY YOUR (BIG) DATA UNIFY YOUR (BIG) DATA ANALYTIC STRATEGY GIVE ANY USER ANY ANALYTIC ON ANY DATA Scott Gnau President, Teradata Labs scott.gnau@teradata.com t Unify Your (Big) Data Analytic Strategy Technology excitement:

More information

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Wayne W. Eckerson Director of Research, TechTarget Founder, BI Leadership Forum Business Analytics

More information

Innovative technology for big data analytics

Innovative technology for big data analytics Technical white paper Innovative technology for big data analytics The HP Vertica Analytics Platform database provides price/performance, scalability, availability, and ease of administration Table of

More information

The 4 Pillars of Technosoft s Big Data Practice

The 4 Pillars of Technosoft s Big Data Practice beyond possible Big Use End-user applications Big Analytics Visualisation tools Big Analytical tools Big management systems The 4 Pillars of Technosoft s Big Practice Overview Businesses have long managed

More information

ANALYTICS BUILT FOR INTERNET OF THINGS

ANALYTICS BUILT FOR INTERNET OF THINGS ANALYTICS BUILT FOR INTERNET OF THINGS Big Data Reporting is Out, Actionable Insights are In In recent years, it has become clear that data in itself has little relevance, it is the analysis of it that

More information

Getting Started & Successful with Big Data

Getting Started & Successful with Big Data Getting Started & Successful with Big Data @Pentaho #BigDataWebSeries 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Your Hosts Today Davy Nys VP EMEA & APAC Pentaho Paul

More information

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time

How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time SCALEOUT SOFTWARE How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time by Dr. William Bain and Dr. Mikhail Sobolev, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T wenty-first

More information

BIG DATA TRENDS AND TECHNOLOGIES

BIG DATA TRENDS AND TECHNOLOGIES BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.

More information

Implement Hadoop jobs to extract business value from large and varied data sets

Implement Hadoop jobs to extract business value from large and varied data sets Hadoop Development for Big Data Solutions: Hands-On You Will Learn How To: Implement Hadoop jobs to extract business value from large and varied data sets Write, customize and deploy MapReduce jobs to

More information

International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: 2454-2377 Vol. 1, Issue 6, October 2015. Big Data and Hadoop

International Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: 2454-2377 Vol. 1, Issue 6, October 2015. Big Data and Hadoop ISSN: 2454-2377, October 2015 Big Data and Hadoop Simmi Bagga 1 Satinder Kaur 2 1 Assistant Professor, Sant Hira Dass Kanya MahaVidyalaya, Kala Sanghian, Distt Kpt. INDIA E-mail: simmibagga12@gmail.com

More information

Harnessing the power of advanced analytics with IBM Netezza

Harnessing the power of advanced analytics with IBM Netezza IBM Software Information Management White Paper Harnessing the power of advanced analytics with IBM Netezza How an appliance approach simplifies the use of advanced analytics Harnessing the power of advanced

More information

MicroStrategy Cloud Reduces the Barriers to Enterprise BI...

MicroStrategy Cloud Reduces the Barriers to Enterprise BI... MicroStrategy Cloud Reduces the Barriers to Enterprise BI... MicroStrategy Cloud reduces the traditional barriers that organizations face when implementing enterprise business intelligence solutions. MicroStrategy

More information

Implementing Data Models and Reports with Microsoft SQL Server

Implementing Data Models and Reports with Microsoft SQL Server Course 20466C: Implementing Data Models and Reports with Microsoft SQL Server Course Details Course Outline Module 1: Introduction to Business Intelligence and Data Modeling As a SQL Server database professional,

More information

ENABLING OPERATIONAL BI

ENABLING OPERATIONAL BI ENABLING OPERATIONAL BI WITH SAP DATA Satisfy the need for speed with real-time data replication Author: Eric Kavanagh, The Bloor Group Co-Founder WHITE PAPER Table of Contents The Data Challenge to Make

More information

How to Enhance Traditional BI Architecture to Leverage Big Data

How to Enhance Traditional BI Architecture to Leverage Big Data B I G D ATA How to Enhance Traditional BI Architecture to Leverage Big Data Contents Executive Summary... 1 Traditional BI - DataStack 2.0 Architecture... 2 Benefits of Traditional BI - DataStack 2.0...

More information

Architectures for Big Data Analytics A database perspective

Architectures for Big Data Analytics A database perspective Architectures for Big Data Analytics A database perspective Fernando Velez Director of Product Management Enterprise Information Management, SAP June 2013 Outline Big Data Analytics Requirements Spectrum

More information

Simplifying Big Data Analytics: Unifying Batch and Stream Processing. John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!!

Simplifying Big Data Analytics: Unifying Batch and Stream Processing. John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!! Simplifying Big Data Analytics: Unifying Batch and Stream Processing John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!! Streaming Analy.cs S S S Scale- up Database Data And Compute Grid

More information

Integrated Big Data: Hadoop + DBMS + Discovery for SAS High Performance Analytics

Integrated Big Data: Hadoop + DBMS + Discovery for SAS High Performance Analytics Paper 1828-2014 Integrated Big Data: Hadoop + DBMS + Discovery for SAS High Performance Analytics John Cunningham, Teradata Corporation, Danville, CA ABSTRACT SAS High Performance Analytics (HPA) is a

More information

The Big Data Ecosystem at LinkedIn Roshan Sumbaly, Jay Kreps, and Sam Shah LinkedIn

The Big Data Ecosystem at LinkedIn Roshan Sumbaly, Jay Kreps, and Sam Shah LinkedIn The Big Data Ecosystem at LinkedIn Roshan Sumbaly, Jay Kreps, and Sam Shah LinkedIn Presented by :- Ishank Kumar Aakash Patel Vishnu Dev Yadav CONTENT Abstract Introduction Related work The Ecosystem Ingress

More information

Cloud Computing and Advanced Relationship Analytics

Cloud Computing and Advanced Relationship Analytics Cloud Computing and Advanced Relationship Analytics Using Objectivity/DB to Discover the Relationships in your Data By Brian Clark Vice President, Product Management Objectivity, Inc. 408 992 7136 brian.clark@objectivity.com

More information

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Highly competitive enterprises are increasingly finding ways to maximize and accelerate

More information

Microsoft Big Data. Solution Brief

Microsoft Big Data. Solution Brief Microsoft Big Data Solution Brief Contents Introduction... 2 The Microsoft Big Data Solution... 3 Key Benefits... 3 Immersive Insight, Wherever You Are... 3 Connecting with the World s Data... 3 Any Data,

More information

SAP HANA SAP s In-Memory Database. Dr. Martin Kittel, SAP HANA Development January 16, 2013

SAP HANA SAP s In-Memory Database. Dr. Martin Kittel, SAP HANA Development January 16, 2013 SAP HANA SAP s In-Memory Database Dr. Martin Kittel, SAP HANA Development January 16, 2013 Disclaimer This presentation outlines our general product direction and should not be relied on in making a purchase

More information

Big Data Defined Introducing DataStack 3.0

Big Data Defined Introducing DataStack 3.0 Big Data Big Data Defined Introducing DataStack 3.0 Inside: Executive Summary... 1 Introduction... 2 Emergence of DataStack 3.0... 3 DataStack 1.0 to 2.0... 4 DataStack 2.0 Refined for Large Data & Analytics...

More information

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale WHITE PAPER Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale Sponsored by: IBM Carl W. Olofson December 2014 IN THIS WHITE PAPER This white paper discusses the concept

More information

Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time?

Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time? Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time? Kai Wähner kwaehner@tibco.com @KaiWaehner www.kai-waehner.de Disclaimer! These opinions are my own and do not necessarily

More information

SELLING PROJECTS ON THE MICROSOFT BUSINESS ANALYTICS PLATFORM

SELLING PROJECTS ON THE MICROSOFT BUSINESS ANALYTICS PLATFORM David Chappell SELLING PROJECTS ON THE MICROSOFT BUSINESS ANALYTICS PLATFORM A PERSPECTIVE FOR SYSTEMS INTEGRATORS Sponsored by Microsoft Corporation Copyright 2014 Chappell & Associates Contents Business

More information

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April 9 2013

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April 9 2013 Integrating Hadoop Into Business Intelligence & Data Warehousing Philip Russom TDWI Research Director for Data Management, April 9 2013 TDWI would like to thank the following companies for sponsoring the

More information

Accelerating Hadoop MapReduce Using an In-Memory Data Grid

Accelerating Hadoop MapReduce Using an In-Memory Data Grid Accelerating Hadoop MapReduce Using an In-Memory Data Grid By David L. Brinker and William L. Bain, ScaleOut Software, Inc. 2013 ScaleOut Software, Inc. 12/27/2012 H adoop has been widely embraced for

More information

WHITE PAPER. Harnessing the Power of Advanced Analytics How an appliance approach simplifies the use of advanced analytics

WHITE PAPER. Harnessing the Power of Advanced Analytics How an appliance approach simplifies the use of advanced analytics WHITE PAPER Harnessing the Power of Advanced How an appliance approach simplifies the use of advanced analytics Introduction The Netezza TwinFin i-class advanced analytics appliance pushes the limits of

More information

Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks

Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks WHITE PAPER July 2014 Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks Contents Executive Summary...2 Background...3 InfiniteGraph...3 High Performance

More information

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum

Big Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum Big Data Analytics with EMC Greenplum and Hadoop Big Data Analytics with EMC Greenplum and Hadoop Ofir Manor Pre Sales Technical Architect EMC Greenplum 1 Big Data and the Data Warehouse Potential All

More information

Parallel Data Warehouse

Parallel Data Warehouse MICROSOFT S ANALYTICS SOLUTIONS WITH PARALLEL DATA WAREHOUSE Parallel Data Warehouse Stefan Cronjaeger Microsoft May 2013 AGENDA PDW overview Columnstore and Big Data Business Intellignece Project Ability

More information

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software

Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software WHITEPAPER Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software SanDisk ZetaScale software unlocks the full benefits of flash for In-Memory Compute and NoSQL applications

More information

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data

More information

Tap into Hadoop and Other No SQL Sources

Tap into Hadoop and Other No SQL Sources Tap into Hadoop and Other No SQL Sources Presented by: Trishla Maru What is Big Data really? The Three Vs of Big Data According to Gartner Volume Volume Orders of magnitude bigger than conventional data

More information

IBM Software Information Management Creating an Integrated, Optimized, and Secure Enterprise Data Platform:

IBM Software Information Management Creating an Integrated, Optimized, and Secure Enterprise Data Platform: Creating an Integrated, Optimized, and Secure Enterprise Data Platform: IBM PureData System for Transactions with SafeNet s ProtectDB and DataSecure Table of contents 1. Data, Data, Everywhere... 3 2.

More information

SAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES

SAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES SAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES AWS GLOBAL INFRASTRUCTURE 10 Regions 25 Availability Zones 51 Edge locations WHAT

More information

Integrating SAP and non-sap data for comprehensive Business Intelligence

Integrating SAP and non-sap data for comprehensive Business Intelligence WHITE PAPER Integrating SAP and non-sap data for comprehensive Business Intelligence www.barc.de/en Business Application Research Center 2 Integrating SAP and non-sap data Authors Timm Grosser Senior Analyst

More information

An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise

An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise Solutions Group The following is intended to outline our

More information

Big Data Analytics - Accelerated. stream-horizon.com

Big Data Analytics - Accelerated. stream-horizon.com Big Data Analytics - Accelerated stream-horizon.com Legacy ETL platforms & conventional Data Integration approach Unable to meet latency & data throughput demands of Big Data integration challenges Based

More information

Netezza and Business Analytics Synergy

Netezza and Business Analytics Synergy Netezza Business Partner Update: November 17, 2011 Netezza and Business Analytics Synergy Shimon Nir, IBM Agenda Business Analytics / Netezza Synergy Overview Netezza overview Enabling the Business with

More information

Innovate and Grow: SAP and Teradata

Innovate and Grow: SAP and Teradata Partners Innovate and Grow: SAP and Teradata Lily Gulik, Teradata Director, SAP Center of Excellence Wayne Boyle, Chief Technology Officer Strategy, Teradata R&D Table of Contents Introduction: The Integrated

More information

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014 5 Keys to Unlocking the Big Data Analytics Puzzle Anurag Tandon Director, Product Marketing March 26, 2014 1 A Little About Us A global footprint. A proven innovator. A leader in enterprise analytics for

More information

Understanding traffic flow

Understanding traffic flow White Paper A Real-time Data Hub For Smarter City Applications Intelligent Transportation Innovation for Real-time Traffic Flow Analytics with Dynamic Congestion Management 2 Understanding traffic flow

More information

Architecting for the Internet of Things & Big Data

Architecting for the Internet of Things & Big Data Architecting for the Internet of Things & Big Data Robert Stackowiak, Oracle North America, VP Information Architecture & Big Data September 29, 2014 Safe Harbor Statement The following is intended to

More information

locuz.com Big Data Services

locuz.com Big Data Services locuz.com Big Data Services Big Data At Locuz, we help the enterprise move from being a data-limited to a data-driven one, thereby enabling smarter, faster decisions that result in better business outcome.

More information

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web

More information

OLAP Services. MicroStrategy Products. MicroStrategy OLAP Services Delivers Economic Savings, Analytical Insight, and up to 50x Faster Performance

OLAP Services. MicroStrategy Products. MicroStrategy OLAP Services Delivers Economic Savings, Analytical Insight, and up to 50x Faster Performance OLAP Services MicroStrategy Products MicroStrategy OLAP Services Delivers Economic Savings, Analytical Insight, and up to 50x Faster Performance MicroStrategy OLAP Services brings In-memory Business Intelligence

More information

Safe Harbor Statement

Safe Harbor Statement Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment

More information