WHY IN-MEMORY TECHNOLOGY WILL DOMINATE BIG DATA
|
|
- Teresa Kennedy
- 7 years ago
- Views:
Transcription
1 WHY IN-MEMORY TECHNOLOGY WILL DOMINATE BIG DATA In-Memory and the New BI Robin Bloor, Ph.D. WHITE PAPER
2 A Tale of Two Markets Big data is primarily about business intelligence (BI). There are a few areas of big data activity that are not focused on the opportunity to gather data from new sources, possibly combine it with data from established sources and then analyze it; however, the goal of data analysis is, as always, to discover new knowledge about some area of an organization s activity and exploit it. Big data itself may be gathered from a variety of sources: agents and partners up and down a supply chain; RFID data and data from embedded systems and sensors; mobile applications data; data from public data providers of many kinds, including the plethora of real-time data streams; social media data; data from the large number of log files (web site logs, computer logs, network logs, application logs, etc.); and more. Some of these data sources are new, but others were simply too voluminous to use until recently. In some ways it is the forward march of Moore s Law delivering ever more powerful technology at every level that is responsible for the dawn of the Big Era. However, the computer power has been usefully complemented by the emergence of open source Hadoop and the rapid deployment capabilities of cloud computing. The fact is that organizations are investing in big data projects because they can afford to and because they expect to reap significant benefits. Old BI and New BI BI applications have evolved considerably since the 1990s when business intelligence first began to flower. Nowadays, we can reduce BI into two distinct areas of activity: old BI and new BI. The old BI applications include the traditional areas of business monitoring such as reporting, data visualizations, analytical processing, dashboards and KPIs. What we regard as new BI has developed out of the area of data analytics which had, to some extent, been a cloistered activity, usually fed from a data warehouse and focused mainly on company data. This is now spoken of as the domain of the data scientist. It includes: exploration by data analysts/data scientists to reveal new insights that can be applied to the business. Advanced predictive analytics which reveals knowledge that can be acted on swiftly, often to preemptively manage business situations, i.e., to take advantage of a trend as it develops. These activities have been enlivened by the availability of new (big) data that was previously not subjected to analysis of any kind. Without wishing to disparage the useful contribution of old BI applications, it is clearly with the new BI applications that companies look for and find valuable insights into their business activities. The Value of Information From an IT perspective the new BI applications can be problematic to support. The need is for a highly productive environment which enables the data analyst to carry out analysis swiftly and implement the knowledge that has been discovered quickly so that it is delivered to the individual or software application that can use it just-in-time or sooner. 1
3 Even so, there are three different latencies that an organization will experience with the new BI. 1. The Time to Discovery: The time it takes for a data scientist to explore a collection of data and discover useful knowledge in it. 2. The Time to Deployment: The time it takes to implement the discovered knowledge within the business processes that it can enrich. 3. Knowledge Delivery Time: The time it takes for the BI application to deliver its knowledge in real time. New BI is distinctive in requiring very fast processing on large or even very large volumes of data tens to hundreds of terabytes and beyond. And there is a simple economic imperative for carrying out new BI as swiftly as possible. Consider, hypothetically, an analytical project where a data scientist discovers a data pattern that can save a business one million dollars per week when implemented. Now assume it takes the data scientist six months of analytical activity to make the discovery and prove the soundness of it: time to discovery is six months. Imagine, as is likely, that to utilize that discovery an analytical engine needs to be permanently analyzing streams of data not only identifying the patterns, but following any trends it exhibits. It may take significant time, maybe three months, to implement this. Time to deployment is three months. When implemented there will inevitably be a time lag before the analytical engine can feed usable knowledge to BI users let s say five minutes. Knowledge delivery time is five minutes. If it were possible to cut the discovery time and the deployment time to one month, then this would reduce the time to value by seven months. It would be worth an extra $28 million to the business, and that value would begin to be realized seven months earlier. In some situations the knowledge delivery time will be an important factor. This is contextual and it is easiest to see in competitive situations. Consider the stock market. Stock prices change in fractions of a second. If data analysis shows that a specific trigger will move a stock price, then you will make the most profit in the sale or purchase of that stock at the moment the trigger occurs. If you take action five minutes later, you may make less profit or miss the opportunity completely. Today, of course, automated trading happens in fractions of a second for very small stock price movements that obey known statistical patterns. However, there are many business situations where the delivery of knowledge needs to be very fast. There are many in web businesses: ad placement on web sites, web site behavior, computer gaming, the identification of trends on Twitter and so on. There also many transportation, telecoms, retail and media applications. The Strategic Deployment of In-Memory Technology As a rough rule of thumb, reading from memory is more than 3,300 times faster than reading from disk. A simple calculation would suggest that if it takes an hour to read a set of information from disk, it would take just over a second to read it from memory. The difference is dramatic, but of course memory is more expensive than disk. The relative cost, at 2
4 the time of writing, is about 100:1; a terabyte of disk costs about $50 and a terabyte of memory about $4,500. Clearly it makes no economic sense to hold all corporate data in memory, even given an unlimited budget, as most data is rarely accessed. There are also other factors to consider. solid state disk (SSD) has become an important data storage option. In practice SSDs speed up data access by a factor of between three and ten (it depends on workload), so they do not come close to in-memory speeds. They are about ten times the cost per byte of HDD (normal disk) at the moment. The picture is further complicated by the fact that data volumes grow fairly rapidly for many businesses, and workloads can vary significantly from one BI project to another. If the goal is to accelerate BI activities dramatically, the natural approach is to have an inmemory processing resource that can be used where it makes a difference, flowing the data from disk through SSD to memory in order to support those BI workloads. In other words, data is kept in memory when the value obtained from processing it is high, and data stays on disk when it is inactive or the value from processing it is low. The Morphing of the base We have entered a stimulating era of database innovation. It began when the scalability of the traditional relational databases tailed off, leading to a focus on column-store and NoSQL databases with scale-out architectures that were better suited to processing very large amounts of data. At almost the same time, the inexpensive and highly scalable Hadoop appeared not a database per se, but a very useful data store and platform. It can be, and is, used effectively as a data reservoir and as a staging area for flowing data to analytical databases. Although Hadoop was not engineered for performance in the way that many databases are, it enabled businesses to inexpensively collect much larger volumes of data than before. This was not just archived and log file data, but also external data from social media, public data sources, partners and suppliers. This capability was a blessing in giving companies more valuable data to mine, but it also increased the computer workload of data analyst activities significantly. As a consequence, for many businesses Hadoop has become an integral component of data exploration and data analysis activity. But, if performance is important, and in this area it nearly always is, its use needs to be considered. In simple architectural terms, you can either move the processing to where the data is or you can move the data to where the processing is. When you have very large amounts of data, it will obviously be better to move the processing to the data. But Hadoop is not lightning fast, nor is it easy to program. This creates a definite challenge, especially if one needs the muscle that in-memory processing provides, because neither Hadoop nor any of its components have in-memory capabilities. The nature of data analysis further complicates the situation. With analytics, the data analyst conducts a fairly extensive dialogue with the data that may involve many steps. In modeling a problem the analyst may read just a small sample of the data and interact with it using various statistical techniques. So the first query might hit all the data, perhaps even hundreds of terabytes or even petabytes of data, to extract the required sample. Hadoop might process that query reasonably swiftly. But then the rest of the data analyst interactions including data preparation and cleansing might involve a relatively small amount of data which could 3
5 be held in memory and against which various mathematical routines execute. This is not going to work well with Hadoop. Once the data analyst is finished with modeling, he or she may wish to query a much larger volume of data, but again this might fit into memory if a sufficiently large memory configuration were available. And if not, then it might be best to flow the data to memory anyway to take maximum advantage of in-memory speed. The important point is that the data analyst activity is an iterative conversation with the data, and for maximum speed, the in-memory resource needs to be used optimally, not just for querying, transforming and cleansing data, but also for applying mathematical functions to it. It demands more than an in-memory database, it requires an in-memory analytical database which can apply parallel processing to both data queries and to mathematical calculations. Such an architecture will produce two useful and profitable outcomes: 1. The data analyst achieves greater productivity 2. The knowledge gained by the data analyst can be acted on more swiftly In-Memory Analytical Platforms There are currently just a few software platforms that can work in the manner we have described. Here we describe one that is offered by Kognitio, illustrated in Figure 1. Analyst Tools BI Tools MS Excel OLAP Clients Application Layer Queries Results Kognitio SSD Near Line/ SSD Analytical Platform Feeds Storage Hadoop Hadoop Hadoop EDW Cloud Storage Other Other Stores Stores Reporting Persistence Layer Figure 1. Kognitio Overview Kognitio is both an in-memory database and an analytical engine that can execute analytical functions. As such it naturally fits in between a persistence layer which stores persistent data and a layer of applications that can exploit its capabilities. It can gather and/or store data from other data sources, including many databases, Hadoop and cloud storage services, 4
6 and has the option to read from and write to available local storage from memory, if memory resources become exhausted. This analytical platform can be added incrementally into almost any computing environment. As the diagram indicates, while data feeds of any type might flow into the persistence layer, many reporting and BI functions can carry on as they did before. The customer is only likely to deploy this product for those BI tasks that will benefit from a big acceleration in execution time. The database runs on commodity servers, usually with large amounts of configured memory on each and, possibly, extended by the addition of SSD or near-line storage. It employs a massively parallel processing (MPP) engine that scales linearly. It can be upgraded simply by adding more servers (i.e., more CPUs and memory) as needed. It loads data in parallel at the persistence layer, distributing it across the grid to create its in-memory data store and balancing the resources available to it. It can continue to ingest data in real time, without disturbing any of the queries it is executing. It serves a BI applications layer which can include data analyst tools and more conventional BI tools, including Microsoft Excel and OLAP clients. Analytical Tasks Kognitio supports the industry-standard interface SQL (ODBC and JDBC) as well as MDX for OLAP queries. To satisfy MDX queries, it builds virtual in-memory cubes. However, it is not just a query engine. It can also execute analytical processing, running third-party binaries embedded directly within SQL commands. In this way, it delivers full MPP execution of WPS libraries, R, Python, Java and other languages, so that the whole workload is executed in parallel. It can, by wrapping it in SQL, include any script or binary that will run under Linux. It can also be configured to isolate workloads. It achieves this by treating part of the MPP configuration as a hub, automatically building external table instances on other nodes of the grid for each isolated workload. In this way, for example, data analyst workloads can be given a greater share of the available resources than other BI workloads and OLAP workloads. Overall, the engine works in a distinctly different manner than, for example, a scalable column store database. As it has a full in-memory architecture, it does not locate data on disk. Thus there is no disk I/O when a query executes, only direct memory access. When analytical routines execute they run in parallel with the query, operating on the results of the query as data becomes available. With a column store database an analytical task can be thought of as a three-step process involving reading data from disk into cache, answering the query and then applying the analytical routine to get a result. With an in-memory architecture, it is a one-step process that runs the query and the analytical calculation in parallel at a much greater speed. Hadoop Integration Naturally, situations will arise where the data is too large to fit into memory and disk I/O is necessary. Enter Hadoop. Nearly every vendor out there offers some sort of Hadoop integration, and for good reason: it can process very, very large volumes of data. Its interface, however, requires a certain level of expertise that many companies simply cannot cater for. As 5
7 a result, vendors have either developed native connectors to Hadoop or, more recently, put SQL right on top of Hadoop, both of which mitigate the skill set gap. Kognitio has an interesting solution: it integrates directly with the Hadoop environment, both at the MapReduce level and the Hadoop Distributed File System (HDFS) level, and it offers a SQL interface. By placing an agent on every node within the Hadoop cluster, it can filter data from very large files after the manner of MapReduce, but at higher speeds. A SQL query, for example, will pass selections and relevant predicates to the agents. filtering and projection is performed locally on each node, and only the data required to satisfy the query is transferred and loaded into memory, in parallel. Resolving a query to a large Hadoop cluster at the node level thus greatly reduces processing time. Similarly, connecting directly to Hadoop s file system produces faster loads of complete files. For example, when a connector defines access to HDFS, it allows an external table to grab the row-based data that is stored within it. That data can then be pinned into memory to return the query results. As mentioned, many vendors are and have been offering Hadoop connectors. It makes sense. It also makes sense to allow users to query data that is stored in Hadoop by using standard SQL. Hive is Hadoop s answer to SQL, but it must be configured. Using a SQL wrapper for Hadoop eliminates the need to learn and deploy yet another interface. The end user can continue to use SQL as a data management language and can query from Hadoop directly. A tight integration with Hadoop essentially creates a separation of concerns. In other words, Hadoop is left to store and filter data, and the analytical engine can focus on complex analytics. The Business Benefits of New BI In-memory analytics is not just about time to value, it is also about volume processing. While this may seem counter-intuitive, there are good reasons why this is so. The fact is that without an analytical engine of the kind we have described, some analytical activity will not even be attempted because the results will arrive far too late. Part of this relates to data analyst productivity. If data analysts can get results in seconds or minutes instead of hours, then they can test many more hypotheses than before and they will. They are also more likely to consider engaging in activities they previously thought impractical, such as adding to their data sources and pursuing new analytical projects. The point is that memory is three orders of magnitude (more than 1,000 times) faster than disk, and an increase in speed of that magnitude does not just open up possibilities for the data analyst, it completely changes the approach to analysis projects. The same applies to other areas of BI. If BI dashboards can be accurate up-to-the-second then it may be possible to support individual decision making in the moment, rather than report after the fact what the optimum way to handle a specific opportunity or business problem might have been. 6
8 About The Bloor Group The Bloor Group is a consulting, research and technology analysis firm that focuses on open research and the use of modern media to gather knowledge and disseminate it to IT users. Visit both and for more information. The Bloor Group is the sole copyright holder of this publication. PO Box Austin TX Tel: w w w. B l o o r G r o u p. c o m 7
Information Architecture
The Bloor Group Actian and The Big Data Information Architecture WHITE PAPER The Actian Big Data Information Architecture Actian and The Big Data Information Architecture Originally founded in 2005 to
More informationActian SQL in Hadoop Buyer s Guide
Actian SQL in Hadoop Buyer s Guide Contents Introduction: Big Data and Hadoop... 3 SQL on Hadoop Benefits... 4 Approaches to SQL on Hadoop... 4 The Top 10 SQL in Hadoop Capabilities... 5 SQL in Hadoop
More informationHow To Handle Big Data With A Data Scientist
III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution
More informationWhy Big Data in the Cloud?
Have 40 Why Big Data in the Cloud? Colin White, BI Research January 2014 Sponsored by Treasure Data TABLE OF CONTENTS Introduction The Importance of Big Data The Role of Cloud Computing Using Big Data
More informationLuncheon Webinar Series May 13, 2013
Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration
More informationEnd to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ
End to End Solution to Accelerate Data Warehouse Optimization Franco Flore Alliance Sales Director - APJ Big Data Is Driving Key Business Initiatives Increase profitability, innovation, customer satisfaction,
More informationIBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS!
The Bloor Group IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS VENDOR PROFILE The IBM Big Data Landscape IBM can legitimately claim to have been involved in Big Data and to have a much broader
More informationW H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract
W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the
More informationConverged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities
Technology Insight Paper Converged, Real-time Analytics Enabling Faster Decision Making and New Business Opportunities By John Webster February 2015 Enabling you to make the best technology decisions Enabling
More informationORACLE OLAP. Oracle OLAP is embedded in the Oracle Database kernel and runs in the same database process
ORACLE OLAP KEY FEATURES AND BENEFITS FAST ANSWERS TO TOUGH QUESTIONS EASILY KEY FEATURES & BENEFITS World class analytic engine Superior query performance Simple SQL access to advanced analytics Enhanced
More informationBig Data and Big Data Modeling
Big Data and Big Data Modeling The Age of Disruption Robin Bloor The Bloor Group March 19, 2015 TP02 Presenter Bio Robin Bloor, Ph.D. Robin Bloor is Chief Analyst at The Bloor Group. He has been an industry
More informationManaging Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database
Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica
More informationQLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering
QLIKVIEW INTEGRATION TION WITH AMAZON REDSHIFT John Park Partner Engineering June 2014 Page 1 Contents Introduction... 3 About Amazon Web Services (AWS)... 3 About Amazon Redshift... 3 QlikView on AWS...
More informationWell packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances
INSIGHT Oracle's All- Out Assault on the Big Data Market: Offering Hadoop, R, Cubes, and Scalable IMDB in Familiar Packages Carl W. Olofson IDC OPINION Global Headquarters: 5 Speen Street Framingham, MA
More informationBIG DATA TECHNOLOGY. Hadoop Ecosystem
BIG DATA TECHNOLOGY Hadoop Ecosystem Agenda Background What is Big Data Solution Objective Introduction to Hadoop Hadoop Ecosystem Hybrid EDW Model Predictive Analysis using Hadoop Conclusion What is Big
More informationNative Connectivity to Big Data Sources in MicroStrategy 10. Presented by: Raja Ganapathy
Native Connectivity to Big Data Sources in MicroStrategy 10 Presented by: Raja Ganapathy Agenda MicroStrategy supports several data sources, including Hadoop Why Hadoop? How does MicroStrategy Analytics
More informationEinsatzfelder von IBM PureData Systems und Ihre Vorteile.
Einsatzfelder von IBM PureData Systems und Ihre Vorteile demirkaya@de.ibm.com Agenda Information technology challenges PureSystems and PureData introduction PureData for Transactions PureData for Analytics
More informationUsing Big Data for Smarter Decision Making. Colin White, BI Research July 2011 Sponsored by IBM
Using Big Data for Smarter Decision Making Colin White, BI Research July 2011 Sponsored by IBM USING BIG DATA FOR SMARTER DECISION MAKING To increase competitiveness, 83% of CIOs have visionary plans that
More informationThe big data revolution
The big data revolution Friso van Vollenhoven (Xebia) Enterprise NoSQL Recently, there has been a lot of buzz about the NoSQL movement, a collection of related technologies mostly concerned with storing
More informationMicrosoft Analytics Platform System. Solution Brief
Microsoft Analytics Platform System Solution Brief Contents 4 Introduction 4 Microsoft Analytics Platform System 5 Enterprise-ready Big Data 7 Next-generation performance at scale 10 Engineered for optimal
More informationSQLSaturday #399 Sacramento 25 July, 2015. Big Data Analytics with Excel
SQLSaturday #399 Sacramento 25 July, 2015 Big Data Analytics with Excel Presenter Introduction Peter Myers Independent BI Expert Bitwise Solutions BBus, SQL Server MCSE, SQL Server MVP since 2007 Experienced
More informationIn-Memory Analytics for Big Data
In-Memory Analytics for Big Data Game-changing technology for faster, better insights WHITE PAPER SAS White Paper Table of Contents Introduction: A New Breed of Analytics... 1 SAS In-Memory Overview...
More informationThe Future of Data Management
The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class
More informationOracle Big Data Building A Big Data Management System
Oracle Big Building A Big Management System Copyright 2015, Oracle and/or its affiliates. All rights reserved. Effi Psychogiou ECEMEA Big Product Director May, 2015 Safe Harbor Statement The following
More informationCloud Integration and the Big Data Journey - Common Use-Case Patterns
Cloud Integration and the Big Data Journey - Common Use-Case Patterns A White Paper August, 2014 Corporate Technologies Business Intelligence Group OVERVIEW The advent of cloud and hybrid architectures
More informationIn-Database Analytics
Embedding Analytics in Decision Management Systems In-database analytics offer a powerful tool for embedding advanced analytics in a critical component of IT infrastructure. James Taylor CEO CONTENTS Introducing
More informationGetting Started Practical Input For Your Roadmap
Getting Started Practical Input For Your Roadmap Mike Ferguson Managing Director, Intelligent Business Strategies BA4ALL Big Data & Analytics Insight Conference Stockholm, May 2015 About Mike Ferguson
More informationBig Data Integration: A Buyer's Guide
SEPTEMBER 2013 Buyer s Guide to Big Data Integration Sponsored by Contents Introduction 1 Challenges of Big Data Integration: New and Old 1 What You Need for Big Data Integration 3 Preferred Technology
More informationOracle Big Data SQL Technical Update
Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical
More informationAnnex: Concept Note. Big Data for Policy, Development and Official Statistics New York, 22 February 2013
Annex: Concept Note Friday Seminar on Emerging Issues Big Data for Policy, Development and Official Statistics New York, 22 February 2013 How is Big Data different from just very large databases? 1 Traditionally,
More informationData processing goes big
Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,
More informationNoSQL for SQL Professionals William McKnight
NoSQL for SQL Professionals William McKnight Session Code BD03 About your Speaker, William McKnight President, McKnight Consulting Group Frequent keynote speaker and trainer internationally Consulted to
More informationFrom Spark to Ignition:
From Spark to Ignition: Fueling Your Business on Real-Time Analytics Eric Frenkiel, MemSQL CEO June 29, 2015 San Francisco, CA What s in Store For This Presentation? 1. MemSQL: A real-time database for
More informationCisco Data Preparation
Data Sheet Cisco Data Preparation Unleash your business analysts to develop the insights that drive better business outcomes, sooner, from all your data. As self-service business intelligence (BI) and
More informationThe Bloor Group. The Pillars of Data Science
The Pillars of Data Science The Three Pillars DOMAIN KNOWLEDGE The DS/DA needs to know the business STATISTICAL SKILLS Knowing how to use software to analyze data TECHNOLOGY KNOWLEDGE Knowing how to leverage
More informationUnderstanding the Value of In-Memory in the IT Landscape
February 2012 Understing the Value of In-Memory in Sponsored by QlikView Contents The Many Faces of In-Memory 1 The Meaning of In-Memory 2 The Data Analysis Value Chain Your Goals 3 Mapping Vendors to
More informationNews and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren
News and trends in Data Warehouse Automation, Big Data and BI Johan Hendrickx & Dirk Vermeiren Extreme Agility from Source to Analysis DWH Appliances & DWH Automation Typical Architecture 3 What Business
More informationIBM Netezza High Capacity Appliance
IBM Netezza High Capacity Appliance Petascale Data Archival, Analysis and Disaster Recovery Solutions IBM Netezza High Capacity Appliance Highlights: Allows querying and analysis of deep archival data
More informationThe 3 questions to ask yourself about BIG DATA
The 3 questions to ask yourself about BIG DATA Do you have a big data problem? Companies looking to tackle big data problems are embarking on a journey that is full of hype, buzz, confusion, and misinformation.
More informationWhy DBMSs Matter More than Ever in the Big Data Era
E-PAPER FEBRUARY 2014 Why DBMSs Matter More than Ever in the Big Data Era Having the right database infrastructure can make or break big data analytics projects. TW_1401138 Big data has become big news
More informationETPL Extract, Transform, Predict and Load
ETPL Extract, Transform, Predict and Load An Oracle White Paper March 2006 ETPL Extract, Transform, Predict and Load. Executive summary... 2 Why Extract, transform, predict and load?... 4 Basic requirements
More informationHadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012. Viswa Sharma Solutions Architect Tata Consultancy Services
Hadoop Beyond Hype: Complex Adaptive Systems Conference Nov 16, 2012 Viswa Sharma Solutions Architect Tata Consultancy Services 1 Agenda What is Hadoop Why Hadoop? The Net Generation is here Sizing the
More informationReal-Time Big Data Analytics SAP HANA with the Intel Distribution for Apache Hadoop software
Real-Time Big Data Analytics with the Intel Distribution for Apache Hadoop software Executive Summary is already helping businesses extract value out of Big Data by enabling real-time analysis of diverse
More informationBringing Big Data into the Enterprise
Bringing Big Data into the Enterprise Overview When evaluating Big Data applications in enterprise computing, one often-asked question is how does Big Data compare to the Enterprise Data Warehouse (EDW)?
More informationWhite Paper. How Streaming Data Analytics Enables Real-Time Decisions
White Paper How Streaming Data Analytics Enables Real-Time Decisions Contents Introduction... 1 What Is Streaming Analytics?... 1 How Does SAS Event Stream Processing Work?... 2 Overview...2 Event Stream
More informationORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION
ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION EXECUTIVE SUMMARY Oracle business intelligence solutions are complete, open, and integrated. Key components of Oracle business intelligence
More informationIn-memory computing with SAP HANA
In-memory computing with SAP HANA June 2015 Amit Satoor, SAP @asatoor 2015 SAP SE or an SAP affiliate company. All rights reserved. 1 Hyperconnectivity across people, business, and devices give rise to
More informationTHE DEVELOPER GUIDE TO BUILDING STREAMING DATA APPLICATIONS
THE DEVELOPER GUIDE TO BUILDING STREAMING DATA APPLICATIONS WHITE PAPER Successfully writing Fast Data applications to manage data generated from mobile, smart devices and social interactions, and the
More informationUsing Tableau Software with Hortonworks Data Platform
Using Tableau Software with Hortonworks Data Platform September 2013 2013 Hortonworks Inc. http:// Modern businesses need to manage vast amounts of data, and in many cases they have accumulated this data
More informationUNIFY YOUR (BIG) DATA
UNIFY YOUR (BIG) DATA ANALYTIC STRATEGY GIVE ANY USER ANY ANALYTIC ON ANY DATA Scott Gnau President, Teradata Labs scott.gnau@teradata.com t Unify Your (Big) Data Analytic Strategy Technology excitement:
More informationArchitecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing
Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Wayne W. Eckerson Director of Research, TechTarget Founder, BI Leadership Forum Business Analytics
More informationInnovative technology for big data analytics
Technical white paper Innovative technology for big data analytics The HP Vertica Analytics Platform database provides price/performance, scalability, availability, and ease of administration Table of
More informationThe 4 Pillars of Technosoft s Big Data Practice
beyond possible Big Use End-user applications Big Analytics Visualisation tools Big Analytical tools Big management systems The 4 Pillars of Technosoft s Big Practice Overview Businesses have long managed
More informationANALYTICS BUILT FOR INTERNET OF THINGS
ANALYTICS BUILT FOR INTERNET OF THINGS Big Data Reporting is Out, Actionable Insights are In In recent years, it has become clear that data in itself has little relevance, it is the analysis of it that
More informationGetting Started & Successful with Big Data
Getting Started & Successful with Big Data @Pentaho #BigDataWebSeries 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 Your Hosts Today Davy Nys VP EMEA & APAC Pentaho Paul
More informationHow In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time
SCALEOUT SOFTWARE How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time by Dr. William Bain and Dr. Mikhail Sobolev, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T wenty-first
More informationBIG DATA TRENDS AND TECHNOLOGIES
BIG DATA TRENDS AND TECHNOLOGIES THE WORLD OF DATA IS CHANGING Cloud WHAT IS BIG DATA? Big data are datasets that grow so large that they become awkward to work with using onhand database management tools.
More informationImplement Hadoop jobs to extract business value from large and varied data sets
Hadoop Development for Big Data Solutions: Hands-On You Will Learn How To: Implement Hadoop jobs to extract business value from large and varied data sets Write, customize and deploy MapReduce jobs to
More informationInternational Journal of Advanced Engineering Research and Applications (IJAERA) ISSN: 2454-2377 Vol. 1, Issue 6, October 2015. Big Data and Hadoop
ISSN: 2454-2377, October 2015 Big Data and Hadoop Simmi Bagga 1 Satinder Kaur 2 1 Assistant Professor, Sant Hira Dass Kanya MahaVidyalaya, Kala Sanghian, Distt Kpt. INDIA E-mail: simmibagga12@gmail.com
More informationHarnessing the power of advanced analytics with IBM Netezza
IBM Software Information Management White Paper Harnessing the power of advanced analytics with IBM Netezza How an appliance approach simplifies the use of advanced analytics Harnessing the power of advanced
More informationMicroStrategy Cloud Reduces the Barriers to Enterprise BI...
MicroStrategy Cloud Reduces the Barriers to Enterprise BI... MicroStrategy Cloud reduces the traditional barriers that organizations face when implementing enterprise business intelligence solutions. MicroStrategy
More informationImplementing Data Models and Reports with Microsoft SQL Server
Course 20466C: Implementing Data Models and Reports with Microsoft SQL Server Course Details Course Outline Module 1: Introduction to Business Intelligence and Data Modeling As a SQL Server database professional,
More informationENABLING OPERATIONAL BI
ENABLING OPERATIONAL BI WITH SAP DATA Satisfy the need for speed with real-time data replication Author: Eric Kavanagh, The Bloor Group Co-Founder WHITE PAPER Table of Contents The Data Challenge to Make
More informationHow to Enhance Traditional BI Architecture to Leverage Big Data
B I G D ATA How to Enhance Traditional BI Architecture to Leverage Big Data Contents Executive Summary... 1 Traditional BI - DataStack 2.0 Architecture... 2 Benefits of Traditional BI - DataStack 2.0...
More informationArchitectures for Big Data Analytics A database perspective
Architectures for Big Data Analytics A database perspective Fernando Velez Director of Product Management Enterprise Information Management, SAP June 2013 Outline Big Data Analytics Requirements Spectrum
More informationSimplifying Big Data Analytics: Unifying Batch and Stream Processing. John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!!
Simplifying Big Data Analytics: Unifying Batch and Stream Processing John Fanelli,! VP Product! In-Memory Compute Summit! June 30, 2015!! Streaming Analy.cs S S S Scale- up Database Data And Compute Grid
More informationIntegrated Big Data: Hadoop + DBMS + Discovery for SAS High Performance Analytics
Paper 1828-2014 Integrated Big Data: Hadoop + DBMS + Discovery for SAS High Performance Analytics John Cunningham, Teradata Corporation, Danville, CA ABSTRACT SAS High Performance Analytics (HPA) is a
More informationThe Big Data Ecosystem at LinkedIn Roshan Sumbaly, Jay Kreps, and Sam Shah LinkedIn
The Big Data Ecosystem at LinkedIn Roshan Sumbaly, Jay Kreps, and Sam Shah LinkedIn Presented by :- Ishank Kumar Aakash Patel Vishnu Dev Yadav CONTENT Abstract Introduction Related work The Ecosystem Ingress
More informationCloud Computing and Advanced Relationship Analytics
Cloud Computing and Advanced Relationship Analytics Using Objectivity/DB to Discover the Relationships in your Data By Brian Clark Vice President, Product Management Objectivity, Inc. 408 992 7136 brian.clark@objectivity.com
More informationCapitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes
Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Highly competitive enterprises are increasingly finding ways to maximize and accelerate
More informationMicrosoft Big Data. Solution Brief
Microsoft Big Data Solution Brief Contents Introduction... 2 The Microsoft Big Data Solution... 3 Key Benefits... 3 Immersive Insight, Wherever You Are... 3 Connecting with the World s Data... 3 Any Data,
More informationSAP HANA SAP s In-Memory Database. Dr. Martin Kittel, SAP HANA Development January 16, 2013
SAP HANA SAP s In-Memory Database Dr. Martin Kittel, SAP HANA Development January 16, 2013 Disclaimer This presentation outlines our general product direction and should not be relied on in making a purchase
More informationBig Data Defined Introducing DataStack 3.0
Big Data Big Data Defined Introducing DataStack 3.0 Inside: Executive Summary... 1 Introduction... 2 Emergence of DataStack 3.0... 3 DataStack 1.0 to 2.0... 4 DataStack 2.0 Refined for Large Data & Analytics...
More informationAffordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale
WHITE PAPER Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale Sponsored by: IBM Carl W. Olofson December 2014 IN THIS WHITE PAPER This white paper discusses the concept
More informationHadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time?
Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time? Kai Wähner kwaehner@tibco.com @KaiWaehner www.kai-waehner.de Disclaimer! These opinions are my own and do not necessarily
More informationSELLING PROJECTS ON THE MICROSOFT BUSINESS ANALYTICS PLATFORM
David Chappell SELLING PROJECTS ON THE MICROSOFT BUSINESS ANALYTICS PLATFORM A PERSPECTIVE FOR SYSTEMS INTEGRATORS Sponsored by Microsoft Corporation Copyright 2014 Chappell & Associates Contents Business
More informationIntegrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April 9 2013
Integrating Hadoop Into Business Intelligence & Data Warehousing Philip Russom TDWI Research Director for Data Management, April 9 2013 TDWI would like to thank the following companies for sponsoring the
More informationAccelerating Hadoop MapReduce Using an In-Memory Data Grid
Accelerating Hadoop MapReduce Using an In-Memory Data Grid By David L. Brinker and William L. Bain, ScaleOut Software, Inc. 2013 ScaleOut Software, Inc. 12/27/2012 H adoop has been widely embraced for
More informationWHITE PAPER. Harnessing the Power of Advanced Analytics How an appliance approach simplifies the use of advanced analytics
WHITE PAPER Harnessing the Power of Advanced How an appliance approach simplifies the use of advanced analytics Introduction The Netezza TwinFin i-class advanced analytics appliance pushes the limits of
More informationAchieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks
WHITE PAPER July 2014 Achieving Real-Time Business Solutions Using Graph Database Technology and High Performance Networks Contents Executive Summary...2 Background...3 InfiniteGraph...3 High Performance
More informationBig Data Analytics. with EMC Greenplum and Hadoop. Big Data Analytics. Ofir Manor Pre Sales Technical Architect EMC Greenplum
Big Data Analytics with EMC Greenplum and Hadoop Big Data Analytics with EMC Greenplum and Hadoop Ofir Manor Pre Sales Technical Architect EMC Greenplum 1 Big Data and the Data Warehouse Potential All
More informationParallel Data Warehouse
MICROSOFT S ANALYTICS SOLUTIONS WITH PARALLEL DATA WAREHOUSE Parallel Data Warehouse Stefan Cronjaeger Microsoft May 2013 AGENDA PDW overview Columnstore and Big Data Business Intellignece Project Ability
More informationAccelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software
WHITEPAPER Accelerating Enterprise Applications and Reducing TCO with SanDisk ZetaScale Software SanDisk ZetaScale software unlocks the full benefits of flash for In-Memory Compute and NoSQL applications
More informationBIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES
BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data
More informationTap into Hadoop and Other No SQL Sources
Tap into Hadoop and Other No SQL Sources Presented by: Trishla Maru What is Big Data really? The Three Vs of Big Data According to Gartner Volume Volume Orders of magnitude bigger than conventional data
More informationIBM Software Information Management Creating an Integrated, Optimized, and Secure Enterprise Data Platform:
Creating an Integrated, Optimized, and Secure Enterprise Data Platform: IBM PureData System for Transactions with SafeNet s ProtectDB and DataSecure Table of contents 1. Data, Data, Everywhere... 3 2.
More informationSAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES
SAS BIG DATA SOLUTIONS ON AWS SAS FORUM ESPAÑA, OCTOBER 16 TH, 2014 IAN MEYERS SOLUTIONS ARCHITECT / AMAZON WEB SERVICES AWS GLOBAL INFRASTRUCTURE 10 Regions 25 Availability Zones 51 Edge locations WHAT
More informationIntegrating SAP and non-sap data for comprehensive Business Intelligence
WHITE PAPER Integrating SAP and non-sap data for comprehensive Business Intelligence www.barc.de/en Business Application Research Center 2 Integrating SAP and non-sap data Authors Timm Grosser Senior Analyst
More informationAn Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise
An Integrated Analytics & Big Data Infrastructure September 21, 2012 Robert Stackowiak, Vice President Data Systems Architecture Oracle Enterprise Solutions Group The following is intended to outline our
More informationBig Data Analytics - Accelerated. stream-horizon.com
Big Data Analytics - Accelerated stream-horizon.com Legacy ETL platforms & conventional Data Integration approach Unable to meet latency & data throughput demands of Big Data integration challenges Based
More informationNetezza and Business Analytics Synergy
Netezza Business Partner Update: November 17, 2011 Netezza and Business Analytics Synergy Shimon Nir, IBM Agenda Business Analytics / Netezza Synergy Overview Netezza overview Enabling the Business with
More informationInnovate and Grow: SAP and Teradata
Partners Innovate and Grow: SAP and Teradata Lily Gulik, Teradata Director, SAP Center of Excellence Wayne Boyle, Chief Technology Officer Strategy, Teradata R&D Table of Contents Introduction: The Integrated
More information5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014
5 Keys to Unlocking the Big Data Analytics Puzzle Anurag Tandon Director, Product Marketing March 26, 2014 1 A Little About Us A global footprint. A proven innovator. A leader in enterprise analytics for
More informationUnderstanding traffic flow
White Paper A Real-time Data Hub For Smarter City Applications Intelligent Transportation Innovation for Real-time Traffic Flow Analytics with Dynamic Congestion Management 2 Understanding traffic flow
More informationArchitecting for the Internet of Things & Big Data
Architecting for the Internet of Things & Big Data Robert Stackowiak, Oracle North America, VP Information Architecture & Big Data September 29, 2014 Safe Harbor Statement The following is intended to
More informationlocuz.com Big Data Services
locuz.com Big Data Services Big Data At Locuz, we help the enterprise move from being a data-limited to a data-driven one, thereby enabling smarter, faster decisions that result in better business outcome.
More informationESS event: Big Data in Official Statistics. Antonino Virgillito, Istat
ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web
More informationOLAP Services. MicroStrategy Products. MicroStrategy OLAP Services Delivers Economic Savings, Analytical Insight, and up to 50x Faster Performance
OLAP Services MicroStrategy Products MicroStrategy OLAP Services Delivers Economic Savings, Analytical Insight, and up to 50x Faster Performance MicroStrategy OLAP Services brings In-memory Business Intelligence
More informationSafe Harbor Statement
Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment
More information