InDetail. RainStor archiving

Size: px
Start display at page:

Download "InDetail. RainStor archiving"

Transcription

1 InDetail RainStor archiving An InDetail Paper by Bloor Research Author : Philip Howard Publish date : November 2013

2 Archival is a no-brainer when it comes to return on investment and total cost of ownership. This is particularly true as data volumes grow. Philip Howard

3 Executive summary Archiving, by which we mean on-line or nearline archiving rather than off-line, tape-based archiving, has been a technology ahead of its time. It has been around for a number of years but it has never been as widely adopted as its merits would recommend. However, we believe that this is changing. On the one hand, the amount of data held by organisations is greatly increasing and, at the same time, compliance requirements are mandating that companies retain more and more data. On the other hand, compression technologies have improved and we have seen the introduction of Hadoop and other low-cost compute platforms that combine to further improve the cost-of-ownership benefits of archiving. When combined with the ability to query archived data directly there is no good argument against adopting this technology and lots of good reasons in favour of it. We expect archiving to move much higher up the CIO s priority list over the coming months and years. That said, we should perhaps explain what we mean by an (enterprise) data archive, because perceptions of this may vary across organisations. For example, if you are collecting machine generated data (from networks, sensors or smart meters, for example) then the data in your archive could range from being as little as a day old to being several years old, depending on how long you want to keep this information for analysis purposes. On the other hand, if we think about archiving transactional data then it will often be the case that the data in the archive is no younger than six months old and the oldest data will often be mandated by compliance requirements of, say, seven years. Again, if you are archiving from a data warehouse, the youngest data in the archive might be as much three years old, depending on the duration for which you keep the data for live analysis purposes. Of course, an enterprise data archiving strategy may encompass all of these. With current rates of data growth, archiving is coming more and more to the fore. Return on investment therefore becomes a critical issue and so do standards, both for the platform(s) that will be used for archiving and the processes and best practices used to archive the data. These have certainly been issues that the companies we have spoken to see later have faced in adopting their archiving solution. Fast facts RainStor s eponymous product is marketed both as an analytics data hub and as an archiving product. In this paper we are concerned with the product s capabilities in the latter area. Note that the product is specific to storing, retrieving and managing archived data and not to deciding what data should be archived in the first place. In terms of its archiving capabilities RainStor supports three different approaches, all of which are based on the same software, as follows: Archiving in the cloud archiving as a service Archiving on Hadoop Archiving on conventional in-house platforms In this paper we will discuss the facilities provided, supported by the views of a number of RainStor customers to whom we have spoken. Key findings In the opinion of Bloor Research, the following represent the key facts of which prospective users should be aware: RainStor guarantees 10x compression and normally expects compression to be between 20x and 40x. For some data, compression rates can be as high as 45x. RainStor supports ANSI standard queries (via ODBC/JDBC) against its archive so you should be able to use existing business intelligence tools to query the archived data. Queries perform very efficiently, thanks to the use of Bloom filters, which are also used to improve performance for free text search. Federated queries are supported (via third party tools) across both the archive and live systems. If the data being archived is relational then you can ingest the data schema along with the data. RainStor understands the concept of schema evolution so that you can query the data within the context of the schema that was active at the time the data was live. A Bloor InDetail Paper Bloor Research

4 Executive summary While RainStor provides a connector called FastConnect for rapid loading of data from Teradata data warehouses, there are no equivalent products for other environments. While users are fine with the load performance they are getting in non-teradata environments it would be nice if a similar higher performance loader was available for other popular database environments like Oracle and Netezza (IBM). We understand that RainStor is planning exactly this. The Hadoop version of RainStor provides security that Hadoop otherwise lacks and RainStor supports MapReduce as well as SQL in this environment. However, there are no current facilities to embed SQL statements within MapReduce jobs, or vice versa. The bottom line It is the combination of capabilities offered by RainStor that is most impressive: compression as high as 99%, the ability to access archived data via SQL, the performance you get for such queries (thanks to the use of Bloom filters), the support for schema evolution, and the choice of platform for storage (cloud, Hadoop or conventional). There are not many, if any, competitors that can offer this range of cost, performance, scalability and compliance features. The initial process of selecting the data to archive is typically the domain of data profiling and discovery tools to ensure that business objects are archived in toto. We would like to see RainStor partner with one or more vendors that specialise in this area Bloor Research A Bloor InDetail Paper

5 The product RainStor is currently in version 5.5, which was released in June It is targeted at both archiving and analytics markets. In the latter case the idea is that you store raw data, such as log files or any network machine generated data, in RainStor, while holding aggregated data in your data warehouse. As we shall see, the query capabilities provided by RainStor make this a practical proposition. For Teradata environments (Teradata is a partner) there is a FastConnect product, which specifically enables communications between the Teradata and RainStor systems, as well as a recently introduced product called FastForward, which allows users to reinstate data into RainStor from Teradata tape archives. As far as archiving is concerned, the software is the same as you would use for analytics but the emphasis is different. As previously noted, this solution may be deployed in conjunction with scale-out NAS environments like EMC Isilon and popular WORM storage, such as EMC Centera, which is the traditional approach adopted by RainStor or, since release 5 (Jan 2012), you can implement RainStor on top of a Hadoop cluster. Moreover, you can combine these options with an implementation, for example, of EMC Isilon running on HDFS. The solution can also be deployed in the cloud via an archiving as a service offering, which is perhaps best described as providing the facility for IT to offer this as a service to multiple business units. Multi-tenancy is supported for customers archiving to a central repository across multiple applications. In the bulk of this paper we will discuss the architecture of the product, the potential benefits it offers and how these are achieved and we will back this up with reports on the discussions we have had with a couple of RainStor customers. A Bloor InDetail Paper Bloor Research

6 The product BI Tools & Dashboards (ODBC/JDBC Connectivity) Visualization Layer Hive Pig Java MapReduce - Batch (Distributed Programming Framework) SQL 92 (With Oracle, SQL Server, SybaselQ extensions) Programming Languages Computation Security & Compliance (Encryption, Masking, Audit Trail, Data Disposition, Kerberos, LDAP/Active Directory, Immutable) Security RainStor Database (up to 40x Data Compression) Database Storage HDFS (Hadoop Distributed File System) NAS, SAN, CAS, NFS (On-premise, cloud) Object/Hardware Storage Apache Projects RainStor Vendor Specific Figure 1: RainStor architecture Architecture RainStor describes its product as being founded on a database. From a purist s point of view this is not correct, at least in the sense that there is no data model that is inherent to the environment. In fact, the underlying structure is a file system. It could, of course, be regarded as a NoSQL environment such as that offered by Hadoop, which is also based on a file system and is schema-free. However, this would be misleading since RainStor has strong SQL support and understands the concept of database schemas. In fact the most accurate description of the storage architecture is as a data repository that stores data in files. This means that RainStor is very easy to install and implement and it requires virtually no administration. The repository element of this is important because it implies the presence of metadata. This does three things. First of all it allows the ingestion of SQL schemas and supports schema evolution. This is a patented process, which means that you can query archived data against the schema that was in place at the time the data was created. This is important not just for business purposes but also in some compliance environments where you need to be able to prove what you knew at some point in time. The second important aspect of this repository relates to how you query data in the archive. RainStor uses what are known as Bloom filters (named for the inventor Burton H. Bloom) which are space-efficient probabilistic data structures (held as metadata) that are used to test whether an element is a member of a set. False positives are possible, but false negatives are not. Put simply, these filters tell the system where the data is not, so that the software looks for the data it needs only within relevant data blocks. The advantage of using these Bloom filters (which, technically, are based on bit vectors) is that they greatly increase performance and, at the same time, they require much less management and overhead than indexes. Note that filters, like the data, are compressed Bloor Research A Bloor InDetail Paper

7 The product While on the subject of compression, RainStor guarantees 10x compression, which is about on a par with the best that the major database vendors can offer (and much better than you would expect from Hadoop). However, 20x to 40x compression is commonplace in practice and, as we shall see, one of RainStor s users that we spoke to claimed 99% compression (in other words, 45x giving a 99% reduced footprint). These figures are significantly better than you would get from any merchant database vendor. From a technical perspective RainStor uses a form of tokenisation with byte level compression, combined with a linked list to enable data value and pattern de-duplication. Querying the data While we have mentioned some aspects of RainStor s query abilities already (Bloom filters and schema evolution) and we discuss the use of Hadoop in a later section, more generally RainStor includes a query engine that supports (translates) incoming SQL (ANSI standard SQL 92) so that you can run conventional business intelligence environments against RainStor. In addition, the company has recently introduced free text search capabilities that combine Bloom filters with some of the parsing aspects of Lucene, bringing the performance advantages (according to RainStor, one to two orders of magnitude) of the former to this requirement. Compliance RainStor provides an append only environment. That is, you can add new records but you cannot update records. If you need to do the latter you must delete the original record (there is a record level delete function) if you are allowed to records may be designated as subject to legal hold and then insert a new record. This could be a limitation in analytic environments but shouldn t be a problem for archiving. In fact, compliance-driven archiving actually requires data disposition and retention capabilities. RainStor provides record level expiry and auto-delete capabilities as well as the ability to add comments and to audit these (and there are broader auditing capabilities, as you might expect). Where compliance is a major consideration, WORM (write once read many) storage may be the preferred option even though RainStor s Hadoop implementation has done much to improve the security of the Hadoop environment (see next section). MD5 fingerprinting is provided along with tamper-proofing. Hadoop One of the problems with Hadoop is that the security isn t great. It s not that you don t get any security with Hadoop distributions Kerberos authentication is fairly standard, for example but it is not as detailed as it needs to be for enterprise-level requirements and certainly won t be compliant with requirements such as PCI. Moreover, it is frequently the case that Kerberos gets turned off for performance reasons, so a more robust solution is required. RainStor, in its latest 5.5 release, has set out to resolve this issue. It now offers not just Kerberos but also LDAP and Active Directory support. In addition, the company offers data masking for both SQL and MapReduce functions. While this is not intended to be a fullblown data masking tool, it will be useful for masking data in, say, log files. Masking can be done in a consistent fashion so that the same piece of data is always masked in the same way. The way that RainStor is implemented on Hadoop is that RainStor partitions are stored within HDFS (Hadoop distributed file system). You can choose what data you want to flatten. So, because RainStor understands SQL you can store relevant data in tabular format if you want to, and you can flatten other data where that is not relevant. Also, of course, the SQL supported by RainStor means that you can perform functions like joins, which would not otherwise be possible. It is also worth noting that RainStor files are treated as first-class objects within the context of Hadoop and MapReduce, so there is no need to change existing scripts (in Pig, say) only to change a single parameter that points the query to RainStor partitions. The company has added native MapReduce support to its existing SQL capabilities and has partnered with Cloudera, HortonWorks and MapR. What you can t do is to combine MapReduce and SQL within a single query you have to use one or the other although we understand that RainStor is planning to introduce this capability in due course. Finally, the one other point to note is that RainStor understands time stamps, which is not something that is native to Hadoop. A Bloor InDetail Paper Bloor Research

8 Case Studies We have interviewed two of RainStor s users of its archiving solutions to confirm (or otherwise) the company s claims. Dell In addition to being a partner, Dell has also adopted RainStor as an archiving solution within its own IT department, within the data warehousing and business intelligence group. The company uses Teradata as its main data warehousing platform and then archives seldom used data that it nevertheless wishes or must retain, onto RainStor. Note that this is an archiving solution rather than one more focused at analytics. The initial justification for implementing RainStor was performance. The customer has a number of tables in its warehouse that are multi-terabyte in size and the company could halve the size of these tables, and their related indexes, by archiving rarely accessed data. This in turn improves the query performance achieved from the data warehouse without needing costly hardware upgrades. The organization did not undertake any product comparisons before opting in favour of RainStor, partly because the company was already a partner and partly because the company had looked at Informatica ILM (which embedded RainStor at the time this partnership has since ended) previously, so the company was familiar with the technology. It did, however, undertake a proof of concept in which RainStor demonstrated 40x compression rates on 2TB of data and according to the Enterprise BI Technical Architecture lead we spoke with, the company is fully expecting at least 20x against the 50TB to be archived. From this statement it will be clear that this implementation is not yet complete. In part, the reason for this is that the customer has opted to implement RainStor on top of Hadoop. It was originally going to implement RainStor natively but then the company released its Hadoop option and the IT team lobbied management to move to a Hadoop environment, in part because the data warehouse team had no previous experience of Hadoop. In practice, according to the user we spoke to, we learned some things the hard way about Hadoop. Nevertheless, management is now very pleased that it opted for an approach based upon Hadoop. While the company initially thought it would not need support for Hadoop it has subsequently decided that this would be a good idea and is licensing this from Cloudera. The data stored in RainStor is varied: it consists of sales, service, manufacturing and inventory data. It is loaded using RainStor FastConnect in Informatica PowerCenter workflows. TOAD Decision Point is used to query the data. The customer has also performed POCs using SAP Business Objects to federate queries across data stored within the warehouse and the archive. As far as support is concerned, the customer was complimentary: he described the company as very responsive and supportive and said we couldn t have asked for better support Bloor Research A Bloor InDetail Paper

9 Case Studies CenturyLink CenturyLink is one of the largest telecommunications companies in the United States. Unlike Dell, which is a recent RainStor customer, it has been using the product since 2010 to archive data that is too expensive to retain within its Oracle Exadata data warehouse. It originally evaluated Vertica (now HP) and Greenplum (now EMC) as alternatives to RainStor but felt that RainStor offered a better architecture for supporting archiving. Indeed, the principal architect for the organisation stated that, we are getting Exadata speeds out of RainStor. Indeed, he explained that sometimes it is (much) faster: we have reports that ran in Oracle for 5.5 hours and errored out that finished in RainStor in less than 20 minutes. Access to the data is primarily through SAP Business Objects and QlikView and queries can be federated. Apart from the compression, which the customer said was upwards of 99%: really tremendous the other big advantage of implementing RainStor is that the issue of back-up and recovery goes away you just have to tune the environment, so we save a bundle of money on storage. The only difficulty is on the Oracle side, which he described as going stupid with the idea of an external database. Loading times were described as acceptable; you can t trickle feed the data into RainStor you have to present a file, sort and load it. The company has multiple implementations of RainStor in different locations and it expects its volume of archived data to surpass 1PB at some point next year. Currently this is all NAS-based but the company is currently going through a Hadoop proof of concept process, which it will use for unstructured archiving while retaining the existing RainStor implementations for structured and semi-structured data. The big advantage of RainStor on Hadoop, according to the customer, is that it adds security. It is interesting to note that the spokesperson for CenturyLink had the same comments about its Hadoop implementation (that is, listen to RainStor) as the previous high tech customer running on HDFS. The customer we spoke to was complimentary about RainStor without being gushing. He described the company s service as somewhere between excellent and above average, said that bug fixes were immediate was pleased that we have had a lot fewer issues with RainStor than with other vendors and that the system is very stable. A Bloor InDetail Paper Bloor Research

10 The vendor RainStor was previously Clearpace Software. The company was founded by ex-mod staff in the UK who had been working on a way to effectively store data derived from battlefield simulations. As you may imagine this means very large volumes of data, which need to be ingested rapidly, stored for a long time and easily retrieved, hence the product s excellent levels of compression. However, in 2007 the company changed its name to RainStor and moved its headquarters to the United States, although the bulk of development remains within the UK. Initially the company focused on reseller partnerships and an OEM model, notably with Dell, Teradata, Anritsu, AdaptiveMobile, Group2000, Informatica (which formerly embedded RainStor in its ILM product), and HP as well as a number of systems integrators. However, more recently the company has moved towards a hybrid model, which includes both direct sales and indirect. RainStor focuses particularly on the banking and financial services, government, utilities, and telecommunications and media sectors. Given the scale of their data retention requirements one might imagine that the healthcare and the pharmaceutical sectors might also be fruitful for RainStor. The company has in excess of 150 customers worldwide. Web site: Bloor Research A Bloor InDetail Paper

11 Summary Archival is a no-brainer when it comes to return on investment and total cost of ownership. This is particularly true as data volumes grow and when you can get the sort of performance and scale that RainStor can offer. Moreover, the choice of archiving in the cloud or on Hadoop, as well as in conventional environments, means that archiving is within the pecuniary reach of organisations of all sizes. Bloor Research believes that archiving should be a standard part of any company s infrastructure and, moreover, it should be higher up the priority list than it has been historically. With big data already on the agenda this is already happening and RainStor is well-placed to take advantage of this opportunity. The company s tag line says: taking the big out of data and this is absolutely true. Further Information Further information about this subject is available from A Bloor InDetail Paper Bloor Research

12 Bloor Research overview Bloor Research is one of Europe s leading IT research, analysis and consultancy organisations. We explain how to bring greater Agility to corporate IT systems through the effective governance, management and leverage of Information. We have built a reputation for telling the right story with independent, intelligent, well-articulated communications content and publications on all aspects of the ICT industry. We believe the objective of telling the right story is to: Describe the technology in context to its business value and the other systems and processes it interacts with. Understand how new and innovative technologies fit in with existing ICT investments. Look at the whole market and explain all the solutions available and how they can be more effectively evaluated. Filter noise and make it easier to find the additional information or news that supports both investment and implementation. Ensure all our content is available through the most appropriate channel. Founded in 1989, we have spent over two decades distributing research and analysis to IT user and vendor organisations throughout the world via online subscriptions, tailored research services, events and consultancy projects. We are committed to turning our knowledge into business value for you. About the author Philip Howard Research Director - Data Management Philip started in the computer industry way back in 1973 and has variously worked as a systems analyst, programmer and salesperson, as well as in marketing and product management, for a variety of companies including GEC Marconi, GPT, Philips Data Systems, Raytheon and NCR. After a quarter of a century of not being his own boss Philip set up his own company in 1992 and his first client was Bloor Research (then ButlerBloor), with Philip working for the company as an associate analyst. His relationship with Bloor Research has continued since that time and he is now Research Director focused on Data Management. Data management refers to the management, movement, governance and storage of data and involves diverse technologies that include (but are not limited to) databases and data warehousing, data integration (including ETL, data migration and data federation), data quality, master data management, metadata management and log and event management. Philip also tracks spreadsheet management and complex event processing. In addition to the numerous reports Philip has written on behalf of Bloor Research, Philip also contributes regularly to IT-Director.com and IT-Analysis.com and was previously editor of both Application Development News and Operating System News on behalf of Cambridge Market Intelligence (CMI). He has also contributed to various magazines and written a number of reports published by companies such as CMI and The Financial Times. Philip speaks regularly at conferences and other events throughout Europe and North America. Away from work, Philip s primary leisure activities are canal boats, skiing, playing Bridge (at which he is a Life Master), dining out and walking Benji the dog.

13 Copyright & disclaimer This document is copyright 2013 Bloor Research. No part of this publication may be reproduced by any method whatsoever without the prior consent of Bloor Research. Due to the nature of this material, numerous hardware and software products have been mentioned by name. In the majority, if not all, of the cases, these product names are claimed as trademarks by the companies that manufacture the products. It is not Bloor Research s intent to claim these names or trademarks as our own. Likewise, company logos, graphics or screen shots have been reproduced with the consent of the owner and are subject to that owner s copyright. Whilst every care has been taken in the preparation of this document to ensure that the information is correct, the publishers cannot accept responsibility for any errors or omissions.

14 2nd Floor, St John Street LONDON, EC1V 4PY, United Kingdom Tel: +44 (0) Fax: +44 (0) Web:

InBrief. Data Profiling & Discovery. A Market Update

InBrief. Data Profiling & Discovery. A Market Update InBrief Data Profiling & Discovery A Market Update An InBrief Paper by Bloor Research Author : Philip Howard Publish date : June 2012 Data Profiling and Discovery X88 Pandora Market trends In 2009 we

More information

Spotlight. Big data and the mainframe

Spotlight. Big data and the mainframe Spotlight Big data and the mainframe A Spotlight Paper by Bloor Research Author : Philip Howard Publish date : March 2014 there needs to be an infrastructure in place to manage the inter-relationship between

More information

White Paper. The importance of an Information Strategy

White Paper. The importance of an Information Strategy White Paper The importance of an Information Strategy A White Paper by Bloor Research Author : Philip Howard Publish date : December 2008 The idea of an Information Strategy will be critical to your business

More information

White Paper. Lower your risk with application data migration. next steps with Informatica

White Paper. Lower your risk with application data migration. next steps with Informatica White Paper Lower your risk with application data migration A White Paper by Bloor Research Author : Philip Howard Publish date : April 2013 If we add in Data Validation and Proactive Monitoring then Informatica

More information

White Paper. SAP ASE Total Cost of Ownership. A comparison to Oracle

White Paper. SAP ASE Total Cost of Ownership. A comparison to Oracle White Paper SAP ASE Total Cost of Ownership A White Paper by Bloor Research Author : Philip Howard Publish date : April 2014 The results of this survey are unequivocal: for all 21 TCO and related metrics

More information

White Paper. The benefits of a cloud-based email archiving service. for use by organisations of any size

White Paper. The benefits of a cloud-based email archiving service. for use by organisations of any size White Paper The benefits of a cloud-based email archiving service A White Paper by Bloor Research Author : Fran Howarth Publish date : June 2010 Given the importance placed today on emails as a means of

More information

How do you get more from your Data Warehouse?

How do you get more from your Data Warehouse? A White Paper by Bloor Research Author : Philip Howard Publish date : November 2007 The need for data warehousing is growing ever more acute and poses a number of problems for data warehouse providers

More information

White Paper. Data Migration

White Paper. Data Migration White Paper Data Migration A White Paper by Bloor Research Author : Philip Howard Publish date : May 2011 data migration projects are undertaken because they will support business objectives. There are

More information

White Paper. Master Data Management

White Paper. Master Data Management White Paper Master Data Management A White Paper by Bloor Research Author : Philip Howard Publish date : May 2013 Whatever your reasons for wanting to implement MDM, the sorts of facilities described for

More information

White Paper. Data exchange and information sharing

White Paper. Data exchange and information sharing White Paper Data exchange and information sharing A White Paper by Bloor Research Author : Philip Howard Publish date : February 2011 We highly recommend a move away from hand coding (for enabling partner

More information

White Paper. The benefits of basing email and web security in the cloud. including cost, speed, agility and better protection

White Paper. The benefits of basing email and web security in the cloud. including cost, speed, agility and better protection White Paper The benefits of basing email and web security in the cloud A White Paper by Bloor Research Author : Fran Howarth Publish date : July 2010 the outsourcing of email and web security defences

More information

Tier 1 Communications Provider Efficiently Manages Big Data, Saving Millions of Dollars and Enabling Richer Analytics for Business Users

Tier 1 Communications Provider Efficiently Manages Big Data, Saving Millions of Dollars and Enabling Richer Analytics for Business Users Tier 1 Communications Provider Efficiently Manages Big Data, Saving Millions of Dollars and Enabling Richer Analytics for Business Users www.rainstor.com Background Communications providers have had experience

More information

InDetail. Kdb+ and the Internet of Things/Big Data

InDetail. Kdb+ and the Internet of Things/Big Data InDetail Kdb+ and the Internet of Things/Big Data An InDetail Paper by Bloor Research Author : Philip Howard Publish date : August 2014 Kdb+ has proved itself in what is unarguably the most demanding big

More information

White Paper. Considerations for maximising analytic performance

White Paper. Considerations for maximising analytic performance White Paper Considerations for maximising analytic performance A White Paper by Bloor Research Author : Philip Howard Publish date : September 2013 DB2 with BLU Acceleration should not only provide better

More information

White. Paper. EMC Isilon: A Scalable Storage Platform for Big Data. April 2014

White. Paper. EMC Isilon: A Scalable Storage Platform for Big Data. April 2014 White Paper EMC Isilon: A Scalable Storage Platform for Big Data By Nik Rouda, Senior Analyst and Terri McClure, Senior Analyst April 2014 This ESG White Paper was commissioned by EMC Isilon and is distributed

More information

White Paper. What the ideal cloud-based web security service should provide. the tools and services to look for

White Paper. What the ideal cloud-based web security service should provide. the tools and services to look for White Paper What the ideal cloud-based web security service should provide A White Paper by Bloor Research Author : Fran Howarth Publish date : February 2010 The components required of an effective web

More information

Actian SQL in Hadoop Buyer s Guide

Actian SQL in Hadoop Buyer s Guide Actian SQL in Hadoop Buyer s Guide Contents Introduction: Big Data and Hadoop... 3 SQL on Hadoop Benefits... 4 Approaches to SQL on Hadoop... 4 The Top 10 SQL in Hadoop Capabilities... 5 SQL in Hadoop

More information

InDetail. InDetail Paper by Bloor Author Philip Howard Date October 2014. NuoDB Swifts Release 2.1

InDetail. InDetail Paper by Bloor Author Philip Howard Date October 2014. NuoDB Swifts Release 2.1 InDetail InDetail Paper by Bloor Author Philip Howard Date October 2014 NuoDB Swifts Release 2.1 NuoDB is a very interesting product, both from a conceptual and an architectural point of view Author Philip

More information

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014

Forecast of Big Data Trends. Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Forecast of Big Data Trends Assoc. Prof. Dr. Thanachart Numnonda Executive Director IMC Institute 3 September 2014 Big Data transforms Business 2 Data created every minute Source http://mashable.com/2012/06/22/data-created-every-minute/

More information

Protecting Big Data Data Protection Solutions for the Business Data Lake

Protecting Big Data Data Protection Solutions for the Business Data Lake White Paper Protecting Big Data Data Protection Solutions for the Business Data Lake Abstract Big Data use cases are maturing and customers are using Big Data to improve top and bottom line revenues. With

More information

Oracle Database 12c Plug In. Switch On. Get SMART.

Oracle Database 12c Plug In. Switch On. Get SMART. Oracle Database 12c Plug In. Switch On. Get SMART. Duncan Harvey Head of Core Technology, Oracle EMEA March 2015 Safe Harbor Statement The following is intended to outline our general product direction.

More information

Big Data at Cloud Scale

Big Data at Cloud Scale Big Data at Cloud Scale Pushing the limits of flexible & powerful analytics Copyright 2015 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For

More information

White Paper. When email archiving is best done in the cloud. ease of use a prime consideration

White Paper. When email archiving is best done in the cloud. ease of use a prime consideration White Paper When email archiving is best done in the cloud A White Paper by Bloor Research Author : Fran Howarth Publish date : June 2010 An email archiving service provided in the cloud is a viable alternative

More information

Delivering Real-World Total Cost of Ownership and Operational Benefits

Delivering Real-World Total Cost of Ownership and Operational Benefits Delivering Real-World Total Cost of Ownership and Operational Benefits Treasure Data - Delivering Real-World Total Cost of Ownership and Operational Benefits 1 Background Big Data is traditionally thought

More information

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics ESSENTIALS EMC ISILON Use the industry's first and only scale-out NAS solution with native Hadoop

More information

Informatica Version 10 Features and Advancements

Informatica Version 10 Features and Advancements Informatica Version 10 Features and Advancements Created: 01-22-2016 Author: Mahendra Mannan Last Updated: 01-25-2015 Version Number: 0.5 Contact Info: mahendram@logandata.com krishnak@logandata.com 1.

More information

Data Integration Checklist

Data Integration Checklist The need for data integration tools exists in every company, small to large. Whether it is extracting data that exists in spreadsheets, packaged applications, databases, sensor networks or social media

More information

Data processing goes big

Data processing goes big Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,

More information

Building Your Big Data Team

Building Your Big Data Team Building Your Big Data Team With all the buzz around Big Data, many companies have decided they need some sort of Big Data initiative in place to stay current with modern data management requirements.

More information

An Oracle White Paper June 2012. High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database

An Oracle White Paper June 2012. High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database An Oracle White Paper June 2012 High Performance Connectors for Load and Access of Data from Hadoop to Oracle Database Executive Overview... 1 Introduction... 1 Oracle Loader for Hadoop... 2 Oracle Direct

More information

White Paper. Architecting the security of the next-generation data center. why security needs to be a key component early in the design phase

White Paper. Architecting the security of the next-generation data center. why security needs to be a key component early in the design phase White Paper Architecting the security of the next-generation data center A White Paper by Bloor Research Author : Fran Howarth Publish date : August 2011 teams involved in modernization projects need to

More information

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing

Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Architecting for Big Data Analytics and Beyond: A New Framework for Business Intelligence and Data Warehousing Wayne W. Eckerson Director of Research, TechTarget Founder, BI Leadership Forum Business Analytics

More information

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April 9 2013

Integrating Hadoop. Into Business Intelligence & Data Warehousing. Philip Russom TDWI Research Director for Data Management, April 9 2013 Integrating Hadoop Into Business Intelligence & Data Warehousing Philip Russom TDWI Research Director for Data Management, April 9 2013 TDWI would like to thank the following companies for sponsoring the

More information

BIG DATA-AS-A-SERVICE

BIG DATA-AS-A-SERVICE White Paper BIG DATA-AS-A-SERVICE What Big Data is about What service providers can do with Big Data What EMC can do to help EMC Solutions Group Abstract This white paper looks at what service providers

More information

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract

W H I T E P A P E R. Deriving Intelligence from Large Data Using Hadoop and Applying Analytics. Abstract W H I T E P A P E R Deriving Intelligence from Large Data Using Hadoop and Applying Analytics Abstract This white paper is focused on discussing the challenges facing large scale data processing and the

More information

The Future of Data Management

The Future of Data Management The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class

More information

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE

ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE ENABLING GLOBAL HADOOP WITH EMC ELASTIC CLOUD STORAGE Hadoop Storage-as-a-Service ABSTRACT This White Paper illustrates how EMC Elastic Cloud Storage (ECS ) can be used to streamline the Hadoop data analytics

More information

Global Investment Bank Saves Millions with RainStor

Global Investment Bank Saves Millions with RainStor SUCCESS STORY Global Investment Bank Saves Millions with RainStor Reduces data footprint by 97%, meets stringent compliance regulations and achieves payback in 18 months www.rainstor.com Background Financial

More information

Informatica and the Vibe Virtual Data Machine

Informatica and the Vibe Virtual Data Machine White Paper Informatica and the Vibe Virtual Data Machine Preparing for the Integrated Information Age This document contains Confidential, Proprietary and Trade Secret Information ( Confidential Information

More information

Object Storage: Out of the Shadows and into the Spotlight

Object Storage: Out of the Shadows and into the Spotlight Technology Insight Paper Object Storage: Out of the Shadows and into the Spotlight By John Webster December 12, 2012 Enabling you to make the best technology decisions Object Storage: Out of the Shadows

More information

A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel

A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel A Next-Generation Analytics Ecosystem for Big Data Colin White, BI Research September 2012 Sponsored by ParAccel BIG DATA IS BIG NEWS The value of big data lies in the business analytics that can be generated

More information

Performance and Scalability Overview

Performance and Scalability Overview Performance and Scalability Overview This guide provides an overview of some of the performance and scalability capabilities of the Pentaho Business Analytics Platform. Contents Pentaho Scalability and

More information

ORACLE DATA INTEGRATOR ENTERPRISE EDITION

ORACLE DATA INTEGRATOR ENTERPRISE EDITION ORACLE DATA INTEGRATOR ENTERPRISE EDITION ORACLE DATA INTEGRATOR ENTERPRISE EDITION KEY FEATURES Out-of-box integration with databases, ERPs, CRMs, B2B systems, flat files, XML data, LDAP, JDBC, ODBC Knowledge

More information

Buying vs. Building Business Analytics. A decision resource for technology and product teams

Buying vs. Building Business Analytics. A decision resource for technology and product teams Buying vs. Building Business Analytics A decision resource for technology and product teams Introduction Providing analytics functionality to your end users can create a number of benefits. Actionable

More information

WHITEPAPER. A Technical Perspective on the Talena Data Availability Management Solution

WHITEPAPER. A Technical Perspective on the Talena Data Availability Management Solution WHITEPAPER A Technical Perspective on the Talena Data Availability Management Solution BIG DATA TECHNOLOGY LANDSCAPE Over the past decade, the emergence of social media, mobile, and cloud technologies

More information

Oracle Database - Engineered for Innovation. Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya

Oracle Database - Engineered for Innovation. Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya Oracle Database - Engineered for Innovation Sedat Zencirci Teknoloji Satış Danışmanlığı Direktörü Türkiye ve Orta Asya Oracle Database 11g Release 2 Shipping since September 2009 11.2.0.3 Patch Set now

More information

Performance and Scalability Overview

Performance and Scalability Overview Performance and Scalability Overview This guide provides an overview of some of the performance and scalability capabilities of the Pentaho Business Analytics platform. PENTAHO PERFORMANCE ENGINEERING

More information

Making Sense of the Madness

Making Sense of the Madness Making Sense of the Madness Deploying Big Data techniques to deal with real world Bigish Data issues Copyright James Mitchell 2014 1 Introduction Warning! Parental Guidance Recommended Please read the

More information

White Paper. Big Data Analytics with Hadoop and Sybase IQ

White Paper. Big Data Analytics with Hadoop and Sybase IQ White Paper Big Data Analytics with Hadoop and Sybase IQ A White Paper by Bloor Research Author : Philip Howard Publish date : April 2012 Big data is important because it enables you to analyse large amounts

More information

Data Governance in the Hadoop Data Lake. Michael Lang May 2015

Data Governance in the Hadoop Data Lake. Michael Lang May 2015 Data Governance in the Hadoop Data Lake Michael Lang May 2015 Introduction Product Manager for Teradata Loom Joined Teradata as part of acquisition of Revelytix, original developer of Loom VP of Sales

More information

Contents. Pentaho Corporation. Version 5.1. Copyright Page. New Features in Pentaho Data Integration 5.1. PDI Version 5.1 Minor Functionality Changes

Contents. Pentaho Corporation. Version 5.1. Copyright Page. New Features in Pentaho Data Integration 5.1. PDI Version 5.1 Minor Functionality Changes Contents Pentaho Corporation Version 5.1 Copyright Page New Features in Pentaho Data Integration 5.1 PDI Version 5.1 Minor Functionality Changes Legal Notices https://help.pentaho.com/template:pentaho/controls/pdftocfooter

More information

White Paper. White Paper by Bloor Author Philip Howard Publish date March 2012. The business case for Data Quality

White Paper. White Paper by Bloor Author Philip Howard Publish date March 2012. The business case for Data Quality White Paper White Paper by Bloor Author Philip Howard Publish date March 2012 The business case for Data Quality there is much to be said in favour of a platform-based approach to data quality. Author

More information

White Paper. Agile data management with X88

White Paper. Agile data management with X88 White Paper Agile data management with X88 A White Paper by Bloor Research Author : Philip Howard Publish date : June 2011 This paper is a call for some more forward thinking from data management practitioners

More information

HDP Enabling the Modern Data Architecture

HDP Enabling the Modern Data Architecture HDP Enabling the Modern Data Architecture Herb Cunitz President, Hortonworks Page 1 Hortonworks enables adoption of Apache Hadoop through HDP (Hortonworks Data Platform) Founded in 2011 Original 24 architects,

More information

HDP Hadoop From concept to deployment.

HDP Hadoop From concept to deployment. HDP Hadoop From concept to deployment. Ankur Gupta Senior Solutions Engineer Rackspace: Page 41 27 th Jan 2015 Where are you in your Hadoop Journey? A. Researching our options B. Currently evaluating some

More information

Oracle Big Data SQL Technical Update

Oracle Big Data SQL Technical Update Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical

More information

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica

More information

Information Architecture

Information Architecture The Bloor Group Actian and The Big Data Information Architecture WHITE PAPER The Actian Big Data Information Architecture Actian and The Big Data Information Architecture Originally founded in 2005 to

More information

IBM BigInsights for Apache Hadoop

IBM BigInsights for Apache Hadoop IBM BigInsights for Apache Hadoop Efficiently manage and mine big data for valuable insights Highlights: Enterprise-ready Apache Hadoop based platform for data processing, warehousing and analytics Advanced

More information

Informatica Application Information Lifecycle Management

Informatica Application Information Lifecycle Management Informatica Application Information Lifecycle Management Cost-Effectively Manage Every Phase of the Information Lifecycle brochure Controlling Explosive Data Growth The era of big data presents today s

More information

Offload Enterprise Data Warehouse (EDW) to Big Data Lake. Ample White Paper

Offload Enterprise Data Warehouse (EDW) to Big Data Lake. Ample White Paper Offload Enterprise Data Warehouse (EDW) to Big Data Lake Oracle Exadata, Teradata, Netezza and SQL Server Ample White Paper EDW (Enterprise Data Warehouse) Offloads The EDW (Enterprise Data Warehouse)

More information

Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload

Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload Dell Cloudera Syncsort Data Warehouse Optimization ETL Offload Drive operational efficiency and lower data transformation costs with a Reference Architecture for an end-to-end optimization and offload

More information

Teradata and RainStor in a Big Data World Augment Your Data Warehouse Strategy with an Active Archive Running on Hadoop.

Teradata and RainStor in a Big Data World Augment Your Data Warehouse Strategy with an Active Archive Running on Hadoop. WHITE PAPER Teradata and RainStor in a Big Data World Augment Your Data Warehouse Strategy with an Active Archive Running on Hadoop. If you could store the years of history that your business needs and

More information

How to Enhance Traditional BI Architecture to Leverage Big Data

How to Enhance Traditional BI Architecture to Leverage Big Data B I G D ATA How to Enhance Traditional BI Architecture to Leverage Big Data Contents Executive Summary... 1 Traditional BI - DataStack 2.0 Architecture... 2 Benefits of Traditional BI - DataStack 2.0...

More information

Analytics framework: creating the data-centric organisation to optimise business performance

Analytics framework: creating the data-centric organisation to optimise business performance Research Report Analytics framework: creating the data-centric organisation to optimise business performance October 2013 Justin van der Lande 2 Contents [1] Slide no. 5. Executive summary 6. Executive

More information

INVESTOR PRESENTATION. First Quarter 2014

INVESTOR PRESENTATION. First Quarter 2014 INVESTOR PRESENTATION First Quarter 2014 Note to Investors Certain non-gaap financial information regarding operating results may be discussed during this presentation. Reconciliations of the differences

More information

Big Data for Investment Research Management

Big Data for Investment Research Management IDT Partners www.idtpartners.com Big Data for Investment Research Management Discover how IDT Partners helps Financial Services, Market Research, and Investment Management firms turn big data into actionable

More information

<Insert Picture Here> Big Data

<Insert Picture Here> Big Data Big Data Kevin Kalmbach Principal Sales Consultant, Public Sector Engineered Systems Program Agenda What is Big Data and why it is important? What is your Big

More information

Data Modeling for Big Data

Data Modeling for Big Data Data Modeling for Big Data by Jinbao Zhu, Principal Software Engineer, and Allen Wang, Manager, Software Engineering, CA Technologies In the Internet era, the volume of data we deal with has grown to terabytes

More information

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Datenverwaltung im Wandel - Building an Enterprise Data Hub with Datenverwaltung im Wandel - Building an Enterprise Data Hub with Cloudera Bernard Doering Regional Director, Central EMEA, Cloudera Cloudera Your Hadoop Experts Founded 2008, by former employees of Employees

More information

IBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look

IBM BigInsights Has Potential If It Lives Up To Its Promise. InfoSphere BigInsights A Closer Look IBM BigInsights Has Potential If It Lives Up To Its Promise By Prakash Sukumar, Principal Consultant at iolap, Inc. IBM released Hadoop-based InfoSphere BigInsights in May 2013. There are already Hadoop-based

More information

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns

How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns How Transactional Analytics is Changing the Future of Business A look at the options, use cases, and anti-patterns Table of Contents Abstract... 3 Introduction... 3 Definition... 3 The Expanding Digitization

More information

EMC s Enterprise Hadoop Solution. By Julie Lockner, Senior Analyst, and Terri McClure, Senior Analyst

EMC s Enterprise Hadoop Solution. By Julie Lockner, Senior Analyst, and Terri McClure, Senior Analyst White Paper EMC s Enterprise Hadoop Solution Isilon Scale-out NAS and Greenplum HD By Julie Lockner, Senior Analyst, and Terri McClure, Senior Analyst February 2012 This ESG White Paper was commissioned

More information

SELLING PROJECTS ON THE MICROSOFT BUSINESS ANALYTICS PLATFORM

SELLING PROJECTS ON THE MICROSOFT BUSINESS ANALYTICS PLATFORM David Chappell SELLING PROJECTS ON THE MICROSOFT BUSINESS ANALYTICS PLATFORM A PERSPECTIVE FOR SYSTEMS INTEGRATORS Sponsored by Microsoft Corporation Copyright 2014 Chappell & Associates Contents Business

More information

QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM

QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM QlikView Technical Case Study Series Big Data June 2012 qlikview.com Introduction This QlikView technical case study focuses on the QlikView deployment

More information

ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION

ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION EXECUTIVE SUMMARY Oracle business intelligence solutions are complete, open, and integrated. Key components of Oracle business intelligence

More information

Agile Business Intelligence Data Lake Architecture

Agile Business Intelligence Data Lake Architecture Agile Business Intelligence Data Lake Architecture TABLE OF CONTENTS Introduction... 2 Data Lake Architecture... 2 Step 1 Extract From Source Data... 5 Step 2 Register And Catalogue Data Sets... 5 Step

More information

IBM InfoSphere Guardium Data Activity Monitor for Hadoop-based systems

IBM InfoSphere Guardium Data Activity Monitor for Hadoop-based systems IBM InfoSphere Guardium Data Activity Monitor for Hadoop-based systems Proactively address regulatory compliance requirements and protect sensitive data in real time Highlights Monitor and audit data activity

More information

White Paper. Exploiting the Internet of Things with investigative analytics

White Paper. Exploiting the Internet of Things with investigative analytics White Paper Exploiting the Internet of Things with investigative analytics A White Paper by Bloor Research Author : Philip Howard Publish date : May 2013 The Internet of Things has the potential to change

More information

Cost-Effective Business Intelligence with Red Hat and Open Source

Cost-Effective Business Intelligence with Red Hat and Open Source Cost-Effective Business Intelligence with Red Hat and Open Source Sherman Wood Director, Business Intelligence, Jaspersoft September 3, 2009 1 Agenda Introductions Quick survey What is BI?: reporting,

More information

Big Data and Data Science: Behind the Buzz Words

Big Data and Data Science: Behind the Buzz Words Big Data and Data Science: Behind the Buzz Words Peggy Brinkmann, FCAS, MAAA Actuary Milliman, Inc. April 1, 2014 Contents Big data: from hype to value Deconstructing data science Managing big data Analyzing

More information

BIG DATA SOLUTION DATA SHEET

BIG DATA SOLUTION DATA SHEET BIG DATA SOLUTION DATA SHEET Highlight. DATA SHEET HGrid247 BIG DATA SOLUTION Exploring your BIG DATA, get some deeper insight. It is possible! Another approach to access your BIG DATA with the latest

More information

A Big Data Storage Architecture for the Second Wave David Sunny Sundstrom Principle Product Director, Storage Oracle

A Big Data Storage Architecture for the Second Wave David Sunny Sundstrom Principle Product Director, Storage Oracle A Big Data Storage Architecture for the Second Wave David Sunny Sundstrom Principle Product Director, Storage Oracle Growth in Data Diversity and Usage 1.8 Zettabytes of Data in 2011, 20x Growth by 2020

More information

Move Data from Oracle to Hadoop and Gain New Business Insights

Move Data from Oracle to Hadoop and Gain New Business Insights Move Data from Oracle to Hadoop and Gain New Business Insights Written by Lenka Vanek, senior director of engineering, Dell Software Abstract Today, the majority of data for transaction processing resides

More information

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances

Well packaged sets of preinstalled, integrated, and optimized software on select hardware in the form of engineered systems and appliances INSIGHT Oracle's All- Out Assault on the Big Data Market: Offering Hadoop, R, Cubes, and Scalable IMDB in Familiar Packages Carl W. Olofson IDC OPINION Global Headquarters: 5 Speen Street Framingham, MA

More information

Big Data Technologies Compared June 2014

Big Data Technologies Compared June 2014 Big Data Technologies Compared June 2014 Agenda What is Big Data Big Data Technology Comparison Summary Other Big Data Technologies Questions 2 What is Big Data by Example The SKA Telescope is a new development

More information

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics An Oracle White Paper November 2010 Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics 1 Introduction New applications such as web searches, recommendation engines,

More information

Build a Streamlined Data Refinery. An enterprise solution for blended data that is governed, analytics-ready, and on-demand

Build a Streamlined Data Refinery. An enterprise solution for blended data that is governed, analytics-ready, and on-demand Build a Streamlined Data Refinery An enterprise solution for blended data that is governed, analytics-ready, and on-demand Introduction As the volume and variety of data has exploded in recent years, putting

More information

Who Am I? Mark Cusack Chief Architect 9 years@rainstor Founding developer Ex UK Ministry of Defence Research InfoSec projects

Who Am I? Mark Cusack Chief Architect 9 years@rainstor Founding developer Ex UK Ministry of Defence Research InfoSec projects 1 Who Am I? Mark Cusack Chief Architect 9 years@rainstor Founding developer Ex UK Ministry of Defence Research InfoSec projects 2 RainStor: a SQL Database on Hadoop SCALE (MPP, Shared everything) LOAD

More information

master data management and data integration: complementary but distinct

master data management and data integration: complementary but distinct master data management and data integration: complementary but distinct A White Paper by Bloor Research Author : Philip Howard Review date : October 2006 Put simply, if you ignore data integration or do

More information

Business Intelligence and Service Oriented Architectures. An Oracle White Paper May 2007

Business Intelligence and Service Oriented Architectures. An Oracle White Paper May 2007 Business Intelligence and Service Oriented Architectures An Oracle White Paper May 2007 Note: The following is intended to outline our general product direction. It is intended for information purposes

More information

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS!

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS! The Bloor Group IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS VENDOR PROFILE The IBM Big Data Landscape IBM can legitimately claim to have been involved in Big Data and to have a much broader

More information

Automated Data Ingestion. Bernhard Disselhoff Enterprise Sales Engineer

Automated Data Ingestion. Bernhard Disselhoff Enterprise Sales Engineer Automated Data Ingestion Bernhard Disselhoff Enterprise Sales Engineer Agenda Pentaho Overview Templated dynamic ETL workflows Pentaho Data Integration (PDI) Use Cases Pentaho Overview Overview What we

More information

EMC BACKUP MEETS BIG DATA

EMC BACKUP MEETS BIG DATA EMC BACKUP MEETS BIG DATA Strategies To Protect Greenplum, Isilon And Teradata Systems 1 Agenda Big Data: Overview, Backup and Recovery EMC Big Data Backup Strategy EMC Backup and Recovery Solutions for

More information

SQL Server 2012 and PostgreSQL 9

SQL Server 2012 and PostgreSQL 9 SQL Server 2012 and PostgreSQL 9 A Detailed Comparison of Approaches and Features SQL Server White Paper Published: April 2012 Applies to: SQL Server 2012 Introduction: The question whether to implement

More information

Big Data must become a first class citizen in the enterprise

Big Data must become a first class citizen in the enterprise Big Data must become a first class citizen in the enterprise An Ovum white paper for Cloudera Publication Date: 14 January 2014 Author: Tony Baer SUMMARY Catalyst Ovum view Big Data analytics have caught

More information

Why Big Data in the Cloud?

Why Big Data in the Cloud? Have 40 Why Big Data in the Cloud? Colin White, BI Research January 2014 Sponsored by Treasure Data TABLE OF CONTENTS Introduction The Importance of Big Data The Role of Cloud Computing Using Big Data

More information

How To Handle Big Data With A Data Scientist

How To Handle Big Data With A Data Scientist III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution

More information

Embedded Analytics Vendor Selection Guide. A holistic evaluation criteria for your OEM analytics project

Embedded Analytics Vendor Selection Guide. A holistic evaluation criteria for your OEM analytics project Embedded Analytics Vendor Selection Guide A holistic evaluation criteria for your OEM analytics project Introduction Integrating a rich analytics offering into your software product can bring substantial

More information

Datalogix. Using IBM Netezza data warehouse appliances to drive online sales with offline data. Overview. IBM Software Information Management

Datalogix. Using IBM Netezza data warehouse appliances to drive online sales with offline data. Overview. IBM Software Information Management Datalogix Using IBM Netezza data warehouse appliances to drive online sales with offline data Overview The need Infrastructure could not support the growing online data volumes and analysis required The

More information