Spotlight. Big data and the mainframe

Size: px
Start display at page:

Download "Spotlight. Big data and the mainframe"

Transcription

1 Spotlight Big data and the mainframe A Spotlight Paper by Bloor Research Author : Philip Howard Publish date : March 2014

2 there needs to be an infrastructure in place to manage the inter-relationship between the new big data world on the one hand, and traditional environments, such as the mainframe, on the other. Philip Howard

3 Introduction More or less every major organisation in the world has a mainframe at the heart of its enterprise and it is critical that big data deployments are viewed from that perspective The discussion about big data to date has been primarily focused on the volume, variety or velocity of data. We are making the assumption that readers are familiar with big data and, indeed, with the concept of the Internet of Things (IoT) so we will not be discussing the three Vs. However, now that big data deployments are beyond the trial stage and into deployment it is time to consider some of the issues that arise when big data implementations transition beyond skunk works and into general-purpose use and, in particular, the issues that arise when organisations are integrating their mainframe system of record alongside big data implementations. Note that by mainframe we mean an IBM z/os environment, for which we are using mainframe as shorthand. More or less every major organisation in the world has a mainframe at the heart of its enterprise and it is critical that big data deployments are viewed from that perspective, rather than treated as isolated efforts that are distinct from the mainframe environment. Big Data must ultimately (if not sooner) link back to the system of record, the mainframe. What we will discuss in this paper are some of the issues surrounding these big data deployments and how they might be resolved within the context of a mainframe environment. A Bloor Spotlight Paper Bloor Research

4 Issues around big data deployments It is all too clear that many companies are making the same mistakes with big data as they have historically with little data There are at least six issues around big data deployments. Not all of these are specific to mainframe environments and some of them are more concerned with providing data analytics rather than the infrastructure required to support analytics, which is the subject of this paper. Nevertheless, we include all of these issues here for the sake of completeness but also to highlight the fact that big data deployments need careful consideration and planning. It is all too clear that many companies are making the same mistakes with big data as they have historically with little data. That is, they are building piecemeal solutions to resolve particular short-term requirements. This will lead to the same sort of siloed mess that bedevils conventional environments and which requires seemingly endless amounts of IT plumbing to join the whole thing up. According to the latest research from The Gartner Group, over 90% of existing IT budgets are taken up with maintenance of existing applications and their supporting infrastructure. A large part of the reason for this is because of the siloed, duplicated and complex nature of the existing environments. Doing the same with big data implementations will only exacerbate the situation and it is therefore really important to consider big data from a strategic rather than a merely tactical perspective. The issues are: 1. Many big data and IoT deployments are already business critical and this is likely to be even more the case as time progresses. This means that you need enterprise class products to support them, with high (or preferably continuous) availability and no single point of failure. 2. Big data environments are potentially complex and this complexity manifests itself in a variety of different ways: a. There are often multiple sources of data from which information needs to be combined, often in a variety of different formats (relational data, social media, clickstream data, machine-generated data, spreadsheets et al). In this paper we are specifically concerned with environments where at least some of that data, customer data for example, is held on the mainframe. b. The query environment itself is often complex, involving conventional data warehouses, data marts, stream processing products and platforms for data discovery such as Hadoop, which need to be combined from an architectural point-of-view. c. There may well be multiple big data or IoT initiatives within the same organisation, each of which is likely to involve multiple platforms, some of which may overlap with one another and some of which may not. d. There are multiple deployment options: some analytic results may need to be embedded within business processes (this is especially true of the Internet of Things) while in other cases users will want direct, self-service access to the data along with advanced visualisation techniques. Moreover, presentation of the data will typically be required across a range of different device types. e. Deployment options are further complicated by the fact that some platforms may be running on premise while others may be in public or private clouds. f. Many big data deployments require real-time or near real-time performance and, even when this is not the case, timely performance is still a requirement for enterprise class implementations. While there are certainly some NoSQL Bloor Research A Bloor Spotlight Paper

5 Issues around big data deployments products that are geared towards high performance, Hadoop, for example, is essentially a batch-based environment. This is fine where timeliness is not critical and, to be fair to the Hadoop community, it has been trying to improve performance, but it is not there yet. However, it is not just the performance of the database whether it may be warehouse, mart or discovery platform that is at issue: it is the performance of the whole environment, including the ingestion of data in the first place and its presentation once analysed. The short answer to how you get performance across the environment is that you not only need high performing analytics but also appropriate and fast pipes to get the data into, out of and through the analytic environment. 3. Big data deployments typically require expensive skills that are in short supply. The most obvious example is the need to program using the MapReduce framework. In the case of the former, of course, there are SQL initiatives in place but these are limited not only by the sophistication (or lack thereof) of the SQL being implemented but also by the absence of a database optimiser which is, in our view, essential to good SQL performance. Again, the community is working on this: for the latest version of Impala, for example, Cloudera has implemented join optimisations, but this is not yet a fully developed database optimiser. The other problem area related to skills is with respect to data scientists. In theory, data scientists are highly trained to ask the right questions of the data, given the particular industry they are working in and the organisation s marketing and business strategy. In practice, the required skill sets are simply too broad: not many people have extensive knowledge of their business, and statistical capabilities, and know about both data and IT, so a business model based on a presumption of the availability of data scientists is not going to be sustainable. Moreover, data scientists are expensive: average salaries are around $350, We have already mentioned, when discussing complexity, that a lot of the insight that is derived from big data will be embedded into automated business processes and applications. This raises another issue, which is that the big data environment needs to integrate with business process management and business rules management software in order to enable this process of embedding. If we think of the need to present information to users in various different forms as BYOD (bring your own device) we can extend that concept to BYOI (bring your own interface) if we also incorporate this sort of insight automation. 5. It is a fallacy, which all too many people fall into, to think that big data does not need the governance, compliance and security that is thought appropriate for structured data stored in relational and other databases. You need to know (and do something about it, for regulatory reasons around storing personal data), for example, if someone has mistakenly embedded his or her social security number in a tweet that you are analysing. Similarly, a great deal of machine-generated data requires de-duplication. There is also a reverse factor when it comes to social media and other personal information: customers need to know that they can trust you to safeguard their data. As mentioned, we are not going to discuss a number of these issues: what you choose to use for a discovery platform (Hadoop or some other NoSQL database or something else entirely) is outside these discussions, as are (big) data governance and the sort of analytical capabilities you might need. What we are concerned with is the interplay between the mainframe and the analytic environment and how that can be implemented in an enterprise-class manner. A Bloor Spotlight Paper Bloor Research

6 Mainframe big data issues The best way to discuss this is probably by example. Suppose that you are doing an analysis based on social media data and clickstream data stored in a Hadoop cluster. Moreover, this is not a one-time exercise but something that you are going to be doing on an ongoing basis to support, for example, one-to-one marketing. And further suppose that you want to enrich this social media data with information about customers that is normally stored on your mainframe. This immediately brings in various issues as in Figure 1. Get the data from the mainframe to Hadoop so that you can enrich the data you are analysing Get the results back to the mainframe from Hadoop Implement repeatable data movement processes, in both directions Ensure that the information coming back to the mainframe is appropriately formatted and of appropriate quality Figure 1: Mainframe Big Data some typical issues Bloor Research A Bloor Spotlight Paper

7 Mainframe big data issues Another issue is the avoidance of irrevocable assumptions that may become outdated The situation can actually get more complex than this, because you may have analytic capability within your mainframe environment (for example, you might have an Analytics Accelerator [Netezza]). Now you have options about where you perform queries: you could ship the data to Hadoop and do all the query processing there, or you could do initial analysis on Hadoop and ship the results to the mainframe where the final analysis is performed, or you could do some query processing on each platform and then join the results (which, again, could be performed in either location). Any one of these approaches might make sense for either cost or performance reasons and which approach might be most suitable will be dependent on the analytics in question and how they are going to be used. Another issue is the avoidance of irrevocable assumptions that may become outdated: for example, workloads should run on the most appropriate platform, and what that is may change as technology changes. For instance, extract, transform and load (ETL, possibly to a non-mainframe platform) may be seen as giving assurance that big data analytics will run cheaply and can t impact production performance. However, a mainframe can run different workloads entirely independently, at the same time, with welldefined priorities, so, it may become feasible to run some near real-time analytics directly on the production data instead of on a copy. The availability of specialist processors (essentially, free computing power for specialised tasks, including the running of Java Virtual Machines) may, in future, make the mainframe an unexpectedly cost-effective host for some big data analytics or associated workloads. You might also want to consider possible cultural clashes. Mainframe DBAs often see themselves as the custodians, not the owners, of the data in their databases. They are concerned with its provenance, its safety and its quality but leave the business to decide what to do with it. In the non-relational world, the custodian of the data, the big data administrator may also be (or have been) the business owner of the data and may want to use it to support a particular point-of-view or career opportunity. In terms of governance, mixing these two points of view may be an issue. For many companies it is still early days for big data deployments. That has resulted in many such implementations being cobbled together, often with manual processes for moving data around or, where tools are used, there tend to be a plethora of them. This is not satisfactory. A Bloor Spotlight Paper Bloor Research

8 Resolving mainframe big data issues How do we square all of these circles? In practice, many of the issues we have raised do not form part of the mainframe equation specifically. For example, if you are using Hadoop or some other NoSQL database then you need to ensure that the distribution you use is suitably robust for deployment in an enterprise environment. That is not related directly to the mainframe environment. Similarly, you probably want to move away from MapReduce because Java developers are expensive and deploy using SQL where that is feasible but, again, that is a big data issue rather than a mainframe one. Leaving those issues aside, therefore, what should you look for in terms of managing the interaction between the mainframe and the big data environment? The first thing is that there needs to be a repeatable process for moving data between the two environments, as opposed to something that is done in an ad-hoc manner. This means that you will need some sort of data integration capability that probably needs to include both ETL data virtualisation or federation capability, where the former supports the physical movement of data between the different environments and the latter allows you to create queries that span both environments without physically moving the data. Both of these techniques are likely to be useful for different types of analytics. The ETL approach, in particular, will be required for data discovery purposes (the sort of work that data scientists do). In addition, data integration may require replication (moving copies of the data in real-time) if real-time or near real-time analytics needs to be performed. However, being able to move the data backwards and forwards is one thing but it is not everything. You will need a scheduler or, more likely, you will want data integration processes to work with your existing mainframe scheduler. It is, of course, true that data integration products are typically delivered with their own capabilities in this regard but why have two sets of functionality to do the same thing when one will do? Having two just adds to the management overhead and you must make sure that any data moving back into the mainframe is clean and synchronised with the original database schemas. The semantics of customer, for example, in the big data environment may be very different to those in the relational environment, so if you move data back, you may find that you have extra customers, which may or may not be valid in the relational context. Speaking of management overheads, the other thing that you are really going to want is to be able to manage the whole relationship between the mainframe and the big data environments. This starts with having a graphical representation of the whole environment so that you can see where there are links, where data movement is taking place or is scheduled to take place, where there are failures and how you remediate them, and so on. In other words, you need a full management console that spans the two environments. Further, this management console needs to include collaborative capabilities since it is likely to be used both by database administrators in the mainframe world as well as the administrators looking after Hadoop and other platforms within the big data environment. While this may not prevent any cultural clashes between the individuals in these two areas, collaborative capabilities should at least help to reduce tensions Bloor Research A Bloor Spotlight Paper

9 Mainframe and big data futures Finally, we should address the point we made earlier about changing conditions. Processing that currently takes place on the mainframe may, at some point in the future, move to the big data environment. It is equally true that some of the processing that we now think will be Hadoop-based may be moved back to the mainframe (especially as new specialist processors are introduced) at some future point. The implication of this is that the management framework, put in place for co-ordinating the mainframe and big data environments, needs to be flexible enough to cater for these changing conditions. In particular, as this paper makes clear, it is necessary today to place the mainframe as a first-class player in any enterprise Big Data strategy. However, future repositioning of the mainframe as just another enterprise server, albeit one with particular qualities of resilience, performance and availability, providing high quality cloud services in a box, could take this a step further. Bloor envisages a future where smart schedulers place workloads dynamically on the most appropriate platform, in accordance with predefined policies for security, availability, service level and so on, together with feedback from customer and end-user experience (using big data analytics). This is only feasible in practice for enterprises with a fully formulated cloud strategy but could start to give market leaders a competitive edge in the short to medium term. This vision is probably a little way out for most enterprises today, but they should start thinking about the possibilities now, in order to avoid making tactical decisions that limit their future choices. Bloor envisages a future where smart schedulers place workloads dynamically on the most appropriate platform, in accordance with predefined policies for security, availability, service level and so on, together with feedback from customer and end-user experience (using big data analytics) A Bloor Spotlight Paper Bloor Research

10 Conclusion Big data and the Internet of Things present huge opportunities for companies to become more efficient in a variety of ways for example, to improve customer intimacy and to increase operational efficiency that will ultimately improve financial results. However, according to Forrester Research, only 1% of the world s data is currently analysed so there is enormous scope for growth in this analysis. Unfortunately, this scope threatens uncontrolled growth in the deployment of big data analyses and that poses a significant risk that projects will get out of hand and management and oversight will fall by the wayside. In our view, big data initiatives need careful planning, not just in their own right, but also within the context of future big data projects that can already be foreseen. In addition, there needs to be an infrastructure in place to manage the inter-relationship between the new big data world on the one hand, and traditional environments, such as the mainframe, on the other. Without such a management framework administrative costs will increase, efficiency will be reduced and it is all too likely that siloed applications will cause as much pain and suffering in the future as they have in the past. Further Information Further information is available from Bloor Research A Bloor Spotlight Paper

11 Bloor Research overview Bloor Research is one of Europe s leading IT research, analysis and consultancy organisations. We explain how to bring greater Agility to corporate IT systems through the effective governance, management and leverage of Information. We have built a reputation for telling the right story with independent, intelligent, well-articulated communications content and publications on all aspects of the ICT industry. We believe the objective of telling the right story is to: Describe the technology in context to its business value and the other systems and processes it interacts with. Understand how new and innovative technologies fit in with existing ICT investments. Look at the whole market and explain all the solutions available and how they can be more effectively evaluated. Filter noise and make it easier to find the additional information or news that supports both investment and implementation. Ensure all our content is available through the most appropriate channel. Founded in 1989, we have spent over two decades distributing research and analysis to IT user and vendor organisations throughout the world via online subscriptions, tailored research services, events and consultancy projects. We are committed to turning our knowledge into business value for you. About the author Philip Howard Research Director - Data Management Philip started in the computer industry way back in 1973 and has variously worked as a systems analyst, programmer and salesperson, as well as in marketing and product management, for a variety of companies including GEC Marconi, GPT, Philips Data Systems, Raytheon and NCR. After a quarter of a century of not being his own boss Philip set up his own company in 1992 and his first client was Bloor Research (then ButlerBloor), with Philip working for the company as an associate analyst. His relationship with Bloor Research has continued since that time and he is now Research Director focused on Data Management. Data management refers to the management, movement, governance and storage of data and involves diverse technologies that include (but are not limited to) databases and data warehousing, data integration (including ETL, data migration and data federation), data quality, master data management, metadata management and log and event management. Philip also tracks spreadsheet management and complex event processing. In addition to the numerous reports Philip has written on behalf of Bloor Research, Philip also contributes regularly to IT-Director.com and IT-Analysis.com and was previously editor of both Application Development News and Operating System News on behalf of Cambridge Market Intelligence (CMI). He has also contributed to various magazines and written a number of reports published by companies such as CMI and The Financial Times. Philip speaks regularly at conferences and other events throughout Europe and North America. Away from work, Philip s primary leisure activities are canal boats, skiing, playing Bridge (at which he is a Life Master), dining out and walking Benji the dog.

12 Copyright & disclaimer This document is copyright 2014 Bloor Research. No part of this publication may be reproduced by any method whatsoever without the prior consent of Bloor Research. Due to the nature of this material, numerous hardware and software products have been mentioned by name. In the majority, if not all, of the cases, these product names are claimed as trademarks by the companies that manufacture the products. It is not Bloor Research s intent to claim these names or trademarks as our own. Likewise, company logos, graphics or screen shots have been reproduced with the consent of the owner and are subject to that owner s copyright. Whilst every care has been taken in the preparation of this document to ensure that the information is correct, the publishers cannot accept responsibility for any errors or omissions.

13 2nd Floor, St John Street LONDON, EC1V 4PY, United Kingdom Tel: +44 (0) Fax: +44 (0) Web:

InBrief. Data Profiling & Discovery. A Market Update

InBrief. Data Profiling & Discovery. A Market Update InBrief Data Profiling & Discovery A Market Update An InBrief Paper by Bloor Research Author : Philip Howard Publish date : June 2012 Data Profiling and Discovery X88 Pandora Market trends In 2009 we

More information

White Paper. Lower your risk with application data migration. next steps with Informatica

White Paper. Lower your risk with application data migration. next steps with Informatica White Paper Lower your risk with application data migration A White Paper by Bloor Research Author : Philip Howard Publish date : April 2013 If we add in Data Validation and Proactive Monitoring then Informatica

More information

White Paper. The importance of an Information Strategy

White Paper. The importance of an Information Strategy White Paper The importance of an Information Strategy A White Paper by Bloor Research Author : Philip Howard Publish date : December 2008 The idea of an Information Strategy will be critical to your business

More information

White Paper. SAP ASE Total Cost of Ownership. A comparison to Oracle

White Paper. SAP ASE Total Cost of Ownership. A comparison to Oracle White Paper SAP ASE Total Cost of Ownership A White Paper by Bloor Research Author : Philip Howard Publish date : April 2014 The results of this survey are unequivocal: for all 21 TCO and related metrics

More information

White Paper. The benefits of a cloud-based email archiving service. for use by organisations of any size

White Paper. The benefits of a cloud-based email archiving service. for use by organisations of any size White Paper The benefits of a cloud-based email archiving service A White Paper by Bloor Research Author : Fran Howarth Publish date : June 2010 Given the importance placed today on emails as a means of

More information

White Paper. The benefits of basing email and web security in the cloud. including cost, speed, agility and better protection

White Paper. The benefits of basing email and web security in the cloud. including cost, speed, agility and better protection White Paper The benefits of basing email and web security in the cloud A White Paper by Bloor Research Author : Fran Howarth Publish date : July 2010 the outsourcing of email and web security defences

More information

White Paper. Master Data Management

White Paper. Master Data Management White Paper Master Data Management A White Paper by Bloor Research Author : Philip Howard Publish date : May 2013 Whatever your reasons for wanting to implement MDM, the sorts of facilities described for

More information

How do you get more from your Data Warehouse?

How do you get more from your Data Warehouse? A White Paper by Bloor Research Author : Philip Howard Publish date : November 2007 The need for data warehousing is growing ever more acute and poses a number of problems for data warehouse providers

More information

White Paper. Data Migration

White Paper. Data Migration White Paper Data Migration A White Paper by Bloor Research Author : Philip Howard Publish date : May 2011 data migration projects are undertaken because they will support business objectives. There are

More information

InDetail. Kdb+ and the Internet of Things/Big Data

InDetail. Kdb+ and the Internet of Things/Big Data InDetail Kdb+ and the Internet of Things/Big Data An InDetail Paper by Bloor Research Author : Philip Howard Publish date : August 2014 Kdb+ has proved itself in what is unarguably the most demanding big

More information

White Paper. What the ideal cloud-based web security service should provide. the tools and services to look for

White Paper. What the ideal cloud-based web security service should provide. the tools and services to look for White Paper What the ideal cloud-based web security service should provide A White Paper by Bloor Research Author : Fran Howarth Publish date : February 2010 The components required of an effective web

More information

White Paper. Considerations for maximising analytic performance

White Paper. Considerations for maximising analytic performance White Paper Considerations for maximising analytic performance A White Paper by Bloor Research Author : Philip Howard Publish date : September 2013 DB2 with BLU Acceleration should not only provide better

More information

White Paper. Data exchange and information sharing

White Paper. Data exchange and information sharing White Paper Data exchange and information sharing A White Paper by Bloor Research Author : Philip Howard Publish date : February 2011 We highly recommend a move away from hand coding (for enabling partner

More information

Spotlight. Data Discovery

Spotlight. Data Discovery Spotlight Data Discovery A Spotlight Report by Bloor Research Author : Philip Howard Publish date : February 2009 We believe that the ability to discover and understand the relationships that exist across

More information

InDetail. RainStor archiving

InDetail. RainStor archiving InDetail RainStor archiving An InDetail Paper by Bloor Research Author : Philip Howard Publish date : November 2013 Archival is a no-brainer when it comes to return on investment and total cost of ownership.

More information

White Paper. Architecting the security of the next-generation data center. why security needs to be a key component early in the design phase

White Paper. Architecting the security of the next-generation data center. why security needs to be a key component early in the design phase White Paper Architecting the security of the next-generation data center A White Paper by Bloor Research Author : Fran Howarth Publish date : August 2011 teams involved in modernization projects need to

More information

White Paper. When email archiving is best done in the cloud. ease of use a prime consideration

White Paper. When email archiving is best done in the cloud. ease of use a prime consideration White Paper When email archiving is best done in the cloud A White Paper by Bloor Research Author : Fran Howarth Publish date : June 2010 An email archiving service provided in the cloud is a viable alternative

More information

White Paper. Exploiting the Internet of Things with investigative analytics

White Paper. Exploiting the Internet of Things with investigative analytics White Paper Exploiting the Internet of Things with investigative analytics A White Paper by Bloor Research Author : Philip Howard Publish date : May 2013 The Internet of Things has the potential to change

More information

InDetail. InDetail Paper by Bloor Author Philip Howard Date October 2014. NuoDB Swifts Release 2.1

InDetail. InDetail Paper by Bloor Author Philip Howard Date October 2014. NuoDB Swifts Release 2.1 InDetail InDetail Paper by Bloor Author Philip Howard Date October 2014 NuoDB Swifts Release 2.1 NuoDB is a very interesting product, both from a conceptual and an architectural point of view Author Philip

More information

White Paper. Agile data management with X88

White Paper. Agile data management with X88 White Paper Agile data management with X88 A White Paper by Bloor Research Author : Philip Howard Publish date : June 2011 This paper is a call for some more forward thinking from data management practitioners

More information

White Paper. Big Data Analytics with Hadoop and Sybase IQ

White Paper. Big Data Analytics with Hadoop and Sybase IQ White Paper Big Data Analytics with Hadoop and Sybase IQ A White Paper by Bloor Research Author : Philip Howard Publish date : April 2012 Big data is important because it enables you to analyse large amounts

More information

White Paper. White Paper by Bloor Author Philip Howard Publish date March 2012. The business case for Data Quality

White Paper. White Paper by Bloor Author Philip Howard Publish date March 2012. The business case for Data Quality White Paper White Paper by Bloor Author Philip Howard Publish date March 2012 The business case for Data Quality there is much to be said in favour of a platform-based approach to data quality. Author

More information

Big Data Integration: A Buyer's Guide

Big Data Integration: A Buyer's Guide SEPTEMBER 2013 Buyer s Guide to Big Data Integration Sponsored by Contents Introduction 1 Challenges of Big Data Integration: New and Old 1 What You Need for Big Data Integration 3 Preferred Technology

More information

QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM

QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM QlikView Technical Case Study Series Big Data June 2012 qlikview.com Introduction This QlikView technical case study focuses on the QlikView deployment

More information

Hadoop Data Hubs and BI. Supporting the migration from siloed reporting and BI to centralized services with Hadoop

Hadoop Data Hubs and BI. Supporting the migration from siloed reporting and BI to centralized services with Hadoop Hadoop Data Hubs and BI Supporting the migration from siloed reporting and BI to centralized services with Hadoop John Allen October 2014 Introduction John Allen; computer scientist Background in data

More information

Realizing the Benefits of Data Modernization

Realizing the Benefits of Data Modernization February 2015 Perspective Realizing the Benefits of How to overcome legacy data challenges with innovative technologies and a seamless data modernization roadmap. Companies born into the digital world

More information

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS!

IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS! The Bloor Group IBM AND NEXT GENERATION ARCHITECTURE FOR BIG DATA & ANALYTICS VENDOR PROFILE The IBM Big Data Landscape IBM can legitimately claim to have been involved in Big Data and to have a much broader

More information

Information Architecture

Information Architecture The Bloor Group Actian and The Big Data Information Architecture WHITE PAPER The Actian Big Data Information Architecture Actian and The Big Data Information Architecture Originally founded in 2005 to

More information

Data Integration Platforms - Talend

Data Integration Platforms - Talend Data Integration Platforms - Talend Author : Philip Howard Publish date : July 2008 page 1 Introduction Talend is an open source provider of data integration products. However, while many open source

More information

Luncheon Webinar Series May 13, 2013

Luncheon Webinar Series May 13, 2013 Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration

More information

IBM System x reference architecture solutions for big data

IBM System x reference architecture solutions for big data IBM System x reference architecture solutions for big data Easy-to-implement hardware, software and services for analyzing data at rest and data in motion Highlights Accelerates time-to-value with scalable,

More information

Big Data at Cloud Scale

Big Data at Cloud Scale Big Data at Cloud Scale Pushing the limits of flexible & powerful analytics Copyright 2015 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For

More information

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ End to End Solution to Accelerate Data Warehouse Optimization Franco Flore Alliance Sales Director - APJ Big Data Is Driving Key Business Initiatives Increase profitability, innovation, customer satisfaction,

More information

White Paper. The benefits of a cloud-based service for web security. reducing risk, adding value and cutting costs

White Paper. The benefits of a cloud-based service for web security. reducing risk, adding value and cutting costs White Paper The benefits of a cloud-based service for web security A White Paper by Bloor Research Author : Fran Howarth Publish date : February 2010 By using a service based in the cloud, protection against

More information

A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel

A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel A Next-Generation Analytics Ecosystem for Big Data Colin White, BI Research September 2012 Sponsored by ParAccel BIG DATA IS BIG NEWS The value of big data lies in the business analytics that can be generated

More information

IBM InfoSphere Guardium Data Activity Monitor for Hadoop-based systems

IBM InfoSphere Guardium Data Activity Monitor for Hadoop-based systems IBM InfoSphere Guardium Data Activity Monitor for Hadoop-based systems Proactively address regulatory compliance requirements and protect sensitive data in real time Highlights Monitor and audit data activity

More information

Are You Big Data Ready?

Are You Big Data Ready? ACS 2015 Annual Canberra Conference Are You Big Data Ready? Vladimir Videnovic Business Solutions Director Oracle Big Data and Analytics Introduction Introduction What is Big Data? If you can't explain

More information

Decoding the Big Data Deluge a Virtual Approach. Dan Luongo, Global Lead, Field Solution Engineering Data Virtualization Business Unit, Cisco

Decoding the Big Data Deluge a Virtual Approach. Dan Luongo, Global Lead, Field Solution Engineering Data Virtualization Business Unit, Cisco Decoding the Big Data Deluge a Virtual Approach Dan Luongo, Global Lead, Field Solution Engineering Data Virtualization Business Unit, Cisco High-volume, velocity and variety information assets that demand

More information

IBM: An Early Leader across the Big Data Security Analytics Continuum Date: June 2013 Author: Jon Oltsik, Senior Principal Analyst

IBM: An Early Leader across the Big Data Security Analytics Continuum Date: June 2013 Author: Jon Oltsik, Senior Principal Analyst ESG Brief IBM: An Early Leader across the Big Data Security Analytics Continuum Date: June 2013 Author: Jon Oltsik, Senior Principal Analyst Abstract: Many enterprise organizations claim that they already

More information

Oracle Big Data Building A Big Data Management System

Oracle Big Data Building A Big Data Management System Oracle Big Building A Big Management System Copyright 2015, Oracle and/or its affiliates. All rights reserved. Effi Psychogiou ECEMEA Big Product Director May, 2015 Safe Harbor Statement The following

More information

Three Open Blueprints For Big Data Success

Three Open Blueprints For Big Data Success White Paper: Three Open Blueprints For Big Data Success Featuring Pentaho s Open Data Integration Platform Inside: Leverage open framework and open source Kickstart your efforts with repeatable blueprints

More information

Improve your IT Analytics Capabilities through Mainframe Consolidation and Simplification

Improve your IT Analytics Capabilities through Mainframe Consolidation and Simplification Improve your IT Analytics Capabilities through Mainframe Consolidation and Simplification Ros Schulman Hitachi Data Systems John Harker Hitachi Data Systems Insert Custom Session QR if Desired. Improve

More information

master data management and data integration: complementary but distinct

master data management and data integration: complementary but distinct master data management and data integration: complementary but distinct A White Paper by Bloor Research Author : Philip Howard Review date : October 2006 Put simply, if you ignore data integration or do

More information

Spotlight. Operations Management Applying operations management in the services sector

Spotlight. Operations Management Applying operations management in the services sector Spotlight Operations Management A Spotlight Paper by Bloor Research Author : Simon Holloway Publish date : November 2009 With new pressures on costs, it is becoming more imperative to get better control

More information

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Datenverwaltung im Wandel - Building an Enterprise Data Hub with Datenverwaltung im Wandel - Building an Enterprise Data Hub with Cloudera Bernard Doering Regional Director, Central EMEA, Cloudera Cloudera Your Hadoop Experts Founded 2008, by former employees of Employees

More information

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES

BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES BIG DATA ANALYTICS REFERENCE ARCHITECTURES AND CASE STUDIES Relational vs. Non-Relational Architecture Relational Non-Relational Rational Predictable Traditional Agile Flexible Modern 2 Agenda Big Data

More information

Data processing goes big

Data processing goes big Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,

More information

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics ESSENTIALS EMC ISILON Use the industry's first and only scale-out NAS solution with native Hadoop

More information

Building Your Big Data Team

Building Your Big Data Team Building Your Big Data Team With all the buzz around Big Data, many companies have decided they need some sort of Big Data initiative in place to stay current with modern data management requirements.

More information

White. Paper. EMC Isilon: A Scalable Storage Platform for Big Data. April 2014

White. Paper. EMC Isilon: A Scalable Storage Platform for Big Data. April 2014 White Paper EMC Isilon: A Scalable Storage Platform for Big Data By Nik Rouda, Senior Analyst and Terri McClure, Senior Analyst April 2014 This ESG White Paper was commissioned by EMC Isilon and is distributed

More information

Data Virtualization A Potential Antidote for Big Data Growing Pains

Data Virtualization A Potential Antidote for Big Data Growing Pains perspective Data Virtualization A Potential Antidote for Big Data Growing Pains Atul Shrivastava Abstract Enterprises are already facing challenges around data consolidation, heterogeneity, quality, and

More information

TopBraid Insight for Life Sciences

TopBraid Insight for Life Sciences TopBraid Insight for Life Sciences In the Life Sciences industries, making critical business decisions depends on having relevant information. However, queries often have to span multiple sources of information.

More information

INTELLIGENT BUSINESS STRATEGIES WHITE PAPER

INTELLIGENT BUSINESS STRATEGIES WHITE PAPER INTELLIGENT BUSINESS STRATEGIES WHITE PAPER Improving Access to Data for Successful Business Intelligence Part 2: Supporting Multiple Analytical Workloads in a Changing Analytical Landscape By Mike Ferguson

More information

White Paper. Getting ahead in the cloud. the need for better identity and access controls

White Paper. Getting ahead in the cloud. the need for better identity and access controls White Paper Getting ahead in the cloud A White Paper by Bloor Research Author : Fran Howarth Publish date : March 2013 Users are demanding access to applications and services from wherever they are, whenever

More information

Spotlight. Spotlight Paper by Bloor Author Philip Howard Publish date September 2014. Automated test case generation

Spotlight. Spotlight Paper by Bloor Author Philip Howard Publish date September 2014. Automated test case generation Spotlight Spotlight Paper by Bloor Author Philip Howard Publish date September 2014 Automated test case generation Since its inception, IT has been about automating business processes. However, it has

More information

Integrate Big Data into Business Processes and Enterprise Systems. solution white paper

Integrate Big Data into Business Processes and Enterprise Systems. solution white paper Integrate Big Data into Business Processes and Enterprise Systems solution white paper THOUGHT LEADERSHIP FROM BMC TO HELP YOU: Understand what Big Data means Effectively implement your company s Big Data

More information

The Enterprise Data Hub and The Modern Information Architecture

The Enterprise Data Hub and The Modern Information Architecture The Enterprise Data Hub and The Modern Information Architecture Dr. Amr Awadallah CTO & Co-Founder, Cloudera Twitter: @awadallah 1 2013 Cloudera, Inc. All rights reserved. Cloudera Overview The Leader

More information

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Highly competitive enterprises are increasingly finding ways to maximize and accelerate

More information

InDetail. InDetail Paper by Bloor Author Philip Howard Publish date December 2015. Blazegraph GPU

InDetail. InDetail Paper by Bloor Author Philip Howard Publish date December 2015. Blazegraph GPU InDetail InDetail Paper by Bloor Author Philip Howard Publish date December 2015 Blazegraph GPU Blazegraph has implemented graphical processing units (GPUs) as accelerators for graph analytics... this

More information

UNIFY YOUR (BIG) DATA

UNIFY YOUR (BIG) DATA UNIFY YOUR (BIG) DATA ANALYTIC STRATEGY GIVE ANY USER ANY ANALYTIC ON ANY DATA Scott Gnau President, Teradata Labs scott.gnau@teradata.com t Unify Your (Big) Data Analytic Strategy Technology excitement:

More information

Integrating Big Data into Business Processes and Enterprise Systems

Integrating Big Data into Business Processes and Enterprise Systems Integrating Big Data into Business Processes and Enterprise Systems THOUGHT LEADERSHIP FROM BMC TO HELP YOU: Understand what Big Data means Effectively implement your company s Big Data strategy Get business

More information

Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum

Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum Trends and Research Opportunities in Spatial Big Data Analytics and Cloud Computing NCSU GeoSpatial Forum Siva Ravada Senior Director of Development Oracle Spatial and MapViewer 2 Evolving Technology Platforms

More information

BIG DATA SOLUTION DATA SHEET

BIG DATA SOLUTION DATA SHEET BIG DATA SOLUTION DATA SHEET Highlight. DATA SHEET HGrid247 BIG DATA SOLUTION Exploring your BIG DATA, get some deeper insight. It is possible! Another approach to access your BIG DATA with the latest

More information

Offload Enterprise Data Warehouse (EDW) to Big Data Lake. Ample White Paper

Offload Enterprise Data Warehouse (EDW) to Big Data Lake. Ample White Paper Offload Enterprise Data Warehouse (EDW) to Big Data Lake Oracle Exadata, Teradata, Netezza and SQL Server Ample White Paper EDW (Enterprise Data Warehouse) Offloads The EDW (Enterprise Data Warehouse)

More information

The Future of Data Management

The Future of Data Management The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class

More information

TECHNOLOGY TRANSFER PRESENTS MIKE FERGUSON BIG DATA MULTI-PLATFORM JUNE 25-27, 2014 RESIDENZA DI RIPETTA - VIA DI RIPETTA, 231 ROME (ITALY)

TECHNOLOGY TRANSFER PRESENTS MIKE FERGUSON BIG DATA MULTI-PLATFORM JUNE 25-27, 2014 RESIDENZA DI RIPETTA - VIA DI RIPETTA, 231 ROME (ITALY) TECHNOLOGY TRANSFER PRESENTS MIKE FERGUSON BIG DATA MULTI-PLATFORM ANALYTICS JUNE 25-27, 2014 RESIDENZA DI RIPETTA - VIA DI RIPETTA, 231 ROME (ITALY) info@technologytransfer.it www.technologytransfer.it

More information

Gain Contextual Awareness for a Smarter Digital Enterprise with SAP HANA Vora

Gain Contextual Awareness for a Smarter Digital Enterprise with SAP HANA Vora SAP Brief SAP Technology SAP HANA Vora Objectives Gain Contextual Awareness for a Smarter Digital Enterprise with SAP HANA Vora Bridge the divide between enterprise data and Big Data Bridge the divide

More information

Oracle Big Data SQL Technical Update

Oracle Big Data SQL Technical Update Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical

More information

Ten Mistakes to Avoid

Ten Mistakes to Avoid EXCLUSIVELY FOR TDWI PREMIUM MEMBERS TDWI RESEARCH SECOND QUARTER 2014 Ten Mistakes to Avoid In Big Data Analytics Projects By Fern Halper tdwi.org Ten Mistakes to Avoid In Big Data Analytics Projects

More information

Changing the Equation on Big Data Spending

Changing the Equation on Big Data Spending White Paper Changing the Equation on Big Data Spending Big Data analytics can deliver new customer insights, provide competitive advantage, and drive business innovation. But complexity is holding back

More information

The Five Most Common Big Data Integration Mistakes To Avoid O R A C L E W H I T E P A P E R A P R I L 2 0 1 5

The Five Most Common Big Data Integration Mistakes To Avoid O R A C L E W H I T E P A P E R A P R I L 2 0 1 5 The Five Most Common Big Data Integration Mistakes To Avoid O R A C L E W H I T E P A P E R A P R I L 2 0 1 5 Executive Summary Big Data projects have fascinated business executives with the promise of

More information

Understanding Your Customer Journey by Extending Adobe Analytics with Big Data

Understanding Your Customer Journey by Extending Adobe Analytics with Big Data SOLUTION BRIEF Understanding Your Customer Journey by Extending Adobe Analytics with Big Data Business Challenge Today s digital marketing teams are overwhelmed by the volume and variety of customer interaction

More information

Customer Insight Appliance. Enabling retailers to understand and serve their customer

Customer Insight Appliance. Enabling retailers to understand and serve their customer Customer Insight Appliance Enabling retailers to understand and serve their customer Customer Insight Appliance Enabling retailers to understand and serve their customer. Technology has empowered today

More information

Datameer Cloud. End-to-End Big Data Analytics in the Cloud

Datameer Cloud. End-to-End Big Data Analytics in the Cloud Cloud End-to-End Big Data Analytics in the Cloud Datameer Cloud unites the economics of the cloud with big data analytics to deliver extremely fast time to insight. With Datameer Cloud, empowered line

More information

Agile Business Intelligence Data Lake Architecture

Agile Business Intelligence Data Lake Architecture Agile Business Intelligence Data Lake Architecture TABLE OF CONTENTS Introduction... 2 Data Lake Architecture... 2 Step 1 Extract From Source Data... 5 Step 2 Register And Catalogue Data Sets... 5 Step

More information

Data virtualization: Delivering on-demand access to information throughout the enterprise

Data virtualization: Delivering on-demand access to information throughout the enterprise IBM Software Thought Leadership White Paper April 2013 Data virtualization: Delivering on-demand access to information throughout the enterprise 2 Data virtualization: Delivering on-demand access to information

More information

BIG DATA-AS-A-SERVICE

BIG DATA-AS-A-SERVICE White Paper BIG DATA-AS-A-SERVICE What Big Data is about What service providers can do with Big Data What EMC can do to help EMC Solutions Group Abstract This white paper looks at what service providers

More information

SAP HANA Vora : Gain Contextual Awareness for a Smarter Digital Enterprise

SAP HANA Vora : Gain Contextual Awareness for a Smarter Digital Enterprise Frequently Asked Questions SAP HANA Vora SAP HANA Vora : Gain Contextual Awareness for a Smarter Digital Enterprise SAP HANA Vora software enables digital businesses to innovate and compete through in-the-moment

More information

INVESTOR PRESENTATION. First Quarter 2014

INVESTOR PRESENTATION. First Quarter 2014 INVESTOR PRESENTATION First Quarter 2014 Note to Investors Certain non-gaap financial information regarding operating results may be discussed during this presentation. Reconciliations of the differences

More information

Dell* In-Memory Appliance for Cloudera* Enterprise

Dell* In-Memory Appliance for Cloudera* Enterprise Built with Intel Dell* In-Memory Appliance for Cloudera* Enterprise Find out what faster big data analytics can do for your business The need for speed in all things related to big data is an enormous

More information

Focus on the business, not the business of data warehousing!

Focus on the business, not the business of data warehousing! Focus on the business, not the business of data warehousing! Adam M. Ronthal Technical Product Marketing and Strategy Big Data, Cloud, and Appliances @ARonthal 1 Disclaimer Copyright IBM Corporation 2014.

More information

Data Discovery, Analytics, and the Enterprise Data Hub

Data Discovery, Analytics, and the Enterprise Data Hub Data Discovery, Analytics, and the Enterprise Data Hub Version: 101 Table of Contents Summary 3 Used Data and Limitations of Legacy Analytic Architecture 3 The Meaning of Data Discovery & Analytics 4 Machine

More information

Wrangling Actionable Insights from Organizational Data

Wrangling Actionable Insights from Organizational Data Wrangling Actionable Insights from Organizational Data Koverse Eases Big Data Analytics for Those with Strong Security Requirements The amount of data created and stored by organizations around the world

More information

Business Intelligence

Business Intelligence Business Intelligence What is it? Why do you need it? This white paper at a glance This whitepaper discusses Professional Advantage s approach to Business Intelligence. It also looks at the business value

More information

Data Integration Checklist

Data Integration Checklist The need for data integration tools exists in every company, small to large. Whether it is extracting data that exists in spreadsheets, packaged applications, databases, sensor networks or social media

More information

Why Big Data in the Cloud?

Why Big Data in the Cloud? Have 40 Why Big Data in the Cloud? Colin White, BI Research January 2014 Sponsored by Treasure Data TABLE OF CONTENTS Introduction The Importance of Big Data The Role of Cloud Computing Using Big Data

More information

Enterprise Data Integration

Enterprise Data Integration Enterprise Data Integration Access, Integrate, and Deliver Data Efficiently Throughout the Enterprise brochure How Can Your IT Organization Deliver a Return on Data? The High Price of Data Fragmentation

More information

SAP Agile Data Preparation

SAP Agile Data Preparation SAP Agile Data Preparation Speaker s Name/Department (delete if not needed) Month 00, 2015 Internal Legal disclaimer The information in this presentation is confidential and proprietary to SAP and may

More information

How To Handle Big Data With A Data Scientist

How To Handle Big Data With A Data Scientist III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution

More information

High-Performance Business Analytics: SAS and IBM Netezza Data Warehouse Appliances

High-Performance Business Analytics: SAS and IBM Netezza Data Warehouse Appliances High-Performance Business Analytics: SAS and IBM Netezza Data Warehouse Appliances Highlights IBM Netezza and SAS together provide appliances and analytic software solutions that help organizations improve

More information

GRIDS IN DATA WAREHOUSING

GRIDS IN DATA WAREHOUSING GRIDS IN DATA WAREHOUSING By Madhu Zode Oct 2008 Page 1 of 6 ABSTRACT The main characteristic of any data warehouse is its ability to hold huge volume of data while still offering the good query performance.

More information

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale WHITE PAPER Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale Sponsored by: IBM Carl W. Olofson December 2014 IN THIS WHITE PAPER This white paper discusses the concept

More information

BIG DATA IS MESSY PARTNER WITH SCALABLE

BIG DATA IS MESSY PARTNER WITH SCALABLE BIG DATA IS MESSY PARTNER WITH SCALABLE SCALABLE SYSTEMS HADOOP SOLUTION WHAT IS BIG DATA? Each day human beings create 2.5 quintillion bytes of data. In the last two years alone over 90% of the data on

More information

SAS Enterprise Data Integration Server - A Complete Solution Designed To Meet the Full Spectrum of Enterprise Data Integration Needs

SAS Enterprise Data Integration Server - A Complete Solution Designed To Meet the Full Spectrum of Enterprise Data Integration Needs Database Systems Journal vol. III, no. 1/2012 41 SAS Enterprise Data Integration Server - A Complete Solution Designed To Meet the Full Spectrum of Enterprise Data Integration Needs 1 Silvia BOLOHAN, 2

More information

Cisco Data Preparation

Cisco Data Preparation Data Sheet Cisco Data Preparation Unleash your business analysts to develop the insights that drive better business outcomes, sooner, from all your data. As self-service business intelligence (BI) and

More information

Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges

Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges Prerita Gupta Research Scholar, DAV College, Chandigarh Dr. Harmunish Taneja Department of Computer Science and

More information

White Paper: Evaluating Big Data Analytical Capabilities For Government Use

White Paper: Evaluating Big Data Analytical Capabilities For Government Use CTOlabs.com White Paper: Evaluating Big Data Analytical Capabilities For Government Use March 2012 A White Paper providing context and guidance you can use Inside: The Big Data Tool Landscape Big Data

More information

Big Data Analytics. Copyright 2011 EMC Corporation. All rights reserved.

Big Data Analytics. Copyright 2011 EMC Corporation. All rights reserved. Big Data Analytics 1 Priority Discussion Topics What are the most compelling business drivers behind big data analytics? Do you have or expect to have data scientists on your staff, and what will be their

More information

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments

Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Cloudera Enterprise Reference Architecture for Google Cloud Platform Deployments Important Notice 2010-2015 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, Cloudera Impala, Impala, and

More information

What s New with Informatica Data Services & PowerCenter Data Virtualization Edition

What s New with Informatica Data Services & PowerCenter Data Virtualization Edition 1 What s New with Informatica Data Services & PowerCenter Data Virtualization Edition Kevin Brady, Integration Team Lead Bonneville Power Wei Zheng, Product Management Informatica Ash Parikh, Product Marketing

More information