The New Rules for Integration

Size: px
Start display at page:

Download "The New Rules for Integration"

Transcription

1 The New Rules for Integration A Unified Integration Approach for Big Data, the Cloud, and the Enterprise A Whitepaper Rick F. van der Lans Independent Business Intelligence Analyst R20/Consultancy September 2013 Sponsored by

2 Copyright 2013 R20/Consultancy. All rights reserved. The Talend Platform for Data Services, Talend Open Studio, and The Talend Unified Platform are registered trademarks or trademarks of Talend Inc.. Trademarks of other companies referenced in this document are the sole property of their respective owners.

3

4 Table of Contents 1 Management Summary 1 2 Dispersion of Business Data Across a Labyrinth of Systems 1 3 From Integration Silos to an Integration Labyrinth 2 4 The New Rules for Integration 3 5 Rule 1: Unified Integration Platform 4 6 Rule 2: Generating Integration Specifications 5 7 Rule 3: Big Data-Ready 6 8 Rule 4: Cloud-Ready (Hybrid Integration) 8 9 Rule 5: Enterprise-Ready Talend and the New Rules for Integration 12 About the Author Rick F. van der Lans 15 About Talend Inc. 15

5 The New Rules for Integration A Unified Integration Approach for Big Data, the Cloud, and the Enterprise 1 1 Management Summary Data is increasingly becoming a crucial asset for organizations to survive in today s fast moving business world. And data becomes more valuable if enriched and/or fused with other data. Unfortunately, enterprise data is dispersed by most organizations over numerous systems all using different technologies. To bring all that data together is and has always been a major technological challenge. For each system that requires data from different systems, different integration solutions are deployed. In other words, integration silos have been developed that over time has led to a complex integration labyrinth. The disadvantages are clear: Inconsistent integration specifications Inconsistent results Decreased time to market Increased development costs Increased maintenance costs The bar for integration tools and technology has been raised: the integration labyrinth has to disappear. It must become easier to integrate systems, and integration solutions should be easier to design and maintain to keep up with the fast changing business world. In addition, organizations are now confronted with new technologies such as big data systems and applications running in the cloud. All these new demands are changing the rules of the integration game. This whitepaper discusses the following five crucial new rules for integration: 1. Unified integration platform 2. Generating integration specifications 3. Big data-ready 4. Cloud-ready (Hybrid integration) 5. Enterprise-ready Data becomes more valuable if enriched and/or fused with other data. In addition, the whitepaper describes The Talend Platform for Data Services which fully supports data integration and application integration with one unified platform. It also explains how Talend s product meets the new rules for integration. 2 Dispersion of Business Data Across a Labyrinth of Systems The Synergetic Effect of Data Integration The term synergetic effect applies very strongly to the world of business data. By bringing data from multiple IT systems together, the business value of that integrated data is greater than the sum of the individual data elements; one plus one is clearly three here.

6 The New Rules for Integration A Unified Integration Approach for Big Data, the Cloud, and the Enterprise 2 For example, knowing which customers have purchased which products is valuable. Knowing which customers have returned products is valuable as well. But more valuable could be to bring these data elements together. This may reveal, for example, that a particular customer, who purchases many products, returns most of them. This customer is probably not For different reasons, business data has been dispersed across many systems. as valuable as some may think. Data becomes more valuable if enriched and/or fused with other data. Dispersion of Business Data Across Many IT Systems When all the business data is stored in one IT system, bringing data together is technically easy. Unfortunately, in most organizations business data has been dispersed over many different systems. For example, data on a particular customer may be distributed across the sales system, the finance system, the complaints systems, the customer accounts system, the data warehouse, the master data management system, and so on. Usually, the underlying reasons for this situation are historical. Through the years, organizations have created and acquired new systems; they have merged with other organizations bringing their own systems; and they have rented systems in the cloud. In addition, when new systems were developed to replace older ones, rarely ever did these new systems fully replace the older ones. In most cases, these legacy systems have been kept alive and are still operational. The consequence of this all is a dispersion of business data. Besides the fact that data is stored in many systems, an additional complexity is that many systems use different implementation technologies. Some use SQL databases to store data, others use pre-sql systems, such as IDMS, IDS, and Total, and there is more and more data available through API s such as SOAP and REST. And don t forget this new generation of NoSQL systems. The use of heterogeneous technologies for storing data elevates the complexity to integrate data. Conclusion, today it s hard for users to find and integrate the data they need to get the desired synergetic effect. To them, it feels as if their business data has been hidden deeply in a complex data labyrinth. 3 From Integration Silos to an Integration Labyrinth The Integration Silos More and more IT systems need to retrieve and manipulate data stored in multiple systems. For example, a new portal for supporting customer questions needs access to a production database and an ERP system to get all the required data to handle the customer requests. Another example is a website designed for customers to order products online needs to query data from and insert and update data in various production applications. A third example is a report that shows what has happened with a particular business process and which also requires integrated data from multiple systems. For most of these systems dedicated integration solutions have been developed. For example, the website may use an integration solution developed with an ESB (Enterprise Service Bus). This bus is used to extract data from and insert data in the underlying production systems. The company portal, on For many new IT systems dedicated integration silos have been developed.

7 The New Rules for Integration A Unified Integration Approach for Big Data, the Cloud, and the Enterprise 3 the other hand, may use a dedicated portal server, whereas the reporting environment may be supported by an integration solution based on a data warehouse and an ETL-tool. Some newer self-service reporting tools deploy their own light-weight data integration technology. The Integration Labyrinth If we consider all these data integration efforts, one can easily speak of integration silos, because for each application or group of applications a dedicated integration solution is developed. This is a highly undesirable situation, because eventually this approach leads to an integration labyrinth. Disadvantages of Integration Silos Although having a dedicated integration solution may be handy for the system involved, this approach clearly has some weaknesses: Inconsistent integration specifications: Because the integration specifications are distributed over many integration solutions, it s difficult to guarantee that rules in different solutions for integrating the same data are implemented consistently. Inconsistent results: If different sets of integration specifications are applied, the results from different integration solutions may be inconsistent. In addition, this inconsistency will reduce trust in data and the supporting systems. Decreased time to market: Because the integration specifications are replicated, changing them enterprise-wide in all relevant solutions is time-consuming. This slows down the implementation and thus the time-to-market of new systems. Increased development costs: When the same systems are integrated by different integration solutions, the same integration specifications have to be implemented multiple times, thus increasing the development costs. Increased maintenance costs: Changing integration specifications in multiple different solutions implies changing them in many different tools and programming languages and requires different development skills. This raises the costs for changing integration specifications considerably. 4 The New Rules for Integration The new business demands described in the previous section have raised the bar for integration tools and technology; the New business demands integration labyrinth has to disappear. It must become easier have raised the bar for to integrate systems and integration specifications should be integration technology. easier to change and maintain to keep up with the fast changing business world. These new demands are changing the rules of the integration game.

8 The New Rules for Integration A Unified Integration Approach for Big Data, the Cloud, and the Enterprise 4 This whitepaper describes the following five crucial new rules for integration: 1. Unified integration platform 2. Generating integration specifications 3. Big data-ready 4. Cloud-ready (Hybrid integration) 5. Enterprise-ready In the next sections each of these five rules are explained in detail. 5 Rule 1: Unified Integration Platform Multitude of Integration Technologies Organizations can select from a multitude of technologies for integrating systems and data sources, such as ESBs (Enterprise Service Bus), ETL tools, data replicators, portals, data virtualization servers, and homemade code. Today, they can even use the lightweight integration technology embedded in self-service reporting tools to integrate data sources. It s good that this wide range of integration styles exists, because no one integration style exists that is perfect for every integration problem. For example, when data is integrated from multiple data sources and copied to one data source using a batch-oriented approach, ETL is the preferred approach, whereas when individual data messages must be transmitted from one application to another, ESB is undoubtedly the recommended solution. So, different problems require different solutions. However, the current situation is that organizations are really deploying all these different solutions which leads to a duplication of integration specifications. For example, when an ETL solution has been designed to extract data from a particular database, and a data replicator is using that same database, there is a big chance that for both solutions comparable integration specifications have been entered. Or, when an application is accessed by an ESB and a portal to retrieve data, comparable specifications have probably been developed for each of them. A Unified Integration Platform In the situation described here, the wheel is reinvented over and over again. It leads to integration silos. The previous section indicates the disadvantages of the integration silos. To solve this problem, it s essential that integration tools support many different integration styles with No integration style exists that is perfect for every integration problem. Rule 1: Integration tools must support unified integration capabilities. one single design and runtime platform. Developers should not have to switch to another tool if they want to switch from, for example, ESB-style to ETL-style integration. Nor should they have to switch to another tool if data is moved to another database platform, or if the applications use another API type. There should be one integrated platform in which all the integration specifications (logical and technical) are stored only once.

9 The New Rules for Integration A Unified Integration Approach for Big Data, the Cloud, and the Enterprise 5 6 Rule 2: Generating Integration Specifications Logical Versus Technical Integration Specifications Regardless of the integration technology in use, developers have to enter integration specifications. These specifications can be classified in two groups: logical and technical integration specifications. The former group deals with the what and the latter with the how. Logical specifications describe the structure of source and target systems, required transformation, merge and join specifications, cleansing rules, and so on. They purely indicate what should be done, not how. That s where the technical specifications come in. They deal with the specific APIs of source and target systems, performance and efficiency aspects, etcetera. It s everyone s dream that developers of integration solutions would only need to focus on the logical aspects of integration, and not on the technical aspects. For example, it should be irrelevant for developers whether data has to be extracted from a classic SQL system, from an Hadoop system, or from a Developers should be focusing on logical and not technical specifications. Salesforce.com application running in the cloud. They should be focusing on the logical structure of the source data, the logical structure of the target system, and which transformations to apply. They should not have to focus on specific APIs, database concepts used, encryption aspects, etcetera. Unfortunately, in many integration tools, this is not the case. Developers do have to know how to extract data from Hadoop using Hive, Pig, or HBase, via an ESB using a SOAP-based interface, via a REST interface, via JMS, or via one of the many different alternatives. Abstraction Through Code Generation An integration solution should hide all the technical integration aspects and should let developers focus on the logical aspects. Code generation is a proven technique to hide technical aspects. Code generation has been applied very successfully in the IT industry for many years. Numerous examples exist where code generation is used to hide technical aspects from developers: For example, starting in the 1960s, Cobol compilers generated assembler code, and by doing that they concealed many of the technical difficulties of assembler programming. Another successful example is SQL. The distinction between the what and how has been the basis for SQL. For example, SQL queries only deal with the what: what data should be retrieved from the database? Queries do not indicate how the data should be retrieved efficiently and quickly from disk. SQL database servers generate internal code to access the data and with that hide the technical aspects of data storage and data access aspects. Probably the most popular example is Java. Java compilers generate Java byte code. By generating code, an abstraction layer is created that hides technical details and thus increases the productivity of developers.

10 The New Rules for Integration A Unified Integration Approach for Big Data, the Cloud, and the Enterprise 6 Advantages of Code Generation In general, the advantages of code generation are: Transparency Portability Productivity Maintenance The perfect integration tool allows integration developers to focus on the logical specifications and hides all the technical details. For example, developers should not have to learn all the technical peculiarities of Hadoop to be able to extract data, they should not have to study all the details of an ESB, they should not have to investigate how to extract data from a legacy database, or how to insert data in a document store such as Cassandra. Integration tools should understand what the most efficient and fastest way is to work with all those source and target systems. Is Generated Code Efficient? The following question has been raised countless times since tools started to generate code: Is generated code efficient? Is it as efficient as code written by hand? Maybe the answer is no. Maybe generated code is, in most cases, generic code and therefore not the most efficient code possible. However, if code has to be written by hand, does an organization have the specialists on board who can write more efficient code? And if they can, what are the costs of writing code by hand? In addition, how maintainable is code written by hand? Imagine that the IT industry would not have outgrown the world of assembler languages, productivity would have been horrendous. The discussion on code generation should not be limited to the efficiency of code. A comparison should be made that doesn t limit the focus on performance and efficiency, but also includes productivity and maintenance. Nowadays, these latter two aspects are considered more important than brute performance and efficiency. Integration Speed-Up Through Generating Integration Code The need to integrate systems keeps increasing. In addition, the need to finish these integration projects is also increasing. What is needed are tools that offer integration speed-up. Because a generator hides all the technical details and developers only need to focus on the how, less time has to be spent on development. Rule 2: Integration tools must offer integration speed-up through code generation. 7 Rule 3: Big Data-Ready The Big Data Train keeps Rolling There is no stopping, the big data train left the station a few years ago and continues to travel the world. Many organizations have already adopted big data, some are already relying on these systems, some are in the process of adopting big data, or are studying what big data could mean for them. In a nutshell, big data systems enrich the analytical capabilities of an organization.

11 The New Rules for Integration A Unified Integration Approach for Big Data, the Cloud, and the Enterprise 7 Gartner 1 predicts that big data will drive $232 billion in spending through 2016, Wikibon 2 claims that by 2017 big data revenue will have grown to $47.8 billion, and McKinsey Global Institute 3 indicates that big data has the potential to increase the value of the US Health Care industry by $300 Billion and to increase the industry value of Europe s public sector administration by 250 Billion. NoSQL and Hadoop Developing big data systems is not easy for the following reasons: The sheer size of a big data system makes it a technological challenge. A large portion of big data is unstructured, multi-structured, or semi-structured. This means that to analyze big data, structure must be assigned to it when it s being read. This is called schema-on-read and can be complex and resource intensive. Some big data is sensor data. In most cases, sensor data is highly cryptic and heavily coded. These codes may indicate machines, customers, sensor devices, and so on. To be able to analyze this coded data it must be enriched with meaningful data that exists in other, more traditional data stores which requires some form of integration. To tackle the above aspects, many organizations have decided to deploy NoSQL systems for storing and managing these massive amounts of data. NoSQL systems are designed for big data workloads, are powerful and scalable, but they are different from the well-known SQL systems that many developers are familiar with. Here are some of the differences: A NoSQL system, as the name implies, does not support the popular SQL database language nor the familiar relational concepts, such as tables, columns, and records. This means that developers of integration solutions must learn how to handle these new concepts and how to merge them with classic concepts. Each NoSQL system supports its own API, database language, and set of database concepts. This means that expertise with one product can t easily be reused with another. NoSQL skills are still scarce. Most organizations don t have these skills, and external specialists are not easy to find. Integration Tools must be Big Data-Ready To be able to exploit the value hidden in these big data systems, it has to be analyzed and enriched. In other words, data from big data systems has to be integrated with data from traditional systems. Thus, Rule 3: Integration tools must be big-data ready. 1 Gartner, October 2012; see / 2 Wikibon, Big Data Vendor Revenue and Market Forecast , August 26, 2013; See 3 McKinsey Global Institute, Big Data: The Next Frontier for Innovation, Competition, and Productivity, June 2011; see

12 The New Rules for Integration A Unified Integration Approach for Big Data, the Cloud, and the Enterprise 8 integration tools must be able to integrate these two types of data sources, and this requires that they support NoSQL systems. In other words, today integration tools must be big dataready. Requirements for being big data-ready include: Data Scalability: Integration tools must be able to process massive amounts of data. Technically this means that when an ETL-style of integration is deployed, they must support native load and unload facilities offered by these NoSQL systems (if available). Pushdown: NoSQL systems are highly scalable platforms because they can distribute the processing of logic over an enormous set of processors. To be able to exploit this power, integration tools must be able to pushdown as much of the integration logic into the NoSQL systems as possible. For example, if Hadoop MapReduce is used, an integration tool must be able to generate a MapReduce program for extracting data from HDFS and that executes as much of the transformation processing as it can. NoSQL Concepts: Integration tools must understand the new database concepts supported by NoSQL systems, such as hierarchical data structures, multi-structured tables, column families, and repeating groups. They must be able to transform such concepts to flat relational concepts, and vice versa. Bi-directional: Integration tools must be able to read from and write to NoSQL systems. Schema-on-read: Integration tools must support schema-on-read. In other words, integration tools must be able to assign a schema to big data when it s unloaded from a NoSQL system. 8 Rule 4: Cloud-Ready (Hybrid Integration) The Success of the Cloud More and more applications and data are moving to the cloud. To illustrate the growing success of cloud-based systems, here are two quotes from an IDC 2012 study 4 : Worldwide spending on public IT cloud services will be more than $40 billion in 2012 and is expected to approach $100 Billion in By 2016, public IT cloud services will account for 16% of IT revenue in five key technology categories: applications, system infrastructure software, platform as a service (PaaS), servers, and basic storage. More significantly, cloud services will generate 41% of all growth in these categories by IDC, Worldwide and Regional Public IT Cloud Services Forecast, August 2012.

13 The New Rules for Integration A Unified Integration Approach for Big Data, the Cloud, and the Enterprise 9 Not only commercial organizations are moving to the cloud. According to a recent study by Gartner 5 into IT spending by government, their interest in cloud is growing rapidly as well: Cloud computing continues to increase compared with prior years, driven by economic conditions and a shift from capital expenditure to operational expenditure, as well as potentially more important factors such as faster deployment and reduced risk. Between 30 to 50 per cent of respondents are seeking to adopt public and private cloudbased services and will sign up for an active IT services contract within the next 12 months. When systems are moved to the cloud, technically data is moved to the cloud, is moved from the cloud, or is moved within the cloud (from one system to another). The Blurring of the Cloud In the beginning, the cloud was special. There was a clear boundary between systems running in the cloud and systems running on-premises. Today, that boundary is becoming fuzzy. For example, enterprise systems are spilling into the cloud, they run onpremises but are accessing services running in the cloud, data stored on-premises has to be integrated with data stored in the cloud, and on-premises applications are migrated to running in the cloud or vice versa. The consequence is that more and more hybrid (cloud and noncloud) systems exist. Furthermore, there are different types of clouds, ranging from public clouds to private clouds, which also blurs the distinction between cloud and on-premises. It s more and more becoming a sliding scale from 100% cloud to 100% on-premises. Financial, privacy, security, and performance reasons determine where applications and data are best placed. Conclusion, things can and will change over time. Cloud and Integration What does cloud mean for integration? In general, it should be irrelevant for developers of an integration solution where data and applications reside, in the cloud or not. Therefore, integration tools should understand what the most efficient way is to transport data and messages into, from, and within the cloud. In addition, when applications and data sources move into the cloud or back, it should not change the logical integration specifications. The reason is that when Integration tools must hide technical cloud aspects for developers. data or applications are moved, the logical integration aspects don t change, only the technical aspects. Integration tools must hide technical cloud aspects for integration developers. Integration Requirements for the Cloud Technical requirements for integration tools to be able to operate successfully in the cloud are: Efficient data transmission: Moving data across the cloud-on-premises boundary, and moving data within the cloud, is not as fast as moving data between local systems. It s therefore important that integration solutions deploy efficient techniques for data transmission. For example, smart compression techniques should be supported. 5 Gartner, User Survey Analysis: IT Spending Priorities in Government, Worldwide, 2013, 25 January 2013.

14 The New Rules for Integration A Unified Integration Approach for Big Data, the Cloud, and the Enterprise 10 Location transparency: For developers it should be hidden whether a system or data source runs within the cloud or not. When a system is moved from on-premises to cloud, it should have no effect on the logical integration specifications. Only technical integration specifications dealing with location and network communications should have to be changed. Support for cloud APIs: With the cloud came various new applications and systems introducing new APIs and languages. For example, for extracting data from Salesforce.com, Facebook, and Twitter, special APIs are available. For inserting data in some cloud systems new APIs exist as well. Integration solutions should support as many of these typical cloud APIs as possible. Secure data transmission: Data and messages that are transmitted over public communication mechanisms, must be protected against unauthorized access. Integration solutions should support various encryption mechanisms. Again, it s important that these encryption specifications are independent of the logical integration specifications, so that when another encryption mechanisms is required, or when a system is moved and therefore another encryption becomes relevant, it has no impact on the logical specifications. All the integration work should still work. To summarize, due to the cloud and all its forms, vendors of integration solutions should invest in supporting all the Rule 4: Integration tools required features, integration tools must be cloud-ready. must be cloud-ready. Frank Gens 4 (senior vice president and chief analyst at IDC) worded it as follows: Quite simply, vendor failure in cloud services will mean stagnation." 9 Rule 5: Enterprise-Ready Enterprise-Ready For completeness sake the rule of enterprise-ready has been added. Evidently, this is a rule that has always applied: integration technology always had to be enterprise-ready. More and more organizations are relying on these solutions, and thus the need for enterpriseready is crucial. Enterprise-ready means the following: Integration tools must offer a high level of robustness, scalability, and performance. Integration tools must be enterprise-grade with respect to support. Integration tools must support all relevant security technologies, including authorization, authentication, and encryption. Integration tools must be DTAP-ready (Development Testing Acceptance Production). Integration tools must be easy to monitor and manage. However, being enterprise-ready is a moving target. What was a reasonable level of scalability five years ago can be far from sufficient today. For example, data warehouses are growing in size, the amount of data to be analyzed grows phenomenally, Rule 5: Integration tools must be enterprise-ready.

15 The New Rules for Integration A Unified Integration Approach for Big Data, the Cloud, and the Enterprise 11 the number of applications to be integrated increases, and the number of messages to be transmitted between applications grows. When enterprise demands are increasing, integration technology must follow. Open Source Software is Enterprise-Ready Some IT specialists still think that open source software is buggy, it doesn t scale, its functionality is poor, the products are simpler versions of their closed source counterparts, and so on. This is an erroneous view of open source software. Open source tools and systems are being deployed in the largest IT systems. They can easily compete with their closed source competitors. There are still some myths related to open source tools: Myth 1: Open source software is not market-ready. This myth suggests that open source software requires a lot of tweaking before it can be used. In other words, no simple install procedure exists, functionality is missing, and software is buggy. In a way, the suggestion is made that the products are not finished. This is incorrect, professional open source tools are as market-ready as their proprietary counterparts. Myth 2: Open source products are evaluation versions. For many open source products commercial and community versions exist. The community versions can be used for evaluation purposes, but they are mature enough to be deployed in operational environments. In fact, many organizations run operational systems that make use of community versions. In addition, the commercial versions may add additional enterprise class features or may be more scalable than the community versions. Myth 3: Open source software has a steep learning curve. This is not true. Whether a product is open source or not, has no relationship with the learning curve. It all depends on how the interface of the product has been developed. Many open source products are as easy to use as closed source products. And vice versa, there are closed and open source products that are very hard to use. Myth 4: No stable and predictable pricing model. Evidently, different pricing models apply for community and commercial versions of open source software. Different vendors use different pricing models for their commercial versions, and some of these models are crystal clear and others are somewhat muddy. This is not a lot different from the pricing models of closed source software. It s a myth to suggest that all open source software vendors have unclear pricing models. Myth 5: The loosely united community leads to weak software. The community of developers working on open source software may literally be spread out over the world. Some of these developers are on the payroll of the vendors and some are not. The tools and the project management techniques exist today that make geo-distributed development easy. In fact, vendors of closed source software are using this development model more and more as well. This style of development does not lead to weak software. Myth 6: Not enterprise-grade support. Whether vendors offer enterprise-grade support is not dependent on whether they offer open source software or not. It all depends on the maturity of the vendor itself and their willingness to

16 The New Rules for Integration A Unified Integration Approach for Big Data, the Cloud, and the Enterprise 12 invest in support. More and more open source software vendors are offering enterprise-grade support. Myth 7: Minimal connectivity options. For integration tools it s important to offer a wide range of connectors for different technologies. One of the strengths of open source software is that it s designed so that others can contribute and let them add features as well. When customers use very exotic systems for which no connector is available, they can develop that connector themselves and make it public. In other words, the openness makes it possible that a large community can develop a large and fast growing set of connectivity options. With closed source software, the vendor must build all the features themselves. 10 Talend and the New Rules for Integration Talend in a Nutshell Talend Inc. was founded in They were the first open source vendor of data integration software. In November 2006 they released their first product, the ETL tool called Talend Open Studio. In November 2011, Gartner rated Talend a visionary in their well-known magic quadrant for data integration tools. On November 10, 2010, Talend acquired Sopera and with that they got access to a successful, high-end, open source ESB for application integration. With this, Talend had the products, the know-how, and technology in the two main integration areas: data integration and application integration. Since the acquisition, they have worked hard to unify the two integration solutions. The result is the solution called The Talend Platform for Data Services which fully supports data integration and application integration. This approach seriously minimizes the proliferation of integration specifications. It makes the goal of a unified view real. Developers trained in data integration solutions can now re-use their skills when switching to other types of integration solutions. Meeting the Five Rules for Integration Section 2 lists the following five new rules for integration: 1. Unified integration platform 2. Code generation 3. Big data-ready 4. Cloud-ready 5. Enterprise-ready Rule 1: Integration tools must support unified integration capabilities By merging the data integration and application integration technologies, Talend offers a real unified integration platform. It consists of the following modules: Common graphical development environment: Designers and developers can use a single development environment, called Talend Open Studio, to enter and maintain integration specifications and develop solutions. This module is based on the popular Eclipse extensible integrated development environment. There is no need for developers

17 The New Rules for Integration A Unified Integration Approach for Big Data, the Cloud, and the Enterprise 13 to learn multiple integration tools. In other words, whether a developer wants to use Talend s ESB or ETL solution, they ll use the same development environment. Common repository: Technical integration specifications, such as connectors to data sources and schema definitions, need to be defined only once, and are stored in one common repository. This allows for different integration solutions to be shared. Whether an integration specification is used for ETL-style integration or for ESB-style integration, it s stored only once in the common repository. Common runtime environment: Next to seeing only one development environment, Talend developers see one common runtime environment. However, in reality the code they develop is really deployed on various runtime environments. Talend now supports four runtime environments: Java, SQL, Hadoop MapReduce, and Camel. This makes it possible to generate native, optimized code for these environments. So, a developer writing code to extract data from a source system doesn t have to deal with the technical aspects that are specific to SQL systems, Hadoop systems, or cloud-based applications. If integration specifications have to run on MapReduce, optimized MapReduce code is generated, and if they have to execute on a SQL database server, optimized SQL code is generated for that platform. Common deployment mechanism: Whether integration logic should run ETL- or ESB-like, developers use the same deployment mechanism. They don t have to study the different deployment mechanisms. The code generator will generate correct and efficient code. Common monitoring: Integration must be monitored. Talend supports one monitoring environment. Whether ESB or ETL is deployed, the same monitoring tool is used. There is no need to learn and install multiple different monitoring environments. This simplifies the entire integration environment considerably. Rule 2: Integration tools must support integration speed-up through code generation In Talend, whether data integration or application integration is selected, developers design their logical integration specifications independent of the source and target systems. The code generator that generates for the various runtime environments handles that. For example, Talend can generate MapReduce code needed to access data stored in Hadoop HDFS. Developers do not have to study all the complexities of this platform to deploy HDFS. In fact, when, in the future, a new interface for HDFS is invented, Talend will probably support that interface as well. This means that existing integration specifications don t have to be changed. New code is generated for this new interface based on the existing specifications. Rule 3: Integration tools must be Big Data-Ready Talend is big-data ready. As indicated, its common runtime environment supports Hadoop MapReduce. Code is generated for and executed by Hadoop MapReduce using the pushdown technique. How this is all done, is completely hidden for developers. In other words, integration developers don t have to learn the specifics of Hadoop MapReduce or any of the other Hadoop layers to be able to extract data. In addition, existing integration specifications developed for a SQL system can easily be migrated to Hadoop because of this feature. No or minimal specifications have to be altered. So, if such a migration is required because of scalability

18 The New Rules for Integration A Unified Integration Approach for Big Data, the Cloud, and the Enterprise 14 problems, the current investment in integration specifications remains safe. In addition, Talend is also available for Redshift and Google Big Query. Rule 4: Integration tools must be Cloud-Ready For Talend, applications in the cloud, databases in the cloud, messages transmitted through the cloud, are all sources or targets. As with big data, the common runtime platform hides the specific aspects of these cloud systems. The result is that applications can be moved from onpremises to the cloud (or vice versa), without having to change integration specifications. The same applies when data is moved from on-premises to cloud. Talend makes the cloud transparent for integration developers. Rule 5: Integration tools must be Enterprise-Ready Talend has always been Enterprise-ready. Many customers are using Talend in large-scale environments so there is no doubt about its enterprise-readiness. For illustration purposes, here are some large anonymous business cases where Talend is being deployed. The numbers mentioned show how large some of these environments are: A large e-commerce company is specialized in searching and booking business trips and vacations. They use Talend. They have three systems all running on MySQL databases. The system supports 300,000 users and it processes tens of thousands of auctions every day generating huge amounts of data. This big data stream is stored in a central warehouse where it can be accessed by the applications. The data warehouse currently holds over one terabyte of data and this figure is rising rapidly. A leading mobile service provider uses Talend in an environment with more than 30 million customers and approximately 200 million phone calls per day. The company has to manage huge volumes of data in quasi real-time. In order to improve its services, invoicing and marketing practices, the operator needed to extract different types of information from the call detail records, then integrate this data into three different systems for marketing campaigns, pricing simulation and revenue assurance management. The third case is a company that makes complex real estate data available on the web and mobile devices. They receive data for roughly 2 million MLS (Multiple Listing Service) property listing records on a daily basis. The listings include agent and office data and about 17 million photo files that the company uses to consolidate and standardize all data in order to efficiently provide property listings to major real estate companies and Web portals.

19 The New Rules for Integration A Unified Integration Approach for Big Data, the Cloud, and the Enterprise 15 About the Author Rick F. van der Lans Rick F. van der Lans is an independent analyst, consultant, author, and lecturer specializing in data warehousing, business intelligence, service oriented architectures, data virtualization, and database technology. He works for R20/Consultancy ( a consultancy company he founded in Rick is chairman of the annual European Enterprise Data and Business Intelligence Conference (organized in London). He writes for the eminent B-eye-Network 6 and other websites. He introduced the business intelligence architecture called the Data Delivery Platform in 2009 in a number of articles 7 all published at BeyeNetwork.com. He has written several books on SQL. Published in 1987, his popular Introduction to SQL 8 was the first English book on the market devoted entirely to SQL. After more than twenty years, this book is still being sold, and has been translated in several languages, including Chinese, German, and Italian. His latest book 9 Data Virtualization for Business Intelligence Systems was published in For more information please visit or to rick@r20.nl. You can also get in touch with him via LinkedIn and via About Talend Inc. Talend provides integration solutions that truly scale for any type of integration challenge, any volume of data, and any scope of project, no matter how simple or complex. Only Talend s highly scalable data, application and business process integration platform enables organizations to effectively leverage all of their information assets. Talend unites integration projects and technologies to dramatically accelerate the time-to-value for the business. Ready for big data environments, Talend s flexible architecture easily adapts to future IT platforms. Talend s unified solutions portfolio includes data integration, data quality, master data management (MDM), enterprise service bus (ESB) and business process management (BPM). A common set of easy-to-use tools implemented across all Talend products maximizes the skills of integration teams. Unlike traditional vendors offering closed and disjointed solutions, Talend offers an open and flexible platform, supported by a predictable and scalable value-based subscription model. 6 See 7 See 8 R.F. van der Lans, Introduction to SQL; Mastering the Relational Database Language, fourth edition, Addison- Wesley, R.F. van der Lans, Data Virtualization for Business Intelligence Systems, Morgan Kaufmann Publishers, 2012.

Data Warehouse Optimization

Data Warehouse Optimization Data Warehouse Optimization Embedding Hadoop in Data Warehouse Environments A Whitepaper Rick F. van der Lans Independent Business Intelligence Analyst R20/Consultancy September 2013 Sponsored by Copyright

More information

Data Services: The Marriage of Data Integration and Application Integration

Data Services: The Marriage of Data Integration and Application Integration Data Services: The Marriage of Data Integration and Application Integration A Whitepaper Author: Rick F. van der Lans Independent Business Intelligence Analyst R20/Consultancy July, 2012 Sponsored by Copyright

More information

Discovering Business Insights in Big Data Using SQL-MapReduce

Discovering Business Insights in Big Data Using SQL-MapReduce Discovering Business Insights in Big Data Using SQL-MapReduce A Technical Whitepaper Rick F. van der Lans Independent Business Intelligence Analyst R20/Consultancy July 2013 Sponsored by Copyright 2013

More information

What is Data Virtualization? Rick F. van der Lans, R20/Consultancy

What is Data Virtualization? Rick F. van der Lans, R20/Consultancy What is Data Virtualization? by Rick F. van der Lans, R20/Consultancy August 2011 Introduction Data virtualization is receiving more and more attention in the IT industry, especially from those interested

More information

Creating an Agile Data Integration Platform using Data Virtualization

Creating an Agile Data Integration Platform using Data Virtualization Creating an Agile Data Integration Platform using Data Virtualization A Technical Whitepaper Rick F. van der Lans Independent Business Intelligence Analyst R20/Consultancy May 2013 Sponsored by Copyright

More information

Empowering Operational Business Intelligence with Data Replication

Empowering Operational Business Intelligence with Data Replication Empowering Operational Business Intelligence with Data Replication A Whitepaper Rick F. van der Lans Independent Business Intelligence Analyst R20/Consultancy April 2013 Sponsored by Copyright 2013 R20/Consultancy.

More information

Why Big Data in the Cloud?

Why Big Data in the Cloud? Have 40 Why Big Data in the Cloud? Colin White, BI Research January 2014 Sponsored by Treasure Data TABLE OF CONTENTS Introduction The Importance of Big Data The Role of Cloud Computing Using Big Data

More information

What is Data Virtualization?

What is Data Virtualization? What is Data Virtualization? Rick F. van der Lans Data virtualization is receiving more and more attention in the IT industry, especially from those interested in data management and business intelligence.

More information

Data Integration Checklist

Data Integration Checklist The need for data integration tools exists in every company, small to large. Whether it is extracting data that exists in spreadsheets, packaged applications, databases, sensor networks or social media

More information

Data virtualization: Delivering on-demand access to information throughout the enterprise

Data virtualization: Delivering on-demand access to information throughout the enterprise IBM Software Thought Leadership White Paper April 2013 Data virtualization: Delivering on-demand access to information throughout the enterprise 2 Data virtualization: Delivering on-demand access to information

More information

The Next Wave of Data Management. Is Big Data The New Normal?

The Next Wave of Data Management. Is Big Data The New Normal? The Next Wave of Data Management Is Big Data The New Normal? Table of Contents Introduction 3 Separating Reality and Hype 3 Why Are Firms Making IT Investments In Big Data? 4 Trends In Data Management

More information

How To Handle Big Data With A Data Scientist

How To Handle Big Data With A Data Scientist III Big Data Technologies Today, new technologies make it possible to realize value from Big Data. Big data technologies can replace highly customized, expensive legacy systems with a standard solution

More information

Informatica and the Vibe Virtual Data Machine

Informatica and the Vibe Virtual Data Machine White Paper Informatica and the Vibe Virtual Data Machine Preparing for the Integrated Information Age This document contains Confidential, Proprietary and Trade Secret Information ( Confidential Information

More information

So Many Tools, So Much Data, and So Much Meta Data

So Many Tools, So Much Data, and So Much Meta Data So Many Tools, So Much Data, and So Much Meta Data Copyright 1991-2012 R20/Consultancy B.V., The Hague, The Netherlands. All rights reserved. No part of this material may be reproduced, stored in a retrieval

More information

E-Guide THE CHALLENGES BEHIND DATA INTEGRATION IN A BIG DATA WORLD

E-Guide THE CHALLENGES BEHIND DATA INTEGRATION IN A BIG DATA WORLD E-Guide THE CHALLENGES BEHIND DATA INTEGRATION IN A BIG DATA WORLD O n one hand, while big data applications have eliminated the rigidity of the data integration process, they don t take responsibility

More information

Data Virtualization for Agile Business Intelligence Systems and Virtual MDM. To View This Presentation as a Video Click Here

Data Virtualization for Agile Business Intelligence Systems and Virtual MDM. To View This Presentation as a Video Click Here Data Virtualization for Agile Business Intelligence Systems and Virtual MDM To View This Presentation as a Video Click Here Agenda Data Virtualization New Capabilities New Challenges in Data Integration

More information

Analytics of Textual Big Data Text Exploration of the Big Untapped Data Source

Analytics of Textual Big Data Text Exploration of the Big Untapped Data Source Analytics of Textual Big Data Text Exploration of the Big Untapped Data Source A Whitepaper Rick F. van der Lans Independent Business Intelligence Analyst R20/Consultancy December 2013 Sponsored by Copyright

More information

Delivering Real-World Total Cost of Ownership and Operational Benefits

Delivering Real-World Total Cost of Ownership and Operational Benefits Delivering Real-World Total Cost of Ownership and Operational Benefits Treasure Data - Delivering Real-World Total Cost of Ownership and Operational Benefits 1 Background Big Data is traditionally thought

More information

The Future of Data Management

The Future of Data Management The Future of Data Management with Hadoop and the Enterprise Data Hub Amr Awadallah (@awadallah) Cofounder and CTO Cloudera Snapshot Founded 2008, by former employees of Employees Today ~ 800 World Class

More information

JOURNAL OF OBJECT TECHNOLOGY

JOURNAL OF OBJECT TECHNOLOGY JOURNAL OF OBJECT TECHNOLOGY Online at www.jot.fm. Published by ETH Zurich, Chair of Software Engineering JOT, 2008 Vol. 7, No. 8, November-December 2008 What s Your Information Agenda? Mahesh H. Dodani,

More information

Key Data Replication Criteria for Enabling Operational Reporting and Analytics

Key Data Replication Criteria for Enabling Operational Reporting and Analytics Key Data Replication Criteria for Enabling Operational Reporting and Analytics A Technical Whitepaper Rick F. van der Lans Independent Business Intelligence Analyst R20/Consultancy May 2013 Sponsored by

More information

Data Virtualization for Business Intelligence Agility

Data Virtualization for Business Intelligence Agility Data Virtualization for Business Intelligence Agility A Whitepaper Author: Rick F. van der Lans Independent Business Intelligence Analyst R20/Consultancy February 9, 2012 Sponsored by Copyright 2012 R20/Consultancy.

More information

Data Virtualization and ETL. Denodo Technologies Architecture Brief

Data Virtualization and ETL. Denodo Technologies Architecture Brief Data Virtualization and ETL Denodo Technologies Architecture Brief Contents Data Virtualization and ETL... 3 Summary... 3 Data Virtualization... 7 What is Data Virtualization good for?... 8 Applications

More information

TECHNOLOGY TRANSFER PRESENTS OCTOBER 16 2012 OCTOBER 17 2012 RESIDENZA DI RIPETTA - VIA DI RIPETTA, 231 ROME (ITALY)

TECHNOLOGY TRANSFER PRESENTS OCTOBER 16 2012 OCTOBER 17 2012 RESIDENZA DI RIPETTA - VIA DI RIPETTA, 231 ROME (ITALY) TECHNOLOGY TRANSFER PRESENTS RICK VAN DER LANS Data Virtualization for Agile Business Intelligence Systems New Database Technology for Data Warehousing OCTOBER 16 2012 OCTOBER 17 2012 RESIDENZA DI RIPETTA

More information

Integrating data in the Information System An Open Source approach

Integrating data in the Information System An Open Source approach WHITE PAPER Integrating data in the Information System An Open Source approach Table of Contents Most IT Deployments Require Integration... 3 Scenario 1: Data Migration... 4 Scenario 2: e-business Application

More information

Bringing Together ESB and Big Data

Bringing Together ESB and Big Data Bringing Together ESB and Big Data Bringing Together ESB and Big Data Table of Contents Why ESB and Big Data?...3 Exploring the Promise of Big Data and ESB... 4 Moving Forward With ESB and Big Data...5

More information

Data Vault + Data Virtualization = Double Flexibility

Data Vault + Data Virtualization = Double Flexibility Vault + Virtualization = Double Flexibility Copyright 1991-2015 R20/Consultancy B.V., The Hague, The Netherlands. All rights reserved. No part of this material may be reproduced, stored in a retrieval

More information

Integrating Salesforce Using Talend Integration Cloud

Integrating Salesforce Using Talend Integration Cloud Integrating Salesforce Using Talend Integration Cloud Table of Contents Executive Summary 3 Why Integrate Salesforce? 3 Advances in Data and Application Integration 4 About Talend Integration Cloud 5 Key

More information

Integrating Ingres in the Information System: An Open Source Approach

Integrating Ingres in the Information System: An Open Source Approach Integrating Ingres in the Information System: WHITE PAPER Table of Contents Ingres, a Business Open Source Database that needs Integration... 3 Scenario 1: Data Migration... 4 Scenario 2: e-business Application

More information

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ End to End Solution to Accelerate Data Warehouse Optimization Franco Flore Alliance Sales Director - APJ Big Data Is Driving Key Business Initiatives Increase profitability, innovation, customer satisfaction,

More information

ORACLE DATA INTEGRATOR ENTERPRISE EDITION

ORACLE DATA INTEGRATOR ENTERPRISE EDITION ORACLE DATA INTEGRATOR ENTERPRISE EDITION Oracle Data Integrator Enterprise Edition 12c delivers high-performance data movement and transformation among enterprise platforms with its open and integrated

More information

Decoding the Big Data Deluge a Virtual Approach. Dan Luongo, Global Lead, Field Solution Engineering Data Virtualization Business Unit, Cisco

Decoding the Big Data Deluge a Virtual Approach. Dan Luongo, Global Lead, Field Solution Engineering Data Virtualization Business Unit, Cisco Decoding the Big Data Deluge a Virtual Approach Dan Luongo, Global Lead, Field Solution Engineering Data Virtualization Business Unit, Cisco High-volume, velocity and variety information assets that demand

More information

PRACTICAL USE CASES BPA-AS-A-SERVICE: The value of BPA

PRACTICAL USE CASES BPA-AS-A-SERVICE: The value of BPA BPA-AS-A-SERVICE: PRACTICAL USE CASES How social collaboration and cloud computing are changing process improvement TABLE OF CONTENTS 1 Introduction 1 The value of BPA 2 Social collaboration 3 Moving to

More information

The 3 questions to ask yourself about BIG DATA

The 3 questions to ask yourself about BIG DATA The 3 questions to ask yourself about BIG DATA Do you have a big data problem? Companies looking to tackle big data problems are embarking on a journey that is full of hype, buzz, confusion, and misinformation.

More information

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database

Managing Big Data with Hadoop & Vertica. A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Managing Big Data with Hadoop & Vertica A look at integration between the Cloudera distribution for Hadoop and the Vertica Analytic Database Copyright Vertica Systems, Inc. October 2009 Cloudera and Vertica

More information

Big Data at Cloud Scale

Big Data at Cloud Scale Big Data at Cloud Scale Pushing the limits of flexible & powerful analytics Copyright 2015 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For

More information

Optimizing the Hybrid Cloud

Optimizing the Hybrid Cloud Judith Hurwitz President and CEO Marcia Kaufman COO and Principal Analyst Sponsored by IBM Introduction Hybrid cloud is fast becoming a reality for enterprises that want speed, predictability and flexibility

More information

Fast, Low-Overhead Encryption for Apache Hadoop*

Fast, Low-Overhead Encryption for Apache Hadoop* Fast, Low-Overhead Encryption for Apache Hadoop* Solution Brief Intel Xeon Processors Intel Advanced Encryption Standard New Instructions (Intel AES-NI) The Intel Distribution for Apache Hadoop* software

More information

The Liaison ALLOY Platform

The Liaison ALLOY Platform PRODUCT OVERVIEW The Liaison ALLOY Platform WELCOME TO YOUR DATA-INSPIRED FUTURE Data is a core enterprise asset. Extracting insights from data is a fundamental business need. As the volume, velocity,

More information

Datenverwaltung im Wandel - Building an Enterprise Data Hub with

Datenverwaltung im Wandel - Building an Enterprise Data Hub with Datenverwaltung im Wandel - Building an Enterprise Data Hub with Cloudera Bernard Doering Regional Director, Central EMEA, Cloudera Cloudera Your Hadoop Experts Founded 2008, by former employees of Employees

More information

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat

ESS event: Big Data in Official Statistics. Antonino Virgillito, Istat ESS event: Big Data in Official Statistics Antonino Virgillito, Istat v erbi v is 1 About me Head of Unit Web and BI Technologies, IT Directorate of Istat Project manager and technical coordinator of Web

More information

What to Look for When Selecting a Master Data Management Solution

What to Look for When Selecting a Master Data Management Solution What to Look for When Selecting a Master Data Management Solution What to Look for When Selecting a Master Data Management Solution Table of Contents Business Drivers of MDM... 3 Next-Generation MDM...

More information

BIG DATA-AS-A-SERVICE

BIG DATA-AS-A-SERVICE White Paper BIG DATA-AS-A-SERVICE What Big Data is about What service providers can do with Big Data What EMC can do to help EMC Solutions Group Abstract This white paper looks at what service providers

More information

THE QUEST FOR A CLOUD INTEGRATION STRATEGY

THE QUEST FOR A CLOUD INTEGRATION STRATEGY THE QUEST FOR A CLOUD INTEGRATION STRATEGY ENTERPRISE INTEGRATION Historically, enterprise-wide integration and its countless business benefits have only been available to large companies due to the high

More information

Increase Agility and Reduce Costs with a Logical Data Warehouse. February 2014

Increase Agility and Reduce Costs with a Logical Data Warehouse. February 2014 Increase Agility and Reduce Costs with a Logical Data Warehouse February 2014 Table of Contents Summary... 3 Data Virtualization & the Logical Data Warehouse... 4 What is a Logical Data Warehouse?... 4

More information

Oracle Database 12c Plug In. Switch On. Get SMART.

Oracle Database 12c Plug In. Switch On. Get SMART. Oracle Database 12c Plug In. Switch On. Get SMART. Duncan Harvey Head of Core Technology, Oracle EMEA March 2015 Safe Harbor Statement The following is intended to outline our general product direction.

More information

QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM

QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM QLIKVIEW DEPLOYMENT FOR BIG DATA ANALYTICS AT KING.COM QlikView Technical Case Study Series Big Data June 2012 qlikview.com Introduction This QlikView technical case study focuses on the QlikView deployment

More information

Oracle Big Data SQL Technical Update

Oracle Big Data SQL Technical Update Oracle Big Data SQL Technical Update Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data, Hadoop, NoSQL Databases, Relational Databases, SQL, Security, Performance Introduction This technical

More information

Enterprise Data Integration

Enterprise Data Integration Enterprise Data Integration Access, Integrate, and Deliver Data Efficiently Throughout the Enterprise brochure How Can Your IT Organization Deliver a Return on Data? The High Price of Data Fragmentation

More information

IBM System x reference architecture solutions for big data

IBM System x reference architecture solutions for big data IBM System x reference architecture solutions for big data Easy-to-implement hardware, software and services for analyzing data at rest and data in motion Highlights Accelerates time-to-value with scalable,

More information

Big Data: Beyond the Hype

Big Data: Beyond the Hype Big Data: Beyond the Hype Why Big Data Matters to You WHITE PAPER Big Data: Beyond the Hype Why Big Data Matters to You By DataStax Corporation October 2011 Table of Contents Introduction...4 Big Data

More information

Modern Data Integration

Modern Data Integration Modern Data Integration Whitepaper Table of contents Preface(by Jonathan Wu)... 3 The Pardigm Shift... 4 The Shift in Data... 5 The Shift in Complexity... 6 New Challenges Require New Approaches... 6 Big

More information

A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel

A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel A Next-Generation Analytics Ecosystem for Big Data Colin White, BI Research September 2012 Sponsored by ParAccel BIG DATA IS BIG NEWS The value of big data lies in the business analytics that can be generated

More information

Transparently Offloading Data Warehouse Data to Hadoop using Data Virtualization

Transparently Offloading Data Warehouse Data to Hadoop using Data Virtualization Transparently Offloading Data Warehouse Data to Hadoop using Data Virtualization A Technical Whitepaper Rick F. van der Lans Independent Business Intelligence Analyst R20/Consultancy February 2015 Sponsored

More information

WINDOWS AZURE DATA MANAGEMENT AND BUSINESS ANALYTICS

WINDOWS AZURE DATA MANAGEMENT AND BUSINESS ANALYTICS WINDOWS AZURE DATA MANAGEMENT AND BUSINESS ANALYTICS Managing and analyzing data in the cloud is just as important as it is anywhere else. To let you do this, Windows Azure provides a range of technologies

More information

A Modern Data Architecture with Apache Hadoop

A Modern Data Architecture with Apache Hadoop Modern Data Architecture with Apache Hadoop Talend Big Data Presented by Hortonworks and Talend Executive Summary Apache Hadoop didn t disrupt the datacenter, the data did. Shortly after Corporate IT functions

More information

WHITE PAPER. Data Migration and Access in a Cloud Computing Environment INTELLIGENT BUSINESS STRATEGIES

WHITE PAPER. Data Migration and Access in a Cloud Computing Environment INTELLIGENT BUSINESS STRATEGIES INTELLIGENT BUSINESS STRATEGIES WHITE PAPER Data Migration and Access in a Cloud Computing Environment By Mike Ferguson Intelligent Business Strategies March 2014 Prepared for: Table of Contents Introduction...

More information

Bringing Big Data into the Enterprise

Bringing Big Data into the Enterprise Bringing Big Data into the Enterprise Overview When evaluating Big Data applications in enterprise computing, one often-asked question is how does Big Data compare to the Enterprise Data Warehouse (EDW)?

More information

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

An Oracle White Paper November 2010. Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics An Oracle White Paper November 2010 Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics 1 Introduction New applications such as web searches, recommendation engines,

More information

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes Highly competitive enterprises are increasingly finding ways to maximize and accelerate

More information

Data Virtualization. Paul Moxon Denodo Technologies. Alberta Data Architecture Community January 22 nd, 2014. 2014 Denodo Technologies

Data Virtualization. Paul Moxon Denodo Technologies. Alberta Data Architecture Community January 22 nd, 2014. 2014 Denodo Technologies Data Virtualization Paul Moxon Denodo Technologies Alberta Data Architecture Community January 22 nd, 2014 The Changing Speed of Business 100 25 35 45 55 65 75 85 95 Gartner The Nexus of Forces Today s

More information

Five Steps to Integrate SalesForce.com with 3 rd -Party Systems and Avoid Most Common Mistakes

Five Steps to Integrate SalesForce.com with 3 rd -Party Systems and Avoid Most Common Mistakes Five Steps to Integrate SalesForce.com with 3 rd -Party Systems and Avoid Most Common Mistakes This white paper will help you learn how to integrate your SalesForce.com data with 3 rd -party on-demand,

More information

Klarna Tech Talk: Mind the Data! Jeff Pollock InfoSphere Information Integration & Governance

Klarna Tech Talk: Mind the Data! Jeff Pollock InfoSphere Information Integration & Governance Klarna Tech Talk: Mind the Data! Jeff Pollock InfoSphere Information Integration & Governance IBM s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice

More information

OPEN MODERN DATA ARCHITECTURE FOR FINANCIAL SERVICES RISK MANAGEMENT

OPEN MODERN DATA ARCHITECTURE FOR FINANCIAL SERVICES RISK MANAGEMENT WHITEPAPER OPEN MODERN DATA ARCHITECTURE FOR FINANCIAL SERVICES RISK MANAGEMENT A top-tier global bank s end-of-day risk analysis jobs didn t complete in time for the next start of trading day. To solve

More information

ORACLE DATA INTEGRATOR ENTERPRISE EDITION

ORACLE DATA INTEGRATOR ENTERPRISE EDITION ORACLE DATA INTEGRATOR ENTERPRISE EDITION ORACLE DATA INTEGRATOR ENTERPRISE EDITION KEY FEATURES Out-of-box integration with databases, ERPs, CRMs, B2B systems, flat files, XML data, LDAP, JDBC, ODBC Knowledge

More information

Datamation. Find the Right Cloud Computing Solution. Executive Brief. In This Paper

Datamation. Find the Right Cloud Computing Solution. Executive Brief. In This Paper Find the Right Cloud Computing Solution In This Paper There are three main cloud computing deployment models: private, public, and hybrid The true value of the cloud is achieved when the services it delivers

More information

Big Data Integration: A Buyer's Guide

Big Data Integration: A Buyer's Guide SEPTEMBER 2013 Buyer s Guide to Big Data Integration Sponsored by Contents Introduction 1 Challenges of Big Data Integration: New and Old 1 What You Need for Big Data Integration 3 Preferred Technology

More information

Capital Market Day 2015

Capital Market Day 2015 Capital Market Day 2015 Digital Business Platform & Product Roadmap Dr. Wolfram Jost Chief Technology Officer February 4, 2015 1 For Internal use only. Market Application infrastructure and middleware

More information

Big Data and Apache Hadoop Adoption:

Big Data and Apache Hadoop Adoption: Expert Reference Series of White Papers Big Data and Apache Hadoop Adoption: Key Challenges and Rewards 1-800-COURSES www.globalknowledge.com Big Data and Apache Hadoop Adoption: Key Challenges and Rewards

More information

The Principles of the Business Data Lake

The Principles of the Business Data Lake The Principles of the Business Data Lake The Business Data Lake Culture eats Strategy for Breakfast, so said Peter Drucker, elegantly making the point that the hardest thing to change in any organization

More information

Talend Global Leader in OSS Data Management

Talend Global Leader in OSS Data Management Talend Global Leader in OSS Data Management Cédric Carbone CTO ccarbone(at)talend.com 2010 Scilab-OW2 Programing Context 2010-09-21 Talend 2010 1 Corporate Overview Leading provider of open source data

More information

SELLING PROJECTS ON THE MICROSOFT BUSINESS ANALYTICS PLATFORM

SELLING PROJECTS ON THE MICROSOFT BUSINESS ANALYTICS PLATFORM David Chappell SELLING PROJECTS ON THE MICROSOFT BUSINESS ANALYTICS PLATFORM A PERSPECTIVE FOR SYSTEMS INTEGRATORS Sponsored by Microsoft Corporation Copyright 2014 Chappell & Associates Contents Business

More information

Chapter 6. Foundations of Business Intelligence: Databases and Information Management

Chapter 6. Foundations of Business Intelligence: Databases and Information Management Chapter 6 Foundations of Business Intelligence: Databases and Information Management VIDEO CASES Case 1a: City of Dubuque Uses Cloud Computing and Sensors to Build a Smarter, Sustainable City Case 1b:

More information

The big data business model: opportunity and key success factors

The big data business model: opportunity and key success factors MENA Summit 2013: Enabling innovation, driving profitability The big data business model: opportunity and key success factors 6 November 2013 Justin van der Lande EVENT PARTNERS: 2 Introduction What is

More information

Building your Big Data Architecture on Amazon Web Services

Building your Big Data Architecture on Amazon Web Services Building your Big Data Architecture on Amazon Web Services Abhishek Sinha @abysinha sinhaar@amazon.com AWS Services Deployment & Administration Application Services Compute Storage Database Networking

More information

The Ultimate Guide to Buying Business Analytics

The Ultimate Guide to Buying Business Analytics The Ultimate Guide to Buying Business Analytics How to Evaluate a BI Solution for Your Small or Medium Sized Business: What Questions to Ask and What to Look For Copyright 2012 Pentaho Corporation. Redistribution

More information

Oracle Data Integration: CON7926 Oracle Data Integration: A Crucial Ingredient for Cloud Integration

Oracle Data Integration: CON7926 Oracle Data Integration: A Crucial Ingredient for Cloud Integration Oracle Data Integration: CON7926 Oracle Data Integration: A Crucial Ingredient for Cloud Integration Julien Testut Principal Product Manager, Oracle Data Integration Sumit Sarkar Principal Systems Engineer,

More information

TRANSFORM BIG DATA INTO ACTIONABLE INFORMATION

TRANSFORM BIG DATA INTO ACTIONABLE INFORMATION TRANSFORM BIG DATA INTO ACTIONABLE INFORMATION Make Big Available for Everyone Syed Rasheed Solution Marketing Manager January 29 th, 2014 Agenda Demystifying Big Challenges Getting Bigger Red Hat Big

More information

How To Scale Out Of A Nosql Database

How To Scale Out Of A Nosql Database Firebird meets NoSQL (Apache HBase) Case Study Firebird Conference 2011 Luxembourg 25.11.2011 26.11.2011 Thomas Steinmaurer DI +43 7236 3343 896 thomas.steinmaurer@scch.at www.scch.at Michael Zwick DI

More information

Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges

Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges Managing Cloud Server with Big Data for Small, Medium Enterprises: Issues and Challenges Prerita Gupta Research Scholar, DAV College, Chandigarh Dr. Harmunish Taneja Department of Computer Science and

More information

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014

Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014 Big Data, Why All the Buzz? (Abridged) Anita Luthra, February 20, 2014 Defining Big Not Just Massive Data Big data refers to data sets whose size is beyond the ability of typical database software tools

More information

How To Integrate With Salesforce Crm

How To Integrate With Salesforce Crm Introduction Turbo-Charge Salesforce CRM with Dell Integration Services By Chandar Pattabhiram January 2010 Fueled by today s fiercely competitive business environment, IT managers must deliver rapid,

More information

Alexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data

Alexander Nikov. 5. Database Systems and Managing Data Resources. Learning Objectives. RR Donnelley Tries to Master Its Data INFO 1500 Introduction to IT Fundamentals 5. Database Systems and Managing Data Resources Learning Objectives 1. Describe how the problems of managing data resources in a traditional file environment are

More information

White Paper. Unified Data Integration Across Big Data Platforms

White Paper. Unified Data Integration Across Big Data Platforms White Paper Unified Data Integration Across Big Data Platforms Contents Business Problem... 2 Unified Big Data Integration... 3 Diyotta Solution Overview... 4 Data Warehouse Project Implementation using

More information

Unified Data Integration Across Big Data Platforms

Unified Data Integration Across Big Data Platforms Unified Data Integration Across Big Data Platforms Contents Business Problem... 2 Unified Big Data Integration... 3 Diyotta Solution Overview... 4 Data Warehouse Project Implementation using ELT... 6 Diyotta

More information

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics

HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics HADOOP SOLUTION USING EMC ISILON AND CLOUDERA ENTERPRISE Efficient, Flexible In-Place Hadoop Analytics ESSENTIALS EMC ISILON Use the industry's first and only scale-out NAS solution with native Hadoop

More information

Luncheon Webinar Series May 13, 2013

Luncheon Webinar Series May 13, 2013 Luncheon Webinar Series May 13, 2013 InfoSphere DataStage is Big Data Integration Sponsored By: Presented by : Tony Curcio, InfoSphere Product Management 0 InfoSphere DataStage is Big Data Integration

More information

Solution White Paper Connect Hadoop to the Enterprise

Solution White Paper Connect Hadoop to the Enterprise Solution White Paper Connect Hadoop to the Enterprise Streamline workflow automation with BMC Control-M Application Integrator Table of Contents 1 EXECUTIVE SUMMARY 2 INTRODUCTION THE UNDERLYING CONCEPT

More information

The Ultimate Guide to Buying Business Analytics

The Ultimate Guide to Buying Business Analytics The Ultimate Guide to Buying Business Analytics How to Evaluate a BI Solution for Your Small or Medium Sized Business: What Questions to Ask and What to Look For Copyright 2012 Pentaho Corporation. Redistribution

More information

SAP INTEGRATION APPROACHES

SAP INTEGRATION APPROACHES SAP INTEGRATION APPROACHES Best Practices for SAP application integration projects Abstract: One of the most pervasive challenges for SAP shops is integrating SAP to other applications within their organization.

More information

Apache Hadoop: The Big Data Refinery

Apache Hadoop: The Big Data Refinery Architecting the Future of Big Data Whitepaper Apache Hadoop: The Big Data Refinery Introduction Big data has become an extremely popular term, due to the well-documented explosion in the amount of data

More information

Using Big Data for Smarter Decision Making. Colin White, BI Research July 2011 Sponsored by IBM

Using Big Data for Smarter Decision Making. Colin White, BI Research July 2011 Sponsored by IBM Using Big Data for Smarter Decision Making Colin White, BI Research July 2011 Sponsored by IBM USING BIG DATA FOR SMARTER DECISION MAKING To increase competitiveness, 83% of CIOs have visionary plans that

More information

Data processing goes big

Data processing goes big Test report: Integration Big Data Edition Data processing goes big Dr. Götz Güttich Integration is a powerful set of tools to access, transform, move and synchronize data. With more than 450 connectors,

More information

GigaSpaces Real-Time Analytics for Big Data

GigaSpaces Real-Time Analytics for Big Data GigaSpaces Real-Time Analytics for Big Data GigaSpaces makes it easy to build and deploy large-scale real-time analytics systems Rapidly increasing use of large-scale and location-aware social media and

More information

Testing Big data is one of the biggest

Testing Big data is one of the biggest Infosys Labs Briefings VOL 11 NO 1 2013 Big Data: Testing Approach to Overcome Quality Challenges By Mahesh Gudipati, Shanthi Rao, Naju D. Mohan and Naveen Kumar Gajja Validate data quality by employing

More information

Next-Generation Cloud Analytics with Amazon Redshift

Next-Generation Cloud Analytics with Amazon Redshift Next-Generation Cloud Analytics with Amazon Redshift What s inside Introduction Why Amazon Redshift is Great for Analytics Cloud Data Warehousing Strategies for Relational Databases Analyzing Fast, Transactional

More information

Implement Hadoop jobs to extract business value from large and varied data sets

Implement Hadoop jobs to extract business value from large and varied data sets Hadoop Development for Big Data Solutions: Hands-On You Will Learn How To: Implement Hadoop jobs to extract business value from large and varied data sets Write, customize and deploy MapReduce jobs to

More information

Scalable Enterprise Data Integration Your business agility depends on how fast you can access your complex data

Scalable Enterprise Data Integration Your business agility depends on how fast you can access your complex data Transforming Data into Intelligence Scalable Enterprise Data Integration Your business agility depends on how fast you can access your complex data Big Data Data Warehousing Data Governance and Quality

More information

BUSINESS INTELLIGENCE. Keywords: business intelligence, architecture, concepts, dashboards, ETL, data mining

BUSINESS INTELLIGENCE. Keywords: business intelligence, architecture, concepts, dashboards, ETL, data mining BUSINESS INTELLIGENCE Bogdan Mohor Dumitrita 1 Abstract A Business Intelligence (BI)-driven approach can be very effective in implementing business transformation programs within an enterprise framework.

More information