1 Sagent Data Flow from Group 1 Software an extract from the Bloor Research report, Data Integration, Volume 1
3 Sagent Data Flow Sagent Data Flow Fast facts Sagent Data Flow, which is now provided by Group 1 Software since the latter s acquisition of Sagent, is aimed at two markets: data integration and data federation. Put simply it provides both conventional ETL (extract, transform and load) and EII (enterprise information integration) capabilities, either together or singly, using a common platform and approach. Our view is that there is clear and growing customer demand for a single solution for all data movement requirements and, as one of the first vendors to provide a solution aimed at satisfying this market, Sagent is ahead of the vast majority of its competitors. Key findings In the opinion of Bloor Research the following represent the key facts of which prospective users should be aware: Group 1 s Sagent Data Flow is more extensive than almost every other data integration product in the market, in the sense that it provides EII capability as well as ETL, using the same platform. Data Flow differs significantly from other data integration solutions in that it was originally designed not just as a workbench to visualise transformation processes but also to visualise the data itself. This has two major consequences:» In conjunction with the product s process-driven approach, the development of data movement workflow is very much simpler. In particular, it makes the development process much more easily understandable by end users and it automates much of the attribute level mapping that is required by more conventional tools.» It also means that you can use Sagent Data Flow as a business intelligence tool in its own right, or you can use it directly in conjunction with third party business intelligence tools. According to Group 1, around half of its customer base deploys the product in conjunction with this business intelligence option, and that it is particularly popular in departmental environments. In addition to the sort of icon-based graphical workflow that you would expect when defining data movement processes, the product also provides further workflow capabilities to support scheduling, event triggers and actions, and so forth. This is separated from the process workflow in Data Flow and has a different user interface. While fully functional, we think the look and feel of this additional workflow could do with some tidying up. Bloor Research 2004 Page 1
4 Data Integration, Volume 1 We particularly like the fact, not just that Group 1 provides a business intelligence front-end, but that Data Flow itself has been designed specifically to support more complex analytic applications, with built-in statistical and other functions that can be used in analytic or data mining environments. The purchase of Sagent by Group 1 should mean close integration between Data Flow on the one hand and Group 1 s data quality products on the other. Group 1 already has a partnership with Evoke for data profiling. The bottom line Sagent Data Flow has three unique propositions: first, it is much easier for end users to understand than other products in the marketplace; second, it offers both ETL and EII capabilities in a single package; and, third, it offers direct business intelligence capability either via its own or third party front-ends. While we can see this last facility being useful in the mid-market and for departmental solutions, especially because it means that you can provide business intelligence capabilities without a data warehouse, at the enterprise level it is likely to be the first two considerations that are most important. Ease of use is always a concern and is a significant advantage for Sagent. However, in the short term (that is, until other vendors extend their products to provide EII capabilities, which could be well into 2005) it is the broader applicability of Sagent Data Flow that should make it most appealing to customers wanting to cut their costs by standardising on a single platform. Page 2 Bloor Research 2004
5 Sagent Data Flow Vendor Information Background information Sagent was formed in 1995 to address the data integration and business intelligence market. In 1999 it floated on the stock market and, shortly thereafter, introduced aggressive expansion plans. Unfortunately, the subsequent downturn in the market started to make life more difficult and this was exacerbated when the company was the victim of fraud, resulting in its having to re-state its financial position. As a result the company got into financial difficulties and it was acquired by Group 1 in a purchase that was eventually completed in October Group 1 Software was founded in 1982, initially as a company specialising in address management. Today it has several product lines in addition to the Sagent data integration products: its data quality solutions, direct marketing and business geographics applications; and DOC1, its flagship customer communications management solution. DOC1 manages every aspect of a company s critical business documents, from data acquisition and content creation through multi-channel delivery, archiving and web-based customer care. Historically, the company s data quality solutions have been aimed at US-based organisations, perhaps with an overseas dimension, but not overseas companies per se. There are, in fact, several reasons why Group 1 purchased Sagent. First, it was looking to replace its previous data integration partner anyway. Secondly, Sagent had sales and distribution operations in twenty countries worldwide, which would enable Group 1 to gain additional product distribution capabilities in a number of growing markets including Europe, Japan, China, South Africa, and others. In order to leverage these offices it is likely that Group 1 will expand its data quality offerings out from a primarily US-based position in the near future. The company uses Sagent Data Flow both as a companion product to its data quality technologies and to import data into DOC1 and, of course, it is marketed worldwide as a stand-alone product in its own right. Group1/Sagent web address: Product availability Sagent Data Flow is currently in release 5.0, which runs on Windows platforms and Sun Solaris. AIX and HP-UX implementations are currently in development and should be available shortly. Linux support is another possibility although this is likely to be dependent on Linux platform support in the DOC1 Suite and its success. For source and target connectors, Sagent supplies some of its own technology but otherwise usually partners with iway. For real-time environments where you Bloor Research 2004 Page 3
6 Data Integration, Volume 1 need change data capture facilities, the company previously partnered with Striva before that company was acquired by Informatica. However, the company s technology is open so that you could use comparable facilities from a company such as Attunity. The product s repository is based on relational technology and can be run on top of SQL Server, Oracle, DB2, Informix or Sybase. For data profiling, Group 1 partners with Evoke Software. Financial results Group 1 is a public company, quoted on NASDAQ. In the most recent quarter (Q3 2003/4) it reported revenues of $31.3m, up 17.4% compared to the same period last year. Net income was reduced, however, due to the acquisition of Sagent, falling from $2.3m in the same period last year to $1m in the most recent quarter. In the last full accounting year (2002/3), Group 1 reported total revenues of $104.3m, up 17% compared to the previous year, with net income almost doubling from $4.4m to $8.7m. In the first nine months of the current year revenues stand at a total of $80.8m versus $75.1m for the first three quarters of 2002/3, with net income also improved at $5.6m as opposed to $5.2m. The company has 650 employees and international offices in Canada, the UK, Germany, Italy, France, Japan and Singapore. It also operates in a number of other European, Latin American and Asian countries. Page 4 Bloor Research 2004
7 Sagent Data Flow Product Information Introduction Sagent Data Flow has three distinctive differences compared to other data integration products. The first is the fact that the product is focused on processes rather than mapping. Using most data integration products, the primary emphasis in development is on creating the definitions that allow you to map a data field or record in the source data to a relevant target field or record. This is essentially an IT exercise. In other words, this puts you into the equivalent of a development situation in which the user agrees a specification and then the developer goes away and produces software that meets that specification. The problem with this sort of approach is that it typically fails to captures the exceptions that are common in all business environments. This is why most development environments no longer work simply on the basis of a written specification and why a more interactive environment between the users and developers is to be preferred. However, this is only a reasonable proposition when the IT department can talk to end-users using a terminology that the latter can understand. This is what the emphasis on processes, operating at a higher level than mappings, provides: it is the equivalent of rapid application development (RAD) applied to data movement. The second big difference in the Sagent product is that it does not stop with the traditional ETL (extract, transform and load) processes that make up data integration. Unlike other products in the marketplace, Group 1 s product does not limit itself to ETL per se. Indeed, the company describes its approach as the visualisation of processes and data. The key point here is that Sagent Data Flow can actually present the results of its E and T processes (without the L ) to the user. In other words you can see the processed data directly, graph it and so on, directly within the Sagent environment. Alternatively, you can also view the data using third party business intelligence tools or spreadsheets or, of course, you can load it into a target database in a conventional manner. One of the advantages of being able to see the results of the E and T processes is precisely so that you can support the user community in ensuring that you are actually providing what is required (as discussed in the previous paragraph), as well as being useful in its own right. Further, this facility provides limited but useful data profiling capabilities. That is, you can visually inspect the data not just for exceptions but also for errors, though we would normally recommend the use of Evoke Axio or another specialist product for this purpose. Finally, the third major difference between Group 1 s approach and that of most of its competitors derives from the fact that it uses an engine-based approach. This, in itself, is by no means unique. However, the engine itself is effectively a rules engine that can be set to suit different environments. In particular, it can be tuned to provide mass movement of data where the number of users involved is minimal. Alternatively, the company provides a second configuration in which Bloor Research 2004 Page 5
8 Data Integration, Volume 1 the engine has been optimised for large numbers of users but relatively small amounts of data. These two configurations are called the Data Load Server and the Data Access Server respectively. In practice, what this means is that the Data Load Server provides conventional data movement capabilities à la ETL, while the Data Access Server provides what Sagent calls ETP: extract, transform and present and which, more commonly, would be referred to as EII (enterprise information integration) or data federation. We differentiate these two technologies on the basis that EII is solely for the retrieval of data while data federation also allows updates. Since this is in the nature of ETL products, we would categorise Sagent Data Flow as providing data federation, although it does not support two-phase commit as some specialist products do. There are some other features that you need for data federation or EII. You need to be able to capture real-time data, which you can do through third party connectors such as those provided by Attunity. Also, it is advantageous to be able to process data on the source system (to do joins and so on locally) where appropriate. Group 1 can do this by implementing its engine on that source, when it is a supported platform. However, this precludes this capability on mainframes in particular, unless you use intermediary technology from a vendor such as Corigin (which provides access to mainframe data directly from an open platform). To extend its data federation capabilities we would therefore like to see a closer relationship between Sagent and some of the companies mentioned. We regard this combined ETL and EII capability as particularly significant. We expect this to be one of the directions in which the market moves in the years ahead, because there are obvious synergies between the two approaches. The fact that Group 1 is one of the first (if not actually the first) vendors to recognise this trend is significant in itself and will give the company a major advantage if the market moves in the direction we expect. Architecture The architecture of Sagent Data Flow is illustrated in the diagram left. Data Flow technical overview With respect to the discussion in the previous section, it is important to note the flexibility of Data Flow. You can implement both a Data Load Server and Data Access Server, which will provide the sort of business intelligence output we have referred to (via either WebLink Server or Sagent OpenLink); or you can implement just a Data load Server, which will provide conventional data movement capabilities; or you can implement a Data Access Server on its own, which will support a data federation solution. The other elements in this diagram are self-descriptive and we will discuss these in the sections that follow. Page 6 Bloor Research 2004
9 Sagent Data Flow Development The core elements for development are the Design Studio, Repository and Transforms as illustrated above. As we have mentioned, Data Flow is processdriven. That is, the emphasis is at a high-level so that developers can work in conjunction with users, and as much as possible of the lower level functionality is automated. Process-level development (Plan) An example of this process-level development, which Sagent refers to as a plan, is illustrated left, where the yellow boxes (objects) represent individual operations and the blue boxes represent a set of operations that have been previously defined and stored in the repository so that they can be reused. You can click on a blue box to see the plan that it represents. Each of the objects can have business function or process names, and comments can be added to further specify the transformation taking place, for documentation and collaboration purposes. In practice, the developer would step through this plan with the user, showing him or her the data that was generated at each stage of the process. This is where the product really differs from other approaches. Conventionally, you do not present data at this stage, which is why developers have to spend a lot of time mapping attributes in source data to attributes that will apply on the target. By concentrating on the data that you want to see, this level of detail can be automated by the software. This data will typically be displayed in a grid (table) format below the plan, where the layout of the grid is defined by dragging and dropping the relevant columns into the display pane. This data can then be manipulated in a variety of ways. For example, you might want to use the Analytical Calculator, shown below, to perform a variety of statistical and other functions, using set-based logic. Note that when Group 1 refers to this is an analytical calculator, it really means it. This was originally designed to support analytic applications and you can, for example, use this for segmentation purposes. In addition, there is an Expression Builder, as illustrated, Allowing data in the view to be manipulated using set-based logic which allows you to define additional functions (transformations) that manipulate the data inside a plan, using Boolean logic and other options. Specifically, there are facilities to allow new (calculated) data (virtual columns) to be added to the view of the data. Sagent Data Flow fits into the general class of a black-box product as opposed to a code generating product. You can, however, view the (native) SQL that is generated for access to source data and edit this. However, it would be unusual to want to do this except where you wanted to use any existing stored procedures or to edit join paths, which may be important in EII environments. Any such edits Bloor Research 2004 Page 7
10 Data Integration, Volume 1 remain within the Sagent environment and, thus, the audit trail that the product provides will encompass any such changes. The same is not true if you want to extend the transformation environment (which you can do, using VBScript, Perl and so on). However, this again is an unusual requirement. Other facilities relevant to development include the generation of documentation, which derives from the repository, as well as a number of standard reports that will run against the repository. Deployment We have already discussed the Data Flow engines to some extent. However, it is worth adding that Sagent has a massively parallel data pipeline processing engine, allowing it to support large data volumes for parallel extraction and loading capabilities, as well as other performance and scalability features. In addition, there are some other elements of the Sagent environment that deserve mention, notably the Automation module. The Automation component of the product is used to set up schedules, event triggers and associated alerts (which can be sent to a variety of device types), and so on. The stages in an automation process are defined using a fairly standard icon-driven drag-and-drop based approach, to create a data flow diagram that includes branching, and the other sorts of facilities that one might expect. The one comment that we would make about these diagrams is that they look a little oldfashioned: there is, for example, no facility to automatically produce orthogonal lines, which means that diagrams tend to look messy. If you do not wish to use Sagent s own scheduling capabilities then you can use a third party scheduling product provided that it has a command line interface. Business Intelligence We have already detailed two reasons for wanting to be able to see data within a data movement environment. One is to assist in the development process and in the interchanges that are necessary between the end user and the developer; the second is an extension to the first, for identifying errors rather than exceptions; and the third is for its own sake, to support decision making. In terms of the business intelligence facilities that are provided Sagent offers two products: WebLink Server and Sagent OpenLink. WebLink Server is, as its name suggests, a browser-based product, which is used to present the business intelligence capabilities of Sagent Data Flow to the end user. Results are displayed either in report format, crosstab format, or as charts. In the latter case, a Chart Wizard is provided to allow you to select the type of presentation that you want, which range from simple bar charts to scatter plots and sophisticated 3D objects such as doughnuts; as well as the style, layout and axes definitions that best suit your requirements. Page 8 Bloor Research 2004
11 Sagent Data Flow The OLAP cubes that underpin this reporting are constructed in memory alongside the data that will populate the cube, although you can take a snapshot of the data at any time, which can be stored for reuse. This in-memory facility should be particularly useful in an EII environment because it will provide more complex capabilities (drill-down, pivot, rollup and so forth) that are not normally available from other vendors in this sector. Other business intelligence facilities include filtering, embedded calculations, publish and subscribe capabilities for report delivery, sorting, ranking, exception highlighting, and concatenation. Aggregations can be performed either on the source database or within the WebLink client, as required. Finally, it is worth mentioning the product s integration with Microsoft Excel. In this case you can work completely within the Excel environment and simply point at Sagent as a source. The fact that Data Flow is actually performing a whole load of processes behind the scenes is completely transparent to the user. Sagent OpenLink allows you to leverage existing investments in third party business intelligence tools by presenting the data collected by Sagent Data Flow to these environments through ODBC or JDBC calls to the BI tools. Summary Group 1 s Sagent Data Flow is likely to be particularly appealing to the midmarket, and for departmental solutions within larger organisations. The fact that it is both easy to use and can provide business intelligence capabilities without complex additional requirements (such as a data warehouse) should be especially attractive. Moreover, the fact that this business intelligence can address heterogeneous operational data sources from within the same environment is a significant differentiator. At the enterprise level, the business intelligence capabilities of Sagent Data Flow are less likely to be of interest because of existing investments. Nevertheless, the ability to present data as part of the data integration exercise will still be useful in ensuring a smooth implementation. However, it is the combination of data movement and federation (ETL and EII) in a single platform that is likely to be of most interest to this market. Few, if any, other vendors have a product that spans this space and it is obviously a significant advantage to be able to do both of these things from a single platform. Bloor Research 2004 Page 9
12 Copyright & Disclaimer This document is subject to copyright. No part of this publication may be reproduced by any method whatsoever without the prior consent of Bloor Research. Due to the nature of this material, numerous hardware and software products have been mentioned by name. In the majority, if not all, of the cases, these product names are claimed as trademarks by the companies that manufacture the products. It is not Bloor Research s intent to claim these names or trademarks as our own. Whilst every care has been taken in the preparation of this document to ensure that the information is correct, the publishers cannot accept responsibility for any errors or omissions.
FORECASTING AND BUDGETING SoftWARE SECOND EDITION CHARTECH SOFTWARE PRODUCT GUIDE business with CONFIDENCE icaew.com/itfac IT FACULTY BENEFITS Keep on top of important developments with e-bulletins, bi-monthly
Microsoft Dynamics NAV 2009 Business Intelligence Driving insight for more confident results White Paper November 2008 www.microsoft.com/dynamics/nav Table of Contents Overview... 3 What Is Business Intelligence?...
Microsoft Dynamics NAV 2009 Business Intelligence Driving insight for more confident results White Paper November 2008 www.microsoft.com/dynamics/nav Table of Contents Overview... 3 What Is Business Intelligence?...
BIG DATA ANALYTICS - THIS TIME IT S PERSONAL Robin Bloor, Ph D WHITE PAPER Executive Summary 1010data is surprising in several ways. As a technology, it does not fall into any obvious category, but instead
Customer Cloud Architecture for Big Data and Analytics Executive Overview Using analytics reveals patterns, trends and associations in data that help an organization understand the behavior of the people
The IBM Business Intelligence Software Solution Prepared for IBM by Colin J. White DataBase Associates International, Inc. Version 3, March 1999 TABLE OF CONTENTS WHAT IS BUSINESS INTELLIGENCE? 1 The Evolution
SAP Business One Whitepaper Page 1 SAP Business One, The Answer to the Challenges of SMB Business Management Software Selection Contact: Daniel A. Carr firstname.lastname@example.org Phone: 248-347-4600 Date: June 14,
The Twining project, an institutional cooperation between Italy and Turkey, is co-financed by the European Union and the Republic of Turkey. EU TWINNING PROJECT Improving Data Quality in Public Accounts
THE ROYAL INSTITUTE OF TECHNOLOGY Business Intelligence for Small Enterprises An Open Source Approach Rustam Aliyev May 2008 Master thesis at the Department of Computer and Systems Sciences at the Stockholm
SYMANTEC ServiceDesk Customization Guide 7.0 Symantec ServiceDesk 7 The software described in this book is furnished under a license agreement and may be used only in accordance with the terms of the agreement.
How to embrace Big Data A methodology to look at the new technology Contents 2 Big Data in a nutshell 3 Big data in Italy 3 Data volume is not an issue 4 Italian firms embrace Big Data 4 Big Data strategies
Product Overview for Windows Small Business Server 2011 December 2010 Abstract Microsoft offers Windows Small Business Servers as a business solution for small businesses by providing a simplified setup,
In-Memory Analytics: Leveraging Emerging Technologies for Business Intelligence Featuring Research From Gartner INSIDE THIS ISSUE Introduction... 1 Emerging Technologies Will Drive Self-Service Business
Identity and access management as a driver for business growth February 2013 Identity and access management (IAM) systems are today used by the majority of European enterprises. Many of these are still
Outsourcing Workbook Page 1 Copyright 2008 Notice of rights All rights reserved. No part of this book may be reproduced or transmitted in any form by any means, electronic, mechanical, photocopying, recording,
Private Cloud in Context What s it for and where does it fit? Dale Vile, Freeform Dynamics Ltd, May 212 The term private cloud has been described by some purists as an oxymoron; if cloud computing is all
An introduction and guide to buying Cloud Services DEFINITION Cloud Computing definition Cloud Computing is a term that relates to the IT infrastructure and environment required to develop/ host/run IT
SAP Licensing Guide Licensing SAP Software A Guide for Buyers Table of Contents 3 Preface 4 Chapter 1: An Overview Licensing Components Named User and Package Licenses Modular Structure of SAP Software
Business Intelligence Software 1 Running head: BUSINESS INTELLIGENCE SOFTWARE Business Intelligence Software Customers Understanding, Expectations and Needs Adis Sabanovic Thesis for the Master s degree
technical white paper Synchronizing Data Among Heterogeneous Databases Principal Author Robert H. Wiebener, Jr. Robert.Wiebener@sybase.com www.sybase.com TABLE OF CONTENTS 1 Introduction to Heterogeneous
WHITE PAPER Bringing Business Intelligence to Public Safety Contents 1. Introduction... 1 2. Bringing Business Intelligence to Public Safety... 2 2.1. Real-World Analysis of Complex Data... 3 2.2. Creating
SAP Statement of Direction Business Intelligence Solutions Business Intelligence Solutions from SAP: Statement of Direction Table of Contents 3 Quick Facts 4 Driving Business Innovation Through Radical