Making SAP Information Steward a Key Part of Your Data Governance Strategy



Similar documents
Making SAP Information Steward a Key Part of Your Data Governance Strategy

Business User driven Scorecards to measure Data Quality using SAP BusinessObjects Information Steward

Measure Your Data and Achieve Information Governance Excellence

SAP BusinessObjects Information Steward

Implementing Data Governance at Grifols: Best Practices and

Lost in Space? Methodology for a Guided Drill-Through Analysis Out of the Wormhole

Business Objects Course outline: =======================

CHAPTER SIX DATA. Business Intelligence The McGraw-Hill Companies, All Rights Reserved

TRUSTING YOUR DATA. How companies benefit from a reliable data using Information Steward 1/2/2013

Model-driven Business Intelligence Building Multi-dimensional Business and Financial Models from Raw Data

What's New in SAS Data Management

<no narration for this slide>

Managing Third Party Databases and Building Your Data Warehouse

TECHNOLOGY BRIEF: CA ERWIN SAPHIR OPTION. CA ERwin Saphir Option

Course duration: 45 Hrs Class duration: 1-1.5hrs

Consumption of OData Services of Open Items Analytics Dashboard using SAP Predictive Analysis

SAP BO 4.1 COURSE CONTENT

... Foreword Preface... 19

Creating a universe on Hive with Hortonworks HDP 2.0

Tips and Tricks SAGE ACCPAC INTELLIGENCE

SAP BO 4.1 Online Training

MAS 500 Intelligence Tips and Tricks Booklet Vol. 1

Building a Data Quality Scorecard for Operational Data Governance

8902 How to Generate Universes from SAP Sybase PowerDesigner. Revision:

Topics. Database Essential Concepts. What s s a Good Database System? Using Database Software. Using Database Software. Types of Database Programs

BUSINESSOBJECTS DATA INTEGRATOR

Jet Data Manager 2012 User Guide

ER/Studio Enterprise Portal User Guide

How is it helping? PragmatiQa XOData : Overview with an Example. P a g e Doc Version : 1.3

SAP Data Services 4.X. An Enterprise Information management Solution

Seeking Data Quality. Using Agile Methods to Test a Data Warehouse

ReceivablesVision SM Getting Started Guide

P6 Analytics Reference Manual

SAP Data Services and SAP Information Steward Document Version: 4.2 Support Package 7 ( ) PUBLIC. Master Guide

Technology in Action. Alan Evans Kendall Martin Mary Anne Poatsy. Eleventh Edition. Copyright 2015 Pearson Education, Inc.

Business Objects Online training Contents SAP BUSINESS OBJECTS 4.0/XI 3.1. We provide online instructor led Business Objects Training.

SAP Business Objects Business Intelligence platform Document Version: 4.1 Support Package Data Federation Administration Tool Guide

Talend Metadata Manager. Reduce Risk and Friction in your Information Supply Chain

Using Query Browser in Dashboards 4.0: What You Need to Know

Report and Dashboard Template User Guide

A WHITE PAPER By Silwood Technology Limited

By Makesh Kannaiyan 8/27/2011 1

Whitepaper Data Governance Roadmap for IT Executives Valeh Nazemoff

MITS Distributor Analytics

Module 9 Ad Hoc Queries

MicroStrategy Desktop

Enterprise Information Management with SAP

WebSphere Business Monitor

Lab 02 Working with Data Quality Services in SQL Server 2014

How To Use Deepsee

SAP BUSINESS OBJECTS BO BI 4.1 amron

<Insert Picture Here> Extending Hyperion BI with the Oracle BI Server

for Sage 100 ERP Business Insights Overview Document

BusinessObjects Enterprise InfoView User's Guide

SQL Server Integration Services with Oracle Database 10g

Create Mobile, Compelling Dashboards with Trusted Business Warehouse Data

Taleo Enterprise. Taleo Reporting Getting Started with Business Objects XI3.1 - User Guide

Setting up the Oracle Warehouse Builder Project. Topics. Overview. Purpose

Ignite Your Creative Ideas with Fast and Engaging Data Discovery

BUSINESSOBJECTS DATA INTEGRATOR

Master Data Services. SQL Server 2012 Books Online

Visualization with Excel Tools and Microsoft Azure

Chapter 6. Foundations of Business Intelligence: Databases and Information Management

Introduction to Microsoft Access 2003

Great! You sold some Lexmark devices. Now you need to file for the associated rebates.

Chapter 7: Data Mining

SonicWALL GMS Custom Reports

Exploiting Key Answers from Your Data Warehouse Using SAS Enterprise Reporter Software

SA S P A BO B BJ B COURSE CONTENT 2015

Building a Successful Data Quality Management Program WHITE PAPER

Analance Data Integration Technical Whitepaper

Kyubit Business Intelligence OLAP analysis - User Manual

BizTalk Server Business Activity Monitoring. Microsoft Corporation Published: April Abstract

ASYST Intelligence South Africa A Decision Inc. Company

Accountable Care Organization Quality Explorer. Quick Start Guide

SAS VISUAL ANALYTICS AN OVERVIEW OF POWERFUL DISCOVERY, ANALYSIS AND REPORTING

SMB Intelligence. Reporting

- SAP BusinessObjects and Xcelsius articles, links and ressources

DATA GOVERNANCE AND DATA QUALITY

Improve Information Governance Through Clarity and Collaboration

Building an Executive Dashboard On Top Of Excel. Building an Executive Dashboard On Top Of Excel

Adobe Marketing Cloud Data Workbench Dashboard User Guide

Data Entry Training Module

OECD.Stat Web Browser User Guide

Five Fundamental Data Quality Practices

SAP BusinessObjects Financial Consolidation Web User Guide

Sage MAS 90 and 200 ERP

Data Domain Profiling and Data Masking for Hadoop

Analance Data Integration Technical Whitepaper

Sage Accpac ERP 5.6A. CRM Analytics for SageCRM I User Guide

Data Integrator: Object Naming Conventions

Participant Guide RP301: Ad Hoc Business Intelligence Reporting

Using Metadata Manager for System Impact Analysis in Healthcare

PUBLIC Preferences Setup Automated Analytics User Guide

SAP BusinessObjects Business Intelligence (BOBI) 4.1

Business Intelligence, Analytics & Reporting: Glossary of Terms

Transcription:

Making SAP Information Steward a Key Part of Your Data Governance Strategy Part 2 SAP Information Steward Overview and Data Insight Review Part 1 in our series on Data Governance defined the concept of Data Governance and gave suggestions on how to go about implementing an initial program at a corporate level. The definition that we use is: Data Governance is your organization s management strategy to meet the data quality needs of final data users and consumers. It verifies that data meets your organization s security requirements and ensures that it complies with any regulatory laws. It is the marriage of data quality, data management, and risk management principles. It is implemented via corporate policies, procedures, controls, and software. Now that we know what it is and how to start a program, let s discuss how SAP Information Steward can fit into a data governance initiative. SAP Information Steward is an enterprise-level data quality solution that allows you to profile data, perform impact and lineage analysis, construct a corporate dictionary, and define custom cleansing rules for incoming data. Each of these functions is performed by a different module of the software, which are: Data Insight, Metadata Management, Metapedia, and Cleansing Package Builder. Your initial data governance goal will determine which of these to utilize first. Data Insight is the data profiling tool and data quality monitor. Metadata Management is the impact and lineage analysis tool that can determine where a piece of data is used through the enterprise and what may affect that data. Metapedia is the corporate dictionary where business terms can be defined for use throughout the organization. Finally, Cleansing Package Builder is the data quality tool that allows data area experts to define transformations and cleansing rules in order to standardize a particular set of data. This post will cover Data Insight in detail, while subsequent posts will breakdown the other modules of the Information Steward tool. Data Insight allows you to profile data from a range of sources that include standard relational databases, SAP HANA, SAP ERP, SAP Master Data Services, and even flat files. Data profiling is simply the process of analyzing the data that exists in a source and collecting statistics from that analysis. It answers the question: What does my data source actually contain?, as there is often a disparity between what a source should contain and what it contains in reality. Data profiling is the starting point for data integration tasks, data warehouse projects, and many data governance programs. Without this starting point, one cannot properly calculate true measurements of the data quality improvements that are achieved through a data governance or data quality program. There are several flavors of data profiling. The basic Column Profiling option collects data statistics of a column such as the percentage of null values, the distinct values found, the minimum and maximum values in the field, common patterns in the data, and much more. Advanced column profiling is also possible with Address profiling, Redundancy profiling, Uniqueness profiling, and Dependency profiling. Address profiling utilizes the Data Services job engine to parse address data through its Address Cleansing transform and return a percentage breakdown of good and bad addresses in the data. It

also displays the percentage of correctable addresses if the data were to be run through a Data Services Address Cleanse transform. Redundancy profiling checks the amount of overlap in data between two sources, and is good for rooting out referential integrity issues. Uniqueness profiling determines the percentage of unique values in a field. And lastly, Dependency profiling allows you to determine the degree of dependency that two or more columns of data have upon each other. The screenshot below shows the results of column data profiling of a database table. Note that certain results only appear when appropriate; for example an average result only appears for numeric columns and string length results only for character columns. Figure 1 Sample results of column data profiling Information Steward also stores sample rows of data that fit a certain profiling result. This allows you to drill down to record-level detail without having to leave the tool. In the below screenshot, I first selected the Value result of the GPO_NAME column. I then selected the value NOVATION that appeared in the list of values in the right-hand panel. Note that these values are sorted by most popular to least as determined by the number of those values in the dataset. The bottom panel contains raw data that fits these selections.

Figure 2 The list of values for the column GPO_NAME appears on the right and sample data rows appear in the bottom panel for the value NOVATION Data Insight s second focus is that of a data quality monitor. This is achieved through the use of custom data validation rules and data quality scorecards. A data validation rule defines the form that a piece of data should take and scores the results of a rule when it is run against one or multiple sources. For example, if a database column should contain only US zip codes, a rule for that column could be that the data in it should be either five digits in length or five digits followed by a hyphen followed by four more (e.g. 99999 or 99999-9999). The rule that is created can then be bound to any source field that should contain US zip codes. Furthermore, multiple rule scores are stored in the Information Steward repository and can be analyzed over time to determine whether data quality is improving or getting worse. The rule building tool allows for a wide range of flexibility to build rules that will suit your organization s needs. The screenshot below shows the various score results of a rule that has been selected in the left panel. This rule has been bound to multiple sources and has been run multiple times. Take note of the scores per source binding, the color that corresponds to those scores (which are configurable per binding), and the From Last arrow that indicates whether the score improved or got worse since the last time the score was calculated. This screen also displays the total number of rows that were analyzed and the number that failed the rule.

Figure 3 Rules are displayed in the left panel along with its corresponding bindings on the right. Scores are calculated and kept historically Scorecards are the final visualization piece of data quality monitoring. They allow data validation rules to be aggregated to higher levels of analysis in Data Quality Dimensions and Key Data Domains. Data Quality Dimensions are generic categories that rules fall under such as Quality, Accuracy, and Completeness, and are defined when a rule is created. Key Data Domains are user-driven areas of focus for which data quality is to be analyzed. These can be as broad or granular as desired. The difference between these two is that Key Data Domains are the highest level for which data quality is measured, and Data Quality Dimensions are a subset of Key Data Domains. Information Steward calculates total scores for each and keeps historical scores for analysis over time as well. In the screenshot below, a scorecard is broken down into Key Data Domains of Distributors that I ve blurred out due to it being customer sensitive information. Data Quality Dimensions score are broken out in each panel of Key Data Domains to show each result. And finally, the Quality Trend line at the bottom of each panel shows the score of the Key Data Domain over time.

Figure 4 Key Data Domains are broken out into separate panels. Quality Dimensions are displayed as well and an overall Quality Trend for the Key Data Domain appears at the bottom The next screenshot shows the drill-down screen for the right-most Key Data Domain of the above screenshot. This is accessed by selecting Show more in the upper right of the Key Data Domain panel. Notice that a list of Key Data Domains, Data Quality Dimensions, and Validation Rules all appear with a score breakdown. You are able to select a value from any of these to display a Quality Trend in the lower left as Information Steward keeps historical data for all three of these dimensions.

Figure 5 The "Show more..." screen appears with a selection made from the list Key Data Domains on the left. Data Quality Dimensions and Validation Rules are also able to be selected on this screen to display the respective data quality trend in the lower left. One last thing to note is the View failed data button that appears in the lower right panel. Clicking this will bring up the rows of data that failed validation rules for the selected Key Data Domain (or Data Quality Dimension or Validation Rule if one of those is selected instead). With this button you are able to see the raw data records and which validation rule(s) that they failed directly within the tool. An option to export those records to Excel is also available if further analysis is necessary. In conclusion, the Data Insight module within SAP Information Steward can be a very influential asset in starting a data governance program or data quality initiative. Its powerful data analysis and data quality monitoring capabilities will greatly help in building a case for data governance in your organization. In Part 3 of the series, we will discuss the remaining modules of the SAP Information Steward solution of Metadata Management, Metapedia, and Cleansing Package Builder. In subsequent posts we will discuss some case studies where we use Information Steward in the real-world.

Rich Hauser, Senior Business Intelligence Consultant Decision First Technologies Richard.Hauser@decisionfirst.com Rich is a senior business intelligence consultant specializing in Enterprise Information Management. He has delivered customized SAP BusinessObjects solutions for customers of all sizes across a variety of industries. With Decision First Technologies, Rich utilizes SAP Data Services and SAP Information Steward.