DATA MANAGEMENT USER GROUP

Similar documents
What's New in SAS Data Management

SAS Data Management Technologies Supporting a Data Governance Process. Dave Smith, SAS UK & I

Nyheter i SAS Data Management med SAS versjon 9.4

Data Governance in the Hadoop Data Lake. Michael Lang May 2015

White Paper. Thirsting for Insight? Quench It With 5 Data Management for Analytics Best Practices.

Hadoop & SAS Data Loader for Hadoop

Implementing and Maintaining Microsoft SQL Server 2008 Integration Services

Data Domain Profiling and Data Masking for Hadoop

Deploy. Friction-free self-service BI solutions for everyone Scalable analytics on a modern architecture

Business User driven Scorecards to measure Data Quality using SAP BusinessObjects Information Steward

SAP BusinessObjects Information Steward

Bringing the Power of SAS to Hadoop. White Paper

IRMAC SAS INFORMATION MANAGEMENT, TRANSFORMING AN ANALYTICS CULTURE. Copyright 2012, SAS Institute Inc. All rights reserved.

Ganzheitliches Datenmanagement

MOC 20467B: Designing Business Intelligence Solutions with Microsoft SQL Server 2012

White Paper. Unified Data Integration Across Big Data Platforms

Unified Data Integration Across Big Data Platforms

Sisense. Product Highlights.

Luncheon Webinar Series May 13, 2013

Evaluation Checklist Data Warehouse Automation

ER/Studio Enterprise Portal User Guide

Automated Data Ingestion. Bernhard Disselhoff Enterprise Sales Engineer

What s New with Informatica Data Services & PowerCenter Data Virtualization Edition

Managing Third Party Databases and Building Your Data Warehouse

SAP Data Services 4.X. An Enterprise Information management Solution

In-Database Analytics

Microsoft. Course 20463C: Implementing a Data Warehouse with Microsoft SQL Server

Analance Data Integration Technical Whitepaper

INTRODUCING ORACLE APPLICATION EXPRESS. Keywords: database, Oracle, web application, forms, reports

AnalytiX MappingManager Big Data Edition

What does SAS Data Management do? Why is SAS Data Management important? For whom is SAS Data Management designed? Key Benefits

SAS Enterprise Data Integration Server - A Complete Solution Designed To Meet the Full Spectrum of Enterprise Data Integration Needs

dbspeak DBs peak when we speak

BIRT ihub Actuate Customer Days. Wow that looks good! Jeff Morris & Mark Gamble

CorHousing. CorHousing provides performance indicator, risk and project management templates for the UK Social Housing sector including:

Dynamic Decision-Making Web Services Using SAS Stored Processes and SAS Business Rules Manager

9.4 Intelligence. SAS Platform. Overview Second Edition. SAS Documentation

Desktop Activity Intelligence

Release Automation for Siebel

Data processing goes big

AV-005: Administering and Implementing a Data Warehouse with SQL Server 2014

WHAT S NEW IN SAS 9.4

Chapter 6 Basics of Data Integration. Fundamentals of Business Analytics RN Prasad and Seema Acharya

Course 20463:Implementing a Data Warehouse with Microsoft SQL Server

Automate Your BI Administration to Save Millions with Command Manager and System Manager

Data Integration Checklist

Melissa Coates. Tools & Techniques for Implementing Corporate and Self-Service BI. Triad SQL BI User Group 6/25/2013. BI Architect, Intellinet

Analance Data Integration Technical Whitepaper

BIG DATA: FROM HYPE TO REALITY. Leandro Ruiz Presales Partner for C&LA Teradata

QUEST meeting Big Data Analytics

SAP Agile Data Preparation

High-Volume Data Warehousing in Centerprise. Product Datasheet

Capitalize on Big Data for Competitive Advantage with Bedrock TM, an integrated Management Platform for Hadoop Data Lakes

End to End Solution to Accelerate Data Warehouse Optimization. Franco Flore Alliance Sales Director - APJ

Data Integrity and Integration: How it can compliment your WebFOCUS project. Vincent Deeney Solutions Architect

Extend your analytic capabilities with SAP Predictive Analysis

Hadoop Data Hubs and BI. Supporting the migration from siloed reporting and BI to centralized services with Hadoop

SQL Server Setup Guide for BusinessObjects Planning

Answers to Top BRMS Questions

SQL Server Administrator Introduction - 3 Days Objectives

Getting Started with Oracle Data Miner 11g R2. Brendan Tierney

SQLSaturday#393 Redmond 16 May, End-to-End SQL Server Master Data Services

Data Management Roadmap

G Cloud Services Definition Document. Compliance Service. Invigilatis Limited. Contents. Pages. Invigilatis Applications 1.

Best Practices for Implementing Oracle Data Integrator (ODI) July 21, 2011

SQL Server 2012 Business Intelligence Boot Camp

Data Governance in the Hadoop Data Lake. Kiran Kamreddy May 2015

Best Practices in Enterprise Data Governance

Privileged. Account Management. Accounts Discovery, Password Protection & Management. Overview. Privileged. Accounts Discovery

G Cloud Services Definition Document. Property Management Service. Invigilatis Limited. Contents. Pages. Invigilatis Applications 1.

Integrating Netezza into your existing IT landscape

GAIN BETTER INSIGHT FROM BIG DATA USING JBOSS DATA VIRTUALIZATION

and Hadoop Technology

Apache Sentry. Prasad Mujumdar

Data Governance Maturity Model Guiding Questions for each Component-Dimension

Salesforce.com and MicroStrategy. A functional overview and recommendation for analysis and application development

INTRODUCTION TO DATA MINING SAS ENTERPRISE MINER

How to avoid building a data swamp

Gør dine big data klar til analyse på en nem måde med Hadoop og SAS Data Loader for Hadoop. Jens Dahl Mikkelsen SAS Institute

Toad for Oracle 8.6 SQL Tuning

From Lab to Factory: The Big Data Management Workbook

Using Oracle Data Integrator with Essbase, Planning and the Rest of the Oracle EPM Products

Data Integration and ETL with Oracle Warehouse Builder NEW

IBM InfoSphere Discovery: The Power of Smarter Data Discovery

<Insert Picture Here> Oracle SQL Developer 3.0: Overview and New Features

TRANSFORM BIG DATA INTO ACTIONABLE INFORMATION

SQL Server 2012 Gives You More Advanced Features (Out-Of-The-Box)

Lost in Space? Methodology for a Guided Drill-Through Analysis Out of the Wormhole

Master Data Governance & SAP Information Steward Integration. Jens Sauer, SAP Switzerland September 11 th, 2013

BUSINESSOBJECTS DATA INTEGRATOR

APPLICATION COMPLIANCE AUDIT & ENFORCEMENT

Database Studio is the new tool to administrate SAP MaxDB database instances as of version 7.5.

Dell Statistica Web Data Entry

Informatica Data Quality Product Family

Table of Contents Cicero, Inc. All rights protected and reserved.

Oracle Warehouse Builder 10g

SAS BI Course Content; Introduction to DWH / BI Concepts

Oracle Data Miner (Extension of SQL Developer 4.0)

Microsoft Business Intelligence

SAP BO 4.1 COURSE CONTENT

Transcription:

DATA MANAGEMENT USER GROUP MANCHESTER 9 TH MARCH 2016

AGENDA SAS DATA MANAGEMENT USER GROUP 9 th March 2016, SAS Manchester Time Topic Speaker 9-9.30am Coffee and networking 9.30-9.35am Introductions All 9.35-10.15am SAS Data Integration Janice Newell 10.15-11.00am SAS Data Quality Rajeeve Narula 11.00am Coffee Break 11.20-11.40am SAS Data Governance Dave Smith 11:40-12:20pm SAS Data Federation Dave Smith 12.20-12.30pm Wrap up Sophie Ainley 12.30 2pm Lunch and networking All

THE DATA PROCESS A DAY IN THE LIFE Lineage BDN Data Governance Fed Server Data Access Data Assessment Data Cleansing Data Monitoring Data Transformation SAS DI SAS DI DM Studio DM Studio DM Studio DM Studio Fed Server Fed Server Fed Server Fed Server

DATA INTEGRATION

DATA INTEGRATION Transactional Systems Reshape Adjust Time Dim Decision Making External Data Feeds Standardise Clean Regulatory Data Reporting Requirements Spreadsheets Analyse How did you get there? C o p yr i g ht 2013, S A S I nstitute Inc. A l l r i g hts r eser v ed.

SAS DATA INTEGRATION SERVER Generates code through GUI SAS Code Database Code Library of transformations and functions Manages Metadata Records steps taken to build output data Able to trace any table or column forwards and backwards through the process Much more efficient than hand coding Usually 50% faster through clarity, reusability, coding speed C o p yr i g ht 2013, S A S I nstitute Inc. A l l r i g hts r eser v ed.

ANALYTICAL PROCESS DATA REQUIREMENT Source Data Analysis Ready Data Dashboards Deployed Models Stakeholders Productionised Process Analysts Source Data Analysis Ready Data Visualisation Modelling Data Managers Granular Security Business Repeatability Assured Quality IT Support Governance and Clarity C o p yr i g ht 2013, S A S I nstitute Inc. A l l r i g hts r eser v ed.

ANALYTICAL DATA INTEGRATION Preparing data for analysis Can pass data mining metadata to Enterprise Miner (Target etc.) Embedding analytical procedures Summarisation, esp medians (on all data) Time series preparation Multi-row data operations Creating correlation indexes Scoring Data Rapid model deployment Including managing in-database scoring Model monitoring data creation C o p yr i g ht 2013, S A S I nstitute Inc. A l l r i g hts r eser v ed.

C o p yr i g ht 2013, S A S I nstitute Inc. A l l r i g hts r eser v ed. DI DEMO

DEMO DAY IN THE LIFE OF DI ANALYST SAS data External files Join tables Create new fields Map data Check errors Control order Tables Fields Impact analysis Define data Test job C o p yr i g ht 2013, S A S I nstitute Inc. A l l r i g hts r eser v ed.

ANALYTICAL DATA MANAGEMENT DETAIL What SAS DMA provides Development framework to create SAS job flows including a documentation framework Inbuilt versioning framework Inbuilt custom transformation framework to provide re-use of complex processing Metadata impact analysis and search facility Deployment of SAS data flows to a scheduling tool SAS analytical modelling code integration (uses Enterprise Miner) Data quality integration framework bring in DQ processing (uses DMA) Data governance framework share lineage through a browser (DMA 9.4M2) Clear evidence of process and development ownership Historical traceability of DI changes Simplify DI flows and speed up development Importance/usage of data items within a SAS data flow Integration with production processes Support predictive model factory concept Ensure trust in results, build defensive process controls Build business driven definitions of data items C o p yr i g ht 2013, S A S I nstitute Inc. A l l r i g hts r eser v ed.

DATA QUALITY

THE DATA PROCESS A DAY IN THE LIFE Data Access Data Assessment Data Cleansing Data Monitoring DM Studio DM Studio DM Studio DM Studio Data Connection Profiling Standardise Business Rules Data Job Dashboard

DATA GOVERNANCE

WHY GOVERN DATA? Regulation Risk Efficiency Opportunity Financial organisation

CHALLENGE MAP EVERYTHING Customers are under increasing pressure to be able to link data in disparate systems at a logical level that is to show how metadata is connected At the same time, with the advent of Big Data systems and the concept of the Data Lake, it is ever more important from a practical, user-driven point of view, to have a system that tells data users where the data resides? Business Term Data Item?? Where in the Lake is my data?

ETL Data Quality Database Physical Data Model Logical Data Model Database Database Mainframe Data via COBOL Analytics and Reporting

TYPICAL REQUIREMENTS Business Glossary Search / Discover data & metadata Business terms & technical data attributes Ownership, Structure, Usage Context Secured, governed, and workflow enabled Consensus Data Monitoring Metadata Lineage Users specify data quality controls/checks Proactively monitor data Validate data Enforce policy Alerts Automated collection of metadata Services to manage the metadata Maintain relationships Provide context Metadata analysis Collaboration Transparency

METADATA LINEAGE SAS RELATIONSHIP SERVICE AND LINEAGE VIEWER

SAS RELATIONSHIP (LINEAGE) REPOSITORY One repository to store metadata from multiple environments. SAS Relationship Repository

A ROBUST SET OF SERVICES TO MANAGE THE REPOSITORY Automated metadata collection Easy to access Many ways to analyze Metabridge Loader Lineage Viewer Relationship Loader REST Services Relationship Repository Relationship Reporter REST Services

RESULT: DATASTAGE ETL JOB

CLEAR GOVERNANCE AND OWNERSHIP SAS BUSINESS DATA NETWORK

DATA GOVERNANCE PEOPLE, PROCESS, TECHNOLOGY Business User Business Term Technical User Rule Business Data Network Data Stewards Alerts

DEMO

SAS FEDERATION SERVER

DATA FEDERATION WHAT IS IT? Federation Server Source 1 Web Administration Logging Source 2 Federated Views Row and Column Access Control Applications Source 3 Caching views Scheduling

DATA FEDERATION ENABLING COLLABORATION Collaboration Environment Secure data filter Organisation 1 Organisation 2 Organisation 3

DATA FEDERATION MASKING SENSITIVE INFORMATION Data Lake Data Masking Analysis Environment

DATA FEDERATION AUDITING ANALYTICAL USAGE EDW Logging Analysis Environment Notifiable Queries Workflow

SAS FEDERATION SERVER WHAT S NEW IN 4.2?

WHAT S NEW? SUMMARY SAS Metadata Server and Web Infrastructure Platform (WIP) integration SAS Metadata Server replaces DataFlux Authentication Server for authentication and persistence of users, groups, logins (for example, personal, group, and shared) and domains

WHAT S NEW? SUMMARY Read/Write access to Hadoop (HIVE) using the new SAS Federation Server Driver for Apache Hive Access to SAS data sets secured with metadata bound libraries Access to shared data sources across multiple SAS Federation Servers using a new Federation Server Driver Enhanced data masking and encryption support

WHAT S NEW? SUMMARY Embedded data quality and cleansing functions in data views Support for SAS DS2 Cache enhancements that include in memory data cache A new migration guide is available for SAS Federation Server 4.2. Proc ASExport

USES SAS METADATA SERVER This refresh icon can be used to show newly created Authentication Domains SAS Metadata Server replaces Authentication Server for authentication and other permission-based functions SAS Metadata Server provides access for user and group objects other permission-based functions such as shared logins and trusted users.

ENHANCED DATA MASKING Enhanced data masking and encryption support New data masking features include TRANC which transliterates characters from the input string to characters in the output string. to change (letters, words, etc.) into corresponding characters of another alphabet or language A series of random data masking rules are also available. The current set of available Data Masking Functions

The current set of available Data Masking Functions ENHANCED DATA MASKING TRANC which transliterates characters RANDOM rules are also available Example of masking a numeric column Example of masking a character column

CACHE ENHANCEMENTS Cache enhancements that include cache refresh for data held in memory We can now cache queries to the MDS (Memory Data Store) = FAST PERFORMANCE Federation Server now has the capability of refreshing cached data, including MDS, after a server restart. In previous releases, cached data that was held in memory was deleted if the server was restarted or shut down.

FED. SERVER 4.2 CACHE ENHANCEMENTS After a Fed. Server restart the views are re-ran in the background We can now cache queries to the MDS (Memory Data Store) = FAST PERFORMANCE

FED. SERVER 4.2 Parsing EMBEDDED DATA QUALITY Mr. Roy G Biv Jr Data Quality Extraction Blue mens long-sleeved buttondown collar denim shirt Where the DQ functions live Pattern Analysis 999-999-9999 Identification Analysis John Smith = Name / SAS = Organization Gender Analysis Jane Smith = F - Sam Adams = M Standardization, Casing 919.6778000 = (919) 677-8000 Matching John Smith / J. Smith / Mr. Jon Smith

EMBEDDED DATA QUALITY Embedded data quality and cleansing functions in data views Implemented using SAS Quality Knowledge Base (QKB) with FedSQL and DS2. The data quality methods use data quality rules from the SAS QKB in order to cleanse data. The standardized primary_state_code

FED. SERVER 4.2 EMBEDDED DATA QUALITY

DATA STEP 2 (DS2) LANGUAGE SAS Federation Server now supports the DATA Step 2 (DS2) language. includes additional data types ANSI SQL types programming structure elements user-defined methods and packages. To invoke DS2, you must configure a DSN that uses the DS2 dialect Processing gets automatically pushed down if Code Accelerator is present in the corresponding data platform If DS2 code conforms to a pushable format (e.g. threads defined, etc.)

FED. SERVER 4.2 DATA STEP 2 (DS2) LANGUAGE Actually, our DQ functions are DS2 methods invoked from SQL DS2 equivalent Customers can write any DS2 code with if/then/else logic, iterating over column data and producing programmatic results This integrates nicely with SQL and is a very useful way to use DS2

READ/WRITE ACCESS TO HADOOP (HIVE) Read/Write access to Hadoop (HIVE) using the SAS Federation Server Driver for Apache Hive The Driver for Hive uses FedSQL and also provides limited support for HiveQL. supports multiple versions of Hadoop. you can use Kerberos does not support Write operations such as insert, update, and delete

FED. SERVER 4.2 READ/WRITE ACCESS TO HADOOP (HIVE) Access Hadoop using SAS Studio to Federation Server Create a table in Hadoop The configuration of the Hadoop Data Service using the native Apache HIVE driver

QUESTIONS?

CUSTOMER LOYALTY UK USER GROUPS To register: www.sas.com/uk/usergroups

USEFUL INFORMATION

THANK YOU FOR YOUR TIME