Using a Data Warehouse to Audit a Transactional System



Similar documents
SQL Injection. SQL Injection. CSCI 4971 Secure Software Principles. Rensselaer Polytechnic Institute. Spring

Geodatabase Programming with SQL

Data Warehousing. Jens Teubner, TU Dortmund Winter 2014/15. Jens Teubner Data Warehousing Winter 2014/15 1

New Features in MySQL 5.0, 5.1, and Beyond

Data Domain Profiling and Data Masking for Hadoop

Physical Design. Meeting the needs of the users is the gold standard against which we measure our success in creating a database.

Session #25832 March 12, 2008 Alliance 2008 Conference Las Vegas, Nevada

Oracle Database 12c: Introduction to SQL Ed 1.1

Introduction to Triggers using SQL

OLH: Oracle Loader for Hadoop OSCH: Oracle SQL Connector for Hadoop Distributed File System (HDFS)

Vendor: Crystal Decisions Product: Crystal Reports and Crystal Enterprise

D61830GC30. MySQL for Developers. Summary. Introduction. Prerequisites. At Course completion After completing this course, students will be able to:

database abstraction layer database abstraction layers in PHP Lukas Smith BackendMedia

Duration Vendor Audience 5 Days Oracle End Users, Developers, Technical Consultants and Support Staff

ETL Process in Data Warehouse. G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT

Extraction Transformation Loading ETL Get data out of sources and load into the DW

Product: DQ Order Manager Release Notes

Partitioning under the hood in MySQL 5.5

MySQL for Beginners Ed 3

Oracle SQL. Course Summary. Duration. Objectives

Windows Scheduled Task and PowerShell Scheduled Job Management Pack Guide for Operations Manager 2012

A basic create statement for a simple student table would look like the following.

Resources You can find more resources for Sync & Save at our support site:

Karl Lum Partner, LabKey Software Evolution of Connectivity in LabKey Server

Instant SQL Programming

Websense SQL Queries. David Buyer June 2009

Implementing a Data Warehouse with Microsoft SQL Server 2012 MOC 10777

Programming with SQL

Oracle Database: SQL and PL/SQL Fundamentals

Governance, Risk, and Compliance Controls Suite. Preventive Controls Governor Audit Rules User s Guide. Software Version

Microsoft Access 3: Understanding and Creating Queries

Content Management System (CMS)

Oracle Data Integrator: Administration and Development

Sybase Replication Server 15.6 Real Time Loading into Sybase IQ

Oracle Database 11g SQL

1. To start Installation: To install the reporting tool, copy the entire contents of the zip file to a directory of your choice. Run the exe.

Database Migration from MySQL to RDM Server

Data Archiving - Solutions, Challenges, Considerations & 3rd Party Tools. Putting Customer First

API Endpoint Methods NAME DESCRIPTION GET /api/v4/analytics/dashboard/ POST /api/v4/analytics/dashboard/ GET /api/v4/analytics/dashboard/{id}/ PUT

Technology Foundations. Conan C. Albrecht, Ph.D.

Vendor: Brio Software Product: Brio Performance Suite

Programming Database lectures for mathema

Serious Threat. Targets for Attack. Characterization of Attack. SQL Injection 4/9/2010 COMP On August 17, 2009, the United States Justice

PeopleSoft Development: Overview of Application Engine and the Query Tool. Presented by: Judi Doolittle (Judi Hotsinpller) and Barbara Sandoval

Implementing a Data Warehouse with Microsoft SQL Server

IronKey Enterprise File Audit Admin Guide

Database Design Patterns. Winter Lecture 24

ETL as a Necessity for Business Architectures

Oracle Database: SQL and PL/SQL Fundamentals NEW

Retrieving Data Using the SQL SELECT Statement. Copyright 2006, Oracle. All rights reserved.

SAP Business Objects Business Intelligence platform Document Version: 4.1 Support Package Data Federation Administration Tool Guide

Implementing a Data Warehouse with Microsoft SQL Server MOC 20463

COURSE OUTLINE MOC 20463: IMPLEMENTING A DATA WAREHOUSE WITH MICROSOFT SQL SERVER

Pervasive Data Integrator. Oracle CRM On Demand Connector Guide

Teach Yourself InterBase

SQL Server for developers. murach's TRAINING & REFERENCE. Bryan Syverson. Mike Murach & Associates, Inc. Joel Murach

Optimizing Your Data Warehouse Design for Superior Performance

SQL Injection. Sajjad Pourali CERT of Ferdowsi University of Mashhad

TimesTen Auditing Using ttaudit.java

Chapter 6: Physical Database Design and Performance. Database Development Process. Physical Design Process. Physical Database Design

SQL 2: GETTING INFORMATION INTO A DATABASE. MIS2502 Data Analytics

Fine Grained Auditing In Oracle 10G

CDW DATA QUALITY INITIATIVE

Installation Instruction STATISTICA Enterprise Small Business

SQL Server 2008 Core Skills. Gary Young 2011

Oracle Database: SQL and PL/SQL Fundamentals

Managing Objects with Data Dictionary Views. Copyright 2006, Oracle. All rights reserved.

Extracting META information from Interbase/Firebird SQL (INFORMATION_SCHEMA)

Microsoft SQL connection to Sysmac NJ Quick Start Guide

BIG DATA HANDS-ON WORKSHOP Data Manipulation with Hive and Pig

ETL Overview. Extract, Transform, Load (ETL) Refreshment Workflow. The ETL Process. General ETL issues. MS Integration Services

STATISTICA VERSION 12 STATISTICA ENTERPRISE SMALL BUSINESS INSTALLATION INSTRUCTIONS

PART 1. PeopleSoft Basics

Human Resources (HR) Query Basics

Introduction This document s purpose is to define Microsoft SQL server database design standards.

A Dashboard for the JCC

Advanced BIAR Participant Guide

Implementing a Data Warehouse with Microsoft SQL Server

Basics Series-4004 Database Manager and Import Version 9.0

Oracle Database: Introduction to SQL

Course Outline. Module 1: Introduction to Data Warehousing

ODBC Driver Version 4 Manual

AUTHENTICATION... 2 Step 1:Set up your LDAP server... 2 Step 2: Set up your username... 4 WRITEBACK REPORT... 8 Step 1: Table structures...

SQL. Short introduction

COSC 6397 Big Data Analytics. 2 nd homework assignment Pig and Hive. Edgar Gabriel Spring 2015

Writing a Logical Decoding Plug-In. Christophe Pettus PostgreSQL Experts, Inc. FOSDEM 2015

Introduction to SQL for Data Scientists

Oracle Database: Introduction to SQL

ETL PROCESS IN DATA WAREHOUSE

Search help. More on Office.com: images templates

SQL Simple Queries. Chapter 3.1 V3.0. Napier University Dr Gordon Russell

SAP Business Objects Data Services Setup Guide

How To Create A Table In Sql (Ahem)

Teamstudio USER GUIDE

Transcription:

Using a Data Warehouse to Audit a Transactional System Mike Glasser Office of Institutional Research University of Maryland Baltimore County

Agenda Introduction Why Audit How to Audit The Setup The Audit Audit Results Questions 2

UMBC University of Maryland - Baltimore County Located in suburban Baltimore County, between Baltimore, MD and Washington, DC One of the three public research campuses in the University of Maryland System 10K undergraduate and 2K graduate students 2,200 Employees; 715 full-time faculty 8 Pan-Am chess championships 3

UMBC Data Warehouse Custom HR and Student tables for IR 5 years Oracle Relational tables Purchased istrategy DW for Campus Legacy student > 1 year PS Student Admin live for 1 month SQL Server 2005 15 fact tables 100 custom reporting tables for IR 75 GB in size 3 IR staff + 1.5 IT staff 4

What is an Audit? Need to identify and reconcile the data that was changed yesterday 5

Why Audit in the DW? Audit HR data entry Piecing together 4 pages every day Learn about changes to codes Why in the Data Warehouse? Take pressure off OLTP No custom tables No changes to transactional tables No need to turn on audit 6

Auditing at UMBC 10 tables PS_Job, PS_Citizenship, Tax_Data PS_Acad_Plan_Tbl (majors) Translation table Error Messages Average 825,000 records audited per night Average 1,000+ changes per night Average about 45 seconds for complete audit Max < 4 minutes with 825,000 changes 7

How to Audit Make a copy of yesterday Refresh the table with today Compare yesterday with today Report the differences 8

How We Audit Nightly_YDAY_Tables Copy Source tables to _YDAY Refresh Source tables Nightly_Audit Load_Data_Changes for each table Data_Changes_Staging Parse updates by field Email_Audit_Summary 9

Setup Create YDAY table Create Current view (optional) Index YDAY table Add to Nightly YDAY procedure Add to Nightly Audit procedure 10

YDAY table Make a copy of yesterday Setup is one time copy of the table to be audited Tablename with suffix _YDAY Index on primary key YDAY table populated every night Copy of yesterday s source table Before nightly refresh of source tables Truncated to preserve indexes 11

Setup Create YDAY table Create Current view (optional) Index YDAY table Add to Nightly YDAY procedure Add to Nightly Audit procedure 12

Special Case Effective Dating Transaction tables keep a history Current data is based on effective date New effective dated record inserted when something is updated Want to treat new record as UPDATE 13

Current View Create view (virtual table) of rows currently in effect Populate YDAY table from view Primary key does not include Effdt Compare YDAY to view of rows currently in effect Skip Effdt column during comparison 14

Effective Dating Yesterday Emplid (PK) Effdt (PK) Major 1000 2/3/2008 Mathematics Today Emplid (PK) Effdt (PK) Major 1000 2/3/2008 Mathematics 1000 4/1/2009 English 15

Effective Dating Yesterday Emplid (PK) Effdt Major 1000 2/3/2008 Mathematics Today View Emplid (PK) Effdt Major 1000 4/1/2009 English Emplid 1000 updated major from Math to English The changed EFFDT is ignored during comparison 16

Setup Create YDAY table Create Current view (optional) Index YDAY table Create primary key Add to Nightly YDAY procedure Add to Nightly Audit procedure 17

Nightly YDAY Procedure BEGIN TRANSACTION TRUNCATE TABLE Stage.PS_Citizenship_YDAY INSERT INTO Stage.PS_Citizenship_YDAY SELECT * FROM Source.PS_Citizenship COMMIT TRANSACTION 18

Nightly Audit Procedure SELECT @table = 'PS_CITIZENSHIP' EXECUTE Admin.Load_Data_Changes @p_base_table = 'Stage.PS_Citizenship_YDAY', @p_comp_table = 'Source.PS_Citizenship', @p_details = 'YES', @p_changes_tablename = @table 19

Setup Created YDAY table Created Current view (maybe) Created Primary Key Added to Nightly YDAY procedure Added to Nightly Audit procedure PS_Citizenship table is now going to be audited every night 20

The Audit How do we know what changed? SELECT @table = 'PS_CITIZENSHIP' EXECUTE Admin.Load_Data_Changes @p_base_table = 'Stage.PS_Citizenship_YDAY', @p_comp_table = 'Source.PS_Citizenship', @p_details = 'YES', @p_changes_tablename = @table 21

Delete, Add, Update? Compare Primary Keys Deletes Updates Adds Yesterday Today 22

Compare Yesterday to Today Read database metadata to determine primary key Read metadata to build dynamic SQL Dynamic SQL is built for ADDs and DELETEs Dynamic SQL is built for UPDATEs The dynamic SQL inserts records into staging table Final step parses staging table 23

Sample Table Sample Column Name Key1 Key2 Field1 Field2 Type Varchar(4) Integer Varchar(10) Datetime 24

Sample Data Yesterday Key1 Key2 Field1 Field2 UMBC 1 Update Me 7/8/2009 12:00 AM UMBC 2 Delete Me 6/7/2009 12:00 AM Today Key1 Key2 Field1 Field2 UMBC 1 Updated 5/1/2009 12:00 AM UMBC 3 Add Me 3/4/2009 12:00 AM 25

Staging Audit Data Table Name Action Key Fields Key Values Changes Sample Delete Key1~Key2 UMBC~2 Sample Add Key1~Key2 UMBC~3 Sample Update Key1~Key2 UMBC~1 ~~Field1 ~~Field1=[Update Me^^Updated] ~Field2=[Jul 8 2009 12:00AM^^May 1 2009 12:00AM]~ ~ field delimiter ^^ old/new value delimiter 26

Audit SQL for Add INSERT INTO Stage.DATA_CHANGES (Tablename, Action, Key_Fields, Key_Values, Changes) SELECT 'Sample' tname, 'ADD' action, 'Key1~Key2' kfields, convert(varchar,a.key1) + '~' + convert(varchar,a.key2) kvalues, null FROM mglasser.sample A LEFT OUTER JOIN mglasser.sample_yday B ON A.Key1 = B.Key1 and A.Key2 = B.Key2 WHERE B.Key1 is null 27

Create Dynamic SQL for Add SET @SQL = 'INSERT INTO Stage.DATA_CHANGES (Tablename,Action,Key_Fields,Key_Values,Changes) SELECT ''' + @p_changes_tablename + ''' tname, ' ADD'' action, ''' + @Key_Fields + ''' kfields, ' + @Key_Values + ' kvalues, null ' + @Outer_Join + ' ON ' + @Join_Keys + ' WHERE B.' + substring( @Key_Fields,1,charindex('~',@Key_Fields+'~')-1) + ' is null EXECUTE (@SQL) 28

Primary Key Fields Derived from Primary Key metadata SQL Server Table_Constraints Key_Column_Usage Oracle All_Indexes All_Ind_Columns Tells us the name of the primary key Tells us the fields in the primary key 29

Primary Key SQL SELECT stuff(( select '~' + k1.column_name from information_schema.key_column_usage k1 where k1.constraint_name = c.constraint_name order by ordinal_position for xml path('') ), 1,1,'') FROM INFORMATION_SCHEMA.Table_Constraints c WHERE c.table_schema = @Schema1 AND c.table_name = @Table1 AND c.constraint_type = 'PRIMARY KEY' GROUP BY constraint_name Result for @Key_Fields: Key1~Key2 30

Audit SQL for Add INSERT INTO Stage.DATA_CHANGES (Tablename, Action, Key_Fields, Key_Values, Changes) SELECT 'Sample' tname, 'ADD' action, 'Key1~Key2' kfields, convert(varchar,a.key1) + '~' + convert(varchar,a.key2) kvalues, null FROM mglasser.sample A LEFT OUTER JOIN mglasser.sample_yday B ON A.Key1 = B.Key1 and A.Key2 = B.Key2 WHERE B.Key1 is null 31

Staging Audit Data Table Name Action Key Fields Key Values Changes Sample Delete Key1~Key2 UMBC~2 Sample Add Key1~Key2 UMBC~3 Sample Update Key1~Key2 UMBC~1 ~~Field1 ~~Field1=[Update Me^^Updated] ~Field2=[Jul 8 2009 12:00AM^^May 1 2009 12:00AM]~ ~ field delimiter ^^ old/new value delimiter 32

Audit for Update Join tables by primary key Compare each field in YDAY with same field in today s table String together results of comparison Delimited by ~ If fields are equal, results are empty If not equal, then put fieldname and both values In one SQL insert statement Created dynamically from metadata 33

Comparison Results No updates in Sample data ~~~~ Updates from Sample data ~~Field1~Field2~ Insert into staging table only those records where result has something other than ~ 34

SQL for Comparison SQL Server CASE WHEN YDAY.Field1 = TODAY.Field1 THEN '~' ELSE 'Field1=['+ YDAY.Field1 + '^^' + TODAY.Field1 + ']~' END Oracle DECODE( YDAY.Field1, TODAY.Field1, null, 'Field1=[ YDAY.Field1 '^^' TODAY.Field1 ']~') 35

Create Dynamic SQL for Update SELECT @Changes = stuff( ( select '+ case when isnull(convert(varchar,a.' + c1.column_name + '),'''') = isnull(convert(varchar,b.' + c1.column_name + '),'''') then ''~'' else ''' + c1.column_name + '=[''+isnull(convert(varchar,a.' + c1.column_name + '),'''') + ''^^'' +isnull(convert(varchar,b.' + c1.column_name + '),'''') + '']~'' end ' from information_schema.columns c1 where c1.column_name not in ('LOAD_DTTM','LASTUPDDTTM','DW_LOAD_DTTM', 'UW_LOAD_DTTM','AGE','EFFDT','EFFSEQ') and c1.table_schema = c2.table_schema and c1.table_name = c2.table_name for xml path('') ), 1,1,'') FROM Information_Schema.Columns c2 WHERE table_schema = @Schema1 and table_name = @Table1 GROUP BY c2.table_schema, table_name 36

Staging Audit Data Table Name Action Key Fields Key Values Changes Sample Delete Key1~Key2 UMBC~2 Sample Add Key1~Key2 UMBC~3 Sample Update Key1~Key2 UMBC~1 ~~Field1 ~~Field1=[Update Me^^Updated] ~Field2=[Jul 8 2009 12:00AM^^May 1 2009 12:00AM]~ ~ field delimiter ^^ old/new value delimiter 37

Final Audit Data Table Name Action Key Fields Key Values Sample Delete Key1~Key2 UMBC~2 Sample Update Key1~Key2 UMBC~1 Sample Update Key1~Key2 UMBC~1 Fieldname Old Value New Value DW Load Dttm 4/15/2009 14:30 Field1 Update Me Updated 4/15/2009 14:30 Field2 Jul 8 2009 12:00AM May 1 2009 12:00AM 4/15/2009 14:30 38

Report the Differences The audit is complete The keys for deleted or new records The old and new values for updated fields Notify users of summary results Allow reporting of details 39

Email Outputs Email sent to HR data manager and OIR nightly Records Add Delete Update Table Name ------- --- ------ ------ ---------- 45 1 0 338 Employees_Cur 52 8 0 602 Jobs_Cur 23 19 1 3 PS_CITIZENSHIP 5 1 0 20 PS_FED_TAX_DATA 53 8 0 461 PS_JOB 19 15 0 8 PS_PERS_DATA_EFFDT 5 1 0 7 PS_STATE_TAX_DATA 12 8 0 6 PS_UM_JOB_INFO Fields 40

Crystal Report 41

Potential Alternatives Database logs Utility program from vendors that read the logs the database keeps Not available choice for us Costs money (free ones?) Output may not be flexible? Smaller views Only audits the fields in the view Can control for specific records Smaller YDAY tables Faster performance 42

Poor Alternatives Separate program for each table Longer development Potentially huge SQL Maintenance for table changes Separate dynamic SQL for each field Horrible performance Reads the tables once for each field 43

Alternative Uses Debugging & development Table structure changes ALL_TAB_COLS Information_Schema.Columns Deltas for data warehouse loads Monitor or analyze database actions over time??? 44

Wrap Up Any Questions? Mike Glasser University of Maryland - Baltimore County mglasser@umbc.edu (410) 455-3577 Source code is available upon request 45