ETL Team Development Standards - DRAFT



Similar documents
eopf Release E Administrator Training Manual

Excel Reports and Macros

Installation Manual Version 8.5 (w/sql Server 2005)

Drake Hosted User Guide

Using Oracle Data Integrator with Essbase, Planning and the Rest of the Oracle EPM Products

Oracle Data Integrator for Big Data. Alex Kotopoulis Senior Principal Product Manager

OnCommand Report 1.2. OnCommand Report User Guide. NetApp, Inc. 495 East Java Drive Sunnyvale, CA U.S.

STIDistrict Server Replacement

User Manual for Web. Help Desk Authority 9.0

Backup and Recovery. What Backup, Recovery, and Disaster Recovery Mean to Your SQL Anywhere Databases

PATROL From a Database Administrator s Perspective

SWIFT Data Warehouse Frequently Asked Questions & Glossary of Terms

System Administrator Guide

Cal Answers Analysis Training Part III. Advanced OBIEE - Dashboard Reports

Active Directory User Management System (ADUMS)

Encore Software Solutions (V3) Identity Lifecycle Management and Federated Security Suite (ILM/FSS) Overview and Technical Requirements

11. Configuring the Database Archiving Mode.

Copyright 2011 DataNet Quality Systems. All rights reserved. Printed in U.S.A. WinSPC is a registered trademarks of DataNet Quality Systems.

The United States Office Of Personnel Management eopf System Administrator Training Manual for eopf Version 4.0.

ITAR Compliant Data Exchange

Scomis Remote Backup Service 1 st April 2014 until 31 st March 2015

SAP BusinessObjects Financial Consolidation Web User Guide

Workflow Templates Library

BACKUP SOLUTIONS FOR SCHOOLS. Advice and Guidance. ICT Services 42 New Union Street Coventry CV1 2HN

IMS Self Service Portal Customer Guide

CAQH ProView. Practice Manager Module User Guide

Running, Viewing, and Printing Reports Table of Contents

Table of Contents Chapter 1 - Getting Started with Oracle Data Relationship Management (DRM) 1

Symantec Backup Exec Desktop Laptop Option ( DLO )

SOLUTION GUIDE AND BEST PRACTICES

Content Management System User Guide

Copyright 2012 Trend Micro Incorporated. All rights reserved.

Oracle Business Intelligence Server Administration Guide. Version December 2006

Table of Contents OBJECTIVE... 3 USING WORKFLOW... 3 WORKFLOW USER ATTRIBUTE MAINTENANCE... 5 WORKFLOW PARAMETER MAINTENANCE... 6

SAS Business Data Network 3.1

Karl Lum Partner, LabKey Software Evolution of Connectivity in LabKey Server

4 Backing Up and Restoring System Software

Oracle Data Integrators for Beginners. Presented by: Dip Jadawala Company: BizTech Session ID: 9950

VIPERVAULT STORAGECRAFT SHADOWPROTECT SETUP GUIDE

5.5. Change Management for PeopleSoft

news from Tom Bacon about Monday's lecture

Oracle Business Intelligence Applications

Dynamics GP 50 More Tips in 50 Minutes

SonicWALL CDP 5.0 Microsoft Exchange InfoStore Backup and Restore

Institution/Department Administrator Manual

Employee Training Center LearnerWeb Manual

Advanced Configuration Steps

Data Movement Modeling PowerDesigner 16.1

Secure Messaging Quick Reference Guide

webkpi SaaS ETL Connector Installation & Configuration Guide

How To Backup A Database

Monitor and Manage Your MicroStrategy BI Environment Using Enterprise Manager and Health Center

Introduction to Business Reporting Using IBM Cognos

Working with SQL Server Agent Jobs

Supplier registers on SRF. VC logs into SRF portal. Check for Duplicates (System & Manual) Check SRF Entity type for 1099 indicator

Does the GC have an online document management solution?

VERITAS NetBackup 6.0

Getting Started with Attunity CloudBeam for Azure SQL Data Warehouse BYOL

Department of Veterans Affairs VistA Integration Adapter Release Enhancement Manual

Windows Mobile from Vodafone

Business 360 Online - Product concepts and features

Scheduling Software User s Guide

DocAve 6 Service Pack 1 Administrator

COOK COUNTY OFFICE 365 MIGRATION USER GUIDE

Cost Savings THINK ORACLE BI. THINK KPI. THINK ORACLE BI. THINK KPI. THINK ORACLE BI. THINK KPI.

Oracle Database Security. Nathan Aaron ICTN 4040 Spring 2006

Microsoft. Course 20463C: Implementing a Data Warehouse with Microsoft SQL Server

Lab 02 Working with Data Quality Services in SQL Server 2014

UC Berkeley Data Warehouse Roadmap. Data Warehouse Architecture

A guide through the concepts of Serena Dimensions. René Steg Steg IT-Engineering, Zurich (Switzerland)

PROFESSIONAL SERVICES

ithenticate User Manual

M I O app User Guide. a n a g e r. Version

TimeValue Software Due Date Tracking and Task Management Software

Informatica PowerCenter Express (Version 9.5.1) Getting Started Guide

IBM Information Server

GREEN HOUSE DATA. Services Guide. Built right. Just for you. greenhousedata.com. Green House Data 340 Progress Circle Cheyenne, WY 82007

WebSphere Business Monitor V6.2 Business space dashboards

Maximizing Performance for Oracle Database 12c using Oracle Enterprise Manager

VERITAS NetBackup 6.0 for Microsoft Exchange Server

Kaseya 2. User Guide. Version 1.0

Oracle Agile Product Lifecycle Management for Process

Administration GUIDE. SharePoint Server idataagent. Published On: 11/19/2013 V10 Service Pack 4A Page 1 of 201

MapGuide Open Source Repository Management Back up, restore, and recover your resource repository.

WaSERV (a.k.a. The Vault) Frequently Asked Questions

NYS OCFS CMS Manual CHAPTER CHAPTER CHAPTER CHAPTER Contract Management System

Applications Manager Version 8.0

Definition / Working Example

IT Quick Reference Guides Using Windows 7

FrontDesk Installation And Configuration

Council of Ontario Universities. COFO Online Reporting System. User Manual

ITPS AG. Aplication overview. DIGITAL RESEARCH & DEVELOPMENT SQL Informational Management System. SQL Informational Management System 1

Transcription:

Table of Contents ETL Development Checklist... 2 Technical Specification Development Guidelines...2 Informatica Naming Conventions...2 Informatica Version Control Guidelines...3 Documentation...4 Miscellaneous Guidelines...4 Optimization/Security Considerations...4 Change Management Considerations...4 Appendix A: Informatica Administration and Maintenance... 5 Informatica Server Reboot Procedures...5 Appendix B: BAIRS Outages... 6 Appendix C: Issues Log... 6 Appendix D: Month-End Procedures... 7 Appendix E: Archiving Procedures... 7 Appendix F: Student Data Warehouse Support... 7 Access Request Process...7 Appendix G Gaelen Standards... 7 Appendix H - Peoplesoft Table Considerations... 8 Appendix I - Environment Objectives... 8 Page 1

ETL Development Checklist Develop and publish Technical specifications based on functional requirements Create Informatica mapping specifications from ETL template Publish to Sharepoint ETL Home Shared Documents. Create new folder as necessary. Develop new Informatica map in development folder. For revisions to existing Informatica mappings, copy mappings from info_prod repository to info_dev repository within Designer tool. Change version number per version control guidelines below. For new Informatica mappings, create mapping in info_dev repository using Designer tool, per Informatica Naming conventions. Update Informatica Maps in Development list on ETL Sharepoint site or drag separate copy of map into In Development folder. Maps may be developed in personal folders within the info_dev repository, but should be migrated to the dev copy of the relevant production folder after initial development is complete and before QA testing begins. Review design with teammates. Technical Specification Development Guidelines To develop technical specifications for a project, use the PROJECT NAME Data Requirements template under Data Warehouse Groups > Deliverable Templates: https://calshare.berkeley.edu/sites/rapo/edw/reporting/deliverable%20templates/forms/allitems.aspx Informatica Naming Conventions Map Naming Conventions Within Informatica Designer, maps should be named using the following template: Area_TargetName_Qualifier_Action_vX_Y where: Area is stg Staging dw EDW fact/dimension tables TargetName is the final target table name all in upper case Action is del - Delete ins - Insert updt - Update scd - Slowly Changing Dimensions copy - Copy (no transformation logic between sources and targets mainly used for source to stage copies, creation of test data, and ad hoc data movement.) Page 2

Qualifier is A description of the functionality of the mapping. This only needs to be added if multiple mappings use the same target table. v stands for version X is the major version number. It is initially set to 1 when a map is first created and is incremented by one for each subsequent major change to that mapping. Major changes involve fundamental changes to a map design, e.g. new sources, transformations and/or targets, replacing or significantly augmenting existing functionality. For minor mapping revisions, the major version number remains constant. Y is the minor version number. It is initially set to 0 when a map is first created and is incremented by one for each minor change to a given mapping (e.g. a change to a Filter transformation condition or a change to derived values within an Expression tranformation). When the Major Version number ( X above) is incremented, the minor version number is re-set to 0. Session Naming Conventions Within Informatica Workflow Manager, sessions should be named using the following template: s_mappingname_qualifier where: s stands for session MappingName is the name of the Informatica mapping associated with a given session Qualifier is A description of the functionality of the session. This only needs to be added if the mapping is associated with multiple sessions. Workflow Naming Conventions Within Informatica Workflow Manager, Workflows should be named using the following template: wf_workflowname_frequency where: wf stands for Workflow WorkflowName is a description of the functionality contained within the workflow, e.g. HR_ADM_WKFORCE Frequency is how often the workflow runs e.g. Monthly, Daily, Weekly. Daily can be used for workflows which run Monday through Saturday or Monday through Friday. Informatica Version Control Guidelines Version control in Informatica is managed as follows: Page 3

Maps and associated sessions should share the same version number See naming conventions above for details on how version numbers should be maintained for mappings and sessions. Documentation Documentation should be posted on the appropriate BearShare site: BearShare is backed up nightly with 2 hour snapshots taken during the day more info is available at : https://bearshare.berkeley.edu/c4/implementing%20bearshare/default.aspx. Michael Leefers is contact for bearshare questions. Miscellaneous Guidelines Source to Stage Mapping Guidelines Update strategy: maintenance of code values not matching in the source system Workflows should be updated in info_dev repository to match production before new/revised maps, sessions and/or workflows are moved to production. Include considerations of shortcut folder management when existing maps are to be modified. All documents on this list should be updated as part of any new development work. Report Inventories: https://bearshare.berkeley.edu/sites/rapo/edw/reporting/reports/shared%20documents/forms/allitems.aspx - contains three documents containing report inventories for BAIRS, BIS and HR. It would be useful to augment these spreadsheets by adding the underlying tables associated with listed views. Optimization/Security Considerations Developer Roles (ADM_RO and ADM_HR_RO) should always be given read access to new database objects. Access privileges may prevent developers from being able to view data contained in database Views. Two ways to deal with this: 1. Update HRMS_OPR_XREF table (in QA) to allow access. Example below: Update BAIR_HRMS_OPR_XREF Set userid = BISWJC Where oprid = 011502567 (or 011504738 ) 2. Apply for security access through SARA (HRMS Dept. Security, Administer Workforce) Change Management Considerations Resources: Page 4

ASD_EDW_change_process_notification_flow, under Data Warehouse Groups Shared Documents (https://bearshare.berkeley.edu/sites/rapo/edw/reporting/shared%20documents/forms/allitems.aspx ). To request mainframe production changes: send 2 emails: asdhelp@berkeley.edu ist-as-production@lists.berkeley.edu Enter objects to be moved into TSO MIGMGR - describe what needs to be moved and when. (e.g. move members xxx from EDW.PUB.STAGE.INCLIB to ASD.P.CTM.BIS.AEVARS). Report Migrations Contact bairpthelp@berkeley.edu for report migration requests. Non-standard between 5-6 PM, 8-9 AM. Report users/ess staff to review before general access is allowed. Appendix A: Informatica Administration and Maintenance Informatica Server Reboot Procedures To bring down the Informatica server: Confirm that no jobs are running. Insure all developers have saved work and closed desktop clients. In Server Manager, right-click on modoc_712 icon. Choose Shut down server Log onto modoc as informat ps ef grep pm Open Admin console, connect to modoc Repositories info_prod stop Modoc shutdown (repository server) To bring up the Informatica server: From Unix as informat on modoc: Page 5

cd /apps/informatica7.1.2/repository_server pmrepserver pmrepserver.cfg cd../server./pmserver pmserver_prod.cfg From Unix as informat on tehama: cd /apps/informatica7.1.2/ pmrepserver pmrepserver.cfg cd server./pmserver pmserver_prod.cfg ETL Team Development Standards - DRAFT Appendix B: BAIRS Outages If there s a problem loading EDW Ask Quin/oracledoctor to put the database in restricted mode so users cannot run reports. Post a message in the report portal as soon as it is available. Send listserves per doc in Support folder. Ask Michael to make specific reports or folders unavailable to users so we can allow other users to run reports until fin/pb load is complete. When loads have completed, tell Quin/DBA s: please reset access so users can run reports (take the ucbdw1p database out of restricted access mode) Add servicedesk@berkeley.edu to any BAIRS outage notifications so Kevin Haney can post it on the CCS Status page. Appendix C: Issues Log All production issues should be entered in the Sharepoint issues log at https://bearshare.berkeley.edu/sites/rapo/edw/reporting/lists/dw%20production%20issues/allitems.aspx. Issues include Reconciliation problems Informatica production map failures Etc. Page 6

Appendix D: Month-End Procedures Duplicate rows in Fact_JD_Open were inserted on 10/21, the day after we processed Foundation JD data for Period 3 close.next time we can avoid this by waiting an extra day before starting the monthly process for Foundation JD data. This is due to our current daily process loads 2-day worth of Fact JD data in to the Open table. Appendix E: Archiving Procedures EDW data is backed up to tape on the following schedule: Nightly backups, retained for 30 days Weekly backups, retained for 90 days Monthly backups, retained for 1 year Backups are not encrypted. Tapes are stored by Iron Mountain. Hope this helps - let me know if you have any more questions. See X:\RAPO\EDW\CCS-EDW\Support\Financials\ProductionSupport\AP PO Archive procedure.doc for details on AP/PO archiving procedures. Appendix F: Student Data Warehouse Support Access Request Process Request for access to pilot SDW is approved through Dennis or designate. Approval is forwarded to Oracle DBA's DBA's verify that the allotted number of users (24) is not exceeded. If user count is greater than 24, Dennis or designate is requested to provide a user to have access removed. DBA's grant/remove appropriate access and send notification to Dennis. Dennis or designate notifies user. Appendix G Gaelen Standards Worksheet ALL contains all table, columns, words from HRMSDIM, BAIRDIM and BAIRFACT broken out as follows For example owner = BAIRDIM Table = ACCOUNT_TREE Page 7

Column = ACCOUNT_CODE Abbreviation = ACCOUNT Full English Name = ACCOUNT ETL Team Development Standards - DRAFT Worksheet Glossary contains all the distinct abbreviations and full English name. The ones highlighted in red are outstanding questions that I would like to go over. Worksheet Class words Contains a set of class words I am proposing to use here at Berkeley. All columns would end with a class word. Worksheet Questions a number of abbreviations I do not know the full English name. If you could fill out as much as you know and send it back I would appreciate it Appendix H - Peoplesoft Table Considerations Translate table in People Soft will be extracted into the staging area, with Current flag, DW_FEFF_DT and DW_LEFF_DT, then moved to the ODS/DW, changing target names where necessary. Views will be set up for the codes as the need arises. One view for the current code set another view of the active code set. Appendix I - Environment Objectives Development QA Production Shared environment for DBA s, ETL and Data validation Ready for general user Report Developers. Performance testing access Not for data validation Non-production data can be loaded for test cases and then Not intended for functional users refreshed with production data (must be coordinated). Page 8