CDCR EA Data Warehouse / Strategy Overview. February 12, 2010



Similar documents
US Department of Education Federal Student Aid Integration Leadership Support Contractor January 25, 2007

Whitepaper Data Governance Roadmap for IT Executives Valeh Nazemoff

Applied Business Intelligence. Iakovos Motakis, Ph.D. Director, DW & Decision Support Systems Intrasoft SA

The Role of the BI Competency Center in Maximizing Organizational Performance

Introduction to Business Intelligence

How To Develop An Enterprise Architecture

Lection 3-4 WAREHOUSING

Chapter 6 Basics of Data Integration. Fundamentals of Business Analytics RN Prasad and Seema Acharya

Enterprise Information Management and Business Intelligence Initiatives at the Federal Reserve. XXXIV Meeting on Central Bank Systematization

The New Jersey Enterprise Data Warehouse. State of New Jersey

Building for the future

Structure of the presentation

APPROACH TO EIM. Bonnie O Neil, Gambro-BCT Mike Fleckenstein, PPC

Establish and maintain Center of Excellence (CoE) around Data Architecture

Enterprise Information Management Capability Maturity Survey for Higher Education Institutions

Moving Large Data at a Blinding Speed for Critical Business Intelligence. A competitive advantage

Data Governance 8 Steps to Success

Foundations of Business Intelligence: Databases and Information Management

Emerging Technologies Shaping the Future of Data Warehouses & Business Intelligence

By Makesh Kannaiyan 8/27/2011 1

Outline Business Intelligence Roadmap: The Complete Project Lifecycle for Decision-Support Applications

Paper DM10 SAS & Clinical Data Repository Karthikeyan Chidambaram

Introduction to Datawarehousing

BENEFITS OF AUTOMATING DATA WAREHOUSING

MDM and Data Warehousing Complement Each Other

Software Defined Hybrid IT. Execute your 2020 plan

Business Intelligence Project Management 101

TRENDS IN THE DEVELOPMENT OF BUSINESS INTELLIGENCE SYSTEMS

Bussiness Intelligence and Data Warehouse. Tomas Bartos CIS 764, Kansas State University

California Enterprise Architecture Framework

Building a Comprehensive Strategy for Enterprise Data Management An Executive Overview

Foundations of Business Intelligence: Databases and Information Management

Introduction to the BI Architecture Framework and Methods

Introduction to Oracle Business Intelligence Standard Edition One. Mike Donohue Senior Manager, Product Management Oracle Business Intelligence

Technical Management Strategic Capabilities Statement. Business Solutions for the Future

Practical Considerations for Real-Time Business Intelligence. Donovan Schneider Yahoo! September 11, 2006

The HP Neoview data warehousing platform for business intelligence Die clevere Alternative

EAI vs. ETL: Drawing Boundaries for Data Integration

Enterprise Data Quality

Knowledgent White Paper Series. Developing an MDM Strategy WHITE PAPER. Key Components for Success

Value to the Mission. FEA Practice Guidance. Federal Enterprise Architecture Program Management Office, OMB

Business Intelligence (BI) Data Store Project Discussion / Draft Outline for Requirements Document

Business Intelligence Maturity Model. Wayne Eckerson Director of Research The Data Warehousing Institute

Data Management Roadmap

Human Services Decision Support System Data and Knowledge Management

Master Data Management. Zahra Mansoori

Before getting started, we need to make sure we. Business Intelligence Project Management 101: Managing BI Projects Within the PMI Process Group

INFORMATION TECHNOLOGY STANDARD

Enterprise Data Governance

MDM for the Enterprise: Complementing and extending your Active Data Warehousing strategy. Satish Krishnaswamy VP MDM Solutions - Teradata

IST722 Data Warehousing

Business Intelligence. A Presentation of the Current Lead Solutions and a Comparative Analysis of the Main Providers

Course DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

Using SAP Master Data Technologies to Enable Key Business Capabilities in Johnson & Johnson Consumer

Management Update: The Cornerstones of Business Intelligence Excellence

Custom Consulting Services Catalog

Analance Data Integration Technical Whitepaper

University of Michigan Medical School Data Governance Council Charter

UC Berkeley Data Warehouse Roadmap. Data Warehouse Architecture

MICHIGAN AUDIT REPORT OFFICE OF THE AUDITOR GENERAL THOMAS H. MCTAVISH, C.P.A. AUDITOR GENERAL

Service Oriented Data Management

Turning Data into Knowledge: Creating and Implementing a Meta Data Strategy

State of Michigan Department of Technology, Management & Budget

Oracle Role Manager. An Oracle White Paper Updated June 2009

Vertical Data Warehouse Solutions for Financial Services

Guidelines for Best Practices in Data Management Roles and Responsibilities

ORACLE BUSINESS INTELLIGENCE, ORACLE DATABASE, AND EXADATA INTEGRATION

Business Intelligence Enabling Transparency across the Enterprise

Microsoft Data Warehouse in Depth

OLAP and OLTP. AMIT KUMAR BINDAL Associate Professor M M U MULLANA

The Business in Business Intelligence. Bryan Eargle Database Development and Administration IT Services Division

Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization

NASCIO EA Development Tool-Kit Solution Architecture. Version 3.0

Table of Contents. State of Colorado Data Strategy

LEARNING SOLUTIONS website milner.com/learning phone

SAS BI Course Content; Introduction to DWH / BI Concepts

META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING

Business Intelligence & IT Governance

JOURNAL OF OBJECT TECHNOLOGY

Analytics Strategy Information Architecture Data Management Analytics Value and Governance Realization

Class News. Basic Elements of the Data Warehouse" 1/22/13. CSPP 53017: Data Warehousing Winter 2013" Lecture 2" Svetlozar Nestorov" "

Lost in Space? Methodology for a Guided Drill-Through Analysis Out of the Wormhole

Chapter 5. Warehousing, Data Acquisition, Data. Visualization

can you effectively plan for the migration and management of systems and applications on Vblock Platforms?

Revenue and Sales Reporting (RASR) Business Intelligence Platform

BUSINESS INTELLIGENCE. Keywords: business intelligence, architecture, concepts, dashboards, ETL, data mining

Advanced Data Management Technologies

Published by: PIONEER RESEARCH & DEVELOPMENT GROUP ( 28

Data Virtualization. Paul Moxon Denodo Technologies. Alberta Data Architecture Community January 22 nd, Denodo Technologies

<Insert Picture Here> Extending Hyperion BI with the Oracle BI Server

Master data deployment and management in a global ERP implementation

SQL Server 2012 Business Intelligence Boot Camp

W H I T E P A P E R B u s i n e s s I n t e l l i g e n c e S o lutions from the Microsoft and Teradata Partnership

Information Management & Data Governance

Transcription:

CDCR EA Data Warehouse / Business Intelligence / Reporting Strategy Overview February 12, 2010

Agenda 1. Purpose - Present a high-level Data Warehouse (DW) / Business Intelligence (BI) / Reporting Strategy 2. Define Terms 3. Key References 4. Project Scope for Strategy Overview 5. Concerns 6. Context Diagram 7. Mappings by Programs, Projects & Initiatives (As-Is) 8. DW Maturity y/ Readiness Matrix 9. Strategy 10. 3 Phased Targets 11. Dependencies 12. EDW Best Practices Appendices 2

Define Terms Data Warehouse (DW) Is a repository of an organization's electronically stored data. DWs are designed to facilitate reporting and analysis. DW provides a single 360 degree view of the business and a platform for business intelligence tasks ranging from predictive analysis to near real-time strategic and tactical decision support. Three keys to DW to get correct are 1) the hardware configuration, 2) the data model, and 3) the data loading process. DW supports research by providing data organized and optimized for results. Business Intelligence (BI) Refers to skills, processes, technologies, applications and practices used to support decision making. BI technologies provide historical, current, and predictive views of business operations. Common functions of Business Intelligence technologies are reporting, analytics, data mining, business performance management, predictive analytics. BI supports research by providing triggers, outcomes, data mining and trending, and makes results available to decision makers. Reporting Ability to provide strategic appropriate authorized relevant accurate reports to enable design making, including portal services to control access and provide the intelligence to the correct users. 3

Define Terms cont. DW Workload: Research Ability to mine, analyze, report and provide analytics on available enterprise data to the executive staff concerning the impact of potential new policies Compliance Alignment to reporting bodies (legislature, courts, governor s office) Operations Day to day IT support to programs by mission critical application Outcomes Office of Research (OR) provides outcomes based on their analysis and analytics; also specified as outcomes of CDCR strategic plan goals except clinical focused outcomes Performance COMPSTAT provides performance based on institution reported data except clinical focused performance 4

Define Terms cont. Technology Stack: Governance The governance process across the different Technology Stack components Portal An Enterprise Portal is a Web software infrastructure providing access to, and interaction with, relevant information assets (information/content, applications and business processes), knowledge assets and human assets, by select targeted audiences, delivered in a highly personalized manner. There are two types of Portals: Vertical portals focus on accessing specific applications or business functions; Horizontal portals seek to integrate and aggregate information from multiple cross enterprise applications, as well as specific line-of-business tools and applications. Reporting Ad hoc, canned and variable reports available on-line and batch, hardcopy or electronic Business Intelligence (BI) described by the Technology Bricks as: Business intelligence (BI) refers to technologies, applications and practices for the collection, integration, analysis, and presentation of business information and sometimes to the information itself. The purpose of business intelligence is to support better business decision making. Thus, BI is also described as a decision support system (DSS). Data Mart (DM) A data mart is a subset of an organizational data store, usually oriented to a specific purpose or major data subject, that may be distributed to support business needs. Data marts are analytical data stores designed to focus on specific business functions for a specific community within an organization. Data Warehouse (DW) Data Warehouse specific software tools that provide specific DW functions; described by the Technology Bricks as: A data warehouse is a central repository for all or significant parts of the data that an enterprise's various business systems collect. Typically, a data warehouse is housed on an enterprise server. Data from various online transaction processing (OLTP) applications and other sources is selectively extracted and organized on the data warehouse database for use by OLAP or data mining analytical applications and user queries. Data warehousing emphasizes the capture of data from diverse sources for useful analysis and access, but does not generally start from the point-of-view of the end user or knowledge worker who may need access to specialized, sometimes local databases. (This definition is specific to the Technology Stack, different from the higher h level l DW strategy t definition iti in the previous slide.) Data Element level data that describes CDCR s business which are organized into tables, schemas, views, and databases held in what can be considered a data staging server environment Software Layered software supporting the operations of the hardware such as operating systems, tools, middleware Hardware Server level hardware, in this case specifically supporting the software, data, DW and BI components; may include database servers, application servers, reporting servers and web servers if needed Enterprise Identity Management (EIdM) broad administrative area that deals with identifying individuals in a system (such as a country, a network or an organization) and controlling the access to the resources in that system by placing restrictions on the established identities. 5

Key References 1. EDAC Enterprise Data Architecture Committee Charter http://portal.cdcr.ca.gov/admin/eis/pmo/ea/team/shared%20documents/enterprise%20data% ca 2. FEA Data Reference Model Version 2.0 http://www.whitehouse.gov/omb/asset.aspx?assetid=561 3. DAMA International Foundation http://www.dama.org/i4a/pages/index.cfm?pageid=1 4. A Call to Action for State Government: Guidance for Opening the Doors to State Data http://www.nascio.org/publications/index.cfm#118 org/publications/index cfm#118 5. Data Governance Part I: Data Governance www.nascio.org/publications/documents/nascio- DataGovernance-Part1.pdf - 2009-02-06 6. Data Governance Part II: Maturity www.nascio.org/publications/researchbriefs.cfm 7. Data Governance Part III: Frameworks www.nascio.org/publications/researchbriefssubject2.cfm?category=14 8. The purpose of Data.gov is to increase public access to high value, machine readable datasets generated by the Executive Branch of the Federal Government www.data.gov 9. California Logical Model www.cdcr.ca.gov/news/reentry / y_ slideshow/logic _ model.pdf 10. Little Hoover Commission Report: A New Legacy System: Using Technology to Driver Performance http://www.lhc.ca.gov/studies/193/report193.html 6

Key References cont. Respected CA State Sources 1. Governor s Reorganization Plan 2. CDCR Strategic Plan 3. CDCR IT Strategic Plan 4. CDCR IT Capital Plan 5. COMPSTAT Charter 6. Office of Research (OR) Charter 7. SOMS Contract 8. BIS Contract t 9. OCIO Statewide Data Strategy Report 7

Key References cont. Technology Stack Mapped to Key References and Respected Sources Respected CA State Sources (from page 7) Key References (from page 6) ( p g ) Technology Stack 1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 Governance Portal Su upports Reporting BI DW DM Data Software Hardware Legend: Green reflects that the reference supports the given technology stack component 8

Project Scope for Strategy Overview Scope 1. Define the lifecycle of a data warehouse specific implementation 2. Define data warehouse industry best practices principles i and frameworks of thinking 3. Leverage current production data capabilities as-is (including OR & COMPSTAT) 4. Leverage the technology opportunities and strategic needs of approved projects 5. Align to Enterprise Data Architecture Committee Charter (EDAC) Principles* 6. Produce the overall strategy for ongoing maturity 7. Produce a plan for the first target step mapped to the lifecycle 8. Effort limited to 70 hours planned for 10/05/09 11/30/09 9. Focus was on CDCR s as-is is environment. CPHCS environment details will be incorporated into the next planning cycle as CPHCS falls into the future state architecture. CDCR and CPHCS future state EDW, BI, and reporting will be integrated. Out-of-Scope Meetings were conducted with Budget/Funding not included COMPSTAT Required PYs not included EA Due to the limited scope of this engagement, g follow-on tasks may be needed EdFIRST OISB SOMS BIS * See Appendix - C for EDAC alignment CPHCS 9

Concerns 18 specific concerns resulted during meetings with the various teams which can be summarized as follows*: Data quality / data cleansing needed / common vocabulary needed** Lack of governance confusion on who has authority to make decisions & policies Insufficient performance metrics (fidelity) Misaligned reports (not supportive to current business needs) Tactical, ad hoc, silo'ed, reactive, inconsistent applications Technical performance needs improvement (bandwidth) Cost effectiveness (new Executive order) Economics of scale (support model) ** Does not include CPHCS targeted data quality requirements * Details are included in Appendix - A 10

Context Diagram (As-Is) COMPSTAT Program 17 CPHCS And 13 Other CDCR Systems SOMS Project Performance OISB Compliance Outcomes DB DW / BI / Infrastructure Reporting Initiative (Led by Scott Research Operations MacDonald) Office of Research (OISB, ARB, JRB) Program 11 BIS Project Green Blue Yellow Legend: Programs, Projects, & Initiatives DW Workloads DW / BI / Reporting Environments

Mapping by Programs (As-Is) Findings: Programs are silo ed Programs COMPSTAT OR: OISB, ARB, JRB DW Workloads Compliance Performance Outcomes Operations Research Green Blue Yellow White Light Green Tan Red White Legend: Programs DW Workloads DW Environments Technology Stack COMPSTAT Program OR Program Gaps Overlaps 12 DW Environments COMPSTAT OISB DB Infrastructure OR SOMS BIS Technology Stack Governance Portal Reporting BI DM DW Data Software Hardware

Mapping by Projects (As-Is) cont. Findings: Projects are silo ed Projects SOMS BIS DW Workloads Compliance Performance Outcomes Operations Research Green Blue Yellow White Light Green Tan Red White Legend: Projects DW Workloads DW Environments Technology Stack SOMS Project BIS Project Gaps Overlaps 13 DW Environments COMPSTAT OISB DB Infrastructure OR SOMS BIS Technology Stack Governance Portal Reporting BI DM DW Data Software Hardware

Mapping by Initiatives (As-Is) cont. Findings: Initiatives are silo ed Initiatives OISB DB Infrastructure DW Workloads Compliance Performance Outcomes Operations Research Green Blue Yellow White Light Green Tan Red White Legend: DW Environments COMPSTAT OISB DB Infrastructure OR SOMS BIS Initiatives DW Workloads DW Environments Technology Stack OISB DB Infrastructure Initiative Gaps Overlaps Technology Stack Governance Portal Reporting BI DM DW Data Software Hardware 14

Strategy Vision Build a strategic consolidated Enterprise Data Warehouse (EDW) technical infrastructure with virtualized Data Marts (DM), Business Intelligence (BI) and Reporting functions. As-Is To-Be Technology Stack Distributed Consolidated Technology Stack Virtualized Consolidated Governance Yes Governance Yes Portal Yes Portal Yes Reporting Yes Reporting* Yes BI Yes BI* Yes DM Yes DM* Yes DW Yes EDW Yes Data Yes Data Yes Software Yes Software Yes Hardware Yes Hardware Yes * The business will own and use their own tools from the datacenter. 15

Strategy cont. As-Is Numerous Redundant Data Feeds To Silo ed Environments 16

Strategy cont. Phased To-Be Consolidated, Integrated, Quality, Governed Data To EA Aligned TRM 17

Strategy - cont. Troux As-Is Diagram 18

Strategy cont. Troux To-Be Diagram 19

Drivers (not prioritized, sorted alpha) CDCR Enterprise-wide Integration Executive (Budgeting) Outcome Measures Support the Business Values: Strategy cont. Authoritative Drivers Leverage prior investments to implement new business functions in a more timely and cost effective manner (SOMS, EdFIRST, DECATS, DORMS) Expose CDCR business process gaps, allowing them to be resolved BEFORE litigation occurs Provide crucial planning information required to integrate CDCR and CPHCS Expedite the integration and/or migration of legacy and new systems Support Court Requirements e e Support CPHCS Support Current Data Warehouses (COMPSTAT, OR, BIS, SOMS, others) Support EA Best Practices Support Enterprise Data Architecture Committee (EDAC)* Support Legislative Requirements (CROB / AB900) Support Operational Needs Support Reportable Projects (PLM,CITIP, SOMS, RSTS, DECS, EdFIRST, ARNAT, WINPLO, BIS, RACS, CIPS) * See Appendix - C for EDAC alignment 20

Strategy cont. Strategies Governance - Clarify authoritative sources (who does what), organization authority, organization structure and funding mechanisms with consistent consolidated governance Data - Leverage CODB, OISB & COMPSTAT processes as a starting point to standardize and provide best practices. Consolidate data feeds single centralized data source (quality, timeliness, accuracy) Hardware/Software - Leverage OISB DB Infrastructure initiative to create infrastructure and performance baseline and align to EA standards for HW / SW (Portal, DW, BI, reporting tools) consolidating on EDW / BI tools and leverage licenses from SOMS, OR, COMPSTAT Strategic Reports - Provide consistent integrated reports to provide a common look and feel 21

Phased Targets Phase I Target Proof-of-Concept (anticipated to be completed in Year 1) Phase II Target (anticipated to be completed in Year 1) Phase III Target Distributed Reporting BI DM DW Consolidated Governance Portal Data Software Hardware Virtualized Consolidated d Virtualized Consolidated d Governance Portal Governance Portal Reporting* BI* Reporting* BI* DM* DM* EDW EDW Data Data Software Software Hardware Hardware * The business will own and use their own tools from the datacenter. 22

Phased Targets* - cont. As-Is To-Be Target Steps Plan: Phase I Target Fund and execute OISB DB Infrastructure initiative as a proof-ofconcept with one metric as the first step, focusing on consolidating Governance, Data, Software and Hardware Provide long-term funding and budget Phase II Target Focus on funding, data issues, scalability and metrics: Resolve key data issues for single central source, common vocabulary and data quality Build virtualized Reporting, BI and DM with consolidated EDW Plan scalability, capacity Build out infrastructure based on Phase I proof-of-concept Implement additional metrics through a consolidated governance process BIS integrated data with SOMS Focus on legacy systems migration Phase III Target Bring CPHCS and SOMS/EdFIRST, OR, and COMPSTAT feeds onto the EDW environment with Reporting Complete Reporting, BI, DM and EDW * See details in Troux 23

Planning Consideratons SOMS SOMS is working on a roadmap for Release 1A scheduled Q1 2011, efforts can be dovetailed to maximize benefits EIdM The EIdM project is planning to release with SOMS 1A for SOMS use, full functionality will be in Release 3 of SOMS BIS BIS will be data integrated with SOMS 1A SharePoint SP is the CDCR Enterprise portal, no dependencies have been found CPHCS CPHCS is focused on their Medication Records project for 2011-2012 and does not have any direct dependencies; An opportunity may exist to use Oracle Health BI once SOMS is delivered to leverage the Oracle licensing. SAS SAS analytics used at CPHCS and OR 24

EDW Best Practices Start by defining your user population s requirements in terms of complexity and frequency of queries, both today and in the future Factor in how your data volume is expected to grow, the frequency of data loads, and the mix of workloads Consider scalability of data storage, network bandwidth, and processing capacity Establish your analytic requirements Look for an integrated analytics solution for best performance Design in acceptable performance criteria and SLAs Determine security requirements Map the skills and resources required to implement and manage your data warehouse today and in the future to what you already have. Determine your time-to-market requirements Analyze ongoing management costs and long-term data warehousing investment Identified a list of solution providers that meet your requirements Talk to reference customers Run benchmark tests or proof of concepts Provide reasonable timelines in your project plans Use real source data from operational systems Reference: http://www.tdwi.org/publications/bijournal/display.aspx?id=8266tdwi aspx?id=8266 25

EDW Best Practice Links SAP - http://searchsap.techtarget.com/tip/1,289483,sid21_gci1134359,00.html Oracle - http://www.oracle.com/technology/products/oracle-data- / h l / / l d integrator/pdf/odi-bestpractices-datawarehouse-whitepaper.pdf Microsoft - http://msdn.microsoft.com/en-us/library/cc719165.aspx Microsoft - http://www.microsoft.com/presspass/press/2009/feb09/02- com/presspass/press/2009/feb09/02 23SQLFastTrackPR.mspx?rss_fdn=Press%20Releases Gartner - http://www.gartner.com/displaydocument?id=1221936 Gartner magic quadrant - http://www.gartner.com/technology/mediaproducts/reprints/ncr/article2/article2.html Federal Enterprise Architecture - http://www.whitehouse.gov/omb/e-gov/fea/ NASCIO Maturity Model - http://www.nascio.org/publications/documents/nascio-eamm.pdf 26

Questions???

Appendix Appendix A 18 Concerns Details Appendix B DW Workloads Mapped to FEA PRM Framework Appendix C EDAC Alignment 28

Appendix A - 18 Concerns Details 1. Data quality / cleansing issues 2. Metrics not well established 3. Ad hoc tactical environment 4. Silo ed data warehouses 5. Too many inconsistent portals 6. Lack of governance 7. Competition for operational data feeds is taxing providers 8. Security issues 9. Technical Performance 10. Requests are not well thought out causing miss communication and confusion of reports 11. EIS services (SLAs) lacking 12. Data definition and semantics inconsistencies between data warehouses 13. Don t understand the lifecycle clearly which re enforces a reactive mode 14. Don t reconcile conflicts and inconsistencies of demands, results in dissatisfied customers 15. DW / BI / reporting currently is tactical ad hoc 16. Too many projects in flight saturates ability to respond 17. Performance measurements are inconsistent, not correctly stated, does not provide a consistent story 18. Enterprise policies or direction not established resulting in tactical silo ed decisions 29

Appendix B - DW Workloads Mapped to FEA PRM Framework FEA PRM Framework DW Workload Processes & Activities Mission & Business Results Customer Results Human Capital Technology Other Fixed Assets Research Compliance Operations Outcomes Performance Legend: Green notes workloads that support the FEA PRM Framework 30

Appendix C - EDAC Alignment The DW / BI / Reporting Strategy is positioned to support EDAC alignment and is leveraged by the first strategy noted above (Clarify authoritative sources, organization authority, organization structure and funding mechanisms): Manage from a consolidated enterprise-wide perspective Collect and manage information as an asset in accordance with its business values Promote application consolidation, standardization, and integration Secure information assets from unauthorized access, use, modification, destruction, and / or disclosure Delegate security authorization authority regarding information classification and access Promote a high level of security awareness to employees and partners Assign a single steward or authoritative source for each individual data item with clearly defined locations and data accessibility Separate production databases from data warehouses used for decision support Maintain common and consistent information / data definitions across all business units (e.g., data dictionary) Ensure data accessibility to all authorized entities Store and transmit data electronically to avoid data transcription and manual re-entry Preserve and maintain data such that the information remains accessible and useable for the designated retention period Ensure system data is accessed through the use of business rules 31