COURSE OUTLINE Track 1 Advanced Data Modeling, Analysis and Design TDWI Advanced Data Modeling Techniques Module One Data Modeling Concepts Data Models in Context Zachman Framework Overview Levels of Data Models Enterprise Perspective Levels of Data Models Project Perspective Entity Relationship Diagram Overview Entity Relationship Diagram and Its Components Normalized and Dimensional Models Standards Normalization Normalization to Third Normal Form Higher Normal Form Module Two Business Data Model Development Model Components Entities Entities (Continued) Supertype and Subtype Relationships Attributes Business Data Model Development Approaches Top-Down Bottom-Up Generic Models Limited Depth Model Business Stakeholder Data Modeling Roles Business Stakeholder Data Steward Subject Matter Expert Business Analyst Data Analyst / Data Modeler Business Data Model Application Basis for System Data Model Transformation and Integration Foundation Data Profiling Package Selection Business Communications Special Considerations Recursive Relationships State and Status in ERD Diagramming Options Metadata Tool Exploitation
Module Three System and Physical Data Model Development Data Modeling Roles Data Analyst / Data Modeler Database Administrator Business Analyst Developer Application Implications Model Differences Denormalization Overview Time Dependencies History Optimization Indexing Horizontal Partitioning Vertical Partitioning Special Considerations Surrogate Keys Columnar Databases Point-in-Time vs. Over-Time Models Data Warehouse Load Implications Metadata Module Four Additional Concepts Complementary Models State Transition Model Function Models Process Models Use Case Model Model Management Model Validation and Testing Model Synchronization Tools Data Modeling Tools Repositories Module Five Summary and Conclusions Critical Success Factors Common Mistakes Appendix A Bibliography and References Appendix B s 1: Standards 2: Normalization to Third Normal Form 3: Normalization to Higher Normal Forms 4: Subject Areas 5: Entity-Level Business Data Model 6: Attribute-Level Business Data Model 7: Model Application for Data Profiling 8: Application System Model Development 9: History Implications 10: Indexing 11: Model Synchronization
Advanced Dimensional Modeling Techniques for Practitioners Module 1 Review and Architecture Guiding principles Review dimensional modeling basics Star schema tables and characteristics Surrogate keys vs. natural keys Additivity and non-additivity Slowly changing dimensions type 1, type 2 Three architectures that include dimensional models Inmon / Imhoff Corporate Information Factory Kimball Dimensional Data Warehouse Stand-alone data marts Common components of all solutions Incorporating aggregates and OLAP Quiz: Test your design skills Critique a star schema design Module 2 Multiple Fact Tables Beyond a single fact table design Challenge: facts with differing grain or periodicity Pitfalls of single fact table design Designing for multiple fact tables Querying multiple fact tables Why a single query returns wrong results The concept of drilling across Conformance Ensuring subject areas work together Enabling incremental implementation Schema design exercise: Design a multiple fact table solution based on provided scenario Module 3 Advanced Fact Table Design Snapshots Challenge: determining status from transaction history The snapshot solution Semi-additive facts Use of snapshot vs. transaction tables Implications on density and slowly changing dimensions Accumulating snapshots
Challenge: studying fixed process milestones The accumulating snapshot Performing lag analysis Implications for ETL process (fact table is repeatedly touched ) Heterogeneous Dimensions Challenge: heterogeneous attributes within a dimension Implications for single fact table Building custom fact tables for homogenous subsets The view alternative Factless fact tables Challenge: nothing to measure but dimensions The factless fact table Using the factless fact table (counts) Coverage tables for tracking things that did not happen Schema design exercise: Employ advanced fact table techniques to design a solution supporting provided business scenario Module 4 ETL Processing Processing sequence Initial load Useful for prototyping Dimension processing Fact processing and lookups Incremental load Real world loads Processing slow changes Fact processing Designing for ETL Design elements that support ETL processing Changed data identification Hashing strategies and other optimizations Dealing with Bad data Cleaning up data What not to clean up When automated cleansing fails Paper exercise: filling in warehouse tables
Given a schema design and source data dumps, what should end result of load process look like? Module 5 Advanced Dimension Table Design Dimension reuse Roles Querying with roles Outriggers Understanding basic dimensional hierarchies Drilling without hierarchies Hierarchies Multiple hierarchies in a single dimension table Why to model hierarchies Snowflakes Mini-dimensions Challenge: large and expanding dimensions The use of a mini-dimension Loading the mini-dimension Transaction dimensions Challenge: point in time analysis of dimensional data Where the type 2 change breaks down A transaction dimension Identifying status at a specific point in time Using in relation to a fact table More Slow Change Techniques Type 3 Slow changes Hybrid Slow Changes : Schema design exercise: Employ advanced dimension techniques in support of the supplied scenario Module 6 Many to Many Relationships and Bridge Tables Dimension Bridge Tables Challenge: a multi-valued dimension Why flattening does not solve the problem The dimension bridge table Using the bridge: avoid double counting Variations: allocation Hiding the bridge Attribute Bridge Tables Challenge: a multi-valued attribute The attribute bridge table Use of the bridge Hierarchy bridge Table
Challenge: the organizational hierarchy dilemma Capturing all parent/subsidiary relationships: the bridge table Using the bridge to analyze data under an organization Using the bridge to analyze data above an organization Working without the bridge Design exercise: Modify a supplied schema design to support a business case that involves many-to-many relationships Module 7 Designing for Scalability and Performance Dispelling common myths Exploration warehouse or mart in Corporate Information Factory Standalone data mart Dimensional data warehouse Scaling in Scope Conformance across subject areas Conformed dimensions that are not identical Tools for planning conformance Scaling for Performance: Aggregate schemas Principles of aggregation Single table approach and drawbacks Aggregate fact tables Invisible aggregates Scaling for Performance: Derived Schemas Merged fact tables Pivoted fact table Set operations on fact tables Sliced fact tables Conformance quiz Develop a conformance matrix based on supplied scenario Module 8 The Project Designing and building a dimensional schema Early data load Value of iterative Design Activities and deliverables for typical project stages Project initiation
Project definition & control of scope Architectural issues Roles Strategy and Initial Design Requirements analysis Source system analysis Dimensional design Project strategy Documenting requirements and design Design and Build Initial load and feedback Design review and modification Controlling scope Developing the load Developing information products Test and deploy Testing procedures Finalization and documentation Transitioning to production Training Maintenance Load maintenance Maintaining a BI Competency center