Raw Data vs. Business Information Module 1: Introduction to Data Warehousing and OLAP Capturing Raw Data Gathering data recorded in everyday operations Deriving Business Information Deriving meaningful information from raw data Turning Data into Information Implementing a decision support system OLTP Source Systems Data Warehouse Characteristics OLTP System Characteristics Processes real-time transactions of a business Contains data structures optimized for entries and edits Provides limited decision support capabilities OLTP Examples Order tracking Service-based sales Customer service Banking functions Point-of-sales Provides Data for Business Analysis Processes Integrates Data from Heterogeneous Source Systems Combines Validated Source Data Organizes Data into Non-Volatile, Subject-Specific Groups Stores Data in Structures that Are Optimized for Extraction and Querying Data Warehouse System Components Defining OLAP Solutions Data Sources Staging Area Data Warehouse Data Marts User Data Access OLAP Databases Common OLAP Applications Data Marts and OLAP Cubes OLAP in SQL Server 2000 Data Input Data Access 1
OLAP Databases Common OLAP Applications Optimized Schema for Fast User Queries Robust Calculation Engine for Numeric Analysis Conceptual, Intuitive Data Model Multidimensional View of Data Drill down and drill up Pivot views of data Executive Information Systems Performance measures Exception reporting Sales/Marketing Applications Booking/billing Product analysis Customer analysis Financial Applications Reporting Planning Analysis Operations Applications Manufacturing Customer service Product cost Data Marts and OLAP Cubes OLAP in SQL Server 2000 Data Storage Data Content Data Sources Data Mart Data Structure Detailed and Summarized Data and Non-relational Sources OLAP Cube N-dimensional Data structure Summarized Data and Non-relational Sources Microsoft Is One of Several OLAP Vendors Analysis Services Is Bundled with Microsoft SQL Server 2000 Analysis Services Include OLAP engine Data mining technology Data Retrieval Fast Performance for for Data Extract Queries Faster Performance for for Data Extract Queries Understanding Data Warehouse Design The Star Schema Fact Table Components Dimension Table Characteristics The Snowflake Schema The Star Schema Employee_Dim EmployeeKey EmployeeID Dimension Table Time_Dim Fact Table TimeKey Sales_Fact TheDate TimeKey EmployeeKey ProductKey CustomerKey ShipperKey Sales Amount Unit Sales Shipper_Dim ShipperKey ShipperID Product_Dim ProductKey ProductID Customer_Dim CustomerKey CustomerID 2
Fact Table Components Dimension Table Characteristics Dimension Tables customer_dim 201 ALFI Alfreds sales_fact Table Foreign Keys Measures product_dim 25 123 Chai customer_key product_key time_key quantity_sales amount_sales 201 25 134 400 10,789 time_dim 134 1/1/2000 The grain of the sales_fact table is defined by the lowest level of detail stored in each dimension Describes Business Entities Contains Attributes That Provide Context to Numeric Data Presents Data Organized into Hierarchies The Snowflake Schema Understanding OLAP Models Defines Hierarchies by Using Multiple Dimension Tables Is More Normalized than a Single Table Dimension Is Supported within Analysis Services OLAP Database Components OLAP Dimensions vs. Dimensions Dimension Fundamentals Dimension Family Relationships Cube Measures Data Sources OLAP Database Components OLAP Dimensions vs. Dimensions Numeric Measures Dimensions Cubes OLAP REGION West CA OR East MA NY REGION West East STATE REGION CA West OR West MA East NY East 3
Dimension Fundamentals Dimension Family Relationships USA North West Oregon Washington South West California USA is the parent of North West and South West North West and South West are children of USA North West and California are descendants of USA North West and USA are ancestors of Washington North West and South West are siblings Oregon and California are cousins All are dimension members Cube Measures Data Sources Are the Numeric Values of Principle Interest Correspond to Fact Table Facts Intersect All Dimensions at All Levels Are Aggregated at All Levels of Detail Form a Dimension Star and Snowflake Schemas Are required to build a cube with Analysis Services Fact Table Contains measures Contains keys that join to dimension tables Dimension Tables Must exist in same database as fact table Contain primary keys that identify each member Applying OLAP Cubes Defining a Cube Defining a Cube Querying a Cube Defining a Cube Slice Market Dimension Working with Dimensions and Hierarchies Visualizing Cube Dimensions Connecting to an OLAP Cube Detroit 4
Querying a Cube Defining a Cube Slice Sales Fact Markets Dimension Markets Dimension Dallas Detroit Working with Dimensions and Hierarchies Visualizing Cube Dimensions Dimensions Allow You to Slice Dice Hierarchies Allow You to Drill Down Drill Up Connecting to an OLAP Cube Review State USA Sales Units 6000 5000 4000 3000 2000 1000 0 Bagels Muffins Sliced Bread Cheese Deli Meats Frozen Chicken Bread Dairy Meat Level 02 Year Quarter Sheri Now mer - 2001 - Quarter 4 Sheri Now mer - 2001 - Quarter 3 Sheri Now mer - 2001 - Quarter 2 Sheri Now mer - 2001 - Quarter 1 Sheri Now mer - 2000 - Quarter 4 Sheri Now mer - 2000 - Quarter 3 Sheri Now mer - 2000 - Quarter 2 Sheri Now mer - 2000 - Quarter 1 Introducing Data Warehousing Defining OLAP Solutions Understanding Data Warehouse Design Understanding OLAP Models Applying OLAP Cubes Category Subcategory 5