Data Warehousing. Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig

Size: px
Start display at page:

Download "Data Warehousing. Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs."

Transcription

1 Data Warehousing & Data Mining Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig

2 5. Queries 5. Queries 5.1 OLAP query languages OLAP operators in SQL MDX (MultiDimensional expressions) 5.2 Data modeling Logical modeling - implementation Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 2

3 5.1 How does OLAP work? Presentation Presentation Presentation OLAP Interface HOLAP Server ROLAP Server MDDB MDDB RDBMS RDBMS Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 3

4 5.1 How does OLAP work? OLAP systems Client/server architecture The client displays reports and allows interaction with the end user to perform the OLAP operations and other custom queries The server is responsible for providing the requested data. How? It depends on whether it is MOLAP, ROLAP, HOLAP, etc. Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 4

5 5.1 How does OLAP work? OLAP server High-capacity, multi-user data manipulation engine specifically designed to support and operate on multidimensional data structures It is optimized for Fast, flexible calculation and transformation of raw data based on formulaic relationships Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 5

6 5.1 How does OLAP work? OLAP server may either Physically stage the processed multidimensional information to deliver consistent and rapid response times to end users MOLAP Populate its data structures in real-time from relational or other databases ROLAP Or offer a choice of both HOLAP Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 6

7 5.1 How does OLAP work? We have seen that The best way to represent data at the presentation level is multidimensional Regardless if the storage is multidimensional (MOLAP) or relational (ROLAP) Optimal for analyze purposes: easy to understand by the decision makers, natural representations of the data in businesses, etc Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 7

8 5.1 OLAP query languages Getting from OLAP operations to the data As in the relational model, through queries In OLTP we have SQL as the standard query language However, OLAP operations are hard to express in SQL There is no standard query language for OLAP Choices are: SQL-99 MDX (Multidimensional expressions) Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 8

9 5.1 OLAP query languages SQL-99 Prepare SQL for OLAP queries New SQL commands GROUPING SETS ROLLUP CUBE New aggregate functions Queries of type top k Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 9

10 5.1 SQL-99 Shortcomings of SQL/92 with regard to OLAP queries Hard or impossible to express in SQL Multiple aggregations Comparisons (with aggregation) Reporting features Performance penalty Poor execution of queries with many AND and OR conditions Lack of support for statistical functions Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 10

11 5.1 SQL-99 Multiple aggregations in SQL/92 Create a 2D spreadsheet that shows sum of sales by maker as well as car model Each subtotal requires a separate aggregate query SUV Sedan Sport By make BMW Mercedes By model SUM SELECT model, make, sum(amt) FROM sales GROUP BY model, make union SELECT model, sum(amt) FROM sales GROUP BY model union SELECT make, sum(amt) FROM sales GROUP BY make union SELECT sum(amt) FROM sales Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 11

12 5.1 SQL-99 Comparisons in SQL/92 This year s sales vs. last year s sales for each product Requires a self-join CREATE VIEW v_sales AS SELECT prod_id, year, sum(qty) AS sale_sum FROM sales GROUP BY prod_id, year; SELECT cur.prod_id, cur.year, cur.sale_sum, last.year, last.sale_sum FROM v_sales cur, v_sales last WHERE cur.year = (last.year+1) AND cur.prod_id = last.prod_id; Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 12

13 5.1 SQL-99 Reporting features in SQL/92 Too complex to express RANK (top k) and NTILE ( top X% of all products) Median Running total, moving average, cumulative totals E.g., moving average over a 3 day window of total sales for each product CREATE OR REPLACE VIEW v_sales AS SELECT prod_id, time_id, sum(qty) AS sale_sum FROM sales GROUP BY prod_id, time_id; SELECT end.time, avg(start.sale_sum) FROM v_sales start, v_sales end WHERE end.time >= start.time AND end.time <= start.time + 2 GROUP BY end.time; Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 13

14 5.1 SQL-99 Grouping operators Extensions to the GROUP BY operator GROUPING SET CUBE ROLLUP Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 14

15 5.1 Grouping operators GROUPING SET Used for reporting purposes Replaces the series of UNIONed queries SELECT dept_name, CAST(NULL AS CHAR(10)) AS job_title, COUNT(*) FROM personnel GROUP BY dept_name UNION ALL SELECT CAST(NULL AS CHAR(8)) AS dept_name, job_title, COUNT(*) FROM personnel GROUP BY job_title; Can be re-written as: SELECT dept_name, job_title, COUNT(*) FROM Personnel GROUP BY GROUPING SET (dept_name, job_title); Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 15

16 5.1 Grouping set The issue of NULL values The new grouping functions generate NULL values at the subtotal levels So we have generated NULLs and real NULLs from the data itself How do we tell the difference? Through the GROUPING function return value: GROUPING(job_title) which returns 0 for NULL in the data and 1 for generated NULL Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 16

17 5.1 Grouping operators ROLLUP Produces a result set that contains subtotal rows in addition to regular grouped rows GROUP BY ROLLUP (a, b, c) is equivalent to GROUP BY GROUPING SETS (a, b, c),(a, b), (a), () N elements of the ROLLUP translate to (N+1) grouping sets Order is significant to ROLLUP! GROUP BY ROLLUP (c, b, a) is equivalent with grouping sets of (c, b, a), (c, b), (c), () Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 17

18 5.1 ROLLUP ROLLUP operation, e.g.,: SELECT year, brand, SUM(qty) FROM sales GROUP BY ROLLUP(year, brand); Year Brand SUM(qty) 2008 Mercedes BMW VW Mercedes (year, brand) (year) (year, brand) (year) (ALL) Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 18

19 5.1 Grouping operators CUBE operator Contains all the subtotal rows of a ROLLUP and in addition cross-tabulation rows Can also be thought as a series of GROUPING SETs All permutations of the cubed grouping expressions are computed along with the grand total N elements of a CUBE translate to 2 n grouping sets: GROUP BY CUBE (a, b, c) is equivalent to GROUP BY GROUPING SETS(a, b, c) (a, b) (a, c) (b, c) (a) (b) (c) () Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 19

20 5.1.1 CUBE operator Aggregate Sum SUV SEDAN SPORT Group By (with total) By model Sum SUV SEDAN SPORT By Make Cross Tab BMW MERCBy model Sum By Make & Year By Year By model& Year The Data Cube and The Sub-Space Aggregates By Make SUV SEDAN SPORT By Make & model Sum By model Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 20

21 5.1 CUBE E.g., CUBE operator SELECT year, brand, SUM(qty) FROM sales GROUP BY CUBE (year, brand); Year Brand SUM(qty) 2008 Mercedes BMW VW Mercedes Mercedes 300 BMW 350 VW (year, brand) (year) (year, brand) (year) (brand) (ALL) Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 21

22 5.1 OLAP functions Diagram of OLAP function evaluation... Partitioning Partitioning Sorting Sorting Dynamic Window Dynamic Window Aggregation Aggregation OVER(PARTITION BY ORDER BY ROWS BETWEEN ) RANK(), SUM() Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 22

23 5.1 OLAP functions The window clause Specify that we want to perform an action over a set of rows 3 sub-clauses: Partitioning, ordering and aggregation grouping General format: <aggregate function> OVER ([PARTITION BY <column list>] ORDER BY <sort column list> [<aggregation grouping>]) Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 23

24 5.1 Window clause Moving averages are hard to compute with SQL-92 It involves multiple self joins for the fact table With the window clause we can create dynamical windows: expressed in the <aggregation grouping> SELECT AVG(sales) OVER (PARTITION BY region ORDER BY month ASC ROWS 2 PRECEDING) AS SMA3 Moving average of 3 rows Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 24

25 5.1 Ranking in SQL Ranking operators in SQL Row numbering is the most basic ranking function ROW_NUMBER() returns a column as an expression that contains the row s number within the result set E.g., SELECT SalesOrderID, CustomerID, ROW_NUMBER() OVER (ORDER BY SalesOrderID) as RunningCount FROM Sales WHERE SalesOrderID > ORDER BY SalesOrderID; SalesOrderID CustomerID RunningCount Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 25

26 5.1 Ranking in SQL ROW_NUMBER doesn t consider tied values 2 equal considered values get 2 different returns SalesOrderID RunningCount The behavior is non-deterministic Each tied value could have its number switched!! Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 26

27 5.1 Ranking in SQL RANK and DENSE_RANK functions Allow ranking items in a group The difference between RANK and DENSE_RANK is that DENSE_RANK leaves no gaps in ranking sequence when there are ties Syntax: RANK ( ) OVER ( [query_partition_clause] order_by_clause ) DENSE_RANK ( ) OVER ( [query_partition_clause] order_by_clause ) Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 27

28 5.1 Ranking in SQL SQL99 Ranking e.g., SELECT channel, calendar, TO_CHAR(TRUNC(SUM(amount_sold),-6), '9,999,999') SALES, RANK() OVER (ORDER BY Trunc(SUM(amount_sold),-6) DESC) AS RANK, DENSE_RANK() OVER (ORDER BY TRUNC(SUM(amount_sold),-6) DESC) AS DENSE_RANK FROM sales, products CHANNEL CALENDAR SALES RANK DENSE_RANK Direct sales , Direct sales , Internet , Internet , Partners , Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 28

29 5.1 Ranking in SQL Other flavors of ranking Group ranking RANK function can operate within groups: the rank gets reset whenever the group changes A single query can contain more than one ranking function, each partitioning the data into different groups Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 29

30 5.1 Group Ranking This is accomplished with the PARTITION BY clause E.g., SELECT RANK() OVER (PARTITION BY channel ORDER BY SUM(amount_sold) DESC) AS RANK_BY_CHANNEL CHANNEL CALENDAR SALES RANK _BY_CHANNEL Direct sales ,000 1 Direct sales ,000 2 Internet ,000 1 Internet ,000 1 Partners ,000 1 Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 30

31 5.1 Ranking in SQL The treatment of NULL values: NULLs are treated as normal values A NULL value is equal to another NULL value They are given ranks according to The ASC DESC options provided for measures The NULLS FIRST NULLS LAST clause MONTH SOLD NULL FIRST ASC NULL LAST ASC NULLFIRST DESC NULLLAST DESC Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 31

32 5.1 Ranking in SQL Top k ranking By enclosing the RANK function in a sub-query and then applying a filter condition outside the sub-query SELECT * FROM (SELECT country_id, SUM(amount_sold) SALES, RANK() OVER (ORDER BY SUM(amount_sold) DESC ) AS COUNTRY_RANK FROM sales, products, customers, times, channels WHERE... GROUP BY country_id) WHERE COUNTRY_RANK <= 5; Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 32

33 5.1 NTILE NTILE Not a part of SQL99 standards but adopted by major vendors Splits a set into equal groups It divides an ordered partition into buckets and assigns a bucket number to each row in the partition Buckets are calculated so that each bucket has exactly the same number of rows assigned to it or at most 1 row more than the others Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 33

34 5.1 NTILE SELECT NTILE(3) OVER (ORDER BY sales) NT_3 FROM CHANNEL CALENDAR SALES NT_3 Direct sales ,000 1 Direct sales ,000 1 Internet ,000 2 Internet ,000 2 Partners ,000 3 NTILE(4) quartile NTILE(100) percentage Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 34

35 5.1 MDX MDX (MultiDimensional expressions) Developed by Microsoft Not really brilliant But adopted by major OLAP providers due to Microsoft's market leader position Used in OLE DB for OLAP (ODBO) with API support XML for Analysis (XMLA): specification of web services for OLAP Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 35

36 5.1 MDX Similar to SQL syntax SELECT {Deutschland, Niedersachsen, Bayern, Frankfurt} ON COLUMNS, {Qtr1.CHILDREN, Qtr2, Qtr3} ON ROWS FROM SalesCube WHERE (Measures.Sales, Time.[2008], Products.[All Products]); SELECT axes dimensions, on columns and rows FROM Data source cube specification If joined, data cubes must share dimensions WHERE Slicer - restricts the data area Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 36

37 5.1 MDX Lists Enumeration of elementary nodes from different classification levels E.g. {Deutschland, Niedersachsen, [Frankfurt am Main], USA} Generated elements Methods which lead to new sets of the classification levels Deutschland.CHILDREN generates: {Niedersachsen, Bayern, } Niedersachsen.PARENT generates Deutschland Time.Quarter.MEMBERS generates all the elements of the classification level Functional generation of sets DESCENDENT(USA, Cities): children of the provided classification levels GENERATE ({USA, France}, DESCENDANTS(Geography.CURRENT, Cities)): enumerates all the cities in USA and France Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 37

38 5.1 MDX Sets nesting combines individual coordinates to reduce dimensionality SELECT CROSSJOIN({Deutschland, Sachsen, Hannover, BS}{Ikeea, [H&M-Möbel]}) ON COLUMNS, {Qtr1.CHILDREN, Qtr2} ON ROWS FROM salescube WHERE (Measure.Sales, Time.[2008], Products.[All Products]); Deutschland Sachsen Hannover BS Ikeea H&M- Möbel Ikeea H&M- Möbel Ikeea H&M- Möbel Ikeea H&M- Möbel Jan 08 Feb 08 Mar 08 Qtr2 Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 38

39 5.1 MDX Relative selection Uses the order in the dimensional structures Time.[2008].LastChild : last quarter of 2008 [2008].NextMember : {[2009]} [2008].[Qtr4].Nov.Lead(2) : Jan 2009 [2006]:[2009] represents [2006],.., [2009] Methods for hierarchy information extraction Deutschland.LEVEL : country Time.LEVELS(1) : Year Brackets {}: Sets, e.g. {Hannover, BS, John} []: text interpretation of numbers, empty spaces between words or other symbols E.g. [2008], [Frankfurt am Main], [H&M] (): tuple e.g. WHERE (Measure.Sales, Time.[2008], Products.[All Products]) Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 39

40 5.1 MDX Special functions and filters Special functions TOPCOUNT(), TOPPERCENT(), TOPSUM() E.g. SELECT {Time.CHILDREN} ON COLUMNS, {TOPCOUNT(Deutschland.CHILDREN, 5, Sales.turnover)} ON ROWS FROM salescube WHERE (Measure.Sales, Time.[2008]); Filter function E.g. SELECT FILTER(Deutschland.CHILDREN, ([2008], Turnover) > ([2007], Turnover)) ON COLUMNS, Quarters.MEMBERS ON ROWS FROM salescube WHERE (Measure.Sales, Time.[2008], Products.Electronics); Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 40

41 5.1 MDX Time series Set Values Expressions Choosing time intervals PERIODSTODATE(Quarter, [15-Nov-2008]): returns LASTPERIODS(3, [Sept-2008]): returns [June-2008], [July-2008], [Aug- 2008] Member Value Expressions Pre-periods PARALLELPERIOD(Year, 3, [Sep-2008]): returns [Sep-2005] Numerical functions COVARIANCE, CORRELATION LINEAR REGRESSION Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 41

42 5.1 mdxml XMLA (XML for Analysis) Most recent attempt at a standardized API for OLAP Allows client applications to talk to multi-dimensional data sources In XMLA, mdxml is a MDX wrapper for XML Underlying technologies XML, SOAP, HTTP Service primitives DISCOVER Retrieve information about available data sources, data schemas, server infos EXECUTE Transmission of a query and the corresponding conclusion Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 42

43 5.2 Data Modeling Store dimension Now we know What OLAP looks like From the outside From the inside SQL 99, MDX How data is modeled Conceptual level muml, ME/R Logical level Cubes, dimensions Product dimension Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 43

44 5.2 Data Modeling But how do we implement it On a logical level On a physical level DBMS Independent DBMS Dependent Functional Analysis Application Program Design Requirement Analysis Conceptual Design Logical Design Data requirements Conceptual schema Logical schema Physical Design Transaction Implementation Application Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 44

45 5.2 Implementation Implementation of the multidimensional data model can be: Relational Snowflake-schema Star-schema Multidimensional Array technique Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 45

46 5.2 Implementation Relational Implementation Main goals: As low loss of semantically knowledge as possible e.g., classification hierarchies The translation from multidimensional queries must be efficient The RDBMS should be able to run the translated queries efficiently The maintenance of the present tables should be easy and fast e.g., when loading new data Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 46

47 5.2 Implementation Going from multidimensional to relational Representations for cubes, dimensions, classification hierarchies and attributes Implementation of cubes without the classification hierarchies is easy A table can be seen as a cube A column of a table can be considered as a dimension mapping A tuple in the table represents a cell in the cube If we interpret only a part of the columns as dimensions we can use the rest as measures The resulting table is called a fact table Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 47

48 5.2 Implementation Product 818 Laptops Mobile p. Time Geography Article Store Day Sales Laptops Hannover, Saturn Mobile Phones Hannover Saturn Laptops Braunschweig Saturn Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 48

49 5.2 Snowflake Schema Snowflake-schema Simple idea: use a table for each classification level This table includes the ID of the classification level and other attributes 2 neighbor classification levels are connected by 1:n connections e.g., from n Days to 1 Month The measures of a cube are maintained in a fact table Besides measures, there are also the foreign key IDs for the smallest classification levels Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 49

50 5.2 Snowflake Schema Snowflake? The facts/measures are in the center The dimensions spread out in each direction and branch out with their granularity Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 50

51 5.2 Snowflake Example n 1 Product group Product_group_ID Description Product_categ_ID Product category Product_category_ID Description 1 n Product Product_ID Description Brand Product_gro up_id 1 n n Sales Product_ID Day_ID Store_ID Sales Revenue n n 1 Day Day_ID Description Month_ID Week_ID n 1 1 n Week Week_ID Description Year_ID Month Month_ID Description Quarter_ID n 1 1 Year Year_ID Description Region Region_ID Description Country_ID 1 n n State State_ID Description Region_ID 1 1 n Store Store_ID Description State_ID 1 time Quarter Quarter_ID Description Year_ID n 1 Country Country_ID Description location fact table dimension tables Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 51

52 5.2 Snowflake Schema Snowflake schema Advantages With a snowflake schema the size of the dimension tables will be reduced and queries will run faster If a dimension is very sparse (most measures corresponding to the dimension have no data) And/or a dimension has long list of attributes which may be queried Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 52

53 5.2 Snowflake Schema Snowflake schema Disadvantages Fact tables are responsible for 90% of the storage requirements Thus, normalizing the dimensions usually lead to insignificant improvements Normalization of the dimension tables can reduce the performance of the DW because it leads to a large number of tables E.g., when connecting dimensions with coarse granularity these tables are joined with each other during queries A query which connects Product category with Year and Country is clearly not performant (10 tables need to be connected) Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 53

54 5.2 Snowflake Example n 1 Product group Product_group_ID Description Product_categ_ID Product category Product_category_ID Description 1 n Product Product_ID Description Brand Product_gro up_id 1 n n Sales Product_ID Day_ID Store_ID Sales Revenue n n 1 Day Day_ID Description Month_ID Week_ID n 1 1 n Week Week_ID Description Year_ID Month Month_ID Description Quarter_ID n 1 1 Year Year_ID Description Region Region_ID Description Country_ID 1 n n State State_ID Description Region_ID 1 1 n Store Store_ID Description State_ID 1 Quarter Quarter_ID Description Year_ID n 1 Country Country_ID Description Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 54

55 5.2 Star Schema Star schema Basic idea: use a denormalized schema for all the dimensions A star schema can be obtained from the snowflake schema through the denormalization of the tables belonging to a dimension Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 55

56 5.2 Star Schema - Example Product Product_ID Product group Product category Description 1 n n Sales Product_ID Time_ID Geo_ID Sales Revenue n 1 Time Time_ID Day Week Month Quarter Year 1 Geography Geo_ID Store State Region Country Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 56

57 5.2 Star Schema Advantages Improves query performance for often-used data Less tables and simple structure Efficient query processing with regard to dimensions Disadvantages In some cases, high overhead of redundant data Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 57

58 5.2 Snowflake vs. Star Snowflake vs. Star The structure of the classifications are expressed in table schemas The fact table and dimension tables are normalized The entire classification is expressed in just one table The fact table is normalized while in the dimension table the normalization is broken This leads to redundancy of information in the dimension tables Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 58

59 5.2 Examples Snowflake Star Product_ID Description Brand Prod_group_ID 10 E71 Nokia 4 11 PS-42A Samsung Nokia 4 Bold Berry 4 Prod_group_ID Description Prod_categ_ID 2 TV 11 4 Mobile Pho.. 11 Product_ ID Description Prod. group Prod. categ 10 E71 Mobile Ph.. Electronics 11 PS-42A TV Electronics Mobile Ph.. Electronics 13 Bold Mobile Ph.. Electronics Prod_categ_ID Description 11 Electronics Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 59

60 5.2 Snowflake to Star When should we go from Snowflake to star? Heuristics-based decision When typical queries relate to coarser granularity (like product category) When the volume of data in the dimension tables is relatively low compared to the fact table In this case a star schema leads to negligible overhead through redundancy, but performance is improved When modifications on the classifications are rare compared to insertion of fact data In this case these modifications controlled through the data load process of the ETL reducing the risk of data anomalies Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 60

61 5.2 Do we have a winner? Snowflake or Star? It depends on the necessity Fast query processing or efficient space usage However, most of the time a mixed form is used The Starflake schema: some dimensions stay normalized corresponding to the snowflake schema, while others are denormalized according to the star schema Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 61

62 5.2 Our forces combined The Starflake schema The decision on how to deal with the dimensions is influenced by Frequency of the modifications: if the dimensions change often, normalization leads to better results Amount of dimension elements: the bigger the dimension tables, the more space normalization saves Number of classification levels in a dimension: more classification levels introduce more redundancy in the star schema Materialization of aggregates for the dimension levels: if the aggregates are materialized, a normalization of the dimension can bring better response time Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 62

63 5.2 More Schemas Galaxies In pratice we usually have more measures described by different dimensions Thus, more fact tables Store Store_ID Product Product_ID Sales Product_ID Store_ID Sales Revenue Receipts Product_ID Date_ID Date Date_ID Vendor Vendor_ID Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 63

64 5.2 More Schemas Other schemas Fact constellations Pre-calculated aggregates Factless fact tables Fact tables do not have non-key data Can be used for event tracking or to inventory the set of possible occurrences Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 64

65 5.2 Multidimensional? Multidimensional implementation The representation of the multidimensional data can be implemented relationally with a finite set of transformation steps, however: Multidimensional queries have to be first translated to the relational representation A direct interaction with the relational data model is not fit for the end user Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 65

66 5.2 Multidimensional? Data structures The basic data structure for multidimensional data storage is the array The elementary data structures are the cubes and the dimensions C=((D 1,, D n ), (M 1,, M m )) The storage is intuitive as arrays of arrays, physically linearized More about linearization and related issues will be discussed in the lecture optimization Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 66

67 5.2 Physical Model? Defining the physical structures Setting up the database environment Setting up the appropriate security Preliminary performance tuning strategies Indexing Partitioning Materialization Goal: define the actual storage architecture decide on how the data is to be accessed and how it is arranged Again, this will be discussed in more detail in the lecture on optimization Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 67

68 Next lecture Optimization Indexes Data Warehousing & OLAP Wolf-Tilo Balke Institut für Informationssysteme TU Braunschweig 68

Data Warehousing & Data Mining

Data Warehousing & Data Mining Data Warehousing & Data Mining Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 6. OLAP Operations & Queries 6. OLAP Operations

More information

Data Warehousing OLAP

Data Warehousing OLAP Data Warehousing OLAP References Wei Wang. A Brief MDX Tutorial Using Mondrian. School of Computer Science & Engineering, University of New South Wales. Toon Calders. Querying OLAP Cubes. Wolf-Tilo Balke,

More information

5. Queries. 5.1 How does OLAP work? 5.1 How does OLAP work? 5.1 How does OLAP work? 5.1 How does OLAP work? 5/1/2009. 5. Queries.

5. Queries. 5.1 How does OLAP work? 5.1 How does OLAP work? 5.1 How does OLAP work? 5.1 How does OLAP work? 5/1/2009. 5. Queries. 5//2009 5. Queries Data Warehousig & Data Miig Wolf-Tilo Balke Silviu Homoceau Istitut für Iformatiossysteme Techische Uiversität Brauschweig http://www.ifis.cs.tu-bs.de 5. Queries 5. OLAP query laguages

More information

1. OLAP is an acronym for a. Online Analytical Processing b. Online Analysis Process c. Online Arithmetic Processing d. Object Linking and Processing

1. OLAP is an acronym for a. Online Analytical Processing b. Online Analysis Process c. Online Arithmetic Processing d. Object Linking and Processing 1. OLAP is an acronym for a. Online Analytical Processing b. Online Analysis Process c. Online Arithmetic Processing d. Object Linking and Processing 2. What is a Data warehouse a. A database application

More information

Data Warehousing: Data Models and OLAP operations. By Kishore Jaladi kishorejaladi@yahoo.com

Data Warehousing: Data Models and OLAP operations. By Kishore Jaladi kishorejaladi@yahoo.com Data Warehousing: Data Models and OLAP operations By Kishore Jaladi kishorejaladi@yahoo.com Topics Covered 1. Understanding the term Data Warehousing 2. Three-tier Decision Support Systems 3. Approaches

More information

OLAP Systems and Multidimensional Expressions I

OLAP Systems and Multidimensional Expressions I OLAP Systems and Multidimensional Expressions I Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Software Development Technologies Master

More information

Accessing multidimensional Data Types in Oracle 9i Release 2

Accessing multidimensional Data Types in Oracle 9i Release 2 Accessing multidimensional Data Types in Oracle 9i Release 2 Marc Bastien ORACLE Deutschland GmbH, Notkestrasse 15, 22607 Hamburg Marc.Bastien@oracle.com In former Releases of the Oracle Database, multidimensional

More information

DATA WAREHOUSING - OLAP

DATA WAREHOUSING - OLAP http://www.tutorialspoint.com/dwh/dwh_olap.htm DATA WAREHOUSING - OLAP Copyright tutorialspoint.com Online Analytical Processing Server OLAP is based on the multidimensional data model. It allows managers,

More information

Data Warehousing and Decision Support. Introduction. Three Complementary Trends. Chapter 23, Part A

Data Warehousing and Decision Support. Introduction. Three Complementary Trends. Chapter 23, Part A Data Warehousing and Decision Support Chapter 23, Part A Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke 1 Introduction Increasingly, organizations are analyzing current and historical

More information

Learning Objectives. Definition of OLAP Data cubes OLAP operations MDX OLAP servers

Learning Objectives. Definition of OLAP Data cubes OLAP operations MDX OLAP servers OLAP Learning Objectives Definition of OLAP Data cubes OLAP operations MDX OLAP servers 2 What is OLAP? OLAP has two immediate consequences: online part requires the answers of queries to be fast, the

More information

Basics of Dimensional Modeling

Basics of Dimensional Modeling Basics of Dimensional Modeling Data warehouse and OLAP tools are based on a dimensional data model. A dimensional model is based on dimensions, facts, cubes, and schemas such as star and snowflake. Dimensional

More information

Week 13: Data Warehousing. Warehousing

Week 13: Data Warehousing. Warehousing 1 Week 13: Data Warehousing Warehousing Growing industry: $8 billion in 1998 Range from desktop to huge: Walmart: 900-CPU, 2,700 disk, 23TB Teradata system Lots of buzzwords, hype slice & dice, rollup,

More information

Week 3 lecture slides

Week 3 lecture slides Week 3 lecture slides Topics Data Warehouses Online Analytical Processing Introduction to Data Cubes Textbook reference: Chapter 3 Data Warehouses A data warehouse is a collection of data specifically

More information

2074 : Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000

2074 : Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000 2074 : Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000 Introduction This course provides students with the knowledge and skills necessary to design, implement, and deploy OLAP

More information

Outline. Data Warehousing. What is a Warehouse? What is a Warehouse?

Outline. Data Warehousing. What is a Warehouse? What is a Warehouse? Outline Data Warehousing What is a data warehouse? Why a warehouse? Models & operations Implementing a warehouse 2 What is a Warehouse? Collection of diverse data subject oriented aimed at executive, decision

More information

Data Warehousing. Read chapter 13 of Riguzzi et al Sistemi Informativi. Slides derived from those by Hector Garcia-Molina

Data Warehousing. Read chapter 13 of Riguzzi et al Sistemi Informativi. Slides derived from those by Hector Garcia-Molina Data Warehousing Read chapter 13 of Riguzzi et al Sistemi Informativi Slides derived from those by Hector Garcia-Molina What is a Warehouse? Collection of diverse data subject oriented aimed at executive,

More information

Database Applications. Advanced Querying. Transaction Processing. Transaction Processing. Data Warehouse. Decision Support. Transaction processing

Database Applications. Advanced Querying. Transaction Processing. Transaction Processing. Data Warehouse. Decision Support. Transaction processing Database Applications Advanced Querying Transaction processing Online setting Supports day-to-day operation of business OLAP Data Warehousing Decision support Offline setting Strategic planning (statistics)

More information

Data Warehouse Snowflake Design and Performance Considerations in Business Analytics

Data Warehouse Snowflake Design and Performance Considerations in Business Analytics Journal of Advances in Information Technology Vol. 6, No. 4, November 2015 Data Warehouse Snowflake Design and Performance Considerations in Business Analytics Jiangping Wang and Janet L. Kourik Walker

More information

Introduction to Data Warehousing. Ms Swapnil Shrivastava swapnil@konark.ncst.ernet.in

Introduction to Data Warehousing. Ms Swapnil Shrivastava swapnil@konark.ncst.ernet.in Introduction to Data Warehousing Ms Swapnil Shrivastava swapnil@konark.ncst.ernet.in Necessity is the mother of invention Why Data Warehouse? Scenario 1 ABC Pvt Ltd is a company with branches at Mumbai,

More information

Building Data Cubes and Mining Them. Jelena Jovanovic Email: jeljov@fon.bg.ac.yu

Building Data Cubes and Mining Them. Jelena Jovanovic Email: jeljov@fon.bg.ac.yu Building Data Cubes and Mining Them Jelena Jovanovic Email: jeljov@fon.bg.ac.yu KDD Process KDD is an overall process of discovering useful knowledge from data. Data mining is a particular step in the

More information

Database Design Patterns. Winter 2006-2007 Lecture 24

Database Design Patterns. Winter 2006-2007 Lecture 24 Database Design Patterns Winter 2006-2007 Lecture 24 Trees and Hierarchies Many schemas need to represent trees or hierarchies of some sort Common way of representing trees: An adjacency list model Each

More information

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2007 Lecture 16 - Data Warehousing

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Fall 2007 Lecture 16 - Data Warehousing CSE 544 Principles of Database Management Systems Magdalena Balazinska Fall 2007 Lecture 16 - Data Warehousing Class Projects Class projects are going very well! Project presentations: 15 minutes On Wednesday

More information

Overview of Data Warehousing and OLAP

Overview of Data Warehousing and OLAP Overview of Data Warehousing and OLAP Chapter 28 March 24, 2008 ADBS: DW 1 Chapter Outline What is a data warehouse (DW) Conceptual structure of DW Why separate DW Data modeling for DW Online Analytical

More information

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1

Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1 Slide 29-1 Chapter 29 Overview of Data Warehousing and OLAP Chapter 29 Outline Purpose of Data Warehousing Introduction, Definitions, and Terminology Comparison with Traditional Databases Characteristics

More information

Decision Support. Chapter 23. Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke 1

Decision Support. Chapter 23. Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke 1 Decision Support Chapter 23 Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke 1 Introduction Increasingly, organizations are analyzing current and historical data to identify useful

More information

When to consider OLAP?

When to consider OLAP? When to consider OLAP? Author: Prakash Kewalramani Organization: Evaltech, Inc. Evaltech Research Group, Data Warehousing Practice. Date: 03/10/08 Email: erg@evaltech.com Abstract: Do you need an OLAP

More information

A Technical Review on On-Line Analytical Processing (OLAP)

A Technical Review on On-Line Analytical Processing (OLAP) A Technical Review on On-Line Analytical Processing (OLAP) K. Jayapriya 1., E. Girija 2,III-M.C.A., R.Uma. 3,M.C.A.,M.Phil., Department of computer applications, Assit.Prof,Dept of M.C.A, Dhanalakshmi

More information

Mario Guarracino. Data warehousing

Mario Guarracino. Data warehousing Data warehousing Introduction Since the mid-nineties, it became clear that the databases for analysis and business intelligence need to be separate from operational. In this lecture we will review the

More information

Querying Microsoft SQL Server

Querying Microsoft SQL Server Course 20461C: Querying Microsoft SQL Server Module 1: Introduction to Microsoft SQL Server 2014 This module introduces the SQL Server platform and major tools. It discusses editions, versions, tools used

More information

Turning your Warehouse Data into Business Intelligence: Reporting Trends and Visibility Michael Armanious; Vice President Sales and Marketing Datex,

Turning your Warehouse Data into Business Intelligence: Reporting Trends and Visibility Michael Armanious; Vice President Sales and Marketing Datex, Turning your Warehouse Data into Business Intelligence: Reporting Trends and Visibility Michael Armanious; Vice President Sales and Marketing Datex, Inc. Overview Introduction What is Business Intelligence?

More information

OLAP Systems and Multidimensional Queries II

OLAP Systems and Multidimensional Queries II OLAP Systems and Multidimensional Queries II Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Software Development Technologies Master

More information

Multi-dimensional index structures Part I: motivation

Multi-dimensional index structures Part I: motivation Multi-dimensional index structures Part I: motivation 144 Motivation: Data Warehouse A definition A data warehouse is a repository of integrated enterprise data. A data warehouse is used specifically for

More information

Unit -3. Learning Objective. Demand for Online analytical processing Major features and functions OLAP models and implementation considerations

Unit -3. Learning Objective. Demand for Online analytical processing Major features and functions OLAP models and implementation considerations Unit -3 Learning Objective Demand for Online analytical processing Major features and functions OLAP models and implementation considerations Demand of On Line Analytical Processing Need for multidimensional

More information

University of Gaziantep, Department of Business Administration

University of Gaziantep, Department of Business Administration University of Gaziantep, Department of Business Administration The extensive use of information technology enables organizations to collect huge amounts of data about almost every aspect of their businesses.

More information

Introducing Microsoft SQL Server 2012 Getting Started with SQL Server Management Studio

Introducing Microsoft SQL Server 2012 Getting Started with SQL Server Management Studio Querying Microsoft SQL Server 2012 Microsoft Course 10774 This 5-day instructor led course provides students with the technical skills required to write basic Transact-SQL queries for Microsoft SQL Server

More information

BUILDING BLOCKS OF DATAWAREHOUSE. G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT

BUILDING BLOCKS OF DATAWAREHOUSE. G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT BUILDING BLOCKS OF DATAWAREHOUSE G.Lakshmi Priya & Razia Sultana.A Assistant Professor/IT 1 Data Warehouse Subject Oriented Organized around major subjects, such as customer, product, sales. Focusing on

More information

DATA CUBES E0 261. Jayant Haritsa Computer Science and Automation Indian Institute of Science. JAN 2014 Slide 1 DATA CUBES

DATA CUBES E0 261. Jayant Haritsa Computer Science and Automation Indian Institute of Science. JAN 2014 Slide 1 DATA CUBES E0 261 Jayant Haritsa Computer Science and Automation Indian Institute of Science JAN 2014 Slide 1 Introduction Increasingly, organizations are analyzing historical data to identify useful patterns and

More information

Data Warehousing and OLAP

Data Warehousing and OLAP 1 Data Warehousing and OLAP Hector Garcia-Molina Stanford University Warehousing Growing industry: $8 billion in 1998 Range from desktop to huge: Walmart: 900-CPU, 2,700 disk, 23TB Teradata system Lots

More information

Lecture Data Warehouse Systems

Lecture Data Warehouse Systems Lecture Data Warehouse Systems Eva Zangerle SS 2013 PART A: Architecture Chapter 1: Motivation and Definitions Motivation Goal: to build an operational general view on a company to support decisions in

More information

Designing a Dimensional Model

Designing a Dimensional Model Designing a Dimensional Model Erik Veerman Atlanta MDF member SQL Server MVP, Microsoft MCT Mentor, Solid Quality Learning Definitions Data Warehousing A subject-oriented, integrated, time-variant, and

More information

OLAP & DATA MINING CS561-SPRING 2012 WPI, MOHAMED ELTABAKH

OLAP & DATA MINING CS561-SPRING 2012 WPI, MOHAMED ELTABAKH OLAP & DATA MINING CS561-SPRING 2012 WPI, MOHAMED ELTABAKH 1 Online Analytic Processing OLAP 2 OLAP OLAP: Online Analytic Processing OLAP queries are complex queries that Touch large amounts of data Discover

More information

Course ID#: 1401-801-14-W 35 Hrs. Course Content

Course ID#: 1401-801-14-W 35 Hrs. Course Content Course Content Course Description: This 5-day instructor led course provides students with the technical skills required to write basic Transact- SQL queries for Microsoft SQL Server 2014. This course

More information

Introduction. Introduction to Data Warehousing

Introduction. Introduction to Data Warehousing Introduction to Data Warehousing Pasquale LOPS Gestione della Conoscenza d Impresa A.A. 2003-2004 Introduction Data warehousing and decision support have given rise to a new class of databases. Design

More information

Querying Microsoft SQL Server 20461C; 5 days

Querying Microsoft SQL Server 20461C; 5 days Lincoln Land Community College Capital City Training Center 130 West Mason Springfield, IL 62702 217-782-7436 www.llcc.edu/cctc Querying Microsoft SQL Server 20461C; 5 days Course Description This 5-day

More information

14. Data Warehousing & Data Mining

14. Data Warehousing & Data Mining 14. Data Warehousing & Data Mining Data Warehousing Concepts Decision support is key for companies wanting to turn their organizational data into an information asset Data Warehouse "A subject-oriented,

More information

Monitoring Genebanks using Datamarts based in an Open Source Tool

Monitoring Genebanks using Datamarts based in an Open Source Tool Monitoring Genebanks using Datamarts based in an Open Source Tool April 10 th, 2008 Edwin Rojas Research Informatics Unit (RIU) International Potato Center (CIP) GPG2 Workshop 2008 Datamarts Motivation

More information

ETL TESTING TRAINING

ETL TESTING TRAINING ETL TESTING TRAINING DURATION 35hrs AVAILABLE BATCHES WEEKDAYS (6.30AM TO 7.30AM) & WEEKENDS (6.30pm TO 8pm) MODE OF TRAINING AVAILABLE ONLINE INSTRUCTOR LED CLASSROOM TRAINING (MARATHAHALLI, BANGALORE)

More information

SQL SERVER TRAINING CURRICULUM

SQL SERVER TRAINING CURRICULUM SQL SERVER TRAINING CURRICULUM Complete SQL Server 2000/2005 for Developers Management and Administration Overview Creating databases and transaction logs Managing the file system Server and database configuration

More information

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2009 Lecture 15 - Data Warehousing: Cubes

CSE 544 Principles of Database Management Systems. Magdalena Balazinska Winter 2009 Lecture 15 - Data Warehousing: Cubes CSE 544 Principles of Database Management Systems Magdalena Balazinska Winter 2009 Lecture 15 - Data Warehousing: Cubes Final Exam Overview Open books and open notes No laptops and no other mobile devices

More information

CS2032 Data warehousing and Data Mining Unit II Page 1

CS2032 Data warehousing and Data Mining Unit II Page 1 UNIT II BUSINESS ANALYSIS Reporting Query tools and Applications The data warehouse is accessed using an end-user query and reporting tool from Business Objects. Business Objects provides several tools

More information

Querying Microsoft SQL Server 2012

Querying Microsoft SQL Server 2012 Course 10774A: Querying Microsoft SQL Server 2012 Length: 5 Days Language(s): English Audience(s): IT Professionals Level: 200 Technology: Microsoft SQL Server 2012 Type: Course Delivery Method: Instructor-led

More information

Overview. Data Warehousing and Decision Support. Introduction. Three Complementary Trends. Data Warehousing. An Example: The Store (e.g.

Overview. Data Warehousing and Decision Support. Introduction. Three Complementary Trends. Data Warehousing. An Example: The Store (e.g. Overview Data Warehousing and Decision Support Chapter 25 Why data warehousing and decision support Data warehousing and the so called star schema MOLAP versus ROLAP OLAP, ROLLUP AND CUBE queries Design

More information

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP Data Warehousing and End-User Access Tools OLAP and Data Mining Accompanying growth in data warehouses is increasing demands for more powerful access tools providing advanced analytical capabilities. Key

More information

Course 10774A: Querying Microsoft SQL Server 2012

Course 10774A: Querying Microsoft SQL Server 2012 Course 10774A: Querying Microsoft SQL Server 2012 About this Course This 5-day instructor led course provides students with the technical skills required to write basic Transact-SQL queries for Microsoft

More information

Course 20461C: Querying Microsoft SQL Server Duration: 35 hours

Course 20461C: Querying Microsoft SQL Server Duration: 35 hours Course 20461C: Querying Microsoft SQL Server Duration: 35 hours About this Course This course is the foundation for all SQL Server-related disciplines; namely, Database Administration, Database Development

More information

TIM 50 - Business Information Systems

TIM 50 - Business Information Systems TIM 50 - Business Information Systems Lecture 15 UC Santa Cruz March 1, 2015 The Database Approach to Data Management Database: Collection of related files containing records on people, places, or things.

More information

Course 10774A: Querying Microsoft SQL Server 2012 Length: 5 Days Published: May 25, 2012 Language(s): English Audience(s): IT Professionals

Course 10774A: Querying Microsoft SQL Server 2012 Length: 5 Days Published: May 25, 2012 Language(s): English Audience(s): IT Professionals Course 10774A: Querying Microsoft SQL Server 2012 Length: 5 Days Published: May 25, 2012 Language(s): English Audience(s): IT Professionals Overview About this Course Level: 200 Technology: Microsoft SQL

More information

70-467: Designing Business Intelligence Solutions with Microsoft SQL Server

70-467: Designing Business Intelligence Solutions with Microsoft SQL Server 70-467: Designing Business Intelligence Solutions with Microsoft SQL Server The following tables show where changes to exam 70-467 have been made to include updates that relate to SQL Server 2014 tasks.

More information

Module 1: Introduction to Data Warehousing and OLAP

Module 1: Introduction to Data Warehousing and OLAP Raw Data vs. Business Information Module 1: Introduction to Data Warehousing and OLAP Capturing Raw Data Gathering data recorded in everyday operations Deriving Business Information Deriving meaningful

More information

SQL Server 2012 Business Intelligence Boot Camp

SQL Server 2012 Business Intelligence Boot Camp SQL Server 2012 Business Intelligence Boot Camp Length: 5 Days Technology: Microsoft SQL Server 2012 Delivery Method: Instructor-led (classroom) About this Course Data warehousing is a solution organizations

More information

UNIT-3 OLAP in Data Warehouse

UNIT-3 OLAP in Data Warehouse UNIT-3 OLAP in Data Warehouse Bharati Vidyapeeth s Institute of Computer Applications and Management, New Delhi-63, by Dr.Deepali Kamthania U2.1 OLAP Demand for Online analytical processing Major features

More information

AV-005: Administering and Implementing a Data Warehouse with SQL Server 2014

AV-005: Administering and Implementing a Data Warehouse with SQL Server 2014 AV-005: Administering and Implementing a Data Warehouse with SQL Server 2014 Career Details Duration 105 hours Prerequisites This career requires that you meet the following prerequisites: Working knowledge

More information

DBMS / Business Intelligence, SQL Server

DBMS / Business Intelligence, SQL Server DBMS / Business Intelligence, SQL Server Orsys, with 30 years of experience, is providing high quality, independant State of the Art seminars and hands-on courses corresponding to the needs of IT professionals.

More information

M2074 - Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000 5 Day Course

M2074 - Designing and Implementing OLAP Solutions Using Microsoft SQL Server 2000 5 Day Course Module 1: Introduction to Data Warehousing and OLAP Introducing Data Warehousing Defining OLAP Solutions Understanding Data Warehouse Design Understanding OLAP Models Applying OLAP Cubes At the end of

More information

The Art of Designing HOLAP Databases Mark Moorman, SAS Institute Inc., Cary NC

The Art of Designing HOLAP Databases Mark Moorman, SAS Institute Inc., Cary NC Paper 139 The Art of Designing HOLAP Databases Mark Moorman, SAS Institute Inc., Cary NC ABSTRACT While OLAP applications offer users fast access to information across business dimensions, it can also

More information

Optimizing Your Data Warehouse Design for Superior Performance

Optimizing Your Data Warehouse Design for Superior Performance Optimizing Your Data Warehouse Design for Superior Performance Lester Knutsen, President and Principal Database Consultant Advanced DataTools Corporation Session 2100A The Problem The database is too complex

More information

MOC 20461 QUERYING MICROSOFT SQL SERVER

MOC 20461 QUERYING MICROSOFT SQL SERVER ONE STEP AHEAD. MOC 20461 QUERYING MICROSOFT SQL SERVER Length: 5 days Level: 300 Technology: Microsoft SQL Server Delivery Method: Instructor-led (classroom) COURSE OUTLINE Module 1: Introduction to Microsoft

More information

Data warehousing with PostgreSQL

Data warehousing with PostgreSQL Data warehousing with PostgreSQL Gabriele Bartolini http://www.2ndquadrant.it/ European PostgreSQL Day 2009 6 November, ParisTech Telecom, Paris, France Audience

More information

Course 6234A: Implementing and Maintaining Microsoft SQL Server 2008 Analysis Services

Course 6234A: Implementing and Maintaining Microsoft SQL Server 2008 Analysis Services Course 6234A: Implementing and Maintaining Microsoft SQL Server 2008 Analysis Services Length: Delivery Method: 3 Days Instructor-led (classroom) About this Course Elements of this syllabus are subject

More information

CHAPTER 5: BUSINESS ANALYTICS

CHAPTER 5: BUSINESS ANALYTICS Chapter 5: Business Analytics CHAPTER 5: BUSINESS ANALYTICS Objectives The objectives are: Describe Business Analytics. Explain the terminology associated with Business Analytics. Describe the data warehouse

More information

OLAP OLAP. Data Warehouse. OLAP Data Model: the Data Cube S e s s io n

OLAP OLAP. Data Warehouse. OLAP Data Model: the Data Cube S e s s io n OLAP OLAP On-Line Analytical Processing In contrast to on-line transaction processing (OLTP) Mostly ad hoc queries involving aggregation Response time rather than throughput is the main performance measure.

More information

SQL Server Administrator Introduction - 3 Days Objectives

SQL Server Administrator Introduction - 3 Days Objectives SQL Server Administrator Introduction - 3 Days INTRODUCTION TO MICROSOFT SQL SERVER Exploring the components of SQL Server Identifying SQL Server administration tasks INSTALLING SQL SERVER Identifying

More information

Cognos 8 Best Practices

Cognos 8 Best Practices Northwestern University Business Intelligence Solutions Cognos 8 Best Practices Volume 2 Dimensional vs Relational Reporting Reporting Styles Relational Reports are composed primarily of list reports,

More information

Querying Microsoft SQL Server Course M20461 5 Day(s) 30:00 Hours

Querying Microsoft SQL Server Course M20461 5 Day(s) 30:00 Hours Área de formação Plataforma e Tecnologias de Informação Querying Microsoft SQL Introduction This 5-day instructor led course provides students with the technical skills required to write basic Transact-SQL

More information

Data Warehousing and Data Mining

Data Warehousing and Data Mining Data Warehousing and Data Mining Part I: Data Warehousing Gao Cong gaocong@cs.aau.dk Slides adapted from Man Lung Yiu and Torben Bach Pedersen Course Structure Business intelligence: Extract knowledge

More information

Data Warehousing. Paper 133-25

Data Warehousing. Paper 133-25 Paper 133-25 The Power of Hybrid OLAP in a Multidimensional World Ann Weinberger, SAS Institute Inc., Cary, NC Matthias Ender, SAS Institute Inc., Cary, NC ABSTRACT Version 8 of the SAS System brings powerful

More information

Data Warehousing. Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.

Data Warehousing. Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs. Data Warehousing & Mining Techniques Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 2. Architecture 2. Architecture 2.1

More information

DB2 V8 Performance Opportunities

DB2 V8 Performance Opportunities DB2 V8 Performance Opportunities Data Warehouse Performance DB2 Version 8: More Opportunities! David Beulke Principal Consultant, Pragmatic Solutions, Inc. DBeulke@compserve.com 703 798-3283 Leverage your

More information

Part 22. Data Warehousing

Part 22. Data Warehousing Part 22 Data Warehousing The Decision Support System (DSS) Tools to assist decision-making Used at all levels in the organization Sometimes focused on a single area Sometimes focused on a single problem

More information

CS54100: Database Systems

CS54100: Database Systems CS54100: Database Systems Date Warehousing: Current, Future? 20 April 2012 Prof. Chris Clifton Data Warehousing: Goals OLAP vs OLTP On Line Analytical Processing (vs. Transaction) Optimize for read, not

More information

Data warehousing in Oracle. SQL extensions for data warehouse analysis. Available OLAP functions. Physical aggregation example

Data warehousing in Oracle. SQL extensions for data warehouse analysis. Available OLAP functions. Physical aggregation example Data warehousing in Oracle Materialized views and SQL extensions to analyze data in Oracle data warehouses SQL extensions for data warehouse analysis Available OLAP functions Computation windows window

More information

LearnFromGuru Polish your knowledge

LearnFromGuru Polish your knowledge SQL SERVER 2008 R2 /2012 (TSQL/SSIS/ SSRS/ SSAS BI Developer TRAINING) Module: I T-SQL Programming and Database Design An Overview of SQL Server 2008 R2 / 2012 Available Features and Tools New Capabilities

More information

DATA WAREHOUSING AND OLAP TECHNOLOGY

DATA WAREHOUSING AND OLAP TECHNOLOGY DATA WAREHOUSING AND OLAP TECHNOLOGY Manya Sethi MCA Final Year Amity University, Uttar Pradesh Under Guidance of Ms. Shruti Nagpal Abstract DATA WAREHOUSING and Online Analytical Processing (OLAP) are

More information

Creating BI solutions with BISM Tabular. Written By: Dan Clark

Creating BI solutions with BISM Tabular. Written By: Dan Clark Creating BI solutions with BISM Tabular Written By: Dan Clark CONTENTS PAGE 3 INTRODUCTION PAGE 4 PAGE 5 PAGE 7 PAGE 8 PAGE 9 PAGE 9 PAGE 11 PAGE 12 PAGE 13 PAGE 14 PAGE 17 SSAS TABULAR MODE TABULAR MODELING

More information

Data W a Ware r house house and and OLAP II Week 6 1

Data W a Ware r house house and and OLAP II Week 6 1 Data Warehouse and OLAP II Week 6 1 Team Homework Assignment #8 Using a data warehousing tool and a data set, play four OLAP operations (Roll up (drill up), Drill down (roll down), Slice and dice, Pivot

More information

Data Mining and Data Warehousing Henryk Maciejewski Data Warehousing and OLAP

Data Mining and Data Warehousing Henryk Maciejewski Data Warehousing and OLAP Data Mining and Data Warehousing Henryk Maciejewski Data Warehousing and OLAP Part II Data Warehousing Contents OLAP Approach to Data Analysis Database for OLAP = Data Warehouse Logical model Physical

More information

Delivering Business Intelligence With Microsoft SQL Server 2005 or 2008 HDT922 Five Days

Delivering Business Intelligence With Microsoft SQL Server 2005 or 2008 HDT922 Five Days or 2008 Five Days Prerequisites Students should have experience with any relational database management system as well as experience with data warehouses and star schemas. It would be helpful if students

More information

Module 1: Getting Started with Databases and Transact-SQL in SQL Server 2008

Module 1: Getting Started with Databases and Transact-SQL in SQL Server 2008 Course 2778A: Writing Queries Using Microsoft SQL Server 2008 Transact-SQL About this Course This 3-day instructor led course provides students with the technical skills required to write basic Transact-

More information

Querying Microsoft SQL Server 2012

Querying Microsoft SQL Server 2012 Querying Microsoft SQL Server 2012 Duration: 5 Days Course Code: M10774 Overview: Deze cursus wordt vanaf 1 juli vervangen door cursus M20461 Querying Microsoft SQL Server. This course will be replaced

More information

Review. Data Warehousing. Today. Star schema. Star join indexes. Dimension hierarchies

Review. Data Warehousing. Today. Star schema. Star join indexes. Dimension hierarchies Review Data Warehousing CPS 216 Advanced Database Systems Data warehousing: integrating data for OLAP OLAP versus OLTP Warehousing versus mediation Warehouse maintenance Warehouse data as materialized

More information

Oracle9i Data Warehouse Review. Robert F. Edwards Dulcian, Inc.

Oracle9i Data Warehouse Review. Robert F. Edwards Dulcian, Inc. Oracle9i Data Warehouse Review Robert F. Edwards Dulcian, Inc. Agenda Oracle9i Server OLAP Server Analytical SQL Data Mining ETL Warehouse Builder 3i Oracle 9i Server Overview 9i Server = Data Warehouse

More information

Business Intelligence, Data warehousing Concept and artifacts

Business Intelligence, Data warehousing Concept and artifacts Business Intelligence, Data warehousing Concept and artifacts Data Warehousing is the process of constructing and using the data warehouse. The data warehouse is constructed by integrating the data from

More information

Creating Hybrid Relational-Multidimensional Data Models using OBIEE and Essbase by Mark Rittman and Venkatakrishnan J

Creating Hybrid Relational-Multidimensional Data Models using OBIEE and Essbase by Mark Rittman and Venkatakrishnan J Creating Hybrid Relational-Multidimensional Data Models using OBIEE and Essbase by Mark Rittman and Venkatakrishnan J ODTUG Kaleidoscope Conference June 2009, Monterey, USA Oracle Business Intelligence

More information

CHAPTER 4: BUSINESS ANALYTICS

CHAPTER 4: BUSINESS ANALYTICS Chapter 4: Business Analytics CHAPTER 4: BUSINESS ANALYTICS Objectives Introduction The objectives are: Describe Business Analytics Explain the terminology associated with Business Analytics Describe the

More information

OLAP Is Different From What You Think. Rittman Mead BI Forum Spring 2012

OLAP Is Different From What You Think. Rittman Mead BI Forum Spring 2012 OLAP Is Different From What You Think Rittman Mead BI Forum Spring 2012 Dan Vlamis Vlamis Software Solutions 816-781-2880 http://www.vlamis.com Dan Vlamis and Vlamis Software Solutions Vlamis Software

More information

New Approach of Computing Data Cubes in Data Warehousing

New Approach of Computing Data Cubes in Data Warehousing International Journal of Information & Computation Technology. ISSN 0974-2239 Volume 4, Number 14 (2014), pp. 1411-1417 International Research Publications House http://www. irphouse.com New Approach of

More information

Oracle OLAP What's All This About?

Oracle OLAP What's All This About? Oracle OLAP What's All This About? IOUG Live! 2006 Dan Vlamis dvlamis@vlamis.com Vlamis Software Solutions, Inc. 816-781-2880 http://www.vlamis.com Vlamis Software Solutions, Inc. Founded in 1992 in Kansas

More information

The Cubetree Storage Organization

The Cubetree Storage Organization The Cubetree Storage Organization Nick Roussopoulos & Yannis Kotidis Advanced Communication Technology, Inc. Silver Spring, MD 20905 Tel: 301-384-3759 Fax: 301-384-3679 {nick,kotidis}@act-us.com 1. Introduction

More information

2.1 Basics: Indexing. 2.1 Primary Index. 2.1 Secondary Index. 2.1 Secondary Index. 2.1 Indexes. 2.1 Indexes 14.04.2009.

2.1 Basics: Indexing. 2.1 Primary Index. 2.1 Secondary Index. 2.1 Secondary Index. 2.1 Indexes. 2.1 Indexes 14.04.2009. 2. Architecture Data Warehousing & Mining Techniques Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 2. Architecture 2.1

More information

Tutorials for Project on Building a Business Analytic Model Using Data Mining Tool and Data Warehouse and OLAP Cubes IST 734

Tutorials for Project on Building a Business Analytic Model Using Data Mining Tool and Data Warehouse and OLAP Cubes IST 734 Cleveland State University Tutorials for Project on Building a Business Analytic Model Using Data Mining Tool and Data Warehouse and OLAP Cubes IST 734 SS Chung 14 Build a Data Mining Model using Data

More information