Building a Hybrid Data Warehouse Model
|
|
|
- Lydia Brown
- 10 years ago
- Views:
Transcription
1 Page 1 of 13 Developer: Business Intelligence Building a Hybrid Data Warehouse Model by James Madison DOWNLOAD Oracle Database Sample Code TAGS datawarehousing, bi, All As suggested by this reference implementation, in some cases blending the relational and dimensional models may be the right approach to data warehouse design. Published April 2007 Relational and dimensional modeling are often used separately, but they can be successfully incorporated into a single design when needed. Doing so starts with a normalized relational model and then adds dimensional constructs, primarily at the physical level. The result is a single model that can provide the strengths of its parent models fairly well: it represents entities and relationships with the precision of the traditional relational model, and it processes dimensionally filtered, factaggregated queries with speed approaching that of the traditional dimensional model. Real-world experience was the motivation for this analysis: on three separate data warehousing projects where I worked as programmer, architect, and manager, respectively, I found a consistent pattern of data/database behavior that lent itself far more to a hybrid combination of dimensional and relational modeling than to either one alone. This article discusses the hybrid design and provides a fully functional reference implementation. The system runs on Oracle Database 10g. It contains all code needed to build the database schemas, generate sample data, load it into the schemas, build the indexes and materialized views, run the sample queries, capture the runtimes, and provide statistics on the runtimes. The hybrid model is not a one-size-fits-all solution. Many projects are best served by either using only one of the traditional models or using both models separately with a feed between them. But if the objective is to create a single database that can both store data in its properly normalized form and run aggregation queries with good performance, the hybrid model is a design pattern to consider. Sample Business Domain The sample business domain is in the insurance industry and uses the following entities: Entity ACCOUNT Description Information about a customer and its activities with the insurance company
2 Page 2 of 13 The sample business questions used to analyze the performance of the system have some parallel with reality but also cover extremes of behavior: scanning the fact table for many rows, retrieving a tiny percentage of fact rows, restricting to only the top table, restricting to every table, restricting to only the lower tables, and so on. They are the kinds of questions business users ask of dimensional models, not the kinds of questions that are typically asked of relational models. The relational model questions are not addressed, because it is assumed that the relational model will outperform the dimensional model for questions of a relational nature, such as "Show me all the vehicles on this policy." The questions used in this analysis are the following: Models ID# POLICY VEHICLE An insurance contract representing a specific agreement with the customer A vehicle belonging to the customer and covered by a policy COVERAGE The kinds of losses that are covered for a vehicle on this policy PREMIUM A monthly payment from the customer for coverage on vehicles in this policy Business Questions of a Dimensional Nature 1 What was the total premium collected by year as far back as we can go? 2 What was the premium collected in the New England states in 2002? 3 How much premium did we get for medium catastrophe risks in Connecticut as far back as we can go? 4 How much premium did we get for time-managed plan types in California in 2001? 5 How many passenger cars had collision coverage in November 2003? 6 What was the premium for red vehicles in Vermont with primary usage that had a $1,000 deductible? Break the numbers down per person and by accident limits. 7 What was the premium for coverages with a $1,000 deductible, a $100,000 per-person limit, and an $800,000 accident limit in 2000? 8 What was the monthly premium in 1999 for red cars with 750cc engines?
3 Page 3 of 13 The three models are presented in Figures 1, 2, and 3. The hybrid model is based on the relational model, with two changes that derive from dimensional modeling practices: (1) Create a relationship from the PREMIUM table to each table in the upper portion of the hierarchy, and (2) Add the time dimension. Figure 1. Relational model Figure 2. Dimensional model
4 Page 4 of 13 Figure 3. Hybrid model Implementation Largely standard techniques were used to convert the models into their physical implementation in database schemas. The relational schema was created with normalized modeling techniques, and the dimensional schema was done according to Ralph Kimball's work. Creating the hybrid meant copying the relational schema and then layering the dimensional constructs on top of it. (The "File Descriptions" sidebar lists the most important files in the implementation--which includes those files with DDL, the system validation, the queries, and the automated analysis used to generate the sample code.) Because only three nonkey attributes are used, a SIZING attribute is added to each table, with a type of CHAR(100) to make the row size more realistic. Certain database parameters must be set so that star joins will occur and materialized views will be used. The important parameters are shown here: NAME VALUE
5 Page 5 of compatible optimizer_features_enable optimizer_mode first_rows pga_aggregate_target query_rewrite_enabled true query_rewrite_integrity stale_tolerated sga_target star_transformation_enabled true Verifying that a star join is occurring is done with EXPLAIN PLAN, as detailed in Oracle documentation. All three schemas were loaded with the same data. The best evidence of consistent data loading is that all three schemas produce the same answers for the sample queries. The volume of data used for the analysis is shown below. OWNER TABLE_NAME NUM_ROWS AVG_ROW_LEN LAST_ANALYZED DIM ACCOUNT_DIM : COVERAGE_DIM : POLICY_DIM : PREMIUM_FACT : TIME_DIM : VEHICLE_DIM : HYB ACCOUNT : COVERAGE : POLICY : PREMIUM : TIME_DIM : VEHICLE : REL ACCOUNT : COVERAGE : POLICY : PREMIUM : VEHICLE : The goal was to provide a sufficiently large volume to prevent the optimizer from taking shortcuts, such as reading entire tables instead of using indexes and other such optimization techniques that would undermine the analysis. According to Oracle Database Data Warehousing Guide 10 g Release 2 (10.2), Schema Modeling Techniques, a star transformation might not occur if the optimizer finds "tables that are too small for the transformation to be worthwhile." A fairly arbitrary goal of the implementation was to have at least 1 million rows in the fact table. Given that all dimensional and hybrid query plans generated by QUERIES.SQL meet the criteria of star joins, the data volume used appears to be sufficient for the current analysis. The number of COVERAGE_DIM rows is smaller in the dimensional schema than in the DIMENSION tables of the other two schemas because of the way a weak entity has to be represented in the dimensional schema.
6 Page 6 of 13 Here is the amount of space consumed by the various schemas: OWNER TOTAL_SIZE DIM 129,499,136 HYB 244,056,064 REL 130,023,424 Because the hybrid schema is a combination of the relational and the dimensional, it follows that it should be roughly the size of both, minus any common elements, and the numbers bear this out. Running the System Each of the queries was run 21 times, and the median runtime was used as the representative value, as shown below. EVENT WINNER_TIME RNR_UP_TIME LOSER_TIME DIM = 00:00: REL = 00:00: HYB = 00:00: DIM = 00:00: HYB = 00:00: REL = 00:00: DIM = 00:00: HYB = 00:00: REL = 00:00: DIM = 00:00: HYB = 00:00: REL = 00:00: HYB = 00:00: DIM = 00:00: REL = 00:00: DIM = 00:00: HYB = 00:00: REL = 00:00: DIM = 00:00: HYB = 00:00: REL = 00:00: DIM = 00:00: HYB = 00:00: REL = 00:00: Converting to a percentage scale, to make the values relative rather than absolute, and forcing the fastest schema to 100 percent by definition produces these percentages: EVENT WINNER_OFFSET RNR_UP_OFFSET LOSER_OFFSET DIM = 100% REL = 149% HYB = 159% 2. DIM = 100% HYB = 190% REL = 193% 3. DIM = 100% HYB = 145% REL = 159% 4. DIM = 100% HYB = 136% REL = 4993% 5. HYB = 100% DIM = 497% REL = 4136% 6. DIM = 100% HYB = 263% REL = 1034% 7. DIM = 100% HYB = 302% REL = 1533% 8. DIM = 100% HYB = 159% REL = 408% Comparing Relational and Dimensional Showing that the dimensional schema outperforms the relational schema when running dimensional queries functions as the control of the experiment and provides the baseline from which to consider the hybrid schema's performance. As you can see below, the dimensional schema consistently outperforms the relational schema, as expected.
7 Page 7 of 13 EVENT WINNER_OFFSET RNR_UP_OFFSET LOSER_OFFSET DIM = 100% REL = 149% 2. DIM = 100% REL = 193% 3. DIM = 100% REL = 159% 4. DIM = 100% REL = 4993% 5. DIM = 497% REL = 4136% 6. DIM = 100% REL = 1034% 7. DIM = 100% REL = 1533% 8. DIM = 100% REL = 408% In the case of Query #4, the difference is nearly 50-fold! Query #4 is the most extreme case in which the only (nontime) restriction is on attributes of the topmost table. In the relational schema, this means that all the tables down the hierarchy must be joined to get to the numerical information an expensive operation. In the dimensional schema, the join is a direct connection from one dimension right into the fact table an efficient operation. Hybrid vs. Dimensional Whether the hybrid schema performs as well as the dimensional is the core question in this analysis. As you can see below, the hybrid schema works reasonably well, but the hybrid approach is not as fast as a purely dimensional one. EVENT WINNER_OFFSET RNR_UP_OFFSET LOSER_OFFSET DIM = 100% HYB = 159% 2. DIM = 100% HYB = 190% 3. DIM = 100% HYB = 145% 4. DIM = 100% HYB = 136% 5. HYB = 100% DIM = 497% 6. DIM = 100% HYB = 263% 7. DIM = 100% HYB = 302% 8. DIM = 100% HYB = 159% Query #5 is a deviant case, but for all the other queries, the hybrid takes between 136 percent and 302 percent of the time required for the dimensional schema. This immediately shows that there are some limitations to the performance of the hybrid schema, but to understand why requires analysis of the query plans. A review of the plans captured during a system run indicates that there are three categories of behavior: Queries whose plans are identical between dimensional and hybrid (queries #1, #3, #4, #8). Queries whose plans differ between dimensional and hybrid (queries #2, #6, #7). Perfect alignment of the query with the hybrid schema (query #5). Identical plans. Here are the three plans for query #1: Query #1, relational schema plan: SELECT STATEMENT (rows=195) SORT GROUP BY (rows=195) TABLE ACCESS FULL PREMIUM (rows= )
8 Page 8 of 13 Query #1, dimensional schema plan: SELECT STATEMENT (rows=300) SORT GROUP BY (rows=300) HASH JOIN (rows= ) TABLE ACCESS FULL TIME_DIM (rows=3600) TABLE ACCESS FULL PREMIUM_FACT (rows= ) Query #1, hybrid schema plan: SELECT STATEMENT (rows=300) SORT GROUP BY (rows=300) HASH JOIN (rows= ) TABLE ACCESS FULL TIME_DIM (rows=3600) TABLE ACCESS FULL PREMIUM (rows= ) Note that the relational plan is different from the dimensional plan, as would be expected. It also that the dimensional and hybrid plans are identical. This shows the optimizer's ability to detect the dimensional nature of the query to the dimensional constructs of the hybrid schema, which is the desired behavior. The pattern of the dimensional and relational plans being identical also holds for queries #3, #4, and #8. The slower performance despite the identical plans leads to the conclusion that the hybrid schema is slower simply due to its sheer size. As previously discussed, the hybrid schema tends to require about twice the space of either of the other two schemas. This means fewer rows per block, more total reads for any given operation, and more bytes in motion than in the dimensional schema. It may very well be that having all these extra bytes in motion simply slows things down. Different plans. Now review the three plans for query #7: Query #7, relational schema plan: SELECT STATEMENT (rows=6) SORT GROUP BY (rows=6) HASH JOIN (rows=77) TABLE ACCESS FULL COVERAGE (rows=800) TABLE ACCESS FULL PREMIUM (rows=13773) Query #7, dimensional schema plan: SELECT STATEMENT (rows=1) SORT GROUP BY (rows=1) HASH JOIN (rows=1) TABLE ACCESS BY INDEX ROWID COVERAGE_DIM (rows=6) BITMAP CONVERSION TO ROWIDS (rows=) BITMAP AND (rows=) BITMAP INDEX SINGLE VALUE BX_COVERAGE_ACCD_LIMIT (rows=) BITMAP INDEX SINGLE VALUE BX_COVERAGE_DEDUCTIBLE (rows=) BITMAP INDEX SINGLE VALUE BX_COVERAGE_PERS_LIMIT (rows=) TABLE ACCESS BY INDEX ROWID PREMIUM_FACT (rows=48) BITMAP CONVERSION TO ROWIDS (rows=) BITMAP AND (rows=) BITMAP MERGE (rows=) BITMAP KEY ITERATION (rows=) TABLE ACCESS FULL TIME_DIM (rows=12) BITMAP INDEX RANGE SCAN BX_PREMIUM_TIME (rows=) BITMAP MERGE (rows=)
9 Page 9 of 13 BITMAP KEY ITERATION (rows=) TABLE ACCESS BY INDEX ROWID COVERAGE_DIM (rows=6) BITMAP CONVERSION TO ROWIDS (rows=) BITMAP AND (rows=) BITMAP INDEX SINGLE VALUE BX_COVERAGE_ACCD_LIMIT (rows=) BITMAP INDEX SINGLE VALUE BX_COVERAGE_DEDUCTIBLE (rows=) BITMAP INDEX SINGLE VALUE BX_COVERAGE_PERS_LIMIT (rows=) BITMAP INDEX RANGE SCAN BX_PREMIUM_COVERAGE (rows=) Query #7, hybrid schema plan: SELECT STATEMENT (rows=1) TEMP TABLE TRANSFORMATION (rows=) LOAD AS SELECT SYS_TEMP_0FD9D697C_1278CF0 (rows=) TABLE ACCESS BY INDEX ROWID COVERAGE (rows=6) INDEX FULL SCAN UX_COVERAGE_COVERAGE_KEY (rows=6) SORT GROUP BY (rows=1) HASH JOIN (rows=1) TABLE ACCESS FULL SYS_TEMP_0FD9D697C_1278CF0 (rows=6) TABLE ACCESS BY INDEX ROWID PREMIUM (rows=47) BITMAP CONVERSION TO ROWIDS (rows=) BITMAP AND (rows=) BITMAP MERGE (rows=) BITMAP KEY ITERATION (rows=) TABLE ACCESS FULL TIME_DIM (rows=12) BITMAP INDEX RANGE SCAN BX_PREMIUM_TIME (rows=) BITMAP MERGE (rows=) BITMAP KEY ITERATION (rows=) TABLE ACCESS FULL SYS_TEMP_0FD9D697C_1278CF0 (rows=1) BITMAP INDEX RANGE SCAN BX_PREMIUM_COVERAGE (rows=) Again the relational plan matches neither of the other two, but this time, the dimensional and hybrid plans are not identical. This shows the undesirable optimizer behavior of not using a dimensional plan even though one can exist because the schema has all the necessary constructs. The pattern of not generating a dimensional plan on the hybrid schema also holds for queries #2 and #6. The reasonable conclusion is that the availability of the relational constructs when the optimizer is doing dimensional queries causes the optimizer to generate a plan that is not as effective as the purely dimensional query generated when only dimensional constructs are available in the schema. It is important to note that all four cases that use a dimensional plan against a hybrid schema consistently outperform all three cases where something other than a purely dimensional plan is used against the hybrid schema. This reinforces the value of using a dimensional plan whenever possible. Perfect alignment. Query #5 presents a unique case in which the hybrid wins because the nature of the query happens to lend itself artificially well to the nature of the hybrid schema. Specifically, query #5 uses the VEHICLE, COVERAGE, and TIME tables. COVERAGE is a weak entity, and as such, it has all of the VEHICLE identifier in its primary key in the relational and hybrid schemas. In the dimensional schema, the VEHICLE attributes were taken out, so that the COVERAGE dimension would be a "pure" dimension pure in the sense that it stands alone and any association it has with VEHICLE is through the fact table. Although this does make for a proper dimensional schema, it also separates the VEHICLE and COVERAGE tables dramatically more in the dimensional schema than in the relational and hybrid schemas.
10 Page 10 of 13 When it comes time for the optimizer to use VEHICLE and COVERAGE in query #5, it has to bring them together from scratch in the dimensional schema, but in the hybrid and relational schemas, it can already find them together in the key of the COVERAGE table. That the hybrid schema has such constructs available when the optimizer needs them is one of its stated advantages, but on the other hand, this is not a general-case advantage. It exists only when the query and schema happen to align properly, as is the case in query #5. Conclusion of Query Analysis Generalizing our conclusions: if the hybrid schema aligns perfectly with the nature of the query, the hybrid schema can significantly outperform the dimensional schema, but this is not the general case (one of the eight). In most cases (four of the eight), the hybrid plan will be identical to the dimensional plan, but the hybrid schema will run slower, likely because of the number of bytes in motion. In some cases (three of the eight), the plan for the hybrid schema will be different, less than optimal, and the result is runtimes that are quite a bit longer than those of a dimensional schema. Materialized View Aggregates Aggregates are commonly used in dimensional modeling to increase performance, and materialized views are commonly used to create the aggregates. To demonstrate the effect of materialized view aggregates (MVAs) on the hybrid schema, two MVAs were added. Table 1 shows that four of the queries can undergo a query rewrite to use the MVA named in the right-most column. Query Account dim. Policy dim. Vehicle dim. Coverage dim. Time dim. Aggregate utilized 1 Agg Agg Qry/Agg Agg_acct_pol_time 2 Qry Qry Qry 3 Qry/Agg Qry/Agg Qry/Agg Agg_acct_pol_time 4 Qry/Agg Agg Qry/Agg Agg_acct_pol_time 5 Qry Qry Qry 6 Qry/Agg Qry/Agg Qry/Agg Qry/Agg Agg_acct_pol_veh_cov 7 Qry Qry 8 Qry Qry Table 1. Materialized view aggregates added to optimize specific queries "Qry" indicates that the query references the dimension in its WHERE clause, and "Agg" indicates that the MVA preserves the reference to the dimension. For rewrite to occur, no "Qry" may appear alone in a cell. The performance change of the MVAs overall is very positive, as would be expected. Also of interest is that the MVAs cause the hybrid schema to do quite well in relative to the dimensional schema, as shown here:
11 Page 11 of 13 VENT WINNER_TIME RNR_UP_TIME LOSER_TIME HYB = 00:00: DIM = 00:00: REL = 00:00: DIM = 00:00: REL = 00:00: HYB = 00:00: HYB = 00:00: DIM = 00:00: REL = 00:00: HYB = 00:00: DIM = 00:00: REL = 00:00: HYB = 00:00: DIM = 00:00: REL = 00:00: HYB = 00:00: DIM = 00:00: REL = 00:00: DIM = 00:00: HYB = 00:00: REL = 00:00: DIM = 00:00: HYB = 00:00: REL = 00:00: EVENT WINNER_OFFSET RNR_UP_OFFSET LOSER_OFFSET HYB = 100% DIM = 147% REL = 950% 2. DIM = 100% REL = 189% HYB = 191% 3. HYB = 100% DIM = 134% REL = 572% 4. HYB = 100% DIM = 150% REL = 6151% 5. HYB = 100% DIM = 421% REL = 2065% 6. HYB = 100% DIM = 125% REL = 744% 7. DIM = 100% HYB = 288% REL = 1152% 8. DIM = 100% HYB = 157% REL = 404% In fact, in 100 percent of the cases in which aggregates are used (four of the eight queries), the hybrid schema takes first place for performance. This shows not only that the use of materialized views allows both the dimensional and hybrid schemas to benefit but also that it consistently favors the hybrid schema. This is a very significant finding because it indicates that the use of MVAs on a hybrid schema achieves full dimensional performance while still keeping all relational relationships. Future Research and Other Considerations "Can" vs. "Should". As noted at the outset, the motivation for this research was my need to build a single system to meet both OLTP-like and DSS-like business requirements. However, if a project needs only one of the two types of behavior or has the money and time to fund and build two separate environments with the appropriate feeds between them, it may be better to avoid the hybrid design. Human understanding. This analysis looked at implementation aspects of the hybrid design, but one of the main values of the dimensional design is that it is easy for those who are not database experts to understand. The hybrid design not only loses this ease of understanding but also produces the most complex design of the three discussed. This is a major drawback for systems that need to give power users direct access to the database. Other physical optimization. No partitioning was used in the current analysis. Bitmap join indexes were added and analyzed, but the results showed little advantage (see code for details). These and other physical optimization techniques should be analyzed to determine if they benefit the hybrid design as much as they benefit the dimensional design. Platform considerations. This system was run on the Windows XP operating system using Intel Pentium-class hardware. Three such platforms were tested. RAM varied from 256MB to 1GB. Other platforms may run faster or slower overall, but platform changes may also change the
12 Page 12 of 13 performance of the schemas relative to each other. This possibility is even more likely given the ability of the Oracle database software to detect the state of platform resources and adjust plans accordingly. Conclusion In general, the numbers show that the idea of combining relational and dimensional designs is feasible, if not without issues. But note that perfection is not an option. A purely relational design cannot achieve dimensional performance; a purely dimensional design cannot represent relations efficiently. Each of the three has its own limitations. Given that, the slight performance degradation that comes from the bigger row size and the increase in physical complexity are relatively small prices to pay when the benefit is the ability to have both full relational representation and much improved performance. James Madison has been working in the information technology field since the early 1990s and has worked on Oracle database systems for the majority of that time. He greatly welcomes feedback at [email protected]. Sidebar: File Descriptions The reference implementation has several dozen source files and generates nine output files during a run. Only important or potentially confusing files and directories are listed here. The less significant ones are not listed but should be easily understood by tracing the code and output. File/Pattern Purpose go.cmd The main entry point into the system. This runs everything. It has 13 environmental variables to guide the behavior of the run; review carefully. Except for a few minor utilities, all system code is connected to this root *master*.* build*.* validation.sql queries.sql q????.sql analysis.sql agg_*.sql bitmapjoin.sql The code uses a fourth schema, "master" to build and run the main three The DDL for the schemas Validates that the schema builds and data loading were done correctly Runs the queries. The eight queries across three models. They are broken out this way to make them callable across three schemas for both running and plan analysis. The queries.sql file will handled all the needed parameterization and sequencing Performs the analysis. Produces the runtime output tables shown in the figures Builds the aggregates Builds the bitmap join indexes. Not discussed above, but still available here
13 Page 13 of 13 runs runs\base_*.txt runs\mv_*.txt runs\bmji_*.txt oracle_config The directory into which generated file are placed Three files with the validation, query output, and analysis for the schemas without materialized views or bitmap join indexes The same three files with materialized views in place The same three files with bitmap join indexes in place A directory with files to do some system setup. Not meant for reuse. Your system will vary.
Optimizing the Performance of the Oracle BI Applications using Oracle Datawarehousing Features and Oracle DAC 10.1.3.4.1
Optimizing the Performance of the Oracle BI Applications using Oracle Datawarehousing Features and Oracle DAC 10.1.3.4.1 Mark Rittman, Director, Rittman Mead Consulting for Collaborate 09, Florida, USA,
CHAPTER : 5. MODELS & HYBRID METHODS
CHAPTER : 5. MODELS & HYBRID METHODS 5.1 Data Warehouse Modeling Data warehouse modeling is the process of designing the schemas of the detailed and summarised data of the data warehouse. The aim of data
Database Design Patterns. Winter 2006-2007 Lecture 24
Database Design Patterns Winter 2006-2007 Lecture 24 Trees and Hierarchies Many schemas need to represent trees or hierarchies of some sort Common way of representing trees: An adjacency list model Each
Enterprise Performance Tuning: Best Practices with SQL Server 2008 Analysis Services. By Ajay Goyal Consultant Scalability Experts, Inc.
Enterprise Performance Tuning: Best Practices with SQL Server 2008 Analysis Services By Ajay Goyal Consultant Scalability Experts, Inc. June 2009 Recommendations presented in this document should be thoroughly
Advanced Oracle SQL Tuning
Advanced Oracle SQL Tuning Seminar content technical details 1) Understanding Execution Plans In this part you will learn how exactly Oracle executes SQL execution plans. Instead of describing on PowerPoint
D B M G Data Base and Data Mining Group of Politecnico di Torino
Database Management Data Base and Data Mining Group of [email protected] A.A. 2014-2015 Optimizer objective A SQL statement can be executed in many different ways The query optimizer determines
Capacity Planning Process Estimating the load Initial configuration
Capacity Planning Any data warehouse solution will grow over time, sometimes quite dramatically. It is essential that the components of the solution (hardware, software, and database) are capable of supporting
Real Life Performance of In-Memory Database Systems for BI
D1 Solutions AG a Netcetera Company Real Life Performance of In-Memory Database Systems for BI 10th European TDWI Conference Munich, June 2010 10th European TDWI Conference Munich, June 2010 Authors: Dr.
Oracle BI EE Implementation on Netezza. Prepared by SureShot Strategies, Inc.
Oracle BI EE Implementation on Netezza Prepared by SureShot Strategies, Inc. The goal of this paper is to give an insight to Netezza architecture and implementation experience to strategize Oracle BI EE
low-level storage structures e.g. partitions underpinning the warehouse logical table structures
DATA WAREHOUSE PHYSICAL DESIGN The physical design of a data warehouse specifies the: low-level storage structures e.g. partitions underpinning the warehouse logical table structures low-level structures
Oracle Database 11g: SQL Tuning Workshop Release 2
Oracle University Contact Us: 1 800 005 453 Oracle Database 11g: SQL Tuning Workshop Release 2 Duration: 3 Days What you will learn This course assists database developers, DBAs, and SQL developers to
Data Warehousing with Oracle
Data Warehousing with Oracle Comprehensive Concepts Overview, Insight, Recommendations, Best Practices and a whole lot more. By Tariq Farooq A BrainSurface Presentation What is a Data Warehouse? Designed
Indexing Techniques for Data Warehouses Queries. Abstract
Indexing Techniques for Data Warehouses Queries Sirirut Vanichayobon Le Gruenwald The University of Oklahoma School of Computer Science Norman, OK, 739 [email protected] [email protected] Abstract Recently,
Whitepaper. Innovations in Business Intelligence Database Technology. www.sisense.com
Whitepaper Innovations in Business Intelligence Database Technology The State of Database Technology in 2015 Database technology has seen rapid developments in the past two decades. Online Analytical Processing
Sawmill Log Analyzer Best Practices!! Page 1 of 6. Sawmill Log Analyzer Best Practices
Sawmill Log Analyzer Best Practices!! Page 1 of 6 Sawmill Log Analyzer Best Practices! Sawmill Log Analyzer Best Practices!! Page 2 of 6 This document describes best practices for the Sawmill universal
Lost in Space? Methodology for a Guided Drill-Through Analysis Out of the Wormhole
Paper BB-01 Lost in Space? Methodology for a Guided Drill-Through Analysis Out of the Wormhole ABSTRACT Stephen Overton, Overton Technologies, LLC, Raleigh, NC Business information can be consumed many
SQL Server 2005 Features Comparison
Page 1 of 10 Quick Links Home Worldwide Search Microsoft.com for: Go : Home Product Information How to Buy Editions Learning Downloads Support Partners Technologies Solutions Community Previous Versions
In this session, we use the table ZZTELE with approx. 115,000 records for the examples. The primary key is defined on the columns NAME,VORNAME,STR
1 2 2 3 In this session, we use the table ZZTELE with approx. 115,000 records for the examples. The primary key is defined on the columns NAME,VORNAME,STR The uniqueness of the primary key ensures that
COURSE OUTLINE. Track 1 Advanced Data Modeling, Analysis and Design
COURSE OUTLINE Track 1 Advanced Data Modeling, Analysis and Design TDWI Advanced Data Modeling Techniques Module One Data Modeling Concepts Data Models in Context Zachman Framework Overview Levels of Data
Tiber Solutions. Understanding the Current & Future Landscape of BI and Data Storage. Jim Hadley
Tiber Solutions Understanding the Current & Future Landscape of BI and Data Storage Jim Hadley Tiber Solutions Founded in 2005 to provide Business Intelligence / Data Warehousing / Big Data thought leadership
ESSBASE ASO TUNING AND OPTIMIZATION FOR MERE MORTALS
ESSBASE ASO TUNING AND OPTIMIZATION FOR MERE MORTALS Tracy, interrel Consulting Essbase aggregate storage databases are fast. Really fast. That is until you build a 25+ dimension database with millions
Oracle Warehouse Builder 10g
Oracle Warehouse Builder 10g Architectural White paper February 2004 Table of contents INTRODUCTION... 3 OVERVIEW... 4 THE DESIGN COMPONENT... 4 THE RUNTIME COMPONENT... 5 THE DESIGN ARCHITECTURE... 6
ETL as a Necessity for Business Architectures
Database Systems Journal vol. IV, no. 2/2013 3 ETL as a Necessity for Business Architectures Aurelian TITIRISCA University of Economic Studies, Bucharest, Romania [email protected] Today, the
Top 10 Performance Tips for OBI-EE
Top 10 Performance Tips for OBI-EE Narasimha Rao Madhuvarsu L V Bharath Terala October 2011 Apps Associates LLC Boston New York Atlanta Germany India Premier IT Professional Service and Solution Provider
Best Practices for Multi-Dimensional Design using Cognos 8 Framework Manager
Best Practices for Multi-Dimensional Design using Cognos 8 Framework Manager Bruce Zornes The Boeing Company Business Intelligence & Competency Center Best Practice: Knowledge and Experience Use Best Practice
Designing a Dimensional Model
Designing a Dimensional Model Erik Veerman Atlanta MDF member SQL Server MVP, Microsoft MCT Mentor, Solid Quality Learning Definitions Data Warehousing A subject-oriented, integrated, time-variant, and
Oracle EXAM - 1Z0-117. Oracle Database 11g Release 2: SQL Tuning. Buy Full Product. http://www.examskey.com/1z0-117.html
Oracle EXAM - 1Z0-117 Oracle Database 11g Release 2: SQL Tuning Buy Full Product http://www.examskey.com/1z0-117.html Examskey Oracle 1Z0-117 exam demo product is here for you to test the quality of the
High-Volume Data Warehousing in Centerprise. Product Datasheet
High-Volume Data Warehousing in Centerprise Product Datasheet Table of Contents Overview 3 Data Complexity 3 Data Quality 3 Speed and Scalability 3 Centerprise Data Warehouse Features 4 ETL in a Unified
Query Optimization in Teradata Warehouse
Paper Query Optimization in Teradata Warehouse Agnieszka Gosk Abstract The time necessary for data processing is becoming shorter and shorter nowadays. This thesis presents a definition of the active data
Bringing Big Data Modelling into the Hands of Domain Experts
Bringing Big Data Modelling into the Hands of Domain Experts David Willingham Senior Application Engineer MathWorks [email protected] 2015 The MathWorks, Inc. 1 Data is the sword of the
ETL-EXTRACT, TRANSFORM & LOAD TESTING
ETL-EXTRACT, TRANSFORM & LOAD TESTING Rajesh Popli Manager (Quality), Nagarro Software Pvt. Ltd., Gurgaon, INDIA [email protected] ABSTRACT Data is most important part in any organization. Data
Columnstore Indexes for Fast Data Warehouse Query Processing in SQL Server 11.0
SQL Server Technical Article Columnstore Indexes for Fast Data Warehouse Query Processing in SQL Server 11.0 Writer: Eric N. Hanson Technical Reviewer: Susan Price Published: November 2010 Applies to:
Big Data, Fast Processing Speeds Kevin McGowan SAS Solutions on Demand, Cary NC
Big Data, Fast Processing Speeds Kevin McGowan SAS Solutions on Demand, Cary NC ABSTRACT As data sets continue to grow, it is important for programs to be written very efficiently to make sure no time
Data Warehousing. Paper 133-25
Paper 133-25 The Power of Hybrid OLAP in a Multidimensional World Ann Weinberger, SAS Institute Inc., Cary, NC Matthias Ender, SAS Institute Inc., Cary, NC ABSTRACT Version 8 of the SAS System brings powerful
1Z0-117 Oracle Database 11g Release 2: SQL Tuning. Oracle
1Z0-117 Oracle Database 11g Release 2: SQL Tuning Oracle To purchase Full version of Practice exam click below; http://www.certshome.com/1z0-117-practice-test.html FOR Oracle 1Z0-117 Exam Candidates We
Oracle Database 11g: SQL Tuning Workshop
Oracle University Contact Us: + 38516306373 Oracle Database 11g: SQL Tuning Workshop Duration: 3 Days What you will learn This Oracle Database 11g: SQL Tuning Workshop Release 2 training assists database
Data Warehouse design
Data Warehouse design Design of Enterprise Systems University of Pavia 11/11/2013-1- Data Warehouse design DATA MODELLING - 2- Data Modelling Important premise Data warehouses typically reside on a RDBMS
Copyright 2007 Ramez Elmasri and Shamkant B. Navathe. Slide 29-1
Slide 29-1 Chapter 29 Overview of Data Warehousing and OLAP Chapter 29 Outline Purpose of Data Warehousing Introduction, Definitions, and Terminology Comparison with Traditional Databases Characteristics
The Benefits of Data Modeling in Data Warehousing
WHITE PAPER: THE BENEFITS OF DATA MODELING IN DATA WAREHOUSING The Benefits of Data Modeling in Data Warehousing NOVEMBER 2008 Table of Contents Executive Summary 1 SECTION 1 2 Introduction 2 SECTION 2
SQL Server Business Intelligence on HP ProLiant DL785 Server
SQL Server Business Intelligence on HP ProLiant DL785 Server By Ajay Goyal www.scalabilityexperts.com Mike Fitzner Hewlett Packard www.hp.com Recommendations presented in this document should be thoroughly
Speeding ETL Processing in Data Warehouses White Paper
Speeding ETL Processing in Data Warehouses White Paper 020607dmxwpADM High-Performance Aggregations and Joins for Faster Data Warehouse Processing Data Processing Challenges... 1 Joins and Aggregates are
www.dotnetsparkles.wordpress.com
Database Design Considerations Designing a database requires an understanding of both the business functions you want to model and the database concepts and features used to represent those business functions.
Pivot Charting in SharePoint with Nevron Chart for SharePoint
Pivot Charting in SharePoint Page 1 of 10 Pivot Charting in SharePoint with Nevron Chart for SharePoint The need for Pivot Charting in SharePoint... 1 Pivot Data Analysis... 2 Functional Division of Pivot
The Advantages of Using RAID
1 Quick Guide to the SPD Engine Disk-I/O Set-Up SPD Engine Disk-I/O Set-Up 1 Disk Striping and RAIDs 2 Metadata Area Configuration 3 Assigning a Metadata Area 3 Metadata Space Requirements 3 Data Area
Jet Data Manager 2012 User Guide
Jet Data Manager 2012 User Guide Welcome This documentation provides descriptions of the concepts and features of the Jet Data Manager and how to use with them. With the Jet Data Manager you can transform
Data Warehouse Snowflake Design and Performance Considerations in Business Analytics
Journal of Advances in Information Technology Vol. 6, No. 4, November 2015 Data Warehouse Snowflake Design and Performance Considerations in Business Analytics Jiangping Wang and Janet L. Kourik Walker
Data warehousing with PostgreSQL
Data warehousing with PostgreSQL Gabriele Bartolini http://www.2ndquadrant.it/ European PostgreSQL Day 2009 6 November, ParisTech Telecom, Paris, France Audience
Data Warehouse Overview. Srini Rengarajan
Data Warehouse Overview Srini Rengarajan Please mute Your cell! Agenda Data Warehouse Architecture Approaches to build a Data Warehouse Top Down Approach Bottom Up Approach Best Practices Case Example
FileMaker 11. ODBC and JDBC Guide
FileMaker 11 ODBC and JDBC Guide 2004 2010 FileMaker, Inc. All Rights Reserved. FileMaker, Inc. 5201 Patrick Henry Drive Santa Clara, California 95054 FileMaker is a trademark of FileMaker, Inc. registered
Exadata in the Retail Sector
Exadata in the Retail Sector Jon Mead Managing Director - Rittman Mead Consulting Agenda Introduction Business Problem Approach Design Considerations Observations Wins Summary Q&A What it is not... Introductions
How, What, and Where of Data Warehouses for MySQL
How, What, and Where of Data Warehouses for MySQL Robert Hodges CEO, Continuent. Introducing Continuent The leading provider of clustering and replication for open source DBMS Our Product: Continuent Tungsten
Reference Architecture, Requirements, Gaps, Roles
Reference Architecture, Requirements, Gaps, Roles The contents of this document are an excerpt from the brainstorming document M0014. The purpose is to show how a detailed Big Data Reference Architecture
Archiving the insurance data warehouse
REFLECTIONS Archiving the insurance data warehouse Surajit Basu, IBEXI Solutions Surajit Basu is an IT consultant with over 20 years of experience in design, development and implementation of enterprise-wide
Reflections on Agile DW by a Business Analytics Practitioner. Werner Engelen Principal Business Analytics Architect
Reflections on Agile DW by a Business Analytics Practitioner Werner Engelen Principal Business Analytics Architect Introduction Werner Engelen Active in BI & DW since 1998 + 6 years at element61 Previously:
Mastering Data Warehouse Aggregates. Solutions for Star Schema Performance
Brochure More information from http://www.researchandmarkets.com/reports/2248199/ Mastering Data Warehouse Aggregates. Solutions for Star Schema Performance Description: - This is the first book to provide
Chapter 5. Learning Objectives. DW Development and ETL
Chapter 5 DW Development and ETL Learning Objectives Explain data integration and the extraction, transformation, and load (ETL) processes Basic DW development methodologies Describe real-time (active)
StreamServe Persuasion SP5 Oracle Database
StreamServe Persuasion SP5 Oracle Database Database Guidelines Rev A StreamServe Persuasion SP5 Oracle Database Database Guidelines Rev A 2001-2011 STREAMSERVE, INC. ALL RIGHTS RESERVED United States patent
The Benefits of Data Modeling in Business Intelligence
WHITE PAPER: THE BENEFITS OF DATA MODELING IN BUSINESS INTELLIGENCE The Benefits of Data Modeling in Business Intelligence DECEMBER 2008 Table of Contents Executive Summary 1 SECTION 1 2 Introduction 2
Optimizing the Performance of Your Longview Application
Optimizing the Performance of Your Longview Application François Lalonde, Director Application Support May 15, 2013 Disclaimer This presentation is provided to you solely for information purposes, is not
Cúram Business Intelligence Reporting Developer Guide
IBM Cúram Social Program Management Cúram Business Intelligence Reporting Developer Guide Version 6.0.5 IBM Cúram Social Program Management Cúram Business Intelligence Reporting Developer Guide Version
Cognos Performance Troubleshooting
Cognos Performance Troubleshooting Presenters James Salmon Marketing Manager [email protected] Andy Ellis Senior BI Consultant [email protected] Want to ask a question?
Files. Files. Files. Files. Files. File Organisation. What s it all about? What s in a file?
Files What s it all about? Information being stored about anything important to the business/individual keeping the files. The simple concepts used in the operation of manual files are often a good guide
How To Test For Performance And Scalability On A Server With A Multi-Core Computer (For A Large Server)
Scalability Results Select the right hardware configuration for your organization to optimize performance Table of Contents Introduction... 1 Scalability... 2 Definition... 2 CPU and Memory Usage... 2
SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011
SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications Jürgen Primsch, SAP AG July 2011 Why In-Memory? Information at the Speed of Thought Imagine access to business data,
Turning your Warehouse Data into Business Intelligence: Reporting Trends and Visibility Michael Armanious; Vice President Sales and Marketing Datex,
Turning your Warehouse Data into Business Intelligence: Reporting Trends and Visibility Michael Armanious; Vice President Sales and Marketing Datex, Inc. Overview Introduction What is Business Intelligence?
Comparing SQL and NOSQL databases
COSC 6397 Big Data Analytics Data Formats (II) HBase Edgar Gabriel Spring 2015 Comparing SQL and NOSQL databases Types Development History Data Storage Model SQL One type (SQL database) with minor variations
VMware Virtualization and Software Development
VMware Virtualization and Software Development 1 VMware Virtualization and Software Development Mark Cloutier Undergraduate Student, Applied Math and Computer Science Keywords: Virtualization, VMware,
s@lm@n Oracle Exam 1z0-591 Oracle Business Intelligence Foundation Suite 11g Essentials Version: 6.6 [ Total Questions: 120 ]
s@lm@n Oracle Exam 1z0-591 Oracle Business Intelligence Foundation Suite 11g Essentials Version: 6.6 [ Total Questions: 120 ] Question No : 1 A customer would like to create a change and a % Change for
Big Data Analytics with IBM Cognos BI Dynamic Query IBM Redbooks Solution Guide
Big Data Analytics with IBM Cognos BI Dynamic Query IBM Redbooks Solution Guide IBM Cognos Business Intelligence (BI) helps you make better and smarter business decisions faster. Advanced visualization
Performance Tuning for the Teradata Database
Performance Tuning for the Teradata Database Matthew W Froemsdorf Teradata Partner Engineering and Technical Consulting - i - Document Changes Rev. Date Section Comment 1.0 2010-10-26 All Initial document
Super-Charged Oracle Business Intelligence with Essbase and SmartView
Specialized. Recognized. Preferred. The right partner makes all the difference. Super-Charged Oracle Business Intelligence with Essbase and SmartView By: Gautham Sampath Pinellas County & Patrick Callahan
Understanding the Value of In-Memory in the IT Landscape
February 2012 Understing the Value of In-Memory in Sponsored by QlikView Contents The Many Faces of In-Memory 1 The Meaning of In-Memory 2 The Data Analysis Value Chain Your Goals 3 Mapping Vendors to
QAD Business Intelligence Data Warehouse Demonstration Guide. May 2015 BI 3.11
QAD Business Intelligence Data Warehouse Demonstration Guide May 2015 BI 3.11 Overview This demonstration focuses on the foundation of QAD Business Intelligence the Data Warehouse and shows how this functionality
ENHANCEMENTS TO SQL SERVER COLUMN STORES. Anuhya Mallempati #2610771
ENHANCEMENTS TO SQL SERVER COLUMN STORES Anuhya Mallempati #2610771 CONTENTS Abstract Introduction Column store indexes Batch mode processing Other Enhancements Conclusion ABSTRACT SQL server introduced
Bitmap Index an Efficient Approach to Improve Performance of Data Warehouse Queries
Bitmap Index an Efficient Approach to Improve Performance of Data Warehouse Queries Kale Sarika Prakash 1, P. M. Joe Prathap 2 1 Research Scholar, Department of Computer Science and Engineering, St. Peters
Business Intelligence. 1. Introduction September, 2013.
Business Intelligence 1. Introduction September, 2013. The content of the first lecture Introduction to data warehousing and business intelligence Star join 2 Data hierarchy Strategical data Operational
Extensibility of Oracle BI Applications
Extensibility of Oracle BI Applications The Value of Oracle s BI Analytic Applications with Non-ERP Sources A White Paper by Guident Written - April 2009 Revised - February 2010 Guident Technologies, Inc.
SAP BO Course Details
SAP BO Course Details By Besant Technologies Course Name Category Venue SAP BO SAP Besant Technologies No.24, Nagendra Nagar, Velachery Main Road, Address Velachery, Chennai 600 042 Landmark Opposite to
Oracle Architecture, Concepts & Facilities
COURSE CODE: COURSE TITLE: CURRENCY: AUDIENCE: ORAACF Oracle Architecture, Concepts & Facilities 10g & 11g Database administrators, system administrators and developers PREREQUISITES: At least 1 year of
When to consider OLAP?
When to consider OLAP? Author: Prakash Kewalramani Organization: Evaltech, Inc. Evaltech Research Group, Data Warehousing Practice. Date: 03/10/08 Email: [email protected] Abstract: Do you need an OLAP
Oracle BI Suite Enterprise Edition
Oracle BI Suite Enterprise Edition Optimising BI EE using Oracle OLAP and Essbase Antony Heljula Technical Architect Peak Indicators Limited Agenda Overview When Do You Need a Cube Engine? Example Problem
CHAPTER 1 Overview of SAS/ACCESS Interface to Relational Databases
3 CHAPTER 1 Overview of SAS/ACCESS Interface to Relational Databases About This Document 3 Methods for Accessing Relational Database Data 4 Selecting a SAS/ACCESS Method 4 Methods for Accessing DBMS Tables
How To Model Data For Business Intelligence (Bi)
WHITE PAPER: THE BENEFITS OF DATA MODELING IN BUSINESS INTELLIGENCE The Benefits of Data Modeling in Business Intelligence DECEMBER 2008 Table of Contents Executive Summary 1 SECTION 1 2 Introduction 2
Foundations of Business Intelligence: Databases and Information Management
Foundations of Business Intelligence: Databases and Information Management Problem: HP s numerous systems unable to deliver the information needed for a complete picture of business operations, lack of
SQL Performance for a Big Data 22 Billion row data warehouse
SQL Performance for a Big Data Billion row data warehouse Dave Beulke dave @ d a v e b e u l k e.com Dave Beulke & Associates Session: F19 Friday May 8, 15 8: 9: Platform: z/os D a v e @ d a v e b e u
FileMaker 12. ODBC and JDBC Guide
FileMaker 12 ODBC and JDBC Guide 2004 2012 FileMaker, Inc. All Rights Reserved. FileMaker, Inc. 5201 Patrick Henry Drive Santa Clara, California 95054 FileMaker and Bento are trademarks of FileMaker, Inc.
10 Tips for Optimizing the Performance of your Web Intelligence Reports. Jonathan Brown - SAP SESSION CODE: 0902
10 Tips for Optimizing the Performance of your Web Intelligence Reports Jonathan Brown - SAP SESSION CODE: 0902 LEARNING POINTS Find out about the common issues SAP Product Support gets asked on a regular
HansaWorld SQL Training Material
HansaWorld University HansaWorld SQL Training Material HansaWorld Ltd. January 2008 Version 5.4 TABLE OF CONTENTS: TABLE OF CONTENTS:...2 OBJECTIVES...4 INTRODUCTION...5 Relational Databases...5 Definition...5
Similarity Search in a Very Large Scale Using Hadoop and HBase
Similarity Search in a Very Large Scale Using Hadoop and HBase Stanislav Barton, Vlastislav Dohnal, Philippe Rigaux LAMSADE - Universite Paris Dauphine, France Internet Memory Foundation, Paris, France
MS 20467: Designing Business Intelligence Solutions with Microsoft SQL Server 2012
MS 20467: Designing Business Intelligence Solutions with Microsoft SQL Server 2012 Description: This five-day instructor-led course teaches students how to design and implement a BI infrastructure. The
CHAPTER - 5 CONCLUSIONS / IMP. FINDINGS
CHAPTER - 5 CONCLUSIONS / IMP. FINDINGS In today's scenario data warehouse plays a crucial role in order to perform important operations. Different indexing techniques has been used and analyzed using
Oracle Database In- Memory Op4on in Ac4on
Oracle Database In- Memory Op4on in Ac4on Tanel Põder & Kerry Osborne Accenture Enkitec Group h4p:// 1 Tanel Põder Intro: About Consultant, Trainer, Troubleshooter Oracle Database Performance geek Exadata
Partitioning under the hood in MySQL 5.5
Partitioning under the hood in MySQL 5.5 Mattias Jonsson, Partitioning developer Mikael Ronström, Partitioning author Who are we? Mikael is a founder of the technology behind NDB
Oracle Database In-Memory The Next Big Thing
Oracle Database In-Memory The Next Big Thing Maria Colgan Master Product Manager #DBIM12c Why is Oracle do this Oracle Database In-Memory Goals Real Time Analytics Accelerate Mixed Workload OLTP No Changes
