Avoiding Common Analysis Services Mistakes Craig Utley
Who Am I? Craig Utley, Mentor with Solid Quality Mentors craig@solidq.com Consultant specializing in development with Microsoft technologies and data warehousing Published author of books, whitepapers, articles, and courseware Operator of LearnMicrosoftBI.com
My Data Warehousing Experience Worked on first data warehousing course from Microsoft (1998) Working with BI in various forms for past 11 years Just over one year on Microsoft s SQL Customer Advisory Team (SQLCAT), focused on BI Built warehouses or consulted on BI projects in financial, healthcare, manufacturing, and CPG industries Conference speaker and article author on BI topics
Agenda What is Analysis Services? An Overview of BI Apps Dimensional Modeling Basics Analysis Services Common Problems with SSAS Dimension Problems Cube Problems Other Problems
What is Analysis Services? Analysis Services is the cube building engine from Microsoft Cubes are part of a data warehouse or data mart Analysis Services 2005/2008 are Microsoft s third and fourth generation of cube building technology They are a radical change from the previous two versions, however
The BI Process SSIS SSAS SSRS, etc.
OLAP OLTP is the more common use for relational databases Normalized, built for transaction OLAP (Online Analytical Processing) is a denormalized structure built for queries Sometimes called the relational data warehouse, star schema, or snowflake schema Consists of Dimension and Fact tables
Dimension Tables Short and fat, heavily denormalized Represents a person, place, thing, or concept Contains full descriptions, usually flattens hierarchical structures into a single relational table Contains how the person wants to see the data Joins to zero-to-many records in the fact table
Fact Tables Tall and thin Usually contains a foreign key for each dimension table and then a series of numeric facts (aka measures) Contains what the person wants to see Despite being narrow, usually accounts for 98%+ of the storage of the data
Analysis Services Analysis Services takes data, usually from a relational warehouse, and builds cubes Cubes are a different physical format from relational tables by default Cubes are designed to quickly respond to queries both by how they store data and by pre-calculating aggregates
Cube Components Cubes contain dimensions, which usually contain hierarchies Cubes also contain measures which may come from one or more fact tables Cubes may contain calculations, KPIs, translations, perspectives, actions, aggregations, and partitions
Common SSAS Mistakes There are a number of common mistakes companies make when creating cubes Dimensions appear to be especially problematic for companies Cubes contain their own issues Many companies fail to properly deliver data to the organization effectively
Attribute Relationships Many dimensions will contain one or more hierarchies Attribute relationships define how dimension attributes are related to each other The first thing companies should do is correctly identify attribute relationships Not identifying them is bad Incorrectly identifying them is worse
Attribute Keys Sometimes an attribute, by itself, is not enough to determine uniqueness A combination of attributes is sometimes necessary to ensure proper analysis Attribute relationships with flexible relationship types can cause problems with non-unique attributes
Demo Attribute Relationships and Attribute Keys
Measure Groups Measure groups are tied to specific dimensions that make sense for the records in the fact table Browsing by unrelated dimensions can lead to problems Decide when IgnoreUnrelatedDimensions should be true
Aggregations Aggregations are one of the greatest benefits to cubes They can provide a huge speed boost By default, SSAS doesn t build aggregations! The Usage Based Optimization wizard should provide better results Aggregations can be built by wizards or manually BUT BE CAREFUL A separate tool in 2005, integrated in 2008
Processing Cubes Processing contains a number of defaults These defaults are not always optimal and can be tweaked for better performance The number of parallel objects to process is decided by the server by default Changing this can have a noticeable impact on processing performance
ASCMD To script operations, use ASCMD Allows for the command-line execution of XMLA scripts, MDX queries, and DMX statements Useful for extremely large warehouses where UI tools might slow down
Processing with ASCMD For best performance use ASCMD.EXE and XMLA Use <Parallel> </Parallel> to group processing tasks together until Server is using maximum resources Proper use of <Transaction> </Transaction> ProcessData and ProcessIndex separately instead of ProcessFull for a more predictable CPU usage pattern
MDX Tips Set Non-empty behavior property on calculations when possible Explicitly reference cells when possible Products.Printers instead of Dimensions(2).Printers Products.Printers instead of Products.CurrentMember Move simple calculations (such as Measure1 * Measure2) from Calculations to the DSV Learn how to use the Scope statement Read the Microsoft SQL Server 2005 Analysis Services Performance Guide
Scope vs. Nested IIF CREATE MEMBER CURRENTCUBE.[MEASURES].[Foo] AS iif([time].[time Season].CurrentMember.Level IS [Time].[Time Season].[Day], null, iif([time].[time Season].CurrentMember.Level IS [Time].[Time Season].[Week], iif(isempty([measures].[wtd Net Tot Sls]) or [MEasures].[WTD Net Tot Sls] = 0, null, [Measures].[TY Str Stk]/ [Measures].[WTD Net Tot Sls]), iif([time].[time Season].CurrentMember.Level is [Time].[Time Season].[Period], iif(isempty([measures].[mtd Net Tot Sls]) or [Measures].[MTD Net Tot Sls] = 0, null, [Measures].[TY Str Stk]/ [Measures].[MTD Net Tot Sls]), iif(isempty([measures].[std Net Tot Sls]) or [Measures].[STD Net Tot Sls] = 0, null, [Measures].[TY Str Stk]/[Measures].[STD Net Tot Sls] ) ) ) )
Scope vs. Nested IIF cont. Create [Foo]; Scope [Foo]; [Time].[Time Season].[Week] = IIF( [Measures].[WTD Net Tot Sls] = 0, null, [Measures].[TY Str Stk] / [Measures].[WTD Net Tot Sls]); [Time].[Time Season].[Period] = IIF( [Measures].[MTD Net Tot Sls] = 0, null, [Measures].[TY Str Stk] / [Measures].[MTD Net Tot Sls]); [Time].[Time Season].[Season] = IIF( [Measures].[STD Net Tot Sls] = 0, null, [Measures].[TY Str Stk] / [Measures].[STD Net Tot Sls]); End Scope;
Server Properties There are several server properties that can be changed for better performance Especially in high concurrency situations CoordinatorExecutionMode controls the number of processes per core ThreadPool settings control the number of threads for queries and processing Read the Microsoft SQL Server 2005 Analysis Services Performance Guide
Troubleshooting Slow Queries Troubleshooting slow SSAS queries is not as easy as troubleshooting slow relational queries Learn MDX inside and out Read Identifying and Resolving MDX Query Performance Bottlenecks in SQL Server 2005 Analysis Services
Delivering Data to Users The single biggest barrier I have seen to successful BI projects is a lack of effective delivery to end users Client tools may include PerformancePoint Server, SQL Server Reporting Services, ProClarity, Excel, custom applications, etc. Learn when to use which tools by knowing your users and their needs for the data
Summary Analysis Services is a powerful tool for delivering data that is easy to analyze in a variety of ways Analysis Services is also complex and has many areas so knowing all facets is nearly impossible Read the best practices guides from Microsoft
Books Microsoft SQL Server 2005 Analysis Services Applied Microsoft Analysis Services 2005: And Microsoft Business Intelligence Platform MDX Solutions: With Microsoft SQL Server Analysis Services 2005 and Hyperion Essbase
Avoiding Common Analysis Services Mistakes Craig Utley