Building Advanced Data Models with SAP HANA Werner Steyn Customer Solution Adoption, SAP Labs, LLC.
Disclaimer This presentation outlines our general product direction and should not be relied on in making a purchase decision. This presentation is not subject to your license agreement or any other agreement with SAP. SAP has no obligation to pursue any course of business outlined in this presentation or to develop or release any functionality mentioned in this presentation. This presentation and SAP's strategy and possible future developments are subject to change and may be changed by SAP at any time for any reason without notice. This document is provided without a warranty of any kind, either express or implied, including but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement. SAP assumes no responsibility for errors or omissions in this document, except if such damages were caused by SAP intentionally or grossly negligent. 2011 SAP AG. All rights reserved. 2
Agenda SAP HANA Studio Features and modeling overview Calculation View Features relevant for advanced data modeling SQLScript Focusing on the use case with calculation view Modeling recommendations How to build content Best practices Especially, related to performance Working with multiple fact tables 2011 SAP AG. All rights reserved. 3
SAP HANA Studio Features and Modeling
SAP HANA Studio Features Modeling Information Models To create multiple views of transactional data that can be used for analytical purposes Choice to publish and consume at 3 levels of modeling Attribute View Analytic View Calculation View Database Views / Column Stores Import/Export Models Data Source schemas (metadata) mass and selective load Landscape Data Provisioning (both initial load and replication) Trouble Shooting / Trace / Log Data Preview Physical Tables Information Models 2011 SAP AG. All rights reserved. 5
SAP HANA Modeling Terminology Data Attributes descriptive data (known as Characteristics SAP BW terminology) Calculated Attributes Measures data that can be quantified and calculated (known as key figures in SAP BW) Calculated Measures & Restricted Measures Views Attribute Views i.e. dimensions / JOIN Views Analytic Views i.e. cubes / OLAP views Calculation Views similar to virtual/multi provider with services concept in BW Graphical Calculation View Script Calculation View (SQL Script, CE Functions) Procedures Functions re-usable functionality from within Script Calculation Views Analytic Privilege security object Who has access to which report including restriction on row level data 2011 SAP AG. All rights reserved. 6
Modeling for SAP HANA 1.0 1/3 Using SAP HANA Studio Step 1: Attribute View Separate Master Data Modeling from Fact Data Build the needed master data objects as Attribute Views Step 2: Analytical View Create Cube-like view by joining attributes view to Fact data Build a Data Foundation based on transactional table Join attribute views to data foundation 2011 SAP AG. All rights reserved. 7
Modeling for SAP HANA 1.0 2/3 Using SAP HANA Studio Step 3: Calculation View When working with multiple fact tables, or when Joins are not sufficient create a Calculation View that is something that looks like a View and has SQL Script inside Composite view of other views (tables, JOIN, OLAP views) Consists of a Graphical & Script based editor SQLScript is a HANA-specific functional script language 2011 SAP AG. All rights reserved. 8
Modeling for SAP HANA 1.0 3/3 Using SAP HANA Studio Step 4: Analytic Privileges Analysis authorizations for row-level security 2011 SAP AG. All rights reserved. 9
Calculation View Features relevant for advanced data modeling
Characteristics of Calculation View Calculation Views are A column view that is visible to reporting tools When the view is accessed, a function is implicitly executed 2 types of Calculation Views (Graphical & Script) Calculation Views are side affect free / READ-ONLY functions Column View A function is implicitly executed Calculation View 2011 SAP AG. All rights reserved. 11
2 Types of Calculation Views Composite views, re-uses Analytical and Attribute views SQL / SQL Script / Custom Functions Graphical Calculation View SQL Script Calculation View Union Projection Analytical View Projection Analytical View 2011 SAP AG. All rights reserved. 12 Union
Calculation View Graphical No SQL / SQL Script coding needed Can consume other Analytical Views, Attribute Views, Calculation Views & tables Union, Join, Projection nodes provided, enhance existing view functionality Attribute View Analytical View Calculation View UNION Combining multiple Analytical Views UNION 2.. N (Input Sources) Use Union with Constant values when working with Multiple (2..N) Analytical Views / fact tables 2011 SAP AG. All rights reserved. 13
Calculation View Graphical - Projection Projection nodes improves performance by narrowing the data set Further optimization can be done by applying filters Define Calculated Columns (Example: midstr(string("erdat"),strlen(string("erdat"))-9,4) Calculated Columns are calculated before aggregation 2011 SAP AG. All rights reserved. 14
Calculation View Graphical - Union Use UNION to join multiple Analytical Views 2 modeling options with Unions Standard Union Union with Constant Values 2011 SAP AG. All rights reserved. 15
Calculation View Graphical - Output Add Attributes and Measures to the Output Define the structure of the column store Define Calculated Measures across subject areas Column Store Activate Calculated Measures Across Subject Areas 2011 SAP AG. All rights reserved. 16
Calculation View SQLScript (Script-based) SQL or SQLScript required to create Script based Calculation Views Write SQL Select statements against existing raw tables or Column Stores (preferred) Define output structure, activation creates column store based on Script Output Analytical View Projection Union 2011 SAP AG. All rights reserved. 17
SQLScript
SQLScript (is a collection of SQL extensions to push data-intensive logic into the DB layer) Functional extension - Allows the definition of (side-effect free) functions which can be used to express and encapsulate complex data flows Data type extension - Allows the definition of types without corresponding tables Traditional Model Data to Code Application New Model Code to Data Application Massive data copies Bottle neck!! Layer Code Only transfer results Layer Will be implemented as Calculation Views DB Layer DB Layer Code 2011 SAP AG. All rights reserved. 19
SQLScript Alternative to using SQL built-in functions should be used in Calculation Views exclusively where possible Calculation Engine functions should not be mixed with standard SQL statements Client queries can be well optimized and parallelized by the engine Usually much better performance results than calculation view via SQL Preferred Parallel query execution Only selected fields will be fetched 2011 SAP AG. All rights reserved. 20
SQLScript Build In Functions Preferred SELECT on Column table SELECT on Attribute view SELECT on Analytical view SELECT on Calculation View WHERE HAVING SQL SELECT A, B, C from "COLUMN_TABLE" SELECT A, B, C from "ATTRIBUTE_VIEW" SELECT A, B, C, SUM(D) from "ANALYTIC_VIEW" GROUP BY A, B, C SELECT A, B, C, SUM(D) from CALC_VIEW" GROUP BY A, B, C SELECT A, B, C, SUM(D) from "ANALYTIC_VIEW" WHERE B = 'value' AND C = 'value' CE-Build In Function CE_COLUMN_TABLE("COLUMN_TABLE", [A, B, C]) CE_JOIN_VIEW("ATTRIBUTE_VIEW", [A, B, C]) CE_OLAP_VIEW("ANALYTIC_VIEW", [A, B, C]); CE_CALC_VIEW("ANALYTIC_VIEW", [A, B, C]); var_tab = CE_COLUMN_TABLE("COLUMN_TABLE"); CE_PROJECTION(:var_tab, [A, B, C], ' "B" = ''value'' AND "C" = ''value'' '); GROUP BY SELECT A, B, C, SUM(D) FROM"COLUMN_TABLE" GROUP BY A, B, C var_tab= CE_COLUMN_TABLE("COLUMN_TABLE"); CE_AGGREGATION( (:var_tab, SUM(D), [A, B, C]); INNER JOIN LEFT OUTER JOIN SELECT A, B, Y, SUM(D) from "COLTAB1" INNER JOIN "COLTAB2" WHERE "COLTAB1"."KEY1" = "COLTAB2"."KEY1" AND "COLTAB1"."KEY2" = "COLTAB2"."KEY2" SELECT A, B, Y, SUM(D) from "COLTAB1" LEFT OUTER JOIN "COLTAB2" WHERE "COLTAB1"."KEY1" = "COLTAB2"."KEY1" AND "COLTAB1"."KEY2" = "COLTAB2"."KEY2" CE_JOIN("COLTAB1","COLTAB2", [KEY1, KEY2], [A, B, Y, D]) CE_LEFT_OUTER_JOIN("COLTAB1","COLTAB2", [KEY1, KEY2], [A, B, Y, D]) SQL Expressions SELECT A, B, C, SUBSTRING(D,2,5) FROM "COLUMN_TABLE" var_tab = CE_COLUMN_TABLE("COLUMN_TABLE"); CE_PROJECTION( :var_tab, ["A", "B", "C", CE_CALC('midstr("D",2,5)', string) ]); UNION ALL var_tab1 = SELECT A, B, C, D FROM "COLUMN_TABLE1"; var_tab1 = CE_COLUMN_TABLE("COLUMN_TABLE1",[A,B,C,D]); var_tab2 = SELECT A, B, C, D FROM "COLUMN_TABLE2"; var_tab2 = CE_COLUMN_TABLE("COLUMN_TABLE2",[A,B,C,D]); SELECT * FROM :var_tab1 UNION ALL SELECT * FROM :var_tab2; CE_UNION_ALL(:var_tab1,:var_tab2); 2011 SAP AG. All rights reserved. 21
SQLScript CE-Build in Function Example 2011 SAP AG. All rights reserved. 22
SQLScript CE-Build in Function (CE_OLAP_VIEW) Parameter 1 Analytical View Name Parameter 2 (optional) Field names(1 N) Variable name / temporary table VAR1 = SELECT FROM MATNR, KUNNR, REGIO, LAND1... SUM(CM2) CEA1_00 2011 SAP AG. All rights reserved. 23
SQLScript CE-Build in Function (CE_PROJECTION) Parameter 1 Input data set variable Projected field Function Expressions Parameter 2 Projection field names VAR2 = SELECT FROM MATNR, KUNNR, REGIO, LAND1 AS LANDX, 0 AS KPLIKZ, NetRevenue as NETREV, CM2 :VAR1; 2011 SAP AG. All rights reserved. 24
SQLScript CE-Build in Function (CE_CALC) VAR3= SELECT MATNR, KUNNR, REGIO, LAND1,... CM2 FROM CEP1_00 2011 SAP AG. All rights reserved. 25
SQLScript CE-Build in Function (CE_PROJECTION & FILTER) VAR4 = SELECT FROM WHERE MATNR, KUNNR, REGIO, LAND1 AS LANDX, 0 AS KPLIKZ, NetRevenue as NETREV, CM2 :VAR3 KUNNR!= 001 AND PERIO = 5 Parameter 3 Filter 2011 SAP AG. All rights reserved. 26
SQLScript CE-Build in Function (CE_UNION_ALL) var_out = table type / output of function 2 Parameters (Data set var1, Data set var2) VAR_OUT = SELECT * FROM :VAR3 UNION SELECT * FROM :VAR4 2011 SAP AG. All rights reserved. 27
SQLScript Table Type var_out Column Store Allows for the definition of new Table Types Similar to a database table but do not have an instance Used to define function parameters Created when the Calculation View is activated 2011 SAP AG. All rights reserved. 28
SQLScript Procedures Read-only procedure can be created Following restrictions will apply for the procedures created in the information modeler IN (Input) parameters can be of scalar or table type OUT (Output) parameters must be of type table Tables types required for the signature are generated automatically Activated procedure can be called by other procedures Name of procedure: _SYS_BIC. <package-name>/<proc> 2011 SAP AG. All rights reserved. 29
Recommendations? How to build content
Recommendations How to build content 2/2 Preferred Usage Pros Cons 1: Column Table 2: Analytical View 3: Calculation View (SQL) 4: Calculation View (CE Functions) 5: Calculation View (Graphical) Used for simple applications and showcases. No additional modeling required. For most clients easy to consume. No support for analytical privileges, multi language and client handling. Complex calculation and logic / currency conversion / security shifted to client side. In general low performance. Used for analytical purposes where reading operations on mass data is required. Very high performance on SELECT. Supported by modeling. Well optimized. Limitations in regards to functions. Support measures from a single fact table. Used for simple calculations where only a few fields are used. Building calculation views via SQL syntax is easy. Client queries can be less optimized and could significantly be slower compared to other models. Used for analytical purposes that cannot be expressed using Attribute or Analytical views. Perform statements against existing Attribute & Analytical Views. Client queries can be well optimized and parallelized. Usually better performance results than SQL. Syntax is different in compared to wellknown SQL Language. Limitations in regards to functions. Preferred Used for analytical purposes that cannot be expressed using Attribute or Analytical views. Model execution flow between existing Analytical and Calculation Views. No SQL or SQL Script knowledge required. Union with Constant Values fully supported. Client queries can be well optimized and parallelized. Limitations in regards to functions. Preferred 2011 SAP AG. All rights reserved. 31
Recommendations - How to build content Calculation View Analytical View Attribute View Tables 2011 SAP AG. All rights reserved. 32
Best Practices Especially, recommendations related to performance
How to work with multiple fact tables 1/5 Design considerations How to combine multiple Analytic Views? Business situation may require The combination of two or more fact tables Measures originate from all fact tables Analytic View 1 Analytic View 2 DO NOT DO: Create an analytic view for each fact table, and join the two together. This approach may have severe performance implications. Analytic View 1 Analytic View 2 Recommended solution: Model one analytic view for each fact table and union them. Analytic View 1 U Analytic View 2 In case you have the same measures the solution is straight forward. In case of different measures use union with constant 2011 SAP AG. All rights reserved. 34
How to work with multiple fact tables 2/5 Union with Constant Values Situation You have measures coming from two fact tables Solution Create one analytic view for each of the fact tables Create a calcualtion view that implements a union over both Analytic views Set the counter parts for measures to contant Zeros for all tables that do not provide that measure Set the counter parts for dimensions to constant NULLs for all tables that do not provide the dimension When to consider You can use multiple (2..N) analytic views in one union operation also The patterns follows the concept of a MultiProvider in BW 2011 SAP AG. All rights reserved. 35
Standard Union 3/5 Analytical View A CUSTOMER AMOUNT FLAG 1000 100 A 1000 100 A 2000 200 A Analytical View P CUSTOMER AMOUNT FLAG 1000 100 P 2000 200 P 2000 200 P CUSTOMER AMOUNT FLAG 1000 100 A 1000 100 A 1000 100 P 2000 200 A 2000 200 P 2000 200 P Standard Union Result after aggregation CUSTOMER AMOUNT FLAG 1000 200 A 1000 100 P 2000 200 A 2000 400 P 2011 SAP AG. All rights reserved. 36
Union with Constant Values (Preferred) 4/5 Analytical View A CUSTOMER AMOUNT_A 1000 100 1000 100 2000 200 Analytical View P CUSTOMER AMOUNT_P 1000 100 2000 200 2000 200 CUSTOMER AMOUNT_A AMOUNT_P 1000 100 0 1000 100 0 2000 200 0 1000 0 100 2000 0 200 2000 0 200 Union with Constant values Result after aggregation CUSTOMER AMOUNT_A AMOUNT_P 1000 200 100 2000 200 400 2011 SAP AG. All rights reserved. 37
CO-PA Union vs Union Constant Values 5/5 Standard Union Union with Constant values Actual Fact / Analytical View 1 Planned Fact / Analytical View 2 2011 SAP AG. All rights reserved. 38
Demo