SAP High-Performance Analytic Appliance 1.0 (SAP HANA) A First Look At The System Architecture Marc Bernard SAP Technology Regional Implementation Group February 2011
Disclaimer This presentation outlines our general product direction and should not be relied on in making a purchase decision. This presentation is not subject to your license agreement or any other agreement with SAP. SAP has no obligation to pursue any course of business outlined in this presentation or to develop or release any functionality mentioned in this presentation. This presentation and SAP's strategy and possible future developments are subject to change and may be changed by SAP at any time for any reason without notice. This document is provided without a warranty of any kind, either express or implied, including but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement. SAP assumes no responsibility for errors or omissions in this document, except if such damages were caused by SAP intentionally or grossly negligent. 2011 SAP AG. All rights reserved. / Page 2
Agenda 1. Architecture Overview 2. Row Store 3. Column Store 4. Persistency Layer 5. Modeling 6. Q&A 2011 SAP AG. All rights reserved. / Page 3
Architecture Overview In-Memory Computing Engine and Surroundings In-Memory Computing Studio MS Excel Clients (planned, e.g.) BI4 Explorer Administration Modeling Dashboard Design SAP BI4 universes (WebI,...) BI4 Analysis Load Controller ERP Replication Agent In-Memory Computing Engine Session Management ERP DB SAP Business Objects BI4 Data Services Designer Data Services SBO BI4 Information Design Tool SBO BI4 servers ( program for client) Replication Server Page Management Data Volumes Request Processing / Execution Control SQL Parser MDX SQL Script Calc Engine Relational Engines Row Store Column Store Persistence Layer Disk Storage Logger Log Volumes Transaction Manager Authorization Manager Metadata Manager Other Source Systems SAP NetWeaver BW 3rd Party 2011 SAP AG. All rights reserved. / Page 4
Architecture Overview The Engine Administration Load Controller ERP DB IMC Studio ERP Modeling Replication Agent Log Business Objects Enterprise Data Services Designer Data Services Replication Server SBO Information Design Tool SBO server programs for clients Page Management MS Excel In-Memory Xcelsius Computing SAP Engine BI universes (WebI,...) Request Processing / Execution Control SQL Parser MDX SQL Script Session Management Clients (planned, e.g.) SBOP Explorer 4.0 Calc Engine Relational Engines Row Store Column Store Persistence Layer Logger Transaction Manager Authorization Manager Metadata Manager SBOP Analysis Other Source Systems SAP NetWeaver BW Data Volumes 3rd Party Disk Storage Log Volumes 2011 SAP AG. All rights reserved. / Page 5
Architecture Overview Loading Data into SAP HANA In-Memory Computing Studio MS Excel Clients (planned, e.g.) BI4 Explorer Administration Modeling Dashboard Design SAP BI4 universes (WebI,...) BI4 Analysis Load Controller ERP Replication Agent In-Memory Computing Engine Session Management ERP DB Business Objects Enterprise Data Services Designer Data Services SBO Information Design Tool SBO BI4 servers ( program for client) Other Source Systems SAP NetWeaver BW 3rd Party Replication Server Page Management Data Volumes Request Processing / Execution Control SQL Parser MDX SQL Script Calc Engine Relational Engines Row Store Column Store Persistence Layer Disk Storage Logger Log Volumes Transaction Manager Authorization Manager Metadata Manager 2011 SAP AG. All rights reserved. / Page 6
Architecture Overview Data Modeling In-Memory Computing Studio MS Excel Clients (planned, e.g.) BI4 Explorer Administration Modeling Dashboard Design SAP BI4 universes (WebI,...) BI4 Analysis Load Controller ERP Replication Agent In-Memory Computing Engine Session Management ERP DB Business Objects Enterprise Data Services Designer Data Services SBO Information Design Tool SBO BI4 servers ( program for client) Other Source Systems SAP NetWeaver BW 3rd Party Replication Server Page Management Data Volumes Request Processing / Execution Control SQL Parser MDX SQL Script Calc Engine Relational Engines Row Store Column Store Persistence Layer Disk Storage Logger Log Volumes Transaction Manager Authorization Manager Metadata Manager 2011 SAP AG. All rights reserved. / Page 7
Architecture Overview Reporting In-Memory Computing Studio MS Excel Clients (planned, e.g.) BI4 Explorer Administration Modeling Dashboard Design SAP BI4 universes (WebI,...) BI4 Analysis Load Controller ERP Replication Agent In-Memory Computing Engine Session Management ERP DB Business Objects Enterprise Data Services Designer Data Services SBO Information Design Tool SBO BI4 servers ( program for client) Other Source Systems SAP NetWeaver BW 3rd Party Replication Server Page Management Data Volumes Request Processing / Execution Control SQL Parser MDX SQL Script Calc Engine Relational Engines Row Store Column Store Persistence Layer Disk Storage Logger Log Volumes Transaction Manager Authorization Manager Metadata Manager 2011 SAP AG. All rights reserved. / Page 8
Architecture Overview Administration In-Memory Computing Studio MS Excel Clients (planned, e.g.) BI4 Explorer Administration Modeling Dashboard Design SAP BI4 universes (WebI,...) BI4 Analysis Load Controller ERP Replication Agent In-Memory Computing Engine Session Management ERP DB Business Objects Enterprise Data Services Designer Data Services SBO Information Design Tool SBO BI4 servers ( program for client) Other Source Systems SAP NetWeaver BW 3rd Party Replication Server Page Management Data Volumes Request Processing / Execution Control SQL Parser MDX SQL Script Calc Engine Relational Engines Row Store Column Store Persistence Layer Disk Storage Logger Log Volumes Transaction Manager Authorization Manager Metadata Manager 2011 SAP AG. All rights reserved. / Page 9
SAP High-Performance Analytic Appliance 1.0 (optional) SAP BusinessObjects BI clients (optional) SAP In-Memory Computing Studio Admin & model SQL MDX JDBC ODBC ODBO BICS SQL DBC Authentication Content mgmt SAP BusinessObjects BI 4.0 Repository SAP In-Memory Computing Engine (existing) SAP Business Application Replication Server SAP HANA Replication Agent sync load (optional) SAP BusinessObjects Data Services 4.0 Any source DB Server 2011 SAP AG. All rights reserved. / Page 10
Request Processing and Execution Control Conceptual View Standard SQL Processed directly by DB engine SQL Script, MDX and planning engine interface Domain-specific programming languages or models Converted into calculation models Calc Engine Create logical execution plan for calculation models Execute user defined functions Relational Engine DB optimizer produces physical executing plan Access to row and column store 2011 SAP AG. All rights reserved. / Page 11
Calc Engine for Dummies The easiest way to think of Calculation Models is to see them as dataflow graphs, where the modeler can define data sources as inputs and different operations (join, aggregation, projection, ) on top of them for data manipulations. The Calculation Engine will break up a model, for example some SQL Script, into operations that can be processed in parallel (rule based model optimizer). Then these operations will be passed to the database optimizer which will determine the best plan for accessing row or column stores (algebraic transformations and cost based optimizations based on database statistics). 2011 SAP AG. All rights reserved. / Page 12
Calc Engine for Dummies Example 2011 SAP AG. All rights reserved. / Page 13
Agenda 1. Architecture Overview 2. Row Store 3. Column Store 4. Persistency Layer 5. Modeling 6. Q&A 2011 SAP AG. All rights reserved. / Page 14
In-Memory Computing Engine High Level Architecture Row Store One of the relational engines Interfaced from calculation / execution layer Pure in-memory store Persistence managed in persistence layer SAP in-memory computing engine HANA 2011 SAP AG. All rights reserved. / Page 15
Row Store Architecture Row Store Block Diagram Row Store Block Diagram Transactional Version Memory Contains temporary versions Needed for Multi-Version Concurrency Control (MVCC) Segments Contain the actual data (content of row-store tables) in pages Page Manager Memory allocation Keeping track of free/used pages Version Memory Consolidation Think garbage collector for MVCC Persistence Layer Invoked in write operations (log) And in performing savepoints checkpoint writer 2011 SAP AG. All rights reserved. / Page 16
Row Store Architecture Highlights Write Operations Mainly go into Transactional Version Memory INSERT also writes to Persisted Segment Persisted Segment Contains data that may be seen by any ongoing transaction Data that has been committed before any active transaction was started) Version Consolidation Moves visible version from Transaction Version Memory into Persisted Segment (based on Commit ID) Recent versions of changed records Clears outdated record versions from Transactional Version Memory Transactional Version Memory Main Memory Write Operations Read Operations Version Memory Consolidation Persisted Segment Data that may be seen by all active transactions Memory Handling Row store tables are linked list of memory pages Pages are grouped in segments Page size: 16 KB 2011 SAP AG. All rights reserved. / Page 17
Indexes for Row Store Tables Primary Index / Row ID / Index Persistence Each row-store table has a primary index Primary index maps ROW ID primary key of table ROW ID: a number specifying for each record its memory segment and page How to find the memory page for a table record? A structure called ROW ID contains the segment and the page for the record The page can then be searched for the records based on primary key ROW ID is part of the primary index of the table Secondary indexes can be created if needed Persistence of indexes in row store Indexes in row store only exist in memory No persistence of index data Index definition stored with table metadata Indexes filled on-the-fly when system loads tables into memory on system start-up 2011 SAP AG. All rights reserved. / Page 18
Agenda 1. Architecture Overview 2. Row Store 3. Column Store 4. Persistency Layer 5. Modeling 6. Q&A 2011 SAP AG. All rights reserved. / Page 19
In-Memory Computing Engine High Level Architecture Column Store One of the relational engines Interfaced from calculation / execution layer Pure in-memory store Persistence managed in persistence layer Optimized for high performance of read operation Good performance of write operations Efficient data compression SAP in-memory computing engine HANA 2011 SAP AG. All rights reserved. / Page 20
Column Store Architecture Column Store Block Diagram Column Store Block Diagram Optimizer and Executor Handles queries and execution plan Main and Delta Storage Compressed data for fast read Delta data for fast write Asynchronous delta merge Consistent View Manager Transaction Manager Persistence Layer 2011 SAP AG. All rights reserved. / Page 21
Column Store Highlights Storage Separation (Main & Delta) Enables high compression and high write performance at the same time Write Operations Only in delta storage because write optimized. The update is performed by inserting a new entry into the delta storage. Compressed and Read optimized Main Memory Write Operations Write optimized Data Compression in Main Storage Compression by creating dictionary and applying further compression methods Speed up Data load into CPU cache Equality check Search The compression is computed during delta merge operation. Main Delta Read Operations Read Operations Always have to read from both main & delta storages and merge the results. Engine uses multi version concurrency control (MVCC) to ensure consistent read operations. Delta Merge Operation See next slide 2011 SAP AG. All rights reserved. / Page 22
Column Store Delta Management Delta Merge Operation Purpose To move changes in delta storage into the compressed and read optimized main storage Characteristics Happens asynchronously Even during merge operation the columnar table will be still available for read and write operations To fulfil this requirement, a second delta and main storage are used internally Before Merge During Merge After Merge Write Operations Merge Operations Write Operations Write Operations Main Delta Main Main New Delta Delta New Main New Delta New Read Operations Read Operations Read Operations 2011 SAP AG. All rights reserved. / Page 23
Agenda 1. Architecture Overview 2. Row Store 3. Column Store 4. Persistency Layer 5. Modeling 6. Q&A 2011 SAP AG. All rights reserved. / Page 24
Persistence Layer Purpose and Scope Why Does An In-memory Database Need A Persistence Layer? Main Memory is volatile. What happens upon Database restart? Power outage?... Data needs to be stored in a non-volatile way Backup and restore SAP in-memory computing engine offers one persistence layer which is used by row store and column store Regular savepoints full persisted image of DB at time of savepoint Logs capturing all DB transactions since last savepoint (redo logs and undo logs written) restore DB from latest savepoint onwards Ability to create "snapshots" used for backups 2011 SAP AG. All rights reserved. / Page 25
Persistence Layer System Restart and Population of In-memory Stores Actions During System Restart Last savepoint must be restored plus Undo logs must be read for uncommitted transactions saved with last savepoint Redo logs for committed transactions since last savepoint Complete content of row store is loaded into memory Column store tables may be marked for preload or not Only tables marked for preload are loaded into memory during startup If table is marked for loading on demand, the restore procedure is invoked on first access 2011 SAP AG. All rights reserved. / Page 26
Agenda 1. Architecture Overview 2. Row Store 3. Column Store 4. Persistency Layer 5. Modeling 6. Q&A 2011 SAP AG. All rights reserved. / Page 27
Row Store vs. Column Store When to Use Which Store Modeling Only Possible For Column Tables This answers the frequently asked question: "Where should I put a table row store or column store?" Information Modeler only works with column tables Replication server creates tables in column store per default Data Services creates tables in column store per default SQL to create column table: "CREATE COLUMN TABLE..." Store can be changed with "ALTER TABLE " System Tables Are Created Where They Fit Best Administrative tables in row store: Schema SYS caches, administrative tables of engine Tables from statistics server Administrative tables in column store: Schema _SYS_BI metadata of created views + master data for MDX Schema _SYS_BIC some generated tables for MDX Schema _SYS_REPO e.g. lists of active/modified versions of models 2011 SAP AG. All rights reserved. / Page 28
SAP In-Memory Computing Studio Look and Feel Quick Launch View Navigator View Properties View 2011 SAP AG. All rights reserved. / Page 29
SAP In-Memory Computing Studio Features Information Modeler Features Modeling No materialized aggregates Database views Choice to publish and consume at 4 levels of modeling Attribute View, Analytic View, Analytic View enhanced with Attribute View, Calculation View Data Preview Physical tables Information Models Import/Export Models Data Source schemas (metadata) mass and selective load Landscapes Data Provisioning for SAP Business Applications (both initial load and replication) Analytic Privileges / Security 2011 SAP AG. All rights reserved. / Page 30
Modeling Process Flow Import Source System metadata Physical tables are created dynamically (1:1 schema definition of source system tables) Create Information Models Database Views are created Attribute Views Analytic Views Calculation Views Consume Consume with choice of client tools BICS, SQL, MDX Provision Data Physical tables are loaded with content. Deploy Column views are created and activated 2011 SAP AG. All rights reserved. / Page 31
SAP In-Memory Computing Studio Terminology Information Modeler Terminology Data Attributes descriptive data (known as Characteristics SAP BW terminology) Measures data that can be quantified and calculated (known as key figures in SAP BW) Views Attribute Views i.e. dimensions Analytic Views i.e. cubes Calculation Views similar to virtual provider with services concept in BW Hierarchies Leveled based on multiple attributes Parent-child hierarchy Analytic Privilege security object 2011 SAP AG. All rights reserved. / Page 32
SAP In-Memory Computing Studio Navigator View - Default Catalog HANA Instance (<USER>) HANA Server Name and Instance Number User Database schema Schema Content: Column Views, Functions, Tables, Views 2011 SAP AG. All rights reserved. / Page 33
SAP In-Memory Computing Studio Navigator View - Information Models Information Models organized in Packages Attribute Views, Analytic Views, Calculation Views, Analytic Privileges organised in folders 2011 SAP AG. All rights reserved. / Page 34
Attribute Views Attribute View What is an Attribute View? Attributes add context to data. Attributes are modeled using Attribute Views. Can be regarded as Master Data tables Can be linked to fact tables in Analytical Views A measure e.g. weight can be defined as an attribute. Table Joins and Properties Join Types leftouter, rightouter, fullouter, texttable Cardinality 1:1 N:1 1:N Language Column 2011 SAP AG. All rights reserved. / Page 35
Analytical View Analytical View An Analytical View can be regarded as a cube. Analytical Views does not store any data. The data is stored in column store or table view based on the Analytical View Structure. Attribute and Measures Can create Attribute Filters Must have at least one Attribute Must have at least one Measure Can create Restricted Measures Can create Calculated Measures Can rename Attribute and Measures on the property tab 2011 SAP AG. All rights reserved. / Page 36
Analytical View Analytical View: Data Preview There are three main views one can select from when previewing data. Raw Data table format of data Distinct Values graphical and text format identifying unique values Analysis select fields (attributes and measures) to display in graphical format. 2011 SAP AG. All rights reserved. / Page 37
Calculation View (Scripting) Calculation View Define Table Output Structure Write SQL Statement. Ensure that the selected fields corresponds to previously defined Output table structure of the function. Example : SQL_A = SELECT MATNR, KUNNR,. FROM <COPA_ACTUAL_ANALYTICAL VIEW 1> SQL_P = SELECT MATTNR_KUNNR, FROM <COPA_PROJECTED_ANALYTICAL VIEW 2> TABLE_OUTPUT_STRUCTURE = SELECT * FROM <SQL_A> UNION SELECT * FROM <SQL_P>; 2011 SAP AG. All rights reserved. / Page 38
SAP In-Memory Computing Studio Pre-Delivered Administration Console Navigator View Administration View Properties View 2011 SAP AG. All rights reserved. / Page 39
Agenda 1. Architecture Overview 2. Row Store 3. Column Store 4. Persistency Layer 5. Modeling 6. Q&A 2011 SAP AG. All rights reserved. / Page 40
Thank you! 2011 SAP AG. All rights reserved. / Page 41
Further Information on SAP HANA and In-Memory Technologies In-Memory Computing http://www.sap.com/platform/in-memory-computing Real-Real Time Business with HANA http://www.youtube.com/watch?v=uuqtuw-m7mq SAP Community Network Topic Page http://www.sdn.sap.com/irj/sdn/in-memory SAP Community Forum http://forums.sdn.sap.com/forum.jspa?forumid=491 The SAP NetWeaver BW SAP HANA Relationship http://www.sdn.sap.com/irj/scn/weblogs?blog=/pub/wlg/21575 SAP HANA Ramp-Up Knowledge Transfer (login required) http://service.sap.com/rkt-hana SAP HANA Documentation (login required during ramp-up) https://cw.sdn.sap.com/cw/community/docupedia/hana 2011 SAP AG. All rights reserved. / Page 42
2011 SAP AG. All Rights Reserved No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP AG. The information contained herein may be changed without prior notice. Some software products marketed by SAP AG and its distributors contain proprietary software components of other software vendors. Microsoft, Windows, Excel, Outlook, and PowerPoint are registered trademarks of Microsoft Corporation. IBM, DB2, DB2 Universal Database, System i, System i5, System p, System p5, System x, System z, System z10, System z9, z10, z9, iseries, pseries, xseries, zseries, eserver, z/vm, z/os, i5/os, S/390, OS/390, OS/400, AS/400, S/390 Parallel Enterprise Server, PowerVM, Power Architecture, POWER6+, POWER6, POWER5+, POWER5, POWER, OpenPower, PowerPC, BatchPipes, BladeCenter, System Storage, GPFS, HACMP, RETAIN, DB2 Connect, RACF, Redbooks, OS/2, Parallel Sysplex, MVS/ESA, AIX, Intelligent Miner, WebSphere, Netfinity, Tivoli and Informix are trademarks or registered trademarks of IBM Corporation. Linux is the registered trademark of Linus Torvalds in the U.S. and other countries. Adobe, the Adobe logo, Acrobat, PostScript, and Reader are either trademarks or registered trademarks of Adobe Systems Incorporated in the United States and/or other countries. Oracle is a registered trademark of Oracle Corporation. UNIX, X/Open, OSF/1, and Motif are registered trademarks of the Open Group. Citrix, ICA, Program Neighborhood, MetaFrame, WinFrame, VideoFrame, and MultiWin are trademarks or registered trademarks of Citrix Systems, Inc. HTML, XML, XHTML and W3C are trademarks or registered trademarks of W3C, World Wide Web Consortium, Massachusetts Institute of Technology. Java is a registered trademark of Sun Microsystems, Inc. JavaScript is a registered trademark of Sun Microsystems, Inc., used under license for technology invented and implemented by Netscape. SAP, R/3, SAP NetWeaver, Duet, PartnerEdge, ByDesign, Clear Enterprise, SAP BusinessObjects Explorer and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP AG in Germany and other countries. Business Objects and the Business Objects logo, BusinessObjects, Crystal Reports, Crystal Decisions, Web Intelligence, Xcelsius, and other Business Objects products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP France in the United States and in other countries. All other product and service names mentioned are the trademarks of their respective companies. Data contained in this document serves informational purposes only. National product specifications may vary. The information in this document is proprietary to SAP. No part of this document may be reproduced, copied, or transmitted in any form or for any purpose without the express prior written permission of SAP AG. This document is a preliminary version and not subject to your license agreement or any other agreement with SAP. This document contains only intended strategies, developments, and functionalities of the SAP product and is not intended to be binding upon SAP to any particular course of business, product strategy, and/or development. Please note that this document is subject to change and may be changed by SAP at any time without notice. SAP assumes no responsibility for errors or omissions in this document. SAP does not warrant the accuracy or completeness of the information, text, graphics, links, or other items contained within this material. This document is provided without a warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability, fitness for a particular purpose, or non-infringement. SAP shall have no liability for damages of any kind including without limitation direct, special, indirect, or consequential damages that may result from the use of these materials. This limitation shall not apply in cases of intent or gross negligence. The statutory liability for personal injury and defective products is not affected. SAP has no control over the information that you may access through the use of hot links contained in these materials and does not endorse your use of third-party Web pages nor provide any warranty whatsoever relating to third-party Web pages. 2011 SAP AG. All rights reserved. / Page 43