Big Data Analytics Using SAP HANA Dynamic Tiering Balaji Krishna SAP Labs SESSION CODE: BI474
LEARNING POINTS How Dynamic Tiering reduces the TCO of HANA solution Data aging concepts using in-memory and ondisk storage Single Install/Admin/Monitoring
IDC Predictions for 2015 IDC predictions for 2014 Mobile CRM Data Transactions Demand Sales Order Instant Messages Big Data Cloud Cloud spending will surge by 25%, reaching over $100 billion. There will be a doubling of cloud data centers. Things Customer Sales Order Things Planning Opportunities Inventory Demand Mobile Big Data Planning Transactions Customer CRM Data Data explosion Data volumes will continue to explode to 6 billion petabytes Internet of Things 30 billion devices, sensors in 2020 driving $8.9 Trillion in revenue Social networking Social networking will become embedded in cloud platforms and most enterprise apps and processes
SAP End to End Data Management for Real Time Business Workforce of the Future Cloud Big Data Industries Internet of Things Custom Development Business & Consumer Applications ISVs & OEMs ERP TRANSACT STORE ANALYZE SAP DATA MANAGEMENT PREDICT
SAP Data Management Portfolio End-to End Data Management & App Platform for Real-Time Business REAL-TIME APPLICATIONS REAL-TIME ANALYTICS Consumer Engagement Sense & Respond Planning & Optimization Operational Analytics Big Data Warehousing Predictive, Spatial & Text Analytics SAP ASE SAP ESP Replication Server SAP HANA PLATFORM Real-time transactions e + end-to-end analytics Extended Application Services Processing Engine SAP HANA platform Database Services SAP HANA dynamic tiering Application Function Lib. & Data Models Integration Services SAP SQL Anywhere SAP IQ SAP Data Services
Time Value of Data Last time accessed When you need it again Value Value of immediate data access declines Archive Access Event Regulatory audit Business critical reference data Source data Time
Warm/Cold Data Management Questions about SAP HANA dynamic tiering Why is warm data management important for SAP HANA? SAP HANA dynamic tiering Why utilizes is SAP disk HANA backed, dynamic smart column store technology based tiering the best solution for on SAP IQ warm data management? Size and cost constraints may prohibit all in-memory solution Not all data has the same value Warm data has lower latency requirements than hot data SAP HANA dynamic tiering excels at ad hoc queries on structured data from terabyte to petabyte scale SAP HANA dynamic tiering is a deeply integrated, high performance solution in a single system What about Hadoop for warm data storage and processing? Hadoop has unlimited capacity for raw data processing Hadoop is best suited for batch processing of raw, unstructured data Hadoop is an external data store with technical integration into HANA with higher TCO in order to manage the additional system
Introducing SAP HANA dynamic tiering Requirements from our customers Manage data cost effectively, yet with desired performance based on SLAs Handle very large data sets terabytes to petabytes Update and query all data seamlessly via HANA tables Application defines which data is hot, and which data is warm Native Big Data solution to handle a large percentage of enterprise data needs without Hadoop SAP HANA System with dynamic tiering option Worker host Column Table Hot Store Worker host HANA application HANA Database Fast data movement and optimized push down query processing Row Table Worker host Warm Store Extended Table ES host
Data Qualities and Data Temperatures How to think about it Data in the database Different data temperatures Maximum access performance Hot data - always in memory Reduced access performance: Warm data - not (always) in memory All part of the database s data image SAP HANA Platform Hot Warm Data for daily reporting, other high-priority data Other data required to operate the application Externalize Data moved out of the database Different data qualities Available for read access Near-line storage Not accessible without IT process Traditional archive Data is stored and managed outside of the application database NLS Data that is (normally) not updated, infrequently accessed Traditional Archive Data that s kept for legal reasons or similar
SAP HANA dynamic tiering Map data priorities to data management Hot data Warm data SAP HANA Database Hot Store Primary image in memory Durability Dynamic Tiering RAM All in one database Warm Store Cache / Processing Primary Image on disk Hot Store- Classic HANA tables Primary data image in memory DB algorithms optimized for in-memory data Persistence on disk to guarantee durability Warm Store -Extended Tables Primary data image on disk Data processing using algorithms optimized for disk-based data Main memory used for caching and processing.
Technical Details Implementation choices
SAP HANA dynamic tiering one database / one experience for HANA application developers and admins SAP HANA dynamic tiering Reduced TCO Optimized for performance Single database experience Centralized operational control Centralized monitoring / admin Integrated security Common installer and licensing model SAP HANA dynamic tiering Unified backup and restore High speed data ingest Optimized query processing
SAP HANA dynamic tiering The overall system layout SAP HANA with dynamic tiering consists of two types of hosts: Regular worker hosts (running the classical HANA processes: indexserver, nameserver, daemon, xsserver, ) HANA hosts can be single-node or scale-out; appliance or TDI ES hosts (running nameserver, daemon, and esserver) esserver is the database process of the warm store Client Application Connect Hot Store Worker host(*) SAP HANA System with dynamic tiering service Column Table Worker host Row Table Fast data movement and optimized push down query processing Worker host One single SAP HANA database: one SID, one instance number All client communication happens through index server / XS server Warm Store Extended Table ES host (controller) Further ES hosts (*) Standby hosts not shown Common Storage System
HANA Extended Tables HANA extended table schema is part of HANA database catalog HANA extended table data resides in warm store HANA extended table is a first class database object with full ACID compliance Database Catalog Table Definition Data Hot Store Classical HANA column/row table Table Definition Data Warm Store Extended table (warm table) HANA Database
High Speed Data Ingest Import from CSV files: IMPORT FROM CSV FILE bigfile.csv INTO t1 Bulk array insert: INSERT INTO t1 (col1, col2, col3...) VALUES (val1, val2, val3...) High-speed data movement between HANA tables and HANA extended tables: INSERT INTO t_extended select c1 FROM t_hana Concurrent inserts from multiple connections: A HANA extended table may be a DELTA enabled table, which allows multiple concurrent writes Data movement between hot and warm store IMPORT FROM CSV FILE data.csv INTO t_extended Warm Extended Table INSERT SELECT Hot HANA column Table CSV DATA Materialization HANA Database
Optimized Query Processing Optimized Query Processing Parallel query processing Ordering Data is pulled from HANA hot store into HANA warm store query processing engine using multiple streams, and processed in parallel Grouping Push/Pull query optimization and transformation Query operations ship to hot or warm store as appropriate for native Joining performance Extended tables may be used in HANA CALC views T1 T2 T3 T4 HANA Calc engine and HANA SQL engine share extended table query performance optimizations
Example Query Plan Example Query Plan select "account_num", count(*) as account_count from VXM_FOODMART.CUSTOMER C where "lname" >= 'Ga' and "lname" < 'Gb' and exists Customer is a native HANA table in HANA memory ( select * from VXM_IQSTORE.PRODUCT P where "product_id" = "customer_id" ) Product is a HANA extended table in the warm store group by order by "account_num" "account_num";
HANA Monitoring and Administration HANA Cockpit: New, web based monitoring and administration console for HANA Extended Storage User Tables By top usage Top 14 Total 100 30 MB 50 MB 10 ES 100 CL/RW 30 MB 200 MB 20 Top 100 Totals 100 times / day HANA Studio will be used for design and modeling of HANA extended tables HANA Cockpit displays status, CPU/memory/storage resource utilization, table usage statistics Provides access to and search of server logs and custom traces Shows alerts triggered by extended storage Enables administration of extended storage: add and drop storage, or increase size of file
Unified Backup and Restore Data backup Log backup Log area System crash HANA Extended Storage t1 t2 t3 Backup History Restore Time Data backups (manual or scheduled) Log backups (automatic, or none) Data backups with log backups allow restore to Point in Time or most recent state: t1-> t3 Data backups alone allow restore to specific backup only: t1 or t2 HANA backup manages backup of both hot and warm store Point in Time Recovery (PITR) is supported
High Availability and Disaster Recovery Warm Store Service Compute node Manual Failover Warm Store Standby node mirror High availability Compute node failure will result in failover to standby node (manual for warm store nodes) Storage failure will depend on inherent storage vendor disk mirroring and fault tolerance capabilities Hot and warm store should use the same storage to facilitate auto-failover in the future Disaster recovery Compute node Hot Store Auto- Failover mirror Classical HANA services Standby node HANA without dynamic tiering supports continuous replication to maintain a disaster recovery site HANA with dynamic tiering will maintain a disaster recovery site through backup and restore capabilities only Disaster recovery through system replication is planned for a future release Disaster recovery through storage replication may be added independently from software releases
SAP HANA Multitenant Database Containers Each extended store is dedicated to exactly one tenant database: Tenant Database Extended Store Tenant Database Extended Store Tenant Database (No ES) Compute node Compute node Compute node Compute node HANA Cluster
Hardware Layout View Recommended Option: Use Homogeneous Hardware for All Hosts HANA HANA Clients Clients (HANA Clients (HANA Studio, Studio,...)...) (DB clients, Studio,...) 1 2 Client Network 3 Storage Network for HANA and ES Intra-node Network 1 HANA System (One SID) Certd. HW Box Certd. HW Box Certd. HW Box Certd. HW Box HANA Scale-Out 2 ES DB Node 1 Node 2 Standby Node ES DB Node 3 hot data redo logs hot data redo logs warm data ES may be added to certd. HANA storage, or may be using individual storage logs binaries, traces, core dumps Non-certd. Storage for /hana/shared/
Hardware Layout View Alternative Option: Use Individual Hardware HANA HANA Clients Clients (HANA Clients (HANA Studio, Studio,...)...) (DB clients, Studio,...) 1 2 Client Network Intra-node Network 3 HANA Storage Network 4 ES Storage Network 1 HANA System (One SID) Certd. HW Box Certd. HW Box Certd. HW Box Non-certd. HW Box HANA Scale-Out 2 ES DB Node 1 Node 2 Standby Node ES DB Node hot data redo logs hot data redo logs Certd. Storage for data and redo logs of HANA 3 4 logs warm data Non-certd. Storage for ES binaries, traces, core dumps Non-certd. Storage for /hana/shared/
Use Cases SAP BW and native HANA applications
Corporate Memory Archive/NLS SAP NetWeaver BW powered by SAP HANA Data Classification by Object Type Data Categories in a BW System BW Operational Data Archived Analytic Mart Business Transformation EDW Propagation EDW Transformation Staging Layer Frequent reporting and/or HANA-native operations Limited reporting, limited HANA-native operations Old, out-of-use data Archive, read-only, different SLAs 2014 SAP SE or an SAP affiliate company. All rights reserved. Public 25
Extended Tables in HANA BW Use Case: Staging and Corporate Memory Staging Area Data Source PSA Table Table Schema Data Database Catalog Warm store BW System Corporate Memory Write-optimized DSO Active Table Table Schema Data SAP HANA database Data Mart InfoCube Fact Table Table Schema Data Hot Store Object Classification in BW Data Sources and write-optimized DSOs can have the property Extended Table Generated Tables are of type Extended All BW standard operations supported no changes Only minor temporary RAM required in HANA InfoCubes and Regular or Advanced DSOs Generate standard column table 2014 SAP SE or an SAP affiliate company. All rights reserved. Public 26
SAP HANA dynamic tiering for Big Data SAP HANA dynamic tiering for Big Data SAP HANA with Dynamic Tiering provides native Big Data solution SAP HANA HANA extended tables Hot data Cutting edge, in-memory platform Transact/analyze in real-time Native predictive, text, and spatial algorithms Petascale, Petascale, warm structured data Petascale extension to HANA with disk backed, columnar database technology Expand HANA capacity with warm/cool structured data in HANA warm store Tight integration between HANA hot store and HANA warm store for optimal performance
HANA with Dynamic Tiering Native Big Data solution for a multitude of use casesp SAP HANA Dynamic Tiering for Big Data Use Cases across Industries Airline route profitability analysis: SAP HANA analyzes revenue, variable operating costs (fuel, landing fees...), and fixed operating costs in real time to make decisions on network, pricing, and marketing to determine where to fly, when, and how often. All data must be analyzed in real time. Public utilities: enterprise data stored in SAP HANA and large amounts of smart meter data stored in HANA extended tables, to identify operational problems, and establish incentive pricing for more efficient energy use. Financial services: Stock tick data streamed into SAP HANA for immediate price fluctuation analysis and trading actions, with historical stock price data stored in HANA extended tables for trend analysis and portfolio management. Telecommunications: Network service data in HANA extended tables analyzed and correlated with customer loyalty data in SAP HANA, to anticipate customer churn and initiate customer retention response activities.
Future DirectionDirection Where are we headed?
SAP HANA dynamic tiering roadmap SAP HANA dynamic tiering roadmap PLANNED SAP HANA dynamic tiering available to be used by any HANA application Common installer Unified administration and monitoring using HANA Cockpit Extended Storage (ES) engine is part of HANA topology Single authentication model Single licensing model Combined error log / trace handling Fully integrated backup/restore FUTURE HANA ES host auto-failover (HA) SAP HANA system replication for disaster recovery Enhanced backup and restore (BACKINT and storage snapshots) Hybrid extended tables with rule based automatic data movement / aging Further performance optimizations for HANA Calculation Engine Series data support in extended tables Support of extended tables in Core Data Services (CDS)
Hybrid extended tables Single HANA table that spans hot and warm stores Hot partitions in HANA memory; remaining partitions in warm store Automatic, rules-based, asynchronous data movement between hot and warm stores regulatory audit Hybrid Extended Table Hot data in HANA tier 2012 aging 2012 Warm data In warm tier
THANK YOU FOR PARTICIPATING Please provide feedback on this session by completing a short survey via the event mobile application. SESSION CODE: BI474 For ongoing education on this area of focus, visit www.asug.com