Data Storage Options for SAS Applications: SAS Intelligent Storage



Similar documents
MS-40074: Microsoft SQL Server 2014 for Oracle DBAs

Using SAS as a Relational Database

Hybrid OLAP, An Introduction

Microsoft SQL Server for Oracle DBAs Course 40045; 4 Days, Instructor-led

Oracle Architecture, Concepts & Facilities

SQL Server What s New? Christopher Speer. Technology Solution Specialist (SQL Server, BizTalk Server, Power BI, Azure) v-cspeer@microsoft.

Emerging Technologies Shaping the Future of Data Warehouses & Business Intelligence

Data Warehouse: Introduction

Technical Paper. Migrating a SAS Deployment to Microsoft Windows x64

SQL Server 2012 Gives You More Advanced Features (Out-Of-The-Box)

Inge Os Sales Consulting Manager Oracle Norway

2009 Oracle Corporation 1

Oracle Database 11g Comparison Chart

Data warehousing with PostgreSQL

Oracle Database 12c Plug In. Switch On. Get SMART.

Data Warehousing. Paper

Oracle Exadata: The World s Fastest Database Machine Exadata Database Machine Architecture

<Insert Picture Here> Best Practices for Extreme Performance with Data Warehousing on Oracle Database

The Art of Designing HOLAP Databases Mark Moorman, SAS Institute Inc., Cary NC

SQL 2016 and SQL Azure

W I S E. SQL Server 2008/2008 R2 Advanced DBA Performance & WISE LTD.

<Insert Picture Here> Oracle BI Standard Edition One The Right BI Foundation for the Emerging Enterprise

CASE STUDY: Oracle TimesTen In-Memory Database and Shared Disk HA Implementation at Instance level. -ORACLE TIMESTEN 11gR1

SQL Server and MicroStrategy: Functional Overview Including Recommendations for Performance Optimization. MicroStrategy World 2016

SAS 9.3 Drivers for ODBC

Data Integrator Performance Optimization Guide

Jitterbit Technical Overview : Salesforce

Big Data Analytics - Accelerated. stream-horizon.com

CS2032 Data warehousing and Data Mining Unit II Page 1

Chapter 3. Database Environment - Objectives. Multi-user DBMS Architectures. Teleprocessing. File-Server

FROM RELATIONAL TO OBJECT DATABASE MANAGEMENT SYSTEMS

Data Warehouse as a Service. Lot 2 - Platform as a Service. Version: 1.1, Issue Date: 05/02/2014. Classification: Open

SAP HANA - Main Memory Technology: A Challenge for Development of Business Applications. Jürgen Primsch, SAP AG July 2011

Using MySQL for Big Data Advantage Integrate for Insight Sastry Vedantam

Would-be system and database administrators. PREREQUISITES: At least 6 months experience with a Windows operating system.

Oracle Database In-Memory The Next Big Thing

PostgreSQL Features, Futures and Funding. Simon Riggs

Dell Microsoft Business Intelligence and Data Warehousing Reference Configuration Performance Results Phase III

Oracle Database Public Cloud Services

Advantage Database Server or Microsoft SQL Server which one is right for you?

SAS 9.3 Intelligence Platform

Innovative technology for big data analytics

Preview of Oracle Database 12c In-Memory Option. Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Copyright. Copyright. Arbutus Software Inc Roberts Street Burnaby, British Columbia Canada V5G 4E1

ORACLE DATABASE 10G ENTERPRISE EDITION

Planning the Installation and Installing SQL Server

MDM and Data Warehousing Complement Each Other

Performance Counters. Microsoft SQL. Technical Data Sheet. Overview:

A Next-Generation Analytics Ecosystem for Big Data. Colin White, BI Research September 2012 Sponsored by ParAccel

Chapter 1 Databases and Database Users

<Insert Picture Here>

ORACLE OLAP. Oracle OLAP is embedded in the Oracle Database kernel and runs in the same database process

DBMS / Business Intelligence, SQL Server

Object Oriented Database Management System for Decision Support System.

Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Chapter 1 Outline

A Technical Review on On-Line Analytical Processing (OLAP)

Vendor briefing Business Intelligence and Analytics Platforms Gartner 15 capabilities

Exadata Database Machine

soliddb Fundamentals & Features Copyright 2013 UNICOM Global. All rights reserved.

Demystified CONTENTS Acknowledgments xvii Introduction xix CHAPTER 1 Database Fundamentals CHAPTER 2 Exploring Relational Database Components

Oracle Database 12c. Peter Schmidt Systemberater Oracle Deutschland BV & CO KG

Enterprise Infrastructure Architecture

SAS Federation Server 4.1

<Insert Picture Here> Extending Hyperion BI with the Oracle BI Server

Optimize Oracle Business Intelligence Analytics with Oracle 12c In-Memory Database Option

Exploiting Key Answers from Your Data Warehouse Using SAS Enterprise Reporter Software

Concepts of Database Management Seventh Edition. Chapter 9 Database Management Approaches

EII - ETL - EAI What, Why, and How!

Big Data & Cloud Computing. Faysal Shaarani

Seamless Dynamic Web Reporting with SAS D.J. Penix, Pinnacle Solutions, Indianapolis, IN

Direct NFS - Design considerations for next-gen NAS appliances optimized for database workloads Akshay Shah Gurmeet Goindi Oracle

When to consider OLAP?

FIFTH EDITION. Oracle Essentials. Rick Greenwald, Robert Stackowiak, and. Jonathan Stern O'REILLY" Tokyo. Koln Sebastopol. Cambridge Farnham.

Best Practices for Managing and Monitoring SAS Data Management Solutions. Gregory S. Nelson

Database Scalability and Oracle 12c

Database System Architecture & System Catalog Instructor: Mourad Benchikh Text Books: Elmasri & Navathe Chap. 17 Silberschatz & Korth Chap.

SharePoint 2010 Performance and Capacity Planning Best Practices

Business Intelligence Tutorial

EMBL-EBI. Database Replication - Distribution

HP ProLiant DL580 Gen8 and HP LE PCIe Workload WHITE PAPER Accelerator 90TB Microsoft SQL Server Data Warehouse Fast Track Reference Architecture

How To Use Exadata

PostgreSQL Business Intelligence & Performance Simon Riggs CTO, 2ndQuadrant PostgreSQL Major Contributor

Affordable, Scalable, Reliable OLTP in a Cloud and Big Data World: IBM DB2 purescale

AMT - Asset Management Software: Solutions for Mining Companies

SAS Data Set Encryption Options

Database FAQs - SQL Server

low-level storage structures e.g. partitions underpinning the warehouse logical table structures

SQL Server Enterprise Edition

Tips and Tricks for Using Oracle TimesTen In-Memory Database in the Application Tier

DATABASE DESIGN AND IMPLEMENTATION II SAULT COLLEGE OF APPLIED ARTS AND TECHNOLOGY SAULT STE. MARIE, ONTARIO. Sault College

High performance ETL Benchmark

Jitterbit Technical Overview : Microsoft Dynamics CRM

Enable BI, Reporting, and ETL Integration with Your App

More than Doubled: Implications of Moving from 32-bit to 64-bit SAS on Windows

System Architecture. In-Memory Database

Transcription:

Data Storage Options for SAS Applications: SAS Intelligent Storage Bill Gibson Chief Technology Officer SAS Asia Pacific 1

Presentation Aim! Outline Architecture of SAS Intelligent Storage.! Highlight some existing features that you may not know about.! Compare & contrast SAS storage with traditional RDBMS.! Outline Future Directions for SAS Storage 2

Agenda: SAS Intelligent Storage! Overview! SAS Tables Features! SAS Storage Architecture! Data Servers! Open Access to SAS Data! OLAP Storage! RDBMS Storage! Metadata outside scope! Conclusion 3

Intelligent Storage Data Storage is about reactive capture. Intelligence Storage is about proactive exploitation Transaction systems Web Logs Demografic data Highlight KPIs Compare Handheld Devices Customer Focused Data warehouse Surveys Trend Customer focused data warehouse Forecast Call Center data Campaign data Market analysis Identify Manage Campaigns Predict 4

Intelligent Storage Types based on Business Needs Business Need Sorting, Ranking Compare/Contrast Trends, Pattern, Predict Historical Realtime Mix Structured/Unstructured Storage Type Relational HOLAP Parallel Nearline / Offline Standards Support Metadata/Portal 5

! Proactive Exploitation SAS Intelligence Storage Design Aims! Scalable for data AND intelligence requirements! Usability! Manageability! Interoperability! Optimised for Warehousing & Decision Support. Bulk, single user loading, in scheduled windows. Read only queries processing large amounts of data focus on Intelligence applications not OLTP 6

Intelligence Storage Considerations Size Tera Relational Parallel SASs Multidimensional Giga SPD Server HOLAP Size Mega SAS Tables MDDB 7 Detail Summary

SAS Relational Storage: Overview! Relational: Based around tables! Accessed via SQL & SAS 4GL! SAS offers much richer data manipulation than pure SQL! Set based and procedural logic! Multiple physical storage formats Via Multiple Engine Architecture 8

SAS Multiple Engine Architecture 9! Decouples Applications from physical storage.! Transparent access to many physical data stores.! View Engines for virtual tables! SQL View! Data Step View! Foreign Engines for! Other database tables! XML! ERP! Flat files Engines Base Applications Supervisor SAS Remote Tape... SASSPDS Host Data Data

10 Base Engine Full functionality of the SAS table! supports all the functionality required by SAS statements and procedures.! creates, maintains, and uses indexes.! supports compressed observations.! enforces integrity constraints! creates audit trails! Cross environment data access (CEDA) without remote server.! Open via ODBC! Special features Generations of tables In-memory tables Local File Server

Physical Limits on SAS Tables (V8)! max columns: 32,767! max row length:16,777,191! max rows: virtually unlimited, opsys dependent.! max table size limited by opsys:! 4Giga-gigabytes on NT with NTFS, 4TB VMS, 11

When is Data too big for a Standard SAS Table?! No hard rules it depends on environment & application.! Answer: when time to process it doesn t meet user needs!! Note it is table size, not database size that is the issue. Many SAS databases in the Terabyte range.! A standard SAS table is a single physical file Speed limited by I/O subsystem. Rule of Thumb: Over 10G per table is starts to get big processing time 10-30 min with 10G on small hardware. 12

V9 New Features! Scalable partitioned data format (SPDE) available! 2 G columns! 2 63 rows! Adequate for most needs! Support for threaded procedures! Pipe engine for scalable cooperative processing 13

! Tables The SAS Storage Model! Columns and Rows! Embedded metadata in Table Header! Libraries! Collections of tables Typically in a single directory (except OS/390) 14

SAS Term Data set Observation Variable Library? SAS Storage Model: Terminology RDBMS Term Table Row Column?Logical schema or table space? Database 15 Missing Null

General Purpose RDBMS Architecture Users/ Applications Database Server Database Files DBMS Opsys! All Users share single DBMS Server! DBMS Server manages All physical data access Concurrency Rollback/recovery Constraints Security.! Only DBMS Server understands proprietary Database file structure. 16

SAS Multi-Engine Architecture SAS Application Engine Supervisor Engine 1 Engine 2 Operating System! Each User has own copy of SAS! Each Library has Engine assigned depending on library type! Engine understands table structure for that library.! Operating System sees each table as an individual file: controls! access to tables! physical data management & security! backup & recovery 17

SAS Application Engine Supervisor Engine 1 Engine 2 SAS Multi-Engine Architecture Multiple Users SAS Application Engine Supervisor Engine 1 Engine 2! Operating System manages:! Locking at table level! Multi-user Buffering Operating System 18

Multi-Engine Architecture (Base SAS Engine) Advantages! Simplicity! Leverages OS features Openness Buffering Security Backup/Archiving! Multiple engines /storage formats Disadvantages! OS cannot see inside tables! No row level security Backup/rollback! Less portable?! Table = Single Physical File! Less scalable? 19

SAS Internal Table Organisation! Fixed length storage for numeric & character columns " fixed length rows 1! No row header info! Rows stored sequentially " Any Column can be accessed by offset from Row start " Any Row can be accessed by offset from Table Start, in any order. Metadata 20 Note 1: Provided default features specified: (Reuse=No Compress=No)

! Very short cpu paths SAS Internal Table Organisation: Benefit! Efficient I/O for accessing subsets by row number! Very rapid sequential access! Rapid access to any column in wide tables making Wide Tables manageable! Many databases have small limits on column numbers (Oracle 8: 1000 columns)! Wide tables are needed for tasks such as data mining. 21

DataBase Internals- Oracle (as an example) 22

Oracle Structure "Everything should be made as simple as possible, but not simpler. Einstein 23

Why Do SAS Tables Perform So Well?! Many documented references (SEUGI papers ) where SAS processing especially loading & sequential processing for summarisation, subsetting greatly outperform other DBMSs.! Understanding the internals helps explain why.! Simple structure designed specifically for intelligence! 25 years of optimisation 24

SAS Database Servers! Usually the Engines in Base SAS provide all the database services that are needed.! Network File services & CEDA provide some client/server functions! Specialist Servers are available if required! SAS/Share! SPD Server 25

SAS Database Servers : SAS/Share SAS Share Server SAS Application 1 Engine Supervisor Engine 1 Engine 2 Operating System Engine Supervisor Share Engine SAS Application 2 Engine Supervisor Share Engine 26

SAS/Share! Provides multi-user row level locking for SAS tables! Supports Multi-Engine Architecure! Can serve up data from SAS tables and other data sources! Can process cross source SQL joins etc proc sql; connect to remote(server=vegemite.shr9); select * from connection to remote ( --join sql statement------); 27

SAS Database Servers Summary: SAS/Share! Lightweight OLTP server, row level concurrent update! Control files, metadata! Low volume SQL query server! Standard SAS tables, other engines! Hybrid joins! Supports ODBC clients (V8) Don t Use Share if you can use a network file system for read-only access by SAS clients. 28

SPD Server Gigabyte-Terabyte per table Storage! SMP parallel processing! Advanced Indexing! Read / Write / Alter / Control permissions! Universal / Group / Individual access rights! Row and Column level security! Login and Data Encryption 29

SAS Database Servers : SPD Server SPDS Thread 1 SPD Server SPD Proxy Process Thread n Operating System Each Table is partitioned TCP Parallel Processing Higher Throughput SAS Application 1 Engine Supervisor SPDS Engine SAS Application 2 Engine Supervisor SPDS Engine 30

Case Study: Telecom Italia Mobile TIM's huge customer base meant that performance was a critical issue in the company's choice of an analytical CRM vendor. TIM processes about 100 million call records per day and has built a 3-terabyte SAS data warehouse that is accessed by SAS analytical CRM software. "SAS met our criteria for scalability and performance," says Cardone. 31

V9 SPDE Engine! Single User parallel partitioned engine.! Derived from SPDS! Does not include multiple user, management & administration features of SPDS.! File format compatible with SPDS! Great match for new scaleable procedures.! Needs multiple CPUs & filesystems to perform V9 Base SAS Applications Supervisor Tape Remote Base... SPDE Host Data Data Data Data 32

SAS Database Servers Summary: SPD Server/SPDE! Heavyweight Warehouse Engine! Very large table support (1-100+G)! Requires multiple CPUs, filesystems to perform. See other SEUGI papers for more info 33

Open Access to SAS Data 35 Tables 1. Universal ODBC Driver (no SAS required) 2. ODBC, JDBC & OLE DB & Local SAS Data Provider 3. ODBC, JDBC & OLE DB & SAS Server Servers: SHARE, SPD, IOM 4. Exporting 1. SAS XML Engine 2. SAS/Access Engines for DBMSs & PC File formats MDDB SAS Open OLAP Server - access to any client that supports the OLE DB for OLAP standard.

Open Data Access Summary! Universal ODBC - great for reading single SAS tables sequentially. No SQL, no Indexed Where. Data Source setup required per Table. Remote data accessed via FTP/HTTP.! Local/Share & ODBC - Great for accessing collections of tables. SQL & Indexed Where supported. Remote data transparently accessed. Any SAS library accessible, including DBMS libraries. Note: ODBC or OLE DB available for SPD Server data also. 37

Storing Data in Non-SAS Formats! SAS Multiple Engine Architecture in V8 allows Libname based read/write/update to most Relational databases.! SAS/Access Licence(s) needed.! SAS/Access provides! Transparent reading from relational data stores! SQL queries to relational data stores! Sophisticated & transparent writing/update 39

Its as Simple as a Libname 213 Libname ora oracle user=scott pass=xxxxx path='sas1'; NOTE: Libref ORA was successfully assigned as follows: Engine: ORACLE Physical Name: sas1 40

SAS/Access Features! Many query optimisations! Lots of V8-8.2 enhancements! 4GL Data step interfaces (Keyed access using Indexes)! Bulkload Interfaces for high volume DW loading Not all DBMSs! V9: parallel access! 41

Summary Storage: SAS MDDBs Mega-Gigabyte Summary Storage! Scalable! Flexible security! Base of HOLAP 42

Summary Storage: Hybrid OLAP 100s MB - 100s GB Server Client SAS! data stored in multiple MDDBs! SAS tables MDDB DB2 MDDB HOLAP Data Provider! SPD tables! RDBMS tables SQL Server SAS! See other SEUGI papers 43

Conclusion So, what is the SAS Database? 44

Sample SAS Application Storage Architecture! Design the Storage to meet the application needs! Source data read from operational applications using SAS /Access! Application metadata (control tables) in SAS/Share controlled tables! Working storage in SAS tables on application server! Main detail store in SPDS tables! Reporting data in OLAP MDDBs.! Analysts personal data in SAS tables on personal laptops, for offline access 45

Putting it Together Data from Databases Data from Remote SAS Server Local Data 46

Putting it Together: Libnames the key 47

What is the SAS Database? Intelligent Storage for SAS Applications!! A Distributed Virtual Database! Defined by the LIBNAMES in effect! Central Servers optional! reducing bottlenecks! Optimised for Intelligence Applications! Open via! ODBC/JDBC, OLE DB! SQL library! XML interchange 48

Summary! SAS Intelligent Storage is designed specifically for Intelligence Applications.! It provides an efficient and cost effective solution, with low overhead.! It is being continually improved & refined. 49

Questions? 50