SAS Programming Tips, Tricks, and Techniques

Similar documents
Five Little Known, But Highly Valuable, PROC SQL Programming Techniques. a presentation by Kirk Paul Lafler

Simple Rules to Remember When Working with Indexes Kirk Paul Lafler, Software Intelligence Corporation, Spring Valley, California

Exploring DATA Step Merges and PROC SQL Joins

DATA Step versus PROC SQL Programming Techniques

Demystifying PROC SQL Join Algorithms Kirk Paul Lafler, Software Intelligence Corporation

Using SAS as a Relational Database

CHAPTER 1 Overview of SAS/ACCESS Interface to Relational Databases

Using Macros to Automate SAS Processing Kari Richardson, SAS Institute, Cary, NC Eric Rossland, SAS Institute, Dallas, TX

Powerful and Sometimes Hard-to-find PROC SQL Features

Fun with PROC SQL Darryl Putnam, CACI Inc., Stevensville MD

Efficiency Techniques for Beginning PROC SQL Users Kirk Paul Lafler, Software Intelligence Corporation

Dynamic Dashboards Using Base-SAS Software

Top Ten SAS Performance Tuning Techniques

Integrity Constraints and Audit Trails Working Together Gary Franklin, SAS Institute Inc., Austin, TX Art Jensen, SAS Institute Inc.

Eliminating Tedium by Building Applications that Use SQL Generated SAS Code Segments

Using the SQL Procedure

Big Data, Fast Processing Speeds Kevin McGowan SAS Solutions on Demand, Cary NC

Chapter 1 Overview of the SQL Procedure

Programming Tricks For Reducing Storage And Work Space Curtis A. Smith, Defense Contract Audit Agency, La Mirada, CA.

Same Data Different Attributes: Cloning Issues with Data Sets Brian Varney, Experis Business Analytics, Portage, MI

Using SAS With a SQL Server Database. M. Rita Thissen, Yan Chen Tang, Elizabeth Heath RTI International, RTP, NC

Downloading, Configuring, and Using the Free SAS University Edition Software

Search and Replace in SAS Data Sets thru GUI

Using DATA Step MERGE and PROC SQL JOIN to Combine SAS Datasets Dalia C. Kahane, Westat, Rockville, MD

Managing Tables in Microsoft SQL Server using SAS

Effective Use of SQL in SAS Programming

OBJECT_EXIST: A Macro to Check if a Specified Object Exists Jim Johnson, Independent Consultant, North Wales, PA

ing Automated Notification of Errors in a Batch SAS Program Julie Kilburn, City of Hope, Duarte, CA Rebecca Ottesen, City of Hope, Duarte, CA

An Introduction to SAS/SHARE, By Example

One problem > Multiple solutions; various ways of removing duplicates from dataset using SAS Jaya Dhillon, Louisiana State University

Transferring vs. Transporting Between SAS Operating Environments Mimi Lou, Medical College of Georgia, Augusta, GA

Oracle Database 10g: Introduction to SQL

Overview. NT Event Log. CHAPTER 8 Enhancements for SAS Users under Windows NT

A Method for Cleaning Clinical Trial Analysis Data Sets

Utilizing Clinical SAS Report Templates Sunil Kumar Gupta Gupta Programming, Thousand Oaks, CA

A Macro to Create Data Definition Documents

Parallel Data Preparation with the DS2 Programming Language

Physical Database Design Process. Physical Database Design Process. Major Inputs to Physical Database. Components of Physical Database Design

Analyzing the Server Log

Efficient Techniques and Tips in Handling Large Datasets Shilong Kuang, Kelley Blue Book Inc., Irvine, CA

Building a Better Dashboard Using Base SAS Software

Performing Queries Using PROC SQL (1)

Performance Tuning for the Teradata Database

Oracle 10g PL/SQL Training

Instant SQL Programming

Improving Maintenance and Performance of SQL queries

AN INTRODUCTION TO THE SQL PROCEDURE Chris Yindra, C. Y. Associates

Technical Paper. Defining an ODBC Library in SAS 9.2 Management Console Using Microsoft Windows NT Authentication

THE POWER OF PROC FORMAT

SAS PROGRAM EFFICIENCY FOR BEGINNERS. Bruce Gilsen, Federal Reserve Board

SQL SUBQUERIES: Usage in Clinical Programming. Pavan Vemuri, PPD, Morrisville, NC

9.1 SAS. SQL Query Window. User s Guide

More Tales from the Help Desk: Solutions for Simple SAS Mistakes Bruce Gilsen, Federal Reserve Board

Subsetting Observations from Large SAS Data Sets

Managing very large EXCEL files using the XLS engine John H. Adams, Boehringer Ingelheim Pharmaceutical, Inc., Ridgefield, CT

SAS/IntrNet 9.4: Application Dispatcher

Chapter 2 The Data Table. Chapter Table of Contents

FINDING ORACLE TABLE METADATA WITH PROC CONTENTS

FHE DEFINITIVE GUIDE. ^phihri^^lv JEFFREY GARBUS. Joe Celko. Alvin Chang. PLAMEN ratchev JONES & BARTLETT LEARN IN G. y ti rvrrtuttnrr i t i r

Creating HTML Output with Output Delivery System

Normalizing SAS Datasets Using User Define Formats

Top Ten SAS DBMS Performance Boosters for 2009 Howard Plemmons, SAS Institute Inc., Cary, NC

SAS Views The Best of Both Worlds

Paper An Introduction to SAS PROC SQL Timothy J Harrington, Venturi Partners Consulting, Waukegan, Illinois

Tips and Tricks for Creating Multi-Sheet Microsoft Excel Workbooks the Easy Way with SAS. Vincent DelGobbo, SAS Institute Inc.

The prerelease version of SQL was called SEQUEL (for Structured English Query Language), and some people still pronounce SQL as sequel.

Debugging Complex Macros

Introduction to Proc SQL Steven First, Systems Seminar Consultants, Madison, WI

Make Better Decisions with Optimization

SAS Credit Scoring for Banking 4.3

Katie Minten Ronk, Steve First, David Beam Systems Seminar Consultants, Inc., Madison, WI

From The Little SAS Book, Fifth Edition. Full book available for purchase here.

SQL Query Evaluation. Winter Lecture 23

Alternative Methods for Sorting Large Files without leaving a Big Disk Space Footprint

DBMS Questions. 3.) For which two constraints are indexes created when the constraint is added?

ODBC Chapter,First Edition

You have got SASMAIL!

Tales from the Help Desk 3: More Solutions for Simple SAS Mistakes Bruce Gilsen, Federal Reserve Board

Oracle Database: SQL and PL/SQL Fundamentals

Integrating SAS and Excel: an Overview and Comparison of Three Methods for Using SAS to Create and Access Data in Excel

Statistics and Analysis. Quality Control: How to Analyze and Verify Financial Data

SAS University Edition: Installation Guide for Windows

SAS/ACCESS 9.3 Interface to PC Files

Databases What the Specification Says

Data Presentation. Paper Using SAS Macros to Create Automated Excel Reports Containing Tables, Charts and Graphs

Transcription:

SAS Programming Tips, Tricks, and Techniques A presentation by Kirk Paul Lafler

Copyright 2001-2012 by Kirk Paul Lafler, Software Intelligence Corporation All rights reserved. SAS is the registered trademark of SAS Institute Inc., Cary, NC, USA. All other company and product names mentioned are used for identification purposes only and may be trademarks of their respective owners.

Presentation Objectives - Explore Useful SAS System Options Interesting PROC SQL Options A Userdefined Function using PROC FCMP Integrity Constraints for Tables A Userdefined Dictionary Table and SASHELP View Tool

Example Datasets / Tables Movies Actors

Exploring SAS System Options

SOURCE versus SOURCE2 Primary Source Statements: OPTIONS SOURCE2; %INCLUDE c:\workshops\logcontroloptions.sas ; PROC PRINT DATA=MOVIES NOOBS; VAR TITLE RATING CATEGORY; RUN; Secondary Source Statements (Included Code): OPTIONS MSGLEVEL=I ; /* DISPLAY SORT, MERGE PROCESSING, AND INDEX USAGE */;

SOURCE versus SOURCE2 Log Results SAS Log Results: OPTIONS SOURCE2; %INCLUDE 'c:\workshops\logcontroloptions.sas'; NOTE: %INCLUDE (level 1) file c:\workshops\logcontroloptions.sas is file c:\workshops\logcontroloptions.sas. +OPTIONS MSGLEVEL=I +/* DISPLAY SORT, MERGE PROCESSING, AND INDEX USAGE */ +; NOTE: %INCLUDE (level 1) ending. PROC PRINT DATA=MOVIES NOOBS; VAR TITLE RATING CATEGORY; RUN;

Exploring PROC SQL Options

SQL Join Algorithms Nested Loop (aka Brute force) join algorithm Sort-merge join algorithm Index join algorithm Hash join algorithm

Influencing the SQL Optimizer

Specifying MAGIC=101 PROC SQL MAGIC=101; SELECT * FROM MOVIES, ACTORS WHERE MOVIES.TITLE = ACTORS.TITLE; QUIT; SAS Log Results PROC SQL MAGIC=101; SELECT * FROM MOVIES, ACTORS WHERE MOVIES.TITLE = ACTORS.TITLE; NOTE: PROC SQL planner chooses sequential loop join. QUIT;

Specifying MAGIC=102 PROC SQL MAGIC=102; SELECT * FROM MOVIES, ACTORS WHERE MOVIES.TITLE = ACTORS.TITLE; QUIT; SAS Log Results PROC SQL MAGIC=102; SELECT * FROM MOVIES, ACTORS WHERE MOVIES.TITLE = ACTORS.TITLE; NOTE: PROC SQL planner chooses merge join. QUIT;

Specifying MAGIC=103 PROC SQL MAGIC=103; SELECT * FROM MOVIES, ACTORS WHERE MOVIES.TITLE = ACTORS.TITLE; QUIT; SAS Log Results PROC SQL MAGIC=103; SELECT * FROM MOVIES, ACTORS WHERE MOVIES.TITLE = ACTORS.TITLE; NOTE: PROC SQL planner chooses merge join. NOTE: A merge join has been transformed to a hash join. QUIT;

Specifying IDXWHERE=Yes The IDXWHERE= dataset option can be specified to influence the SQL optimizer to use the most efficient index available (if one exists) to execute a query.

Specifying IDXWHERE=Yes OPTIONS MSGLEVEL=I; PROC SQL; SELECT * FROM MOVIES(IDXWHERE=Yes), ACTORS WHERE MOVIES.TITLE = ACTORS.TITLE; QUIT; SAS Log Results PROC SQL; SELECT * FROM MOVIES(IDXWHERE=Yes), ACTORS WHERE MOVIES.TITLE = ACTORS.TITLE; INFO: Index Rating selected for WHERE clause optimization. QUIT;

Exploring a Userdefined Function using PROC FCMP

PROC FCMP Advantages User-Defined functions created with the FCMP procedure provide specific advantages: Code can be easier to read, write and modify Code is modular and callable Code is independent and not affected by its implementation Code is reusable by any program that has access to the dataset where the function is stored

User-defined Function (Part 1) /* Constructing a User Defined Function in FCMP */ proc fcmp outlib=sasuser.myfunctions.examples ; function Age_of_Movie_function(year) ; if year NE. then do ; Age_of_Movie = year(today()) year ; end ; return(age_of_movie); endsub ; quit ;

User-defined Function (Part 2) options cmplib=sasuser.myfunctions ; /* Call the User Defined Function created in FCMP */ data Age_of_Movie ; set mydata.movies ; Age_of_Movie = Age_of_Movie_function(year) ; put Year= Age_of_Movie= ; run ;

User-defined Function (Log Results)

Exploring Integrity Constraints for Tables

Methods of Building Integrity Data integrity problems such as missing information, duplicate values, and invalid data values can affect user confidence in a database environment. The objective is to establish rules in the database table(s) to safeguard and protect data. Application programs Older / less reliable Database table environment Newer / more reliable PROC DATASETS PROC SQL

Column and Table Constraints Data integrity is maintained by assigning column and table constraints. Modifications made through update and delete operations can have referential integrity constraints built into the database environment. Column and Table Constraints NOT NULL UNIQUE CHECK

NOT NULL Constraint To prevent null values from appearing in any row of a table for a specified column, a NOT NULL constraint can be coded. PROC SQL; CREATE TABLE work.rental_info (TITLE CHAR(30) NOT NULL, RENTAL_AMT NUM FORMAT=DOLLAR6.2); QUIT;

UNIQUE Constraint The UNIQUE constraint prevents rows containing duplicate values for a specified column from being added to a table. PROC SQL; CREATE TABLE work.rental_info (TITLE CHAR(30) UNIQUE, RENTAL_AMT NUM FORMAT=DOLLAR6.2); QUIT;

CHECK Constraint A CHECK constraint can be specified to assign specific rules that a column must adhere to. PROC SQL; ALTER TABLE MOVIES ADD CONSTRAINT CHECK_RATING CHECK (RATING IN ( G, PG, PG-13, R )); QUIT;

Exploring a Userdefined Tool using Dictionary Tables

Dictionary Tables / SASHELP Views SAS collects information about a session Session information is captured as read-only content Tables are accessible using PROC SQL Specify table in FROM clause of a SELECT DICTIONARY libref is automatically assigned SASHELP Views are accessed in a DATA step or with any of your favorite PROCs

Viewing Dictionary Tables / Views # of DICTIONARY Tables/Views: 22 in SAS 9.1 29 in SAS 9.2 30 in SAS 9.3

Cross-reference Listing PROC PRINT PROC PRINT DATA=SASHELP.VCOLUMN NOOBS; VAR LIBNAME MEMNAME NAME TYPE LENGTH; WHERE UPCASE(LIBNAME)=UPCASE( SASUSER") AND UPCASE(NAME)=UPCASE("TITLE"); RUN; Cross-reference listing for the column TITLE Library Column Column Name Member Name Column Name Type Length SASUSER ACTORS Title char 30 SASUSER MOVIES Title char 30

Cross-reference Listing PROC PRINT %MACRO CROSSREF(LIB, COLNAME); PROC PRINT DATA=SASHELP.VCOLUMN NOOBS; VAR LIBNAME MEMNAME NAME TYPE LENGTH; WHERE UPCASE(LIBNAME)=UPCASE("&LIB") AND UPCASE(NAME)=UPCASE("&COLNAME"); RUN; %MEND CROSSREF; %CROSSREF(SASUSER,TITLE);

Cross-reference Listing PROC PRINT PROC PRINT DATA=SASHELP.VCOLUMN NOOBS; VAR LIBNAME MEMNAME NAME TYPE LENGTH; WHERE UPCASE(LIBNAME)="SASUSER" AND UPCASE(NAME)="TITLE"; RUN; Cross-reference listing for the column TITLE Library Column Column Name Member Name Column Name Type Length SASUSER ACTORS Title char 30 SASUSER MOVIES Title char 30

Conclusion User-defined Tool using Dictionary Tables Useful SAS System Options PROC SQL Options Integrity Constraints for Tables User-defined Function using PROC FCMP

Thank You for Attending! Questions? A presentation by Kirk Paul Lafler KirkLafler@cs.com @sasnerd