Developing Applications Using BASE SAS and UNIX



Similar documents
You have got SASMAIL!

Using Macros to Automate SAS Processing Kari Richardson, SAS Institute, Cary, NC Eric Rossland, SAS Institute, Dallas, TX

Unix Shell Scripts. Contents. 1 Introduction. Norman Matloff. July 30, Introduction 1. 2 Invoking Shell Scripts 2

A Method for Cleaning Clinical Trial Analysis Data Sets

Labels, Labels, and More Labels Stephanie R. Thompson, Rochester Institute of Technology, Rochester, NY

SUGI 29 Applications Development

PharmaSUG Paper QT26

How To Write A Clinical Trial In Sas

AN INTRODUCTION TO MACRO VARIABLES AND MACRO PROGRAMS Mike S. Zdeb, New York State Department of Health

Analyzing the Server Log

Nine Steps to Get Started using SAS Macros

Essential Project Management Reports in Clinical Development Nalin Tikoo, BioMarin Pharmaceutical Inc., Novato, CA

Counting the Ways to Count in SAS. Imelda C. Go, South Carolina Department of Education, Columbia, SC

While You Were Sleeping - Scheduling SAS Jobs to Run Automatically Faron Kincheloe, Baylor University, Waco, TX

Search and Replace in SAS Data Sets thru GUI

An macro: Exploring metadata EG and user credentials in Linux to automate notifications Jason Baucom, Ateb Inc.

Encoding the Password

Importing Excel File using Microsoft Access in SAS Ajay Gupta, PPD Inc, Morrisville, NC

Instant Interactive SAS Log Window Analyzer

HP-UX Essentials and Shell Programming Course Summary

EXTRACTING DATA FROM PDF FILES

Data Presentation. Paper Using SAS Macros to Create Automated Excel Reports Containing Tables, Charts and Graphs

We begin by defining a few user-supplied parameters, to make the code transferable between various projects.

REx: An Automated System for Extracting Clinical Trial Data from Oracle to SAS

PharmaSUG Paper AD11

Programming Tricks For Reducing Storage And Work Space Curtis A. Smith, Defense Contract Audit Agency, La Mirada, CA.

ing Automated Notification of Errors in a Batch SAS Program Julie Kilburn, City of Hope, Duarte, CA Rebecca Ottesen, City of Hope, Duarte, CA

Using Pharmacovigilance Reporting System to Generate Ad-hoc Reports

Preparing your data for analysis using SAS. Landon Sego 24 April 2003 Department of Statistics UW-Madison

An Introduction to SAS/SHARE, By Example

Better Safe than Sorry: A SAS Macro to Selectively Back Up Files

Unix Scripts and Job Scheduling

Storing and Using a List of Values in a Macro Variable

Applications Development ABSTRACT PROGRAM DESIGN INTRODUCTION SAS FEATURES USED

AN INTRODUCTION TO UNIX

While You Were Sleeping - Scheduling SAS Jobs to Run Automatically Faron Kincheloe, Baylor University, Waco, TX

Managing Tables in Microsoft SQL Server using SAS

Command Line - Part 1

The Power of CALL SYMPUT DATA Step Interface by Examples Yunchao (Susan) Tian, Social & Scientific Systems, Inc., Silver Spring, MD

Managing very large EXCEL files using the XLS engine John H. Adams, Boehringer Ingelheim Pharmaceutical, Inc., Ridgefield, CT

Using SAS to Control and Automate a Multi SAS Program Process. Patrick Halpin November 2008

Before You Begin... 2 Running SAS in Batch Mode... 2 Printing the Output of Your Program... 3 SAS Statements and Syntax... 3

Automation of Large SAS Processes with and Text Message Notification Seva Kumar, JPMorgan Chase, Seattle, WA

32-Bit Workload Automation 5 for Windows on 64-Bit Windows Systems

ABSTRACT INTRODUCTION

The SAS Data step/macro Interface

SAS PROGRAM EFFICIENCY FOR BEGINNERS. Bruce Gilsen, Federal Reserve Board

B) Mean Function: This function returns the arithmetic mean (average) and ignores the missing value. E.G: Var=MEAN (var1, var2, var3 varn);

SAS ODS HTML + PROC Report = Fantastic Output Girish K. Narayandas, OptumInsight, Eden Prairie, MN

Importing Excel Files Into SAS Using DDE Curtis A. Smith, Defense Contract Audit Agency, La Mirada, CA

Optimizing System Performance by Monitoring UNIX Server with SAS

Choosing the Best Method to Create an Excel Report Romain Miralles, Clinovo, Sunnyvale, CA

Scheduling in SAS 9.3

Paper FF-014. Tips for Moving to SAS Enterprise Guide on Unix Patricia Hettinger, Consultant, Oak Brook, IL

EXST SAS Lab Lab #4: Data input and dataset modifications

Quick Start to Data Analysis with SAS Table of Contents. Chapter 1 Introduction 1. Chapter 2 SAS Programming Concepts 7

SAS Certified Base Programmer for SAS 9 A SAS Certification Questions and Answers with explanation

THE POWER OF PROC FORMAT

An Introduction to Using the Command Line Interface (CLI) to Work with Files and Directories

The Linux Operating System and Linux-Related Issues

OBJECT_EXIST: A Macro to Check if a Specified Object Exists Jim Johnson, Independent Consultant, North Wales, PA

Overview. NT Event Log. CHAPTER 8 Enhancements for SAS Users under Windows NT

Hands-On UNIX Exercise:

Let the CAT Out of the Bag: String Concatenation in SAS 9 Joshua Horstman, Nested Loop Consulting, Indianapolis, IN

Basic C Shell. helpdesk@stat.rice.edu. 11th August 2003

More Tales from the Help Desk: Solutions for Simple SAS Mistakes Bruce Gilsen, Federal Reserve Board

Macros from Beginning to Mend A Simple and Practical Approach to the SAS Macro Facility

DiskPulse DISK CHANGE MONITOR

9.1 SAS. SQL Query Window. User s Guide

Using FILEVAR= to read multiple external files in a DATA Step

Music to My Ears: Using SAS to Deal with External Files (and My ipod)

Shell Scripts (1) For example: #!/bin/sh If they do not, the user's current shell will be used. Any Unix command can go in a shell script

SAS Macros as File Management Utility Programs

Using SVN to Manage Source RTL

Salary. Cumulative Frequency

Tales from the Help Desk 3: More Solutions for Simple SAS Mistakes Bruce Gilsen, Federal Reserve Board

A Quick Guide to the WinZip Command Line Add-On

RECOVER ( 8 ) Maintenance Procedures RECOVER ( 8 )

New Tricks for an Old Tool: Using Custom Formats for Data Validation and Program Efficiency

CA VM:Operator r3. Product Overview. Business Value. Delivery Approach

Using DDE and SAS/Macro for Automated Excel Report Consolidation and Generation

How To Create An Audit Trail In Sas

SAS Customer Intelligence 360: Creating a Consistent Customer Experience in an Omni-channel Environment

Leveraging the SAS Open Metadata Architecture Ray Helm & Yolanda Howard, University of Kansas, Lawrence, KS

Effective Use of SAS/CONNECT ~ Cheryl Garner SAS Institute Inc., Cary, NC

It s not the Yellow Brick Road but the SAS PC FILES SERVER will take you Down the LIBNAME PATH= to Using the 64-Bit Excel Workbooks.

Combining SAS LIBNAME and VBA Macro to Import Excel file in an Intriguing, Efficient way Ajay Gupta, PPD Inc, Morrisville, NC

A Crash Course on UNIX

Eliminating Tedium by Building Applications that Use SQL Generated SAS Code Segments

The SET Statement and Beyond: Uses and Abuses of the SET Statement. S. David Riba, JADE Tech, Inc., Clearwater, FL

Transcription:

Developing Applications Using BASE SAS and UNIX Joe Novotny, GlaxoSmithKline Pharmaceuticals, Inc., Collegeville, PA ABSTRACT How many times have you written simple SAS programs to view the contents of SAS a dataset, determine a frequency count of a variable or create SAS transport files? What if you could perform these tasks with a few simple keystrokes from the UNIX command line? This paper highlights several simple BASE SAS techniques that allow you to take advantage of SAS s ability to interface with UNIX. The paper demonstrates practical applications of: 1) reading the UNIX command line into a SAS program, ) printing SAS output directly to the UNIX terminal screen and 3) techniques that allow you to utilize UNIX information and commands from within SAS programs. These techniques will help you automate many everyday tasks, increase your programming productivity and provide the basis for developing powerful applications. INTRODUCTION Many companies have chosen UNIX as the operating platform of choice for SAS code development. Along with the benefits of using the UNIX system itself, SAS offers many techniques for utilizing UNIX functionality within the SAS language which enable programmers to efficiently transfer useful information between SAS and UNIX systems. This paper discusses a number of these techniques and demonstrates practical applications using them. Topics covered include: 1) Piping UNIX command line information into a SAS data step using the INFILE statement, ) Using the FILENAME statement with the TERMINAL argument and PROC PRINTTO to route SAS output directly to the UNIX terminal, 3) executing UNIX commands from within a SAS program using the X statement, the CALL SYSTEM routine and the %SYSEXEC MACRO statements, 4) using UNIX environment variables within SAS programs. BACKGROUND AND ASSUMPTIONS 1. I assume readers are familiar with basic concepts of the UNIX environment (e.g., UNIX command line, basic UNIX commands, directory structures, environment variables, the keyboard as standard input, the terminal screen as standard output, etc.) or at least have an interest in learning about them. I do not assume readers are power users or shell scripting gurus. You will benefit if you are looking to augment your understanding of how SAS and UNIX can communicate. The focus is on how SAS can utilize UNIX information to facilitate your SAS programming.. I assume readers have an intermediate or greater level of understanding of Base SAS and SAS MACRO. 3. Unless otherwise noted, the UNIX command line examples in this paper (denoted w/ the greater than sign > ) are run using tcsh shell syntax to interface with UNIX. Tcsh is a C shell variant. Some UNIX commands may have slightly different syntax in other UNIX shells such as Korn, Bash, etc. although most commands referenced in this paper are basic commands such as ls l. PIPING COMMAND LINE INFORMATION INTO YOUR SAS PROGRAMS AND SENDING OUTPUT TO THE TERMINAL PROBLEM: How many times have you had to write and run short SAS programs to determine the contents of a SAS data set or determine a simple frequency count of a variable? Over the lifespan of a project you may need to remind yourself of variable names, data types, lengths, labels, etc. numerous times. You are probably not making the best use of your time if you spend much of it opening up tmp.sas and typing something similar to the following: libname mylib /home/userid/mydata ; proc contents data=mylib.mydsname; 1

You then check that your tmp.log file contains no ERROR: or WARNING: messages, open up tmp.lst and scroll down to search for the variable you are looking for. This seems a small task. But add it up for the variables on each data set you use, perhaps many times over the lifespan of a project, and this is a tedious component of our job. Surely there is a better way. SOLUTION 1: One way to avoid this repetitive work is to write a simple little macro that does three basic things: 1) reads what you type at the UNIX command line into a SAS program, ) does the SAS work for you and 3) sends the output to your terminal screen. After the initial code development, all this can be done without having to touch the keyboard again after typing a few words and hitting enter. The example macro contents.sas below performs these operations. In the example, I simply type the following at the UNIX command prompt: > echo mydsname sas contents and the contents macro does the rest. 1 %macro contents; 3 data _null_; 4 infile stdin; 5 length ds $ 00; 6 input ds; 7 call symput("ds",compress(ds)); 8 9 10 libname tmpcont '.'; 11 1 proc contents data=tmpcont.&ds. noprint out=tmpcont; 13 14 15 filename term terminal; 16 17 proc printto new print=term; 18 19 proc print data=tmpcont noobs; 0 var memname nobs name type length label; 1 3 proc printto; 4 5 %mend contents; 6 %contents; Line 4 uses the INFILE statement to read in UNIX standard input. Line 7 uses the CALL SYMPUT routine to create a macro variable containing the name of my data set, in this case mydsname, passed from the command line. I can then use this macro variable within the program to refer to the data set of interest. Line 10 assigns a LIBNAME to the current directory (Note that the code then functions only when run in the same directory as the existing data set. I ll show one way to increase flexibility by using a UNIX shell script later in the paper). Line 1 uses the CONTENTS procedure to generate a working data set containing the contents information about the permanent data set.

Line 15 uses the FILENAME statement to assign a FILEREF of the terminal screen for use as our output destination later. Line 17 uses the PRINTTO procedure to send all printed output to the term FILEREF assigned previously. Lines 19-1 use the PRINT procedure to display the required information. Line 3 closes the PRINTTO procedure. To increase this program s flexibility, I use a simple UNIX shell script to call the SAS MACRO from any directory (Note that this still assumes the data set exists in the current directory and the directory holding the shell script is found in your UNIX $PATH variable). This ensures that program functionality is no longer dependent on the SAS program and the SAS data set residing in the same directory and allows you to type the following at the UNIX command line: > contents mydsname and receive the requested information printed directly to the UNIX terminal screen. Code for the UNIX shell script named contents above is presented below: 1 #! /bin/ksh 3 if (( $#!= 1 )) 4 then 5 echo 6 echo Please enter the name of a single data set from the current directory\. 7 echo 8 else 9 echo $* sas $HOME/code/contents -log /tmp 10 rm -f /tmp/contents.log 11 fi Line 1 establishes that the shell language to be used is the Korn shell. Lines 3-7 ensure only one data set is passed to the script. $# will resolve to the number of arguments passed from the command line to the shell script (the name of the script itself is not counted, so in the example above $# resolves to 1). Line 9 $* resolves to display all information passed to the script [again, the script itself is not included, so in this example, $* resolves to the text string mydsname (without the double quotes)] and pipes it into the command which executes SAS on the contents.sas program residing in the user s $HOME/code directory. It also sends the SAS log to the /tmp directory (note that you must have write access to the /tmp directory). Line 10 cleans up the log file produced by the SAS program. During code development, this is done only after you have verified no further debugging is needed. Line 11 ends the if loop started on line 3. SOLUTION : To simplify the SAS program from Solution 1 using another of SAS s UNIX interface capabilities, the SYSPARM option can be used when invoking SAS. Using this option populates the automatic macro variable SYSPARM with the text enclosed in quotes (see below). So on the command line, we would type: 3

> sas sysparm mydsname contents The SYSPARM macro variable is populated with mydsname and we eliminate the need to use the DATA step and CALL SYMPUT to create the macro variable containing the data set name. So we can then use PROC CON- TENTS as follows and the rest of the program remains the same: proc contents data=tmpcont.&sysparm noprint out=tmpcont; Note that Solution also requires a slight modification to the UNIX script in order to run the contents mydsname command at the UNIX prompt. The required changes are highlighted in red on line 9 below: 1 #! /bin/ksh 3 if (( $#!= 1 )) 4 then 5 echo 6 echo Please enter the name of a single data set from the current directory\. 7 echo 8 else 9 sas sysparm $* $HOME/code/contents -log /tmp 10 rm -f /tmp/contents.log 11 fi Note that while the use of the sysparm technique above is more efficient for passing a single data set to the SAS program, passing more than a single parameter to the SAS program via the UNIX command line may require adding a bit more complexity to your SAS program and/or the use of the DATA step for reading the information into SAS. For example, creating a similar utility program using PROC FREQ to produce a cross-tabulation of multiple variables may require code to parse the following: var1\*var\*var3 (the escape character \ prevents UNIX from interpreting the asterisk as a special character on the command line). The SAS Macro and shell script in the Appendix at the end of the paper demonstrate using these techniques to display cross-tabulation frequency counts of SAS dataset variables from the command line. With a bit of creativity, you can design utility programs that simplify many of the everyday tasks used in getting to know your data (e.g., the CONTENTS, FREQ, PRINT, etc. procedures). Routine tasks such as creating SAS transport files can easily be automated using these techniques. Reducing the amount of repetitive coding required, you can also completely eliminate many common and time-consuming coding errors. EXECUTING UNIX COMMANDS WITHIN SAS PROGRAMS In addition to receiving UNIX information from the command line, SAS can also interface with UNIX by executing UNIX commands directly from within your current SAS session. In this section I will discuss using the X statement, the CALL SYSTEM routine and the %SYSEXEC MACRO statement to run UNIX commands within SAS programs. PROBLEM: You need to populate a SAS data set with metadata information from the files in a given UNIX directory (e.g., filenames, date/time of last modification, etc.). This can be useful for management of SAS programs and output in the UNIX production environment. The particular business need in the author s case was to create a data set that drives an application archiving SAS output into a document repository. SOLUTION 1: The required file information can be obtained by storing the output from the UNIX ls l command into a permanent file and then reading the information in this file into a SAS data set as shown below. > ls l > myfiles.txt 4

For this example, myfiles.txt now contains the following information: total 3588 -rw-r--r-- 1 myid9999 mygroup 836333 Jun 15 10:7 file1.lst -rw-r--r-- 1 myid9999 mygroup 70919 Jun 15 10:7 file.lst -rw-r--r-- 1 myid9999 mygroup 6467 Jun 15 10:7 file3.lst -rw-r--r-- 1 myid9999 mygroup 15463 Jun 15 10:7 file4.lst -rw-r--r-- 1 myid9999 mygroup 556031 Jun 15 10:7 file5.lst -rw-r--r-- 1 myid9999 mygroup 1975 Jun 15 10:7 file6.lst -rw-r--r-- 1 myid9999 mygroup 0 Jun 15 14:03 myfiles.txt Both the first line of the file ( total 3588, the total block count) and the last line (containing information for the myfiles.txt file) contain unwanted information for our purposes. To eliminate this and make the file more easily readable by SAS, we can manually delete the first and last lines of myfiles.txt. We can then read the remaining information into SAS with the following DATA step : 1 data myfiles; infile './myfiles.txt' lrecl=400; 3 length permiss filelink owner group size month day time $0 filename $00; 4 input permiss filelink owner group size month day time filename $; 5 Results of the PRINT procedure for the resulting data set are shown below: Obs PERMISS FILELINK OWNER GROUP SIZE MONTH DAY TIME FILENAME 1 -rw-r--r-- 1 myid9999 mygroup 836333 Jun 15 10:7 file1.lst -rw-r--r-- 1 myid9999 mygroup 70919 Jun 15 10:7 file.lst 3 -rw-r--r-- 1 myid9999 mygroup 6467 Jun 15 10:7 file3.lst 4 -rw-r--r-- 1 myid9999 mygroup 15463 Jun 15 10:7 file4.lst 5 -rw-r--r-- 1 myid9999 mygroup 556031 Jun 15 10:7 file5.lst 6 -rw-r--r-- 1 myid9999 mygroup 1975 Jun 15 10:7 file6.lst From this point, we can use the information just like any other SAS data set. Note that two manual steps were used to generate our input file for this task: 1) the UNIX command to create it and ) file editing to allow easier input to SAS. For a single iteration of this process, this represents two points of human contact where errors may be introduced. If the task is to be repeated as new files are added to the directory or if the current files are updated, the possibility for error increases. A higher degree of validation and repeatability can be achieved if the process is automated. Solution below presents a more automated solution. SOLUTION : We can automate the process described above by using SAS s ability to execute UNIX commands directly from a SAS session. The X statement, the CALL SYSTEM routine and the %SYSEXEC MACRO statements allow us to do this. Instead of manually creating the myfiles.txt file above, we can create it and remove it on-the-fly using the X statement as shown below. 1 x ls -l. tail + > myfiles.txt; 3 data myfiles; 4 infile 'myfiles.txt' ; 5 length permiss filelink owner group size month day time $0 filename $00; 6 input permiss filelink owner group size month day time filename $; 7 if not(index(filename,'myfiles')) and not(index(filename,'readfiles')); 8 9 10 x rm -f myfiles.txt; 5

Line 1 uses the X statement to execute the UNIX ls l command within the SAS session. By piping the output of this command through the tail + UNIX command, we read everything from the ls l command, starting at the second line (which eliminates the total block count), into myfile.txt. Lines 3-6 read the file, assign attributes and input the information into the DATA step. Line 7 subsets the output data set, removing records for the myfiles.txt file (created by line 1) and this running SAS program (which I ve name readfiles.sas in this example). Line 10 programmatically removes the myfiles.txt file using the X statement to execute the UNIX rm command on the file (the f option on the rm command eliminates the need to respond to the UNIX prompt asking for confirmation prior to removing the file. Without the f option, the prompt is sent to the screen and requires user input prior to finishing the SAS session). The %SYSEXEC MACRO statement allows you to execute these same tasks using a slightly different syntax for lines 1 and 10 above: 1 %sysexec(ls -l. tail + > myfiles.txt);..... 10 %sysexec(rm f myfiles.txt); Both the X statement and the %SYSEXEC MACRO statement cause the UNIX command to execute immediately. Both also result in the assignment of operating environment return codes to the SAS automatic macro variable SYSRC. The above tasks can also be performed by using the CALL SYSTEM routine to execute the UNIX commands within SAS. The significant difference between using CALL SYSTEM and using the X or %SYSEXEC MACRO statements is that the CALL SYSTEM routine must be run within a DATA step. One of the benefits of this is that it implies the UNIX commands can be run conditionally if desired (using familiar SAS syntax as opposed to shell scripting language). An example of using the CALL SYSTEM routine to perform one of the example tasks is shown below: 1 data _null_; call system('ls -l. tail + > myfiles.txt'); 3 SOLUTION 3: We can also eliminate the need to create a permanent file by streaming the output from the ls l UNIX command directly into a SAS DATA step using the FILENAME statement with the pipe option. The DATA step looks similar to the above examples, with the exception that instead of reading data from a physical file, we read the information into the DATA step from a data stream that never produces a hard file. So there is no need to create it, subset the output data set for the myfiles.txt file (as we did above) or remove any files from the UNIX environment. 1 filename mylist pipe "ls -l. tail +"; 3 data myfiles; 4 infile mylist lrecl=400; 5 length permiss filelink owner group size month day time $0 filename $00; 6 input permiss filelink owner group size month day time filename $; 7 if not(index(filename,'readfiles')); 8 Solutions one through three all produce the same final working MYFILES data set using differing levels of complexity and having different degrees of flexibility. Each may be better suited to certain specific tasks than the others depending on your needs and preferences. 6

USING UNIX ENVIRONMENT VARIABLES WITHIN SAS PROGRAMS In your UNIX production environment, you probably have many system environment variables that can be used to make your SAS code more efficient and flexible. The %SYSGET MACRO function helps you do this. PROBLEM 1: You need to assign a SAS library reference to work with data in a directory with a long fullyqualified path name. SOLUTION: You can use SAS s ability to retrieve the values of environment variables to populate LIBREFs for use in data retrieval. For example, you may have data which reside in the following UNIX directory: /prod/projid/lots/of/directories/to/get/to/my/data A UNIX environment variable may exist containing the name of this directory. For example, if you have an environment variable named DATAPATH that refers to the above directory, you can use the %SYSGET MACRO function to retrieve this information and assign it to a SAS LIBREF as shown below. 1 libname mydata "%sysget(datapath)"; 3 data work.mydataset; 4 set mydata.mydataset; 5 This simple use of %SYSGET to retrieve environment variable values can help eliminate the need to create numerous libname assignments. SAS MACROs written in this fashion become functional for various projects with a simple reassignment of the UNIX environment variable (perhaps done automatically through a logon script), thus eliminating the need to reassign hard-coded LIBNAMEs for new projects. PROBLEM : You need code to function differently in production than in the development environment (e.g., your code produces an errcheck dataset in development, but this is not wanted when the code is run in production). SOLUTION: In your UNIX environment, a variable may exist called MODE, which contains the value dev when you re logged in as a user in development. It contains the value prod when you re logged into the production environment. We can use %SYSGET to retrieve these values and use this information for conditional program execution. One note here: Since UNIX is case-sensitive, mode is not the same environment variable as MODE. Because we re referring to something in the UNIX environment, we must be conscious of this difference when using %SYSGET to retrieve information from UNIX environment variables. In macro code, %SYSGET can be used in a stand-alone fashion as in the example below: 1 %if %sysget(mode)=dev %then %do; 3 proc print data=errcheck; 4 5 %end; %SYSGET can also be used in open code as in the following DATA step: 1 data work.demo nobdt; set work.demo; 3 runmode="%sysget(mode)"; 4 if runmode= dev' and birthdt=. then output nobdt; 5 else output demo; 6 7

The uses of environment variables through SAS are far-reaching. I have shown how they can be used in populating SAS LIBNAMEs and to execute code conditionally. Often, UNIX system administrators set environment variables to hold individuals user ids. Another use may be to aid in the creation of audit trails to bring your software shop one step closer to a validated programming environment. Becoming familiar with all the UNIX environment variables you have available will increase your programming flexibility. CONCLUSION With a little creativity and some basic knowledge of UNIX and SAS, you can develop some simple SAS MACROs to help eliminate, or at least minimize, time spent performing the more mundane tasks that are part of programming. By standardizing some of the techniques presented here in macro code libraries, small improvements in efficiency can multiply through use by many programmers over the course of large-scale projects to produce large-scale benefits. Even at the individual level, small incremental improvements multiplied, improved upon and expanded over the course of a programming career can result in significant impact on your ability to produce high quality code and contribute to team efforts. By incorporating these ideas into larger programming projects, my hope is for these ideas to serve as a starting point for development of more elegant suites of macros that, together, function in an integrated system as a SAS application sitting on the UNIX platform. REFERENCES Gleick, James (1987), Chaos: Making a New Science, Penguin Books Peek, Jerry, O Reilly, Tim and Loukides, Mike (1997), UNIX Power Tools, Sebastopol, CA: O Reilly & Associates, Inc. SAS Institute Inc. (1999), SAS OnlineDoc documentation, Version 8, Cary NC ACKNOWLEDGMENTS SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are trademarks of their respective companies. CONTACT INFORMATION Joe Novotny GlaxoSmithKline 150 South Collegeville Rd. Collegeville, PA 19468 Phone: (610) 917 6939 Fax: (610) 917-4701 Email: joe..novotny@gsk.com APPENDIX PART A: Shell script to display cross-tabulation frequency counts of SAS dataset variables #! /bin/ksh if [ $# -gt 1 ] then echo $# $* sas $HOME/code/_freq -log /tmp rm -f /tmp/_freq.log else echo 8

fi echo Please enter the name of a single dataset from the current directory echo and the name of at least one variable for the proc freq\. echo Example of use: at UNIX command line type: > freq demo sexo\*sex Note, in the above line > represents the UNIX command line prompt. PART B: SAS Macro called by the above shell script: %macro _freq; **************************************************************** * Read in stdin and create macro variable all with call symput. ****************************************************************; data _null_; infile stdin dlm=','; length all $00; input all; call symput("all",left(trim(all))); ************************************************************* * Iteratively create macro variables to hold the values * of the dataset variables for which to display frequencies. *************************************************************; %let i=1; %do %while(%qscan(%quote(&all),&i,%str( )) ne ); %let var&i=%qscan(%quote(&all),&i,%str( )); %let i=%eval(&i+1); %end; %let j=%eval(&i-1); ****************************************** * Assign libname for current directory. ******************************************; libname tmpfreq '.'; ********************************************************** * If dataset exists in current directory, run proc freq * and send output to standard output (screen). **********************************************************; %if %sysfunc(exist(tmpfreq.&var)) %then %do; filename term terminal; proc printto new print=term; options ls=95; title "Dataset %upcase(&var)"; 9

proc freq data=tmpfreq.&var; tables %let i=3; %do i=&i %to &j; &&var&i %end; / list missing nopercent nofreq; proc printto; %end; ************************************************** * If dataset does not exist in current directory, * send message to standard output indicating so. **************************************************; %if %sysfunc(exist(tmpfreq.&var))=0 %then %do; x echo You typed &var. - This is not a dataset in the current directory.; %end; %mend _freq; %_freq; 10