The Power of CALL SYMPUT DATA Step Interface by Examples Yunchao (Susan) Tian, Social & Scientific Systems, Inc., Silver Spring, MD

Similar documents
More Tales from the Help Desk: Solutions for Simple SAS Mistakes Bruce Gilsen, Federal Reserve Board

Writing cleaner and more powerful SAS code using macros. Patrick Breheny

The SAS Data step/macro Interface

Better Safe than Sorry: A SAS Macro to Selectively Back Up Files

Nine Steps to Get Started using SAS Macros

Instant Interactive SAS Log Window Analyzer

Storing and Using a List of Values in a Macro Variable

SAS Macro Programming for Beginners

Post Processing Macro in Clinical Data Reporting Niraj J. Pandya

Using Pharmacovigilance Reporting System to Generate Ad-hoc Reports

Search and Replace in SAS Data Sets thru GUI

B) Mean Function: This function returns the arithmetic mean (average) and ignores the missing value. E.G: Var=MEAN (var1, var2, var3 varn);

Labels, Labels, and More Labels Stephanie R. Thompson, Rochester Institute of Technology, Rochester, NY

AN INTRODUCTION TO MACRO VARIABLES AND MACRO PROGRAMS Mike S. Zdeb, New York State Department of Health

EXTRACTING DATA FROM PDF FILES

How to Reduce the Disk Space Required by a SAS Data Set

What You re Missing About Missing Values

Innovative Techniques and Tools to Detect Data Quality Problems

Using Macros to Automate SAS Processing Kari Richardson, SAS Institute, Cary, NC Eric Rossland, SAS Institute, Dallas, TX

First Bytes Programming Lab 2

The SET Statement and Beyond: Uses and Abuses of the SET Statement. S. David Riba, JADE Tech, Inc., Clearwater, FL

ing Automated Notification of Errors in a Batch SAS Program Julie Kilburn, City of Hope, Duarte, CA Rebecca Ottesen, City of Hope, Duarte, CA

A Method for Cleaning Clinical Trial Analysis Data Sets

Preparing your data for analysis using SAS. Landon Sego 24 April 2003 Department of Statistics UW-Madison

WESTMORELAND COUNTY PUBLIC SCHOOLS Integrated Instructional Pacing Guide and Checklist Computer Math

A Technique for Storing and Manipulating Incomplete Dates in a Single SAS Date Value

Applications Development ABSTRACT PROGRAM DESIGN INTRODUCTION SAS FEATURES USED

Advanced Tutorials. Numeric Data In SAS : Guidelines for Storage and Display Paul Gorrell, Social & Scientific Systems, Inc., Silver Spring, MD

Subsetting Observations from Large SAS Data Sets

Exercise 1: Python Language Basics

PharmaSUG 2014 Paper CC23. Need to Review or Deliver Outputs on a Rolling Basis? Just Apply the Filter! Tom Santopoli, Accenture, Berwyn, PA

Chapter 2: Elements of Java

SAS PROGRAM EFFICIENCY FOR BEGINNERS. Bruce Gilsen, Federal Reserve Board

We will learn the Python programming language. Why? Because it is easy to learn and many people write programs in Python so we can share.

Macros from Beginning to Mend A Simple and Practical Approach to the SAS Macro Facility

Effective Use of SQL in SAS Programming

Table Lookups: From IF-THEN to Key-Indexing

From The Little SAS Book, Fifth Edition. Full book available for purchase here.

A Recursive SAS Macro to Automate Importing Multiple Excel Worksheets into SAS Data Sets

SUGI 29 Coders' Corner

PANTONE Solid to Process

A Macro to Create Data Definition Documents

Managing Tables in Microsoft SQL Server using SAS

How to Use SDTM Definition and ADaM Specifications Documents. to Facilitate SAS Programming

6.170 Tutorial 3 - Ruby Basics

Pantone Matching System Color Chart PMS Colors Used For Printing

Statistics and Analysis. Quality Control: How to Analyze and Verify Financial Data

The programming language C. sws1 1

Paper Creating Variables: Traps and Pitfalls Olena Galligan, Clinops LLC, San Francisco, CA

How To Write A Clinical Trial In Sas

CDW DATA QUALITY INITIATIVE

Tips, Tricks, and Techniques from the Experts

How To Color Print

Reading Delimited Text Files into SAS 9 TS-673

The presentation will include a code review and presentation of reports that appear in both English and Italian.

Eliminating Tedium by Building Applications that Use SQL Generated SAS Code Segments

Overview. NT Event Log. CHAPTER 8 Enhancements for SAS Users under Windows NT

Macro Quoting. Overview of Macro Quoting CHAPTER 7

SAS Macro Autocall and %Include

Technical Paper. Reading Delimited Text Files into SAS 9

SAS Comments How Straightforward Are They? Jacksen Lou, Merck & Co.,, Blue Bell, PA 19422

Chapter 5 Programming Statements. Chapter Table of Contents

An Introduction to SAS/SHARE, By Example

Paper An Introduction to SAS PROC SQL Timothy J Harrington, Venturi Partners Consulting, Waukegan, Illinois

Internet/Intranet, the Web & SAS. II006 Building a Web Based EIS for Data Analysis Ed Confer, KGC Programming Solutions, Potomac Falls, VA

ABSTRACT INTRODUCTION %CODE MACRO DEFINITION

Computers. An Introduction to Programming with Python. Programming Languages. Programs and Programming. CCHSG Visit June Dr.-Ing.

PS004 SAS , using Data Sets and Macros: A HUGE timesaver! Donna E. Levy, Dana-Farber Cancer Institute

Data Presentation. Paper Using SAS Macros to Create Automated Excel Reports Containing Tables, Charts and Graphs

C Compiler Targeting the Java Virtual Machine

Programming Idioms Using the SET Statement

Object-Oriented Design Lecture 4 CSU 370 Fall 2007 (Pucella) Tuesday, Sep 18, 2007

Paper PO03. A Case of Online Data Processing and Statistical Analysis via SAS/IntrNet. Sijian Zhang University of Alabama at Birmingham

Demonstrating a DATA Step with and without a RETAIN Statement

Smart Web. User Guide. Amcom Software, Inc.

CHAPTER 1 Overview of SAS/ACCESS Interface to Relational Databases

Integrity Constraints and Audit Trails Working Together Gary Franklin, SAS Institute Inc., Austin, TX Art Jensen, SAS Institute Inc.

DBF Chapter. Note to UNIX and OS/390 Users. Import/Export Facility CHAPTER 7

Everything you wanted to know about MERGE but were afraid to ask

The University of Alabama in Huntsville Electrical and Computer Engineering CPE Test #4 November 20, True or False (2 points each)

ABSTRACT INTRODUCTION

Importing Excel File using Microsoft Access in SAS Ajay Gupta, PPD Inc, Morrisville, NC

Vim, Emacs, and JUnit Testing. Audience: Students in CS 331 Written by: Kathleen Lockhart, CS Tutor

SQL Case Expression: An Alternative to DATA Step IF/THEN Statements

Using the Magical Keyword "INTO:" in PROC SQL

Symbol Tables. Introduction

Data Cleaning 101. Ronald Cody, Ed.D., Robert Wood Johnson Medical School, Piscataway, NJ. Variable Name. Valid Values. Type

Building and Customizing a CDISC Compliance and Data Quality Application Wayne Zhong, Accretion Softworks, Chester Springs, PA

PharmaSUG Paper QT26

qwertyuiopasdfghjklzxcvbnmqwerty uiopasdfghjklzxcvbnmqwertyuiopasd fghjklzxcvbnmqwertyuiopasdfghjklzx cvbnmqwertyuiopasdfghjklzxcvbnmq

Variables, Constants, and Data Types

So far we have considered only numeric processing, i.e. processing of numeric data represented

Imelda C. Go, South Carolina Department of Education, Columbia, SC

Transcription:

Paper 052-29 The Power of CALL SYMPUT DATA Step Interface by Examples Yunchao (Susan) Tian, Social & Scientific Systems, Inc., Silver Spring, MD ABSTRACT AND INTRODUCTION CALL SYMPUT is a SAS language routine that assigns a value produced in a DATA step to a macro variable. It is one of the DATA step interface tools that provides a dynamic link for communication between the SAS language and the macro facility. This paper will discuss the uses of the SYMPUT routine through real world examples. Both beginning and experienced macro programmers will benefit from the examples by recognizing the wide application and increasing the understanding of this DATA step interface tool. Besides CALL SYMPUT, other features of the macro facility that are demonstrated include the construction of macro variable names through concatenation, the double ampersand, the %DO loop, the %GOTO statement and statement label, and the implicit %EVAL function. EXAMPLE : CREATE A SERIES OF VARIABLE NAMES FROM ANOTHER VARIABLE'S VALUES When performing logistic regression, we often need to create dummy variables based on all possible values of another variable. For instance, we want to create dummy variables for the variable CON which has over 400 different integer values from to 506. Basically we need to do the following: IF CON = THEN CON = ; ELSE CON = 0; IF CON = 2 THEN CON2 = ; ELSE CON2 = 0;...... IF CON = 506 THEN CON506 = ; ELSE CON506 = 0; It is not practical to write this many statements. Our goal is to use the SYMPUT routine to obtain this code automatically. In the following program, a sample data set TESTDATA with 2 observations and variable is first created in step (). Then in step (2), a data set UNIQUE is created containing 8 unique CON values. In step (3), the SYMPUT routine assigns the largest value of CON to the macro variable N. CALL SYMPUT is executed once when the DATA step reaches the end of the data set. In step (4), the macro variable N s value is retrieved and CALL SYMPUT is executed 506 times to create 506 macro variables M-M506 with the initial value 0. The PUT function is used to eliminate a note that numeric values have been converted to character values. The LEFT function is used to left-align the value of the index variable, I, to avoid creating macro variable names with blanks. In step (5), CALL SYMPUT is executed 8 times and the values of the 8 macro variables created in step (4) are updated with the values of the corresponding CON. The 498 macro variables without the corresponding CON values will remain the initial value 0. Step (6) is a macro that generates all dummy variables for all possible values of CON. By using the %GOTO statement and statement label, the dummy variables without the corresponding CON values will not be created. Note that the double ampersand is necessary to cause the macro processor to scan the text twice first to generate the reference and then to resolve it. Step () invokes the macro GETCON to create the dummy variables for every observation in the data set TESTDATA. The last step prints the output data set with dummy variables shown in Table. /* () Create a sample data set TESTDATA. */ DATA TESTDATA; INPUT CON; CARDS; 34 5 48 34 506 5 43 ;

/* (2) Get the unique values of CON. */ PROC SORT DATA=TESTDATA OUT=UNIQUE NODUPKEY; BY CON; /* (3) Assign the largest value of CON to the macro variable N. */ SET UNIQUE END=LAST; IF LAST THEN CALL SYMPUT('N', PUT(CON, 3.)); /* (4) Assign the initial value 0 to all macro variables. */ DO I = TO &N; CALL SYMPUT('M' LEFT(PUT(I, 3.)), '0'); END; /* (5) Assign the value of CON to the corresponding macro variable. */ SET UNIQUE; CALL SYMPUT('M' LEFT(PUT(CON, 3.)), PUT(CON, 3.)); /* (6) Macro to generate dummy variables. */ %MACRO GETCON; %IF &&M&I = 0 %THEN %GOTO OUT; IF CON = &&M&I THEN CON&I = ; ELSE CON&I = 0; %OUT: %MEND GETCON; /* () Create dummy variables. */ DATA TESTDATA; SET TESTDATA; %GETCON /* (8) Print the result. */ PROC PRINT DATA=TESTDATA; TITLE 'Table. List of CON with dummy variables'; Table. List of CON with dummy variables Obs CON CON CON CON34 CON43 CON5 CON5 CON48 CON506 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 3 34 0 0 0 0 0 0 0 4 5 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 6 0 0 0 0 0 0 0 48 0 0 0 0 0 0 0 8 34 0 0 0 0 0 0 0 9 506 0 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 43 0 0 0 0 0 0 0 2

EXAMPLE 2: GENERATE LABELS FOR A SERIES OF VARIABLES USING EXISTING FORMATS In this example, the problem is to label the variables FLAG-FLAG200 using the existing formats in format library. In the following program, a sample data set FLAGS with one observation and six variables is created. The task is to label each variable with the corresponding format, i.e., label FLAG with Red, and so on. The CALL SYMPUT statement in DATA _NULL_ step assigns the format FMTFLAG as the values to a set of macro variables FMT- FMT6. Then in the macro LABELS, the variables FLAG-FLAG6 are associated with the labels using the values of the macro variables FMT-FMT6. Invoking the macro LABELS in the LABEL statement in the subsequent DATA step generates labels for all variables FLAG-FLAG6. The PROC CONTENTS lists all variables with the assigned labels shown in Table 2. DATA FLAGS; FLAG = ; FLAG2 = ; FLAG3 = 0; FLAG4 = ; FLAG5 = 0; FLAG6 = ; PROC FORMAT; VALUE FMTFLAG ='Red' 2='Purple' 3='Blue' 4='Yellow' 5='Orange' 6='Green'; %LET N = 6; DO I = TO &N; CALL SYMPUT('FMT' LEFT(PUT(I, 3.)), PUT(I, FMTFLAG.)); END; %MACRO LABELS; FLAG&I = "&&FMT&I" %MEND LABELS; DATA FLAGS; SET FLAGS; LABEL %LABELS; PROC CONTENTS DATA=FLAGS; TITLE 'Table 2. Contents of SAS data set FLAGS showing variable labels'; Table 2. Contents of SAS data set FLAGS showing variable labels # Variable Type Len Pos Label --------------------------------------------- FLAG Num 8 0 Red 2 FLAG2 Num 8 8 Purple 3 FLAG3 Num 8 6 Blue 4 FLAG4 Num 8 24 Yellow 5 FLAG5 Num 8 32 Orange 6 FLAG6 Num 8 40 Green 3

EXAMPLE 3: SYMPUT ROUTINE WITH BYTE FUNCTION We can write the following macro to read a numbered series of raw data files with the same layout to create a series of SAS data sets with the same names as the raw data files. %MACRO READ(N); DATA CTY_&I; INFILE "C:\SUGI29\CTY_&I"; INPUT ID F96 F9 F98; %MEND READ; In each iteration of the %DO loop, one DATA step is generated. Invoking this macro with the parameter 3 generates three DATA steps that create three SAS data sets, CTY_, CTY_2, and CTY_3. Now let s suppose we want all the SAS data sets created above to be saved in an alphabetically ordered series of names. In another word, the SAS data set created from the raw data file CTY_ will be called CTY_A, the SAS data set from CTY_2 will be called CTY_B, and so on. We can accomplish this by using the following statement: CALL SYMPUT('L', BYTE(&I+64)); The BYTE function automatically invokes the %EVAL function and returns a character value represented by the integer value returned by the %EVAL function. For example, the BYTE function returns the character A when the value of the index variable, I, is since A is the 65 th character on the ASCII system. Then the SYMPUT routine assigns the character value returned by the BYTE function to the macro variable L. If we apply this statement to the DATA steps above and retrieve the macro variable L s value in subsequent DATA steps, we can create a series of SAS data sets with alphabetically ordered names. So invoking the following macro with the parameter 3 creates three SAS data sets named CTY_A, CTY_B, and CTY_C. %MACRO READ(N); DATA CTY_&I; INFILE "C:\SUGI29\CTY_&I"; INPUT ID F96 F9 F98; CALL SYMPUT('L', BYTE(&I+64)); DATA CTY_&L; SET CTY_&I; %MEND READ; CONCLUSION CALL SYMPUT provides a powerful vehicle to transfer information between program steps. The examples presented in this paper can be easily modified to fit the needs of different users. There are many opportunities to apply CALL SYMPUT in our day to day programming. A few important facts need to be remembered in using CALL SYMPUT. The SYMPUT routine assigns the value of the macro variable during DATA step execution, but the macro variable references resolve during the compilation of a step or global statement. So you can t retrieve the macro variable s value in the same DATA step in which the SYMPUT routine assigns that value. The SYMPUT routine must be used as part of the CALL statement. Since CALL SYMPUT is a SAS language routine, character string arguments need to be enclosed in single or double quotation marks. In most cases, a macro variable created by the SYMPUT routine is global. ACKNOWLEDGMENTS I would like to thank my client Dr. Chunliu Zhan for providing me the challenges over the years, which contributed a lot to this paper. I also want to thank my colleagues Dr. Raymond Hu and Deborah Johnson for their helpful discussions. 4

CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the author at: Yunchao (Susan) Tian Social & Scientific Systems, Inc. 85 Georgia Avenue, 2 th Floor Silver Spring, MD 2090 Work Phone: (30) 628-3285 Fax: (30) 628-320 Email: Stian@s-3.com SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are registered trademarks or trademarks of their respective companies. 5