SAS Logic Coding Made Easy Revisit User-defined Function Songtao Jiang, Boston Scientific Corporation, Marlborough, MA



Similar documents
Let SAS Modify Your Excel File Nelson Lee, Genentech, South San Francisco, CA

New Tricks for an Old Tool: Using Custom Formats for Data Validation and Program Efficiency

THE POWER OF PROC FORMAT

Table Lookups: From IF-THEN to Key-Indexing

It s not the Yellow Brick Road but the SAS PC FILES SERVER will take you Down the LIBNAME PATH= to Using the 64-Bit Excel Workbooks.

Importing Excel File using Microsoft Access in SAS Ajay Gupta, PPD Inc, Morrisville, NC

Essential Project Management Reports in Clinical Development Nalin Tikoo, BioMarin Pharmaceutical Inc., Novato, CA

The SET Statement and Beyond: Uses and Abuses of the SET Statement. S. David Riba, JADE Tech, Inc., Clearwater, FL

Counting the Ways to Count in SAS. Imelda C. Go, South Carolina Department of Education, Columbia, SC

PharmaSUG Paper IB05

Make Better Decisions with Optimization

How To Write A Clinical Trial In Sas

Need for Speed in Large Datasets The Trio of SAS INDICES, PROC SQL and WHERE CLAUSE is the Answer, continued

ABSTRACT THE ISSUE AT HAND THE RECIPE FOR BUILDING THE SYSTEM THE TEAM REQUIREMENTS. Paper DM

Building and Customizing a CDISC Compliance and Data Quality Application Wayne Zhong, Accretion Softworks, Chester Springs, PA

SAS Programming Tips, Tricks, and Techniques

One problem > Multiple solutions; various ways of removing duplicates from dataset using SAS Jaya Dhillon, Louisiana State University

Paper PO03. A Case of Online Data Processing and Statistical Analysis via SAS/IntrNet. Sijian Zhang University of Alabama at Birmingham


Creating Dynamic Reports Using Data Exchange to Excel

WHITE PAPER. Harnessing the Power of Advanced Analytics How an appliance approach simplifies the use of advanced analytics

Introduction to Criteria-based Deduplication of Records, continued SESUG 2012

VOXI Earth Modelling Data Security and Architecture

PharmaSUG 2014 Paper CC23. Need to Review or Deliver Outputs on a Rolling Basis? Just Apply the Filter! Tom Santopoli, Accenture, Berwyn, PA

Chapter 5. Selection 5-1

Graphing Made Easy for Project Management

SDTM, ADaM and define.xml with OpenCDISC Matt Becker, PharmaNet/i3, Cary, NC

Foundations & Fundamentals. A PROC SQL Primer. Matt Taylor, Carolina Analytical Consulting, LLC, Charlotte, NC

SAS/Data Integration Studio Creating and Using A Generated Transformation Jeff Dyson, Financial Risk Group, Cary, NC

WESTMORELAND COUNTY PUBLIC SCHOOLS Integrated Instructional Pacing Guide and Checklist Computer Math

Recommending News Articles using Cosine Similarity Function Rajendra LVN 1, Qing Wang 2 and John Dilip Raj 1

Automation of Large SAS Processes with and Text Message Notification Seva Kumar, JPMorgan Chase, Seattle, WA

Clinical Trial Data Integration: The Strategy, Benefits, and Logistics of Integrating Across a Compound

Sanofi-Aventis Experience Submitting SDTM & Janus Compliant Datasets* SDTM Validation Tools - Needs and Requirements

An Introduction to SAS/SHARE, By Example

Technical Paper. Migrating a SAS Deployment to Microsoft Windows x64

An macro: Exploring metadata EG and user credentials in Linux to automate notifications Jason Baucom, Ateb Inc.

From The Little SAS Book, Fifth Edition. Full book available for purchase here.

REx: An Automated System for Extracting Clinical Trial Data from Oracle to SAS

SAS 9.4 PC Files Server

Converting Electronic Medical Records Data into Practical Analysis Dataset

We begin by defining a few user-supplied parameters, to make the code transferable between various projects.

PharmaSUG Paper PO01

Remove Orphan Claims and Third party Claims for Insurance Data Qiling Shi, NCI Information Systems, Inc., Nashville, Tennessee

Paper DV KEYWORDS: SAS, R, Statistics, Data visualization, Monte Carlo simulation, Pseudo- random numbers

ABSTRACT INTRODUCTION THE MAPPING FILE GENERAL INFORMATION

Transferring vs. Transporting Between SAS Operating Environments Mimi Lou, Medical College of Georgia, Augusta, GA

Automated Program Behavior Analysis

Statistics and Analysis. Quality Control: How to Analyze and Verify Financial Data

Finding National Best Bid and Best Offer

Managing Tables in Microsoft SQL Server using SAS

PharmaSUG Paper PO24. Create Excel TFLs Using the SAS Add-in Jeffrey Tsao, Amgen, Thousand Oaks, CA Tony Chang, Amgen, Thousand Oaks, CA

Switching from PC SAS to SAS Enterprise Guide Zhengxin (Cindy) Yang, inventiv Health Clinical, Princeton, NJ

Technical Paper. Defining an ODBC Library in SAS 9.2 Management Console Using Microsoft Windows NT Authentication

Problems and Measures Regarding Waste 1 Management and 3R Era of public health improvement Situation subsequent to the Meiji Restoration

PROC SQL for SQL Die-hards Jessica Bennett, Advance America, Spartanburg, SC Barbara Ross, Flexshopper LLC, Boca Raton, FL

AN INTRODUCTION TO THE SQL PROCEDURE Chris Yindra, C. Y. Associates

Identifying Invalid Social Security Numbers

PharmaSUG Paper AD08

Using Pharmacovigilance Reporting System to Generate Ad-hoc Reports

SAS Rule-Based Codebook Generation for Exploratory Data Analysis Ross Bettinger, Senior Analytical Consultant, Seattle, WA

Eliminate Memory Errors and Improve Program Stability

Using Proc SQL and ODBC to Manage Data outside of SAS Jeff Magouirk, National Jewish Medical and Research Center, Denver, Colorado

Data Analysis with MATLAB The MathWorks, Inc. 1

SQL SUBQUERIES: Usage in Clinical Programming. Pavan Vemuri, PPD, Morrisville, NC

10 Steps To Getting Started With. Marketing Automation

Automating SAS Macros: Run SAS Code when the Data is Available and a Target Date Reached.

Section 1.4 Place Value Systems of Numeration in Other Bases

Optimizing System Performance by Monitoring UNIX Server with SAS

Debugging Complex Macros

UNIX Operating Environment

ing Automated Notification of Errors in a Batch SAS Program Julie Kilburn, City of Hope, Duarte, CA Rebecca Ottesen, City of Hope, Duarte, CA

Copyright 2013 wolfssl Inc. All rights reserved. 2

Number of bits needed to address hosts 8

Advanced Tutorials. Numeric Data In SAS : Guidelines for Storage and Display Paul Gorrell, Social & Scientific Systems, Inc., Silver Spring, MD

Programme Specification (Postgraduate)

THE BASICS OF PING POST YOUR INTRODUCTORY GUIDE TO PING POST. boberdoo.com. Lead Distribution Software. boberdoo.com

Help Guide - CSR Portal

Managing Data Issues Identified During Programming

Tales from the Help Desk 3: More Solutions for Simple SAS Mistakes Bruce Gilsen, Federal Reserve Board

SUGI 29 Posters. Mazen Abdellatif, M.S., Hines VA CSPCC, Hines IL, 60141, USA

Dup, Dedup, DUPOUT - New in PROC SORT Heidi Markovitz, Federal Reserve Board of Governors, Washington, DC

Problem Solving Basics and Computer Programming

C Programming. for Embedded Microcontrollers. Warwick A. Smith. Postbus 11. Elektor International Media BV. 6114ZG Susteren The Netherlands

Lost in Space? Methodology for a Guided Drill-Through Analysis Out of the Wormhole

Macros from Beginning to Mend A Simple and Practical Approach to the SAS Macro Facility

Figure 1. Example of an Excellent File Directory Structure for Storing SAS Code Which is Easy to Backup.

Best Practice in SAS programs validation. A Case Study

Data-driven Validation Rules: Custom Data Validation Without Custom Programming Don Hopkins, Ursa Logic Corporation, Durham, NC

Transcription:

ABSTRACT PharmaSUG 2013 - Paper CC04 SAS Logic Coding Made Easy Revisit User-defined Function Songtao Jiang, Boston Scientific Corporation, Marlborough, MA SAS programmers deal with programming logics on a daily basis. Using regular logical operators can solve a complex logic problem, but oftentimes it is not intuitive, error-prone, and hard to read and maintain. Implementation also heavily depends on how well the individual understands the logics and how good the individual programmer s logical thinking is. In this paper, by using SAS User-defined Function, we lay out a universal way of implementing SAS logics without the dependence of individual s ability and experience. The implementation is straight-forward, error-prove, and clear. Furthermore, these SAS user-defined functions for logical operators can be built into the standard library and utilized by all programmers within an organization. They are extremely useful to all levels of SAS programmers. To demonstrate the idea, the implementations of these functions are introduced. Two sets of the sample codes of a moderately complex logic are compared between regular logic implementation and User-defined Function. INTRODUCTION SAS programmers deal with programming logics on a daily basis. It is very important and sometimes critical that a programmer can effectively and correctly program the provided logics in clinical settings. In the traditional way, programmers use the if-then-else statement and the SAS logical operators and to implement these logics. For simple logics, it is effective and easy to use. However for more complex logics, things become more tricky and difficult to implement. And also the basic operators may generate unexpected results when they deal with variables with missing values. Let s review how the basic logical operators work. We have three basic logical operators: AND, OR, and NOT. The following table (Table 1) lists their operation characteristics: Variables Logical Operators* a b a AND b a OR b NOT a.. 0 0 1. 0 0 0 1. 1 0 1 1 0. 0 0 1 0 0 0 0 1 0 1 0 1 1 1. 0 1 0 1 0 0 1 0 1 1 1 1 0 * In logical expression, SAS treats missing or 0 as false and all other value as true. Table 1: Basic logical operators and results From Table 1, we can see that 0 or missing value is treated as false in a logic expression, and 1 or other value is treated as true. In this paper, we only discuss 0, 1, and missing values. Any other values should be converted to one of these three values before putting into a logic expression. LOGICAL EXPRESSION AND ISSUES In clinical settings, oftentimes missing value should be treated as missing instead of false. For example, we need to create a Group variable from 2 binary variables Male and Diabetes. We need to identify male subjects with diabetes as one group, the rest subjects as another group. For group 1, it is easy to determine. If a subject is a male with diabetes, then Group = 1. However for group 2, it is a little bit tricky given the presence of missing values of 2 binary variables - Male and Diabetes. In the following, I give 2 wrong solutions and one correct solution, and result comparison table (Table 2). 1

Solution 1: DATA Demo1; IF Male AND Diabetes THEN Group=1; ELSE IF Missing(Male) OR Missing(Diabetes) THEN Group=.; ELSE Group=2; Solution 2: DATA Demo2; IF Male AND Diabetes THEN Group=1; ELSE IF (NOT Male) OR (NOT Diabetes) THEN Group=2; Solution 3 (correct solution): DATA Demo3; IF Male AND Diabetes THEN Group=1; ELSE IF (Male EQ 0) OR (Diabetes EQ 0) THEN Group=2; Variables Solution 1 Solution 2 Solution 3 Male Diabetes Group Group Group... 2.. 0. 2 2. 1. 2. 0.. 2 2 0 0 2 2 2 0 1 2 2 2 1.. 2. 1 0 2 2 2 1 1 1 1 1 Table 2: Results Comparison between 3 Solutions For solution 1, a few cases are missed to be identified as group 2. This is because one of these two binary variables is missing. But in some cases even one variable is missing, we can still determine which group the subject should belong to. For the subject with Male=. and Diabetes=0, this subject is definitely not in Group 1 ( Male and Diabetes ) no matter what the missing value would be for variable Male. So this subject should belong to group 2. In another case for subject with Male=. and Diabetes=1, the Group should be missing because the results depend on value of the variable Male. For solution 2, all subjects are grouped into two groups. This is also not what we wanted. The issue is related to the NOT logical operator. As we mentioned, all logical operators treated missing value as false. Using NOT logical is not appropriate in this case even though the SAS code is very close to what the correct solution is. For solution 3, we are able to give the correct solution with the help of comparison operator EQ instead of using the logical operator NOT as in solution 2. From the example, we can see that a very simple logic task can get us into a tricky situation. If we are given more variables or more complex logics, things are getting a lot worse. Even a well experience programmer may feel that it is not easy to handle. To address this common issue among all levels of SAS programmers, a very simple implementation using SAS Userdefined Function is proposed. It is simple, flexible, error-prove, and very effective. 2

USER-DEFINED FUNCTION (UDF) FOR LOGIC EXPRESSION LOGICAL OPERATOR UDFS IMPLEMENTATION Use SAS Function Compiler procedure to create the User-defined functions for the logical operators. For the UDFs, we need to treat missing value as missing. And the results depend on the situations listing in the following table (Table 3): Parameters UDF a b a AND b a OR b..... 0 0.. 1. 1 0. 0. 0 0 0 0 0 1 0 1 1.. 1 1 0 0 1 1 1 1 1 Table 3: UDF definitions and Results The UDF implementations based on Table 3 are given below: PROC FCMP OUTLIB=sasuser.myUDF.logic; FUNCTION logicand(a,b); IF a=. AND b=. THEN logicand=.; IF a=. AND b=0 THEN logicand=0; IF a=. AND b=1 THEN logicand=.; IF a=0 AND b=. THEN logicand=0; IF a=0 AND b=0 THEN logicand=0; IF a=0 AND b=1 THEN logicand=0; IF a=1 AND b=. THEN logicand=.; IF a=1 AND b=0 THEN logicand=0; IF a=1 AND b=1 THEN logicand=1; RETURN(logicand); ENDSUB; FUNCTION logicor(a,b); IF a=. AND b=. THEN logicor=.; IF a=. AND b=0 THEN logicor=.; IF a=. AND b=1 THEN logicor=1; IF a=0 AND b=. THEN logicor=.; IF a=0 AND b=0 THEN logicor=0; IF a=0 AND b=1 THEN logicor=1; IF a=1 AND b=. THEN logicor=1; IF a=1 AND b=0 THEN logicor=1; IF a=1 AND b=1 THEN logicor=1; RETURN (logicor); ENDSUB; run; SOLUTION FOR THE EXAMPLE USING UDF Now we have defined the UDFs for the logical operators and are ready to use them to solve our earlier simple example. options cmplib=sasuser.myudf; /* Locate the User-define UDFs */ DATA DemoUDF; 3

SAS Logic Coding Made Easy Revisit User-defined Function, continued Group=2-logicand(Male,Diabetes); One may doubt that the UDF just lists all possible outcomes, and it is not very convincing for simplifying a logic expression. In the following section, I will give a more complex example. I will compare the implementations between using regular logical operators and using the UDFs. Then you will see more clearly the power of using UDFs to solve logic problems. COMPARISON BETWEEN TWO METHODS The following example is from a pacemaker device to determining the physiologic settings. The logic need to be implemented is as following: (Pacing Mode is AAIR) OR ((Pacing Mode is DDDR) AND ((AV Search = ON) OR (AV min > PR))) Pacing Mode AAIR, Pacing Mode DDDR, AV Search = ON, and AV min > PR are four binary variables with values (0, 1, or missing). SAS code used for the both methods: DATA DemoCMP; SET CRM; /* Use UDF */ UDF_Logic=logicor(AAIR, logicand(dddr, logicor(avsearch, AVGTPR))); /* Use regular logic expression */ if AAIR=1 or (DDDR=1 and (AVSearch=1 or AVGTPR=1)) then Reg_Logic=1; else if AAIR=0 and (DDDR=0 or (AVSearch=0 and AVGTPR=0)) then Reg_Logic=0; For the UDF method, it is very straight-forward. You can use the UDFs as regular functions, and just follow the given logic specifications to write the SAS code. This is can be easily implemented by all levels of programmers. For the regular logic expression method, depending on your experiences and logical thinking skill, different programmers may give different answers. The given SAS code above by far is the simplest solution that I can get with a paid price that I worked at background using the Set Operations. For Reg_logic = 1, it is straight-forward. The following are the derivations for Reg_logic = 0: and AND or OR and AND In summary of above derivations, the logic expression should be: AAIR=0 AND (DDDR=0 OR (AVSearch=0 AND AVGTPR=0)) 4

The following table (Display 1) lists partial dataset and output: Display 1: Partial dataset and output for the physiologic settings of the pacemaker device 5

CONCLUSION The SAS language has provided SAS programmers a way to define a User-defined Function. Using UDFs for SAS logic programming has demonstrated a lot of advantages over other methods. The solution is ideally simple, easy to implement, error-prove, and easy to maintain. REFERENCES Base SAS(R) 9.2 Procedures Guide. SAS 9.2 Documentation. Copyright 2013 SAS Institute Inc. Available at http://support.sas.com/documentation/cdl/en/proc/61895/html/default/viewer.htm#a002890483.htm CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the author at: Name: Songtao Jiang Enterprise: Boston Scientific Corporation Address: 50 Boston Scientific Way City, State ZIP: Marlborough, MA 01752 Work Phone: 508-683-4432 Fax: 508-683-5642 E-mail: JIANGS@BSCI.com Web: http://www.bostonscientific.com/us/index.html SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are trademarks of their respective companies. 6