Let SAS Modify Your Excel File Nelson Lee, Genentech, South San Francisco, CA



Similar documents
Preserving Line Breaks When Exporting to Excel Nelson Lee, Genentech, South San Francisco, CA

Choosing the Best Method to Create an Excel Report Romain Miralles, Clinovo, Sunnyvale, CA

Switching from PC SAS to SAS Enterprise Guide Zhengxin (Cindy) Yang, inventiv Health Clinical, Princeton, NJ

Methodologies for Converting Microsoft Excel Spreadsheets to SAS datasets

It s not the Yellow Brick Road but the SAS PC FILES SERVER will take you Down the LIBNAME PATH= to Using the 64-Bit Excel Workbooks.

Managing Data Issues Identified During Programming

Paper Merges and Joins Timothy J Harrington, Trilogy Consulting Corporation

Tips and Tricks for Creating Multi-Sheet Microsoft Excel Workbooks the Easy Way with SAS. Vincent DelGobbo, SAS Institute Inc.

Using Proc SQL and ODBC to Manage Data outside of SAS Jeff Magouirk, National Jewish Medical and Research Center, Denver, Colorado

ABSTRACT INTRODUCTION THE MAPPING FILE GENERAL INFORMATION

Clinical Data Management (Process and practical guide) Dr Nguyen Thi My Huong WHO/RHR/RCP/SIS

SAS and Electronic Mail: Send faster, and DEFINITELY more efficiently

ABSTRACT INTRODUCTION CLINICAL PROJECT TRACKER OF SAS TASKS. Paper PH

Creating a Participants Mailing and/or Contact List:

An Approach to Creating Archives That Minimizes Storage Requirements

TECHNIQUES FOR BUILDING A SUCCESSFUL WEB ENABLED APPLICATION USING SAS/INTRNET SOFTWARE

SAS Logic Coding Made Easy Revisit User-defined Function Songtao Jiang, Boston Scientific Corporation, Marlborough, MA

SAS to Excel with ExcelXP Tagset Mahipal Vanam, Kiran Karidi and Sridhar Dodlapati

Challenges in Moving to a Multi-tier Server Platform

Creating Dynamic Reports Using Data Exchange to Excel

Post Processing Macro in Clinical Data Reporting Niraj J. Pandya

Downloading, Configuring, and Using the Free SAS University Edition Software

Get in Control! Configuration Management for SAS Projects John Quarantillo, Westat, Rockville, MD

We begin by defining a few user-supplied parameters, to make the code transferable between various projects.

A Method for Cleaning Clinical Trial Analysis Data Sets

SDTM, ADaM and define.xml with OpenCDISC Matt Becker, PharmaNet/i3, Cary, NC

Introduction to SAS Business Intelligence/Enterprise Guide Alex Dmitrienko, Ph.D., Eli Lilly and Company, Indianapolis, IN

Search and Replace in SAS Data Sets thru GUI

UNIX Comes to the Rescue: A Comparison between UNIX SAS and PC SAS

Automated distribution of SAS results Jacques Pagé, Les Services Conseils HARDY, Quebec, Qc

Importing Excel Files Into SAS Using DDE Curtis A. Smith, Defense Contract Audit Agency, La Mirada, CA

Managing Tables in Microsoft SQL Server using SAS

Create an Excel report using SAS : A comparison of the different techniques

Customized Excel Output Using the Excel Libname Harry Droogendyk, Stratia Consulting Inc., Lynden, ON

Report Customization Using PROC REPORT Procedure Shruthi Amruthnath, EPITEC, INC., Southfield, MI

Writing Packages: A New Way to Distribute and Use SAS/IML Programs

Tales from the Help Desk 3: More Solutions for Simple SAS Mistakes Bruce Gilsen, Federal Reserve Board

9.1 SAS/ACCESS. Interface to SAP BW. User s Guide

Creating HTML Output with Output Delivery System

Be a More Productive Cross-Platform SAS Programmer Using Enterprise Guide

Using SAS Enterprise Business Intelligence to Automate a Manual Process: A Case Study Erik S. Larsen, Independent Consultant, Charleston, SC

SQL SUBQUERIES: Usage in Clinical Programming. Pavan Vemuri, PPD, Morrisville, NC

Managing very large EXCEL files using the XLS engine John H. Adams, Boehringer Ingelheim Pharmaceutical, Inc., Ridgefield, CT

THE HELLO WORLD PROJECT

REx: An Automated System for Extracting Clinical Trial Data from Oracle to SAS

PharmaSUG Paper AD11

Flat Pack Data: Converting and ZIPping SAS Data for Delivery

How to Use SDTM Definition and ADaM Specifications Documents. to Facilitate SAS Programming

Exporting Client Information

Same Data Different Attributes: Cloning Issues with Data Sets Brian Varney, Experis Business Analytics, Portage, MI

ABSTRACT THE ISSUE AT HAND THE RECIPE FOR BUILDING THE SYSTEM THE TEAM REQUIREMENTS. Paper DM

Paper PO03. A Case of Online Data Processing and Statistical Analysis via SAS/IntrNet. Sijian Zhang University of Alabama at Birmingham

Integrating SAS with JMP to Build an Interactive Application

ing Automated Notification of Errors in a Batch SAS Program Julie Kilburn, City of Hope, Duarte, CA Rebecca Ottesen, City of Hope, Duarte, CA

An Introduction to SAS/SHARE, By Example

Importing Excel File using Microsoft Access in SAS Ajay Gupta, PPD Inc, Morrisville, NC

THE POWER OF PROC FORMAT

Clinical Data Management (Process and practical guide) Nguyen Thi My Huong, MD. PhD WHO/RHR/SIS

Integrating SAS and Excel: an Overview and Comparison of Three Methods for Using SAS to Create and Access Data in Excel

A Recursive SAS Macro to Automate Importing Multiple Excel Worksheets into SAS Data Sets

Normalized EditChecks Automated Tracking (N.E.A.T.) A SAS solution to improve clinical data cleaning

From Database to your Desktop: How to almost completely automate reports in SAS, with the power of Proc SQL

Let SAS Write Your SAS/GRAPH Programs for You Max Cherny, GlaxoSmithKline, Collegeville, PA

ER/Studio 8.0 New Features Guide

DBF Chapter. Note to UNIX and OS/390 Users. Import/Export Facility CHAPTER 7

The Essentials of Finding the Distinct, Unique, and Duplicate Values in Your Data

Using SAS Output Delivery System (ODS) Markup to Generate Custom PivotTable and PivotChart Reports Chevell Parker, SAS Institute

Chapter 24: Creating Reports and Extracting Data

Preparing Real World Data in Excel Sheets for Statistical Analysis

Exporting Contact Information

Building and Customizing a CDISC Compliance and Data Quality Application Wayne Zhong, Accretion Softworks, Chester Springs, PA

PharmaSUG Paper AD08

Utilizing Clinical SAS Report Templates Sunil Kumar Gupta Gupta Programming, Thousand Oaks, CA

Tutorial 3. Maintaining and Querying a Database

Microsoft Office 2010

ETL-03 Using SAS Enterprise ETL Server to Build a Data Warehouse: Focus on Student Enrollment Data ABSTRACT WHO WE ARE MISSION PURPOSE BACKGROUND

Health Services Research Utilizing Electronic Health Record Data: A Grad Student How-To Paper

Downloading Your Financial Statements to Excel

Using Pharmacovigilance Reporting System to Generate Ad-hoc Reports

SENDING S IN SAS TO FACILITATE CLINICAL TRIAL. Frank Fan, Clinovo, Sunnyvale CA

Table Lookups: From IF-THEN to Key-Indexing

SAS UNIX-Space Analyzer A handy tool for UNIX SAS Administrators Airaha Chelvakkanthan Manickam, Cognizant Technology Solutions, Teaneck, NJ

New Tricks for an Old Tool: Using Custom Formats for Data Validation and Program Efficiency

PROC SQL for SQL Die-hards Jessica Bennett, Advance America, Spartanburg, SC Barbara Ross, Flexshopper LLC, Boca Raton, FL

Paper Hot Links: Creating Embedded URLs using ODS Jonathan Squire, C 2 RA (Cambridge Clinical Research Associates), Andover, MA

Paper PS04_05 ing a SAS Report to Excel C. Royce Claytor, Dominion Resources Services, Richmond, Virginia

You have got SASMAIL!

Effective Use of SQL in SAS Programming

Advanced Tutorials. Numeric Data In SAS : Guidelines for Storage and Display Paul Gorrell, Social & Scientific Systems, Inc., Silver Spring, MD

Bag it, Tag it & Put it: Project tracking one click away!

SAS ODS HTML + PROC Report = Fantastic Output Girish K. Narayandas, OptumInsight, Eden Prairie, MN

Programming Tricks For Reducing Storage And Work Space Curtis A. Smith, Defense Contract Audit Agency, La Mirada, CA.

USE CDISC SDTM AS A DATA MIDDLE-TIER TO STREAMLINE YOUR SAS INFRASTRUCTURE

Enhancing the SAS Enhanced Editor with Toolbar Customizations Lynn Mullins, PPD, Cincinnati, Ohio

Unicenter Patch Management

Intro to Longitudinal Data: A Grad Student How-To Paper Elisa L. Priest 1,2, Ashley W. Collinsworth 1,3 1

Search help. More on Office.com: images templates

PharmaSUG Paper QT26

SAS Data Views: A Virtual View of Data John C. Boling, SAS Institute Inc., Cary, NC

Transcription:

ABSTRACT PharmaSUG 2015 - Paper QT12 Let SAS Modify Your Excel File Nelson Lee, Genentech, South San Francisco, CA It is common to export SAS data to Excel by creating a new Excel file. However, there are times that we want to update the input Excel file instead of creating a new one so that we can preserve certain data attributes such as text highlight, font color, and etc. This paper shares a quick tip that utilizes the SAS MODIFY statement to update a column in an Excel file where the input file and the output file are the same. The advantage of this approach, documented in this paper, is that it reads data from an edit check specifications document (an Excel file) and it updates a targeted column in the same document. This paper is written for audiences with beginner skills; the code is written using SAS Version 9.2 on the Windows operating system. INTRODUCTION In Clinical Data Management (CDM), one of the major goals is to ensure the data quality in conducting clinical trials. A common practice in the industry is to implement edit checks. While there are different ways to document edit checks, edit checks documented in an Excel spreadsheet is one of the most popular options. This document is commonly referred to as the Edit Check Specifications (ECS) document. Once the ECS is developed, Clinical Programmers (CP) will start edit check programming based on the ECS and track programming progress on the same specification during the programming cycle. It is labor intensive to manually track if a check has been programmed or not, and thus SAS is used to facilitate this tracking process by comparing the ECS with other specifications. For example, one can generate a report from an Electronic Data Capture (EDC) system that holds the relevant edit check information such as check names. The comparison concept is fairly straightforward; it compares the data from the ECS with the data from the EDC system, identifies matching check names and then generates a report (a new document) to indicate if a check has been programmed or not. One challenge is that the ECS is a living document. Data Managers (DM) and Clinical Programmers will need to modify it from time to time. It is challenging to maintain track changes in the ECS because Excel does not have the same level of track changes functionality as in Microsoft Word. Color fonts, highlights, and strikethroughs are some Excel attributes that are utilized for the purpose of tracking or identifying changes. For example, a check description highlighted in yellow could mean this check has been modified or a strikethrough on the check message could mean that certain part of the message is no longer needed. Hence, there is a need to maintain these attributes. It is difficult to track progress and changes in two different documents when there is no automatic direct link between them. This paper shares a different approach by modifying/updating the ECS to maintain tracking progress while keeping certain ECS (Excel) attributes as described above and avoiding the creation of another report. In other words, the input file and the output are the same file (i.e., ECS). AN EXAMPLE Exhibit 1 below illustrates an edit check specification in Excel format. It has highlights, color fonts and strikethroughs for predefined reasons. Each of these attributes stand for something that is meaningful to the CP and DM. In this example, there are 2000 edit checks in the ECS. It is easy to imagine how labor intensive it will be to track programming progress by manually updating the Programmed column to indicate when a check is programmed. Needless to say, it is also prone to human error. Exhibit 2 is a Check Listing (CL), an extract from an EDC system that documents all checks that exist in the system. These checks have been programmed as specified in the ECS. In this example, the CL is also an Excel file and has 500 checks. Records from the ECS and the CL (input files) are read into SAS and compared by check name. The programming logic is to identify matching check names between the 2 documents and to populate the Programmed column in the ECS with Yes or No. Yes means that the same check name exists in the EDC system (i.e., the edit check has been programmed). This aims to eliminate the effort that manual tracking entails and to avoid human entry errors. 1

Exhibit 1 (Edit Check Specification) Exhibit 2 (Check Listing) CREATE A NEW REPORT The following code compares the two input files and assigns Yes to the variable Programmed if the check name matches. This indicates that the edit check is programmed as it exists in the EDC system. A new report named outfile.xls is generated (refer to Exhibit 3). Note that Exhibit 3 captures the correct information and resembles the original edit check specification with updated data in the Programmed column. However, the color fonts or highlighted attributes are no longer there. *prepare libname to read input files; libname ecs "C:\Users\nlee01\Desktop\MY SAS\editcheck.xls" ; libname cl "C:\Users\nlee01\Desktop\MY SAS\checkinedc.xls" ; *create data set for edit checks from edit check specification; data checkspec; set ecs.'editcheck$'n; *create data set for checks programmed in EDC system; data checklist; set cl.'checkinedc$'n; *prepare data sets for merging; proc sort data=checkspec; by check_name; proc sort data=checklist; by check_name; *merge data and identify matching check name; data matchcheck; merge checkspec(in=in1) checklist(in=in2); by check_name; if in1; if in1=in2 then programmed='yes'; else programmed ='No'; proc sort; by id; 2

*write output to Excel; ods listing close; ods tagsets.excelxp file="c:\users\nlee01\desktop\my SAS\outfile.xls"; ods tagsets.excelxp options(sheet_name='editcheck' Absolute_Column_Width="5,8,5,30,20,10,20" autofit_height = 'yes' autofilter ='all'); proc print data=match l noobs; var ID Programmed ecrf Check_Description Check_Message Action_Field Check_Name; label Check_Description = 'Check Description' Check_Message = 'Check Message' Action_Field = 'Action Field' Check_Name = 'CheckName' ; ods tagsets.excelxp close; ods listing; libname ecs clear; libname cl clear; Exhibit 3 MODIFY THE SAME SPECIFICATION DOCUMENT WITH MODIFY In order to maintain the attributes in the original ECS, the code in the previous section is modified. Specifically, the section write output to Excel is replaced with the following code. *update the edit check specification based on the results in matchcheck; data ecs.'editcheck$'n; modify ecs.'editcheck$'n matchcheck; by id; This above code updates the ECS with the data in data set MATCHCHECK by matching the value of the variable ID (primary key). Using SAS terminology 1, ECS is the master-data-set and MATCHCHECK is the transaction-data-set. ID is the byvariable and is the common variable in the master-data-set and transaction-data-set. MODIFY uses ID to match observations in MATCHCHECK to ECS. Furthermore, the libname statements also need modification by adding the option scan_text=no as follows libname ecs "C:\Users\nlee01\Desktop\MY SAS\editcheck.xls" scan_text=no ; This change enables the update of the ECS because the ECS is a Microsoft Excel file 2. If this option is not added, update to the ECS will fail and a SAS error in the SAS log will show the following error message. ERROR: No Update/Append allowed. Set libname option SCAN_TEXT=NO to enable Update and Append operation. 3

Exhibit 4 displays the ECS after it is modified by the code as stated in the update the edit check specification section. The contents are the same as in Exhibit 3 but the original formatting attributes are preserved. Exhibit 4 (ECS after MODIFY ) LIMITATION There are some limitations with the proposed approach that achieve the outcome in exhibit 4. One of the limitations is the bitness between Microsoft Office and SAS. The bitness must match or the code will not work. For example, if a laptop has SAS 9.2 32-bit for Windows and 32-bit Microsoft Office installed, then the bitness is a match. Although SAS PC files server is a solution to the bitness difference, MODIFY is not supported with PC files server based on my experience. Another limitation is that one cannot have mixed format in any column in the ECS. For example, the data in the ID column must have the exact same format (i.e., either all are numeric or all are character but not a mix of the two). MODIFY will stop once it encounters the first occurrence of a mixed format (MODIFY still updates the ECS before that point). The same also applies to non-printable characters. This approach is also limited to one unique record identifier. In other words, it allows one by-variable so the masterdata-set cannot have more than one unique record identifier. CONCLUSION This paper demonstrates a simple way to MODIFY an existing Excel file without the need of creating a new file. The MODIFY statement in SAS serves the purpose. This approach not only streamlines the process of writing SAS output to the same Excel file, it also preserves certain formatting attributes in Excel without additional programming effort. There are some limitations that have been pointed out in this paper. Nevertheless, the use of MODIFY simplifies the body of the code and provides a better option where updates need to be made directly to the original Excel spreadsheet. REFERENCES 1. SAS Statements: Reference, http://support.sas.com/documentation/cdl/en/lestmtsref/63323/html/default/viewer.htm#n0g9jfr4x5hgsfn17gtma5547lt1.htm 2. SAS/ACCESS(R) 9.2 Interface to PC Files: Reference, Second Edition http://support.sas.com/documentation/cdl/en/acpcref/63184/html/default/viewer.htm#a002143109.htm 4

ACKNOWLEDGMENTS I would like to thank my colleagues Katrina Paz and Henk Pechler for their support and invaluable input that further enhance the quality of this paper. CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the author at: Name: Nelson Lee, CCDM Enterprise: Genentech Address: 1 DNA Way City, State ZIP: South San Francisco, CA 94080 Work Phone: 6502252121 E-mail: lee.nelson@gene.com SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are trademarks of their respective companies. 5