Methodologies for Converting Microsoft Excel Spreadsheets to SAS datasets



Similar documents
Integrating SAS and Excel: an Overview and Comparison of Three Methods for Using SAS to Create and Access Data in Excel

Choosing the Best Method to Create an Excel Report Romain Miralles, Clinovo, Sunnyvale, CA

ABSTRACT INTRODUCTION SAS AND EXCEL CAPABILITIES SAS AND EXCEL STRUCTURES

Create an Excel report using SAS : A comparison of the different techniques

Beyond the Basics: Advanced REPORT Procedure Tips and Tricks Updated for SAS 9.2 Allison McMahill Booth, SAS Institute Inc.

Customized Excel Output Using the Excel Libname Harry Droogendyk, Stratia Consulting Inc., Lynden, ON

SAS/ACCESS 9.3 Interface to PC Files

Excel with SAS and Microsoft Excel

Report Customization Using PROC REPORT Procedure Shruthi Amruthnath, EPITEC, INC., Southfield, MI

SPSS for Windows importing and exporting data

Importing Excel Files Into SAS Using DDE Curtis A. Smith, Defense Contract Audit Agency, La Mirada, CA

Tips and Tricks for Creating Multi-Sheet Microsoft Excel Workbooks the Easy Way with SAS. Vincent DelGobbo, SAS Institute Inc.

MS Excel. Handout: Level 2. elearning Department. Copyright 2016 CMS e-learning Department. All Rights Reserved. Page 1 of 11

Flat Pack Data: Converting and ZIPping SAS Data for Delivery

Microsoft Office. Mail Merge in Microsoft Word

Importing Data into SAS

SAS og Excel. Kender du fem forskellige måder at overføre data mellem SAS og Excel? Gert Nissen, seniorkonsulent

New Tricks for an Old Tool: Using Custom Formats for Data Validation and Program Efficiency

Importing Excel File using Microsoft Access in SAS Ajay Gupta, PPD Inc, Morrisville, NC

Using SAS Enterprise Business Intelligence to Automate a Manual Process: A Case Study Erik S. Larsen, Independent Consultant, Charleston, SC

Let SAS Modify Your Excel File Nelson Lee, Genentech, South San Francisco, CA

How To Write A File System On A Microsoft Office (Windows) (Windows 2.3) (For Windows 2) (Minorode) (Orchestra) (Powerpoint) (Xls) (

Microsoft Office Word 2010: Level 1

It s not the Yellow Brick Road but the SAS PC FILES SERVER will take you Down the LIBNAME PATH= to Using the 64-Bit Excel Workbooks.

Combining SAS LIBNAME and VBA Macro to Import Excel file in an Intriguing, Efficient way Ajay Gupta, PPD Inc, Morrisville, NC

Aspose.Cells Product Family

A Recursive SAS Macro to Automate Importing Multiple Excel Worksheets into SAS Data Sets

Easy Map Excel Tool USER GUIDE

Release 2.1 of SAS Add-In for Microsoft Office Bringing Microsoft PowerPoint into the Mix ABSTRACT INTRODUCTION Data Access

Technical Paper. Defining an ODBC Library in SAS 9.2 Management Console Using Microsoft Windows NT Authentication

CDW DATA QUALITY INITIATIVE

Intro to Mail Merge. Contents: David Diskin for the University of the Pacific Center for Professional and Continuing Education. Word Mail Merge Wizard

Importing Data into R

Managing very large EXCEL files using the XLS engine John H. Adams, Boehringer Ingelheim Pharmaceutical, Inc., Ridgefield, CT

Advanced Excel 10/20/2011 1

5. Crea+ng SAS Datasets from external files. GIORGIO RUSSOLILLO - Cours de prépara+on à la cer+fica+on SAS «Base Programming»

ACCESS Importing and Exporting Data Files. Information Technology. MS Access 2007 Users Guide. IT Training & Development (818)

Microsoft Access Introduction

Paper RIV15 SAS Macros to Produce Publication-ready Tables from SAS Survey Procedures

Introduction to PASW Statistics

Basic Excel Handbook

Using Microsoft Access Databases

ABSTRACT INTRODUCTION THE MAPPING FILE GENERAL INFORMATION

Microsoft Access 2007 Module 1

Data Presentation. Paper Using SAS Macros to Create Automated Excel Reports Containing Tables, Charts and Graphs

Excel 2010: Create your first spreadsheet

Microsoft Excel Basics

Analyzing Data Using Excel

Overview of sharing and collaborating on Excel data

RA MODEL VISUALIZATION WITH MICROSOFT EXCEL 2013 AND GEPHI

Intro to Excel spreadsheets

Generating a Custom Bill of Materials

SPSS: Getting Started. For Windows

Lesson 07: MS ACCESS - Handout. Introduction to database (30 mins)

Perfecting Report Output to RTF Steven Feder, Federal Reserve Board, Washington, D.C.

Microsoft Access 2010

Microsoft Access 2007 Introduction

Using SAS Output Delivery System (ODS) Markup to Generate Custom PivotTable and PivotChart Reports Chevell Parker, SAS Institute

ABSTRACT INTRODUCTION CLINICAL PROJECT TRACKER OF SAS TASKS. Paper PH

Automated distribution of SAS results Jacques Pagé, Les Services Conseils HARDY, Quebec, Qc

How to easily convert clinical data to CDISC SDTM

How To Use Excel With A Calculator

Introduction to Microsoft Access 2013

Post Processing Macro in Clinical Data Reporting Niraj J. Pandya

9.1 SAS/ACCESS. Interface to SAP BW. User s Guide

Importing from Tab-Delimited Files

Microsoft Excel 2007 Mini Skills Overview of Tables

Create a New Database in Access 2010

Lab 11: Budgeting with Excel

Creating and Using Databases with Microsoft Access

A Method for Cleaning Clinical Trial Analysis Data Sets

Microsoft Office 2010: Access 2010, Excel 2010, Lync 2010 learning assets

SAS 9.4 PC Files Server

Introduction Course in SPSS - Evening 1

Writing Data with Excel Libname Engine

Importing and Exporting With SPSS for Windows 17 TUT 117

THE HELLO WORLD PROJECT

SAS and Microsoft Excel for Tracking and Managing Clinical Trial Data: Methods and Applications for Information Delivery

UTILITIES BACKUP. Figure 25-1 Backup & Reindex utilities on the Main Menu

Tales from the Help Desk 3: More Solutions for Simple SAS Mistakes Bruce Gilsen, Federal Reserve Board

3 What s New in Excel 2007

PROJECT ON MICROSOFT ACCESS (HOME TAB AND EXTERNAL DATA TAB) SUBMITTED BY: SUBMITTED TO: NAME: ROLL NO: REGN NO: BATCH:

Basics Series-4004 Database Manager and Import Version 9.0

Result Entry by Spreadsheet User Guide

How to Use SDTM Definition and ADaM Specifications Documents. to Facilitate SAS Programming

4. Are you satisfied with the outcome? Why or why not? Offer a solution and make a new graph (Figure 2).

Writer Guide. Chapter 15 Using Forms in Writer

How to Excel with CUFS Part 2 Excel 2010

Introduction. Why Use ODBC? Setting Up an ODBC Data Source. Stat/Math - Getting Started Using ODBC with SAS and SPSS

Polynomial Neural Network Discovery Client User Guide

SAS to Excel with ExcelXP Tagset Mahipal Vanam, Kiran Karidi and Sridhar Dodlapati

Producing Listings and Reports Using SAS and Crystal Reports Krishna (Balakrishna) Dandamudi, PharmaNet - SPS, Kennett Square, PA

DBF Chapter. Note to UNIX and OS/390 Users. Import/Export Facility CHAPTER 7

REx: An Automated System for Extracting Clinical Trial Data from Oracle to SAS

Excel Charts & Graphs

Transcription:

Methodologies for Converting Microsoft Excel Spreadsheets to SAS datasets Karin LaPann ViroPharma Incorporated ABSTRACT Much functionality has been added to the SAS to Excel procedures in SAS version 9. In SAS version 8.2 there were some improvements, but one still had to do one of the following to make the Excel sheet readable: A) Save it as a Microsoft Excel 5.0/95 Workbook (*.xls), with only one sheet per workbook. B) Save it as a comma delimited file CSV (Comma delimited) (*.csv) and read in with input statement in a data step. C) Save it as text file Text (Tab delimited) (*.txt) and read in with input statements. This last form being readable only if you had no missing data in any columns. The example could be read directly by SAS using PROC IMPORT. However, the Excel spreadsheet could only have one tab, ergo the reason to save as Excel 5.0. The second and third examples could be read in with data step input statements. The third example was readable only if you had no missing data in any columns, as SAS assumes variables in sequential rather than positional order. EXCEL FORMATTING ISSUES Other dilemmas caused by Excel setup include the following: 1. Spacing used by Excel users to make spreadsheets readable 2. Titles, footnotes on Excel documents 3. Formulas or macros 4. Dates and times 5. Mixed character and numeric entries in one column 6. Awkward titles across the top, which become variable names 1

The following Excel sheet will cause some problems: Labcorp.xls A B C D E F G H 1 Labcorp 2 Austin, TX 3 4 Patient lab time value unit comments Id lab test lab date 5 6 19321 Cholesterol 25Jan2006 6:15 160 mg/dl on statin drug 7 19321 Calcium 25Jan2006 6:15 80 mg/dl 8 19321 Bun 25Jan2006 6:15 9 19321 Albumin 25Jan2006 6:15 4 g/dl 10 19321 Basos 25Jan2006 6:15 0.5 % 11 19321 abs basos 25Jan2006 6:15 0 th/mm3 12 19321 Color Urin 25Jan2006 6:15 Yellow 13 19321 Urin 25Jan2006 6:15 Negative 14 19321 Lymphs 25Jan2006 6:15 39.1 % Some possible solutions: Item 1: Item 2: Date field, do you want a date string or a character date? Item 4: Alpha-numeric data in what originally appears to be a numeric column. SAS will have assigned this column as numeric. Therefore, these will be assigned to missing. Item 1: Blank row will cause problems identifying char or numeric? Item 3: Comments are allowed up to 200 Charters in SAS. How to define? a. If you have access to the Excel spreadsheet and are allowed to manipulate it, simply delete the offending blank row. b. Otherwise, read in using Input starting on line 6, using option GETNAMES = NO; Item 2: Dates read in rather nicely to the SAS dataset as dates. However, if the date has been entered on the Excel spreadsheet as a character string, you need to specify in the input statement as a character string, then convert to date as follows: labdt = input(lab_date,date9.); format labdt date9.; Item 3: Use LENGTH statement to define as Char 200 prior to reading in. Item 4: a. If you have access to the Excel spreadsheet and are allowed to manipulate it, format entire column as character prior to reading in both character and numeric entries. After creation of the SAS dataset, save as separate variables if you need the numeric entries separated. 2

b. Otherwise, assign as character field programmatically prior to reading in to a SAS dataset. Below is a table showing Excel to SAS and SAS to Excel conversions *: Default SAS Variable and Type Formats for Excel Formats Excel Column Format SAS Variable Format SAS Variable Type Text $w. character General, Number, Scientific, Percentage, Fraction See Note 3 numeric Currency, Accounting DOLLAR21.2 numeric Date, Datetime, Time DATE9. See Notes 1 and 2 numeric 1 The default format is DATE9. However, you can use the SASDATEFMT option to change the format to other date or datetime formats. The LIBNAME engine automatically converts the internal date value for you. 2 If you have a time only field in your Microsoft Excel range, you can use SASDATEFMT to assign it with the SAS TIME. format. Note that the SAS date/time value uses 01Jan1960 as a cutoff line while the Jet provider date/time value uses 30Dec1899 as a cutoff line. 3 To access Fraction or Percent format data in your Excel file, you can use the FORMAT statement to assign the FRACT. or PERCENT. format in your data step code. * from SAS V.9.1.3 on-line documentation Moving to SAS V.9 (specifically V.9.1.3) we have new and exciting ways to read Excel spreadsheets, and also to write back to Excel, in addition to V.8.2 methods. READING FROM AND WRITING TO MULTIPLE SHEET EXCEL WORKBOOKS The most exciting new feature is the ability to use multiple spreadsheets within one workbook of Excel. We no longer have to save each sheet as its own Microsoft Excel 5.0/95 Workbook. We can now import and export up to and including Excel 2000 spreadsheets. Import using the SAS/ACCESS interface for PC Files. (Requires extra license) 3

Import Excel spreadsheets (version 5.0 and later) by specifying DBMS=XLS This enables access to Excel spreadsheets on UNIX directly, without going to a PC server. The IMPORT Procedure You can write code: PROC IMPORT DATAFILE= c:\myfiles\testing.xls OUT= data.project101; SHEET= Sheet1 (Note use Sheet 1$ n if spaces in the name) GETNAMES= Yes ; (Note use No and SAS will assign Var0, Var1 etc) RUN; You can use the IMPORT wizard within a SAS Interactive session, then save the generated code and re-use The EXPORT Procedure You can use translation engines (DBMS=XLS) and specify the Excel Workbook version: PROC EXPORT DATA= data.project101 DBMS= Excel2000 OUTFILE= c:\myfiles\testing.xls SHEET= Sheet1 ; (Note use Sheet 1$ n if spaces in the name) RUN; The Libname Statement with Excel The SAS engine now recognizes Excel spreadsheets using the libname command. For this example, the spreadsheet name is CDISCtabs.xls, and the sheets are: VS domain, DM domain, EX domain. The macro variable &sdtmxls refers to a spreadsheet. Following is sample code to access a spreadsheet to get metadata using Excel: %let dstmxls = CDISCtabs.xls ; libname sdtmxls Excel "H:\WORKAREA\&sdtmXls" access=readonly header=no mixed=yes dbgen_name=sas dbmax_text=32767 DBSASLABEL=COMPAT SCAN_TEXTSIZE= YES scantext=no; proc contents data=sdtmxls._all_ out=_sdtmall noprint; libname sdtmxls clear; 4

For the above example, we create a listing of the contents of the spreadsheet that can then be called in for additional manipulations with INPUT statements. Now read in or print directly, spreadsheets and also individual Worksheets as follows: libname sdtmxls odbc dsn=excel; proc print data=sdtmxls.'dm domain$'n; data mylib.new; set mylib.dm domain$'n; source: http://support.sas.com/kb/12/628.html Use ODS to HTML format and save as Excel spreadsheet Here is a simple SAS Dataset which we can convert to HTML format using SAS V.9.1.3. In the ODS HTML command we assign an Excel name and save. The file can now be opened using Excel and has descriptive labels and formatted $ amounts. Data sales; Input date$ 1-10 salesp $ 12-32 prod $ 34-49 amt region $ 60-61; Cards; 02/14/2007 David Smith Block 500,000 PA 02/15/2007 John Doe Interlock Paver 6,500 NJ 02/16/2007 Jim Jones Groundface Paver 72,000 NJ ; Title1 Report of Sales of Construction Materials ; Ods html file = H:\WORKAREA\Sales_report.xls ; Proc report nowindows data = sales; columns region salesp date prod amt; define region /display Sales Region ; define salesp/display Sales Person ; define date /display Date of Sale format=yymmdd10.; define prod /display Product ; define amt /display Amount format= dollar11.2; ods html close; 5

In addition, you can create a spanned header across two or more cells above the table by replacing the following notation in the columns section: columns region salesp date ( Product Information prod amt) ; You can now control font and color in Excel cells using ODS tagsets as follows: ODS tagsets.excelxp file = &outdir/demo.xls style=styles.xlstatistical; proc report data=sashelp.class nowd style(header)={background=gray font_size=8pt font_weight=bold just=left}; cols name sex age height weight; compute sex; if sex='f' then call define(_col_,'style','style=[background=pink foreground=blue font_size=8pt]'); if sex='m' then call define(_col_,'style','style=[background=blue foreground=yellow font_size=8pt]'); endcomp; SOME CREATIVE USES FOR SAS TO EXCEL CONVERSIONS Data Entry - Excel is often used by the layperson for rudimentary data entry. Also, it can be quite complex, with the addition of formulas and other reference values. Each cell in Excel has properties which may or may not be transformed to the SAS dataset, depending on the version of SAS you are using. Lookup Tables Data can be converted programmatically using lookup tables generated in Excel then output in SAS For example: Recode values by using the lookup table. A B C D E F 1 LABPANEL LABTEXT LABUNIT LABSTANDARDUNIT FACTOR 2 Chemistry BUN mg/dl mg/dl 1 3 Chemistry BUN MG/DL mg/dl 1 4 Hematology Platelets X 10-3/uL 10³/mm³ 1 5 Hematology Platelets K/UL 10³/mm³ 1 6 Hematology Platelets X 10^9TH/L 10³/mm³ 1 7 Hematology Platelets X10E + 09/L 10³/mm³ 1 By creating a SAS dataset from the above lookup table maintained in Excel, then merging into our lab dataset we can standardize the units to a standard consistent unit across all labs used in the study. 6

Tables of Contents An Excel spreadsheet can be developed to drive titles and footnotes in reports, and even as metadata for automatic creation of tables and listings for pharmaceutical studies. Reporting for Upper Management - SAS data can ultimately be displayed in Excel Spreadsheets that are then sent to end users who might add graphs and formulas to them. In SAS V.9 we can even add shading of colors to the cells to make them visually appealing Maintaining SAS Format Catalogs Catalogs for Pharmaceutical studies can more easily be maintained in an Excel spreadsheet, and then read in for each study. By using names needed by PROC FORMAT, the catalog gets read in smoothly. Sample Excel spreadsheet format.xls A B C D E F 1 FMTNAME START END LABEL TYPE 2 YESNOFM 1 1 YES N 3 YESNOFM 2 2 NO N 4 YESNOFM 9 9 UNK N 5 GENDER M M MALE C 6 GENDER F F FEMALE C 7 Sample program to read in and convert to SAS Catalog proc access dbms=excel; create work._imex_.access; path = &xlspath; scantype = yes ; getnames = yes; mixed = Y; create work._imex_.view; select all; data _fmt; set work._imex_; if fmtname = ' ' then delete; if end = ' ' then end = start; proc format cntlin = _fmt library = library; 7

CONCLUSION SAS version 9 offers many new features to handle the conversions from SAS to Excel and back again. Excel is often used as a data entry vehicle, thus making these conversions to SAS necessary. We can now control the cosmetic features of the output to Excel using the ODS tagsets, making the end product much more production-ready. There are many creative uses for converting to and from Excel. This paper was intended to inform the SAS user of many new features available to do these conversions. For a more comprehensive reference for ODS tagsets please see the SAS online documentation. There are also many excellent papers on this subject. ACKNOWLEDGMENTS I would like to thank ViroPharma Incorporated for the use of the SAS software. CONTACT INFORMATION Questions and feedback are welcome. Send them to: Karin LaPann ViroPharma Incorporated 397 Eagleview Drive Exton, PA 19341 USA (610) 321-2329 Karin.lapann@viropharma.com or lapannk@comcast.net COPYRIGHT INFORMATION SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are trademarks of their respective companies. 8