Obtaining A List of Files In A Directory Using SAS Functions



Similar documents
SUGI 29 Applications Development

Combining External PDF Files by Integrating SAS and Adobe Acrobat Brandon Welch, Rho, Inc., Chapel Hill, NC Ryan Burns, Rho, Inc.

Getting Started with Command Prompts

With this document you will be able to download and install all of the components needed for a full installation of PCRASTER for windows.

Computer Programming In QBasic

A Recursive SAS Macro to Automate Importing Multiple Excel Worksheets into SAS Data Sets

Using FILEVAR= to read multiple external files in a DATA Step

An Introduction to Using the Command Line Interface (CLI) to Work with Files and Directories

Monitor file integrity using MultiHasher

Better Safe than Sorry: A SAS Macro to Selectively Back Up Files

Hands-On UNIX Exercise:

You have got SASMAIL!

Cross platform Migration of SAS BI Environment: Tips and Tricks

Introduction to the UNIX Operating System and Open Windows Desktop Environment

Preparing a Windows 7 Gold Image for Unidesk

ABSTRACT INTRODUCTION

Beginning to Program Python

Essential Project Management Reports in Clinical Development Nalin Tikoo, BioMarin Pharmaceutical Inc., Novato, CA

Chapter 5 Programming Statements. Chapter Table of Contents

CS 2112 Lab: Version Control

Bulk Downloader. Call Recording: Bulk Downloader

Capture Pro Software FTP Server System Output

PHP Integration Kit. Version User Guide

Automated distribution of SAS results Jacques Pagé, Les Services Conseils HARDY, Quebec, Qc

SAS University Edition: Installation Guide for Linux

Backing Up TestTrack Native Project Databases

EXTRACTING DATA FROM PDF FILES

Drupal Automated Testing: Using Behat and Gherkin

Scheduling in SAS 9.3

Instant Interactive SAS Log Window Analyzer

MATLAB Functions. function [Out_1,Out_2,,Out_N] = function_name(in_1,in_2,,in_m)

FireBLAST Marketing Solution v2

Tutorial Guide to the IS Unix Service

How To Write A Clinical Trial In Sas

Zenoss Core ZenUp Installation and Administration

Let SAS Do the Downloading: Using Macros to Generate FTP Script Files Arthur Furnia, Federal Aviation Administration, Washington DC

StreamServe Persuasion SP4 Service Broker

Installing and Running MOVES on Linux

SAS UNIX-Space Analyzer A handy tool for UNIX SAS Administrators Airaha Chelvakkanthan Manickam, Cognizant Technology Solutions, Teaneck, NJ

Installing the ASP.NET VETtrak APIs onto IIS 5 or 6

A Method for Cleaning Clinical Trial Analysis Data Sets

SAS Macros as File Management Utility Programs

Cloud Server powered by Mac OS X. Getting Started Guide. Cloud Server. powered by Mac OS X. AKJZNAzsqknsxxkjnsjx Getting Started Guide Page 1

SQLITE C/C++ TUTORIAL

Ingenious Testcraft Technical Documentation Installation Guide

Developing Applications Using BASE SAS and UNIX

Get Started MyLab and Mastering for Blackboard Learn Students

Paper FF-014. Tips for Moving to SAS Enterprise Guide on Unix Patricia Hettinger, Consultant, Oak Brook, IL

We begin by defining a few user-supplied parameters, to make the code transferable between various projects.

Automated Offsite Backup with rdiff-backup

PharmaSUG Paper AD11

AN INTRODUCTION TO UNIX

PCmover Enterprise User Guide

Winsock RCP/RSH/REXEC for Win32. DENICOMP SYSTEMS Copyright? 2002 Denicomp Systems All rights reserved. INTRODUCTION...1 REQUIREMENTS...

SAS ODS HTML + PROC Report = Fantastic Output Girish K. Narayandas, OptumInsight, Eden Prairie, MN

Capture Pro Software FTP Server Output Format

Flat Pack Data: Converting and ZIPping SAS Data for Delivery

Scheduling in SAS 9.4 Second Edition

Application Notes for Packaging and Deploying Avaya Communications Process Manager Sample SDK Web Application on a JBoss Application Server Issue 1.

Symantec Backup Exec Desktop Laptop Option ( DLO )

Practice Fusion API Client Installation Guide for Windows

Applications Development ABSTRACT PROGRAM DESIGN INTRODUCTION SAS FEATURES USED

An Introduction to the Linux Command Shell For Beginners

COMP 356 Programming Language Structures Notes for Chapter 4 of Concepts of Programming Languages Scanning and Parsing

Get Funky with %SYSFUNC

1. Installation Instructions - Unix & Linux

Oracle Endeca Commerce. Tools and Frameworks Deployment Template Usage Guide Version July 2012

UNIX Comes to the Rescue: A Comparison between UNIX SAS and PC SAS

Tracking Network Changes Using Change Audit

Automation of Large SAS Processes with and Text Message Notification Seva Kumar, JPMorgan Chase, Seattle, WA

Downloading Your Financial Statements to Excel

Real-Time Market Monitoring using SAS BI Tools

Unix Shell Scripts. Contents. 1 Introduction. Norman Matloff. July 30, Introduction 1. 2 Invoking Shell Scripts 2

Foundations and Fundamentals

1. Configuring the Dispatcher / Load Balancer

Paper Creating Variables: Traps and Pitfalls Olena Galligan, Clinops LLC, San Francisco, CA

SENDING S IN SAS TO FACILITATE CLINICAL TRIAL. Frank Fan, Clinovo, Sunnyvale CA

Elixir Schedule Designer User Manual

Command Line Interface Specification Linux Mac

Using SSH Secure FTP Client INFORMATION TECHNOLOGY SERVICES California State University, Los Angeles Version 2.0 Fall 2008.

Microsoft Networks/OpenNET FILE SHARING PROTOCOL. INTEL Part Number Document Version 2.0. November 7, 1988

Eclipse installation, configuration and operation

UNIX Operating Environment

An overview of FAT12

NewsletterAdmin 2.4 Setup Manual

Oracle Retail Item Planning Configured for COE Installation Guide Release December 2008

Microsoft. File Management. Windows Desktop. Microsoft File Management NCSEA

Multivendor Extension User Guide

Sophos for Microsoft SharePoint Help

JISIS and Web Technologies

Transcription:

Obtaining A List of Files In A Directory Using SAS Functions Jack Hamilton, Kaiser Permanente Division of Research, Oakland, California ABSTRACT This presentation describes how to use the SAS data information functions to obtain a list of the files in a directory under Windows or Unix. This method has the advantages of being written entirely in SAS and mostly platform independent. A common alternative method, "shelling out" to the host operating system, is also discussed. The data information functions are available in all currently supported versions of SAS. This paper is in two parts. The first describes obtaining the names of files in a single directory; the second part discusses how to obtain a recursive directory listing, i.e., the files in a directory and the directories underneath it. INTRODUCTION It is often desirable to obtain a list of the files in a directory. You may, for example, want to process all files matching a particular complicated pattern that can't be expressed with wildcards. Or you might want to put them all into a ZIP file using the ODS PACKAGE facility. Or you might just want to list them in a report. In the past, before the data information functions became available, the standard practice was to shell out to the operating system, issue a directory listing command, and capture and parse the output. This approach had two drawbacks: output from directory commands varies by system and can be hard to parse, and it is not always possible to run an operating system command. The data information functions, while complicated, avoid these problems. THE DATA INFORMATION FUNCTIONS Function Name Filename DOpen DNum DRead Function Purpose Assigns or deassigns a fileref to an external file, directory, or output device Opens a directory and returns a directory identifier value Returns the number of members in a directory Returns the name of a directory member MOpen Opens a file by directory id and member name, and returns the file identifier or a 0 DClose Closes a directory that was opened by the DOPEN function Table 1. Data Information Functions Source: http://support.sas.com/onlinedoc/913/getdoc/en/lrdict.hlp/a000245852.htm READING DIRECTORY ENTRY NAMES The basic algorithm is this: 1. Open the directory for processing using the Filename and DOpen functions. 2. Count the number of files in the directory using the DNum function 3. Iterate through the files (1 to number-of-files) and get the name of each entry using the DRead function. 4. Attempt to open each entry as a directory using the MOpen function. If it can't be opened as a directory, it's a file. If it's a file, output the name. 5. After looking at all entries, close the directory using the DClose function. 1

Here's an example that reads the names of the entries in Y:\WUSS2012: data yfiles; keep filename; length fref $8 filename $80; rc = filename(fref, 'Y:\wuss2012'); if rc = 0 did = dopen(fref); rc = filename(fref); else length msg $200.; msg = sysmsg(); put msg=; did =.; if did <= 0 putlog 'ERR' 'OR: Unable to open directory.'; dnum = dnum(did); do i = 1 to dnum; filename = dread(did, i); /* If this entry is a file, output. */ fid = mopen(did, filename); if fid > 0 output; rc = dclose(did); proc print data=yfiles; On my laptop, this prints: Obs filename 1 CGF_55.pdf 2 CGF_57.pdf 3 filenames.sas 4 filenames_recurse.sas 5 FP_57.docx 6 FP_57.pdf 7 pipe.sas 8 WUSS2012.zip 2

READING DIRECTORIES RECURSIVELY Reading directories and their subdirectories can be tricky. The example SAS Institute includes with PROC FCMP uses true recursion - it's a routine that calls itself multiple times. But recursion can be hard to understand, and can be slow. In general, anything that can be done with recursion can be done in another way, and this case it's possible to use a SAS data set as a stack. The dirs._found data set contains a list of the directories to search. It starts out containing only the name of the top level directory. As new directories are encountered, they are added to dirs._found, and processed as the data step steps though the dirs_found data set. /* Data set dirs_found starts out with the names of the root folders */ /* you want to analyze. After the second data step has finished, it */ /* will contain the names of all the directories that were found. */ /* The first root name must contain a slash or backslash. */ /* Make sure all directories exist and are readable. Use complete */ /* path names. */ data dirs_found (compress=no); length Root $120.; root = "y:\wuss2012"; output; data dirs_found /* Updated list of directories searched */ files_found (compress=no); /* Names of files found. */ keep Path FileName FileType; length fref $8 Filename $120 FileType $16; /* Read the name of a directory to search. */ modify dirs_found; /* Make a copy of the name, because we might reset root. */ Path = root; /* For the use and meaning of the FILENAME, DOPEN, DREAD, MOPEN, and */ /* DCLOSE functions, see the SAS OnlineDocs. */ rc = filename(fref, path); if rc = 0 did = dopen(fref); rc = filename(fref); else length msg $200.; msg = sysmsg(); putlog msg=; did =.; 3

if did <= 0 putlog 'ERR' 'OR: Unable to open ' Path=; return; dnum = dnum(did); do i = 1 to dnum; filename = dread(did, i); fid = mopen(did, filename); /* It's not explicitly documented, but the SAS online */ /* examples show that a return value of 0 from mopen */ /* means a directory name, and anything else means */ /* a file name. */ if fid > 0 /* FileType is everything after the last dot. If */ /* no dot, no extension. */ FileType = prxchange('s/.*\.{1,1}(.*)/$1/', 1, filename); if filename = filetype filetype = ' '; output files_found; else /* A directory name was found; calculate the complete */ /* path, and add it to the dirs_found data set, */ /* where it will be read in the next iteration of this */ /* data step. */ root = catt(path, "\", filename); output dirs_found; rc = dclose(did); proc print data=dirs_found; proc print data=files_found; On my laptop, this prints: Obs Root 1 y:\wuss2012 2 3 4

Obs Filename Type Path 1 CGF_55.pdf pdf y:\wuss2012 2 CGF_57.pdf pdf y:\wuss2012 3 filenames.sas sas y:\wuss2012 4 filenames_recurse.sas sas y:\wuss2012 5 FP_55.docx docx y:\wuss2012 6 FP_57.docx docx y:\wuss2012 7 FP_57.pdf pdf y:\wuss2012 8 pipe.sas sas y:\wuss2012 9 WUSS2012.zip zip y:\wuss2012 10 WUSS2012.zip zip 11 WUSS2012_AcceptanceLetter 48.pdf pdf 12 WUSS2012_AcceptanceLetter_Hamilton_DM.pdf pdf 13 WUSS2012_PresentersCopyrightFormDir.pdf pdf 14 WUSS2012_PresentersCopyrightFormZIP.pdf pdf 15 WritersGuidelines2012.pdf pdf 16 WUSS2012_PaperTemplate.doc doc 17 WUSS2012_PresentersCopyrightForm.pdf pdf 18 WUSS2012_PresentersFAQs.pdf pdf 5

REFERENCES "Functions and Call Routines Available at < http://support.sas.com/documentation/cdl/en/lrdict/64316/html/default/viewer.htm#a000245852.htm>. ACKNOWLEDGMENTS Jamila Gul, Kaiser Permanente Division of Research UPDATES Please check www.sascommunity.org for an updated version of this paper after the conference. CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the author at: Jack Hamilton jfh@acm.org SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are trademarks of their respective companies. 6