How to Reduce the Disk Space Required by a SAS Data Set

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "How to Reduce the Disk Space Required by a SAS Data Set"

Transcription

1 How to Reduce the Disk Space Required by a SAS Data Set Selvaratnam Sridharma, U.S. Census Bureau, Washington, DC ABSTRACT SAS datasets can be large and disk space can often be at a premium. In this paper, SAS options like COMPRESS, SAS statements like LENGTH and ATTRIB statements, SAS View, and Macros are discussed as to how to reduce the size of a SAS dataset. An already developed %SQUEEZE macro can find the minimum lengths required for both numeric and character variables in a SAS dataset, and use these minimum lengths for the variables to reduce the size of the SAS dataset. Another macro %DROPMISS that is developed here can automatically identify and drop SAS variables that have only missing or null values. INTRODUCTION When storing a large data set, storage space can be exhausted. By reducing the size of a dataset by compressing the dataset using SAS COMPRESS= option, a large amount of storage space can be saved. Using SAS Views instead of SAS datasets, a large amount of storage space can be saved. Another way to reduce the size of a SAS data set is by saving only the needed variables in a dataset using DROP and/or KEEP options. Also, using LENGTH or ATTRIB statements to assign the minimum lengths that are required for the variables in a SAS dataset can reduce the size of a SAS data set. It is often difficult to find the minimum length required by a variable in a SAS data set. An already developed macro %SQUEEZE finds the minimum lengths required by the variables in a SAS data set and assigns the minimum lengths to these variables. The %SQUEEZE macro is modified slightly here to make it more efficient. Sometimes all the values of some variables in a SAS data set are missing and we would like to drop these variables to save storage space. Using the % DROPMISS macro that is developed here can do this. SAS SYSTEM OPTIONS / STATEMENTS Some SAS system options / statements such as LENGTH, ATTRIB, KEEP, DROP, and COMPRESS can be used to reduce the size required by a SAS data set. LENGTH AND ATTRIB Controlling the lengths of individual variables may greatly reduce the size of a SAS data set. LENGTH or ATTRIB statement can be used to assign a length to a numeric or a character variable. For character variables, the statement must occur in a data step before the first occurrences to the variables included in the statement. In a SAS data set, for integers and character variables with short values this may dramatically decrease the size of the data set. For character variables one byte corresponds to one character. Hence, to minimize storing space, set the length of each character variable to the number of characters in the longest value of the variable. The minimum length required by a numeric variable depends on the operating environment. Two examples are given below. 15 Data X; 16 Length a b c 3 17 d e 5 18 F g $4; 19 set Y; 20 run; 50 Data X; 51 Attrib a b c length=3; 52 Attrib d e length=5; 53 Attrib f g length=$4; 54 set Y; 55 run; The ATTRIB statement can also be used to change a variable's FORMAT, INFORMAT, and LABEL. 1

2 One needs to be careful when assigning the length of a numeric variable using the LENGTH statement. If the length assigned for a numeric variable is not adequate, some of the values of that variable will be truncated in the output data set. The statement will not generate an error. It is not advisable to change the lengths of non-integer variables because you can loose the precision of some of the non-integer values. When the length assigned is not adequate for a character variable, the length statement will generate an error. KEEP AND DROP When a SAS data set is created, only the needed variables should be kept. This could save a large amount of space required to store the dataset. This can be done using KEEP= and/or DROP= to delete the unnecessary variables. To save processing time, this should be done as early as logically possible as in the following example. 39 Data A (keep= a b q r); 40 Set B (drop = h k); 41 a= l+p; 42 b= r+q; 43 Run; SAS COMPRESS The COMPRESS= option is a SAS system option and a data set option that can be used to greatly reduce the disk space required to store a SAS data set. You can set the option to either YES or BINARY. In new versions CHAR can be used instead of YES. If there are more character variables than numeric variables, generally it is better to use COMPRESS = YES option. If there are more numeric variables than character variables, generally it is better to use COMPRESS = BINARY. But both options should be tried to find out which one works better. These options are used like as they are used in the following examples. 58 Data A (COMPRESS= YES); 59 SET SASHELP.EISMSG; 60 RUN; NOTE: There were 1470 observations read from the data set SASHELP.EISMSG. NOTE: The data set WORK.A has 1470 observations and 6 variables. NOTE: Compressing data set WORK.A decreased size by percent. Compressed is 15 pages; un-compressed would require 35 pages. 61 Data A (COMPRESS= BINARY); 62 SET SASHELP.EISMSG; 63 RUN; NOTE: There were 1470 observations read from the data set SASHELP.EISMSG. NOTE: The data set WORK.A has 1470 observations and 6 variables. NOTE: Compressing data set WORK.A decreased size by percent. Compressed is 16 pages; un-compressed would require 35 pages. When COMPRESS = option is used as a SAS system option in the beginning of a program, all the SAS datasets created by the program will be compressed. An option to use with COMPRESS= is REUSE= option. Specifying this option allows SAS to reuse space within the compressed SAS data set that has been freed by deleted observations. All compressed SAS data sets are uncompressed by SAS prior to being used in computations in the DATA or PROC steps. So, although compression saves disk space, it requires additional CPU time to compress and uncompress. Sometimes, compressing will result in a file larger than the uncompressed file if the uncompressed file is small. Beginning with version 8, SAS will not compress a SAS data set when the result would be a larger file. 72 Data A (COMPRESS= YES); 73 SET SASHELP.ACCPEO; 74 RUN; 2

3 NOTE: There were 20 observations read from the data set SASHELP.ACCPEO. NOTE: The data set WORK.A has 20 observations and 3 variables. NOTE: Compressing data set WORK.A increased size by percent. Compressed is 2 pages; un-compressed would require 1 pages. Here are some benchmark results for a large data set that is used at Census Bureau. The compression ratio is the ratio of the size of the compressed data set to the size of the uncompressed data set. COMPRESS OPTION SIZE (BYTES) COMPRESSION RATIO None 80,159, Binary 21,517, % Char 37,008, % SAS VIEW As an alternative to a SAS data set, one can use a SAS view. SAS Views provide all the functionality of a SAS data set. A SAS View contains only the instructions that are required for retrieving data values from other SAS data sets or files, and it occupies only a little fraction of the space required by the SAS data set. A SAS View can be created with data step or with a PROC SQL. Following is an example of a SAS View created with data step. 10 Data B /view = B; 11 set sashelp.eismsg; 12 run; NOTE: DATA STEP view saved on file WORK.B. A PROC SQL View can read data from DATA step Views, SAS data sets, other PROC SQL views, ORACLE or other DBMS data. 62 Proc sql; 63 Create view AB as 64 select var1, var2, var3 65 from A 66 order by var3, var4; NOTE: SQL view WORK.AB has been defined. 67 quit; In the above example A can be a SAS view, SAS data set, ORACLE table or any other DBMS table. Starting with Version 8, DATA step View retains source statements. One can retrieve these statements as in the following example. 32 data view=b; 33 describe; 34 run; NOTE: DATA step view WORK.B is defined as: data B/view=B; set sashelp.eismsg; run; To retrieve the source statements for an SQL View, one needs to use SQL as in the following example. 3

4 68 proc sql; 69 describe view AB; NOTE: SQL view WORK.AB is defined as: select var1, var2, var3 from A order by var3 asc, var4 asc; 70 quit; SQUEEZING A SAS DATA SET When a large dataset is created, most often it is difficult to find the minimum length required by an individual variable. The following macros can find the minimum lengths required by numeric or character variables for a SAS data set and use these lengths to reduce the size the data set. These macros could greatly reduce the storage space required by a SAS dataset. %SQUEEZE %SQUEEZE macro created by Ross Bettinger (see Reference 1) can squeeze a data set by reducing the space required by numeric and character variables. If you do not want to squeeze some variables, you have the option of doing so by using a parameter in the %SQUEEZE macro. If you do not include the highlighted part of the code in Appendix A, you would have the code for %SQUEEZE macro. %SQUEEZE_1 The %SQUEEZE macro is modified slightly here to create a macro %SQUEEEZE_1. %SQUEEZE macro checks all numeric variables to find the minimum lengths required by these variables by repeated use of TRUNC function on each and every value of these variables. You do not need to find the minimum length of a numeric variable if its length is already three, and sometimes you do not need to apply the TRUNC function on each and every value of a numeric variable. %SQUEEZE_1 incorporates these improvements. This macro generally runs faster than %SQUEEZE and it runs at least as fast as %SQUEEZE. The squeezing technique that is discussed here may be used on integer valued numeric variable, but should not be used on non-integer valued numeric variables. If you use this technique on non-integer valued numeric variables, you might lose some accuracy for these variables. The code for this macro is given in Appendix A. COMPARING %SQUEEZE AND %SQUEEZE_1 %SQUEEZE_1 SIZE (bytes) SQUEEZED RATIO No 24,969, Yes 21,515, % %SQUEEZE and %SQUEEZE_1 are used to squeeze some large SAS data set of size bytes to come with the results below. Macros TIME (minutes) %SQUEEZE 126 %SQUEEZE_1 108 DROPING VARIABLES WITH ONLY MISSING VALUES When all the values in some numeric or character variables are missing, deleting these variables can save a large amount of disk space. But sometimes you want to keep some variables even though all the values for these variables are missing. A macro %DROPMISS (see Appendix B) that is developed here will automatically drop the variables in a data set that have always missing values. You have the option of not dropping the variables you do not want to drop by using a parameter in the %DROPMISS macro. This macro is more efficient than the program in Reference 2. The table below gives the results for both programs for a SAS data set. 4

5 Programs SIZE (bytes) TIME (minutes) Program in Sample , %DROPMISS 567, COMBINING %SQUEEZE_1 AND %DROPMISS Combining %SQUEEZE_1 and %DROPMISS, another macro %SQ_DROPMISS (see Appendix C) is created. To save processing time, instead of using %SQUEEZE_1 and %DROPMISS on a SAS data set, it would be better to use %SQ_DROPMISS. When the methods in %SQUEEZE are used on a SAS data set, these methods squeeze the lengths of the character variables that are always missing to 1, and squeeze the lengths of the numeric variables that are always missing to 3. So, after applying the methods in the %SQUEEZE macro to SAS data set, to drop the variables that have all the values missing we need to check only the character variables with length 1 and numeric variables with length 3 for the variables that have always missing values. This saves a great amount of processing time. CONCLUSIONS There are many ways to reduce the size of a data set that you want to store. Some SAS options and SAS statements, SAS Views, and some macros discussed in this paper can be used to reduce the space required by a SAS data set. Instead of using %SQUEEZE_1 and %DROPMISS for a data set, it would be better to use the macro %SQ_DROPMISS to save processing time. REFERENCES 1. Ross Bettinger, Sample 267: %SQUEEZE-ing before Compressing Data, Redux. 7 Jul < 2. Sample 53: Delete variables that have only missing data. 7 Jul < ACKNOWLEDGMENTS We would like to thank David Chapman for offering valuable suggestions and comments. SAS is a Registered Trademark of the SAS Institute, Inc. of Cary, North Carolina. DISCLAIMER This paper reports the results of research and analysis undertaken by Census Bureau staff. It has undergone a more limited review by the Census Bureau than its official publications. This report is released to inform interested parties and to encourage discussion. CONTACT INFORMATION Your comments and questions are valued and encouraged. Contact the author at: Selvaratnam Sridharma Economic Planning and Coordination Division U.S. Bureau of the Census Washington, DC

6 APPENDIX A: %SQUEEZE_1 %macro SQUEEZE_1( DSNIN /* name of input SAS dataset */, DSNOUT /* name of output SAS dataset */, NOCOMPRESS= /* [optional] variables to be omitted from the minimum-length computation process */ ); /* PURPOSE: create LENGTH statement for vars that minimizes the variable length * to: * numeric vars: the fewest # of bytes needed to exactly represent the values * contained in the variable * character vars: the fewest # of bytes needed to contain the longest * character string * * macro variable SQZLENTH is created which is then invoked in a subsequent * data step * * NOTE: if no char vars in dataset, produce no char var processing code * NOTE: length of format for char vars is changed to match computed length * of char var * e.g., if length( CHAR_VAR ) = 10 after %SQUEEZE-ing, then FORMAT CHAR_VAR * $10. ; is generated * NOTE: variables in &DSNOUT are maintained in same order as in &DSNIN * NOTE: variables named in &NOCOMPRESS are not included in the minimum- * length computation process and keep their original lengths as specified in * &DSNIN * * EXAMPLE OF USE: * %SQUEEZE( DSNIN, DSNOUT ) * %SQUEEZE( DSNIN, DSNOUT, NOCOMPRESS=A B C D--H X1-X100 ) * %SQUEEZE( DSNIN, DSNOUT, NOCOMPRESS=_numeric_ ) * %SQUEEZE( DSNIN, DSNOUT, NOCOMPRESS=_character_ ) */ %global SQUEEZE ; %local I ; %if "&DSNIN" = "&DSNOUT" %then %do ; %put / \ ; %put ERROR from SQUEEZE: ; %put Input Dataset has same name as Output Dataset. ; %put Execution terminating forthwith. ; %put \ / ; %goto L9999 ; /*###############################################################################*/ /* begin executable code /*###############################################################################*/ 6

7 /* Find the first positive integer n such that n+1 needs more than 3 bytes /* Negative of this number will be the first negative integer n such that n-1 /* needs more than 3 bytes data x; do i=1 to 10000; a=trunc(i,3); if a ^=i then do; call symput ('max_3', a); output; stop; end; end; run; /* create dataset of variable names whose lengths are to be minimized /* exclude from the computation all names in &NOCOMPRESS proc contents data=&dsnin( drop=&nocompress ) memtype=data noprint out=_cntnts_( keep= name type LENGTH) ; run ; %let N_CHAR = 0 ; %let N_NUM = 0 ; data _null_ ; set _cntnts_ end=lastobs nobs=nobs ; WHERE (TYPE =1 AND LENGTH ^= 3) OR (TYPE =2 AND LENGTH ^=1); if nobs = 0 then stop ; n_char + ( type = 2 ) ; n_num + ( type = 1 ) ; /* create macro vars containing final # of char, numeric variables */ if lastobs then do ; call symput( 'N_CHAR', left( put( n_char, 5. ))) ; call symput( 'N_NUM', left( put( n_num, 5. ))) ; end ; run ; /* if there are NO numeric or character vars in dataset, stop further /* processing %if %eval( &N_NUM + &N_CHAR ) = 0 %then %do ; %put / \ ; %put ERROR from SQUEEZE: ; %put No variables in dataset. ; %put Execution terminating forthwith. ; %put \ / ; %goto L9999 ; /* put global macro names into global symbol table for later retrieval %do I = 1 %to &N_NUM ; %global NUM&I NUMLEN&I ; %do I = 1 %to &N_CHAR ; %global CHAR&I CHARLEN&I ; 7

8 /* create macro vars containing variable names /* efficiency note: could compute n_char, n_num here, but must declare macro /* names to be global b4 stuffing them /* note: if no char vars in data, do not create macro vars proc sql noprint ; %if &N_CHAR > 0 %then %str( select name into :CHAR1 - :CHAR&N_CHAR from _cntnts_ where type = 2 AND LENGTH NE 1; ) ; %if &N_NUM > 0 %then %str( select name int o :NUM1 - :NUM&N_NUM from _cntnts_ where type = 1 AND LENGTH NE 3; ) ; quit ; /* compute min # bytes (3 = min length, for portability over platforms) for /* numeric vars compute min # bytes to keep rightmost character for char vars data _null_ ; set &DSNIN end=lastobs ; %if &N_NUM > 0 %then %str ( array _num_len_ ( &N_NUM ) 3 _temporary_ ; ) ; %if &N_CHAR > 0 %then %str( array _char_len_ ( &N_CHAR ) _temporary_ ; ) ; if _n_ = 1 then do; %if &N_CHAR > 0 %then %str( do i = 1 to &N_CHAR ; _char_len_( i ) = 0 ; end ; ) ; %if &N_NUM > 0 %then %str( do i = 1 to &N_NUM ; _num_len_ ( i ) = 3 ; end ; ) ; end ; %if &N_CHAR > 0 %then %do I = 1 %to &N_CHAR ; _char_len_( &I ) = max( _char_len_( &I ), length( &&CHAR&I )) ; %if &N_NUM > 0 %then %do I = 1 %to &N_NUM ; if &&NUM&I ne. THEN DO; IF ( &&NUM&I > &max_3 OR &&NUM&I < -&max_3) THEN DO; if &&NUM&I ne trunc( &&NUM&I, 7 ) then _num_len_( &I ) = max( _num_len_( &I ), 8 ) ; else if &&NUM&I ne trunc( &&NUM&I, 6 ) then _num_len_( &I ) = max( _num_len_( &I ), 7 ) ; else if &&NUM&I ne trunc( &&NUM&I, 5 ) then _num_len_( &I ) = max( _num_len_( &I ), 6 ) ; else if &&NUM&I ne trunc( &&NUM&I, 4 ) then _num_len_( &I ) = max( _num_len_( &I ), 5 ) ; else if &&NUM&I ne trunc( &&NUM&I, 3 ) then _num_len_( &I ) = max( _num_len_( &I ), 4 ) ; end ; end; if lastobs then do ; %if &N_CHAR > 0 %then %do I = 1 %to &N_CHAR ; call symput( "CHARLEN&I", put( _char_len_( &I ), 5. )) ; %if &N_NUM > 0 %then %do I = 1 %to &N_NUM ; call symput( "NUMLEN&I", put( _num_len_( &I ), 1. )) ; end ; run ; 8

9 proc datasets nolist ; delete _cntnts_ ; run ; /* initialize SQZ_NUM, SQZ_CHAR global macro vars %let SQZ_NUM = LENGTH ; %let SQZ_CHAR = LENGTH ; %let SQZ_CHAR_FMT = FORMAT ; %if &N_CHAR > 0 %then %do I = 1 %to &N_CHAR ; %let SQZ_CHAR = &SQZ_CHAR %qtrim( &&CHAR&I ) $%left( &&CHARLEN&I ) ; %let SQZ_CHAR_FMT = &SQZ_CHAR_FMT %qtrim( &&CHAR&I ) $%left( &&CHARLEN&I ). ; %if &N_NUM > 0 %then %do I = 1 %to &N_NUM ; %let SQZ_NUM = &SQZ_NUM %qtrim( &&NUM&I ) &&NUMLEN&I ; /* build macro var containing order of all variables data _null_ ; length retain $32767 ; retain retain 'retain ' ; dsid = open( "&DSNIN", 'I' ) ; /* open dataset for read access only */ do _i_ = 1 to attrn( dsid, 'nvars' ) ; retain = trim( retain ) ' ' varname( dsid, _i_ ) ; end ; call symput( 'RETAIN', retain ) ; run ; /* apply SQZ_* to incoming data, create output dataset data &DSNOUT ; &RETAIN ; %if &N_CHAR > 0 %then %str( &SQZ_CHAR ; ); /* optimize char var lengths */ %if &N_NUM > 0 %then %str( &SQZ_NUM ; ); /* optimize numeric var lengths */ %if &N_CHAR > 0 %then %str( &SQZ_CHAR_FMT ; ) ; /* adjust char var format lengths */ set &DSNIN ; run ; %L9999: %mend SQUEEZE_1 ; 9

10 APPENDIX B: %DROPMISS %macro DROPMISS( DSNIN /* name of input SAS dataset */, DSNOUT /* name of output SAS dataset */, NODROP= /* [optional] variables to be omitted from dropping even if they have only missing values */ ) ; /* PURPOSE: To find both Character and Numeric the variables that have only * missing values and drop them if they are not in &NONDROP * * NOTE: if no char vars in dataset, produce no char var processing code * * EXAMPLE OF USE: * %DROP1( DSNIN, DSNOUT ) * %DROP1( DSNIN, DSNOUT, NODROP=A B C D--H X1-X100 ) * %DROP1( DSNIN, DSNOUT, NODROP=_numeric_ ) * %DROP1( DSNIN, DSNOUT, NOdrop=_character_ ) */ %global DROP1 ; %local I ; %if "&DSNIN" = "&DSNOUT" %then %do ; %put / \ ; %put ERROR from DROPMISS: ; %put Input Dataset has same name as Output Dataset. ; %put Execution terminating forthwith. ; %put \ / ; %goto L9999 ; /*###############################################################################*/ /* begin executable code /*###############################################################################*/ /* create dataset of variable names that have only missing values /* exclude from the computation all names in &NODROP proc contents data=&dsnin( drop=&nodrop ) memtype=data noprint out= _cntnts_( keep= name type ) ; run ; %let N_CHAR = 0 ; %let N_NUM = 0 ; data _null_ ; set _cntnts_ end=lastobs nobs=nobs ; if nobs = 0 then stop ; n_char + ( type = 2 ) ; n_num + ( type = 1 ) ; /* create macro vars containing final # of char, numeric variables */ if lastobs then do ; call symput( 'N_CHAR', left( put( n_char, 5. ))) ; call symput( 'N_NUM', left( put( n_num, 5. ))) ; end ; 10

11 run ; /* if there are NO numeric or character vars in dataset, stop further */ %if %eval( &N_NUM + &N_CHAR ) = 0 %then %do ; %put / \ ; %put ERROR from DROP1: ; %put No variables in dataset. ; %put Execution terminating forthwith. ; %put \ / ; %goto L9999 ; /* put global macro names into global symbol table for later retrieval */ %do I = 1 %to &N_NUM ; %global NUM&I ; %do I = 1 %to &N_CHAR ; %global CHAR&I ; /* create macro vars containing variable names /* efficiency note: could compute n_char, n_num here, but must declare macro /* names to be global b4 stuffing them /* note: if no char vars in data, do not create macro vars proc sql noprint ; %if &N_CHAR > 0 %then %str( select name into :CHAR1 - :CHAR&N_CHAR from _cntnts_ where type = 2 ; ) ; %if &N_NUM > 0 %then %str( select name into :NUM1 - :NUM&N_NUM _cntnts_ where type = 1 ; ) ; quit ; from /* put MAXIMUM values of the variables into macro variables %IF &N_CHAR > 1 %THEN %let N_CHAR_1 = %EVAL(&N_CHAR - 1); %IF &N_NUM > 1 %THEN %let N_NUM_1 = %EVAL(&N_NUM - 1); Proc sql ; %IF &N_NUM >1 %THEN %DO; %do I= 1 %to &N_NUM_1; max (&&NUM&I), %IF &N_NUM > 0 %THEN %DO; MAX(&&NUM&N_NUM) %IF &N_CHAR >0 AND &N_NUM >0 %THEN %DO;, %IF &N_CHAR > 1 %THEN %DO; %do I= 1 %to &N_CHAR_1; max(&&char&i), 11

12 %IF &N_CHAR >0 %THEN %DO; MAX(&&CHAR&N_CHAR) into %IF &N_NUM > 1 %THEN %DO; %do I= 1 %to &N_NUM_1; :NUMMAX&I, %IF &N_NUM > 0 %THEN %DO; :NUMMAX&N_NUM %IF &N_CHAR> 0 AND &N_NUM >0 %THEN %DO;, %IF &N_CHAR > 1 %THEN %DO; %do I= 1 %to &N_CHAR_1; :CHARMAX&I, %IF &N_CHAR > 0 %THEN %DO;:CHARMAX&N_CHAR from &DSNIN; /* initialize DROP_NUM, DROP_CHAR global macro vars %let DROP_NUM = ; %let DROP_CHAR = ; %if &N_CHAR > 0 %THEN %DO; %do I = 1 %to &N_CHAR ; %IF &&CHARMAX&I = %THEN %DO; %let DROP_CHAR = &DROP_CHAR %qtrim( &&CHAR&I ) ; %if &N_NUM > 0 %THEN %DO; %do I = 1 %to &N_NUM ; %IF &&NUMMAX&I =. %THEN %DO; %let DROP_NUM = &DROP_NUM %qtrim( &&NUM&I ) ; %End ; /* apply SQZ_* to incoming data, create output dataset */ data &DSNOUT ; %if &DROP_CHAR ^= %then %str( DROP &DROP_CHAR ; ) ; /* drop char variables that have only missing values */ %if &DROP_NUM ^= %then %str( DROP &DROP_NUM ; ) ; /* drop num variables that have only missing values */ set &DSNIN ; run ; %L9999: %mend DROPMISS ; 12

13 APPENDIX C: %SQ_DROPMISS OPTIONS MPRINT MLOGIC MSYMBOLGEN; %macro SQDROPMISS( DSNIN /* name of input SAS dataset */, DSNOUT /* name of output SAS dataset */, NOCOMPRESS= /* [optional] variables to be omitted from the minimum-length computation process */, NODROP= /* [optional] variables to be omitted from droping even if they have only missing values ) ; /* PURPOSE: Squeeze a data set to have minimum lengths required for the * variables excluding the variables in &NOCOMPRESS applying %SQUEEZE_1 and * then DROP the variables that have always missing values in a more * efficient way. * * EXAMPLE OF USE: * %SQ_DROPMISS( DSNIN, DSNOUT, NOCOMPRESS= ) * %SQ_DROPMISS( DSNIN, DSNOUT, NOCOMPRESS=A B C D--H X1-X100 ) * %SQ_DROPMISS( DSNIN, DSNOUT, NOCOMPRESS=_numeric_ ) * %SQ_DROPMISS DSNIN, DSNOUT, NOCOMPRESS=_character_ * %SQ_DROPMISS DSNIN, DSNOUT, NOCOMPRESS=_character_, NONDROP= A C D) */ /*###############################################################################*/ /* begin executable code /*###############################################################################*/ /* Squeezing part /* Include the code for the macro %SQUEEZE_1 here */ %SQUEEZE_1 (&DSNIN, DSNSQUEEZED, &NOCOMPRESS); /* Dropping part %global DROP1 ; %local I ; %if "&DSNIN" = "&DSNOUT" %then %do ; %put / \ ; %put ERROR from DROPMISS: ; %put Input Dataset has same name as Output Dataset. ; %put Execution terminating forthwith. ; %put \ / ; %goto L9999 ; /* create dataset of variable names that have only missing values /* exclude from the computation all names in &NODROP proc contents data=dsnsqueezed( drop=&nodrop ) memtype=data noprint out= _cntnts_( keep= name type length) ; run ; 13

14 %let N_CHAR = 0 ; %let N_NUM = 0 ; data _null_ ; set _cntnts_ end=lastobs nobs=nobs ; where (type =1 and length =3) or (type=2 and length =1); if nobs = 0 then stop ; n_char + ( type = 2 ) ; n_num + ( type = 1 ) ; /* create macro vars containing final # of char, numeric variables */ if lastobs then do ; call symput( 'N_CHAR', left( put( n_char, 5. ))) ; call symput( 'N_NUM', left( put( n_num, 5. ))) ; end ; run ; /* if there are NO numeric or character vars in dataset, stop further */ %if %eval( &N_NUM + &N_CHAR ) = 0 %then %do ; %put / \ ; %put ERROR from DROP1: ; %put No variables in dataset to drop. ; %put Execution terminating forthwith. ; %put \ / ; %goto L9999 ; /* put global macro names into global symbol table for later retrieval */ %do I = 1 %to &N_NUM ; %global NUM&I ; %do I = 1 %to &N_CHAR ; %global CHAR&I ; /* create macro vars containing variable names /* efficiency note: could compute n_char, n_num here, but must declare macro /* names to be global b4 stuffing them /* note: if no char vars in data, do not create macro vars proc sql noprint ; %if &N_CHAR > 0 %then %str( select name into :CHAR1 - :CHAR&N_CHAR from _cntnts_ where type = 2 ; ) ; %if &N_NUM > 0 %then %str( select name into :NUM1 - :NUM&N_NUM from _cntnts_ where type = 1 ; ) ; quit ; /* put MAXIMUM values of the variables into macro variables %IF &N_CHAR > 1 %THEN %let N_CHAR_1 = %EVAL(&N_CHAR - 1); %IF &N_NUM > 1 %THEN %let N_NUM_1 = %EVAL(&N_NUM - 1); 14

15 Proc sql ; select %IF &N_NUM >1 %THEN %DO; %do I= 1 %to &N_NUM_1; max (&&NUM&I), %IF &N_NUM > 0 %THEN %DO; MAX(&&NUM&N_NUM) %IF &N_CHAR >0 AND &N_NUM >0 %THEN %DO;, %IF &N_CHAR > 1 %THEN %DO; %do I= 1 %to &N_CHAR_1; max(&&char&i), %IF &N_CHAR >0 %THEN %DO; MAX(&&CHAR&N_CHAR) into %IF &N_NUM > 1 %THEN %DO; %do I= 1 %to &N_NUM_1; :NUMMAX&I, %IF &N_NUM > 0 %THEN %DO; :NUMMAX&N_NUM %IF &N_CHAR> 0 AND &N_NUM >0 %THEN %DO;, %IF &N_CHAR > 1 %THEN %DO; %do I= 1 %to &N_CHAR_1; :CHARMAX&I, %IF &N_CHAR > 0 %THEN %DO;:CHARMAX&N_CHAR from &DSNIN; quit; /* initialize DROP_NUM, DROP_CHAR global macro vars %let DROP_NUM = ; %let DROP_CHAR = ; %if &N_CHAR > 0 %THEN %DO; %do I = 1 %to &N_CHAR ; %IF &&CHARMAX&I = %THEN %DO; %let DROP_CHAR = &DROP_CHAR %qtrim( &&CHAR&I ) ; %if &N_NUM > 0 %THEN %DO; %do I = 1 %to &N_NUM ; %IF &&NUMMAX&I =. %THEN %DO; %let DROP_NUM = &DROP_NUM %qtrim( &&NUM&I ) ; 15

16 %End ; /* apply Drop_* to incoming data, create output dataset */ data &DSNOUT ; %if &DROP_CHAR ^= %then %str( DROP &DROP_CHAR ; ) ; /* drop char variables that have only missing values */ %if &DROP_NUM ^= %then %str( DROP &DROP_NUM ;) ; /* drop num variables that have only missing values */ set DSNSQUEEZED ; run ; %L9999: %mend SQDROPMISS ; 16

Programming Tricks For Reducing Storage And Work Space Curtis A. Smith, Defense Contract Audit Agency, La Mirada, CA.

Programming Tricks For Reducing Storage And Work Space Curtis A. Smith, Defense Contract Audit Agency, La Mirada, CA. Paper 23-27 Programming Tricks For Reducing Storage And Work Space Curtis A. Smith, Defense Contract Audit Agency, La Mirada, CA. ABSTRACT Have you ever had trouble getting a SAS job to complete, although

More information

Preparing Real World Data in Excel Sheets for Statistical Analysis

Preparing Real World Data in Excel Sheets for Statistical Analysis Paper DM03 Preparing Real World Data in Excel Sheets for Statistical Analysis Volker Harm, Bayer Schering Pharma AG, Berlin, Germany ABSTRACT This paper collects a set of techniques of importing Excel

More information

B) Mean Function: This function returns the arithmetic mean (average) and ignores the missing value. E.G: Var=MEAN (var1, var2, var3 varn);

B) Mean Function: This function returns the arithmetic mean (average) and ignores the missing value. E.G: Var=MEAN (var1, var2, var3 varn); SAS-INTERVIEW QUESTIONS 1. What SAS statements would you code to read an external raw data file to a DATA step? Ans: Infile and Input statements are used to read external raw data file to a Data Step.

More information

Labels, Labels, and More Labels Stephanie R. Thompson, Rochester Institute of Technology, Rochester, NY

Labels, Labels, and More Labels Stephanie R. Thompson, Rochester Institute of Technology, Rochester, NY Paper FF-007 Labels, Labels, and More Labels Stephanie R. Thompson, Rochester Institute of Technology, Rochester, NY ABSTRACT SAS datasets include labels as optional variable attributes in the descriptor

More information

Automatically Converting Character Variables That Store Numbers to Numeric Variables

Automatically Converting Character Variables That Store Numbers to Numeric Variables Automatically Converting Character Variables That Store Numbers to Numeric Variables Christopher J. Bost, MDRC, New York, NY ABSTRACT A character variable can store numbers, but many analyses require numeric

More information

An Approach to Creating Archives That Minimizes Storage Requirements

An Approach to Creating Archives That Minimizes Storage Requirements Paper SC-008 An Approach to Creating Archives That Minimizes Storage Requirements Ruben Chiflikyan, RTI International, Research Triangle Park, NC Mila Chiflikyan, RTI International, Research Triangle Park,

More information

More Tales from the Help Desk: Solutions for Simple SAS Mistakes Bruce Gilsen, Federal Reserve Board

More Tales from the Help Desk: Solutions for Simple SAS Mistakes Bruce Gilsen, Federal Reserve Board More Tales from the Help Desk: Solutions for Simple SAS Mistakes Bruce Gilsen, Federal Reserve Board INTRODUCTION In 20 years as a SAS consultant at the Federal Reserve Board, I have seen SAS users make

More information

Programming Idioms Using the SET Statement

Programming Idioms Using the SET Statement Programming Idioms Using the SET Statement Jack E. Fuller, Trilogy Consulting Corporation, Kalamazoo, MI ABSTRACT While virtually every programmer of base SAS uses the SET statement, surprisingly few programmers

More information

SAS PROGRAM EFFICIENCY FOR BEGINNERS. Bruce Gilsen, Federal Reserve Board

SAS PROGRAM EFFICIENCY FOR BEGINNERS. Bruce Gilsen, Federal Reserve Board SAS PROGRAM EFFICIENCY FOR BEGINNERS Bruce Gilsen, Federal Reserve Board INTRODUCTION This paper presents simple efficiency techniques that can benefit inexperienced SAS software users on all platforms.

More information

The SET Statement and Beyond: Uses and Abuses of the SET Statement. S. David Riba, JADE Tech, Inc., Clearwater, FL

The SET Statement and Beyond: Uses and Abuses of the SET Statement. S. David Riba, JADE Tech, Inc., Clearwater, FL The SET Statement and Beyond: Uses and Abuses of the SET Statement S. David Riba, JADE Tech, Inc., Clearwater, FL ABSTRACT The SET statement is one of the most frequently used statements in the SAS System.

More information

SUGI 29 Coders' Corner

SUGI 29 Coders' Corner Paper 074-29 Tales from the Help Desk: Solutions for Simple SAS Mistakes Bruce Gilsen, Federal Reserve Board INTRODUCTION In 19 years as a SAS consultant at the Federal Reserve Board, I have seen SAS users

More information

CHAPTER 1 Overview of SAS/ACCESS Interface to Relational Databases

CHAPTER 1 Overview of SAS/ACCESS Interface to Relational Databases 3 CHAPTER 1 Overview of SAS/ACCESS Interface to Relational Databases About This Document 3 Methods for Accessing Relational Database Data 4 Selecting a SAS/ACCESS Method 4 Methods for Accessing DBMS Tables

More information

Data Presentation. Paper 126-27. Using SAS Macros to Create Automated Excel Reports Containing Tables, Charts and Graphs

Data Presentation. Paper 126-27. Using SAS Macros to Create Automated Excel Reports Containing Tables, Charts and Graphs Paper 126-27 Using SAS Macros to Create Automated Excel Reports Containing Tables, Charts and Graphs Tugluke Abdurazak Abt Associates Inc. 1110 Vermont Avenue N.W. Suite 610 Washington D.C. 20005-3522

More information

How Many Observations Are In My Data Set? Jack Hamilton, First Health, West Sacramento, California

How Many Observations Are In My Data Set? Jack Hamilton, First Health, West Sacramento, California Paper 95-26 How Many Observations Are In My Data Set? Jack Hamilton, First Health, West Sacramento, California ABSTRACT This paper presents a macro which returns the number of observations in a SAS data

More information

9.1 SAS. SQL Query Window. User s Guide

9.1 SAS. SQL Query Window. User s Guide SAS 9.1 SQL Query Window User s Guide The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2004. SAS 9.1 SQL Query Window User s Guide. Cary, NC: SAS Institute Inc. SAS

More information

Some Utility Applications Of The Dictionary Tables in PROC SQL

Some Utility Applications Of The Dictionary Tables in PROC SQL Some Utility Applications Of The Dictionary Tables in PROC SQL Jack Hamilton First Health West Sacramento, California JackHamilton@FirstHealth.com sqlutilov.doc 3:38 PM 28 Apr 1999 Page 1 of 33 What Are

More information

A Macro to Create Data Definition Documents

A Macro to Create Data Definition Documents A Macro to Create Data Definition Documents Aileen L. Yam, sanofi-aventis Inc., Bridgewater, NJ ABSTRACT Data Definition documents are one of the requirements for NDA submissions. This paper contains a

More information

3.GETTING STARTED WITH ORACLE8i

3.GETTING STARTED WITH ORACLE8i Oracle For Beginners Page : 1 3.GETTING STARTED WITH ORACLE8i Creating a table Datatypes Displaying table definition using DESCRIBE Inserting rows into a table Selecting rows from a table Editing SQL buffer

More information

Beginning Tutorials. bt009 A TUTORIAL ON THE SAS MACRO LANGUAGE John J. Cohen AstraZeneca LP

Beginning Tutorials. bt009 A TUTORIAL ON THE SAS MACRO LANGUAGE John J. Cohen AstraZeneca LP bt009 A TUTORIAL ON THE SAS MACRO LANGUAGE John J. Cohen AstraZeneca LP Abstract The SAS Macro language is another language that rests on top of regular SAS code. If used properly, it can make programming

More information

Paper 70-27 An Introduction to SAS PROC SQL Timothy J Harrington, Venturi Partners Consulting, Waukegan, Illinois

Paper 70-27 An Introduction to SAS PROC SQL Timothy J Harrington, Venturi Partners Consulting, Waukegan, Illinois Paper 70-27 An Introduction to SAS PROC SQL Timothy J Harrington, Venturi Partners Consulting, Waukegan, Illinois Abstract This paper introduces SAS users with at least a basic understanding of SAS data

More information

Coders' Corner. Paper 81-26

Coders' Corner. Paper 81-26 Paper 81-26 Automating the Process of Listing the Most Frequent Values of Thousands of Variables in Large Datasets Haiping Luo, Dept. of Veterans Affairs, Washington, DC Philip Friend, Dept. of Agriculture,

More information

A Faster Index for sorted SAS Datasets

A Faster Index for sorted SAS Datasets A Faster Index for sorted SAS Datasets Mark Keintz Wharton Research Data Services, Philadelphia PA ABSTRACT In a NESUG 2007 paper with Shuguang Zhang, I demonstrated a condensed index which provided significant

More information

Using Macros to Automate SAS Processing Kari Richardson, SAS Institute, Cary, NC Eric Rossland, SAS Institute, Dallas, TX

Using Macros to Automate SAS Processing Kari Richardson, SAS Institute, Cary, NC Eric Rossland, SAS Institute, Dallas, TX Paper 126-29 Using Macros to Automate SAS Processing Kari Richardson, SAS Institute, Cary, NC Eric Rossland, SAS Institute, Dallas, TX ABSTRACT This hands-on workshop shows how to use the SAS Macro Facility

More information

Remove Voided Claims for Insurance Data Qiling Shi

Remove Voided Claims for Insurance Data Qiling Shi Remove Voided Claims for Insurance Data Qiling Shi ABSTRACT The purpose of this study is to remove voided claims for insurance claim data using SAS. Suppose that for these voided claims, we don t have

More information

Storing and Using a List of Values in a Macro Variable

Storing and Using a List of Values in a Macro Variable Storing and Using a List of Values in a Macro Variable Arthur L. Carpenter California Occidental Consultants, Oceanside, California ABSTRACT When using the macro language it is not at all unusual to need

More information

Tales from the Help Desk 3: More Solutions for Simple SAS Mistakes Bruce Gilsen, Federal Reserve Board

Tales from the Help Desk 3: More Solutions for Simple SAS Mistakes Bruce Gilsen, Federal Reserve Board Tales from the Help Desk 3: More Solutions for Simple SAS Mistakes Bruce Gilsen, Federal Reserve Board INTRODUCTION In 20 years as a SAS consultant at the Federal Reserve Board, I have seen SAS users make

More information

That Mysterious Colon (:) Haiping Luo, Dept. of Veterans Affairs, Washington, DC

That Mysterious Colon (:) Haiping Luo, Dept. of Veterans Affairs, Washington, DC Paper 73-26 That Mysterious Colon (:) Haiping Luo, Dept. of Veterans Affairs, Washington, DC ABSTRACT The colon (:) plays certain roles in SAS coding. Its usage, however, is not well documented nor is

More information

A single register, called the accumulator, stores the. operand before the operation, and stores the result. Add y # add y from memory to the acc

A single register, called the accumulator, stores the. operand before the operation, and stores the result. Add y # add y from memory to the acc Other architectures Example. Accumulator-based machines A single register, called the accumulator, stores the operand before the operation, and stores the result after the operation. Load x # into acc

More information

Managing Tables in Microsoft SQL Server using SAS

Managing Tables in Microsoft SQL Server using SAS Managing Tables in Microsoft SQL Server using SAS Jason Chen, Kaiser Permanente, San Diego, CA Jon Javines, Kaiser Permanente, San Diego, CA Alan L Schepps, M.S., Kaiser Permanente, San Diego, CA Yuexin

More information

Advanced Tutorials. Numeric Data In SAS : Guidelines for Storage and Display Paul Gorrell, Social & Scientific Systems, Inc., Silver Spring, MD

Advanced Tutorials. Numeric Data In SAS : Guidelines for Storage and Display Paul Gorrell, Social & Scientific Systems, Inc., Silver Spring, MD Numeric Data In SAS : Guidelines for Storage and Display Paul Gorrell, Social & Scientific Systems, Inc., Silver Spring, MD ABSTRACT Understanding how SAS stores and displays numeric data is essential

More information

Nine Steps to Get Started using SAS Macros

Nine Steps to Get Started using SAS Macros Paper 56-28 Nine Steps to Get Started using SAS Macros Jane Stroupe, SAS Institute, Chicago, IL ABSTRACT Have you ever heard your coworkers rave about macros? If so, you've probably wondered what all the

More information

The Power of CALL SYMPUT DATA Step Interface by Examples Yunchao (Susan) Tian, Social & Scientific Systems, Inc., Silver Spring, MD

The Power of CALL SYMPUT DATA Step Interface by Examples Yunchao (Susan) Tian, Social & Scientific Systems, Inc., Silver Spring, MD Paper 052-29 The Power of CALL SYMPUT DATA Step Interface by Examples Yunchao (Susan) Tian, Social & Scientific Systems, Inc., Silver Spring, MD ABSTRACT AND INTRODUCTION CALL SYMPUT is a SAS language

More information

Paper 109-25 Merges and Joins Timothy J Harrington, Trilogy Consulting Corporation

Paper 109-25 Merges and Joins Timothy J Harrington, Trilogy Consulting Corporation Paper 109-25 Merges and Joins Timothy J Harrington, Trilogy Consulting Corporation Abstract This paper discusses methods of joining SAS data sets. The different methods and the reasons for choosing a particular

More information

TECHNICAL UNIVERSITY OF CRETE DATA STRUCTURES FILE STRUCTURES

TECHNICAL UNIVERSITY OF CRETE DATA STRUCTURES FILE STRUCTURES TECHNICAL UNIVERSITY OF CRETE DEPT OF ELECTRONIC AND COMPUTER ENGINEERING DATA STRUCTURES AND FILE STRUCTURES Euripides G.M. Petrakis http://www.intelligence.tuc.gr/~petrakis Chania, 2007 E.G.M. Petrakis

More information

The entire SAS code for the %CHK_MISSING macro is in the Appendix. The full macro specification is listed as follows: %chk_missing(indsn=, outdsn= );

The entire SAS code for the %CHK_MISSING macro is in the Appendix. The full macro specification is listed as follows: %chk_missing(indsn=, outdsn= ); Macro Tabulating Missing Values, Leveraging SAS PROC CONTENTS Adam Chow, Health Economics Resource Center (HERC) VA Palo Alto Health Care System Department of Veterans Affairs (Menlo Park, CA) Abstract

More information

Using SAS With a SQL Server Database. M. Rita Thissen, Yan Chen Tang, Elizabeth Heath RTI International, RTP, NC

Using SAS With a SQL Server Database. M. Rita Thissen, Yan Chen Tang, Elizabeth Heath RTI International, RTP, NC Using SAS With a SQL Server Database M. Rita Thissen, Yan Chen Tang, Elizabeth Heath RTI International, RTP, NC ABSTRACT Many operations now store data in relational databases. You may want to use SAS

More information

Paper CC-08 %RESTRUCT - SAS macro with Proc Univariate Milorad Stojanovic, RTI International, North Carolina

Paper CC-08 %RESTRUCT - SAS macro with Proc Univariate Milorad Stojanovic, RTI International, North Carolina Paper CC-08 %RESTRUCT - SAS macro with Proc Univariate Milorad Stojanovic, RTI International, North Carolina ABSTRACT Proc Univariate has a long history as a part of the SAS Base software package. It is

More information

Common Errors in C. David Chisnall. February 15, 2011

Common Errors in C. David Chisnall. February 15, 2011 Common Errors in C David Chisnall February 15, 2011 The C Preprocessor Runs before parsing Allows some metaprogramming Preprocessor Macros Are Not Functions The preprocessor performs token substitution

More information

Preparing your data for analysis using SAS. Landon Sego 24 April 2003 Department of Statistics UW-Madison

Preparing your data for analysis using SAS. Landon Sego 24 April 2003 Department of Statistics UW-Madison Preparing your data for analysis using SAS Landon Sego 24 April 2003 Department of Statistics UW-Madison Assumptions That you have used SAS at least a few times. It doesn t matter whether you run SAS in

More information

Chapter 2: Problem Solving Using C++

Chapter 2: Problem Solving Using C++ Chapter 2: Problem Solving Using C++ 1 Objectives In this chapter, you will learn about: Modular programs Programming style Data types Arithmetic operations Variables and declaration statements Common

More information

AN INTRODUCTION TO MACRO VARIABLES AND MACRO PROGRAMS Mike S. Zdeb, New York State Department of Health

AN INTRODUCTION TO MACRO VARIABLES AND MACRO PROGRAMS Mike S. Zdeb, New York State Department of Health AN INTRODUCTION TO MACRO VARIABLES AND MACRO PROGRAMS Mike S. Zdeb, New York State Department of Health INTRODUCTION There are a number of SAS tools that you may never have to use. Why? The main reason

More information

z = x + y * z / 4 % 2-1

z = x + y * z / 4 % 2-1 1.Which of the following statements should be used to obtain a remainder after dividing 3.14 by 2.1? A. rem = 3.14 % 2.1; B. rem = modf(3.14, 2.1); C. rem = fmod(3.14, 2.1); D. Remainder cannot be obtain

More information

Automating SAS Macros: Run SAS Code when the Data is Available and a Target Date Reached.

Automating SAS Macros: Run SAS Code when the Data is Available and a Target Date Reached. Automating SAS Macros: Run SAS Code when the Data is Available and a Target Date Reached. Nitin Gupta, Tailwind Associates, Schenectady, NY ABSTRACT This paper describes a method to run discreet macro(s)

More information

Subsetting Observations from Large SAS Data Sets

Subsetting Observations from Large SAS Data Sets Subsetting Observations from Large SAS Data Sets Christopher J. Bost, MDRC, New York, NY ABSTRACT This paper reviews four techniques to subset observations from large SAS data sets: MERGE, PROC SQL, user-defined

More information

The SAS Data step/macro Interface

The SAS Data step/macro Interface Paper TS09 The SAS Data step/macro Interface Lawrence Heaton-Wright, Quintiles, Bracknell, Berkshire, UK ABSTRACT The SAS macro facility is an extremely useful part of the SAS System. However, macro variables

More information

Embedded Systems. Review of ANSI C Topics. A Review of ANSI C and Considerations for Embedded C Programming. Basic features of C

Embedded Systems. Review of ANSI C Topics. A Review of ANSI C and Considerations for Embedded C Programming. Basic features of C Embedded Systems A Review of ANSI C and Considerations for Embedded C Programming Dr. Jeff Jackson Lecture 2-1 Review of ANSI C Topics Basic features of C C fundamentals Basic data types Expressions Selection

More information

Search and Replace in SAS Data Sets thru GUI

Search and Replace in SAS Data Sets thru GUI Search and Replace in SAS Data Sets thru GUI Edmond Cheng, Bureau of Labor Statistics, Washington, DC ABSTRACT In managing data with SAS /BASE software, performing a search and replace is not a straight

More information

Writing cleaner and more powerful SAS code using macros. Patrick Breheny

Writing cleaner and more powerful SAS code using macros. Patrick Breheny Writing cleaner and more powerful SAS code using macros Patrick Breheny Why Use Macros? Macros automatically generate SAS code Macros allow you to make more dynamic, complex, and generalizable SAS programs

More information

Bachelors of Computer Application Programming Principle & Algorithm (BCA-S102T)

Bachelors of Computer Application Programming Principle & Algorithm (BCA-S102T) Unit- I Introduction to c Language: C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating

More information

Project Request and Tracking Using SAS/IntrNet Software Steven Beakley, LabOne, Inc., Lenexa, Kansas

Project Request and Tracking Using SAS/IntrNet Software Steven Beakley, LabOne, Inc., Lenexa, Kansas Paper 197 Project Request and Tracking Using SAS/IntrNet Software Steven Beakley, LabOne, Inc., Lenexa, Kansas ABSTRACT The following paper describes a project request and tracking system that has been

More information

Using Pharmacovigilance Reporting System to Generate Ad-hoc Reports

Using Pharmacovigilance Reporting System to Generate Ad-hoc Reports Using Pharmacovigilance Reporting System to Generate Ad-hoc Reports Jeff Cai, Amylin Pharmaceuticals, Inc., San Diego, CA Jay Zhou, Amylin Pharmaceuticals, Inc., San Diego, CA ABSTRACT To supplement Oracle

More information

Informatica e Sistemi in Tempo Reale

Informatica e Sistemi in Tempo Reale Informatica e Sistemi in Tempo Reale Introduction to C programming Giuseppe Lipari http://retis.sssup.it/~lipari Scuola Superiore Sant Anna Pisa October 25, 2010 G. Lipari (Scuola Superiore Sant Anna)

More information

Oracle Database 12c R2: SQL and PL/SQL Fundamentals Ed 2 NEW

Oracle Database 12c R2: SQL and PL/SQL Fundamentals Ed 2 NEW Oracle University Contact Us: 0800 891 6502 Oracle Database 12c R2: SQL and PL/SQL Fundamentals Ed 2 NEW Duration: 5 Days What you will learn This Oracle Database: SQL and PL/SQL Fundamentals training

More information

Eliminating Tedium by Building Applications that Use SQL Generated SAS Code Segments

Eliminating Tedium by Building Applications that Use SQL Generated SAS Code Segments Eliminating Tedium by Building Applications that Use SQL Generated SAS Code Segments David A. Mabey, Reader s Digest Association Inc., Pleasantville, NY ABSTRACT When SAS applications are driven by data-generated

More information

The programming language C. sws1 1

The programming language C. sws1 1 The programming language C sws1 1 The programming language C invented by Dennis Ritchie in early 1970s who used it to write the first Hello World program C was used to write UNIX Standardised as K&C (Kernighan

More information

How well can you speak SAS or is what you see what you get?

How well can you speak SAS or is what you see what you get? PhUSE 06 Paper S03 How well can you speak SAS or is what you see what you get? Yuliia Bahatska, inventiv Health Clinical, Berlin, Germany ABSTRACT We all spend a lot of effort trying to keep up with new

More information

Andrew H. Karp Sierra Information Services, Inc. San Francisco, California USA

Andrew H. Karp Sierra Information Services, Inc. San Francisco, California USA Indexing and Compressing SAS Data Sets: How, Why, and Why Not Andrew H. Karp Sierra Information Services, Inc. San Francisco, California USA Many users of SAS System software, especially those working

More information

A Method for Cleaning Clinical Trial Analysis Data Sets

A Method for Cleaning Clinical Trial Analysis Data Sets A Method for Cleaning Clinical Trial Analysis Data Sets Carol R. Vaughn, Bridgewater Crossings, NJ ABSTRACT This paper presents a method for using SAS software to search SAS programs in selected directories

More information

SAS Functions by Example Ron Cody

SAS Functions by Example Ron Cody Examples from SAS Functions by Example Ron Cody Herman Lo Technical Analyst, RBC Capital Markets Agenda Book Structure Examples from the Book Character Functions (CATS, CATX) Date and Time Functions (INTCK,

More information

Automated Data Converting of Character into Numeric Fields

Automated Data Converting of Character into Numeric Fields Paper CC-030 Automated Data Converting of Character into Numeric Fields Mila Chiflikyan, RTI International, Research Triangle Park, NC Nick L Kinsey, RTI International, Research Triangle Park, NC Ruben

More information

Managing very large EXCEL files using the XLS engine John H. Adams, Boehringer Ingelheim Pharmaceutical, Inc., Ridgefield, CT

Managing very large EXCEL files using the XLS engine John H. Adams, Boehringer Ingelheim Pharmaceutical, Inc., Ridgefield, CT Paper AD01 Managing very large EXCEL files using the XLS engine John H. Adams, Boehringer Ingelheim Pharmaceutical, Inc., Ridgefield, CT ABSTRACT The use of EXCEL spreadsheets is very common in SAS applications,

More information

Integrating Data and Business Rules with a Control Data Set in SAS

Integrating Data and Business Rules with a Control Data Set in SAS Paper 3461-2015 Integrating Data and Business Rules with a Data Set in SAS Edmond Cheng, CACI International Inc. ABSTRACT In SAS software development, data specifications and process requirements can be

More information

Applications Development

Applications Development Paper 45-25 Building an Audit and Tracking System Using SAS/AF and SCL Hung X Phan, U.S. Census Bureau ABSTRACT This paper describes how to build an audit and tracking system using SAS/AF and SCL. There

More information

Notes on Shell Programming

Notes on Shell Programming Bourne shell (sh): #!/bin/sh Prompt: $ ($PS1) Variables Notes on Shell Programming Individual line arguments $0, $1,, $9 All line arguments $*, $@ Command line argument count $# Shell process ID $$ Last

More information

198:211 Computer Architecture

198:211 Computer Architecture 198:211 Computer Architecture Topics: Lecture 8 (W5) Fall 2012 Data representation 2.1 and 2.2 of the book Floating point 2.4 of the book 1 Computer Architecture What do computers do? Manipulate stored

More information

Oracle Database: SQL and PL/SQL Fundamentals NEW

Oracle Database: SQL and PL/SQL Fundamentals NEW Oracle University Contact Us: + 38516306373 Oracle Database: SQL and PL/SQL Fundamentals NEW Duration: 5 Days What you will learn This Oracle Database: SQL and PL/SQL Fundamentals training delivers the

More information

Oracle Database: SQL and PL/SQL Fundamentals

Oracle Database: SQL and PL/SQL Fundamentals Oracle University Contact Us: +966 12 739 894 Oracle Database: SQL and PL/SQL Fundamentals Duration: 5 Days What you will learn This Oracle Database: SQL and PL/SQL Fundamentals training is designed to

More information

Final Exam Review. CS 1428 Fall Jill Seaman. Final Exam

Final Exam Review. CS 1428 Fall Jill Seaman. Final Exam Final Exam Review CS 1428 Fall 2011 Jill Seaman 1 Final Exam Friday, December 9, 11:00am to 1:30pm Derr 241 (here) Closed book, closed notes, clean desk Comprehensive (covers entire course) 25% of your

More information

TAKING ADVANTAGE OF THE PROC SQL PASS-THROUGH FACILITY

TAKING ADVANTAGE OF THE PROC SQL PASS-THROUGH FACILITY 328 Host Systems and Environments TAKING ADVANTAGE OF THE PROC SQL PASS-THROUGH FACILITY Timothy Pruitt Corporate Cost Management Inc. INTRODUCTION The SQL procedure in SAS Release 6.07 provides a useful

More information

STREAMLINING COMMAND-LINE DMS COMMANDS WITH A MACRO

STREAMLINING COMMAND-LINE DMS COMMANDS WITH A MACRO Assigning a User-defined Macro to a Function Key Mary Rosenbloom, Edwards Lifesciences, LLC, Irvine, California Kirk Paul Lafler, Software Intelligence Corporation, Spring Valley, California ABSTRACT Are

More information

Array: Construction and Usage of Arrays of Macro Variables Ronald Fehd Centers for Disease Control, and Prevention, Atlanta GA USA

Array: Construction and Usage of Arrays of Macro Variables Ronald Fehd Centers for Disease Control, and Prevention, Atlanta GA USA ABSTRACT Paper 070-29 Array: Construction and Usage of Arrays of Macro Variables Ronald Fehd Centers for Disease Control, and Prevention, Atlanta GA USA The SAS R software data step statement array V (3)

More information

From The Little SAS Book, Fifth Edition. Full book available for purchase here.

From The Little SAS Book, Fifth Edition. Full book available for purchase here. From The Little SAS Book, Fifth Edition. Full book available for purchase here. Acknowledgments ix Introducing SAS Software About This Book xi What s New xiv x Chapter 1 Getting Started Using SAS Software

More information

Chapter 6 Working with SAS Data Sets

Chapter 6 Working with SAS Data Sets Chapter 6 Working with SAS Data Sets Chapter Table of Contents OVERVIEW... 79 OPENING A SAS DATA SET... 80 MAKING A SAS DATA SET CURRENT... 81 DISPLAYING SAS DATA SET INFORMATION... 82 REFERRING TO A SAS

More information

Finally An Easy Way To Compare Two SAS Files! Doug Zirbel, Pinnacle Solutions Inc., Indianapolis, IN

Finally An Easy Way To Compare Two SAS Files! Doug Zirbel, Pinnacle Solutions Inc., Indianapolis, IN Finally An Easy Way To Compare Two SAS Files! Doug Zirbel, Pinnacle Solutions Inc., Indianapolis, IN ABSTRACT This SAS macro compares two SAS datasets. It produces a differences report which is easy to

More information

Radix Number Systems. Number Systems. Number Systems 4/26/2010. basic idea of a radix number system how do we count:

Radix Number Systems. Number Systems. Number Systems 4/26/2010. basic idea of a radix number system how do we count: Number Systems binary, octal, and hexadecimal numbers why used conversions, including to/from decimal negative binary numbers floating point numbers character codes basic idea of a radix number system

More information

Oracle Database: SQL and PL/SQL Fundamentals

Oracle Database: SQL and PL/SQL Fundamentals Oracle University Contact Us: 1.800.529.0165 Oracle Database: SQL and PL/SQL Fundamentals Duration: 5 Days What you will learn This course is designed to deliver the fundamentals of SQL and PL/SQL along

More information

NT Event Log. CHAPTER 8 Enhancements for SAS Users under Windows NT

NT Event Log. CHAPTER 8 Enhancements for SAS Users under Windows NT 157 CHAPTER 8 Enhancements for SAS Users under Windows NT 157 NT Event Log 157 Sending Messages to the NT Event Log using SAS Code 158 NT Performance Monitor 159 Examples of Monitoring SAS Performance

More information

Overview. NT Event Log. CHAPTER 8 Enhancements for SAS Users under Windows NT

Overview. NT Event Log. CHAPTER 8 Enhancements for SAS Users under Windows NT 177 CHAPTER 8 Enhancements for SAS Users under Windows NT Overview 177 NT Event Log 177 Sending Messages to the NT Event Log Using a User-Written Function 178 Examples of Using the User-Written Function

More information

Financial Data Access with SQL, Excel & VBA

Financial Data Access with SQL, Excel & VBA Computational Finance and Risk Management Financial Data Access with SQL, Excel & VBA Guy Yollin Instructor, Applied Mathematics University of Washington Guy Yollin (Copyright 2012) Data Access with SQL,

More information

Transferring vs. Transporting Between SAS Operating Environments Mimi Lou, Medical College of Georgia, Augusta, GA

Transferring vs. Transporting Between SAS Operating Environments Mimi Lou, Medical College of Georgia, Augusta, GA CC13 Transferring vs. Transporting Between SAS Operating Environments Mimi Lou, Medical College of Georgia, Augusta, GA ABSTRACT Prior to SAS version 8, permanent SAS data sets cannot be moved directly

More information

Dept. of CSE, IIT KGP

Dept. of CSE, IIT KGP Programming in C: Basics CS10001: Programming & Data Structures Pallab Dasgupta Professor, Dept. of Computer Sc. & Engg., Indian Institute of Technology Kharagpur Types of variable We must declare the

More information

Tips, Tricks, and Techniques from the Experts

Tips, Tricks, and Techniques from the Experts Tips, Tricks, and Techniques from the Experts Presented by Katie Ronk 2997 Yarmouth Greenway Drive, Madison, WI 53711 Phone: (608) 278-9964 Web: www.sys-seminar.com Systems Seminar Consultants, Inc www.sys-seminar.com

More information

LITEN INTRODUKTION TILL DS2

LITEN INTRODUKTION TILL DS2 LITEN INTRODUKTION TILL DS2 Onsdag 19 februari 2014 Lena Norberg 1 22 November 2012 VAD ÄR DS2? (DATA STEP 2) DS2 is a new SAS programming language. is included with Base SAS has syntax that is similar

More information

10/24/16. Journey of Byte. BBM 371 Data Management. Disk Space Management. Buffer Management. All Data Pages must be in memory in order to be accessed

10/24/16. Journey of Byte. BBM 371 Data Management. Disk Space Management. Buffer Management. All Data Pages must be in memory in order to be accessed Journey of Byte BBM 371 Management Lecture 4: Basic Concepts of DBMS 25.10.2016 Application byte/ record File management page, page num Buffer management physical adr. block Disk management Request a record/byte

More information

How to Use the Data Step Debugger

How to Use the Data Step Debugger How to Use the Data Step Debugger S. David Riba, JADE Tech, Inc., Clearwater, FL ABSTRACT All SAS programmers, no matter if they are beginners or experienced gurus, share the need to debug their programs.

More information

A Technique for Storing and Manipulating Incomplete Dates in a Single SAS Date Value

A Technique for Storing and Manipulating Incomplete Dates in a Single SAS Date Value A Technique for Storing and Manipulating Incomplete Dates in a Single SAS Date Value John Ingersoll Introduction: This paper presents a technique for storing incomplete date values in a single variable

More information

Applications Development ABSTRACT PROGRAM DESIGN INTRODUCTION SAS FEATURES USED

Applications Development ABSTRACT PROGRAM DESIGN INTRODUCTION SAS FEATURES USED Checking and Tracking SAS Programs Using SAS Software Keith M. Gregg, Ph.D., SCIREX Corporation, Chicago, IL Yefim Gershteyn, Ph.D., SCIREX Corporation, Chicago, IL ABSTRACT Various checks on consistency

More information

Innovative Techniques and Tools to Detect Data Quality Problems

Innovative Techniques and Tools to Detect Data Quality Problems Paper DM05 Innovative Techniques and Tools to Detect Data Quality Problems Hong Qi and Allan Glaser Merck & Co., Inc., Upper Gwynnedd, PA ABSTRACT High quality data are essential for accurate and meaningful

More information

Alternative Methods for Sorting Large Files without leaving a Big Disk Space Footprint

Alternative Methods for Sorting Large Files without leaving a Big Disk Space Footprint Alternative Methods for Sorting Large Files without leaving a Big Disk Space Footprint Rita Volya, Harvard Medical School, Boston, MA ABSTRACT Working with very large data is not only a question of efficiency

More information

PL/SQL MOCK TEST PL/SQL MOCK TEST I

PL/SQL MOCK TEST PL/SQL MOCK TEST I http://www.tutorialspoint.com PL/SQL MOCK TEST Copyright tutorialspoint.com This section presents you various set of Mock Tests related to PL/SQL. You can download these sample mock tests at your local

More information

SAS and UNIX: Techniques for Developing Your Toolbox Joe Novotny, GlaxoSmithKline Pharmaceuticals, Inc., Collegeville, PA

SAS and UNIX: Techniques for Developing Your Toolbox Joe Novotny, GlaxoSmithKline Pharmaceuticals, Inc., Collegeville, PA Paper AA600 SAS and UNIX: Techniques for Developing Your Toolbox Joe Novotny, GlaxoSmithKline Pharmaceuticals, Inc., Collegeville, PA ABSTRACT How many times have you had to write and run short SAS programs

More information

Introduction to Market Basket Analysis Bill Qualls, First Analytics, Raleigh, NC

Introduction to Market Basket Analysis Bill Qualls, First Analytics, Raleigh, NC Paper AA07-2013 Introduction to Market Basket Analysis Bill Qualls, First Analytics, Raleigh, NC ABSTRACT Market Basket Analysis (MBA) is a data mining technique which is widely used in the consumer package

More information

Stacks. Linear data structures

Stacks. Linear data structures Stacks Linear data structures Collection of components that can be arranged as a straight line Data structure grows or shrinks as we add or remove objects ADTs provide an abstract layer for various operations

More information

Attribute Data Input and Management

Attribute Data Input and Management Attribute Data Input and Management Attribute data describe the characteristics of the map feature. Attribute data are stored in tables Each row of a table represents a map feature. Each column represents

More information

strsep exercises Introduction C strings Arrays of char

strsep exercises Introduction C strings Arrays of char strsep exercises Introduction The standard library function strsep enables a C programmer to parse or decompose a string into substrings, each terminated by a specified character. The goals of this document

More information

Keywords are identifiers having predefined meanings in C programming language. The list of keywords used in standard C are : unsigned void

Keywords are identifiers having predefined meanings in C programming language. The list of keywords used in standard C are : unsigned void 1. Explain C tokens Tokens are basic building blocks of a C program. A token is the smallest element of a C program that is meaningful to the compiler. The C compiler recognizes the following kinds of

More information

Tips for Constructing a Data Warehouse Part 2 Curtis A. Smith, Defense Contract Audit Agency, La Mirada, CA

Tips for Constructing a Data Warehouse Part 2 Curtis A. Smith, Defense Contract Audit Agency, La Mirada, CA Tips for Constructing a Data Warehouse Part 2 Curtis A. Smith, Defense Contract Audit Agency, La Mirada, CA ABSTRACT Ah, yes, data warehousing. The subject of much discussion and excitement. Within the

More information

1. Which of the following Boolean operations produces the output 1 for the fewest number of input patterns?

1. Which of the following Boolean operations produces the output 1 for the fewest number of input patterns? Test Bank Chapter One (Data Representation) Multiple Choice Questions 1. Which of the following Boolean operations produces the output 1 for the fewest number of input patterns? ANSWER: A A. AND B. OR

More information

MS ACCESS DATABASE DATA TYPES

MS ACCESS DATABASE DATA TYPES MS ACCESS DATABASE DATA TYPES Data Type Use For Size Text Memo Number Text or combinations of text and numbers, such as addresses. Also numbers that do not require calculations, such as phone numbers,

More information

Oracle SQL. Course Summary. Duration. Objectives

Oracle SQL. Course Summary. Duration. Objectives Oracle SQL Course Summary Identify the major structural components of the Oracle Database 11g Create reports of aggregated data Write SELECT statements that include queries Retrieve row and column data

More information