B) Mean Function: This function returns the arithmetic mean (average) and ignores the missing value. E.G: Var=MEAN (var1, var2, var3 varn);



Similar documents
From The Little SAS Book, Fifth Edition. Full book available for purchase here.

Nine Steps to Get Started using SAS Macros

Preparing your data for analysis using SAS. Landon Sego 24 April 2003 Department of Statistics UW-Madison

THE POWER OF PROC FORMAT

Programming Tricks For Reducing Storage And Work Space Curtis A. Smith, Defense Contract Audit Agency, La Mirada, CA.

EXST SAS Lab Lab #4: Data input and dataset modifications

Tales from the Help Desk 3: More Solutions for Simple SAS Mistakes Bruce Gilsen, Federal Reserve Board

How to Reduce the Disk Space Required by a SAS Data Set

More Tales from the Help Desk: Solutions for Simple SAS Mistakes Bruce Gilsen, Federal Reserve Board

Reading Delimited Text Files into SAS 9 TS-673

Data Presentation. Paper Using SAS Macros to Create Automated Excel Reports Containing Tables, Charts and Graphs

CHAPTER 1 Overview of SAS/ACCESS Interface to Relational Databases

Technical Paper. Reading Delimited Text Files into SAS 9

5. Crea+ng SAS Datasets from external files. GIORGIO RUSSOLILLO - Cours de prépara+on à la cer+fica+on SAS «Base Programming»

Using Macros to Automate SAS Processing Kari Richardson, SAS Institute, Cary, NC Eric Rossland, SAS Institute, Dallas, TX

PROC SQL for SQL Die-hards Jessica Bennett, Advance America, Spartanburg, SC Barbara Ross, Flexshopper LLC, Boca Raton, FL

USING PROCEDURES TO CREATE SAS DATA SETS... ILLUSTRATED WITH AGE ADJUSTING OF DEATH RATES 1

Paper An Introduction to SAS PROC SQL Timothy J Harrington, Venturi Partners Consulting, Waukegan, Illinois

SAS PROGRAM EFFICIENCY FOR BEGINNERS. Bruce Gilsen, Federal Reserve Board

Labels, Labels, and More Labels Stephanie R. Thompson, Rochester Institute of Technology, Rochester, NY

AN INTRODUCTION TO MACRO VARIABLES AND MACRO PROGRAMS Mike S. Zdeb, New York State Department of Health

How To Understand The Power Of Sas

DBF Chapter. Note to UNIX and OS/390 Users. Import/Export Facility CHAPTER 7

Innovative Techniques and Tools to Detect Data Quality Problems

Macros from Beginning to Mend A Simple and Practical Approach to the SAS Macro Facility

The SET Statement and Beyond: Uses and Abuses of the SET Statement. S. David Riba, JADE Tech, Inc., Clearwater, FL

Introduction to SAS Informats and Formats

Paper Merges and Joins Timothy J Harrington, Trilogy Consulting Corporation

Flat Pack Data: Converting and ZIPping SAS Data for Delivery

Lab Experience 17. Programming Language Translation

1. Base Programming. GIORGIO RUSSOLILLO - Cours de prépara+on à la cer+fica+on SAS «Base Programming»

Quick Start to Data Analysis with SAS Table of Contents. Chapter 1 Introduction 1. Chapter 2 SAS Programming Concepts 7

Paper Creating Variables: Traps and Pitfalls Olena Galligan, Clinops LLC, San Francisco, CA

1 Checking Values of Character Variables

Before You Begin... 2 Running SAS in Batch Mode... 2 Printing the Output of Your Program... 3 SAS Statements and Syntax... 3

Importing Excel Files Into SAS Using DDE Curtis A. Smith, Defense Contract Audit Agency, La Mirada, CA

C H A P T E R 1 Introducing Data Relationships, Techniques for Data Manipulation, and Access Methods

Chapter 1 Overview of the SQL Procedure

Effective Use of SQL in SAS Programming

Demonstrating a DATA Step with and without a RETAIN Statement

The Power of CALL SYMPUT DATA Step Interface by Examples Yunchao (Susan) Tian, Social & Scientific Systems, Inc., Silver Spring, MD

AN INTRODUCTION TO THE SQL PROCEDURE Chris Yindra, C. Y. Associates

Trade Flows and Trade Policy Analysis. October 2013 Dhaka, Bangladesh

Database Programming with PL/SQL: Learning Objectives

SAS Certified Base Programmer for SAS 9 A SAS Certification Questions and Answers with explanation

SAS/ACCESS 9.3 Interface to PC Files

Introduction to SAS on Windows

Creating External Files Using SAS Software

A Method for Cleaning Clinical Trial Analysis Data Sets

Counting the Ways to Count in SAS. Imelda C. Go, South Carolina Department of Education, Columbia, SC

The programming language C. sws1 1

3.GETTING STARTED WITH ORACLE8i

Automated distribution of SAS results Jacques Pagé, Les Services Conseils HARDY, Quebec, Qc

A Macro to Create Data Definition Documents

Oracle SQL. Course Summary. Duration. Objectives

Alternatives to Merging SAS Data Sets But Be Careful

Handling Missing Values in the SQL Procedure

1 Introduction. 2 An Interpreter. 2.1 Handling Source Code

Managing Tables in Microsoft SQL Server using SAS

EXTRACTING DATA FROM PDF FILES

Training/Internship Brochure Advanced Clinical SAS Programming Full Time 6 months Program

PharmaSUG Paper MS05

Oracle Database 12c: Introduction to SQL Ed 1.1

Programming Idioms Using the SET Statement

AN ANIMATED GUIDE: SENDING SAS FILE TO EXCEL

Instant Interactive SAS Log Window Analyzer

PharmaSUG Paper QT26

Commonly Used Excel Functions. Supplement to Excel for Budget Analysts

That Mysterious Colon (:) Haiping Luo, Dept. of Veterans Affairs, Washington, DC

SPSS: Getting Started. For Windows

Table Lookups: From IF-THEN to Key-Indexing

New Tricks for an Old Tool: Using Custom Formats for Data Validation and Program Efficiency

Everything you wanted to know about MERGE but were afraid to ask

PART-A Questions. 2. How does an enumerated statement differ from a typedef statement?

Sources: On the Web: Slides will be available on:

KEYWORDS ARRAY statement, DO loop, temporary arrays, MERGE statement, Hash Objects, Big Data, Brute force Techniques, PROC PHREG

Analyzing the Server Log

Using DDE and SAS/Macro for Automated Excel Report Consolidation and Generation

S P S S Statistical Package for the Social Sciences

Introduction to SAS Functions

Tips, Tricks, and Techniques from the Experts

Oracle Database: SQL and PL/SQL Fundamentals

The entire SAS code for the %CHK_MISSING macro is in the Appendix. The full macro specification is listed as follows: %chk_missing(indsn=, outdsn= );

AP Computer Science Java Mr. Clausen Program 9A, 9B

Java Interview Questions and Answers

Using SAS With a SQL Server Database. M. Rita Thissen, Yan Chen Tang, Elizabeth Heath RTI International, RTP, NC

CDW DATA QUALITY INITIATIVE

Let the CAT Out of the Bag: String Concatenation in SAS 9 Joshua Horstman, Nested Loop Consulting, Indianapolis, IN

Importing Excel File using Microsoft Access in SAS Ajay Gupta, PPD Inc, Morrisville, NC

Getting started with the Stata

Transcription:

SAS-INTERVIEW QUESTIONS 1. What SAS statements would you code to read an external raw data file to a DATA step? Ans: Infile and Input statements are used to read external raw data file to a Data Step. 2. How do you read in the variable that you need? Ans: If we want to read a particular variable in a set of SAS data set, we can mention the variable we want in the INPUT statement. 3. Are you familiar with special input delimiters? How are they used? Ans: Yes, we have special delimiters like DLM and DSD in SAS. Both these delimiters can be used in the infile statement The DLM can read the commas and spaces as data delimiters. You may choose any delimiters you wish with this option. You can choose multiple character such as DLM= XX for your delimiter. The DSD option allows you to treat two consecutive delimiters as containing a missing value. 4. If reading a variable length file with fixed input, how would you prevent SAS from reading the next record if the last variable didn t have a value? Ans: We can use MISS OVER option in the INFILE statement 5. What is the difference between an informat and a format? Name three informat or format? Ans: An informat is an instruction that SAS uses to read data values into a variable A format is an instruction that SAS uses to write data values The three informat are: - A) Date informat B) Character informat c) Numeric informat The three Formats are:-

A) Date format B) Character Format C) Numeric Format 6. Name and describe three SAS function that u have used, if any? Ans: A) SUM Function: It adds the variable together by ignoring the missing values if any E.G: Var=SUM (var1, var2 varn); Var1= SUM (1,., 3) = 4 B) Mean Function: This function returns the arithmetic mean (average) and ignores the missing value. E.G: Var=MEAN (var1, var2, var3 varn); C) SUBSTR Function: The SUBSTR function extracts a portion of the character data values based on how many characters are designated for retrieval. E.G: Var=SUBSTR (var, start<, number of characters); Var1=SUBSTR (ASHOK, 1, 3) In the above example the SUBSTR function takes String ASHOK cuts from start-point (1) till number of Characters (3) and stores ASH in Var1 7. How would you code the criteria to restrict the output to be produced? Ans: ods output close; 8. What is the purpose of trailing@? The @@? How would you use them? Ans: The trailing @ is also known as column pointer By using the trailing@, in the INPUT statement gives you ability to read a part of your raw data line, test it, and then decide how to read additional data from the same record. The single trailing @ tells the SAS system to hold the line. The double Trailing @@ tells the SAS system to Hold the line more strongly. NOTE : An INPUT statement ending with @@ instructs the program to release the current raw data line only when there are no

data values left to be read from that line. The @@, therefore, hold the input record even across multiple iteration of the data step. 9. Under what circumstances would you code a SELECT construct instead of IF statement? Ans: Especially if you are recoding a variable into a large number of categories. 10. What statement do you code to tell SAS that it is to write to an external file? Ans: Filename fileref path ; File fileref; Put _all_ /* will write all the variables. */ Or put the variables which you require. 11. If reading an external file to produce an external file, what shortcut to write record without coding every single variable on the record? Ans: Put _all _ 12. If you do not want any SAS output from a data step, how would you code the data statement to prevent SAS from producing a set? Ans: By using DATA _NULL_ the desired output is a file and not a SAS dataset. 13. What is the one statement to set the criteria of a data that can be coded in any step? Ans: Options statement 14. Have you ever-linked SAS code? If so, describe the like and any required statement used to either process the code or the step itself. Ans : The link statement tells SAS to jump immediately To the statement label that is indicated in the Label statement and to continue executing statements from that point until a RETURN statement is executed. The RETURN statement ends program control to the statement immediately following the LINK statement.

Note: The LINK statement and the destination must be in the same DATA step. The destination is identified by a statement label in the LINK statement. 15. How would you include common or reuse code to be Processed along with your statement? Ans: By using %Include 16. When looking for the data contained in a character string of 150 bytes, which function is the best to locate that data: scan, index or indexc? Ans: Scan 17. If you have a data set that contains 100 variables, but you need only five of those, what is the code to force SAS to use only those variables? Ans: Use keep = option; 18. Code a PROC SORT on a data set containing state, district and country as the primary variable, along with several numeric variables. Ans: PROC SORT data-set-name; BY state district country; Run; 19. How would you delete duplicate observation? Ans: There are three ways to delete duplicate observations in a dataset 1) Proc sort data=sas-data-set nodups; by var; run; 2) Proc sql; Create sas-data-set as select * from old_sas_data_set where var=distinct(var); quit; 3)Data clean; Set temp; By group; If first.group and last.group then Run;

20. How would you code a merge that will keep only the observation that have matches form both sets? Ans: By using the IN internal variable in the merge statement. DATA NEW; MERGE ONE_TEMP (IN=ONE) TWO_TEMP (IN=TWO); BY NAME; IF ONE=1 AND TWO=1; RUN; 21. What is the Program Data Vector (PDV)? What are their functions? Ans: Program Data Vector is the temporary holding area. For example The WHERE statement is may be more efficient then the sub setting If (especially if you are taking a very small sunset from a large file) because it checks on the validity of the condition to see if the observation is to be kept or not. This temporary holding area is called the program data vector (PDV). 22. Does SAS Translate (compile) or does it Interpret? Explain. Ans: When you submit a DATA step for execution, SAS checks the syntax of the SAS statements and compiles them, that is, automatically translates the statements into machine code. In this phase, SAS identifies the type and length of each new variable, and determines whether a type conversion is necessary for each subsequent reference to a variable. 23. At compile time when a SAS data set is read, what items are created? Ans: At compile time SAS creates the following A) Input Buffer B) Program Data Vector(pdv) C) Descriptor information 24. Name statements that are recognized at compile time

Only? Ans: Drop Keep e.t.c 25. Identify statement whose placement in the DATA step is critical Ans: Input Statement. 26. Name statements that function at both compile and execution time. 27. Name statements that are execution only. 28. In the flow of the DATA step processing, what is the first action in a typical DATA step? Ans: SAS first performs Syntax check. 29. What is _n_? Ans: This is nothing but a implicit variable created by SAS during data processing. It gives the total number Of records SAS has iterated in a dataset. It is Available only for data step and not for procs. E.G: If we want to find every third record in a Dataset then we can use the _n_ as follows Data new-sas-data-set; Set old; If mod (_n_, 3) =1 then; Run; Note: If we use a where clause to subset the _n_ Will not yield the required result. BASE SAS: 30. What is the effect of the OPTION statement ERROR=1? Ans: If the particular data step has one or more errors then end the processing 31. What s the difference between VAR A1 A4 and VAR A1--A4?

32. What do the SAS log messages numeric values have been converted to character mean? Ans: If we try some character function on the numeric values the SAS will automatically convert the numeric variable into character variable. 33. Why is a STOP statement needed for a POINT=option on a SET statement? Ans: Because POINT= reads only the specified observations, SAS cannot detect an end-of-file condition as itwould if the file were being read sequentially. Because detecting an end-offile condition terminates a DATA step automatically, failure to substitute another means of terminating the DATA step when you use POINT= can cause the DATA step to go into a continuous loop. NOTE: You cannot use the POINT= option with any of the following: BY statement WHERE statement WHERE= data set option transport format data sets sequential data sets (on tape or disk) a table from another vendor's relational database management system. 34. How do you control the number of observation and /or variable read or write? Ans: By specifying obs option 35. Approximately what date is represented by the SAS date value of 730? Ans: 1 January 1962. 36. How would remove a format that has been permanently associated with a variable. Ans: By Using proc datasets library= somelibrary; Modify sasdataset; Run; 37. What does the RUN statement do?

Ans: The run statement executes the statement. 38. Why SAS considered self-documenting? Ans: when a sas-data-set is created SAS creates the Descriptor portion and the data portion of the Data set. The descriptor portion contains the Details like when the dataset was created, no. of Observations, no. of variables e.t.c. Hence SAS is Considered self documenting. 39. Briefly describe 5 ways to do a table lookup in SAS. Ans: 1) Simple table lookup (merging (merge (including IN=OPTION) and sub setting IF statement) 2) Simple table lookup (formats (PROC FORMAT AND PUT function). 3) Looking up with two variable (merging (merge (including IN=OPTION) and sub setting IF statement) 4) Looking up with two variable ((formats (PROC FORMAT, PUT AND INPUT Function) 5) A two-way Looking table (merge statement using two variables). 40. What are some good SAS programming practices for processing vary large data set? Ans: For vary large data set with many variables we can make use of arrays in the SAS systerm. 41. How would you create a data set with 1 observation and 30 variables from a data set with 30 observations and 1 Variable? Ans: Using Proc Transpose and also do with the sas arrays. 44. What are _numeric_ and _character_ and what do they do?

Ans: If we want to do a particular task for all the numeric variable we can use the _numeric_ and same as if we want to do a particular task for all the character variable we can use the _character_ 46. What is the order of application for output data set option, input data set option and SAS statement? Ans: INPUT data set option, SAS statement option and then OUTPUT option. 47. What is the order of evaluation of the comparison operators: + - * /** ()? Missing Value: 56. How many missing values are available? When might you use them? Ans: Two missing values are available in SAS, they are numeric and character. 57. How do you test for missing values? Ans: We can test the missing values by using NMISS option in the input statement 58. How are numeric and character missing values represented internally? Ans: The numeric missing values represented as dots(.) and the character missing values represented as blank FUNCTIONS: 59. What is the significance of the OF in X=SUM (OF a1-a4, a6, a9);? 60. What do the PUT and INPUT function do? Ans: The PUT function is used to identify the logic Problem Which piece of code is executed and not executed what the current value of the particular variable and what the current value of the all variable.

INPUT function: The traditional use is the reread a character variable with a numeric format, execute a character-to-numeric conversion. The character to numeric conversion function; INPUT (variable, informat-name) The INPUT function converts the character variable to numeric Salary=input (EMP_SALARY, dollar7.); Character value Numeric value EMP_SALARY SALARY $85,000 85000 Rename the assigning variable we cannot have the same name. Like: EMP_SALARY=input (EMP_SALARY, dollar7.); The numeric to character conversion function PUT (variable, informat-name); newphone=put (phone, 7); numeric value character value PHONE PHONE 6778000 6778000 61. Which date advances a date, time or date/time value by a given interval? 62. What do the MOD and INT function do? Ans: MOD function is very useful if suppose you want to select every third observation from SAS data set. Example= data third; Set old; If mod(_n_,3)=1; Run; The INT function retunes the integer portion of an argument. To truncate a number (drop off the fractional part), you use the INT function.

63. In ARRAY processing, what does the DIM function do? Ans: DIM is the dimension function. This returns the length of the array (i.e. the number of variable in the list). 64. How would you determine the number of missing or nonmissing value in computation? Ans: We can use the N option for the number of NON- MISSING values and NMISS option for the number of MISSING values. 65. What is the difference between: X=a+b+c+d; and X=SUM (a, b, c, d);? Ans: If we use SUM (a, b, c, d) it will ignore the missing Values if any and compute the sum. For E.G SUM(1,.,2,3)=6 X=1+.+2+3 = MISSING. 66. There is a field containing a date. It needs to be displayed in the format ddmonyy if it s before 1975, dd mon ccyy if it s after 1985, and as disco years if its between 1975 and 1985. How would you accomplish this in data step code? Using only PROC FORMAT. 67. In the following DATA step, what is needed for fraction to print to the log Ans: data _null_; X=1/3; if X=.333 then ; put fraction ; run; 68. What is the difference between calculating the mean using the mean function and PROC MEANS? Ans: The mean function returns the mean of the non-missing values in the variable list. Actually, you may not have figured out the importance of the way the MEAN function deals with the missing values, and this is quit important.if you calculate SCORE by simply

adding up all the item and dividing by 50 as follows SCORE=(item1 +item2+item3+..+item50)/50; You would be in big trouble if any of the items had missing values. When SAS statement tries to do arithmetic operation on missing values, the result is always missing. PROCs: 69. If you were given several SAS data sets you were unfamiliar with, how would you find out the variable names and formats of each dataset? Ans: I can use the contents Procedure of all in the libname and see all the variable name and formats of each data set EG: PROC CONTENTS DATA=LIBREF._ALL_; RUN; 70. How would you keep SAS from overlaying the SAS set with its sorted version? Ans: By creating a new dataset after sorting by specifying Out = new sas dataset 71. In PROC PRINT, can you print only variable that begin with the letter A Ans: Yes we can print variable which begin with the letter A by using the WHERE statement in the PROC PRINT statement WHERE (VARIABLE NAME) LIKE A% ; Or WHERE (VARIABLE NAME =: A ; 72. What are some differences between PROC SUMMARY and PROC MEANS? Ans: 1) PROC MEANS produces subgroup statistics only when a BY statement is used and the input data has been previously sorted (use PROC SORT) by the BY variables.proc SUMMARY automatically produces

statistics for all subgroups, giving you all the information in one run that you would get by repeatedly sorting a data set by the variables that define each subgroup and running PROC MEANS/. 2) PROC SUMMARY does not produce any information in your output so you will always need to use the OUTPUT statement to create a new data set and use PROC PRINT to see the computed statistics. PROC FREQ: 73. Code the table statement for a single-level (most common) frequency. Ans The statement for single-level. DATA MAR.FREQTEST; SET BAS.AMPERS; PROC FREQ DATA =MAR.FREQTEST; TABLE AGE; RUN; 74. Code the table statement to produce a multi-level frequency. Ans: The statement for multilevel. DATA MAR.FREQTEST; SET BAS.AMPERS; PROC FREQ DATA =MAR.FREQTEST; TABLE AGE * gender; RUN; 75. Name the option to produce a frequency line items rather that a table. 76. Produce output from a frequency. Restrict the printing of the table.

PROC MEANS: 77. Code a PROC MEANS that shows both summed and averaged output of the data. 78. Code the option that will allow MEANS to include missing numeric data to be included in the report. 79. Code the MEANS to produce output to be used later. 80. Do you use PROC REPORT or PROC TABULATE? Which do you prefer? Explain. MERGING/UPDATING : 81. What happens in a one-on-one merge? When would you use one? Ans:If you want to merge two data set that have different variable and only one variable as a common variable with that unique variable we can merge the data set with one-on-one merge. 82. How would you combine 3 or more tables with different structures? 83. What is the problem with merging two data set that have variable with the same name but different data? Ans:The second data set value will overwrite the value of the first data set. 84. When would you choose to MERGE two data sets together and when would you SET two data sets? Ans: If we want to create a dataset as an exact copy of The old dataset without any bothering about which Dataset is going to contribute to the new dataset Then we will use set statement. If we want to control the contribution of the old Datasets to the new dataset then we will use the Merge statement 85. Which data set is the controlling data set in the MERGE statement? Ans: The second final dataset after the merge statement.

86. How do the IN= variable improve the capability of a MERGE? Ans: IN is a implicit variable in SAS which helps in controlling which dataset needs to contribute to the new dataset 87. Explain the message MERGE HAS ONE OR MORE DATASETS WITH REPEATS OF BY VARIABLE. COSTOMIZED REPORT WRITING: 88. What is the purpose of the statement DATA_NULL_? Ans: Use the keyword _NULL_, which allows the power of the DATA step without creating a data set. 89. What is the pound sign used for the DATA _NULL_? 90. What is the purpose of using the N=PS option? Ans: Specifying N=PS in the FILE statement allows the output pointer to write on any line of the current output MACRO: 91. What system option would you use to help debug a macro? Ans: Symbolgen Mlogic Mprint 92. Describe how you would create a macro variable? Ans: %let var=value; 93. How do you identify a macro variable? 94. How do you define the end of a macro? Ans: %mend 95. How do you assign a macro variable to a SAS variable? Ans: Using CallSymput

96. what is the difference between %LOCAL and %GLOBAL? Ans: The %LOCAL that variable will be used only at the particular block only but in case of the %GLOBAL that variable will be used till the end of the SAS session 97. How long can a macro variable be? A token? Ans: Till it passes to the word scanner. 98. If you use a SYMPUT in a DATA step, when and where can you use the macro variable? Ans: It can be used outside the scope of dataset and will Be globally available. 100. How would you code a macro statement to produce information on the SAS log? Ans: %put Statement