Transforming SAS Data Sets Using Arrays. Introduction
|
|
|
- Gwendoline Young
- 9 years ago
- Views:
Transcription
1 Transforming SAS Data Sets Using Arrays Ronald Cody, Ed.D., Robert Wood Johnson Medical School, Piscataway, NJ Introduction This paper describes how to efficiently transform SAS data sets using arrays. First, let us discuss what we mean by the term "transforming." You may want to create multiple observations from a single observation or vice versa. There are several possible reasons why you may want to do this. You may want to create multiple observations from a single observation to count frequencies or to allow for BY variable processing. You may also want to restructure SAS data sets for certain statistical analyses. Creating a single observation from multiple observations may make it easier for you to compute differences between variables without resorting to LAG functions or to use the REPEATED statement in PROC GLM. PROC TRANSFORM may come to mind as a solution to these transforming problems, but using arrays in a Data Step can be more flexible and allow you to have full control over the transformation process. Example 1: Creating a New Data Set with Several Observations per Subject from a Data Set with One Observation per Subject Suppose you have a data set called DIAGNOSE, with the variables ID, DX1, DX2, and DX3. The DXn variables represent three diagnosis codes. The observations in data set DIAGNOSE are: DATA SET DIAGNOSE ID DX1 DX2 DX As you can see, some subjects have only one diagnosis code, some two, and some, all three. Suppose you want to count how many subjects have diagnosis 1, how many have diagnosis 2, and so on. You don't care if the diagnosis code is listed as DX1, DX2, or DX3. In the example here, you would have a frequency of one for diagnosis codes 1, 2, 5, and 7 and a frequency of two for diagnosis codes 3 and 4. One way to accomplish this task is to transform the data set DIAGNOSE which has one observation per subject and three diagnosis variables, to a data set that has a single diagnosis variable and as many observations per subject as there are diagnoses for that subject. This new data set (call it NEW_DX) would look like the one shown next: TRANSFORMED DATA SET (NEW_DX) ID DX Using data set NEW_DX, it is now a simple job to count diagnosis codes using PROC FREQ on the single variable DX. Let us first write a SAS DATA step that creates data set NEW_DX from data set DIAGNOSE but does not use arrays. Here is the code: EXAMPLE 1A: CREATING MULTIPLE OBSERVATIONS FROM A SINGLE OBSERVATION WITHOUT USING AN ARRAY DATA NEW_DX; SET DIAGNOSE; DX = DX1; DX = DX2; DX = DX3; KEEP ID DX; Let's see how this program works. As you read in each observation from data set DIAGNOSE, you create from one to three observations in the new
2 data set NEW_DX. The SET statement brings in each observation from the original data set (DIAGNOSE), one at a time. In the first iteration of this Data Step, the values of ID, DX1, DX2, and DX3 are 01, 3, 4, and missing, respectively. Next, a new variable, DX, is set equal to DX1 (which is a 3). Since this is not a missing value, the OUTPUT statement in the next line is executed and the first observation in data set NEW_DX is formed. The values of all the variables in the PDV at this point are: ID=01 DX1=3 DX2=4 DX3=. DX=3 But, since we have a KEEP statement in the Data Step, only the values for ID and DX are written out to the new data set. Next, the value of DX is set to DX2 (which is a 4). Again, since the value of DX is not a missing value, another observation is written to the data set NEW_DX. Finally, DX is set equal to DX3 (which is a missing value). Since the following IF statement is not true, the third OUTPUT statement of the Data Step does not execute and execution returns to the top of the Data Step and another observation from data set DIAGNOSE is read. As you can see, the program will create as many observations per subject as there are nonmissing DX codes for that subject. Notice the repetitive nature of the program and your array light bulb should turn on. Here is the program rewritten using arrays: EXAMPLE 1B: CREATING MULTIPLE OBSERVATIONS FROM A SINGLE OBSERVATION USING AN ARRAY DATA NEW_DX; SET DIAGNOSE; ARRAY DXARRAY[3] DX1 - DX3; DO I = 1 TO 3; DX = DXARRAY[I]; KEEP ID DX; In this program, you first create an array called DXARRAY which contains the three numeric variables DX1, DX2, and DX3. Remember that this array allows you to refer to any of the variables associated with it by listing the array name, subscripted with the appropriate index (subscript). Also remember that array elements exist only in the Data Step in which they are created and they are not part of the SAS data set being created. Finally, array names follow the same rules as SAS variable names. (Note: Do not use the same name for an array as a variable in your data set.) Now, back to the program. The two lines of code inside the DO loop are similar to the repeated lines in the non-array example with the variable names DX1, DX2, and DX3 replaced by the array elements. To count the number of subjects with each diagnosis code, you can now use PROC FREQ like this: PROC FREQ DATA=NEW_DX; TABLES DX / NOCUM; In this example, by using an array, you only saved one line of SAS code. However, if there were more variables, DX1 to DX50 for example, the savings would be substantial. Example 2: Another Example of Creating Multiple Observations from a Single Observation Here is an example that is similar to Example 1. You start with a data set that contains an ID variable and three variables S1, S2, and S3 which represent a score at times 1, 2, and 3 respectively. The original data set called ONEPER, looks as follows: DATA SET ONEPER ID S1 S2 S You want to create a new data set called MANYPER, which looks like this: DATA SET MANYPER ID TIME SCORE
3 The program to transform data set ONEPER to data set MANYPER is similar to the program in Example 1 except that you need to create the TIME variable in the transformed data set. This is easily accomplished by naming the DO loop counter TIME as follows: EXAMPLE 2: CREATING MULTIPLE OBSERVATIONS FROM A SINGLE OBSERVATION USING AN ARRAY DATA MANYPER; SET ONEPER; ARRAY S[3]; DO TIME = 1 TO 3; SCORE = S[TIME]; OUTPUT; KEEP ID TIME SCORE; We first notice that the ARRAY statement in this Data Step does not have a variable list. This was done to demonstrate another way of writing an ARRAY statement. When the variable list is omitted, the variable names default to the array name followed by the numbers from the lower bound to the upper bound. In this case, the statement ARRAY S[3]; is equivalent to ARRAY S[3] S1-S3; The SET statement brings in observations from the data set ONEPER, which contain the variables ID, S1, S2, and S3. This program is similar to the previous program except for the fact that we call the DO loop variable TIME and we always output three observations for every observation in data set ONEPER. (If there are any missing values for the variables S1, S2, or S3, there will be an observation in the new data set with a missing value for the variable called SCORE. Still going in the direction of creating multiple observations from a single observation, let us extend this program to include an additional dimension. Example 3: Going from One Observation per Subject to Many Observations per Subject Using Multidimensional Arrays Suppose you have a SAS data set (call it WT_ONE) that contains an ID and six weights for each subject in an experiment. The first three values represents weights at times 1,2, and 3 under condition 1; the next three values represent weights at times 1,2, and 3 under condition 2 (see the diagram below): CONDITION TIME TIME WT1 WT2 WT3 WT4 WT5 WT6 To clarify this, suppose that data set WT_ONE contains two observations: DATA SET WT_ONE ID WT1 WT2 WT3 WT4 WT5 WT Here, weights 1, 2, and 3 correspond to measurements made under condition 1at times 1 to 3 and weights 4, 5, and 6 correspond to measurements made under condition 2 at times 1 to 3. You want a new data set called WT_MANY to look like this: DATA SET WT_MANY ID COND TIME WEIGHT A convenient way to make this conversion would be to create a two-dimensional array with the first dimension representing condition and the second
4 representing time. So, instead of having a onedimensional array like this: ARRAY WEIGHT[6] WT1-WT6; you could create a two-dimensional array like this: ARRAY WEIGHT[2,3] WT1-WT6; The comma between the 2 and 3 separates the dimensions of the array. This is a 2 by 3 array. Array element WEIGHT[2,3], for example, would represent a subject's weight under condition 2 at time 3. Let us use this array structure to create the new data set which contains 6 observations for each ID. Each observation is to contain the ID and one of the 6 weights, along with two new variables, COND and TIME which represent the condition and the time at which the weight was recorded. Here is the restructuring program: EXAMPLE 3: USING A MULTIDIMEN- SIONAL ARRAY TO RESTRUCTURE A DATA SET DATA WT_MANY; SET WT_ONE; ARRAY WTS [2,3] WT1-WT6; DO COND = 1 TO 2; DO TIME = 1 TO 3; WEIGHT = WTS[COND,TIME]; OUTPUT; DROP WT1-WT6; To cover all combinations of condition and time, you use "nested" DO loops, that is, a DO loop within a DO loop. Here's how it works: COND is first set to 1 by the outer loop. Next, TIME is set to 1,2, and 3 in the inner loop while COND remains at 1. Once the inner loop is completed, (TIME has reached 3), the control returns to the outer loop where condition (COND) is new set to 2 and the inner loop cycles through the three times again. Each time a COND and TIME combination is selected, a WEIGHT is set equal to the appropriate array element and the observation is written out to the new data set. Example 4: Creating a Data Set with One Observation per Subject from a Data Set with Multiple Observations per Subject It's now time to reverse the transformation process. We will do the reverse of Example 2 to demonstrate how to create a single observation from multiple observations. This time, we will start with data set MANYPER and create data set ONEPER. First the program, then the explanation: EXAMPLE 4A: CREATING A DATA SET WITH ONE OBSERVATION PER SUBJECT FROM A DATA SET WITH MULTIPLE OBSERVATIONS PER SUBJECT. (CAUTION, THIS PRO- GRAM WILL NOT WORK IF THERE ARE ANY MISSING TIME VALUES.) PROC SORT DATA=MANYPER; BY ID TIME; DATA ONEPER; ARRAY S[3] S1-S3; RETAIN S1-S3; SET MANYPER; BY ID; S[TIME] = SCORE; IF LAST.ID THEN OUTPUT; KEEP ID S1-S3; First you sort data set MANYPER to be sure that the observations are in ID and TIME order. In this example, data set MANYPER is already in the correct order but the SORT procedure makes the program more general. Next, you need to create an array containing the variables you want in the ONEPER data set, namely, S1, S2, and S3. You can "play computer" to see how this program works. The first observation in data set MANYPER is: ID = 01 TIME = 1 SCORE = 4 Since TIME is equal to a 1, S[TIME] is equivalent to S[1], which represents the variable S1 and is set to the value of SCORE which is 4. Since LAST.ID is false, the OUTPUT statement does not execute and control returns to the top of the Data Step. The
5 value of S1 is still equal to 4 since it is one of the retained variables (as are the variables S2 and S3, because of the RETAIN statement which follows the ARRAY statement). The values of retained variables are not set to a missing value for each iteration of the Data Step as are the values of nonretained variables. In the next observation time is 2 and SCORE is 5, so the variable S2 is assigned a value of 5. Finally, the third and last observation is read for ID 01. S3 is set to the value of SCORE which is 6 and since LAST.ID is true, the first observation in data set ONEPER is written. Everything seems fine. Almost. What if data set MANYPER did not have an observation at all three values of time for each ID? To see what happens when there is a missing value, we have created a new data set (MANYPER2) shown below: DATA SET MANYPER2 ID TIME SCORE Notice that ID number 02 does not have an observation with TIME = 2. What will happen if you run program Example 4A? Since you retained the values of S1, S2, and S3 and never replaced the value of S2 for ID number 02, ID number 2 will be given the value of S2 from the previous subject! Not what you want. You must always be careful when you retain variables. To be sure that this will not happen, you need to set the values of S1, S2, and S3 to missing each time you encounter a new subject. This is easily accomplished by initializing each of the new variables to a missing value each time a new subject is encountered (FIRST.ID is true). The corrected program is shown next: EXAMPLE 4B: CREATING A DATA SET WITH ONE OBSERVATION PER SUBJECT FROM A DATA SET WITH MULTIPLE OBSERVATIONS PER SUBJECT (CORRECTED VERSION) PROC SORT DATA=MANYPER2; BY ID TIME; DATA ONEPER; ARRAY S[3] S1-S3; RETAIN S1-S3; SET MANYPER2; BY ID; IF FIRST.ID THEN DO I = 1 TO 3; S[I] =.; S[TIME] = SCORE; IF LAST.ID THEN OUTPUT; KEEP ID S1-S3; This program will now work correctly whether or not there are missing TIME values. Example 5: Creating a Data Set with One Observation per Subject from a Data Set with Multiple Observations per Subject Using a Multidimensional Array This example will be the reverse of Example 3. That is, you want to start from data set WT_MANY and wind up with data set WT_ONE. The solution to this problem is similar to Example 4 except we will use a multidimensional array. Instead of writing the program in two steps as we did in Examples 4A and 4B, we will present the general solution that will work whether or not there are any missing observations in the data set. Here is the program: EXAMPLE 5: CREATING A DATA SET WITH ONE OBSERVATION PER SUBJECT FROM A DATA SET WITH MULTIPLE OBSERVATIONS PER SUBJECT USING A MULTDIMEN- SIONAL ARRAY PROC SORT DATA=WT_MANY; BY ID COND TIME; DATA WT_ONE; ARRAY WT[2,3] WT1-WT6;
6 RETAIN WT1-WT6; SET WT_MANY; BY ID; IF FIRST.ID THEN DO I = 1 TO 2; DO J = 1 TO 3; WT[I,J] =.; WT[COND,TIME] = WEIGHT; IF LAST.ID THEN OUTPUT; KEEP ID WT1-WT6; PROC PRINT DATA=WT_ONE; TITLE 'WT_ONE AGAIN'; Again, notice the similarities between this program and Program 4B. First, each time we process a new subject, we initialize all of the WT variables to a missing value (using nested DO loops). Next, depending on the value of COND and TIME, we set the appropriate WT to the value of the variable WEIGHT. Finally, when we reach the last observation for a subject, we output the single observation, keeping only the variables ID and WT1- WT6. Conclusion You have seen how to restructure SAS data sets, going from one to many or from many to one observation per subject using arrays. You may want to keep these examples handy for the next time you have a restructuring job to be done. SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries, indicated USA registration. Ronald P. Cody, Ed.D. Robert Wood Johnson Medical School Department of Environmental and Community Medicine 675 Hoes Lane Piscataway, NJ (732) cody@umdnj,edu
Demonstrating a DATA Step with and without a RETAIN Statement
1 The RETAIN Statement Introduction 1 Demonstrating a DATA Step with and without a RETAIN Statement 1 Generating Sequential SUBJECT Numbers Using a Retained Variable 7 Using a SUM Statement to Create SUBJECT
Paper 109-25 Merges and Joins Timothy J Harrington, Trilogy Consulting Corporation
Paper 109-25 Merges and Joins Timothy J Harrington, Trilogy Consulting Corporation Abstract This paper discusses methods of joining SAS data sets. The different methods and the reasons for choosing a particular
Taming the PROC TRANSPOSE
Taming the PROC TRANSPOSE Matt Taylor, Carolina Analytical Consulting, LLC ABSTRACT The PROC TRANSPOSE is often misunderstood and seldom used. SAS users are unsure of the results it will give and curious
Data Cleaning 101. Ronald Cody, Ed.D., Robert Wood Johnson Medical School, Piscataway, NJ. Variable Name. Valid Values. Type
Data Cleaning 101 Ronald Cody, Ed.D., Robert Wood Johnson Medical School, Piscataway, NJ INTRODUCTION One of the first and most important steps in any data processing task is to verify that your data values
1 Checking Values of Character Variables
1 Checking Values of Character Variables Introduction 1 Using PROC FREQ to List Values 1 Description of the Raw Data File PATIENTS.TXT 2 Using a DATA Step to Check for Invalid Values 7 Describing the VERIFY,
Introduction to Criteria-based Deduplication of Records, continued SESUG 2012
SESUG 2012 Paper CT-11 An Introduction to Criteria-based Deduplication of Records Elizabeth Heath RTI International, RTP, NC Priya Suresh RTI International, RTP, NC ABSTRACT When survey respondents are
Paper 2917. Creating Variables: Traps and Pitfalls Olena Galligan, Clinops LLC, San Francisco, CA
Paper 2917 Creating Variables: Traps and Pitfalls Olena Galligan, Clinops LLC, San Francisco, CA ABSTRACT Creation of variables is one of the most common SAS programming tasks. However, sometimes it produces
Counting the Ways to Count in SAS. Imelda C. Go, South Carolina Department of Education, Columbia, SC
Paper CC 14 Counting the Ways to Count in SAS Imelda C. Go, South Carolina Department of Education, Columbia, SC ABSTRACT This paper first takes the reader through a progression of ways to count in SAS.
What You re Missing About Missing Values
Paper 1440-2014 What You re Missing About Missing Values Christopher J. Bost, MDRC, New York, NY ABSTRACT Do you know everything you need to know about missing values? Do you know how to assign a missing
Multiplying Fractions
. Multiplying Fractions. OBJECTIVES 1. Multiply two fractions. Multiply two mixed numbers. Simplify before multiplying fractions 4. Estimate products by rounding Multiplication is the easiest of the four
Guido s Guide to PROC FREQ A Tutorial for Beginners Using the SAS System Joseph J. Guido, University of Rochester Medical Center, Rochester, NY
Guido s Guide to PROC FREQ A Tutorial for Beginners Using the SAS System Joseph J. Guido, University of Rochester Medical Center, Rochester, NY ABSTRACT PROC FREQ is an essential procedure within BASE
Anyone Can Learn PROC TABULATE
Paper 60-27 Anyone Can Learn PROC TABULATE Lauren Haworth, Genentech, Inc., South San Francisco, CA ABSTRACT SAS Software provides hundreds of ways you can analyze your data. You can use the DATA step
In this Chapter you ll learn:
Now go, write it before them in a table, and note it in a book. Isaiah 30:8 To go beyond is as wrong as to fall short. Confucius Begin at the beginning, and go on till you come to the end: then stop. Lewis
VB.NET Programming Fundamentals
Chapter 3 Objectives Programming Fundamentals In this chapter, you will: Learn about the programming language Write a module definition Use variables and data types Compute with Write decision-making statements
USING PROCEDURES TO CREATE SAS DATA SETS... ILLUSTRATED WITH AGE ADJUSTING OF DEATH RATES 1
USING PROCEDURES TO CREATE SAS DATA SETS... ILLUSTRATED WITH AGE ADJUSTING OF DEATH RATES 1 There may be times when you run a SAS procedure when you would like to direct the procedure output to a SAS data
Top Ten Reasons to Use PROC SQL
Paper 042-29 Top Ten Reasons to Use PROC SQL Weiming Hu, Center for Health Research Kaiser Permanente, Portland, Oregon, USA ABSTRACT Among SAS users, it seems there are two groups of people, those who
What is a Loop? Pretest Loops in C++ Types of Loop Testing. Count-controlled loops. Loops can be...
What is a Loop? CSC Intermediate Programming Looping A loop is a repetition control structure It causes a single statement or a group of statements to be executed repeatedly It uses a condition to control
Writing Control Structures
Writing Control Structures Copyright 2006, Oracle. All rights reserved. Oracle Database 10g: PL/SQL Fundamentals 5-1 Objectives After completing this lesson, you should be able to do the following: Identify
Labels, Labels, and More Labels Stephanie R. Thompson, Rochester Institute of Technology, Rochester, NY
Paper FF-007 Labels, Labels, and More Labels Stephanie R. Thompson, Rochester Institute of Technology, Rochester, NY ABSTRACT SAS datasets include labels as optional variable attributes in the descriptor
Study Guide for the Physical Education: Content and Design Test
Study Guide for the Physical Education: Content and Design Test A PUBLICATION OF ETS Copyright 2011 by Educational Testing Service. All rights reserved. ETS, the ETS logo, GRE, and LISTENING. LEARNING.
Chapter 5 Programming Statements. Chapter Table of Contents
Chapter 5 Programming Statements Chapter Table of Contents OVERVIEW... 57 IF-THEN/ELSE STATEMENTS... 57 DO GROUPS... 58 IterativeExecution... 59 JUMPING... 61 MODULES... 62 Defining and Executing a Module....
C H A P T E R 1 Introducing Data Relationships, Techniques for Data Manipulation, and Access Methods
C H A P T E R 1 Introducing Data Relationships, Techniques for Data Manipulation, and Access Methods Overview 1 Determining Data Relationships 1 Understanding the Methods for Combining SAS Data Sets 3
Foundations & Fundamentals. A PROC SQL Primer. Matt Taylor, Carolina Analytical Consulting, LLC, Charlotte, NC
A PROC SQL Primer Matt Taylor, Carolina Analytical Consulting, LLC, Charlotte, NC ABSTRACT Most SAS programmers utilize the power of the DATA step to manipulate their datasets. However, unless they pull
Salary. Cumulative Frequency
HW01 Answering the Right Question with the Right PROC Carrie Mariner, Afton-Royal Training & Consulting, Richmond, VA ABSTRACT When your boss comes to you and says "I need this report by tomorrow!" do
Intro to Longitudinal Data: A Grad Student How-To Paper Elisa L. Priest 1,2, Ashley W. Collinsworth 1,3 1
Intro to Longitudinal Data: A Grad Student How-To Paper Elisa L. Priest 1,2, Ashley W. Collinsworth 1,3 1 Institute for Health Care Research and Improvement, Baylor Health Care System 2 University of North
SAS Certified Base Programmer for SAS 9 A00-211. SAS Certification Questions and Answers with explanation
SAS BASE Certification 490 Questions + 50 Page Revision Notes (Free update to new questions for the Same Product) By SAS Certified Base Programmer for SAS 9 A00-211 SAS Certification Questions and Answers
Alternatives to Merging SAS Data Sets But Be Careful
lternatives to Merging SS Data Sets ut e Careful Michael J. Wieczkowski, IMS HELTH, Plymouth Meeting, P bstract The MERGE statement in the SS programming language is a very useful tool in combining or
LOCF-Different Approaches, Same Results Using LAG Function, RETAIN Statement, and ARRAY Facility Iuliana Barbalau, ClinOps LLC. San Francisco, CA.
LOCF-Different Approaches, Same Results Using LAG Function, RETAIN Statement, and ARRAY Facility Iuliana Barbalau, ClinOps LLC. San Francisco, CA. ABSTRACT LOCF stands for Last Observation Carried Forward
Programming Tricks For Reducing Storage And Work Space Curtis A. Smith, Defense Contract Audit Agency, La Mirada, CA.
Paper 23-27 Programming Tricks For Reducing Storage And Work Space Curtis A. Smith, Defense Contract Audit Agency, La Mirada, CA. ABSTRACT Have you ever had trouble getting a SAS job to complete, although
6. Control Structures
- 35 - Control Structures: 6. Control Structures A program is usually not limited to a linear sequence of instructions. During its process it may bifurcate, repeat code or take decisions. For that purpose,
Random effects and nested models with SAS
Random effects and nested models with SAS /************* classical2.sas ********************* Three levels of factor A, four levels of B Both fixed Both random A fixed, B random B nested within A ***************************************************/
Effective Use of SQL in SAS Programming
INTRODUCTION Effective Use of SQL in SAS Programming Yi Zhao Merck & Co. Inc., Upper Gwynedd, Pennsylvania Structured Query Language (SQL) is a data manipulation tool of which many SAS programmers are
Defining a Validation Process for End-user (Data Manager / Statisticians) SAS Programs
Defining a Validation Process for End-user (Data Manager / Statisticians) SAS Programs Andy Lawton, Boehringer Ingelheim UK Ltd., Berkshire, England INTRODUCTION The requirements for validating end-user
Changing the Shape of Your Data: PROC TRANSPOSE vs. Arrays
Changing the Shape of Your Data: PROC TRANSPOSE vs. Arrays Bob Virgile Robert Virgile Associates, Inc. Overview To transpose your data (turning variables into observations or turning observations into
PharmaSUG 2013 - Paper MS05
PharmaSUG 2013 - Paper MS05 Be a Dead Cert for a SAS Cert How to prepare for the most important SAS Certifications in the Pharmaceutical Industry Hannes Engberg Raeder, inventiv Health Clinical, Germany
recursion, O(n), linked lists 6/14
recursion, O(n), linked lists 6/14 recursion reducing the amount of data to process and processing a smaller amount of data example: process one item in a list, recursively process the rest of the list
Home Health Aide Track
Best Practice: Transitional Care Coordination Home Health Aide Track HHA This material was prepared by Quality Insights of Pennsylvania, the Medicare Quality Improvement Organization Support Center for
PROC SUMMARY Options Beyond the Basics Susmita Pattnaik, PPD Inc, Morrisville, NC
Paper BB-12 PROC SUMMARY Options Beyond the Basics Susmita Pattnaik, PPD Inc, Morrisville, NC ABSTRACT PROC SUMMARY is used for summarizing the data across all observations and is familiar to most SAS
Text Analytics Illustrated with a Simple Data Set
CSC 594 Text Mining More on SAS Enterprise Miner Text Analytics Illustrated with a Simple Data Set This demonstration illustrates some text analytic results using a simple data set that is designed to
DATA CLEANING: LONGITUDINAL STUDY CROSS-VISIT CHECKS
Paper 1314-2014 ABSTRACT DATA CLEANING: LONGITUDINAL STUDY CROSS-VISIT CHECKS Lauren Parlett,PhD Johns Hopkins University Bloomberg School of Public Health, Baltimore, MD Cross-visit checks are a vital
EXST SAS Lab Lab #4: Data input and dataset modifications
EXST SAS Lab Lab #4: Data input and dataset modifications Objectives 1. Import an EXCEL dataset. 2. Infile an external dataset (CSV file) 3. Concatenate two datasets into one 4. The PLOT statement will
One Dimension Array: Declaring a fixed-array, if array-name is the name of an array
Arrays in Visual Basic 6 An array is a collection of simple variables of the same type to which the computer can efficiently assign a list of values. Array variables have the same kinds of names as simple
Representing Vector Fields Using Field Line Diagrams
Minds On Physics Activity FFá2 5 Representing Vector Fields Using Field Line Diagrams Purpose and Expected Outcome One way of representing vector fields is using arrows to indicate the strength and direction
containing Kendall correlations; and the OUTH = option will create a data set containing Hoeffding statistics.
Getting Correlations Using PROC CORR Correlation analysis provides a method to measure the strength of a linear relationship between two numeric variables. PROC CORR can be used to compute Pearson product-moment
West Virginia University College of Engineering and Mineral Resources. Computer Engineering 313 Spring 2010
College of Engineering and Mineral Resources Computer Engineering 313 Spring 2010 Laboratory #4-A (Micromouse Algorithms) Goals This lab introduces the modified flood fill algorithm and teaches how to
Keywords are identifiers having predefined meanings in C programming language. The list of keywords used in standard C are : unsigned void
1. Explain C tokens Tokens are basic building blocks of a C program. A token is the smallest element of a C program that is meaningful to the compiler. The C compiler recognizes the following kinds of
Nine Steps to Get Started using SAS Macros
Paper 56-28 Nine Steps to Get Started using SAS Macros Jane Stroupe, SAS Institute, Chicago, IL ABSTRACT Have you ever heard your coworkers rave about macros? If so, you've probably wondered what all the
A GUIDE TO PROCESS MAPPING AND IMPROVEMENT
A GUIDE TO PROCESS MAPPING AND IMPROVEMENT Prepared by the CPS Activity Based Costing Team December 2012 CONTENTS 1. Introduction Page 3 2. What is process mapping? Page 4 3. Why process map? Page 4 4.
Object-Oriented Design Lecture 4 CSU 370 Fall 2007 (Pucella) Tuesday, Sep 18, 2007
Object-Oriented Design Lecture 4 CSU 370 Fall 2007 (Pucella) Tuesday, Sep 18, 2007 The Java Type System By now, you have seen a fair amount of Java. Time to study in more depth the foundations of the language,
Order of Operations More Essential Practice
Order of Operations More Essential Practice We will be simplifying expressions using the order of operations in this section. Automatic Skill: Order of operations needs to become an automatic skill. Failure
14:440:127 Introduction to Computers for Engineers. Notes for Lecture 06
14:440:127 Introduction to Computers for Engineers Notes for Lecture 06 Rutgers University, Spring 2010 Instructor- Blase E. Ur 1 Loop Examples 1.1 Example- Sum Primes Let s say we wanted to sum all 1,
Let the CAT Out of the Bag: String Concatenation in SAS 9 Joshua Horstman, Nested Loop Consulting, Indianapolis, IN
Paper S1-08-2013 Let the CAT Out of the Bag: String Concatenation in SAS 9 Joshua Horstman, Nested Loop Consulting, Indianapolis, IN ABSTRACT Are you still using TRIM, LEFT, and vertical bar operators
Client Marketing: Sets
Client Marketing Client Marketing: Sets Purpose Client Marketing Sets are used for selecting clients from the client records based on certain criteria you designate. Once the clients are selected, you
Greatest Common Factor and Least Common Multiple
Greatest Common Factor and Least Common Multiple Intro In order to understand the concepts of Greatest Common Factor (GCF) and Least Common Multiple (LCM), we need to define two key terms: Multiple: Multiples
Paper TU_09. Proc SQL Tips and Techniques - How to get the most out of your queries
Paper TU_09 Proc SQL Tips and Techniques - How to get the most out of your queries Kevin McGowan, Constella Group, Durham, NC Brian Spruell, Constella Group, Durham, NC Abstract: Proc SQL is a powerful
B) Mean Function: This function returns the arithmetic mean (average) and ignores the missing value. E.G: Var=MEAN (var1, var2, var3 varn);
SAS-INTERVIEW QUESTIONS 1. What SAS statements would you code to read an external raw data file to a DATA step? Ans: Infile and Input statements are used to read external raw data file to a Data Step.
The Query Builder: The Swiss Army Knife of SAS Enterprise Guide
Paper 1557-2014 The Query Builder: The Swiss Army Knife of SAS Enterprise Guide ABSTRACT Jennifer First-Kluge and Steven First, Systems Seminar Consultants, Inc. The SAS Enterprise Guide Query Builder
PROBLEMS (Cap. 4 - Istruzioni macchina)
98 CHAPTER 2 MACHINE INSTRUCTIONS AND PROGRAMS PROBLEMS (Cap. 4 - Istruzioni macchina) 2.1 Represent the decimal values 5, 2, 14, 10, 26, 19, 51, and 43, as signed, 7-bit numbers in the following binary
Introduction to SAS Mike Zdeb (402-6479, [email protected]) #122
Mike Zdeb (402-6479, [email protected]) #121 (11) COMBINING SAS DATA SETS There are a number of ways to combine SAS data sets: # concatenate - stack data sets (place one after another) # interleave - stack
LabVIEW Day 6: Saving Files and Making Sub vis
LabVIEW Day 6: Saving Files and Making Sub vis Vern Lindberg You have written various vis that do computations, make 1D and 2D arrays, and plot graphs. In practice we also want to save that data. We will
Scan Physical Inventory
Scan Physical Inventory There are 2 ways to do Inventory: #1 Count everything in inventory, usually done once a quarter #2 Count in cycles per area or category. This is a little easier and usually takes
You can probably work with decimal. binary numbers needed by the. Working with binary numbers is time- consuming & error-prone.
IP Addressing & Subnetting Made Easy Working with IP Addresses Introduction You can probably work with decimal numbers much easier than with the binary numbers needed by the computer. Working with binary
Using SAS to Perform a Table Lookup Last revised: 05/22/96
SS Pub.# 4-1 Using SAS to Perform a Table Lookup Last revised: 05/22/96 Table lookup is the technique you use to acquire additional information from an auxiliary source to supplement or replace information
How to Benchmark Your Building. Instructions for Using ENERGY STAR Portfolio Manager and Southern California Gas Company s Web Services
How to Benchmark Your Building Instructions for Using ENERGY STAR Portfolio Manager and Southern California Gas Company s Web Services This document is a quick-start guide for entering your property into
Chapter 8: Bags and Sets
Chapter 8: Bags and Sets In the stack and the queue abstractions, the order that elements are placed into the container is important, because the order elements are removed is related to the order in which
INSTRUCTIONS FOR USING HMS 2014 Scoring (v2.0)
INSTRUCTIONS FOR USING HMS 2014 Scoring (v2.0) HMS 2014 Scoring (v2.0).xls is an Excel Workbook saved as a Read Only file. Intended for scoring multi-heat events run under HMS 2014, it can also be used
Five Steps Towards Effective Fraud Management
Five Steps Towards Effective Fraud Management Merchants doing business in a card-not-present environment are exposed to significantly higher fraud risk, costly chargebacks and the challenge of securing
Access Part 2 - Design
Access Part 2 - Design The Database Design Process It is important to remember that creating a database is an iterative process. After the database is created and you and others begin to use it there will
Mathematical goals. Starting points. Materials required. Time needed
Level S6 of challenge: B/C S6 Interpreting frequency graphs, cumulative cumulative frequency frequency graphs, graphs, box and box whisker and plots whisker plots Mathematical goals Starting points Materials
An Introduction to Excel Pivot Tables
An Introduction to Excel Pivot Tables EXCEL REVIEW 2001-2002 This brief introduction to Excel Pivot Tables addresses the English version of MS Excel 2000. Microsoft revised the Pivot Tables feature with
Paper FF-014. Tips for Moving to SAS Enterprise Guide on Unix Patricia Hettinger, Consultant, Oak Brook, IL
Paper FF-014 Tips for Moving to SAS Enterprise Guide on Unix Patricia Hettinger, Consultant, Oak Brook, IL ABSTRACT Many companies are moving to SAS Enterprise Guide, often with just a Unix server. A surprising
User guide for the Error & Warning LabVIEW toolset
User guide for the Error & Warning LabVIEW toolset Rev. 2014 December 2 nd 2014 1 INTRODUCTION... 1 2 THE LABVIEW ERROR CLUSTER... 2 2.1 The error description... 3 2.2 Custom error descriptions... 4 3
Building Concepts: Dividing a Fraction by a Whole Number
Lesson Overview This TI-Nspire lesson uses a unit square to explore division of a unit fraction and a fraction in general by a whole number. The concept of dividing a quantity by a whole number, n, can
What Is Recursion? Recursion. Binary search example postponed to end of lecture
Recursion Binary search example postponed to end of lecture What Is Recursion? Recursive call A method call in which the method being called is the same as the one making the call Direct recursion Recursion
1974 Rubik. Rubik and Rubik's are trademarks of Seven Towns ltd., used under license. All rights reserved. Solution Hints booklet
# # R 1974 Rubik. Rubik and Rubik's are trademarks of Seven Towns ltd., used under license. All rights reserved. Solution Hints booklet The Professor s Cube Solution Hints Booklet The Greatest Challenge
Data Flow Diagrams. Outline. Some Rules for External Entities 1/25/2010. Mechanics
Data Flow Diagrams Mechanics Outline DFD symbols External entities (sources and sinks) Data Stores Data Flows Processes Types of diagrams Step by step approach Rules Some Rules for External Entities External
but never to show their absence Edsger W. Dijkstra
Fortran 90 Arrays Program testing can be used to show the presence of bugs, but never to show their absence Edsger W. Dijkstra Fall 2009 1 The DIMENSION Attribute: 1/6 A Fortran 90 program uses the DIMENSION
Lecture 2 Notes: Flow of Control
6.096 Introduction to C++ January, 2011 Massachusetts Institute of Technology John Marrero Lecture 2 Notes: Flow of Control 1 Motivation Normally, a program executes statements from first to last. The
Paper PO06. Randomization in Clinical Trial Studies
Paper PO06 Randomization in Clinical Trial Studies David Shen, WCI, Inc. Zaizai Lu, AstraZeneca Pharmaceuticals ABSTRACT Randomization is of central importance in clinical trials. It prevents selection
How To Proofread
GRADE 8 English Language Arts Proofreading: Lesson 6 Read aloud to the students the material that is printed in boldface type inside the boxes. Information in regular type inside the boxes and all information
SQL Server Database Coding Standards and Guidelines
SQL Server Database Coding Standards and Guidelines http://www.sqlauthority.com Naming Tables: Stored Procs: Triggers: Indexes: Primary Keys: Foreign Keys: Defaults: Columns: General Rules: Rules: Pascal
MASSACHUSETTS INSTITUTE OF TECHNOLOGY. 6.033 Computer Systems Engineering: Spring 2013. Quiz I Solutions
Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.033 Computer Systems Engineering: Spring 2013 Quiz I Solutions There are 11 questions and 8 pages in this
Preparing a Windows 7 Gold Image for Unidesk
Preparing a Windows 7 Gold Image for Unidesk What is a Unidesk gold image? In Unidesk, a gold image is, essentially, a virtual machine that contains the base operating system and usually, not much more
"Statistical methods are objective methods by which group trends are abstracted from observations on many separate individuals." 1
BASIC STATISTICAL THEORY / 3 CHAPTER ONE BASIC STATISTICAL THEORY "Statistical methods are objective methods by which group trends are abstracted from observations on many separate individuals." 1 Medicine
Query Processing C H A P T E R12. Practice Exercises
C H A P T E R12 Query Processing Practice Exercises 12.1 Assume (for simplicity in this exercise) that only one tuple fits in a block and memory holds at most 3 blocks. Show the runs created on each pass
Metacognition. Complete the Metacognitive Awareness Inventory for a quick assessment to:
Metacognition Metacognition is essential to successful learning because it enables individuals to better manage their cognitive skills and to determine weaknesses that can be corrected by constructing
To add a data form to excel - you need to have the insert form table active - to make it active and add it to excel do the following:
Excel Forms A data form provides a convenient way to enter or display one complete row of information in a range or table without scrolling horizontally. You may find that using a data form can make data
Survey Analysis: Options for Missing Data
Survey Analysis: Options for Missing Data Paul Gorrell, Social & Scientific Systems, Inc., Silver Spring, MD Abstract A common situation researchers working with survey data face is the analysis of missing
Zero and Negative Exponents. Section 7-1
Zero and Negative Exponents Section 7-1 Goals Goal To simplify expressions involving zero and negative exponents. Rubric Level 1 Know the goals. Level 2 Fully understand the goals. Level 3 Use the goals
Elena Baralis, Silvia Chiusano Politecnico di Torino. Pag. 1. Physical Design. Phases of database design. Physical design: Inputs.
Phases of database design Application requirements Conceptual design Database Management Systems Conceptual schema Logical design ER or UML Physical Design Relational tables Logical schema Physical design
Imputing Missing Data using SAS
ABSTRACT Paper 3295-2015 Imputing Missing Data using SAS Christopher Yim, California Polytechnic State University, San Luis Obispo Missing data is an unfortunate reality of statistics. However, there are
Two-Dimensional Conduction: Shape Factors and Dimensionless Conduction Heat Rates
Two-Dimensional Conduction: Shape Factors and Dimensionless Conduction Heat Rates Chapter 4 Sections 4.1 and 4.3 make use of commercial FEA program to look at this. D Conduction- General Considerations
