LEADS AND LAGS: HANDLING QUEUES IN THE SAS DATA STEP



Similar documents
Leads and Lags: Static and Dynamic Queues in the SAS DATA STEP

LEADS AND LAGS IN SAS

Paper Merges and Joins Timothy J Harrington, Trilogy Consulting Corporation

The SET Statement and Beyond: Uses and Abuses of the SET Statement. S. David Riba, JADE Tech, Inc., Clearwater, FL

PO-18 Array, Hurray, Array; Consolidate or Expand Your Input Data Stream Using Arrays

Finding National Best Bid and Best Offer

Statistics and Analysis. Quality Control: How to Analyze and Verify Financial Data

The Essentials of Finding the Distinct, Unique, and Duplicate Values in Your Data

Paper An Introduction to SAS PROC SQL Timothy J Harrington, Venturi Partners Consulting, Waukegan, Illinois

Using DATA Step MERGE and PROC SQL JOIN to Combine SAS Datasets Dalia C. Kahane, Westat, Rockville, MD

Subsetting Observations from Large SAS Data Sets

SUGI 29 Coders' Corner

Everything you wanted to know about MERGE but were afraid to ask

Procurement Planning

ABSTRACT INTRODUCTION FILE IMPORT WIZARD

Sensex Realized Volatility Index

Applications Development

Salary. Cumulative Frequency

Microfinance Credit Risk Dashboard User Guide

Designing and Implementing Forms 34

Chapter 25 Specifying Forecasting Models

User Guide for TASKE Desktop

Effective Use of SQL in SAS Programming

KEY FEATURES OF SOURCE CONTROL UTILITIES

AT&T Global Network Client for Windows Product Support Matrix January 29, 2015

Intro to Longitudinal Data: A Grad Student How-To Paper Elisa L. Priest 1,2, Ashley W. Collinsworth 1,3 1

Teamstudio USER GUIDE

AN ANIMATED GUIDE: SENDING SAS FILE TO EXCEL

Chapter 1 Overview of the SQL Procedure

Tips to Use Character String Functions in Record Lookup

Using Edit-Distance Functions to Identify Similar Addresses Howard Schreier, U.S. Dept. of Commerce, Washington DC

Temporally Allocating Emissions with CEM Data. Scott Edick Michigan Department of Environmental Quality P.O. Box Lansing MI 48909

SAS Task Manager 2.2. User s Guide. SAS Documentation

How to Use SDTM Definition and ADaM Specifications Documents. to Facilitate SAS Programming

Switching from PC SAS to SAS Enterprise Guide Zhengxin (Cindy) Yang, inventiv Health Clinical, Princeton, NJ

Innovative Techniques and Tools to Detect Data Quality Problems

PROJECTS SCHEDULING AND COST CONTROLS

Grain Stocks Estimates: Can Anything Explain the Market Surprises of Recent Years? Scott H. Irwin

Accounts Receivable System Administration Manual

Paper Hot Links: Creating Embedded URLs using ODS Jonathan Squire, C 2 RA (Cambridge Clinical Research Associates), Andover, MA

COGNOS Query Studio Ad Hoc Reporting

Directions for the Well Allocation Deck Upload spreadsheet

Do It Yourself (DIY) Data: Creating a Searchable Data Set of Available Classrooms using SAS Enterprise BI Server

Managing Data Issues Identified During Programming

ing Automated Notification of Errors in a Batch SAS Program Julie Kilburn, City of Hope, Duarte, CA Rebecca Ottesen, City of Hope, Duarte, CA

COMPARISON OF FIXED & VARIABLE RATES (25 YEARS) CHARTERED BANK ADMINISTERED INTEREST RATES - PRIME BUSINESS*

COMPARISON OF FIXED & VARIABLE RATES (25 YEARS) CHARTERED BANK ADMINISTERED INTEREST RATES - PRIME BUSINESS*

BAT Smart View for Budget Users. Miami-Dade County. BAT Smart View Training Activity Guide

An Approach to Creating Archives That Minimizes Storage Requirements

Software License Registration Guide

Release 2.1 of SAS Add-In for Microsoft Office Bringing Microsoft PowerPoint into the Mix ABSTRACT INTRODUCTION Data Access

Scatter Chart. Segmented Bar Chart. Overlay Chart

Banner Employee Self-Service Web Time Entry. Student Workers User s Guide

Database Web Development ITP 300 (3 Units)

Traditional Conjoint Analysis with Excel

SAS. Cloud. Account Administrator s Guide. SAS Documentation

DBF Chapter. Note to UNIX and OS/390 Users. Import/Export Facility CHAPTER 7

SAS Add-In 2.1 for Microsoft Office: Getting Started with Data Analysis

Sage 200 v5.10 What s New At a Glance

Comparing share-price performance of a stock

Using simulation to calculate the NPV of a project

More Tales from the Help Desk: Solutions for Simple SAS Mistakes Bruce Gilsen, Federal Reserve Board

Alternatives to Merging SAS Data Sets But Be Careful

SAS Comments How Straightforward Are They? Jacksen Lou, Merck & Co.,, Blue Bell, PA 19422

Data-driven Validation Rules: Custom Data Validation Without Custom Programming Don Hopkins, Ursa Logic Corporation, Durham, NC

Managing User Accounts and User Groups

Using SAS With a SQL Server Database. M. Rita Thissen, Yan Chen Tang, Elizabeth Heath RTI International, RTP, NC

Managing Communications using InTouch. applicable to onwards

Financial Accounting. Course Syllabus

Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing. C. Olivia Rud, VP, Fleet Bank

Management Information Systems 260 Web Programming Fall 2006 (CRN: 42459)

TECHNIQUES FOR BUILDING A SUCCESSFUL WEB ENABLED APPLICATION USING SAS/INTRNET SOFTWARE

Programming Tricks For Reducing Storage And Work Space Curtis A. Smith, Defense Contract Audit Agency, La Mirada, CA.

Excel Database Management Microsoft Excel 2003

DataPA OpenAnalytics End User Training

ROYAL REHAB COLLEGE AND THE ENTOURAGE EDUCATION GROUP. UPDATED SCHEDULE OF VET UNITS OF STUDY AND VET TUITION FEES Course Aug 1/2015

SonicWALL GMS Custom Reports

Let SAS Modify Your Excel File Nelson Lee, Genentech, South San Francisco, CA

Management Reporter Integration Guide for Microsoft Dynamics AX

Document Management System (DMS) Release 4.5 User Guide

This SAS Program Says to Google, "What's Up Doc?" Scott Davis, COMSYS, Portage, MI

Business Process Mining for Internal Fraud Risk Reduction: Results of a Case Study

Post Processing Macro in Clinical Data Reporting Niraj J. Pandya

Excel for Data Cleaning and Management

Return of the Codes: SAS, Windows, and Your s Mark Tabladillo, Ph.D., MarkTab Consulting, Atlanta, GA Associate Faculty, University of Phoenix

Tips, Tricks, and Techniques from the Experts

Designing Adhoc Reports

Release Notes DAISY 4.0

Active Directory Compatibility with ExtremeZ-IP. A Technical Best Practices Whitepaper

Introduction to Business Course Syllabus. Dr. Michelle Choate Office # C221 Phone: Mobile Office:

All of your product data. Just one click away. More than just a pleasant dream. Intelligent Master Data Management.

Chapter 23 File Management (FM)

Active Directory Comapatibility with ExtremeZ-IP A Technical Best Practices Whitepaper

How to Create a Custom TracDat Report With the Ad Hoc Reporting Tool

An Oracle White Paper November Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

Generalized Web Based Data Analysis Tool for Policy Agendas Data

Scheduling Guide Revised August 30, 2010

PROC LOGISTIC: Traps for the unwary Peter L. Flom, Independent statistical consultant, New York, NY

Sharperlight 3.0 Sample Dashboard

Transcription:

LEADS AND LAGS: HANDLING QUEUES IN THE SAS DATA STEP Mark Keintz, Wharton Research Data Services, University of Pennsylvania ASTRACT From stock price histories to hospital stay records, analysis of time series data often requires use of lagged (and occasionally lead) values of one or more analysis variable For the SAS user, the central operational task is typically getting lagged (lead) values for each time point in the data set While SAS has long provided a LAG function, it has no analogous lead function an especially significant problem in the case of large data series This paper will () review the lag function, in particular the powerful, but non-intuitive implications of its queueoriented basis and () demonstrate efficient ways to generate leads with the same flexibility as the lag function, but without the common and expensive recourse to data re-sorting INTRODUCTION SAS has a LAG function (and a related DIF function) intended to provide data values (differences) from preceding records in a data set If may be something as simplistic as last month s price in a monthly stock price data set (a regular time series), or as variable as the most recent sales volume of a product which sold only occasionally (irregular time series) The presentation will show the benefit of the queue-management orientation of the LAG function in addressing time series that are simple, grouped, or irregular In the absence of a lead function, this presentation will show simple SAS scripts to produce lead values under similar conditions THE LAG FUNCTION RETRIEVING HISTORY The term lag function suggests retrieval of data via looking back by some user-specified number of periods or observations, and that is what most simple applications of the LAG function seem to do For instance, consider the task of producing -month and -month price returns for this monthly stock price file (created from the sashelpstocks data see appendix for creation of data set SAMPLE): Table Five Rows From SAMPLE (re-sorted from sashelpstocks) Obs DATE STOCK CLS 5 0AUG86 0SEP86 0OCT86 0NOV86 0DEC86 $875 $50 $6 $7 $000 This program below uses the LAG and LAG ( record lag) functions to compare the current closing price (CLS) to its immediate and it third prior predecessors, with the goal of generating one-month and three-month returns (RTN and RTN):

Example : Simple Creation of Lagged Values data simple_lags; set sample; CLS=lag(CLS); CLS=lag(CLS); if CLS ^= then RETN = CLS/CLS - ; if CLS ^= then RETN = CLS/CLS - ; Example yields the following data in the first 5 rows: Table Example Results from using LAG and LAG Obs DATE STOCK CLS CLS CLS RETN RETN 5 0AUG86 0SEP86 0OCT86 0NOV86 0DEC86 $875 $50 $6 $7 $000 875 50 6 7 875 50-00 -008 008-0056 -008-008 LAGS ARE QUEUES NOT LOOK ACKS At this point LAG functions have all the appearance of simply looking back by one (or ) observations, with the additional feature of imputing missing values when looking back beyond the beginning of the data set ut actually the lag function instructs SAS to construct a fifo (first-in/first-out) queue with () as many entries as the specified length of the lag, and () the queue elements initialized to missing values Every time the lag function is executed, the oldest entry is retrieved (and removed) from the queue and a new entry (ie the current value) is added The significance of the queue-based construction (vs simple look backs ) becomes evident when the LAG function is executed conditionally, as in the treatment of Y groups below As an illustration, consider observations through 7 generated by Example program, and presented in Table This shows the last two cases for and the first four for For the first observation (obs ), the lagged value of the closing stock price (CLS=80) is taken from the series Of course, it should be a missing value, as should all the shaded cells Table The Problem of Y Groups for Lags Obs DATE STOCK CLS CLS CLS RETN RETN 0NOV05 $8890 888 806 0086 00 0DEC05 $80 8890 80-0075 005 0AUG86 $00 80 888-070 -079

Table The Problem of Y Groups for Lags Obs DATE STOCK CLS CLS CLS RETN RETN 5 0SEP86 $950 00 8890-05 -078 6 0OCT86 $05 950 80 008-075 7 0NOV86 $00 05 00 06 0000 The natural way to address this problem is to use a Y statement in SAS and (for the case of the single month return) avoid executing lags when the observation in hand is the first for a given stock Example is such a program (dealing with CLS only for illustration purposes), and its results are in Table Example : A naïve implementation of lags for Y groups data naive_lags; set sample; by stock; if firststock=0 then CLS=lag(CLS); else CLS=; if CLS ^= Then RETN= CLS/CLS - ; format ret: 6; Table Example Results of "if firststock then CLS=lag(CLS)" Obs STOCK DATE CLS CLS RETN 5 6 7 0NOV05 0DEC05 0AUG86 0SEP86 0OCT86 0NOV86 $8890 $80 $00 $950 $05 $00 888 8890 80 950 05 0086-0075 -076 008 06 This fixes the first record, setting both CLS and RETN to missing values ut look at the second record (Obs 5) CLS has a value of 80, taken not from the first record, but rather from the last record, generating an erroneous value for RETN as well In other words, CLS did not come from the prior record, but rather it came from the queue, whose contents were most recently updated prior to the first record LAGS FOR Y GROUPS The usual fix is to unconditionally execute a lag, and then reset the result when necessary Example shows just such a solution (described by Howard Schrier see References) It uses the IFN function instead of an IF statement - because IFN executes the lag function embedded in its second argument regardless of the status of

firststock (the condition being tested) Of course, IFN returns the lagged value only when the tested condition is true Accommodating Y groups for lags longer than one period simply requires comparing lagged values of the Y-variable to the current values ( lag(stock)=stock ), as in the if CLS statement below Example : A robust implementation of lags for Y groups data bygroup_lags; set sample; by stock; CLS = ifn(firststock=0,lag(cls),); CLS = ifn(lag(stock)=stock,lag(cls),) ; if CLS ^= then RETN = CLS/CLS - ; if CLS ^= then RETN = CLS/CLS - ; format ret: 6; The data set YGROUP_LAGS now has missing values for the appropriate records at the start of the monthly records Table 5 Result of Robust Lag Implementation for Y Groups Obs STOCK DATE CLS CLS CLS RETN RETN 5 6 7 0NOV05 0DEC05 0AUG86 0SEP86 0OCT86 0NOV86 $8890 $80 $00 $950 $05 $00 888 8890 00 950 05 806 80 00 0086-0075 -05 008 06 00 005 0000 MULTIPLE LAGS MEANS MULTIPLE QUEUES A WAY TO MANAGE IRREGULAR SERIES While Y-groups benefit from avoiding conditional execution of lags, there are times when conditional lags are the best solution In the data set SALES below (see Appendix for generation of SALES) are monthly sales by product ut a given product month combination is only present when sales are reported As a result each month has a varying number of records, depending on which product sales were reported Table 6 shows 5 records for month, but only records for month Unlike the data in Sample, this time series is not regular, so comparing (say) sales of product in January (observation ) to February (7) would imply a LAG5, while comparing March (0) to February would need a LAG

Table 6 Data Set SALES (first obs) OS MONTH PROD SALES 5 6 7 8 9 0 A C D X A D X C D X 0 7 0 6 8 7 6 The solution to this task is to use conditional lags, with one queue for each product The program to do so is surprisingly simple: Example : LAGS for Irregular Time Series data irregular_lags; set sales; select (product); when ('A') change_rate=(sales-lag(sales))/(month-lag(month)); when ('') change_rate=(sales-lag(sales))/(month-lag(month)); when ('C') change_rate=(sales-lag(sales))/(month-lag(month)); when ('D') change_rate=(sales-lag(sales))/(month-lag(month)); otherwise; Example generates four pair of lag queues, one for each of the products A through D ecause a given queue is updated only when the specified product is in hand (the when clauses), the output of the lag function must come from an observation having the same PRODUCT as the current observation, no matter how far back it may be in the data set Note that, for most functions, a clean program would collapse the four when conditions into one, such as when( A,, C, D ) change_rate= function(sales)/function(month); ut this temptation of apparent program simplicity would misuse the queue-management features of LAG (and DIF) A more compact expression would have used DIF instead of LAG, as in change_rate=dif(sales)/dif(month) 5

LEADS HOW TO LOOK AHEAD SAS does not offer a lead function As a result many SAS programmers sort a data set in descending order and then apply lag functions to create lead values Often the data are sorted a second time, back to original order, before any analysis is done This is a costly technique for large data sets There is a much simpler way, through the use of extra SET statements in combination with the FIRSTOS parameter and other elements of the SET statement The following program generates both a one-month and threemonth lead of the data from Sample Example 5: Simple generation of one-record and three-record leads data simple_leads; set sample; if eof=0 then set sample (firstobs= keep=cls rename=(cls=lead)) end=eof; else lead=; if eof=0 then set sample (firstobs= keep=cls rename=(cls=lead)) end=eof; else lead=; The use of multiple SET statements to generate leads makes use of two features of the SET statement and two data set name parameters: SET Feature : Multiple SET statements reading the same data set do not read from a single series of records (unlike a collection of INPUT statements) Instead they produce separate streams of data (much like multiple LAG functions generate multiple queues) It is as if the three SET statements above were reading from three different data sets In fact the log from Example 5 displays these notes reporting three incoming streams of data: NOTE: There were 699 observations read from the data set WORKSAMPLE NOTE: There were 698 observations read from the data set WORKSAMPLE NOTE: There were 696 observations read from the data set WORKSAMPLE Data Set Name Parameter (FIRSTOS): Using the FIRSTOS= parameter provides a way to look ahead in a data set For instance the second SET has FIRSTOS= (the third has FIRSTOS= ), so that it starts reading from record () while the first SET statement is reading from record This provides a way to synchronize leads with any given current record Data Set Name Parameter (RENAME, and KEEP): All three SET statements read in the same variable (CLS), yet a single variable can t have more than one value at a time In this case the original value would be overwritten by the subsequent SETs To avoid this problem CLS is renamed (to LEAD and LEAD) in the additional SET statements, resulting in three variables: CLS (from the current record, LEAD (one period lead) and LEAD (three period lead) To avoid overwriting any other variables, only variables to be renamed should be in the KEEP= parameter SET Feature (EOF=): Using the IF EOFx=0 then set in tandem with the END=EOFx parameter avoids a premature end of the data step Ordinarily a data step with three SET statements stops when any one of the SETs attempts to go beyond the end of input In this case, the third SET ( FIRSTOS= ) would stop the DATA step even though the first would have unread records remaining The way to work around this is to prevent unwanted attempts at reading beyond the end of data The end= parameter generates a 6

dummy variable indicating whether the record in hand is the last incoming record The program can test its value and stop reading each data input stream when it is exhausted That s why the log notes above report differing numbers of observations read The last records in the resulting data set are below, with lead values as expected: Table 7: Simple Leads Obs STOCK DATE CLS LEAD LEAD 696 697 698 699 Microsoft Microsoft Microsoft Microsoft 0SEP05 0OCT05 0NOV05 0DEC05 $57 $570 $768 $65 $570 $768 $65 $65 > LEADS FOR Y GROUPS Just as in the case of lags, generating lead in the presence of Y group requires a little extra code A singleperiod lead is relatively easy if the current record is the last in a Y group, reset the lead to missing That test is shown after the second SET statement below ut in the case of leads beyond one period, a little extra is needed namely a test comparing the current value of the Y-variable (STOCK) to its value in the lead period That s done in the code below for the three-period lead by reading in (and renaming) the STOCK variable in the third SET statement, comparing it to the current STOCK value, and resetting LEAD to missing when needed Example 6: Generating Leads for y Groups data bygroup_leads; set sample; by stock; if eof=0 then set sample (firstobs= keep=cls rename=(cls=lead)) end=eof; if laststock then lead=; if eof=0 then set sample (firstobs= keep=stock CLS rename=(stock=stock CLS=LEAD)) end=eof; if stock ^= stock the lead=; drop stock; The result for the last four observations and the first three observations are below, with LEAD set to missing for the final, and LEAD for the last observations Table 8 Leads With y Groups 7

Obs STOCK DATE CLS LEAD LEAD 0 5 0SEP05 0OCT05 0NOV05 0DEC05 0AUG86 0SEP86 $80 $888 $8890 $80 $00 $950 $888 $8890 $80 $950 $05 $80 $00 $00 GENERATING IRREGULAR LEAD QUEUES ECAUSE WHERE PRECEDES FIRSTOS Generating lags for the SALES data above required the utilization of multiple queues a LAG function for each product, resolving the problem of varying distances between successive records for a given product Developing leads for such irregular time series requires the same approach one queue for each product Similar to the use of several LAG functions to manage separate queues, generating leads for irregular times series requires a collection SET statements each filtered by an appropriate WHERE condition The relatively simple program below demonstrates: Example 7: Generating Leads for Irregular Time Series data irregular_leads; set sales; select (product); when ('A') if eofa=0 then set sales (where=(product='a') firstobs= keep=product sales rename=(sales=lead)) end=eofa; when ('') if eofb=0 then set sales (where=(product='') firstobs= keep=product sales rename=(sales=lead)) end=eofb; when ('C') if eofc=0 then set sales (where=(product='c') firstobs= keep=product sales rename=(sales=lead)) end=eofc; when ('D') if eofd=0 then set sales (where=(product='d') firstobs= keep=product sales rename=(sales=lead)) end=eofd; otherwise lead=; The logic of Example 7 is straightforward If the current product is A [WHEN( A )] and the stream of PRODUCT A records is not exhausted (if eofa=0), then read the next product A record, renaming its SALES variable to LEAD The technique for reading only product A records is to use the WHERE= data set name parameter Most important to this technique is the fact that the WHERE parameter is honored prior to the FIRSTOS parameter So FIRSTOS= means the second product A record The first 8 product A and records are as follows, with the LEAD value always the same as the SALES values for the next identical product 8

Table 9 Lead produced by Independent Queues Obs MONTH PRODUCT SALES LEAD 6 7 0 5 9 5 A A A A 0 6 5 5 6 5 5 6 0 Of course this approach could become burdensome if the intent is to produce not only a period lead for each product, but also a period and period lead Although this might at first suggest issue SET statements ( products and differing lead levels), one can still work with one SET per product A complete program to address that issue is in Appendix An abridged version follows data irregular_leads (drop=i); set sales; select (product); when ('A') do i= to ifn(lag(product) =' ',,); if eofa=0 then set sales (where=(product='a') firstobs= keep=product sales rename=(sales=lead)) end=eofa; else lead= ; lead=lag(lead); lead=lag(lead); when ('') when ('C') when ('D') otherwise lead= ; The approach above gets the -period lead (LEAD), and then simply applies lag functions to it to produce leads for fewer periods To do this of course, the program must initially read, for each PRODUCT, observations,, and to a establish a complete complement of leads for observation That is, a SET statement must be executed times initially, and subsequently once for each reference record This is done with the do i= to ifn(lag(product) =' ',,); which merely utilizes the fact the lag(product) queue is initialized to a missing value Consequently the ifn condition is true (and observations are read) only for the first instance of each SET statement 9

CONCLUSIONS While at first glance the queue management character of the LAG function may seem counterintuitive, this property offers robust techniques to deal with a variety of situations, including Y groups and irregularly spaced time series The technique for accommodating those structures is relatively simple In addition, the use of multiple SET statements produces the equivalent capability in generating leads, all without the need for extra sorting of the data set REFERENCES: Matlapudi, Anjan, and J Daniel Knapp (00) Please Don t Lag ehind LAG In the Proceedings of the North East SAS Users Group Schreier, Howard (undated) Conditional Lags Don t Have to be Treacherous URL as or 7//0: http://wwwhowlescom/saspapers/ccpdf ACKNOWLEDGMENTS SAS and all other SAS Institute Inc product or service names are registered trademarks or trademarks of SAS Institute Inc in the USA and other countries indicates USA registration CONTACT INFORMATION This is a work in progress Your comments and questions are valued and encouraged Please contact the author at: Author Name: Mark Keintz Company: Wharton Research Data Services Address: 05 St Leonard s Court 89 Chestnut St Philadelphia, PA 90 Work Phone: 5898-60 Fax: 557607 Email: mkeintz@whartonupenedu 0

APPENDIX: CREATION OF SAMPLE DATA SETS FROM THE SASHELP LIRARY /*Sample : Monthly Stock Data for,, Microsoft for Aug 986 - Dec 005 */ proc sort data=sashelpstocks out=sample; by stock date; /*Sample : Irregular Series: Monthly Sales by Product, */ data SALES; do MONTH= to ; do PRODUCT='A','','C','D','X'; if ranuni(098098)< 09 then do; SALES =ceil(0*ranuni(0598067)); output;

APPENDIX : EXAMPLE PROGRAM OF MULTI-PERIOD LEADS FOR IRREGULAR TIME SERIES /*To get leads for each product of interest ( A,, C, D ) for three different periods (,, and ), you might issue SET statements, as per the section on leads for irregular series However, only SET statements are needed when combined with lag functions, as below */ data irregular_leads (label=',, & period leads for products A,, C, and D'); set sales; select (product); when ('A') do i= to ifn(lag(product) =' ',,); if eofa=0 then set sales (where=(product='a') firstobs= keep=product sales rename=(sales=lead)) end=eofa; else lead= ; lead=lag(lead); lead=lag(lead); when ( ) do i= to ifn(lag(product) =' ',,); if eofb=0 then set sales (where=(product= ) firstobs= keep=product sales rename=(sales=lead)) end=eofb; else lead= ; lead=lag(lead); lead=lag(lead); when ( C ) do i= to ifn(lag(product) =' ',,); if eofc=0 then set sales (where=(product= C ) firstobs= keep=product sales rename=(sales=lead)) end=eofc; else lead= ; lead=lag(lead); lead=lag(lead); when ('D') do i= to ifn(lag(product) =' ',,); if eofd=0 then set sales (where=(product= D ) firstobs= keep=product sales rename=(sales=lead)) end=eofd; else lead= ; lead=lag(lead); lead=lag(lead); otherwise lead= ;