(Last updated: 01 April 2004 - LOCF macro bug fixed)
I have seen code to do this processing using arrays. In one sense this is a good idea since it is the most efficient way of handling it. However, the solution using this method never struck me as being elegant. I feel the array method springs from an incorrect assumption of what can be achieved using merges.
Values to carry forward (values dataset)
date value
01jan02 1
01mar02 2
01may02 3
Dates to merge with (dates dataset)
date
01jan02
01feb02
01mar02
01apr02
01may02
01jun02
Resulting dataset (final dataset)
date value
01jan02 1
01feb02 1
01mar02 2
01apr02 2
01may02 3
01jun02 3
Well, it can't be done using only the variables shown above. But what we can do is add a sequence number to the "dates" dataset and increment it when we get a match on date with the "values" dataset like this:
data dates;
retain _seq 0;
merge dates values(in=_val);
by date;
if _val then _seq=_seq+1;
run;
And now we have a sequence number that retains its value corresponding to each observation in the values dataset. So if we simply add the same sequence number to the "values" dataset like this:
data values;
set values;
_seq=_n_;
run;
.... we can merge like this:
data final;
merge dates values;
by _seq;
drop _seq;
run;
And this gives us the type of merge we need for LOCF merging. If you need further convincing, then here is some code you can copy and paste and run under sas.
data values;
date='01jan2002'd;value=1;output;
date='01mar2002'd;value=2;output;
date='01may2002'd;value=3;output;
format date date7.;
run;
data dates;
date='01jan2002'd;output;
date='01feb2002'd;output;
date='01mar2002'd;output;
date='01apr2002'd;output;
date='01may2002'd;output;
date='01jun2002'd;output;
format date date7.;
run;
data dates;
retain _seq 0;
merge dates values(in=_val keep=date);
by date;
if _val then _seq=_seq+1;
run;
data values;
set values;
_seq=_n_;
run;
data final;
merge dates values;
by _seq;
drop _seq;
run;
options nocenter;
proc print data=final;
run;
And if you go to the trouble to run it under sas then this is what you get for a print of the "final" dataset:
Obs date value
1 01JAN02
1
2 01FEB02
1
3 01MAR02
2
4 01APR02
2
5 01MAY02
3
6 01JUN02
3
The above is what we want, I am sure you will agree. So at least, in principle, it is possible to merge values in with a time sequence using a sequence.variable we have created. The above is very simple test data, of course, but we can use the same principle in more complex cases. In reality we will have an owner variable such as a subject-id that changes. There may be subcategories like a lab parameter. We would probably use visit numbers instead of dates. But the same principle of adding a sequence number to both datasets and merging using that sequence number could still apply. And it is this principle I have used in my macro solution for this sort of processing.
(values dataset)
subj param visit value
AA RBC 2
22
AA RBC 4
44
AA WBC 3
33
BB ALB 3
55
(visits dataset)
subj param visit
AA RBC 2
AA RBC 3
AA RBC 4
AA RBC 5
AA WBC 2
AA WBC 3
AA WBC 4
AA WBC 5
BB ALB 2
BB ALB 3
BB ALB 4
BB ALB 5
Now before we decide on how to merge these datasets, take a look at the "visits" dataset. Visits 2-5 are the "on-treatment" visits right up to the planned end point which is visit 5. It is for these visits we want to carry forward our last observation. Visit 1 is not shown because it is likely a baseline value and because it was not a measure taken while on and as a result of treatment we would probably not want to carry forward these baseline values. So Visits 2-5 are the "eligible" visit values for carrying forward. Now look at the values dataset. Again these only have the "eligible" visits and their values and these will be the values we will be carrying forward where required. Note that the subj/param pairings in the "visits" dataset match those in the values dataset. It makes sense to only have subj/param pairings to match values otherwise you would have nothing to carry forward. But this is an assumption. You should understand the assumptions being made. Sometimes the assumptions will not be approppriate for a study you are working on and you might have to change the logic.
Now consider how we are going to approach the merge of the two datasets. Look at the data for subject AA. We would not want to carry forward values of RBC into the following WBC so obviously we will be merging using matching on the subj/param pairing. We will have to reset our sequence count at the start of each subj/param grouping and increment from there. And when we add the sequence number to the "values" dataset we will have to do the same so that when we merge using subj/param/_seq we will get the correct matching.
First read through the following code until you understand what it is doing. The first two data steps generate the observations as shown above. The next two data steps add the sequence number and the final data step merges the data before printing it. Then copy and paste the code, run it under sas and convince yourself that it is producing the correct output:
data values;
subj='AA';
param='RBC';
visit=2;value=22;output;
visit=4;value=44;output;
param='WBC';
visit=3;value=33;output;
subj='BB';
param='ALB';
visit=3;value=55;output;
run;
data visits;
subj='AA';
do param='RBC', 'WBC';
do visit=2 to 5;
output;
end;
end;
subj='BB';
param='ALB';
do visit=2 to 5;
output;
end;
run;
data visits;
retain _seq 0;
merge visits values(in=_val keep=subj param visit);
by subj param visit;
if first.param then _seq=0;
if _val then _seq=_seq+1;
run;
data values;
retain _seq 0;
set values;
by subj param;
if first.param then _seq=0;
_seq=_seq+1;
run;
data final;
merge values visits;
by subj param _seq;
drop _seq;
run;
options nocenter;
proc print data=final;
run;
You should regard this macro as a solution rather than
the
solution. You need to understand what it is doing and what assumptions
are being made. It may or may not match with your study requirements. You
might end up writing a new macro that fits your site standards better,
based on this macro. I can only go so far with it and keep it as generic
and as simple as possible. If you want to develop it further then feel
free. To view the latest version, click here. To
view its test pack click here testpack.
Go back to the home page.
E-mail the macro and web site author.