"stacked" column reports using data _null_

[This site is not connected with the SAS Institute]

[last updated - 08 September 2007]

Introduction

In late 2002 I shifted my focus away from "proc report" back to "data _null_" reporting in preparation for something that eventually didn't happen. The reason at the time was that the FDA were insisting that text tables had to use a 12 point font. Now they say that a 10 point font is OK for tables, which makes life easier. With a 12 point font I was seeing shell tables that typically stack age/race/sex/weight underneath each other. If you do that with "proc report", and there is more than one data line per subject, then the second data line will follow in the line after the line age/race/sex/weight has flowed down to. This leaves a gap of three lines (with this example) between the first and second data line for a subject. This gap looks bad on a report but the shell tables did not have a gap between the data lines and so it could not be coded with "proc report" in any normal way. But everything is possible with "data _null_" so I worked on how best to overcome this limitation in "proc repüort". What you are seeing on this page is a reslt of the work I put into getting round this layout problem. It is mainly historical now.

What "proc report" won't do for you

You could define every field in proc report as "display", then so long as you had arranged your input dataset to give you what you want then it would display it for you. But if you expect proc report to do what it normally does but slightly differently, then you are going to be disappointed. You might want proc report to flow a string of values into a column for you. That's no problem. You set the option to "flow" in your column definition, you define the split character as one of the procedure options and you have these split characters in all the right places in your input field. Proc report will flow it for you. But when it flows a column it makes the whole row "fatter". It will not flow into space that other data lines don't use. So the following data line will occur after where the column has flowed to. If you are flowing four items like age/race/sex/weight then that will leave a gap of three lines before the next data item is displayed. It's no good hoping you can set some option so that it will intelligently use space and avoid leaving gaps. It won't. It never will. And because you have a four-high stacked identity variable then proc report wouldn't start that on the last line on a page, would it? That would look stupid after all. Would it do such a thing? Yes, it will. Proc report cares nothing about the problems the FDA have visited upon you. It will just carry on working in the same old way. So if you want your reports to look "just so" you have a choice between doing massive manipulation of your dataset going into proc report and just using proc report as a proc print to display the values - or - you go back to using data _null_ reports. And what might have taken you five minutes to do on a wide page with proc report now takes you maybe five hours to code. Maybe much, much longer because your data _null_ reporting skills have gone rusty. And there are a lot of young programmers in the industry. Maybe you have never produced a data _null_ report in your life.

The worst reporting scenario

I am going to show you how to code a data _null_ report covering the worst scenario I can think of. It will have age/race/sex/weight stacked in a single column. And since this is an identifying "order" variable then the whole column will have to be repeated on each new page. And if we are near the foot of the page then it would be silly to start the group so it will be put on the following page instead. And on the right of age/race/sex/weight we will have a long parameter description grouping one of more values. And this long parameter will sometimes not fit into the column and so it will have to be "flowed" manually into the line below it. It will be the worst reporting scenario you are ever likely to be faced with. But seeing the code and having it as reference should allow you to tackle any report of any degree of complexity in the future. What I won't cover in this section is how you get data _null_ reports to interface with #byval and #byvar entries in the title lines. I have already written this up fully on this page. I want to keep the code free of that so you have more chance of following it. But if you have never come across a data _null_ report then maybe that page could act as a gentle introduction to the subject.

The code

This is my latest attempt at the code to do a stacked-column report. Horrible, isn't it?
 
options ls=90 ps=40;
title1 'First title';
title3 'Third title';
title5 'Fifth title';
footnote1 'First footnote';
footnote3 'Third footnote';
 

         /*--------------------------------------------------*
                        Generate the dummy data
          *--------------------------------------------------*/

data test;
  length sex $ 6 param $ 80 race $ 10;
  subj=1001;invid=10001;age=21;race='Asian';sex='Male';weight=60;
  param='AA This is a very long parameter and you will have to flow it';
  value=11;output;
  param='BB This is a short parameter';
  value=21;output;
  value=22;output;
  param='CC This is a very long parameter and you will have to flow it';
  do value=30 to 38;
    output;
  end;

  subj=2001;invid=20001;age=32;race='White';sex='Female';weight=55;
  param='AA This is a very long parameter and you will have to flow it';
  value=51;output;
  param='BB This is a short parameter';
  value=61;output;
  value=62;output;
  param='CC This is a very long parameter and you will have to flow it';
  do value=70 to 78;
    output;
  end;

  subj=3001;invid=30001;age=42;race='Black';sex='Female';weight=65;
  param='AA This is a very long parameter and you will have to flow it';
  value=51;output;
  param='BB This is a short parameter';
  value=61;output;
run;

         /*--------------------------------------------------*
                          Produce the report
          *--------------------------------------------------*/

data _null_;
  length tempstr $ 200;
  retain ls 0 startcol 0 repwidth 60 count 0;
  file print titles footnotes header=header linesleft=ll;
  set test end=last;
  by subj param;
  if _n_=1 then do;
    ls=getoption('ls');
    startcol=floor((ls-repwidth)/2)+1;
  end;
  if ll<2 then put _page_;
  if first.subj then do;
    count=0;
    if ll<5 then put _page_;
  end;
  count=count+1;
  %splitvar(param,38,split='*');
  link flow;
  if first.param or count=1 then do;
    tempstr=scan(param,1,'*');
    put @startcol+18 tempstr @startcol+57 value 4.;
    i=2;
    do while(scan(param,i,'*') NE ' ');
      count=count+1;
      link flow;
      tempstr=scan(param,i,'*');
      put @startcol+18 tempstr;
      i=i+1;
    end;
  end;
  else put @startcol+57 value 4.;
  if last.param then do;
    count=count+1;
    link flow;
    if not last or count<5 then put;
  end;
  if last.subj then link lastsubj;
return;

header:
  put;
  put @startcol '           age/';
  put @startcol '          race/';
  put @startcol 'subject/   sex/';
  put @startcol 'invid.   weight   Lab parameter                          value';
  put @startcol '--------------------------------------------------------------';
  if _n_ ne 1 then count=0;
return;

flow:
  if count=1 then put @startcol subj 6. '/' @startcol+8 age 2. ' yrs/' @;
  else if count=2 then put @startcol invid 6. @startcol+8 race $char6. +(-1) '/' @;
  else if count=3 then put @startcol+8 sex $char6.  '/' @;
  else if count=4 then put @startcol+8 weight 3. ' kg' @;
return;

lastsubj:
  if count<4 then do;
    do count=(count+1) to 4;
      link flow;
      put;
    end;
    if not last then put;
  end;
return;

run;

There is a link to the output file here.

If you have looked at the above code and it all makes sense, then maybe you are ready to look at the code with full titles handling added. To do this click here.
 


 


Go back to the home page.

E-mail the macro and web site author.