Production Scripts

(Author: Roland Rashleigh-Berry Date: 04 Apr 2011)

Introduction

The Spectre "production" scripts are those scripts that are involved in the normal process of running a program suite from producing output through to creating the PDFs. These are the most complicated scripts. You can view them from this section. If you are thinking of learning how to write shell scripts then do not be put off by the complexity of the examples you will find here. More typical scripts will be those you will find on the "Utility scripts" page.

To help you understand the purposes of the scripts, they will be split into sections. That way you will get an idea of why they were written and hopefully may give you an idea of when a script might be useful in a new setting.

(When you use your browser to view the scripts, you will probably see a first line like this "# !/bin/bash #". If you view the same script as "Source", from the "View" pull-down menu, you will see two lines instead. The first is "# !/bin/bash" and the second is "#<pre><b>". I have inserted this second line so that your browser can correctly display the script from that point on. The "<pre>" tells the browser that what follows is "pre-formatted" and the "<b>" tells it to display it as "bold". This second line will not affect the operation of the script because any line (except the first) that starts with a "#" is treated as a comment).

Titles dataset scripts

Scripts are used to write titles to a "titles" sas dataset and other scripts ae used to read them from the "titles" dataset. Actually, there is a "protocol" dataset that gets written at the same time you create the "titles" dataset so that will be covered in this section as well. Any script that writes this dataset or reads it will be using sas to do some of the work. And because the titles dataset will be placed in a directory that will be decided by the Spectre administrator it will most likely call a site-specific macro to do the library allocations. I don't know what your site will call this allocation macro so I have used the entry "%alloc" in some of these scripts to refer to this macro that will be used at your site. "%alloc" is not a real macro. If you have Spectre running at your site then this will have been changed or removed.

Before you start programming on a study, the Spectre administrator uses a script to put a template file in the programs directory that they will copy and amend to become "protocol.txt". A link to it is below.
crprottmpl

There is another script that will put the template for titles members in your program directory and name it "titles.template". If you work in batch then you can use the script "titles" to automatically create your titles member, but the "titles" script calls the script below in any case.
crtitlestmpl

Once you have finished creating the titles for your program, it must be read in to create the "titles" dataset. Actually, not just your titles member gets read in - all titles members get read in and the protocol information as well and a "protocol" dataset gets created. It is done by the following script and note that the header lists the scripts it calls as well as the sas macros.
crtitlesds

The "crtitlesds" script called the scripts "vtitles", "alltitles", "checktitles" and "intnop" and these are below.
vtitles
alltitles
checktitles
intnop

The "intnop" script (short for "in titles no programs") calls the script "intitlesds" to find out what programs are in the titles dataset as well as the script "nofile". There is another script similar to "intitlesds" which finds out all the programs and their report labels. These are below.
nofile
intitlesds
intitlabels

sas batch script

When you use Spectre you should run all your sas programs in batch using the "sasb" script. This script will check for errors and warnings and other important messages and if it finds anything will tell you at the end. If it finds something then you have to check this. To get more details about these messages you use the "scanlogs" script.
sasb
scanlogs

printing output

To print output created using Spectre you should use "lis2ps" which is short for ".lis to PostScript". This script converts your output file (which will have the extension ".lis" or ".lis*") to PostScript. It reads information from the titles dataset and protocol dataset to ensure it applies the correct margins and font size. You may think you want to print your output, not convert it to PostScript, but nearly all printers are PostScript printers. First, your output must be converted to PostScript before it is printed and "lis2ps" does this. To send it to the printer you use the "-p" option and specify the printer name after the option. This will be explained in the header.
lis2ps

"lis2ps" calls other scripts. To see the dependencies, click here.

running all the programs

Certainly, the creation of output will be done using Spectre but the datasets the reports are produced from may come from elsewhere. If both the reporting datasets and the report output need to be created and all in one go then the script "fullrunsuite" is used. Once you launch that script it is time to go for a coffee and lunch as well. You will get an email from the script to tell you when it has finished. More likely, you will be running the reporting program only. The scripts that run the programs are automatically generated by a script called "makerun". This generates scripts named "runsuite", "runderived" (for building the datasets) and "runreports" (for creating the output). You can run "runderived" and "runreports" separately and you will get an email from each when they have completed. To get the full list and correct order of the programs to create the reporting datasets, the script "derorder" is used. These scripts are below.
fullrunsuite
makerun
derorder

"fullrunsuite" calls other script which in turn call other scripts. To see the dependencies, click here.

pagexofy

When you produce output, your output gets written to a temporary file. There is another process that needs to be run before you have your final output and that is adding the "Page X of Y" labels. That is done using the script "pagexofy" called from the sas macro %closerep. It first counts the pages so that it knows the "Y" part of "Page X of Y" and then it reads the temporary file again and adds the page labels. It assumes a page marker character of "FF"x will be there. If you look at the temporary file, this "FF"x character appears as a "y" with two dots above it. But adding page labels is not all it does. If it encounters the non-breaking space character "A0"x character it will substitute it for a space. For column alignment purposes using "proc report", it is sometimes necessary to add such a character to stop the values in columns being shifted when you use "right" and "center" alignment. But these characters must be removed at the end and so pagexofy does this. There are some more characters it will substitute. The titles system changes "&" and "%" to different characters that pagexofy will change back.

Remember this character "A0"x that gets turned into a space by the macro %closerep calling pagexofy. You will sometimes want to use this character as well. It is useful for forcing a variable label to be spaces. If you set a variable label to spaces then sas will use the name of the variable for the label. But if you set it to this "A0"x character then sas will accept it and so it will get changed to a space later.

The script is written in "gawk". I found it quite easy to learn gawk because to me, it is similar to SAS/AF code which I have written in the past. It can have a "begin" block, a main block and an "end" block and this is similar to SAS/AF code. The functions are similar to sas as well. Not only is gawk a much more powerful language for handling text files but will run much faster than sas, so that is another reason to use it. You might find the code a little difficult to follow but have a try. See if you can find the other characters it converts into "&" and "%".
pagexofy

creating PDFs

PDFs are created by converting PostScript files to PDF format using system utilities. Once you have a PostScript file then the rest is easy. The utility that converts your output to PostScript is called "lptops" and this is called from the "lis2ps" script mentioned above. The script "lis2ps" and the scripts it calls comprise the bulk of the script code written for Spectre. The process is complicated and some of the scripts are complicated as well. To create bookmarked PDFs, the bookmarks have to be inserted into the postscript file. The bookmarks contain the titles in your output. It has to get these titles by reading the first ten lines of your output file. It can not get it from the titles dataset (which would be more convenient) because only changeable titles are held there. The reporting macro might add other titles so the only way to be sure you have all these titles is to read the first ten lines of the output file. The script that does this is named "getitles". It is written in "gawk" and has to be tailored for each company using Spectre to ensure it can find the titles for the different client layout styles. Because this script may be difficult to understand, a simple sas version of it is supplied as well, so that the maintenance programmer can gain an understanding of what the script is trying to do. These are below.
getitles
getitles_sas

In case "lptops" makes a mistake when it converts the output to PostScript there is a utility named "pstolp" that recreates the output from the PostScript file and compares it to the original. This is a complex script written in gawk. If a mismatch between the input file and what is extracted is found then the scripts "badlines" and "badchars" will be called to tell you exactly where the problem occurred. If needs be, you can then edit the PostScript file to fix the problem as it will tell you on what line the problem was found. These scripts are below.
pstolp
badlines
badchars

Other scripts called by "lis2ps" are below.
getlayout
pages
filesize

When a full run of the reporting programs is done there is a file created named "donelist.txt" with a list of all the output in it. The number of pages of each output is shown at the end of each entry. Often there is too much output to put into a single PDF so "donelist.txt" has to be split into "donesect(n).txt" files where "n" is a number. The script "bigps" can be run on "donelist.txt" or individual "donesect(n).txt" files to create the collected PostScript files. This is below.
bigps

Once the PostScript file has been generated, it gets converted to PDF using the script "a4ps2pdf" or "usps2pdf" depending on whether it uses the A4 layout or the US Letter layout. Your PostScript file is the input to these utilities and the output is the bookmarked PDF. By default, compression is used so that the pdf file takes up less room. You can override this compression using the -u option to give you an uncompressed pdf. When you ftp the pdf onto a Windows platform, you should choose binary mode for the default compressed pdf and text mode for the uncompressed pdf. The -u (uncompressed) function is provided in case the pdf behaves differently when ftp'ed to Windows so that you can try it in uncompressed mode to see if it solves the problem. Somebody once noticed that pages got searched in the correct order but backwards for each page on Windows but got searched in the correct way using "acroread" on Linux. If you encounter this anomaly then the -u option will allow you to try it with the uncompressed version.
a4ps2pdf
usps2pdf

Finally -- bookmarked PDFs !!

We are not done with PDFs quite yet. Once you have the PDF you need a script to extract a complete list of bookmarks. This list of bookmarks can then form the basis for a table of contents.
printpdfbookmarks

biglis

There is a script named "biglis" which works in the same way as "bigps" except that it creates a listing file with all the output in. This can then be imported into a text editor such as Word. There is an option to create a table of contents, using the "toclis" script. If you can write a clever Word macro to change layout, then there is the option to add orientation and fontsize information to the list produced.
biglis
toclis

Conclusion

You have been introduced to all of the production scripts that Spectre uses. Many of these are complicated due to the complex work they do. They are not typical of shell scripts that you might write in the future, so do not be put off by the script code you can link to from this page. If you are not the Spectre administrator, then you will not use most of these scripts but the ones you will use in your daily work of running programs and printing output are "sasb", "sc" and "lis2ps" so you should remember these. You will be reminded of them on the utility scripts page.

Use the "Back" button of your browser to return to the previous page.

contact the author