Unix tips for Clinical SAS Programmers

[last updated - 30 September 2007]

Introduction

Note the title of this page. These are Unix tips especially for Clinical SAS programmers. If you are one, like me, and all you want is some useful shell scripts that you can use when working on Clinical trials then you need look no further than this page. And I'll tell you when you might need to use them as well. Sometimes there will be a shell script that you might like to put in your shell script library. Sometimes it will just be a command or a few combined commands that do not warrant a shell script being written. If there is a shell script, then I will briefly describe what it is doing in an understandable way. At some time in the future, I will add a link to the bottom of this page where you can find out more about writing shell scripts. But you will learn a lot from studying these examples here and I suggest you go through all of these before getting into the more complicated stuff. You are better off learning by example rather than making an academic exercise out of it. It is the way of thinking you need to learn more than anything else. Getting a feel for what should be possible and how simple commands, combined, make it possible.

Cygwin

Cygwin is free software that emulates Linux running on a PC (Linux is the same as Unix as far as we SAS programmers are concerned). I highly recommend downloading Cygwin and installing it on your personal PC so that you can play around with Unix and learn it. Learning Unix should be a very practical activity or you will never be good at it. You can run SAS from Unix as well, so long as you have a version on your PC. There are a number of steps to getting Cygwin set up correctly and calling SAS, so it has its own page here.

"Unix in a Nutshell" book and "man"

You need a good book on Unix to hand, if you work on a Unix platform, and you want to "do" anything useful. And the book will only go so far, so you need extra help as well. I highly recommend the O'Reilly book "Unix in a Nutshell" as a concise source of Unix information. The book can not cover everything but you will find you have a facility called "man" at your Unix installation (mine is "info" instead on Cygwin). Suppose you want to find out more about the "tr" command (mentioned in the first tip), then at the prompt type in "man tr" and you will see more. In the following tips and scripts, I am going to assume you know a little something about commonly-used Unix utilities. But maybe you don't so a book would help but "man" might be enough.

Learning Documents

The following learning documents are fairly complete. They are copies of the documents on the Spectre (Clinical) e-book. The links will all work if you use the e-book but not always for the pages below. You are recommended to download and install the Spectre (Clincal) e-book and study them there. You will find much more there than on this page.

The first document, "Common Unix commands", needs to be carefully read and understood as it lays the foundations for all the other material. It has taken me four years to get it in its current form so it is carefully thought out and thorough. You should keep re-reading that document until you understand it and can apply everything in it as it will be a waste of time going further if you do not. It does not cover all Unix commands but enough for you as an end-user, especially in the field of clinical reporting on a Unix platform. Once you know everything in that single document then you can consider yourself to be "good" with Unix as an end-user. It might be the only document you will ever have to read, depending on how far you want to take things.

Common Unix commands
Writing bash shell scripts
Writing sas shell scripts
Writing gawk programs

Shell scripting tips and scripts

I have ordered the following tips so that they tend to build upon each other. For that reason I recommend you go through them sequentially. If I have marked any as IMPORTANT then you must not skip them or you might not be able to understand the following ones.

piping - Tutorial on "piping" and "redirection" (IMPORTANT)
getname - Script to match a userid to a person's name
whosgot - Script to find out who's got a lock on SAS datasets or other files
fsv    - fsview a SAS dataset
listempty - Script to list out all "empty" files
scanlogs - Script to scan all the lines in a SAS log looking for important messages
rescue - Script to "rescue" sas programs from their logs
antigrep - Script to tell you what files DO NOT contain a character string
pages - Script to select a specified page range from a list of files
pagexofy - Script to add "Page x of Y" labels in output tables and listings
delsome - Tutorial/Script on optionally deleting a list of files (IMPORTANT)
delall - Script to delete a list of files
dirtidy - Tutorial/Script on spotting widowed .log and .lst files in a directory (IMPORTANT)
killjobs - Script to optionally kill Unix processes that you own
sed & awk - Tutorial on sed, awk and the do-it-now quotes (IMPORTANT)
ddiff - Script to compare an old set of outputs with new ones
hdr - Script to create a standard SAS program header with details filled in
xargs - Tutorial on passing arguments to Unix utilities
basename - Tutorial on when and when not to use basename

SAS/Unix scripts

These are scripts that invoke SAS. These have been developed under Cygwin running on a PC and then converted to run under Unix. As a consequence, there could be obvious errors in them that will cause them to fail under Unix. Please help the author by emailing him if you have run this on a full-blown Unix platform and you have found errors that cause it not to run.

sasunixskeleton - the utility that writes utilites that call SAS
contents - List the contents in short form of one or more datasets or a whole library
contentsl - Long form of contents
clash - Identify where differences exist in identically-named variables between datasets in a library
printalln - Print all occurences of a numeric variable meeting the specified condition
printallc - Same as printalln but for character variables
allmiss - Notify for all-missing variables in a dataset, list of datasets or a whole library
misscnt - Notify number of missing variable observations in a dataset, list of datasets or a whole library
titleprogs - YOUR CALL. You need to write this yourself because I don't know your setup (IMPORTANT)
intitlesnoprogs - YOUR CALL AGAIN. Searches program directories assuming you wrote titleprogs but will need more work
intitles - YOUR CALL AGAIN. List all programs in your current directory that match with those in your titles dataset


Other Unix tips

File permissions (umask and chmod) (IMPORTANT)
Using tr to remove Windows carriage-returns(IMPORTANT)
Symbolic links
Makefiles vs. Scripts for running SAS program suites
 

Go back to the home page.

E-mail the macro and web site author.




Check here for unix and related to script
Page provided by FREE GoFTP Client