This document will teach you about the common Unix commands but it will do more than that. Because, as stated, the power of Unix comes from the ability of simple utilities to pass information to each other for processing, then to achieve that power as a Unix user, you need to learn how to combine Unix commands so that they can pass information to each other. To achieve that, you will be introduced to the terms "piping", "substitution" and the utility "xargs" which are ways of passing information between Unix utilities.
This document is for learning Unix commands. It is not that good as a reference source. I wrote it and I hardly ever use it as a reference document. You should read this document through from start to finish for the first time as later sections build on ideas introduced in earlier sections so you can develop your understanding of "piping" and "substitution", gradually, and with a firm understanding. If you use it as a reference document without having gone through it from start to finish then it won't be very useful to you even though you think it might be. What I am hoping for is that you have quiet times, like on train journeys or flights, or even a quiet day at the office, when you can carefully go through this entire document from start to finish. But take your time. It is not a "quick read". It takes a lot of concentrated effort. You are not going to know Unix commands in less than an hour unless you are a genius. A week of evening study would be more realistic. That's how long it took me. If this were a course then I guess it would take a day and a half to complete with trying out examples.
Once you have been through this document carefully then you can jump to any topic you like, from the links below, to refresh your memory. If you have come back to this page after a while, updates or new sections will be marked as such after the links below.
"man pages"
cd
ls
wc
grep
fgrep and egrep
pattern matching vs. regular expressions
beyond "regular expressions"
more
mkdir
rmdir
cp
mv
rm
create a file
cat
cntl-c and cntl-d
diff
cmp
xargs
tee
awk
cut
ln -s
find
&
nice
nohup
ps (updated)
kill
special characters
chmod
umask
file
"command not found" messages
user-defined system environment variables
"set" command
alias
"sourcing" a list of commands
which
clear
exit
xterm
/usr/sbin/fuser
finger
sed
generating a command file
vi
less
conclusion
You will be introduced to a number of Unix commands. I will not be able to cover any in much detail so you will need a extra reference source for further information. Don't worry though, you will not have spend any money on books.
man ls |
If you were to type in the command “cd /” then you would go to the top directory on your Unix machine that is called the “root” directory. Once you are in a directory you can go “up” and “down” directories (but there is no “up” from the root directory). If you were to go to the /usr/bin/ directory by typing in “cd /usr/bin” (or “cd /usr/bin/”) and then type in “cd ..” (or “cd ../”) you would find yourself in the upper or “parent” directory. If this is not automatically displayed then you can display the current directory using the pwd command (short for “print working directory”). Once you are in the /usr directory you can go “down” again to the /usr/bin directory by typing in “cd bin”, “cd bin/” or by using the bash shell to complete it for you by typing “cd b”, followed by pushing the TAB key to get it to auto complete to “cd bin/” and then pressing ENTER (if you are using the Korn shell then you should be able to auto complete file names by pressing ESC twice). If there were two or more directories or files that started with a “b” then the bash shell would auto complete as many letters that were common to the start of these files or directories and then make a bleep sound to let you know there was a problem. On the other hand, if there were no directories starting with a “b” it would also bleep. To illustrate this then make /data your current directory and type in “cd d” followed by a TAB. It will auto complete to /data/db and bleep to let you know there is more than one file starting with “db”. To list these multiple files or directories then you have to use the ls command in a certain way. The “ls” command is described next.
Quite often, when you change directories, you will want to return to the original directory after finishing your work there. This can be done using “cd -“.
"cd" is known as a "shell built-in". It is a command but it is a "built-in command". If you type in "man cd", hoping to see the "man" pages for "cd", then instead you will see a page for all the shell "built-ins". You might want to try this out of interest.
The “ls” command is usually used with a file name pattern to limit the number of files listed. To list out all the programs in a directory you would normally type in the command “ls *.sas”. It would then only list the files that has a “.sas” extension.
“ls” can be used on other directories apart from the current directory. If you wanted to list all the files in the directory /usr and still stay in your home directory then type in the command “ls /usr” or “ls /usr/”.
“ls” gives you a different result from “ls *”. In the second case, if some of the files are directories, then they get expanded out – the directory name being shown with a “:” at the end and the files in that directory follow. Try this out using the command “ls /usr/local” and then “ls /usr/local/*”. You may think that both should be the same, since in a sense in both cases you are asking it to list all files. But it is not the same and why will be explained here. Before a utility is called, there is something called a “command parser” that works out what command you want to call. If it sees a file pattern such as “*” or “*.sas” or “db*” then the command parser will expand this out into a list of all files that match that pattern. You could use “echo *” to get a list of files or “echo *.sas” to get a list of programs. “echo” does no work except write to the terminal. So when you enter the command “ls *” the command parser takes the “*” and expands it out into all files and this list gets passed to the “ls” command. When “ls” receives a pure file name, there is nothing for it to do except list it and so you just get the file name. But in the case of a directory then “ls” expands it out into what files it contains. But if you call “ls” on its own then the command parser has nothing to expand into a list of files so “ls” just acts on the current directory and gives you a list of files and directories in it. That is why “ls *” acts differently to “ls”. It is all to do with the command parser acting on file patterns.
You can use many options with “ls”. If used with the “-l” (“long”) option
then you get out a lot more information about the files and their read/write/execute
status. Here is the start of a sample list when you use “ls” with the “-l”
options. The entered command is shown, directly followed by the start of
the list. Note that the first line gives a total. This is a total number
of blocks of 512 bytes storage (each file will use up one or a multiple
of these) used by the listed files. This total is not the number of files
but will equal the number of files if every file is 512 bytes or less in
size. You should not rely on this number for anything. Note that the date
and time is shown for when the file was last updated. The owner or user
is shown as well as the size of the file in bytes. The first column will
be explained in a later section. “ls –l” gives you very useful information
on the files. Of particular importance are the size of the file and the
date of the file for where you have copies of these files of the same name
in other directories. A difference of size in two files of the same name
in different locations will indicate that the file has been changed. A
difference of date will indicate that one file is newer than another. Note
that in the command used below the output from the "ls" command is being
passed to the "more" command by using the symbol "|". This is known an
"piping" and "|" is known as the "pipe" character. This is the
main way Unix utilities pass information to each other.
$ ls -l | more
total 2068 -rw-rw-r-- 1 rrash rpt 2243 Dec 4 13:42 1 -rwxrw-r-- 1 rrash rpt 6332 Oct 24 2001 SAS -rwxrwxr-x 1 rrash rpt 1259 Oct 8 14:56 addcr -rwxrwxr-x 1 rrash rpt 37 Jan 16 09:42 al -rwxrwxr-x 1 rrash rpt 2482 Dec 1 11:53 allfmts -rw-rw-r-- 1 rrash rpt 6052 Dec 15 11:26 allfmts.log -rwxrwxr-x 1 rrash rpt 2732 Dec 15 11:29 allfmtsl |
You can add an option to show “hidden” files. These are files that start
with a period but it also includes the current directory as just one “.”
and the parent directory as “..” as shown below.
$ ls -al | more
total 2112 drwxr-xr-x 8 rrash rpt 3072 Jan 19 15:52 . drwxrwxrwx+ 98 root other 1536 Dec 19 16:49 .. -rw------- 1 rrash rpt 146 Jan 19 10:05 .TTauthority -rw------- 1 rrash rpt 7064 Jan 19 16:09 .bash_history -rw-rw-r-- 1 rrash rpt 3334 Jan 19 14:42 .bashrc -rw-rw-r-- 1 rrash rpt 2441 Nov 21 15:17 .gv -rwxr--r-- 1 rrash rpt 729 Dec 8 15:48 .profile -rw------- 1 rrash rpt 60 Oct 9 08:53 .sh_history -rw-rw-r-- 1 rrash rpt 2243 Dec 4 13:42 1 -rwxrw-r-- 1 rrash rpt 6332 Oct 24 2001 SAS -rwxrwxr-x 1 rrash rpt 1259 Oct 8 14:56 addcr -rwxrwxr-x 1 rrash rpt 37 Jan 16 09:42 al -rwxrwxr-x 1 rrash rpt 2482 Dec 1 11:53 allfmts -rw-rw-r-- 1 rrash rpt 6052 Dec 15 11:26 allfmts.log -rwxrwxr-x 1 rrash rpt 2732 Dec 15 11:29 allfmtsl |
Before we leave "ls" I thought I would introduce you to the "-t" option
which will order the files in descending datetime order. This is a very
useful option to find out what files have been recently updated in a directory.
Again, "piping" is your friend, since the most recent files will
be shown first so these will be the first to scroll off your screen. Suppose
you wanted to see the most recent 10 files only then you could pipe
to the "head" utility like this.
ls -lt | head -10 |
The "head" utility has an opposite counterpart named "tail". Consult the "man" pages if you are interested.
ls | wc –w |
If you count files by piping “ls” output to “wc” then be aware that you will get a different count depending on the options you use with “ls”. The “-a” option gives you “hidden” files as well and this includes the current directory as “.” and the parent directory as “..”. The “-l” option will give you a line at the start giving you the total number of files. In this case, if you were counting lines of output and assumed this was the number of files then you would get one more due to this total line shown at the start.
Note that if some of your file names contain a space then counting the words will not give you the number of files. It is best to never have file names on Unix that contain a space. It can cause problems where you least expect them to occur, since many user-written utilities were not designed to deal with this situation.
Again, you can use options with this command. A very useful option is the “c” option which instead of giving you the matching lines gives you the count of matching lines in each file. The same command as above to give just a count would be “grep -c 'der\.' *.sas”. This will also give you a zero count for those files that did not contain your regular expression in any line. You will see that the file names end in a colon followed by the count.
Taking the previous example a little further, suppose you wanted to find out which programs did not contain the regular expression then you could select only those output lines that ended in “:0” by using this command “grep -c 'der\.' *.sas | grep ':0$'”. What this does is to pipe it to grep again and to select only those lines that end in “:0” (the “$” sign means “ends with”).
Taking the previous example one extra stage, suppose you only wanted
a list of the files that had a zero count as in the above but you did not
want to see the “:0” at the end. There is no option in “grep” to switch
this off but you could pipe the list to a utility called sed (stands
for “stream editor”) to replace the “:0” ending with nothing. The command
then becomes:
grep -c 'der\.' *.sas | grep ':0$' | sed 's/:0$//' |
A useful option is “-v”. This selects lines that do not contain the
regular expression you supply. And another option you will use is “-i”
which tells it to “ignore” case. You can use these options together. For
example, the following command will return all lines in the file date.txt
that do no contain the word “date” in upper or lower case or any combination
of upper or lower case:
grep -iv 'date' date.txt |
If you just want a list of files that contain the string or regular expression you are looking for then use the -l option ("l" for "list").
fgrep –f $HOME/search.lst *.log |
The important difference between “egrep” and “fgrep” is that “fgrep”
uses literal matching of strings and not regular expression matching. “egrep”,
on the other hand, uses regular expressions, but slightly extended regular
expressions such that “|” means “or” (it does not act as a "pipe" character
when used like this). Also “?” and “+” take on a special meaning. For example,
the following will pick out any line with “one” or “two” or “three” in
it:
egrep 'one|two|three' *.sas |
For more information on “fgrep” and “egrep” (and its extended regular expressions), consult the "man pages".
Although the “grep” you have access to does not allow for your patterns to be in a file, there is another version of “grep” that does. This lives at /usr/xpg4/bin/grep. If you use this then your matching patterns will be treated as ordinary regular expressions and not extended ones as for egrep. You can always use it, like any other utility, if you specify its full path name.
Suppose you wanted to list all the log files that started with a “c” then you would use “ls c*.log”. If you wanted to select them using “grep”, following the same rule exactly, then you would have to do it with “ls | grep '^c.*\.log$' ”. So you see there is a big difference between pattern matching and regular expressions.
cp –pi ../program.sas . |
Using “rm” with the “-r” option is a very powerful form of the delete
command that will delete all files and subdirectories. You must take care
that you are in the correct directory and that you really intend to delete
everything in a subdirectory and all subdirectories below that. You use
it like this:
rm –r directory |
echo –n > newfile
echo 2>newfile > newfile : > newfile touch newfile |
“>” can also be written as “1>”. The “1” is implied when you use “>”. The “1” refers to “standard output”. It is where all normal output goes. There is a “2” as well that refers to “standard error”. It is where all the error messages are (supposed) to go. You can redirect error messages to a file using “2>”. Sometimes you will want to discard some error messages. For example, suppose you had a list of Unix commands and some of these just deleted certain types of files which may or may not exist. Assuming you did not want to see any error messages for the files that you tried to delete but did not exist, then you can redirect “standard error” to the Unix trash can like this “2> /dev/null”.
In some cases you might want to redirect standard output to standard
error. For example, suppose you had detected an error condition and you
wanted to write out an error message, then you could write out a message
using “echo” like this…
echo "Error detected" |
…but the output would go to standard output. Standard output is where
all the correct information should be written to. Error, warning or other
diagnostic information should be written to standard error, but then somebody
might have redirected standard error using “2>”. You would need to send
it to the same place that the error messages are supposed to go to. You
can do this by redirecting standard output to the standard error location
like this:
echo "Error detected" 1>&2 |
Another use of “cat” is to write a file to the terminal and use the “-v” option to show up non-printable characters which would otherwise be invisible. If a file on Unix started off on a Windows platform then there could be carriage return characters at the end of some or all of the lines. “cat –v filename” would reveal these characters as a “^M” at the end of the line. If these characters exist then it is wise to delete them. How to do this will be described in the section on “special characters”.
Do not use Cntl-Z to stop a command. This only interrupts a command and leaves it as a suspended process. These suspended processes can slow the Unix system down and you will have to “kill” these processes at some stage. Some Unix systems are set up so that you can not log off if you have suspended processes so you have to find out what they are and “kill” them. Listing processes and “killing” them is covered later in this document.
Options can be used with "diff" that affect how the comparison is done. The "-b" option will treat multiple blanks as single blanks, for example. For a full list of options, consult the "man pages".
For standard output listing then you will have lines that are always
different because they have the date and time in it. The “idiff”
utility is an in-house written extended version of “diff” where you can
specify a start pattern for lines you want ignored. All it does is use
sed to blank out these lines and write the two files to temporary files
and then does the comparison between the two temporary files using "diff"
with the "-b" option set. For comparing the contents of two directories
there is "ddiff" as well ("directory diff"), another
in-house written utility. You can link to these two utilities below to
read about how to use them.
idiff
ddiff
$ cmp roland.txt roland2.txt ; echo $? # I know these files are
different
roland.txt roland2.txt differ: char 1, line 1 1 $ cmp roland.txt roland3.txt ; echo $? # These file are the same
$ cmp roland.txt roland4.txt ; echo $? # The second file does not
exist
|
Now here is something that might worry you a little bit, depending on what job you do. I work as a clinical trials sas programmer and it worries me a lot so I am telling about it here. In that job there is no way you can allow the results of a clinical trial to get stored in an altered way, but that is exactly what might be happening on very rare occassions. That is that the copy command, "cp", on Unix platforms has no "verify" option that you can use to tell you if the copy you are making is a perfect one (on Windows platforms you could specify a "verify" option). If "cp" encounters a problem that it knows about during its execution then it will give out a message and return a non-zero return code that you can test just like you can see I have tested "$?" for "cmp" above. But who checks this? Worse still, it can return a zero return code but the copy might be corrupted that nobody knows about. The only way you can be sure this very valuable data has been copied correctly is to run the "cmp" command against the two files (the orginal and the copy) to make sure they are the same after the copy has been made. But who does this? Nobody, I guess. But there is another problem lurking here. If you run "cmp" on a copy you have just created, part of that file or even all of it will be sitting in the disk cache so when you run "cmp" it uses the part of the file in the disk cache and not the file that has been written to disk so doing the comparison immediately after creating a file is a waste of time. You have to do it in a way such that you are sure the copy has been committed to disk and is no longer in the disk cache.
If these files are extremely important and an organization makes copies a lot of times using "cp" then some day, somewhere, somebody is going to discover than one or more of these files has been corrupted. Worse than that, your files may get copied around by the systems administrator without your knowing about it, during server changeover or routine overnight maintenance, and if they do not know which of your files are very important, and are not validating the copies they are making, then your files might become corrupted in very rare circumstances without your having any indication that anything has changed.
But all is not lost. If you are copying a batch of large files then when all the copies are made, you can be fairly sure that the first file you copied has been flushed from the cache so you are ready to do the compares. How to do a bulk "cmp" is covered in the "xargs" section next.
cp r*.txt txtdir/ |
...but once you had done the copy and these were important files and
the copy had to be validated as correct using "cmp" then how would you
run the "cmp" command on these files? The "cmp" command will only accept
the two file names it is going to compare as arguments so how can you feed
it "r*.txt" and tell it about the other directory and have it compare just
those files you copied across? The answer is that this is not possible
unless you use "xargs". It so happens that I have set this up on my PC
and there are indeed some files of the form "r*.txt" that I have copied
to a directory "txtdir" so here is how I run the "cmp" on each of them.
Unfortunately there is no way to get it to show the return code from the
"cmp" command (unless I write a script for "xargs" to call) but you can
see it really is running "cmp" properly because it spotted the deliberate
change I made to roland2.txt after I did the bulk copy. I have used the
"-t" option (trace mode) with "xargs" so that it echoes each command to
standard error before actually running the command.
$ ls -1 r*.txt | xargs -t -I {} cmp {} txtdir/{}
cmp roland.txt txtdir/roland.txt cmp roland2.txt txtdir/roland2.txt roland2.txt txtdir/roland2.txt differ: char 1, line 1 cmp roland3.txt txtdir/roland3.txt |
I would dearly like the return code to be echoed after each "cmp" had
been run so it would give me positive feedback when "cmp" found no differences
so I have written a script named cmprc
("cmp" plus the return code) for this. It echoes each "cmp" command it
will run so I don't need the "-t" option with xargs so here is "xargs"
calling my "cmprc" script and the output produced.
$ ls -1 r*.txt | xargs -I {} cmprc {} txtdir/{}
cmp roland.txt txtdir/roland.txt rc=0 cmp roland2.txt txtdir/roland2.txt roland2.txt txtdir/roland2.txt differ: char 1, line 1 rc=1 cmp roland3.txt txtdir/roland3.txt rc=0 |
Quite apart from detecting corrupted copies (which will be extremely rare), you can use the above method to "make sure nothing has changed" in a directory compared to another directory. An awareness of "xargs" for passing arguments from one utility to another is extremely useful.
There is another command you can use that is similar to "xargs" in some ways and that is "apply" (see "man" pages).
$ ls -1 r*.txt | xargs -I {} cmprc {} txtdir/{} 2>&1 | tee
cmp.log
cmp roland.txt txtdir/roland.txt rc=0 cmp roland2.txt txtdir/roland2.txt roland2.txt txtdir/roland2.txt differ: char 1, line 1 rc=1 cmp roland3.txt txtdir/roland3.txt rc=0 $ cat cmp.log
|
The following command was used earlier, where the last “sed” step was
to drop the “:0” from the ends of a list of files and just leave the file
name. “awk” can be used instead in the last step. Here is the original
command again, using “sed” in the last step:
grep -c 'der\.' *.sas | grep ':0$' | sed 's/:0$//' |
If there is only one “:” in each line then we can tell “awk” that the
delimiter is “:” using the option “-F:” and tell it we want the first field.
So we can use the following command in place of the above command:
grep -c 'der\.' *.sas | grep ':0$' | sed 's/:0$//' |
Here is another example using “awk” run against the Unix password file
to match the first field with the user name and print out the fifth field,
which is the full name of the person, held in the password file:
awk -F: '{if ($1=="'$USER'") {print $5}}' /etc/passwd |
There are newer versions of “awk” that have extra capabilities. There
is “nawk” (“new awk”) and “gawk” (“GNU awk”). Both “nawk” and “gawk” have
a global substitution function that you can define hexadecimal characters
to. Try typing in this command, for example:
echo * | gawk '{gsub(" ","\x0a")} {print $0}' |
…and compare it with the output from the “ls” command used with the
“-1” option:
ls -1 |
If the above example with “gsub” works with “awk” then you will probably find that the systems administrator has replaced the original “awk” utility with a symbolic link to “gawk” or “nawk”. If you can locate where it is stored then using “ls –l” will show you the long form of file listing and if it is a symbolic link then the first character of the “-rwxr-xr-x” will not be a “-“ but will be an “l” and the very last column will show the link as “awk -> gawk” or something similar.
For more information on “awk” see “man awk” and for “nawk” see “man nawk”. This will give you a lot of information and you should realize that these are very powerful utilities, but you will mainly make use of it to identify fields.
cut –d: -f5 /etc/passwd |
You can also "cut" based on column positions. You can learn more about it using “man cut”.
When you set up a symbolic link you normally keep the name of the file
the same, except have it in a different directory. So to create a symbolic
link to a macro in a project library and have it point to a file the same
name in a central library then you would make your project directory your
current directory and enter the command:
ln –s /central/library/macro.sas macro.sas |
If you edit a file that is a symbolic link then you will edit the original file. If you delete a file that is a symbolic link then you will not delete the original file. You will just delete the link to it.
find . -name 'di*.sas' |
If you wanted to “find” what subdirectories you had from the current
location downwards then you could do it like this:
find . –type d |
Supposing you wanted to search for the string “labs” in all programs
of the form di*.sas from the current directory downwards, the following
would not work:
find . -name 'di*.sas' | grep 'labs' # this does NOT work |
The reason the above does not work is because it searches for the string
“labs” in the list of files that comes out of the “find” command rather
than in the files themselves. But the following would work:
grep 'labs' $(find . -name 'di*.sas') |
The reason the above works is that if $(....) is encountered in a Unix command, then what is enclosed in those brackets gets performed first. This is called substitution and is an important way that Unix commands pass information to each other. The $( ) contains the “find” command and so this results in a list of files. “grep” then understands this list of files as the list of files it has to search in and so will search them and return the results.
nice sas sasjob & |
nohup nice sas sasjob & |
ps -fu rrash |
To list another user’s processes then put a different userid at the end.
The "f" is an option that means "full". It goes beyond the "l" option
which means "long". The "u" option means "user" or in other words the "user-id".
Of course, if you have an option for user then you have an option "p" for
"process-id" instead and this will look like the following:
ps -pu 12345 |
You should remember "f" for "full" and "u" for "user-id" and "p" for "process-id" and sometimes look at the "man pages" for the many options you can specify with this command. It is a very important command. But to help you for the processes owned by you, a utility named myps has been written to list out your own processes. You can also list out another person's processes very easily.
kill 12345 |
...then you could list your processes again in a few seconds to make
sure it is gone. You should always try to “kill” processes in this way,
rather than the way described next, because it will hopefully allow resources
used by the process to be released back to the Unix system in a controlled
way. If your process does not go away then you can force it to go away
using the “-9” option with the “kill” command like this:
kill -9 12345 |
grep -c '^M' * |
… but remember that the “^M” was entered as Cntl-V M (still holding
down the Cntl key). There is another way you can search for this carriage
return character using grep and that is using $'\r'
grep -c $'\r' |
...and to show just those file names without the count on the end we could use “sed” to replace a regular expression at the end with nothing like this:
Once we know what files contain these carriage return characters then
it is easy to delete them using “sed” as follows. A new file test3.txt
is created from the old file test2.txt with these characters removed:
sed 's/^M//g' test2.txt > test3.txt |
...again remember that “^M” was entered using Cntl-V M. You can also
use "\r" to represent the carriage return in the sed expression as follows
but you should check that it is doing what you expect (and not just removing
all the "r" characters):
sed 's/\r//g' test2.txt > test3.txt |
These “^M” characters are often result from not setting the right options when transferring files from a Windows platform onto a Unix platform using “ftp”. It is better if these characters can be avoided since the date information for these files are often important and running utilities against them at a later date to strip out the “^M” will change the date of these files.
chmod 664 *.sas |
You will get error messages for the programs that are not yours, since
you are not allowed to change file permissions for those files. But no
harm will be done and all your own programs will have been changed. There
is a script called myfiles
that has been written to only list out files owned by the invoker so you
can also do it using that command and it will only act on your own .sas
files. Note that we are using the $( ...)substitution technique
again. This is an important way that Unix utilities can pass information
to each other.
chmod 664 $(myfiles *.sas) |
You have to be very careful using the chmod command. Suppose you wanted
to remove write access for all the files in a directory. You could cause
problems if you used this command:
chmod 444 * # NEVER use this command |
The reason you should never use the command in the above way is because
it implies that you want to set permission to 444 for every type of file,
except you maybe do not know what all the files are. If there were any
scripts then they would no longer work as you will have removed the execute
permission. And, more importantly, directories are also files that would
be affected by this command. Directories need execute permission for them
to work as directories so if you used “chmod 444 *” then people would no
longer be able to read any directory that you created until you switched
the execute permission back on. And only the owner can do this or the Unix
systems administrator. Fortunately there is another way of using “chmod”
that is much safer for removing write access as follows:
chmod –w * |
What the above command does is to remove write access for everybody
for all files. It will not affect the execute permission of any scripts
or subdirectories. You can also use this form of the command on single
files. Suppose you had created a file that you wanted to run as a script
(it needs execute permission for this) then you could set the permission
this way:
chmod 775 myscript |
Or you could do it like this:
chmod +x myscript |
The second way is probably easier to remember. To find out more about the “chmod” command then use “man chmod”.
You should note that using the “chmod” command like “chmod –w” or “chmod
+w” uses your “umask” value to mask permissions (this will be explained
in the following section). If you wanted to switch on write permission
for all files for all users then you would find that “chmod +w *” would
probably switch on write permission for yourself (the owner) and your group
because for “others” the write permission might be masked. You can still
use this method to switch on write for everybody, though, using a letter
before the “+” or “-“ to say whether it referred to “a” (“all users”),
“u” (“user/owner”), “g” (“group”) or “o” (“others”). So to switch on “write”
for everybody, including “others”, you could do it like this:
chmod a+x * |
…or like this:
chmod ugo+x * |
If you have a “umask” of 022 then the “w” permission of the group is masked as well as the “w” permission of outsiders and nobody in your group will be able to edit the file you created. So if you created a program then it would mean that nobody else in your group could edit it and make changes. This is not a good idea if you are all working together on a project so you should, instead, make sure “umask” is set to 002.
You can set “umask” during your Unix session, if you like, but it is normal to have it set to 002 for you by default or to have it in your login member as “umask 002”.
If you try to browse or list binary executable files then it can mess
up your terminal window with strange flashing characters and from then
on you have to close the window down. If you are unsure about a file and
you want to know more about it, then the “file” command is the best and
safest way to find out. To list all the file types in a directory then
you could use “file *”. You can pipe the output to “grep” and then it becomes
more useful. Suppose you have a mix of scripts and other files in a directory
and you just want a list of the scripts then you could pipe the output
to “grep” like this:
file * | grep ' script' |
This will give you a list with the file name followed by a colon plus
the description mentioning the word “script”. If you just wanted the file
name and nothing else then you could drop the colon and everything after
it. There are various ways of doing this and you should know more
than one way already after reading this document. Here is a solution using
“sed”:
file * | grep ' script' | sed 's/:.*$//' |
To list all your system environment variables then enter the command
“env”. You may see a lot of these when you do this. They are shown as the
name followed by “=” followed by what they are set to. Suppose you just
wanted to see what PATH was set to. You could pipe it to grep like this:
env | grep '^PATH' |
(the “^” sign above is a regular expression special symbol meaning “line
beginning with”). That will show you just that line but the more common
way to do it would be like this:
echo $PATH |
Putting the “$” in front of the variable is to refer to its contents
and “echo” just displays it on the screen. The “$” for a system environment
variable is rather like a “&” for a macro variable. You put the sign
in front to refer to its contents. You could also do the same with this:
echo ${PATH} |
The above is a longer form of the same thing and you would use it if you were following this variable with characters that it would confuse as part of the name of the variable. For example, $PATHXX would not resolve to anything as the variable PATHXX had not been set up but ${PATH}XX would work. Note that these brackets must be “curly” brackets.
Suppose your PATH variable was set differently to other people in your group and so you could not run commands that your group members were running and you wanted it changed so that you could. PATH values are usually set for you by default but you can set your own values as well. It is likely that your group members have reset PATH to a different value in their Unix login member but you should check whether they had done this. This resides in the home directory. If you go there and list out the files, including the “hidden” ones, using “ls” with the “-a” option, you will see a member called “.bashrc”. This is the login member you can edit (at most sites you are not allowed to edit this member and are only allowed to edit one called .bashrc_own). You would then include the line that sets PATH the same as your other group members and remove any other lines that do the same. This member only takes effect at login time so next time you logged on then PATH would have the same setting as your group members and you should be able to access the same Unix commands as they do.
Note that the setting of PATH in your login member is not recommended and is to be avoided if at all possible. It is better if everybody in your group gets a suitable setting for PATH by default.
pk04p=/data/sas/xxx/pr0g/drug/study/inc/dev
pk04d=/data/sas/xxx/data/drug/study/inc/der pk04s=/data/sas/xxx/data/drug/study/inc/stat export pk04p pk04d pk04s |
Note that in the above, not only should the variables be set but they
should be “exported” as well. This is so that any background sub-sessions
are also aware of their settings. You only need to export the variables
once after they have been created. If you change their values after that
then so long as they have already been exported they will also change for
any new sub-processes. Once these are set then you can use them as short-cuts
to directories like this:
cd $pk04p |
You could also use these variables in any other command. For example,
you could copy all the programs in one directory to the current directory
like this:
cp -p $pk04p/*.sas . |
$ set -o
allexport off braceexpand on emacs on errexit off errtrace off functrace off hashall on histexpand on history on igncr off ignoreeof off interactive-comments on keyword off monitor on noclobber off noexec off noglob off nolog off notify off nounset off onecmd off physical off pipefail off posix off privileged off verbose off vi off xtrace off |
What "noclobber" does, if set to "on", is to prevent redirection using
">" from overwriting a file if it already exists. I do not want that feature
so I use the default which is to have "noclobber" inactive. There are two
ways I can activate it. Both are equivalent.
$ set -o noclobber
$ set -C |
Now if I check on this setting I will see that it has changed.
$ set -o | grep noclobber
noclobber on |
If it is active and I want to deactivate it then there are two ways
to do it which are the same as the ways I used to activate it except the
"-" is changed to "+". I will make the change and check on the "noclobber"
setting again.
$ set +o noclobber
$ set +C $ set -o | grep noclobber
|
If you have "noclobber" active then you might find that some utilities written for you do not work properly as they are trying to overwrite existing files using redirection using ">". If it is a problem then the script programmer might change the script to ensure this option is not in effect using the commands above or they may force the overwrite using ">|" (or ">!" for C shell scripts) instead of ">" but it might be easier for you to reset this option in your own session. You could do this in your login member.
alias sas6='/data/app/sas/sas612/sas' |
Once this is done, then you can use the command like this to run a program
under version 6:
sas6 progname |
To list all the aliases for your session then enter the command “alias”.
The use of aliases can cause some confusion. For example, at some Unix sites, “cp” is an alias of “cp –i” and “rm” is an alias of “rm –i”. You might like to set these up yourself. The effect in these two cases is to set the “-i” option (“interactive”) so that it prompts you for confirmation to make sure a file is not deleted or overwritten by mistake. If you move from one site where “cp” is an alias of “cp –i” and you were not aware of this then you might move to another site and expect to be prompted when you copy a file and are about to overwrite another file, except that the alias does not exist at the other site and so it overwrites it without prompting.
You can easily escape an alias definition and use the raw command by
preceding it with an escape character like this “\rm”. Another thing you
can do is to specifically cancel an alias using the “unalias” command like
in the following command. However these aliases must have been defined
otherwise you will get an error message.
unalias rm cp |
You can “unalias” all aliases set up by using the “-a” option like this:
unalias –a |
Aliases have a very limited scope. You can not “export” an alias like you can a system environment variable.
. /path-name/command_file
. ./command_file # for current directory |
$ which contents
/cygdrive/c/spectre/scripts/contents |
Because scripts with the same name in different directories can cause
problems sometimes, I wrote a script that you can use named pathscr
that will list out all scripts defined to your PATH system environment
variable in the order in which they will be accessed. This is what I get
when I use that script and grep for "contents". The number you see is the
order number of the directory defined to PATH.
$ pathscr | grep contents
contents 8 /cygdrive/c/spectre/scripts contents_cygwin 8 /cygdrive/c/spectre/scripts contentsl 8 /cygdrive/c/spectre/scripts contentsl_cygwin 8 /cygdrive/c/spectre/scripts |
xterm*background: AntiqueWhite
xterm*geometry: 81x24 sas*background: wheat !xterm*foreground: black ghostscript*background: white ghostscript*foreground: black !ghostscript*useXPutImage: false !ghostscript*useXSetTile: false !ghostscript*useBackingPixmap: false |
To find out more about customizing xterm windows, do an Internet search on ".Xdefaults".
/usr/sbin/fuser –u *.sas7bdat |
If there were any users holding locks then the userids would be listed alongside the file names. To find out the full name of a person from just their userid then use the “finger” command described next.
Note that in the case of “fuser” the list of files and the users holding locks on them are written to standard error and not standard output. It is a simple matter to redirect this to standard output if you want to by following the command with “ 2>&1”.
finger rrash |
You usually “pipe” a stream of information to “sed” and “sed”
sends its output to “standard output” which is your terminal. You can,
of course, “pipe” its output to another utility if you want to.
The general form of using “sed” is like this:
previous command | sed 's/RE/replacement/' | next command |
The above is for a single substitution (hence the “s” for “substitution”
at the start) of a regular expression “RE” with a replacement string “replacement”.
To replace a regular expression with nothing then it would take the form:
previous command | sed 's/RE//' | next command |
If you wanted to apply the substitution multiple times or “globally”
then you put a “g” at the end like this. "globally" means multiple times
in the same line.
previous command | sed 's/RE//g' | next command |
The “/” used to separate the RE from the replacement could be another
character instead. Suppose your RE contained a “/” then you could use this
instead:
previous command | sed 's%RE%%g' | next command |
As mentioned previously, a “regular expression” or “RE” is more than just a string. Some characters have a special meaning. To give some examples, “^” at the beginning means “begins with”, “$” at the end means “ends with”, “.” means “any character”, “*” means “zero or more of the previous character”. There is more to this that you can read about. If you really did want to search for one of these characters then you have to put an “escape” character “\” in front of it. Note that this slash leans the opposite way to a Unix directory slash.
Here is a “sed” step for removing the full path name complete with slashes
from a fully specified file name to just leave the ending such that “/xx/yy/file.ext”
will end up as “file.ext”:
sed 's%^.*/%%' |
Here is a “sed” step for removing a colon followed by something or nothing
from the end of a string such that “aa:0 and more things” will end up as
“aa”.
sed 's/:.*$//' |
If you see examples of “sed” being used, then very often they will be doing one of the two examples above.
You can do a lot more with sed. For example, you can have multiple edits
each preceeded by the "-e" option and the pattern you are searching for
can be substituted into the replacement string using the "&" symbol.
Sometimes you will want to substitute different sections of the pattern
you are searching for into the replacement string. You can do this by splitting
your RE into bracketed portions using "\(" and "\)" to define each section
and then you can substitute each section using "\1", "\2" etc. in the replacement
string. This technique was used to add the links in the "updates" page
on this web site. You can view the raw "updates" page where you can see
this sed command listed in the instructions by clicking on the link below.
updates.html
In this example, I have some programs I want to run that have the extension
".sas". I want a command that will create a command list that will take
each program in turn, will generate the commands to delete its output files
and then on the next line run the program using the command "sasb". First
of all, let us see what programs are there.
$ ls t_*.sas
t_adv.sas t_adv2.sas t_demog.sas |
Now I see what is there I want to drop the ".sas" extension and add
the file extensions of the output files I want to delete, then on a new
line, run the program with the "sasb" utility. Here goes:
$ ls t_*.sas | sed 's%\..*%%' | gawk '{print "rm -f " $0 ".log
" $0 ".lst\n" "sasb " $0}'
rm -f t_adv.log t_adv.lst sasb t_adv rm -f t_adv2.log t_adv2.lst sasb t_adv2 rm -f t_demog.log t_demog.lst sasb t_demog |
You can see from above that it has generated the commands that I want. I can redirect it to a file and then "source" it to run it. Do you see how easy it is? There is no need to "maintain" a list of commands like this if you know how to extract the list of programs and the ordering of it does not matter. It is easier and better to generate the commands as that way there will be no syntax errors, no spelling mistakes and no missing programs in the list.
I have copied a number of scripts on this web site and put them in my AIX "bin" directory at work using "vi". First, I highlight and copy the script code I am looking at in Internet Explorer so it is in the paste buffer. Next, I invoke "vi" in the directory I want to store the script in using the command in the form "vi scriptname". Next I press "i" to put "vi" into insert mode. Next I click the right mouse button to paste in the code from the buffer. Next I press "Esc" (actually twice to be sure) so I am out of insert mode. Next, I type Shift-q to get the ":" prompt and push "w" to save the code. I then press "q" to quit and close "vi". Lastly, I make the script executable using the command in the form "chmod +x scriptname" and then it is done. It only takes a few seconds.
"Less" can do more for you. It is very useful for searching for words, for example. To learn about its capabilities, press the "h" for "help" key.
If you come to the end of this document hungry for more knowledge then I have written a document called “Writing bash Shell Scripts” that may interest you. But you should only start reading that document if you know the common Unix/Linux commands well and have assimilated all the material in this document. It would prove to be a waste of time unless you have done this.
Use the "Back" button of your browser to return to the previous page
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.