We are now familiar with the shell and a large number of commands.
In this lecture, we will discuss shell programming using bash. The main goal is to be able to write your own scripts. But what are scripts? And what are they useful for? We will answer these questions.
We plan to learn the following from today’s lecture:
OK. Let’s get started.
Linux has the ability to let you put commands together in a single file known as a shell script that it can execute. Shell scripts can include any of the shell commands; for example, the set we discussed in class. These top twenty commands (well, I lied, top 44 commands - I used wc word count with the “-w” switch to get the actual number of commands we have discussed in class) are sorted (I used the sort command) in alphabetical order below:
Important note: If you know our top 44 you are all set for the reminder of the course. You may only use half of them most of the time and you are likely to learn others, but this is a good set to know.
We also used a number of programs and tools:
as, emacs, vim, gcc, objdump
We will learn more tools for sure but emacs and gcc will become familiar friends.
In addition to calling Linux commands (e.g., grep, cd, rm) shell scripts can also call compiled programs (e.g., C programs) and other shell scripts. Shell programming also includes control flow commands to test conditional code (if..then) or to do a task repeatedly (for..in). These control structure commands found in many other languages (such as C, or other scripting languages like perl) allow the programmer to quickly write fairly sophisticated shell programs to do a number of different tasks.
Importantly, shell scripts are not compiled, rather, they are interpreted and executed by the shell itself.
Shell scripts are used for a variety of reasons ranging from building and configuring systems or environments, prototyping code, or in support of an array of repetitive tasks that programmers do. Shell programming is mainly built on the Linux shell commands and utilities and therefore reuse of existing programs enables programmers to simply build new programs to tackle fairly complex jobs.
The shell can be used in two different ways:
The interactive mode is fine for entering a handful of commands but it becomes cumbersome for the user to keep re-entering these commands interactively. It is better to store the commands in a text file called a shell script, or script for short, and execute the script when needed. In this way, the script is preserved and other users can also use it - code reuse, again. In fact, scripts can invoke others scripts and programs so scripting makes good sense.
Let’s start to build up our knowledge of how scripts work by first looking at some basic operations of the shell. The Linux shell allows for the unconditional execution of commands and allows for related commands to be kept adjacent as a command sequence using the semicolon character as shown below:
The tty command prints the file name of the terminal connected to standard input.
When using the shell interactively it is often clear when we have made a mistake - the shell warns about correct syntax, and complains about invalid switches or missing files. Here is a seperation of interests between the parser (the shell) and the program (the command, /bin/ls for example).
Error messages provide visual clues that something is wrong allowing us to adjust the command to get it right.
Commands also inform the shell explicitly whether the command has terminated successfully or not due to an error. Commands do this by returning an exit status, which represents an integer value that is made available to the shell and other commands, programs, scripts.
By convention an exit status of (0) indicates the successful execution and any other value (always positive) indicates failure of some sort.
The shell environment value $? is updated each time a command exits. What do we mean by that?
The special parameter $? holds the exit status. There are a number of other special parameters, most like $? can be only read from and not written to.
Why do we need to use the exit status?
Often we only want to execute a command based on the success or failure of an earlier command; for example, we may only wish to remove files if we are in the correct directory, or we can only append info to a file if we know it exists.
The shell provides both conjunction (“and-ing”) and disjunction (“or-ing”) based on previous commands. These are very useful constructs for writing decision-making scripts - take the example below:
In the first example, && (without any spaces) requests that the second command is only executed if the first succeeds (with an exit status of 0) - i.e., we only delete the files if we have been able to change to the required directory. The w or who command shows who is logged on and what they are doing.
In the second example, (||) (without any spaces) requests that the second command is only executed if the first command failed (with an exit status > 0) - i.e., if campbell is not logged on. The grep command used with the “-q” switch for quiet suppresses the error message when campbell does not exist in this example.
The shell syntax borrows heavily from C (and now C++) . Since Unix is written in C it is not surprising that the shell syntax is similar - more reuse!
There are many situations when we need to execute commands based on the outcome of an earlier command. We now introduce some of the simple programming features of the shell.
Here command1 and command2 (and any other commands that might be programmed) will be executed if and only if command-sequence returns a successful or true value (i.e., its exit status is 0). The fact that command1 and command2 are executed when true equals 0 and not 1 is confusing for some people. In many high level languages conditionals are executed where true equals 1. But do not get hung-up on that for now.
The > character is the secondary prompt issued by the shell indicating that more input is expected.
Similarly, we may have commands to execute if the conditional fails.
Here command3 and command4 (and others) will be executed iff command-sequence fails.
OK let’s write a simple interactive script. Entering interactive scripts such as the one below is a quick and easy way to test the structure of the scripting language or try out a set of commands. During an interactive session the shell simply allows you to enter an interactive program at the command line and then executes it.
The exit status of the command-sequence (i.e., w (|) grep -q campbell) (note w and who are very similar commands that can be used interchangeably) is provided by the exit status of the last command executed in the sequence (grep in this case) before the conditional test. So if grep finds campbell in the piped data from w it returns a 0 (success) and the “campbell is working online” is displayed to the screen, else if grep does not find campbell then it returns a failed status (1). Note, that if you look at the status after the echo statements are executed the status is always 0 no matter which branch was taken. Check it out and see if you can work out why that is the case.
We can also use the exclamation character ! which represents logical “not” - this is the same as the C language again.
Typically, the command-sequence providing the exit status need not be an external command but can be a result that the shell itself has determined - does a certain file exist? Is a certain file actually in the directory (recall that we looked at similar code in the .bash_profile in the .bash_profile file)? And, if so is it executable, etc.
We can test for a number of conditions using the test or (interchangeably) [ ] command. We use both below but recommend you use [ ] for no other reason than that it’s more standard.
So replacing test with “[ ]” we get the equivalent: This time we set the variable ASSIGN1
Note, that if you use the “up arrow” on your keyboard or history command to display the last set of commands executed then the interactive program is formatted as the following:
Note, it’s important that you leave spaces between [ $ASSIGN1 ] else you will get syntax errors. There are a number of other options that can be used with the test [ ] command.
Recall the statement “if [ -f /.bashrc ]; then” in the .bash_profile file. We could interactively enter the script, with a small variation to what was in the .bash_profile file.
Checkout. You can read the man pages to read up about the test command (the condition evaluation utility) just type the following the command-line: $man test
Many commands accept a list of files on the command line and perform actions on each file in turn. However, what if we need to perform a sequence of commands on a list of files?. Some commands can only handle one file (or argument) per invocation so we need to invoke the command many times.
The shell supports a simple iteration over lists of values - typically over lists of files. In the following example, we make a “back up” copy of each of our C files by adding the .bak.extension.
Try it. You could do this with a single cp command.
As expected we may place as many commands, pipelines, etc. inside the body of a loop. We can use any combination of other if/else tests and nested loops just like in traditional languages such as C. We haven’t covered C yet so you have to believe me for now.
We are not forced to use names of files (as generated by filename expansion) in our list:
Imagine that we wanted to send an email to everyone currently logged in:
This example highlights the use of quoting using backquotes ‘ ‘ allowing the output of the who command to be used by another command, in this case the for loop. The variable person is assigned to each person logged on. The cut command is used to find the person’s username (e.g., campbell). Sort is used to sort the list and remove any duplicates (-u swicth for unique). The message in saved.messages is used as standard input to the mail command. This simple but fairly sophisticated script will send an mail to all people logged in to wilcat.cs.dartmouth.edu. If you want to just send an email to yourself to test the script out before blasting email to everyone logged in try adding a final stage after sort to pick out your username. What command would you use? Quite simple really.
How about the alternative. We don’t want to mail ourself. Then:
Note, that back quotes are used at either end of the command line ‘who .... sort -u‘ and that standard single quotes are used as round the cut switch ’-d ’. This is not always clear from the printed page of the notes.
Fun Challenge. How about you send me an email using a bash script? To do that I would have to be logged on to a machine, say wildcat. But how would you know what time I’m logged on? This is one of the questions on Lab2 called “gotcha”. When Andrew and the TA are logged on send email to them: “Subject: Gotach Campbell you sneak!” with information about when we logged on and off; that is, the exact times. Your job is to send the exact time we snook on and off the machine some time this week - checkout Lab2 for details. Of course I’m sneaky so you better be vigilant.
Here is another example of the use of the for-loop command. Let’s maintain 10 rolling backups of an important file. Start by copying an important file to backup.1. As you run the script backups are rolled back i.e., backup.1 becomes backup.2, etc.
The script above also shows how numbers can be used directly in scripts.
Up until now we have entered scripts interactively into the shell. It is a pain to have to keep re-entering scripts interactively. It is better to store the script commands in a text file and then execute the script when we need it. So how do we do that?
Simple. We first use emacs, vim, or your favorite text editor, to create a text file, enter the script, and then make the file executable (with chmod).
Let’s take the script above as an example and do just that. We create a file called backup.sh (the .sh extension is for shell script). We do not have to use the .sh extension but it tells us immediately that the file is a shell script rather than say a C program which by convention would be backup.c. So get into the habbit of doing that.
Note, that our first shell script below - backup.sh - is a plain-text file that contains shell commands and constructs that we have previously entered interactively.
Script source: backup.sh
The contents of backup.sh looks like this:
There are a couple of things to note about the file. First there is the #!/bin/bash line. What does this mean? Typically, the # in the first column of a file denotes the start of a comment until the end of the line. Not in this case.
There is an exception in the case of #!/bin/bash which is a special form of comment. The special character sequence #! tells the shell that the argument that follows i.e., /bin/bash is the path to the program that will execute this file. In our case this is the bash shell /bin/bash. The bash shell /bin/bash takes the remaining commands in the file as standard input just like any other command line. The #! must be the first line of the file, no blank lines are allowed.
The script returns the exit status 0 to the command line which can be viewed using the echo $? command, as discussed earlier. The return status is typically not checked when scripts are run from the command line, as discussed below. However, when a script is called by another script the return status is typically checked so it is important to return a meaningful return code.
Note, that if we look at the file in its local directory we see it is not an executable file. That is a problem because it needs to be an executable for the shell to execute it - makes sense right. We use the chmod command with the symbolic switch +x to make the file backup.sh an executable.
The script can be executed by simply typing any of the following.
The first invocation simply executes the shell without an explicit path (Question: what does this infer if setup in the $PATH environment variable?).
Answer: The $PATH environment variable set up in .bash˙profile needs to include the current working directory “.”. If you type the script name without the path being set up (e.g., PATH=$PATH:.) you will get an error message “file not found”.
If this is the case you can type ./backup.sh which tells the shell that the file is in the current working directory - this gives the full relative pathname of the file. Using ./ has the advantage that you are sure you are executing the file you think you are rather than accidently another script with the same name on the system (e.g., imagine if the name of the script were cd)
The final example uses the source command. We haven’t come across source before. Source is a way of getting commands to execute in your current shell.
Variables and arrays are typically not declared before they are used in scripts.
Variables and arrays are stored as strings.
Examples of variables include:
Like in any programming language, arrays are very useful to store and retrieve data from.
Creating and using arrays is straightforward.
For example: array=(red green blue yellow black white)
Importantly, the array can be used in combination with back quotes and commands in a powerful manner; for example, consider:
array(‘cat array.sh‘)
array=(‘find .‘)
array=(‘ls‘)
Note, that the above commands use back quoutes. Sometimes it is not clear from the printed notes.
Try substituting these three definitions in the script below and observe the results - is that cool!
Here is a simple example:
Script source: array.sh
The contents of backup.sh looks like this:
Make sure you re-run the script with the other array definitions - this will help with the assignment.
Another example. You can executes this interactively and play with arrays. If you need to store information and update it while your script is running then arrays are good for this:
The “for-loop” construct is good for looping through a series of strings but not that useful when you do not know how many times the loop needs to run. The while do command is perfect for this. For example, in the Gotcha assignment you will need this.
The contents of guessprime.sh using the “while-do” construct. The script allows the user to guess a prime between 1-100.
Script source: guessprime.sh
This script uses user defined variables, echo -n (-n removed the newline usually associated with echo).
Checkout. The script allows you to enter data into the script using the read utility to read in user input. Why don’t you run it. But make the file executable this time using absolute octal values.
The shell maintains a number of important environment variables that are useful in writing scripts. We have come across some of them already.
Sometimes it’s good to know the process ID (PID) of the shell that is executing. You can do this using the special character “$$” variable.
Here is a little script that shows the difference between using some of the environmental variables and using the command “shift”, as well as initialising a variable and using “let” to increment the variable in a for .. do loop.
Script source: shift.sh
Note, that two processes are running. One is the bash shell that you interact with when you log in and the other is the ps command. If we use “ps -l” we can see that the bash shell process (PID 7996) is in a wait state. The Linux scheduler puts processes into various states depending on what they are doing. They can be executing, ready to execute but waiting on the scheduler queue or blocked or waiting on some signal or event. Here the bash shell is waiting for the ps command process to complete. Operating systems are very neat in the way they make it look like multiple processes or tasks are executing simultaneously when they are not!
Checkout. Let’s look at how the special character $@ can be used with the for..in construct. The description in the script file “whos.sh” describes what the script does. It introduces a number of new ideas. Take a look at the script and see if you can work out what is going on. We will discuss how it works below. First read the script and see if you can work it out without reading the next section first.
Script source: whos.sh
Note, we use -eq above to test if something is equal. We can also use == and != In addition, -lt means less than, -gt means greater than and -ne not equal to.
Here is how you interact with the script once you have made it executable.
How it Works. The first thing to notice is that this script expects arguments. We have not dealt with that before. The command-line arguments to the script are accessed using the shell environment variable $@ which provides a list of all the input arguments as a single string delimited by a space. The script first checks to see if there are any arguments entered on the command line ($whos.sh argument1 ..).
The first part of the script checks that there are in fact arguments passed to the script. The code “if [ $# -eq 0 ]” does just that. The environment variable $# reflects the number of arguments entered. If it equals 0 then an error message is sent to the display. The output of the echo command is sent to the standard output and standard error using the cryptic 1>&2 statement. This means the output and error are directed to the standard output.
The next part is executed if there are input arguments. Let’s discuss the for loop statement and in particular the “for args” which translates to “for agrs in $@”. Here the bash shell expands “$@” into a list of quoted command line arguments “$1”, “$2”, “$3”, “$4”, etc. The variable “arg” is user defined. The script uses the for statement to work through the command-line arguments entered by the user. $@ allows the for loop to treat an argument that contains a space as a single argument. In the input example given above the double quotes “Andrew Campbell” which causes the bash shell to pass it to the script whos.sh as a single argument.
The awk command is used to do the heavy lifting in terms of pattern matching against the contents of the /etc/passwd file. Take a look at the awk command which is a pattern scanning and processing language - checkout man awk for the details of how it works and what the switches are. The awk utility extracts the first $1 and fifth $5 fields from the /etc/passwd file as discussed above. The -F: switch causes awk to use the colon as a field separator when it parses the file. What does awk stand for? You have to give it to the early unix developers of these gems of command lines. The awk command (it’s really like a programming language in its own right) was developed at Bell Labs in the 1970s by, wait for it, Al Aho, Peter Weinberger and Brian Kernighan, yes, Aho Weingerger Kernighan ;-)
Let’s take a look at a snippet of the /etc/passwd file to see why that makes sense.
Clearly you can see that the first field is the username and the fifth the full name and the separation or delimiter is a colon “:”. The $1 and $5 variables as quoted so that the shell does not interpret them but passes them to the command so that only the awk utility will interpret their meaning. Do not not confuse these field arguments used for parsing the /etc/passwd file with the positional parameters that map to the parameters given to the whos.sh script. The first and fifth fields of the /etc/passwd file are piped to the grep command which prints lines matching a pattern. The grep command - grep -i ”$arg” - searches for the $arg (which represents the command-line arguments) in its input (which is the output piped from awk). The grep command uses the “-i” switch to force grep to ignore upper or lower case as it searches; grep displays any line piped to it from awk that matches an $arg (e.g., ”Andrew Campbell”) to the display.
Comment. The script is quite simple, yes? But it’s also powerful, right? Beauty of scripting.
When writing scripts it is important to write defensive code that checks that the input arguments are correct. In whos.sh the program checks if there are no arguments and then prints a usage message. In this example, the program checks for a specific number using the not equal to operator.
Let command. The let command carries out arithmetic operations on variables. It functions in a similar manner to the expr command. We give a number of examples below where variables are assigned values.
Examples of operators below include “+, *, %”. The modulus operator may be new to you. Definition from wikipedia: Given two numbers, a (the dividend) and n (the divisor), a modulo n (abbreviated as a mod n) is the remainder, on division of a by n. For instance, the expression ”7 mod 3” would evaluate to 1, while ”9 mod 3” would evaluate to 0.
Just like most procedural languages scripts have structure and function support. Typically, it is a good idea to use functions for command script code and to make script more readable and structured. The shell supports functions. In what follows, we simply add a function to guessprime as an illustration rather than good motivation for creating a function out of this code.
Script source: guessfunction.sh
The goal of looking at these example scripts is to learn more about the syntax and structure of the bash shell programming language. Each script introduces more concepts and complexity and gives a feel for the types of operations scripts can support.
The following script is based on using the trap command which is used to specify actions to take on receipt of signals. A common use is to clean up a script when it is interrupted. Linux supports a number of signal numbers and associated names. Type “trap -l” to see them.
Signals are asynchronous events raised by programs, Linux OS, or users. Typically they would terminate the program.
In the following script, the trap command (in this case the SIGINT which is the interrupt sent by pressing control-c) will delete a temp file set up in the tmp directory. The script sits in a while loop while the file exists. When the user enters a control-c the file is deleted and the script drops out of the while loop. Note, that we use the $$ that appends the process ID to the file for identification, if needed. Note also in the script we use the printf command instead of the echo command. The printf is preferable these days over echo. It also is very similar to the printf statement used in the C language. Look at man printf and use that in your scripts from now on. The script also uses the sleep command which suspended execution of the script and therefore the process for an interval of time (in this case 1 second).
The next part of the script resets the trap to take no action. In this case when the control-c is asserted the default behavior occurs which is to terminate the script. Because the script terminates in the while loop the final print command is not executed.
Let’s look at the script and see if we can work it out. We will explain how it works below.
Script source: trap.sh
The shift command is used for doing command and text processing in scripts. Please check that out.
The second laboratory assignment is: Lab2 - Shell Programming.
All assignments and due dates can be found on the lab assignments.
In what follows, we provide some useful tips for making a script executable, starting and killing a background process (in this case the spy script discussed in the lab2 assignment) and debugging.
How do I run a script?
You need to make sure the script which you create using an editor (e.g., vim or emacs) is excutable; by default its a plain text file. Assuming I’ve just written spy.sh and saved it. Now make let’s it executable:
Now we can run spy.sh.
How do I run my dastardly spy.sh daemon process (in background)?
First our spy.sh process must be written in a “continous loop” with a delay at the end of the loop. The loop in the process means it will run forever. We want spy to sleep for 60 seconds and then wake up and start spying. The delay is important. Just think of it as my job security ;-). OK, so we have written spy as a “while [ true ] loop” with a sleep 60 and now we want to launch it in the background so that when we log out of wildcat the daemon spy process remains running, forever.. so it thinks. Our spy program wouldn’t be much good if it was killed when we log off the machine, right?
Here is how we start our spy.sh in background.
From the above command line you can see we use “&” to push the spy.sh into background OK let’s see if it’s running using ps
Note, that we can not only see the spy.sh process but also the sleep command excuted as part of the spy.sh script. Why is this? Recall I mentioned in class that the shell creates a new process for every command we execute unless it’s a built-in. Most of the time our spy.sh will be sleeping. It checks things then sleeps .. ad nauseam. So it’s likely that the next time you do a ps we will see two processes running again: the spy.sh and its associated “sleep 60” command. Let’s check if that is the case:
Yes.
How do I kill my spy deamon?
If you start your spy script and then log off wildcat it will still be running. You want that to be the case because you do not know when I or the TA will log on and off of wildcat. You can assume we will stay on longer than 60 seconds (why is that important) so your spy will always catch us.
During the debug phase you will be revising and testing your spy script. So you will want to “kill” your perviously started spy before launching your new one. Note, last year some students unbeknowns to themselves started 10s of spys and never killed them! Not good. Note, that your “while [ true ] loop” has an mail command in it - this is particularly dangerous. Imagine you launch a deamon that is in a tight loop (e.g., your sleep is missing from the code) mailing everyone on the machine (because your who awk filter is incorrect) . Multiply that by 5 because you didn’t kill your previous buggy deamons. The result is that you tie up the machines processor (problematic, but no big deal), bring down the department email server and fill up people’s inbox with zillions of gotcha mail. The result: Campbell is shipped back to the UK. So please take care with this spy program when you are debugging, I like living here.
Here is how we kill your deamon spy. Assuming we start the spy in background on wildcat and then log off. When we log back on again to wildcat to carry on the assignment and do a ps we see:
There is no spy running!
Well there is but it is not associated with the shell that created it (or the parent process ID to be accurate) so we do not see it. But we need to be able to see it. Here is one way to find it. If we do a “ps -el” we see all the processes running on wildcat. And, if we use grep we can see all the spys running.
When you do this on Sunday you will see all the students spy scripts running. So if you want to just see the spy deamon you created (and importantly, all the spys you created and not killed, if that is the case) then type:
By first gettting your user ID (mine is 529) and then using ps and grepping the output for all processes created by user 529 (me in this case) we see the spy deamon and a bunch of other processes that belong to me not visible when just using ps by itself.
Now that I can see my spy deamon I’m going to “kill it”. I’m not going to worry about the sleep process because when the sleep 60 is done and wakes up it will find its parent has been killed (its parent process is spy which executed the command sleep 60). A child process (sleep in this case) with no parent (in this instance) will die - it’s an orphan to use the Unix vernacular I hate all these negative vibes that Unix gives but that is how it is; the 70s were clearly a brutal time for OS designers.
To kill the spy deamon we need the PID (process ID). From the above pipeline output we can see the UID (USER ID) followed by the PID. The PID of my spy is 9070.
The final Shakespearean act now falls on us:
Let’s make sure its killed:
No spy.
And if I wait for another 60 seconds and do a ps
No sleep process. All cleaned up. Now I can safely start my new spy again.
Simple Debugging Tips.
When you run a script you can use printf or echo to print out places in your script where the excution reaches e.g., “echo Got here” or print out the contents of script variables as a sanity check: for example, in spy you need to maintain a number of arrays one of which is needed to determine if users are logged in - echo $USERS˙LOGGEDIN[$USERCOUNT].
Another tip: if the script give you a syntax error; for example:
The error is on or around line 13. In emacs edit the file ./count.sh again and then “goto line 13” using the sequence of key strokes “ESC g g” that is hit the ESC key and hit g twice. Then, enter the line number 13 and you will be brought to that line. Now fix the bug.
The second laboratory assignment is: Lab2 - Shell Programming.
All assignments and due dates can be found on the lab assignments.
Here are some good links to bash scripting information.