In this lecture, we continue our discussion of the Linux shell and its commands. The shell is a command line interpreter and invokes kernel level commands. It also can be used as a scripting language to design your own utilities. We will discuss scripting as part of a future lecture on shell programming.
We plan to learn the following from today’s lecture:
There are many hidden files as you can see from below. Perhaps the ones that interest us right now are:
The .bash_profile is executed each time you log in, the .bashrc each time a command is executed interactively, the bash_history records the last N commands executed by the shell, and .bash_logout is executed when you log out. Lets look at snippets from the .bash_profile and .bashrc. We will use the “cat” command (concatenates or displays files) to view a file and print it to the standard output aka the display.
Here is a snippet of .bash_profile
There are a number of interesting things to discuss here. First, we can see that the PATH environment
variable is set up and exported to the shell environment. Prior to that the script checks to see if there is a
.bashrc file - if [ -f .bashrc ] - and is so it executes it - source
.bashrc. This is useful because .bashrc
has a number of aliases in it that need to be set up. Finally, in the snippet above we set up our cvs
environment.
Here is a snippet of .bashrc
You can look at your aliases, as follows:
Note the double quotes “ ” around the aliases below. This is because is the use of the use of single quotes for ’ˆd’. The caret ˆ is a special character used by grep (take a look at man grep) and denotes the begining of a line. The two commands below are looking for d as the first character on a line, or not when using the -v switch.
alias lf=”ls -l — grep -v ’ˆd’”
alias ldir=”ls -l — grep ’ˆd’”
Feel free to copy my .bashrc (my aliases) and .bash_profile over to your home directory:
$ cp ampbell/.bashrc .
And you can show all the environment variable associated with your shell, as follows:
You might start to recognize some of these environment variables, e.g., SHELL, PATH, HOSTNAME, TERM, HISTSIZE, SSH_CLIENT, USER, MAIL, PWD, PS1, HOME, LOGNAME, DISPLAY. Some of them you haven’t seen before but they are intuitive.
The thing to note about the snipped above is that we set up a number of aliases. In addition, we set up the prompt (called PS1). Instead of using the standard “gcc filename” we have an alias called mygcc which is much more rigorous in compilation and generates an a.out.
Try this. Edit your .bashrc file and add the alias for mygcc (above) and then recompile hello.c using the “-o” switch. You must give a name for the executable. For example, mygcc -o hello hello.c In this case, the executable will be called hello rather than a.out (which is the default if there is no -o option).
Word of warning. Whether using gcc -o hello hello.c or mygcc -o hello hello.c you must take care of not getting the order wrong with the files in relation to the -o switch which tells the complier that the name of the file following the -o switch will be the name of the executable. One student compiled the correct way mygcc -o hello hello.c (producing a executable hello correctly) and then recompiled but got the order wrong: mygcc -o hello.c hello. What the gcc compiler did was’nt pleasant. It took the executable hello as the source file and and hello.c as the name of the executable to be created. The result was the real source file hello.c disappeared! So please be careful: the -o tells the compiler that the executable it creates should be given the name that follows the -o. You can always use the default a.out and not use the -o option if concerned.
The other gcc switches are important to use because make sure we write good clean code. Let’s discuss what these gcc options are (-Wall -pedantic -std=c99). -Wall turns on all optional warnings which are desirable for normal code. Users use -pedantic to check programs for strict ISO C conformance and issues all the warnings demanded by strict ISO C. -std indicate the language standard here we use c99. By using these options the compiler forces us to resolve various warning that you would not see if you just used gcc without any switches. Always use these options (-Wall -pedantic -std=c99) from now on. You can do a man gcc and look at the option meanings for the nitty gritty details on these settings.
Another good reason to use aliases: The Linux command entered in at the terminal “rm *” would delete all files in your directory. But with the alias rm=’rm -i’ set the use entering “rm *” is prompted interactively to confirm that each file should be deleted. Believe me this is a life safer! As the Nike saying goes: Just do it.
To date we have seen Linux programs using default input and output - called standard input and output - the keyboard is the standard input and the display the standard output. The Linux shell is able to redirect both the input and output of programs. As an example of output redirection consider the following. We want to store who is logged on to the spruce computer in a file called “loggedin”. Then we want to append the date to that file. We can use the who and date commands with output redirection to achieve this. We will use the cat to look at the file’s evolution.
The output redirection writes the output to the file in this case and not to the standard output, which is the display. Note that the > operation created the file which did not exist before the output redirection command was executed. OK, now we will append the date to the file. Note that the contents of the file are not overwritten but appended by using the >> double > output redirection character.
The shell also supports input redirection. This provides input to a program (rather than the keyboard). First we create a file using output redirection. The input to the cat command comes from the standard input (i.e., the keyboard). The shell redirect > the cat commands output to the primes file. The primes are input at the keyboard and control-d is used to signal the end of the file (EoF). Control characters given at the standard input can interact with the process control. We will discuss this in the section on process control.
Input redirection "<" tells the shell to use a file (in the example below) as input to the command rather than the keyboard.
In the input redirection example below primes is used as input to cat which sends its standard output to the screen.
Note, that there are a number of Linux commands (e.g., cat, sort) that allow you to provide standard input if you do not specify a file on the command line. For example, if you type cat (cr) (carriage return) then the command expects input from the standard input.
Linux (really developed under Unix) also supports a powerful operator for passing data between commands using the pipe operator "|". Pipes connect commands that run as separate processes as data becomes available the processes are scheduled. Pipes are a clever invention since the need for separate temporary files between processes is not required. Because commands are implemented as processes a program reading an empty pipe will be “suspended” until there is data or information ready for it to read. There is no limit to the number of programs (used interchangeably here with the term command) in the pipeline. In our example, below there are four programs in the pipeline all running simultaneously waiting on the input for information to operate on:
What is the difference between pipes and redirection? Basically, redirection (’>’,’>>’,’<’,’<<’) is used to direct the output/input from/to a command to a file. Pipes (’|’) are used to redirect the output to another command. This allows us to “glue” together programs or filters to process the plain text sent between them (it is worth restating plain text between the processes - nice design decision). This supports the notion of reuse and allows us to build sophisticated programs quickly and simply. It’s a cool feature of Unix/Linux.
There are a number of commands above we haven’t come across: sort, uniq, grep and more.
Let’s look at what seems a complex set of commands that takes our primes file and sorts it then passes the output to the input of the uniq (which removes duplicate lines from a sorted file). Hey, 18 is not a prime! So lets remove it with the handy Linux command - grep, which prints lines matching a pattern. We can use grep to remove 18 from the input from uniq and then pipe the output to the more command, which displays the result in a paginated format to the screen. Phew, all that in one command line. BTW, looks like there are gaps in the primes - which ones are missing? You should look at man (i.e., look at the manual, e.g., man sort) to determine the specific meaning of the switch for sort and grep used below.
Note, the original file - primes - is not changed by executing the command line above. Rather, the file is read in by the sort command and the data is manipulated as it is processed by each stage of the command pipe line. Would the following command achieve the same result?
Here we use a combination of input redirection from the file primes which is the cat into the pipe-line.
As we discussed above Linux treats reading and writing to and from a file the same as to and from a device, such as, the terminal. In essence, devices such as the terminal you use (use the command who -m to find the terminal) are represented as files by linux; they have a filename and a path. The device name is pts/1 and its pathname is /dev/pts/1. We can see from the permission information that this is a “c” for character special file and that the user (i.e., the owner) has read and write access to the terminal.
In the example above we simply redirect the standard output to write directly to the device and the data is displayed on the screen! That is cool. We will discuss this in the next section. What is important here is that devices are retreated as files albeit special files. While the above example clearly illustrated this concept it would be unusual to use such a command. But writing to /dev/pts/1 (in this example) writes the output to the screen. And similarly, by reading from /dev/pts/1 we could read whatever is entered at the keyboard.
The standard IO provides a set of defaults that include standard input (abbreviated to stdin) which is set to the keyboard input, standard output (stdout) which is writes output to the display, and standard error (stderr) which is used to write any error messages to the display. These defaults are set up when you first log onto the system.
The cat command is useful to understand how standard input and output is handled by the shell. The cat command such as “cat loggedin” below copies the file to the standard output because by default the shell directs standard output to the display. If there is no file associated with cat as an argument then cat takes its input from the standard input (which is the keyboard by default) and writes it to the standard output (which is the screen by default). The cat command keeps copying from the standard input (one line at a time) to the standard output until it fines the end of file (EOF) which is control-d.
Standard error is used in the case a command does not want to mix up any error messages from a program with information sent to the standard output. By default error messages are sent to the screen unless redirected.
Every time a command is executed it runs a process opens three file descriptors related to standard input (0), output (1) and error (2). So when you use the symbol for redirecting output > it is shorthand for ‘‘1>’’. Similarly, ‘‘<’’ is shorthand for ‘‘0<’’ to redirect the standard input. In the case of standard error the symbol ‘‘2>’’ redirects the standard error. In the example, below the file fred does not exist.
When the first cat command is executed the standard error is wrote to the screen and in the second example the standard error is written to the errorfile.
There are a number of special characters supported by the shell - spaces, tabs, filename expansions characters, redirection symbols, etc. Special characters have special meaning and cannot be used as regular characters because the shell interprets them in a special manner. These special characters include:
We have already used a number of these special characters. No need to try and memorize them at this stage. Through use they will become second nature. We will just give some examples of the ones we have not discussed so far.
The ? special character matches any single character in a name of an existing file; for example,
The * special character performs a similar function to ? but matches any number of characters. The position of the * in the filename make a difference. See the two examples below.
The [ ] special character gets the shell to match filenames containing individual characters; for example:
The examples above also include other special characters such as ! and ˆ. These are used to exclude files from the command; for example data[ˆ12] . Or use of “-” which allows the inclusion of a range data[1-3].
If you need to use one of these special characters as a regular character you can quote or escape it. When you quote a special character in this manner you tell the shell not to interpret it. To quote special characters precede it with a backslash \. If there are multiple special characters such as ** then they must be both be quoted \*\* You can also quote using single quotations marks such as ’**’
There will always be times when we need to pass these characters to other programs, for example, when your write scripts and don’t want the shell to interpret options or commands.
In most cases pairs of double quotes or the backslash character override any special significance of a character:
Here the expr is a command that evaluates an expression.
The use of a pair of single quotes is similar though the environment variables are not expanded inside single quotes:
Another powerful quotation mechanism uses backquotes ‘ ‘. These enable the output of one command to be used as arguments to other commands. For example:
[campbell@spruce ]$ echo the date is now ‘date‘ and it is a New Year plus 5 mins the date is now Tue Jan 1 00:05:00 EST 2008 and it is a New Year plus 5 mins
The output of a backquoted command can be assigned to a variable and then used in other commands. For example, you want to capture all the files in a directory on a monthly basis.
[campbell@spruce assign-1]$ MONTH=‘date | cut -d ’ ’ -f2‘
You can use -d ’ ’ (with a pace between the ’ ’) or ’-d ’ (space after the d). The -d switch is delimiter character which is a space in this example. The default is tab character.
Notice the standard single quotes are used around the -d switch but back quotes are used to make the assignment to the variable MONTH.
The cut command is very useful. It removes sections from each line of files. Date displays Wed Jan 2 18:56:19 EST 2008. The options select the “Jan” item from the standard output of the date command. Jan is the second field (hence -f2 switch) in the stream. The “d” switch is used by cut as the delimiter and tells it to use the a single space as the delimiter rather than the default TAB.
If you wanted to append the month to a file back up you can use quoting again:
cp hash.c hash.c.bac.‘date | cut ’-d ’ -f2‘
Linux is a multi-user, multi-tasking OS. Multi-tasking allow multiple processes to execute at the same time (or scheduled in a manner that gives the feel for executing at the same time). We have seen this already where each xterm window runs its own bash process. Many times we want to start running a command and while it is executing we can do other work.
The shell allows you to put the command in background mode using the & symbol as above. If you type “fg” you can bring the process back into the background process into the foreground.
Processes can also be suspended while running in the foreground by using “control-z”. The process is now suspended. Below we first run the command line in background mode, bring it back into the foreground, suspend it with control-z, check out the process status.
Here we see one process for each of the following: the shell, grep, wc, and ps itself. A total of 4 processes are started (one for the shell that was started when we logged in). Finally, we bring the command line back into the foreground and it completes its job. wc command prints the number of newlines, words, and bytes in files. The command ‘‘grep campbell@cs.dartmouth.edu Sent | wc’’ simply computes the number of emails campbell@cs.dartmouth.edu has sent - quite a few.
The wc print the number of newlines, words, and bytes in files - stands for word count.
Also you can directly use the job number if you have commands in background and you want to bring them to the foreground
Here we bring sleep 100 back into the foreground. The sleep command delays a process for a specified amount of time. Also, note that the emacs fred is in a suspended - “control-z” is used once emacs fred is executed on the command line.
You can also kill a process for whatever reason. This is done using the kill command - terminate a process. Here is an example of kill. First we use the man ps command and suspend it using control-z. We do a ps to find the process ID (PID) and the use kill to terminate the process and command.
kill -9 process ID - need to add this. The “-9” option is interesting. 9 means issues the signal KILL which is a non-catchable, non-ignorable kill). It is deadly! The other argument is the PID that we get from the ps command.
You will need to archive complete directories and compress them - for example, homework assignments will be submitted this way.
First you use the tar utility (short for tape archive - when we used tapes, probably before you were born!) to create an archive of all the files in the directory (using the “-c” switch) and name it (using the “-f” switch) to something appropriate.
Essentially, tar packs and unpacks files.
Given that we also use the “-z” switch which uses the gzip command to compress the archive once created. We call the file directoryname.tar.gz where the directory name in this example is assign-1.
The command leaves the original directory and files intact. Assume you emailed the assign-1.tar.gz to the instructor or TA. They could reverse the process and create a new directory of the same name with all the files in. But before we do that we check what is in the archive by using the “-t” switch which simply lists the files in the assign-1.tar.gz.
Note, that the “-x” switch with tar eXstracts the files after it is uncompressed (“-z” switch).
We will use this methodology for submitting code. Along with the code is a README file that would explain how to make and run the software.
Each laboratory in Sudikoff has a laserprineter. The printer is named after the lab: s001, s003, etc. To print a file to the printer you have to direct the print command to the specific printer; we can also check the status of our print job, for example,
Now the default printer has been changed from s2s˙cs˙dartmouth˙edu to s2add˙cs˙dartmouth˙edu. If you want to print to the default printer just type lrp filename.
The queue is empty so we assume all when well and the print out is ready to pick up. If your job is queued to the printer behind a long line of other jobs you might want to print it on another less busy printer. So as a good citzen and to save paper you’ll need to remove your job from the queue. To do that use lprm (line printer remove). Here we assume that the print job for our document is 666 (you can find the job number by using the lpq -Ps001 command used above).
If you do not know which printers are connected to the computer do the following:
This command lists the printers (-p) and the default printer (-d). You can change the default using
Less Is More: The less and more commands are handy for quickly looking at files. The syntax is less filename and more filename. Take a look at the man to get the details of each. Similarly, head and tail displays the beginning and end of a file, respectively.
Many times you want to find a file but do not know where it is in the directory tree (Linux directory structure is a tree - rooted at the “/” root) . The find command - walk a file hierarchy:
The second find example searches for a directory (-type d) called metrosense and prints it. The search starts from the current working directory “.”. The third example used “iname” (case insensitive search) instead of name (which is case sensitive) to find a file called map - note the file found is capitalized.
If you want to quickly determine the differences between two files (e.g., could be between versions of a file that has been updated) the diff will compare two files and display the differences to the display.
Many times you are not sure what the contents of a file is: it could be text, binary, compressed, specific format to a certain application. In this case the file command is very useful. Below we can decode what the files are - some are obvious from their names others are not. Sometimes there is a disconnect between a file name (e.g., trash.tar.gz) and the application failes only to realize that the name does not match the contents. In this case better to check.
Forgot who you are? Type:
Can’t remember where a program is located? Or which program (e.g., compiler) you are actually using.
The which command locates other commands and displays their full pathname. Note that the system may
have a number of different versions of a utility. Recall that when you type the command the shell searches
your path and runs the first instance of the utility it finds. This may not be the one you want for example
you might be running a local version of a program in say ”bin’” rather than /bin/ depending on your
path. You can find out which version or copy of the program will run by using the which
command.
The whereis command searches for files related to the utility by looking in standard locations rather than your search path.
The slocate a secure searches for files on the local system.