From the last lecture we learnt to write our first C program and looked at the compiler (gcc) code chain. In this lecture, we will discuss the Linux shell and its commands. The shell is a command line interpreter and invokes kernel level commands. It also can be used as a scripting language to design your own utilities. We will discuss scripting as part a future lecture on shell programming.
Since we do not recommend you buy a Linux book here are some very good references and free access online books - see resources.
If you need help on the meaning or syntax of any unix shell command you can use the manual (man) pages or the web unix commands.
We plan to learn the following from today’s lecture:
OK. Let’s get started.
The shell is the Linux command line interpreter. It provides an interface between the user and the kernel and executes programs called commands. For example, if a user enters ls then the shell executes the ls command. The shell can also execute other programs such as applications, scripts, and user programs (e.g., written in c or the shell programming language).
You will get by in the course by becoming familiar with a subset of the Linux commands.
Linux has often been criticized for being very terse (it’s rumored that its designers were bad typists). Many commands have short, cryptic names - vowels are a rarity:
We will learn to use all of these commands and more.
Linux command output is also very terse - the default action on success is silence. Only errors are reported. Linux strongly supports the philosophy of one and only one way to perform each task. Linux commands are often termed tools or utilities - familiarity with the “top 20” will be a great help.
Instructions entered in response to the shell prompt have the following syntax:
The brackets [] indicate that the arguments are optional. Many commands can be executed with or without arguments. Others require arguments (e.g., cp sort.c anothersort.c) to work correctly, if not, they will provide some error status message in return. But to avoid an explosion in the number of commands most commands support switches (i.e., options) to modify the actions of the commands.
For example, lets use the ls command and the -l option switch to list in long format the file c.tex. Switches are often single characters preceded by a hyphen (e.g., -l). Most commands accept switches in any order though they must appear before all “true” arguments (usually filenames). In the case of the ls example below the command arguments represent [options] filenames[s], as shown below. Options therefore modify the operation of the command and are operated on by the program called by the shell and not the shell itself.
In fact, the command is argument 0 (ls), -l switch or option is argument 1 and the filename is argument 2. Some commands also accept their switches grouped together. For example, the following switches to ls are identical:
The shell parses the words or tokens (commandname , options, filesnames[s]) and gets the kernel to execute the commands assuming the syntax is good.
Typically, the shell processes the complete line after a carriage return (cr) (carriage return) is entered and finds the program that the command line wants executing. The shell looks for the command to execute either in the specified directory if given (./mycommnd) or it searches through a list of directories depending on your $PATH variable.
You will need to look at your $PATH variable and update it from time to time to make sure the path to the code you want to execute is there. Typically, your login files that execute when you log in (.bash˙profile) or each time to execute a command (.bashrc) are the place to set up these environment variables such as $PATH.
So where does the ls command executed above reside in the Linux directory hierarchy. Let’s use another command to find out.
The whereis or which commands are a useful sanity check if you want to know for sure which ls command is executed. For example, I could have written a program called ls and placed it in my working directory - probably not a good idea but could happen. In that case, if I entered ls - which command would the shell execute?
So we can see from the $PATH variable that /bin is in the path. Hence the shell can track down the ls binary to execute. The fact that the ls command is in /bin assumes that the filename /bin/ls has the correct permission set to be an executable by all. We can check this.
Indeed it is. The file is owned by “root” and is executable by all.
If I set my $PATH variable to “.” only the current working directory and execute ls (this would be akin to not having my path name set up correctly) then the shell would not be able to find the correct program to execute (assuming I don’t have an ls binary with the correct permissions in my current directory). Here is what would happen.
You may come across this error (“-bash: ls: command not found” or variants thereof) when you install code or try and execute new programs you want to use or have written.
There are a number of shells available to a Linux user - so which one do you select? These are:
We will be using the bash shell in this course since it is standard fare with Linux machines. Let’s check what shell has been configured for your login account. If it’s not the bash shell then let’s change the shell using the change shell chsh command. To find out what shell is running, log in and look at the $SHELL environment variable. We will use the echo command which is akin to the print (e.g., printf in C) statement in many languages.
The tcsh shell is set. So we need to change the shell to bash. Note, that galehead is the only machine that you can change user configuration information on such as passwords and shells. So we ssh to galehead and list all permissible shells supported by our local Linux machines using the “chsh -l” switch.
Now, let’s change the shell to bash. Note, that our first attempt fails because the shell wants the full path name. We will discuss full and relative path names later in this lecture. Our second attempt using the full path name from above is successful.
Another way to see what shell is running is to use the process status (ps) command. We can see that the bash shell is running. The process ID and other status information of the process is displayed.
Note, that most commands executed by the shell starts a new “process”. (There is an exception to for what are called builtins). We will discuss processes in a future lecture.
The basic shell operation is as follows. The shell parses the command line and finds the program to execute. It passes any options and arguments to the program as part of a new process for the command such as ps above. While the process is running ps above the shell waits for the process to complete. The shell is in a sleep state. When the program completes it passes its exit status back to the shell. At this point the shell wakes up and prompts the user for another command. Note, that it is the program that is executed, for example, ps in this case, that checks to see if the arguments passed to it by the shell are correct or not. It is not the job of the shell to do that level of parsing or error checking. If there are problems with the syntax (e.g., wrong switch) then it is the program itself that informs the user via an error message.
The term command and utility are used synonymous in these notes. The shell has a number of utilities built into the shell called builtins. When a builtin runs the shell does not fork a process; that is, it does not create a process specifically to execute the command. Therefore, the builtins run more efficiently in the context of the existing process rather than having the cost of creating new processes to run the command. Typically, users are not aware if a command runs as a builtin or a standard forked command. The echo command exists as a builtin to the shell and as a separate utility in /bin/echo. As a rule the shell will always execute a builtin before trying to find a command of the same name to fork. Bash supports a number of builtins including bg, fg, cd, kill, pwd, read among others.
The Linux file system is a hierarchical file system. The file system consists of a very small number of different file types. These include text files, directories, character special files (e.g., terminals) and block special files (e.g., disks and tapes).
A directory is just a special type of file. A directory (akin to a Macintosh folder) contains the names and locations of all files and directories below it. A directory always contains two special files ’.’ (dot) and ’..’ (dot dot). Every file has a filename of up to 1024 characters typically from ’A-Z a-z 0-9 ˙ .’ and an inode which uniquely identifies the file in the file system.
Directory names are separated by a slash ’/’, forming pathnames.
Files are accessed by referring to their relative or absolute pathnames.
Each account has a home directory. After you have logged in your shell will be executing in your home directory. So let’s log in and use the pwd command to find out where we are - we will be in the home directory of course.
Let’s list the contents of the home directory.
Recall the “d” in “drwx——” indicates that this file is in fact a directory. So we can move to that directory assuming we have the relevant permission - which we do in all cases. So lets move around.
All files and directories have certain access permissions which constrain access to only those users having the correct permission. Let’s consider a couple of typical examples from above:
The first character of the access permissions indicate what “type of file” is being displayed.
Following the first type of file character the next 3 triples (i.e., groups of three characters) from left to right represent file permissions: the read, write, and execute permissions, for (respectively) the owner (campbell in this case), the files’s group (faculty in this case), and the “rest-of-the-world”. To determine that group particular files are in enter the “ls -lg” command.
What do these permissions mean?
After the file permission comes the number of links to the file (e.g., 5), followed b the owner (campbell), group (faculty), size (e.g., 128) which represents the size of the file in bytes, date and time of modification (e.g., Dec 24 14:33), and the filename (e.g., cs50).
Note, that shellscripts (which we will discuss in a future lecture) must have both read and execute permission - bash or any of the shells must both be able to read the shellscript and execute it. Program binaries on the other hand do not need to be read and only need execution permission since they are not read but executed (recall when we tried to more a.out we could not view it because it was an executable in machine code).
The permissions on files and directories may be changed with the chmod (change mode) command. When you own a file or directory you can use chmod to change the access permissions to the file or directory.
Only the three permission triplets may be changed - a directory cannot be changed into a plain file nor vice-versa.
Permissions given to chmod are either absolute or relative (i.e., symbolic).
Each triplet is the sum of the octal digits 4, 2, 1, and read from left to right. For example rwx is represented by 7, rw- by 6, and r– by 4 and so on. The absolute octal values used with chmod are as follows:
The complete permission on any file is the sum of the values. For example, home directories are typically 700 which provides the owner with read, write, and execute permission but denies all access to others.
If you wish others to read your files set - in this case a file funny - then the command would be:
Because the file is only to be read, not written too, and the fact that it’s a file with no execution (not a binary or shellscript) 644 makes good sense.
Use the manual pages to read how chmod can be used in a relative or symbolic mode; for example, what would
These symbolic arguments need to be used carefully. Here “o” means others and “u” means owner or user.
“chmod o+wrx cs50” do?
The change directory command (cd) allows us to move around the Linux directory hierarchy. Let’s combine pwd, ls, and cd to move around the my local directories that are rooted at /net/nusers/campbell
There are also a number of special characters that can be used with cd for short hand.
Moving to the parent directory:
Moving back to where we came from:
Moving to our home directory:
and back:
Here are a popular set of switches you can use with ls:
We can use a number of special characters to look at the files in a smart and efficient manner:
If I wanted to list just the directories or just plain files (i.e., non directory files) in a directory how would I do that? Use ls, right. Sorry, ls does not have an option to list only directories or just plain files. But we can use a combination of commands to do this!
We can write our own commands to do this job - we can use a combination of ls and grep to list directory names only or plain file names only.
First let’s just list all the files in the home directory - it includes one plain file and the rest are directories.
Now we use a combination of ls and grep and the pipe command. More on this is in a later lecture but now we begin to see the power of the shell.
Let’s just list plain files:
Now, let’s use a modification on the above to just list directories:
If you don’t know any of the above swicthes then use the man command. We can also use the -F switch to show which file is a directory or not. Check it out.
It is handy to be able to list just the directories when moving around the file system. So we’ll add these commands to our bash files in the next lecture - we’ll create aliases of these commands so we can use them any time.
We’ll there is a ls option to list directories and indeed there are many ways to do this; for example:
In the following sequence we will create a new directory, create two new files (using touch), move one file to another directory, delete the other file and remove the directory.
In the sequence above we reset the alias for rm which is set up in .bashrc. When you use the “rm -i” option the shell will ask you to confirm if you really want to delete files. This is worth doing by setting up the alias in your .bashrc file. It is easy to type “rm” and accidently delete files. Therefore, the “-i” (interactive) option is a life saver. For example,
In the home directory there are a number of interesting “hidden” files. Using the “-a” lists all files including those that begin with a dot (aka the hidden files).
But a simple ls will only show:
Make sure you do the reading for the next class Typically we have reading for Wednesday and Friday classes.