CS 10: Winter 2016

Lecture 13, January 31

Code discussed in lecture

Traversing a rooted tree

When we visit the nodes in a rooted tree, we are traversing the tree. Of the many possible orders for visiting nodes, three are most common. We think of them recursively, in terms of when to visit a node and when to visit its subtrees. We'll show examples in terms of this binary search tree, where we imagine visiting a node as printing out its key:

Preorder: Visit the root, then recursively visit each of the subtrees, from left to right. For our example, a preorder search would print keys in the order 15, 6, 3, 2, 4, 7, 13, 9, 18, 17, 20.
Postorder: Recursively visit each of the subtrees, from left to right, then visit the root. For our example: 2, 4, 3, 9, 13, 7, 6, 17, 20, 18, 15.
Inorder: An inorder traversal applies only to binary trees. Recursively visit the left subtree, then visit the root, then visit the right subtree. For our example: 2, 3, 4, 6, 7, 9, 13, 15, 17, 18, 20. It is no coincidence that if you visit the nodes of a binary search tree using an inorder traversal, you visit the nodes in nondecreasing order of their keys. Think about why that should be.

These are basic approaches to designing tree algorithms. You are not limited to them. The book talks about an Eulerian tour of a binary tree. It even implements a template that visits each node three times (before its children, between its children, and after its children). The code supplies three methods for you to fill in called left, below, and right. I would have called them before, between, and after. But the idea is that you can be very flexible in how you traverse a tree. (I sometimes find it easier to write the whole traversal rather than fill in the template methods, but it is a matter of taste.) Eulerian traversals definitely abstract the idea of a traversal!

Going back to our binary tree code in BinaryTree.java, we see methods preorder, inorder, and postorder. Each appends data from a subtree to a List. But List is an interface, and in the driver, you can see that it's actually implemented by a Java LinkedList. Here's the preorder method:

  public void preorder(List<E> dataList) {
    dataList.add(data);
    if (this.hasLeft())
      left.preorder(dataList);  // recurse on left child
    if (this.hasRight())
      right.preorder(dataList); // recurse on right child
  }

The other two methods just move the call to add to different locations in the code:

  public void inorder(List<E> dataList) {
    if (this.hasLeft())
      left.inorder(dataList);   // recurse on left child
    dataList.add(data);
    if (this.hasRight())
      right.inorder(dataList);  // recurse on right child
  }

  public void postorder(List<E> dataList) {
    if (this.hasLeft())
      left.postorder(dataList);   // recurse on left child
    if (this.hasRight())
      right.postorder(dataList);  // recurse on right child
    dataList.add(data);
  }

Traversing rooted trees comes up quite often. For example, the TreeSet implementation of the Set interface stores information in a binary search tree, and the iterator returned by its iterator method is guaranteed to iterate over the set elements in increasing order. Not surprisingly, this iterator traverses the binary search tree using an inorder traversal.

There's another interesting use of traversals in binary trees. If all keys are distinct and you have both a preorder and an inorder traversal, you can reconstruct the entire binary tree. The method reconstructTree in the BinaryTree class does so. How does this method work? The first node in a preorder traversal is the root. The root is somewhere in the middle of an inorder traversal, with everything in its left subtree before it and everything in its right subtree after it. In the preorder traversal, all of the nodes in the left subtree are visited before anything in the right subtree. So we can build the tree recursively as follows:

Save the root's key.
Build lists of the keys in the left subtree, one list from the preorder traversal (leftPre in the code) and one list from the inorder traversal (leftIn in the code). Do so by iterating through the preorder and inorder lists until getting to the root in the inorder list.
Having constructed the leftPre and leftIn lists, use them to recursively build the left subtree.
Build lists of the keys in the right subtree, one list from the preorder traversal (rightPre in the code) and one list from the inorder traversal (rightIn in the code). Do so by iterating through the remainder of the preorder and inorder lists, starting from where we left off in step 1 and jumping over the root in the inorder list.
Having constructed the rightPre and rightIn lists, use them to recursively build the right subtree.
Create a new node for the root with the root's key, and link the reconstructed left and right subtrees to the root. Return the root.

How long does it take to traverse a rooted tree with n nodes? The answer is Θ(n) time. Here's an easy way to see it. We have to consider each edge twice. The first time is when we traverse the edge from the parent down to a child. The second time is when we traverse the edge from the child back to the parent. Conceptually, there is also an edge down into the root, representing the work performed in the initial call of the method. How many edges are there? It is a fact that in any tree—whether or not we think of it as a rooted tree—the number of edges is n − 1. Add in one more for the edge down to the root, and conceptually, there are n edges, each traversed twice. That makes the total time Θ(n).

Expression trees

We can think of expressions as being represented by trees. To keep things simple so that we don't have to think about operator precedence, let's consider only fully parenthesized expressions. Take the expression ((17.5 + (5.0 / x)) * (y - 4.0)). We can represent it by this binary tree:

Although we could build an expression tree on top of the binary tree code, we will write a new set of classes to represent expression trees. Each node will be its own type, so that we'll have nodes for variables, constants, and each of the binary operators. We won't include unary operators (such as unary minus) or ternary operators (such as the ? : operator of Java), but we could—and we could not include unary operators if all expression trees had to be binary trees.

The code for expression trees is in this zip file. (I will show you individual classes, but this is the easy way to download the whole thing.) First, look at Expression.java. It is an interface that says that an Expression must provide two methods: eval and deriv. The eval method evaluates an expression and returns its value in a double. The deriv method returns an expression that is the symbolic version of the derivative with respect to a particular variable. An Expression should also override toString.

Next, look at Constant.java. To use a constant, you call its define method, just giving a double with the value of the constant. The define method is static and calls the constructor, which is private, so that the only way to define a constant is by calling define. The constructor just saves the value of the constant in the instance variable myValue, and then the define method returns a reference to the Constant object. Because Constant implements Expression, anywhere you have declared a reference to an Expression, the reference may actually be to a Constant. The eval method just returns the value in myValue, and the deriv method always returns a constant whose value is 0.0, since the derivative of a constant with respect to any variable is 0. The toString method just returns the String representation of myValue.

This scheme, where the constructor is private and there is a static method that calls the constructor is a variant of a design pattern called the factory pattern. Instead of constructing objects directly, users of the class have to go through a method to create an object. One use is to allow the factory to decide to create a subclass object when asked to create a superclass object. Another is to give the user options on what parameters to pass to create the object. For example, complex numbers can be represented using polar or rectangular coordinates. Either way you present two double values, and so you cannot overload the constructor to accept either representation. You can create two factories, however, one of which expects polar coordinates and the other of which expects rectangular coordinates. For example:

class Complex {
  public static Complex fromCartesian(double real, double imag) {
    return new Complex(real, imag);
  }

  public static Complex fromPolar(double mag, double phase) {
    return new Complex(mag * cos(phase), mag * sin(phase));
  }

  private Complex(double a, double b) {
    // Code to store into private instance variables goes here.
  }
}

Complex c = Complex.fromPolar(1, pi);

It is less clear why we would want to use this pattern in the Constant class, but it will become clearer later.

The Variable class in Variable.java class is similar, but it has a symbol table to save variable values. These are saved in a Map that is implemented using a HashMap. We will look at maps in detail later, but for now all that you need to know is that put(key, value) adds the (key, value) pair into the map, get(key) returns the value corresponding to key, and containsKey(key) returns true if the key is in the map. Note that the symbolTable is static, so it is a class variable. We have only one symbol table shared by all of the variables.

As we saw in Constant the constructor is private and there's a define method, but this time it is overloaded. The two-parameter form takes the name of a variable as a String and a value for the variable, storing the name and value in the symbol table. The one-parameter form takes just the name of the variable, and it stores the name in the symbol table with null for the value. A variable whose value is null is defined but uninitialized. The constructor, called by both define methods, stores the variable's name in the instance variable myName. There is an assign method that stores the value in the symbol table entry for the variable. The eval method looks up the variable in the symbol table, using myName as the key, and it returns the value found in the symbol table for the key given by myName. The deriv method determines whether the variable v given as the parameter matches myName. If it does, then the derivative is 1; otherwise, the derivative is 0. The correct Constant is defined and returned in either case. The toString method just returns myName.

Some errors can occur when using variables, and instead of just printing error messages, we throw exceptions. We have exceptions for three types of errors:

If you try to define a variable more than once, the define method throws a MultiplyDefinedVariableException, defined in MultiplyDefinedVariableException.java.
If you try to assign to a variable that has not been defined via the define method, the assign method throws an UndefinedVariableException, defined in UndefinedVariableException.java.
If you try to evaluate a variable that does not have a value, the eval method throws an UnassignedVariableException, defined in UnassignedVariableException.java.

The classes in Sum.java, Difference.java, Product.java, and Quotient.java all perform eval by evaluating their operands and performing an operation on them. The only difference is the operator. Therefore, we have an abstract class in BinaryOp.java that has the template for evaluating a binary expression and another template for toString. These templates call abstract functions doOperation and getOperation, repectively. (The latter returns a String representation of the operator). It also has accessor methods to get the first or second expression.

Notice that evaluating an expression is really a postorder traversal of the expression tree. To evaluate a BinaryOp, first evaluate its two subtrees, and then apply the appropriate binary operator to the results of evaluating the subtrees.

Let's look at the Sum class more carefully. It provides the necessary two methods, which are fairly trivial. It supplies its deriv method, which adds the derivatives of its operands. But the interesting thing is Sum.make. Here we see the power of a factory method. It tries to simplify the expression. If the two expressions it is adding are constants, it adds them to get a new constant. If either operand is 0 it returns the other. So in three of the four cases it does not even create a Sum object! The other three operations are similar in how they try to simplify the resulting expression in their make methods.

Finally, look at the driver in ExpressionDriver.java. After the main method, we have several static wrapper methods. They alleviate the need to call the static methods in the above classes with the name of the class. We also use method names plus, minus, times, and over instead of the class names Sum, Difference, Product, and Quotient. In that way, we can construct the expression tree above using lines of code such as

Variable xVar = define("x", 2.0);
Variable yVar = define("y", 6.0);

Expression first = plus(constant(17.5), over(constant(5.0), xVar));
Expression second = minus(yVar, constant(4.0));
Expression third = times(first, second);

Or we can replace the last three lines by

Expression fourth = times(plus(constant(17.5), over(constant(5.0), xVar)),
    minus(yVar, constant(4.0)));

The driver also verifies that the three exceptions work correctly.

The Composite Pattern

This is an example of another design pattern, the composite pattern. Here we have individual things (constants and variables) and compositions of things (sums, differences, etc.), but they all are expressions and can be treated in the same manner. We perform the same operations on all of them, so we need not distinguish between individual things and groups of things. Any of the expression types can be used wherever an expression is expected, and everything works. For example, we take the derivative of the product of two huge, complicated expressions the same way that we take the derivative of 2*x.

The same pattern arises in some graphical drawing programs, where there are individual shapes (circles, rectangles, etc.) and groups of shapes (shapes are laid out and then combined into a single group). But all are shapes, and groups of shapes can be used wherever individual shapes can be used and vise versa, and all are subject to the same scaling, translation, and other operations. So you compose a bunch of shapes and get another shape.

Exceptions

So far we have handled errors by printing error messages to System.err. This is not a great solution to the problem of handling errors. Java has a much better one - exceptions. Exceptions are objects that are thrown when an error occurs. When an exception is thrown the program stops what it was doing and goes into error-handling mode. Exceptions are either caught and handled in the method that throws them or passed on the the method that called it. If that method does not catch the exception it is passed to the method that calls it, etc. If the main program does not catch the exception it kills the program and prints an error message. You probably have seen these messages for null pointers or subscript out of bounds.

Exceptions are an excellent way of passing an error on to a method that knows a reasonable way to handle them. Exceptions usually occur in a low-level method. Consider trying to open a file and failing. The file-opening method has no idea what the programmer wants to do if the file can't be opened. Maybe the method that calls it has no idea, either. If you don't know a good way to handle the exception it is best to pass it on. Eventually you will get to a high-level method that does know what to do about the situation. If the file name was just entered by the user it might be good to give her a second chance. Or it may be that the programmer has a backup file to use if the first cannot be opened. Or it could be that the right thing to do is to print an error message and quit the program. Exceptions give a way to let the low-level method that discovers the problem pass it on to a method that knows what to do about it.

There is a whole heirarchy of exceptions. The top level class is Exception. The heirarchy looks like:

Exception

IOException

EOFException
FileNotFoundException
MalformedURLException
UnknownHostException

ClassNotFoundException
CloneNotSupportedException
RuntimeException

ArithmeticException
ClassCastException
IllegalArgumentException

NumberFormatException

IllegalStateException
IndexOutOfBoundsException

ArrayIndexOutOfBoundsException

NoSuchElementException

InputMismatchException

NullPointerException

I should also note that there are two types of exceptions, checked and unchecked. When you call a method that could throw a checked exception the compiler checks to make sure that you handle it. You can handle it in either of two ways. The first one is to catch it. The second is to pass the buck and add a throws clause at the end of the method header. For example:


public String readString(String fileName) throws IOException, CloneNotSupportedException {

Which are the checked and unchecked exceptions? RunTimeException and all of its subclasses are unchecked. The rest are checked. The checked exceptions tend to be things that are outside of the programmers's control. The IOException exceptions happen when the file you are reading is corrupted or there is a hardware glitch or somebody moved the file from where it was supposed to be or gave it a different name. Since you can't control whether these things happen you should make provisions to handle them when they do.

RunTimeException and its subclasses are all thing that happened because you made a programming error or failed to check something that you could have checked. The thought was that checking for all of these things is a waste of time, and if one of them goes wrong you probably want to kill the program anyway. (NumberFormatException is something of an exception to the rule - it often occurs when you are trying to convert strings typed in by the user into numbers. But being unchecked doesn't mean that you can't check for them - the compiler just doesn't check to make sure that you do.)

So how do we catch an exception? I have modified Fahrenheit.java to demonstrate catching a NumberFormatException. The relevant part is in actionPerformed:


    // Get the text entered in the text field.
    String text = fahrenheit.getText();

    try {
      fahrenheitTemp = Integer.parseInt(text); // convert the text to an int
      celsiusTemp = (fahrenheitTemp - 32) * 5 / 9; // do the actual conversion

      // Convert the int celsius temperature to a String, and display it in
      // the result label.
      resultLabel.setText(Integer.toString(celsiusTemp));
    }
    catch(NumberFormatException ex) {
      resultLabel.setText("*****");
    }

The user supplies text via a fahrenheit.getText() call. However, what the user entered may not be a valid int. Therefore we put the conversion and the statements that deal with the result of the conversion into a try block. If the conversion works successfully the result is displayed and the catch block is ignored. If a NumberFormatException is thrown then catch is called with an argument of type NumberFormatException. There can be several catch blocks, one after another, and the first one whose parameter is of the type of the exception (including a supertype) gets executed. Here what happens is that the output field is filled with asteriks to show that an error occurred.

There is also another possibility - a finally block. Code in the final block is run whether the code in the try block succeeds or fails. It is used to clean up. A typical example is an open file that should be closed. To extend our earlier example:


	public String readString(String fileName) throws IOException {
	  BufferedReader input =  new BufferedReader(new FileReader(pathName));
	  String line = input.readLine();
	  input.close();
	  return line;
	}

The first line opens a file. (We will see more about file reading and writing soon.) If an IOException is thrown by the readLine call the file will never be closed. By rewriting it:


	public String readString(String fileName) throws IOException {
	  BufferedReader input =  new BufferedReader(new FileReader(pathName));
	  
	  try {
	    String line = input.readLine();
	  }
	  finally {
	    input.close();
	  }
	  
	  return line;
	}

the file will get closed whether an exception is thrown or not. It is possible to include both catch and finally blocks in the same try block, but it is not recommended because of confusion over what is happening in what order. Instead have a throws clause (as above) or nest two try blocks with the finally on the inside and catch on the outside.)

Finally, you can write your own exceptions, as was done in the expression tree code. An example is UnassignedVariableException thrown by Variable. No exception really described the problem, so I extended RuntimeException to create one whose name was more descriptive.