CS 10: Winter 2016

Lecture 3, January 7

Code discussed in lecture

From last time

Please read the class notes from last time on "Printing to Console" and "Java's increment and decrement operators." The ideas are fairly straightforward and I want to use class time for other things today. Also, look at the descriptions of primitive types. Some details are in the notes that I did not get to in class.

Some tricky details

As in Python and C, when you divide two integer types, you get integer division, which truncates the result so 11/3 = 3. % is the remainder operation. 11%3 = 2. ONLY for integers. Demonstrate using DrJava.

Java has type conversions. Some happen automatically. Others must be specifically requested via a process called casting. The distinction is whether the conversion can result in the loss of significant information. If an int is used where a double is expected, it is safe to do the conversion and it happens automatically. (Every int can also be represented as a double.) If a double is used where an int is expected, we have a problem. There is no good integer representation for 3.3, for instance. So we have to make a specific cast, which trucates the fractional part. You cast by putting the desired type name in parentheses to the left of the expression being cast:
```
double d = 3.3;
int n = (int) d;
```
Going in the "safe" direction is called widening; going the other way is called narrowing. The order of primitive types from narrowest to widest is
```
byte => short => int => long => float => double
```
It is also widening to go from a char to a short, an int, or a double, because the Unicode in a char is stored as an integer value.

Calling a method

Just as in C and Python, when you call a method, parameters get their values via call by value. Some terminology:

A parameter declared in a method header is a formal parameter. Remember that a formal parameter is also a local variable.
The value of a parameter supplied at the point of call is an actual parameter, or an argument.

In call by value, the value of each actual parameter is copied into the corresponding formal parameter. Actual parameters match up with formal parameters one by one, left to right. A formal parameters must be a single variable. An actual parameter can be any expression of the correct type.

How an object references itself

Suppose that an object needs a reference to itself. Java supplies the keyword this for just such a purpose. In Python, self accomplishes the same purpose. (Note that whereas Python requires the first parameter of a method to be self, you don't make a parameter for this in Java.)

Here's one simple reason you'd want to use this. Suppose that we wrote our Counter class as follows:

public class Counter {
  private int limit;    // upper limit on the counter
  private int value;    // current value
  ...

  public Counter(int limit) {
    // How to assign the parameter limit to the instance variable limit?
  }

  ...
}

So how could we assign the parameter limit to the instance variable limit? We can't say limit = limit; because that would just assign a variable's value to the same variable. The problem here is that we have an instance variable and a local variable (the formal parameter) with the same name. Java's rule is that when such a conflict occurs, the local variable wins. So the line limit = limit; assigns the value of the formal parameter limit to the formal parameter limit.

But we can use this to say that we mean the instance variable:

  public Counter(int limit) {
    this.limit = limit;
  }

When we write this.limit, we're saying "the instance variable limit in the object that the constructor is running on." Similarly, we could write the set method as

  public void set(int value) {
    if (value >= 0 && value < limit)
        this.value = value;
    else
        this.value = 0;
  }

The tests use the formal parameter, and the assignment is to the instance variable of the object that set is called on.

Now, you might wonder whether we really need this. After all, if we had the sequence

c = Counter();
c.set(4);

couldn't we just refer to c in the set method? No way! First, c is probably local to some other piece of code, and it's not known within the set method. Second, what if we had two Counter objects:

c1 = Counter();
c2 = Counter();
c1.set(4);
c2.set(7);

Now how do you know which Counter you want to use in set? Sometimes it's the one that c1 references, and sometimes it's the one that c2 references. The pronoun this always references the object that the method was called on. Problem solved.

Public and private variables and methods

When you declare an instance variable, static variable, or method, you should declare it as either public or private.

If a variable is public, then it can be read or written from anywhere in the program. If a variable is private, then only methods in its class can read or write it. Most of the time, instance and class variables should be private. That way, only the methods in the class can see them. That's important from the point of view of abstraction: code from outside the class should interact with objects in the class only through the methods. An exception to this is that we sometimes make final static constants public, so that they can be seen (but not changed) from outside of the class.

If a method is public, then it can be called from anywhere in the program. If a method is private, then only methods in its class can call it. Most of the time, methods are public. Private methods can be useful, however, especially as "helper" methods within a class that code from outside the class need not know about.

Later in the course, we'll see another way to declare variables and methods: protected.

Note: things are private to the class, not to the object. So one object can see another object's private variables, if they are in the same class.

Therefore if we added a method copyFrom to the Counter class which sets the value of this counter to the value of a counter passed as a parameter, the following would be legal:

public void copyFrom(Counter c) {
  myValue = c.myValue;
}

Arrays

You have seen arrays in C and lists in Python. You give an index and get or store an element. As in C and Python, array indices in Java always start at 0. So array indices always go as 0, 1, 2, 3, 4, … . (Java is different from languages such as Pascal, where you can select a lower bound other than 0.)

Then again, just because indices start at 0, that doesn't mean you always have to use all the entries starting from 0. If you'd rather start from 1, go ahead. The cost is just the wasted space of the 0-indexed entry.

We can store any type in an array, as long as all entries are declared as the same type. So for starters, our array entries will be something simple, such as integers.

To declare an array of 10 ints you have to do two steps:

Create a variable that can hold a reference to an array of the proper type.
Create the actual array by using new.

int [] bozo;
bozo = new int[10];

This style is reminiscent of objects, and arrays are a special kind of object. As with objects, we can combine the two lines above into one:

int [] bozo = new int[10];

Our picture is the following:

Here,

int gives the type of each array element. The empty square brackets say that we have a reference to an array. Think of the whole notation int [] as saying "array of ints." (By the way, it doesn't matter how much space you leave between int and [], so you could write int[] bozo if you wanted to.)
bozo gives the name of the variable that stores a reference to the array.
int[10] says how many elements are in the array.

The 10 elements of this array are bozo[0], bozo[1], bozo[2], …, bozo[9]. Note that

An array of n elements has elements indexed from 0 to n - 1, only. An array of n elements does not have an element with index n.

Computer scientists like to start counting from 0.

You can think of array indices as being like subscripts. Our array bozo is like having bozo₀, bozo₁, bozo₂, …, bozo₉.

Technically, since an array is an object, we should not say that bozo is an array; we should say that bozo is a reference to an array. That would become incredibly tedious, however, and so we shall sometimes just say that bozo is an array. But you should understand that arrays are always objects, and that when we declare a variable to be an array, it's really a reference to an array.

You can make arrays of other sizes and of other types. To make an array blap of 30 doubles:

double [] blap = new double[30];

Array elements can be of any type. Even objects. Even other arrays!

Our first example of code that uses an array is in ShortList.java. We have a 10-element array named list. The first for-loop assigns the value i+1 to list[i] for i = 0, 1, …, 9. For example,list[5] is assigned the value 6.

Note how we set up the first for-loop:

for (int i = 0; i < N; i++)

As C programmers know, this scheme will be a very common paradigm for stepping through an N-element array. We start at 0 and continue as long as the index is strictly less than N. As soon as it hits N, we are out of bounds of the legal array indices, so we must stop. In other words, the loop test i <= N would be wrong, because within the loop body, we would be trying to access list[N], and no such array entry exists. An attempt to access list[N] (or any other array element out of range) would cause an ArrayIndexOutOfBoundsException error at run time. (This situation shows another advantage of Java over C/C++, which would happily make the reference, returning garbage or overwriting other data or code).

Remember:

The legal indices into an array of n elements are integers in the range 0, 1, …, n - 1. There is no element with index n. No indices are allowed to be negative, and no indices are allowed to be greater than or equal to n.

The most common paradigm for processing each element of an n-element array has a for-loop with the header

for (int i = 0; i < n; i++)

and the body uses the variable i to index into the array.

The second for-loop in ShortList.java prints out the values in the array in increasing order of the index. That is, it prints in order list[0], list[1], list[2], up through list[9].

Java 5.0 added another way to access every element of an array. The third loop in ShortList.java shows it:

  for (int element : list) {
    // Print it out in increasing order using a foreach loop
    System.out.print(element + "  ");
  }

This style of loop is called a foreach-loop or foreach-statement. The idea is that the loop variable, which is element in this example, takes on each value in the array (list in this example). In each iteration of the foreach-loop, the loop variable takes on the next value in the array. Python programmers will recognize this style of loop as similar to for-loops in Python.

Foreach-loops have the advantage that you cannot make an indexing error. You don't have to worry about indices being too large or too small. There are a few disadvantages to foreach-loops, however:

You don't have access to the index within an iteration, so if you need to do something that depends on the index, you don't have it.
Foreach-loops access the array elements in only one order: at index 0, index 1, index 2, and so on. If you want to access the array elements in any other order, you cannot use a foreach-loop.
If you are using only some, but not all, of the array elements, a foreach-loop will access array elements that you're not using. Depending on what you do with the array elements, severe problems could result.

The bottom line on foreach-loops is that for the restricted cases in which they apply, they are wonderful. For all other cases (such as those listed above), you are best off avoiding them and instead using for-loops with explicit index variables (such as the first for-loop in ShortList.java.

The fourth for-loop prints out the values in decreasing order of the index: list[9], list[8], list[7], down through list[0]. Note how the loop index i decreases in this loop. We could not have used a foreach-loop for this job.

The fifth for-loop prints out the array in decreasing order again, but using a slightly different method. The loop index i increases, but we use a more complex expression, N-i-1, in the array index. When i is 0, N-i-1 is 9. When i is 1, N-i-1 is 8, and so forth. Finally, when i is 9, N-i-1 is 0. This for-loop points out that the array index in a program can be any expression that evaluates to an integer in the correct range of indices.

There is an alternate way to create an array, using an initializer list:

  int [] list2 = {1, 3, 5, 9};

Instead of calling new, we put a list of numbers in curly braces on the right side of the assignment statement. The Java compiler counts the number of items in the list, verifies that they are all of type int, and creates an array of that length with the first item assigned to list2[0], the second to list2[1], etc. Initializer lists may be used only when you declare the array.

Note that when you do this that you don't necessarily know the length of the array. But the program shows that you can still determine the length of any array: list2.length gives the length of the array list2. Think of length as a final instance variable of an array object.

Multidimensional arrays

In a multidimensional array, you can think of the array entries as themselves being arrays. For example, suppose we have the declaration

Rect [][] box = new Rect[6][8];

where Rect is a class. The usual way to think of box is as a 2-dimensional array with 6 rows and 8 columns:

Note that the entry in row i and column j is denoted box[i][j]. In some other languages, you might write box[i,j], but not in Java (also not in C/C++ or Python). You must always use a separate set of brackets for each index: box[i][j].

A more accurate picture notes that we really have an array of 6 entries, each of which is an array of 8 entries. Therefore box[i] is a 1D array of references to Rect. The picture is

A consequence of the fact that 2D arrays are really arrays of arrays is that the number of rows is box.length and the number of columns is box[0].length. (Any valid subscript could be substituted for 0.)

Note that so far all of the references in box are null. After we assign a Rect object to each reference the picture would look like:

You can also use initializers with 2D arrays. For example, here is an initialized 3 × 4 matrix:

int [][] matrix = {{5, 3, 11, 2}, {7, 13, 3, 6}, {10, 0, -5, 9}};

Here, we have an array with three arrays of ints (one per row), each of which contains four ints.

Polymorphism and Interfaces

Polymorphism comes from Greek words for "many" and "shape." So a polymorphic variable is one that can hold multiple types. It is easy to see why Simula, an early OO language designed for simulation, found this idea helpful. In simulation you may have many objects, perhaps representing animals. It would be useful to have an array that could hold any animal type. The first object in the array might be a fish, the second a frog, the third a wildebeest, etc. For the simulation,, it would be useful to be able to go through the array and tell each animal to move. The fish would swim, the frog would hop, the wildebeest would run, etc. Therefore, we would like to have an array of in which each element is of type Animal. Fish, Frog, Wildebeest, etc. would be subtypes of Animal. They should be able to do anything that a generic Animal can do, and perhaps other things as well.

Java gives us two ways to achieve this type of polymorphism. The first is to create an interface. That is what we will consider for the rest of this lecture. The second is inheritance, which we will see soon.

An interface is like a class where the method headers (name, return type, parameters) are given, but there are no bodies that implement these methods. We will see a large number of interfaces this term, because interfaces are ideal for specifying Abstract Data Types (ADTs). An ADT consists of some data and a collection of operations that can be performed on that data. "A collection of operations" is exactly what an interface defines. We will see interfaces List, Set, Map, and many others.

Let's look at a simple interface and how it can be used. The file GeomShape.java contains such an interface. It looks at first like a class definition, but instead of using the word class in the header, it uses the word interface. The other difference is that the methods are not implemented. Instead of a method body, each one has a semicolon after the method header, which makes them abstract methods. An interface can have only abstract methods and constants.

The classes Circle in Circle.java and Rectangle in Rectangle.java both implement the GeomShape interface. Note that the the first line of the definition of both classes ends with implements GeomShape. (There could be a list of interface names, separated by commas.) The implements tag is a promise to include a definition for every abstract method in the interface. We say that the implementing class implements every method in the interface. It is an error to fail to implement any method from the interface, and the compiler will complain. (This is a slight lie; we will see later that classes can be abstract. But it is true about any class that can create objects.)

An implements tag is a promise to implement all the methods of the interface, but the class may include even more. For example, both the Circle and Rectangle class contain constructors and toString. Moreover, the Rectangle class has a flip method; the Circle class does not, but that's just to show you that the two classes need not implement the same exact set of methods, as long as each implements at least the methods in the interface (which are the methods it has promised to implement).

The advantage is that a variable (including a formal parameter) can be declared using the interface name as if it were a class name. As long as the object actually referenced by the variable is from a class that implements the interface, it's OK. For example, we can declare a variable to be a reference to a GeomShape and have it actually be a reference to a Circle, a reference to a Rectangle, or, for that matter, a reference to any class that implements GeomShape. We can call any of the methods specified in the interface on the object. We know that the methods specified in the interface are defined for the object, because that's exactly what the implements tag tells us.

The program GeomShapeDriver.java demonstrates our interface in action. The main method creates an array shapes that contains references to GeomShape objects. shapes[0] refers to a Circle and shapes[1] refers to a Rectangle. In the for-loop, the methods areaOf, move, and scale are called on shapes[0] the first time through the loop and on shapes[1] the second time through the loop. Java uses the actual type of the object referenced by shapes[i] to decide at run time which areaOf, move, and scale methods to call. Deciding at run time which method is actually called is known as dynamic binding. It is a feature of polymorphism: the ability of a single variable to reference more than one type of object. It can be quite powerful.

Let's revisit what is happening here, because it is important and also very cool. We have declared shapes as an array of type reference to GeomShape. Except that there's no such thing as a GeomShape object. GeomShape is a placeholder for any class that implements the scale, move, and areaOf methods that appear in the GeomShape interface declaration. When we actually create an object that implements GeomShape, say a Circle, we may assign the reference to that object to a place in shapes.

The truly remarkable part happens when we call a method with shapes[i] as the reference to the left of the dot, e.g., shapes[i].areaOf(). At compile time, we have no idea from which class the object will actually be. Perhaps shapes[i] references a Circle, but perhaps shapes[i] references a Rectangle. We don't know at compile time. And it doesn't matter! The decision as to which areaOf method gets called—the one in Circle or the one in Rectangle—is based entirely on what kind of object shapes[i] references at the time the call is made. That's dynamic binding.

An additional observation: there is an if-statement that tests if shapes[i] is of type Rectangle. That is what the operator instanceof tests. There is a call to flip inside this if-statement. To make this call, we have to cast the variable shapes[i] to be a reference to a Rectangle object, because GeomShape does not include a flip method. Therefore, there is no reason for the compiler to treat as legal a call of flip on an object that implements GeomShape. We know that shapes[i] references a Rectangle at that moment, and so we are willing to assume responsibility for the call to shape being legal. In order to convince the compiler that the call is legal, however, we cast shapes[i] to a reference to Rectangle. The compiler then follows President Reagan's maxim: "Trust but verify." Because it is doing something that may be unsafe, it puts in a runtime test to make sure that shapes[i] really refers to a Rectangle before it allows the call to flip.

What would have happened if we had tried to cast shapes[i] to a reference to Rectangle when it actually was a reference to some other kind of object (such as a Circle)? We would get a run-time error known as a ClassCastException, and our program would crash at that point. You can see this behavior yourself: eliminate the if-statement that tests if shapes[i] is an instance of Rectangle.