CS 10: Winter 2016

Lecture 2, January 6

Code discussed in lecture

Counter.java

Short Assignment 1

A quick look at a Java class, continued

Last class we did a quick overview - class declaration, variable declarations, method declarations, return types, parameters. Now we will finish up the notes from last lecture and then will go through some things about Java in more detail.

Java types

One of the biggest changes for Python programmers in moving to Java is that you have to declare your variables and say what type each will hold. It isn't that Python didn't have types. It was that Python checked them at run time. Java checks them statically at compile time rather than at run time. By saving all of those run time-checks, Java code is able to run significantly faster than Python code.

So what is a type, and what is it for? At the basic level computer memory is a long sequence of 1's and 0's, each called a bit (BInary digiT). It is broken into eight bit chunks (called bytes), each with its own address. There are may things that you might want to store in a computer: numbers, characters, colors, sounds, wildebeast objects, etc. Even computer instructions are stored in memory. Each "type" of thing has its own encoding as bits. When I was teaching the computer architecture course one of my favorite test questions was to give a 32 bit sequence of 0's and 1's and ask how it would be interpreted if it were a computer instruction, an integer, a string of characters, a color, etc.

When programming in machine language the programmer has to remember what each byte is supposed to represent. If he or she starts treating computer instructions as if they were integers, or integers as if they were computer instructions, bad things happen. (That is what many computer viruses do. They store a bit pattern as data, and then trick the program into running that pattern as code.)

The two things that a type tells us are

How to interpret the bits as a representation of the particular data.
Which operations are valid on that data.

Smalltalk, an early OO languge, has a very elegant solution to the type problem. Everything is an object, and the operations that are valid on the object are precisely those defined by its methods. The methods understand how to interpret the bit patterns. It is elegant, but having to ask a number to add itself to another number via a method call is a lot slower than using the hardware "add" instruction. As a result, Smalltalk is s-l-o-w.

Java compromises. Most things are objects whose types are defined by the methods that they implement and the data that they store. The object data is stored somewhere in memory, and variables hold references to the object that tell how to find it. The variables c1 and c2 in the Counter class are examples. In practice the reference is a memory address, but Java treats the reference abstractly and uses it to communicate with the object. (Unlike C and C++, Java has no way to manipulate the memory addresses.) All references in Java are the same size, no matter how many instance variables and methods the object has.

For speed's sake, Java has eight primitive types that are not objects:

Four types for storing integers of in different numbers of bytes: byte (1), short (2), int (4), and long (8). We will almost always use int, which has a range between just under negative 2 billion and to just over 2 billion. (That's because an int has 32 bits. In 32 bits, we can store all integers in the range - 2³¹ to 2³¹ - 1, and 2³¹ = 2,147,483,648.) byte is -128 to 127. short is basically plus or minus 32,000 (2¹⁶ = 32,768). long is huge: - 2⁶³ to 2⁶³ - 1, and 2⁶³ = 9,223,372,036,854,775,808.

Integer literals (e.g., 123) are assumed to be of type int, unless followed by a L or l, in which case they are long.
Two types for storing numbers with fractions in scientific notation: float (4 bytes) and double (8 bytes). float has 7 or 8 significant decimal digits, double has about 15 significant decimal digits.

Floating-point literals (e.g., 3.14159) contain a decimal point and are assumed to be double, unless followed by an F or f, in which case they are float. The scientific notation form that you've seen in Python or C works in Java, too, e.g., 6.02e23. E or e means "10 to the".
char, a type for storing characters in a system called Unicode (2 bytes). As in C, character literals are written between single quotes, e.g., 'a'. Pythonistas: Python does not have single-character variables or literals. In Python, 'a' is a string consisting of one character, but in Java it's a character literal, which is a different type from a string. As in C and Python, '2' means the character 2, not the integer 2.

As in C and Python, use the backslash as an escape character. For a character literal that's a single quote: '\''. For a backslash: '\\'. And '\n' is the newline character, with '\t' being the tab character.

In Java, a String is an object rather than a primitive type. String literals are surrounded by double quotes: "This is a string". Strings can be concatenated using the + operator. To include a double-quote in a string, escape it with backslash: "Strings are enclosed by \" characters". String literals cannot extend over multiple lines. If your String literal is that long, break it into pieces on separate lines and use the + operator to concatenate them.
boolean, which stores true or false.

Values of a primitive type are actually stored in the variable itself, and so these primitives have different lengths in memory. Therefore, Java has two kinds of variables. If a variable is of a primitive type, the variable contains the actual data itself (the bit pattern representing the data). If a variable is of type reference to an object, then it contains a reference, and the data itself is stored elsewhere.

Why is this distinction important? My wife and I have a joint checking account. We each have an ATM card. The cards are different and have different names on them, but the refer to the same checking account. If I withdraw money with my ATM card, there is less money in the account, and if my wife then looks at the balance it will be smaller even though she did nothing with her ATM card.

In this analogy, the account is the object (and bank account objects are a common example in textbooks). The ATM cards are the variables, each holding a reference to the same account. Any changes made by either of us to the account via our ATM card are seen by both. On the other hand, if my wife has her ATM card re-programmed so that it refers to her personal account (changes the reference stored in the variable), that won't affect my card or the account. She just will no longer be able to use that ATM card to withdraw money from our joint account, because it no longer refers to our joint account.

Demo in DrJava:

int x = 4;
int y = x;
x++;
x
y
Counter c1 = new Counter();
Counter c2 = c1;
c1.tick();
c1
c2

Java classes

As I said before, think of a class like a blueprint, and of an object like a house. Defining a class does not create an object of that class, just as drawing a blueprint does not cause a house to be built. A blueprint says that when we build a house, here's what it will be like. And a class says that when we create an object, here are the instance variables and methods it will have.

Although Java does not explicitly require us to, we observe various conventions for capitalization styles of various types of identifiers:

Class names conventionally use title case: the first letter of each word in the name is capitalized. Example: BarrelOfMonkeys
Variables and method names conventionally start with a lowercase letter and then capitalize successive words. Example: poundsOfBarbecue
Constants (final variables) are all uppercase, with words separated by underscores. Example: AVOGADROS_NUMBER.

Kinds of variables

Java has three distinct kinds of variables:

Instance variables. The way that an object stores its "personal" data. One copy of each instance variable is created each time an object is created. The object's instance variables continue to exist as long as the object exists.
Class variables (also known as static variables). These belong to the class, not to any particular object. So there is exactly one per class, no matter how many objects of the class there are. These variables exist before any objects are created. They last as long as the program is running. If any object changes the value of a class variable the value is changed for all objects of that class, because there is only one, and it's shared among all the objects in the class.
Local variables. These are "scratch paper" created within a method. A local variable is created when execution reaches the variable's declaration, and it goes away no later than when the method returns. (Actually, when the program reaches the end of the "block" where the variable is declared. Blocks are delimited by curly braces. If you re-enter a block, a new copy of the variable gets created. Any previous value is gone.)

Any variable declared within a method is a local variable. Parameters are always local variables for their methods.

If a variable is declared outside a method, then it's either an instance variable (if the declaration does not include static) or a class variable (if the declaration includes static).

You might expect there to be declarations like "instance variable" or "local variable". Unfortunately, that is not the way it works. Show examples from Counter.

Note that the three variable types are used for very different things. A common error is to be writing a method, realize that you need a temporary local variable, and declare it outside of the method. This makes it an instance variable, which remains in existence as long as the object exists. It runs correctly, but it is wrong to do this. It is like realizing you need a piece of scratch paper, and then going to a file cabinet and creating a new file folder and permanently filing the piece of scratch paper, which you will never use again.

One other note about variables - any of the three categories can be declared final. That means that the variable cannot be changed after it is assigned. It is a constant. This can be a very useful thing to do, and allows you to give meaningful names to "magic values".

While it is legal to declare a final instance variable, does it make sense to do so? Not if it is going to be the same constant in every object. If you have a thousand objects, does each need its own copy of the constant? One for the whole class is enough, because it will be the same for every object in the class. So if you want a constant to be seen in every class make it final static, so it becomes a class variable.

Printing to the console

As you've seen, you call System.out.println to print to the console. The String that you supply as a parameter is what's printed. If you don't supply a parameter, then a blank line prints to the console.

If you want to print to the console but not end with a newline, then call System.out.print. For example, if you were printing a prompt and wanted the typed input to appear on the same line as the prompt, you could write

System.out.print("Enter a prime number: ");

Java's increment and decrement operators

C programmers are familiar with the ++ and -- operators for incrementing and decrementing. (It might be hard to see but that's two minus signs in a row, with no space between them.) When you use these operators, they have to apply to variables.

int x = 7;
x++;     // legal; x gets the value 8
7++;     // not legal; 7 is not a variable
int y = 10;
(x+y)++; // not legal; (x+y) is not a variable
y--;      // legal; y gets the value 9

You can use the ++ and -- operators in expressions. If you put the ++ or -- before the variable, then the increment or decrement happens before the variable's value is used in the expression. If you put the ++ or -- after the variable, then the increment or decrement happens after the variable's value is used in the expression.

int x = 7, y = 10;
int z = ++x * y--;

Here, x is incremented before it's used in the expression, and so x gets the value 8, and the value of x used in the expression is 8. But y is decremented after it's used in the expression, and so although y will get the value 9, the value of y used in the expression is 10. Hence, z gets the value 80.

As you can see, you can make some complicated expressions using the ++ and -- operators. Just because you can doesn't mean you should! Using them can make extremely compact but almost unreadable code. My preference is to only use these operators as stand-alone statements, rather than embedding them in expressions.