Encapsulation

[prev] [next]

Today we discuss one of the four pillars of object oriented programming: encapsulation. Encapsulation describes binding code (called methods in OOP) and data together into one thing called an object.

Instance variables
Getters and setters
Constructors
Other methods
toString
Encapsulation
Objects vs. primitives
Applications vs. classes
Java notes

All the code files for today: Student0.java; Student01.java; Student02.java; Student03.java; Student04.java; Student05.java; Student06.java; Student07.java; StudentTrackerApp.java; MemoryAllocationPrimitives.java; MemoryAllocationObjects.java.

Slides from class

Instance variables

In Java a class gives what is essentially a "blueprint" or a template for an entity in a program; the entity is called an object. In the blueprint analogy, the object is like a house specified by the blueprint. Defining a class does not create an object of that class, just as drawing a blueprint does not cause a house to be built. A blueprint says that when we build a house, here's what it will be like. Similarly, a class says that when we create an object, here's what it has and what it can do.

Today we'll build a simple Student class to represent students at a college. We will progressively add more functionality to the class as we go, resulting classes Student0 through Student07. From those classes we will instantiate (create) objects, where each object represents one student. For now, a very simple student has a student ID, a name, and a graduation year.

[Student0.java]

ppublic class Student0 {
    String name;
    int graduationYear;

    public static void main(String[] args) {
        Student0 alice = new Student0();
        alice.name = "Alice";
        alice.graduationYear = 2027;
        System.out.println("Name: " + alice.name +
                ", Year: " + alice.graduationYear);
    }
}

OUTPUT
Name: Alice, Year: 2027

Here, name and graduationYear are instance variables for objects of class Student; i.e., each student object (that is each instance) has its own name and year. Python programmers are used to objects having instance variables (though not declaring them up front like this); C programmers can think of them kind of like structs (though as we'll see, they're much more powerful). One of the biggest changes for Python programmers in moving to Java is that we have to declare variables and say what type each will hold. It isn't that Python doesn't have types. It's that Python checks them at run time (sometimes called duck typing — if it looks like a duck, quacks like a duck, ... it's probably a duck. Similarly, if it looks like an int, acts like an int, ... it's probably an int). Java checks data types at compile time rather than at run time (Java is said to be statically typed). By saving all of those run time-checks, Java code is able to run significantly faster than Python code. It's also safer, in that once the compiler is happy, you know that you have the type of data at run-time that you expect to have. You'll get used to have IntelliJ telling you what you need to fix (code highlighted in red); while annoying, it's for your own good!

In the main method of the code above, we first instantiate (create) an object called alice from class Student0. We use the keyword new to tell Java to allocate memory for object alice. alice gets memory allocated for instance variables representing her name and graduation year. These instance variables are set using the dot operator (e.g., alice.name). Once the instance variables have been initialized, alice is printed (remember, white space doesn't matter in Java, its stripped out by the compiler, so we can stretch one command over multiple lines for readability). Soon we will instantiate more objects, other than alice, and each one (each instance of Student0) will get their own name and graduation year instance variables. We can track multiple students by instantiating multiple objects.

Getters and setters

While setting instance variables using the dot operator (e.g., alice.graduationYear = 2027;) works shown above, that approach is frowned upon in Java. Instead we use methods (code) to allow outsiders to get or set instance variable values.

Note: Student01.java andStudent02.java progressively add functionality to Student0.java.

[Student03.java]

public class Student03 {
    protected String name;
    protected int graduationYear;

    /**
     * Setters for instance variables
     */
    public void setName(String name) { this.name = name; }
    public void setYear(int year) {
        //only accept valid years
        if (year > 1769 && year < 2100) {
            graduationYear = year;
        }
    }

    /**
     * Getters for instance variables
     */
    public String getName() { return name; }
    public int getGraduationYear() { return graduationYear; }




    public static void main(String[] args) {
        Student03 alice = new Student03();
        alice.setName("Alice");
        alice.setYear(2027);
        System.out.println("Name: " + alice.name +
                ", Year: " + alice.graduationYear);
    }
}

Student03 provides getter and setter methods for our basic Student class. getters allow outside code to retrieve the value of an instance variable. Each getter method returns an instance variable's value. By convention, getters are named get<VariableName>. We see getName returns the student's name. Java does not enforce this naming convention, you can call your method anything you'd like. Other programmers will generally assume getters will have the same name as the variable, but Java doesn't care what you call your methods. In fact, Java doesn't know getName is a getter. Java simply knows getName is a method. Note that methods that return a variable provide the data type in the method declaration (String and int in the code above).

Setters generally do not return a value (use data type void if a method does not return a value) but allow outside code to request a change to an instance variable's value. Setters follow the same naming convention as getters but use set instead of get. They typically take a parameter with the same data type as the instance variable (e.g., String for name and int for graduationYear.) Notice we called our setter setYear not setGraduationYear. Java doesn't know or care that this method is a setter, so it doesn't require the method to have the same name as the instance variable. Setters allow an object to provide error checking and refuse an update if the values provided are invalid. For example, the setYear method only accepts years between Dartmouth's founding in 1769 and 2100. For now it ignores other years (we will soon throw an exception for invalid parameters to tell the caller that their input was invalid).

In the code above we instantiate a student named alice in the main method, and then set each instance variable one by one using the setters. This initializes alice, getting that object ready for use. It turns out there is an easier way to intialize objects — using constructors.

Constructors

When an object is first instantiated (declared using keyword new), Java immediately runs a special method called a constructor. Constructors have the same name as the class and may take zero or more parameters.

[Student04.java]

public class Student04 {
    protected String name;
    protected int graduationYear;

    public Student04() {
        //default constructor: you get this by default
    }

    public Student04(String name, int year) {
        this.name = name;
        graduationYear = year;
    }

    

    public static void main(String[] args) {
        Student04 abby = new Student04(); //calls first constructor
        Student04 alice = new Student04("Alice", 2027); //calls second constructor
        System.out.println("Name: " + abby.name +
                ", Year: " + abby.graduationYear);
        System.out.println("Name: " + alice.name +
                ", Year: " + alice.graduationYear);
    }
}

OUTPUT
Name: null, Year: 0
Name: Alice, Year: 2027

Class Student04 demonstrates constructors. Constructors are a way to initialize an object's instance variables. For example, an object named abby is instantiated with Student04 abby = new Student04();. Java will first look to see if there is a method with the same name as the class that takes no parameters. In this case it finds the constructor and runs it. This constructor does nothing. If you do not write your own constructor, by default Java creates one like this and initializes instance variables to 0 for numeric types, false for boolean types, and null for objects (objects are the topic of the next class). abby's' instance variables are set to null and 0 by default. We see the result in the first line of the output.

Next, an object named alice is instantiated with Student04 alice = new Student04("Alice", 2027);. Here Java looks for a constructor (method with same name as the class) that takes two parameters, a String followed by an integer. It finds that method and immediately runs it after the keyword new, setting alice's name and graduationYear to the parameters provided. One the second line of output, we see alice's instance variables are set to the parameters passed to the constructor.

One thing to note about the second constructor: it uses the keyword this. Because the parameter name has the same name as the corresponding instance variables, Java doesn't know which variable we want if we don't specify (e.g., do we mean the instance variable or the parameter when we say name?). To specify the instance variable, use the keyword this and to specify the parameter, just use the parameter name. Java will choose the most local variable if you do not use this. For example, this.name means the instance variable name, whereas just name means the most local variable, the parameter.

There are two constructors in this code, but Java knows which constructor to run based on the signature — the number and type of parameters provided. Because abby provided no parameters, Java knows to run the constructor that takes no parameters. Because alice provided a String followed by an integer, Java finds a method with that signature and runs that code for alice. Two or more methods with the same name (here Student04) but different signatures is called overloading.

Other methods

Code that operates on an object's instance variables are called methods in Java. Getters, setters, and constructors are examples of methods, but objects can have other methods. For example, suppose we want to track how many hours each student spends in class and studying. We can add instance variables for studyHours and classHours and provide methods, study and attendClass to track the numbers of hours spent studying and in class.

[Student05.java]

public class Student05 {
    protected String name;
    protected int graduationYear;
    double studyHours;
    double classHours;

    

    public double study(double hoursSpent) {
        System.out.println("Hi Mom! It's " + name + ". I'm studying!");
        studyHours += hoursSpent;
        return studyHours;
    }

    public double attendClass(double hoursSpent) {
        System.out.println("Hi Dad! It's " + name +". I'm in class!");
        classHours += hoursSpent;
        return classHours;
    }

    public static void main(String[] args) {
        Student05 abby = new Student05(); //calls first constructor, default instance variables
        Student05 alice = new Student05("Alice", 2027); //calls second constructor
        alice.study(1.5);
        alice.attendClass(1.1);
    }


OUTPUT
Hi Mom! It's Alice. I'm studying!
Hi Dad! It's Alice. I'm in class!

toString

When we define a class, Java does not know its semantic meaning. Our Student classes above may make sense to a human, but Java doesn't know what a "student" is. If we print an object that was instantiated from one of our classes, because Java doesn't know what the class represents, by default it simply prints a value based on the object's memory address. We can, however, provide a toString method that returns a String representation of the object that makes sense to a human. For example:

[Student06.java]

public class Student06 {
    protected String name;
    protected int graduationYear;
    double studyHours;
    double classHours;

    

    public String toString() {
        String s = "Name: " + name + ", graduation year: " + graduationYear + "\n";
        s += "\tHours studying: " + studyHours + "\n";
        s += "\tHours in class: " + classHours;
        return s;
    }

    public static void main(String[] args) {
        Student06 abby = new Student06(); //calls first constructor, default instance variables
        Student06 alice = new Student06("Alice", 2027); //calls second constructor
        System.out.println(abby);
        alice.study(1.5);
        alice.attendClass(1.1);
        System.out.println(alice);
    }
}

OUTPUT
Name: null, graduation year: 0
  Hours studying: 0.0
  Hours in class: 0.0
Hi Mom!  It's Alice. I'm studying!
Hi Dad! It's Alice. I'm in class!
Name: Alice, graduation year: 2027
  Hours studying: 1.5
  Hours in class: 1.1

Now when an object of type Student06 is printed, behind the scenes Java calls the toString method, which returns a String representing the object. Note: '\n' adds a newline character and '\t' adds a tab character to the String.

Encapsulation

Encapsulation in OOP refers to bringing code and data together into one thing called an object. In the code samples above we create Student classes that have both data (ID, name, graduation year) and code (getters/setters, study, attendClass, toString). In OOP we call code, methods. In other languages code might be called functions or subroutines.

Objects vs. primitives

Java keeps track of its variables in an area of memory called the stack. The values of variables declared as primitive data types are stored on the stack. Java knows how many bytes each primitive data type uses, so it knows how much memory to allocate for them on the stack. For example, all variables of type double are 8 bytes so Java allocates 8 bytes on the stack for each double.

Java stores objects in another area of memory called the heap. Java makes an entry in the stack that stores the memory address where the object can be found on the heap for each object that is declared.

An important note about primitive vs. object types: double is a "primitive" type, as are int, char, and boolean (lower-case type names), variables of those types do not refer to an object but rather just simple number (or character) stored directly in memory. If a variable is of a primitive type, the stack contains the actual data itself (the bit pattern representing the data). If a variable is a reference to an object, then it contains a reference, and the object's data is stored on the heap.

Why is this distinction important? My wife and I have a joint checking account. We each have an ATM card. The cards are different and have different names on them, but the refer to the same checking account. If I withdraw money with my ATM card, there is less money in the account, and if my wife then looks at the balance it will be smaller even though she did nothing with her ATM card. In this analogy, the account is the object (and bank account objects are a common example in textbooks). The ATM cards are the variables, each holding a reference to the same account. Any changes made by either of us to the account via our ATM card are seen by both. On the other hand, if my wife has her ATM card re-programmed so that it refers to her personal account (changes the reference stored in the variable), that won't affect my card or the account. She just will no longer be able to use that ATM card to withdraw money from our account, because it no longer refers to our account.

Consider this code:

Student06 alice = new Student06("Alice", 2027);
Student06 bob = new Student06("Bob", 2025);
Student06 abby = alice;

There are two different students here, created with the two new statements. One of these students has two "names" (called "aliasing"), with variables abby and alice referring to the same object. So:

abby.setName("Ainsley");
System.out.println(alice.getName()); // => prints Ainsley

It doesn't matter whether we refer to the object by its abby name or its alice name; it's the same memory location on the heap. Recuse they both reference the same memory location on the heap, changing abby's name also changes alice's name. Now both have the name "Ainsley". The student that bob refers to is an entirely separate object piece of memory on the heap. bob's name is not changed in this example.

Applications vs. classes

Often we will create classes to model an entity such as a student, but will create a separate program that uses them. These programs are often called application or driver programs and they provide the business logic to accomplish a useful task. For example, an application might be called StudentTrackerApp and might track several Student objects. In that case, StrudentTrackerApp with have a main method and other associated methods to track students at a college, but the Student class will not have a main method. Student doesn't need a main method, because it is not meant to run on its own. Instead Student is meant to be used by StudentTrackerApp. In Java, we create one class to represent students (stored on disk as Student.java) and another class called StudentTrackerApp (stored on disk as StudentTrackerApp.java) to provide the application's logic. We provide Student07.java as a Student class that does not have a main method. Other application programs (that do have a main method can use this Student class.

Java notes

Again, this isn't a comprehensive reference to the language; the textbook and on-line references provide much more detail. But hopefully it gives sufficient intuition and an organizational structure. Give a yell if I've missed something important.

class: A class is like a blueprint; a set of instructions that describe how to build an object. It can contain instance variables and methods (code) that operates on the instance variables.
object: An instance of a class. Objects are instantiated (created) by using the keyword new. Multiple objects can be instantiated from the same class, just like multiple houses can be constructed from the same blueprint.
instantiation: An object is created by the new operator, with the name of the class and any parameters needed to initialize the object as defined by the class. Instantiation allocates memory for the object and "brings it into being."
instance variables / fields: These store data specific to an object; they are declared outside any method.
local variables: These hold values temporarily in a method. They are declared inside methods.
parameters: These carry values into a method.
constructor: Creating an object creates its instance variables. Java will initialize these (to 0 for numbers, false for boolean, null for references to objects). However, these are seldom what we want. Java provides a special type of method called a constructor, that has the same name as the class. It called via new, and is responsible for giving all of the instance variables appropriate values. It can take parameters, too.
method: Code that operates on an object's instance variables. A method performs an operation for an object. It is defined inside the class, and includes a method name, list of parameters (parameter names and types), and return type (before the name), followed by a body, which is a code block to be executed. Other languages might call methods a function or subroutine.
method invocation: To ask "object" to perform "method" with "parameters" (comma separated values), we write object.method(parameters);
dot operator: A way to access an object's instance variables. For example, alice.name. In Java we typically use getters and setters to get or update instance variable values.
getter: A method that returns an instance variables value.
setter: A method that updates an instance variable to a value passed in as a parameter. Setters may perform error checking and may refuse to update an instance variable if an invalid value is passed as a parameter.
encapsulation: Bringing code (methods) and data (instance variables) into one thing called an object. C programmers, think structs but also with functions.
return: This exits a method immediately, and if the method returns a value, specifies what to pass back. Methods that don't return values ("void" type) just have "return" all by itself to return early, or just naturally return at the end of the code block.
return type: If a method is to return a value, the type of that value is specified. If doesn't return a value, "void" is indicated as the return type.
this: Refers to the object itself. In constructors and setters, it can be helpful to distinguish a parameter that has the same name as the instance variable (e.g., this.x = x where this.x refers to the instance variable and x is the parameter).
signature: The signature of a method is the name of the method plus the number and type of parameter it takes along. A class may have multiple methods with the same name, but with different numbers and types of parameters.
overloading: When a class has multiple methods with the same name, but with different parameter lists, the method is said to be overloaded.