Finish up ArrayList from last lecture.
Let's revisit linked lists. I say "revisit" because you've seen linked lists before in either CS 1, ENGS 20, or your AP course. But you haven't seen them like you're going to see them here.
Some reminders about linked lists. A linked list is in some ways like an array. In particular, it defines a linear ordering of elements. Unlike an array, we can insert an element into any position in the linear order in constant time, and we can also delete an element from any position in the linear order in constant time, with all other elements maintaining their relative positions for both insertion and deletion. With an array, we have to shift elements when inserting or deleting if we want them to maintain their positions relative to each other, and so it can take a while to insert or delete into an array (or an ArrayList
).
Another advantage of linked lists is that they grow and shrink dynamically. That is, the space occupied by an n-element linked list, in which each element occupies b bytes, is about bn bytes. Although you could implement an ArrayList
to grow and shrink dynamically, you cannot do so while maintaining constant-time per insertion or deletion from the middle of the ArrayList
.
So what's the downside of linked lists? They're a little more complicated to work with than arrays. They also take a bit more space than arrays (need space for the links). Like an ArrayList
, a linked list can store only objects, not primitive types. But the biggest disadvantage is that we cannot index into them in constant time. In other words, we can find the ith element of an array or an ArrayList
in constant time, but finding the ith element of a linked list takes time proportional to i. For applications where we need to index quickly, linked lists are not a great choice.
In a linked list, the elements are arranged in a linear order that is determined by following a chain of links. For example, here is a conceptual picture of a linked list with three elements, each of which contains a reference to the name of a state:
As you can see, we think of the list as having a head and a tail element, and we can think of references head
and tail
to these elements.
The link part of each element tells us the next or previous element on the list. These will be implemented as references to entire elements.
We will see a couple of ways to implement linked lists. Both ways will support a basic set of operations. When we have a set of operations to be supported but without a specific implementation for them, we make an interface. We will use the CS10LinkedList
interface in CS10LinkedList.java. Like an ArrayList
, this interface uses a generic type T
.
The specifications of the methods of the CS10LinkedList
interface assume that the linked list maintains a notion of a "current" element of the list. The interface has the following methods:
add
inserts a new element after the current element, making the current element be this new element. If there is no current element, it inserts at the front of the list.remove
removes the current element, designating the successor of the removed element as the new current element. It is an error to call remove
if there is no current element.contains
determines whether an object is in the linked list. It returns a boolean that says whether the object was found. If the object was found, then the current element is set to be this object. (You can then insert after it, set it, or remove it.) The contains
method uses the equals
method to determine whether an object is in the linked list. As always, the default version of equals
determines whether two objects are the same object, i.e., whether two references are to the same address, but a class can override the default version of equals
.isEmpty
returns a boolean indicating whether the linked list has no elements.hasCurrent
returns a boolean indicating whether the linked list has a current element.hasNext
returns true if there is a current element and it has an element that follows it.getFirst
sets the current element to be the first one in the linked list and returns it. If the list is empty, getFirst
returns null
.getLast
sets the current element to be the last one in the linked list and returns it. If the list is empty, getLast
returns null
.addFirst
inserts an element at the head of the linked list and makes it the current element.addLast
inserts an element at the tail of the linked list and makes it the current element.clear
removes all elements from the list.next
advances the notion of the current element to the element following the current element and returns it. If there is no element following the current element, then next
returns null
.get
returns the data of the current element, or null
if there is no current element.set
sets the data field of the current element. It is an error to call set
if there is no current element.You can traverse a linked list to get a reference to each element of the list. Here, each element is an Integer
object, just to keep things simple. (We rely on unboxing in the loop body, where we add i
, which is really a reference to an Integer
object, into sum
.) Notice the header of the for-loop to traverse the list:
for (myList.getFirst(); myList.hasCurrent(); myList.next()) {
Integer i = myList.get();
System.out.println("List element with value " + i);
sum += i;
}
The for-loop calls the hasCurrent
method to determine when we have reached the end of the list. The loop has to explicitly call the get
method within the body of the loop to get the value of the current element:
If you took CS 1, then you know that the simplest, cleanest way to implement a linked list is with a circular, doubly linked list with a sentinel. The implementation is in SentinelDLL.java. The class is generic for a type T
, declared with the line
public class SentinelDLL<T> implements CS10LinkedList<T>
To start, each list element is an object of the class Element
and has three instance variables:
data
is a reference to the object being stored in that list element. This object must be of the type T
. For the above example with state names, when we create a SentinelDLL
object, T
will be a String
, so that data
is a reference to a String
.next
is a reference to the Element
after this one in the list.previous
is a reference to the Element
before this one in the list.The Element
class is a private inner class. It has the following methods:
T
. It stores this reference in the instance variable data
.toString
returns the String
representation of this element's data object.Because each Element
stores a reference to an object, strange things can happen if we store a reference to an object and then the object is changed. Therefore, we require that once a reference to an object is stored in an Element
, the object itself should not change.
The class SentinelDLL
implements the linked list. In fact, it implements the CS10LinkedList
interface. The methods of SentinelDLL
will need to access the data
, next
, and previous
instance variables of each Element
object. Because Element
is a private inner class, the methods of SentinelDLL
can access its instance variables, even though they are declared as private. No methods outside of SentinelDLL
can access the instance variables of Element
, and so no methods outside of SentinelDLL
can refer to data
, next
, or previous
.
Next we examine the declaration for the class SentinelDLL
. It contains several methods, but first let's look at the instance variables.
current
references the "current" list element, which we will need for several of the linked-list operations.sentinel
references a special list element, which we call the sentinel.The scheme is that a linked list has exactly one sentinel, along with zero or more "real" elements. For example, the list above, with the names of three states, would contain four Element
objects: the sentinel, and objects for Maine
, Idaho
, and Utah
. The picture looks like the following, where a slash indicates a null
reference:
Here, I omitted showing which Element
object is pointed to by current
. Despite how I had to draw the figure, each of these references points not to individual instance data, but rather to an entire Element
object. The sentinel's data
is a null
reference.
Notice how the list is circular, in that you can start at the sentinel and follow either forward (next
) or backward (previous
) references and eventually get back to the sentinel.
In this scheme, every linked list, even an empty one, has a sentinel. In an empty list, both references in the sentinel point to the only Element
available, namely the sentinel:
It may seem strange to have an "empty" list actually have an Element
object in it, but it turns out to really simplify some of the code. You may appreciate this simplicity later on when we examine other ways to implement linked lists.
Having seen how we intend circular, doubly linked lists with a sentinel to be represented, now we examine the methods of the Element
and SentinelDLL
classes in SentinelDLL.java. The methods for Element
are straightforward, so we won't go over them here.
The SentinelDLL
constructor makes an empty list with only the sentinel, as the diagram above shows. It also sets the instance variable current
to point to the only Element
in town, namely the sentinel. Setting the next
and previous
fields of the sentinel and setting current
are done by a call to clear
, which makes any list empty (leaving any contents for garbage collection).
String
The toString
method for a SentinelDLL
is fairly straightforward. It uses a common style of traversing a linked list by a clever for-loop header:
String result = "";
for (Element<T> x = sentinel.next; x != sentinel; x = x.next)
result += x.toString() + "\n";
return result;
The for-loop iterates through the list, starting from the first non-sentinel on the list (sentinel.next
), following next
references, and stopping when it gets back to the sentinel. It concatenates the string representation of each element in the list onto a String
named result
, returning result
at the end. Of course, this style of traversing the linked list works only within methods of the SentinelDLL
class, since the instance variables sentinel
and next
are private to their respective classes.
The add
method for a list takes an object reference obj
, and it inserts it after the Element
object referenced by the instance variable current
. Notice that we restrict obj
to be of type T
. The code manipulates references to "splice in" the new element. For example, if we start from an empty list, where current = sentinel
, and insert an element with the string Maine
, we have the following situation:
The add
method makes current
reference the new Element
object.
The splicing works the same when inserting into any position of the list. For example, starting from the 3-element list from before, we insert Ohio
after Idaho
as follows:
Let's take a careful look at how add
works. First, it makes a new Element
that references the given object, and x
references this new Element
. It is this new Element
that we will add to the list. We need to do four things:
x
's next
reference the element following the one that current
references. The assignment x.next = current.next
does so.x
's previous
reference current
. The assignment x.previous = current
does so.current
will have a new predecessor, namely the element that x
references, so we need to set the previous
instance variable of this element to reference x
's element. The assignment current.next.previous = x
does so. The expression current.next.previous
can be a bit confusing, so let's examine it carefully. current
references the current element. current.next
references the element following the one that current
references. This element has an instance variable previous
that references its predecessor (which is current
at the time that the add
method is called, but it's about to be updated). Since we want to assign to the previous
instance variable of the Element
object referenced by current.next
, we put current.next.previous
on the left-hand side of the assignment statement.current
will have a new successor, namely the element that x
references, so we set the next
instance variable of current
's element to reference x
's element. The assignment current.next = x
does so.As you can easily see from the add
code, it takes constant time to insert an element into a circular, doubly linked list with a sentinel. You can also see, by the absence of if-statements, that there are no special cases.
The remove
method for a list removes the Element
object that current
references. You never ever remove the sentinel, so the first thing we do is check whether current
references the sentinel by calling the hasCurrent
method. If current
references the sentinel (indicated by hasCurrent
returning false
), then we print an error message to System.err
, rather than to System.out
. On some systems, you can suppress regular output printed to System.out
, but you have to go to extra lengths to suppress error messages printed to System.err
. In Eclipse, when you print to System.err
, the message appears in red in the console. We want to make error messages likely to be seen.
Normally, the remove
method is not trying to remove the sentinel. We splice the current element out of the list and make current
reference its successor in the list.
For example, to remove Idaho
from the previous list:
and to remove the only element from a list:
The time to remove an element is constant. As we will see when we examine "simpler" lists, this running time is quite good; with linked lists whose representation appears simpler than that of a circular, doubly linked list with a sentinel, the time to remove an element at the ith position in the list is proportional to i.
The contains
method for a SentinelDLL
takes a reference obj
to an object of the generic type T
and looks for an element that equals obj
, according to the equals
method on the data
field of each Element
. We traverse the list, calling equals
on each element's data, until a match is found. If the contains
method finds such an element, it sets current
to reference it, so that we can next either add a new element after it or remove it.
We could check to make sure that we haven't returned to the sentinel, along with checking whether we have a match, but we use a clever way to avoid having to check that we haven't returned to the sentinel in each iteration of the loop. We put the value we're looking for in the sentinel. That way, we're guaranteed of finding it. If where we found it was the sentinel, it wasn't there in the first place. If where we found it was not the sentinel, then it really was there. We set sentinel.data
to be the same reference as obj
before traversing the list, and we make sure to put a null
back into sentinel.next
after the traversal is done, no matter where in the list the traversal stopped.
When we use the sentinel trick, the for-loop needs no body:
for (x = sentinel.next; !x.data.equals(obj); x = x.next)
;
This process is really linear search. The time to perform it depends on the time to compare two elements. If we denote this comparison time by t, and we say that the list has nelements, then the time to find a list element is proportional to tn in the worst case (when the element is not in the list). If t is a constant that can be ignored, then the worst-case time is proportional to n.
The remaining list functions are really easy. Note that the later functions use the isEmpty
, hasCurrent
, and hasNext
predicates rather than just doing the tests directly. Accessing the linked list through these methods makes changing the representation easier.
isEmpty
returns true
if and only if the only list element is the sentinel. That is the case precisely when the sentinel references itself.hasCurrent
returns true
if and only if there is a current element. That is the case precisely when current
does not reference the sentinel.hasNext
returns true if there are both a current element and another element after the current element.getFirst
sets the current
reference to the first element in the list and returns the data in the first element. If the list is empty, then current
must reference the sentinel, and its data must be null
, and so getFirst
returns null
when the list is empty.getLast
is like getFirst
except that it sets current
to reference the last element in the list and return its data.addFirst
adds a new element at the head of the list and makes it the current element.addLast
adds a new element at the tail of the list and makes it the current element.next
moves current
to current.next
and returns the data in that next element. It returns null
if there is no next element.get
returns the data in the current element, or null
if there is no current element.set
assigns to the current element, printing an error message to System.err
if there is no current element.All of the above methods are in the CS10LinkedList
interface. In addition, the SentinelDLL
class contains one method (other than the constructor) that is not in the CS10LinkedList
interface:
previous
moves current
to current.previous
and returns the data in that previous element. It returns null
if there is no previous element.SentinelDLL
classWe can use the ListTest.java program to test the SentinelDLL
class. You can use the debugger to examine the linked list if you like.
Notice that to declare and create the linked list, we specify the type that will be stored in the list. Here, it's a String
:
CS10LinkedList<String> theList = new SentinelDLL<String>();
Because theList
is declared as a reference to the interface CS10LinkedList
, we cannot call the previous
or hasPrevious
methods in this driver.
This test program reads one-letter commmands (or at least looks only at the first
letter of each command). It also reads names (or any strings) to insert into the list or search for in the list. To do these things it uses the Scanner
class (which
must be imported).
System.in
is an instance variable within the System object, and it refers
to terminal input. Calling new Scanner(System.in)
creates a scanner which
gets its input from System.in
.
The Scanner
is fairly complicated, but we will only use one method from it
for now. nextLine()
reads everything up to the next newline character and returns it as a String
(discarding the newline character). It is also possible to read integers, doubles, bytes, booleans, etc. We will learn more about this
later.
The test program is really very rudimentary. Once it has read a command line it uses
the String
method charAt(0)
to get the character in the 0 position of the string (which is the first character). It then uses a switch
command to choose between the possible cases. Note that you have to put a break
at the end of every case - if you don't the program will "fall through" to the next case. The default
case at the end is executed if no other case
is chosen.
Although doubly linked circular linked lists with sentinels are the easiest linked lists to implement, they can take a lot of space. There are two references (next
and previous
) in each element, plus the sentinel node. Some applications create a huge numbers of very short linked lists. (One is hashing, which we'll see later in this course.) In such situations, the extra reference in each node and the extra node for the sentinel can take substantial space.
The code for singly linked lists has more special cases than the code for circular, doubly linked lists with a sentinel, and the time to remove an element in a singly linked list is proportional to the length of the list in the worst case rather than the constant time it takes in a circular, doubly linked list with a sentinel. (Your second lab assignment will show how to avoid this by keeping track of the element before the current item instead of the current item.)
The SLL
class in SLL.java implements the CS10LinkedList
interface with a generic type T
, just as the SentinelDLL
class does. A singly linked list, as implemented in the SLL
class, has two structural differences from a circular, doubly linked list with a sentinel:
Each Element
object in a singly linked list has no backward (previous
) reference; the only navigational aid is a forward (next
) reference.
There is no sentinel, nor does the list have a circular structure. Instead, the SLL
class maintains references head
to the first element on the list and tail
to the last element on the list.
A singly linked list with Maine
, Idaho
, and Utah
would look like
A singly linked list with only one element would look like
And an empty singly linked list looks like
The file SLL.java contains the class definitions for Element
and SLL
for a singly linked list. These declarations are similar to those for circular, doubly linked lists with a sentinel. As before, Element
class is a private inner class, and all method declarations are the same. The only difference is in the instance data. We can use the same ListTest.java driver to test the singly linked list class, as long as we change the line creating the list to read
CSLinkedList<String> theList = new SLL<String>();
Let's examine the List
methods in SLL.java for singly linked lists. We will highlight those that differ from those for circular, doubly linked lists with a sentinel.
The clear
method, which is called by the SLL
constructor as well as being publicly available, makes an empty list by setting all instance variables (head
, tail
, and current
) to null
.
As before, the add
method places a new Element
object after the one that current
references. Without a special case, however, there would be no way to add an element as the new head of the list, since there is no sentinel to put a new element after. Therefore, if current
is null
, then we add the new element as the new list head.
The code, therefore, has two cases, depending on whether current
is null
. If it is, we have to make the new element reference what head
was referencing and then make head
reference the new element. Otherwise, we make the new element reference what the current element is referencing and then make current
reference the new element. If the new element is added after the last element on the list, we also have to update tail
to reference the new element.
Compare this code to the add code for a circular, doubly linked list with a sentinel. Although there is only one directional link to maintain for a singly linked list, the code has more cases and is more complex. For either implementation, however, adding an element takes constant time.
As mentioned, removing an element from a singly linked list takes time proportional to the length of the list in the worst case—in other words, time that is linear in the length of the list— which is worse than the constant time required for a circular, doubly linked list with a sentinel. Why does it take linear time, rather than constant time? The reason is that the previous
reference in a doubly linked list really helps. In order to splice out the current element, we need to know its predecessor in the list, because we have to set the next
instance variable of the predecessor to the value of current.next
. With the previous
reference, we can easily find the predecessor in constant time. With only next
references available, the only way we have to determine an element's predecessor is to traverse the list from the beginning until we find an element whose next
value references the element we want to splice out. And that traversal takes linear time in the worst case, which is when the element to be removed is at or near the end of the list.
The remove
method first checks that current
, which references the Element
object to be removed, is non-null
. If current
is null
, we print an error message and return. Normally, current
is non-null
, and the remove
method finds the predecessor pred
of the element that current
references. Even this search for the predecessor has two cases, depending on whether the element to be removed is the first one in the list. If we are removing the first element, then we set pred
to null
and update head
. Otherwise, we have to perform a linear search, stopping when pred.next
references the same element as current
; once this happens, we know that pred
is indeed the predecessor of the current element. (There is also some "defensive coding," just in case we simply do not find an element pred
such that pred.next
references the same element as current
. We do not expect this to ever happen, but if it does, we have found a grave error and so we print an error message and return.) Assuming that we find a correct predecessor, we splice out the current element. We also have to update tail
if we are removing the last element of the list.
The bottom line is that, compared with the remove
code for a circular, doubly linked list with a sentinel, the remove
code for a singly linked list is more complex, has more possibilities for error, and can take longer.
toString
for a listThe toString for a singly linked list is similar to how we print a circular, doubly linked list with a sentinel, except that now we start from head
rather than sentinel.next
and that the termination condition is not whether we come back to the sentinel but rather whether the reference we have is null
. The for-loop header, therefore, is
for (x = head; x != null; x = x.next)
The contains
method for a singly linked list is perhaps a little shorter than for a circular, doubly linked list with a sentinel, because now we do not replace the object reference in the sentinel. The for-loop header, therefore, becomes a little more complicated. We have to check whether we have run off the end of the list (which we did not have to do when we stored a reference to the object being searched for in the sentinel) and then, once we know we have not run off the end, whether the element we are looking at equals the object we want. The bodyless for-loop is
for (x = head; x != null && !x.data.equals(obj); x = x.next)
;
Although the code surrounding the for-loop simplifies with a singly linked list, the loop itself is cleaner for the circular, doubly linked list with a sentinel. Either way, it takes linear time in the worst case.
isEmpty
is easy, but slightly different from the version for a circular, doubly linked list with a sentinel. We simply return a boolean that indicates whether head
is null
.hasCurrent
returns true
if and only if the there is a current element. We simply return a boolean indicating whether current
is not null
.hasNext
checks to see whether there is a current element and whether the next field of the current element is null rather than seeing if it is the sentinel.getFirst
is different, as it sets current
to head
.getLast
changes, too, setting current
to tail
.addFirst
and addLast
are similar to a circular, doubly linked list with a sentinel. However, addLast
has to deal with an empty list separately.get
is unchanged.next
is identical to the version in the doubly linked list. (This is an advantage of calling hasNext
rather than doing the test directly in this method.)previous
and hasPrevious
methods. We are not required to, because they're not in the CS10LinkedList
interface.It is also possible to have a dummy list head, even if the list is not circular. If we do so, we can eliminate some special cases, because adding at the head becomes more similar to adding anywhere else. (Instead of changing the head
you update a next
field.) It is also possible to have a currentPred
that references the element before the element that we think of as the current element, so that removal can be done in constant time. Using a currentPred
reference instead of current
to indicate the item that is conceptually the "current" requires a number changes in the code. Lab 2 will ask you to figure out what these changes are.
It is also possible to have a circular singly linked list, either with or without a sentinel.