There will no be no class this Friday (April 18). Also, office hours for Thursday, April 17 are canceled.
Short Assignment 8 is due Friday.
A stack is a LIFO (Last In, First Out) data structure. The book compares a stack to a spring-loaded stack of cafeteria trays or a PEZ dispenser. (Though I have to question how well the authors remember PEZ. They are candies, but definitely not mint.) The abstract data type (ADT) Stack
has at least the following operations (in addition to a constructor to create an empty stack):
push
: add to the top of the stackpop
: remove the top element from the stack and return itpeek
: return the top element but do not remove itisEmpty
: return a boolean indicating whether the stack is emptySo what good is a stack? It has many, many applications. You already know that the run-time stack handles allocating memory in method calls and freeing memory on returns. (It also allows recursion). A stack provides an easy way to reverse a list or string: push each element on the stack then pop them all off. They come off in the opposite order that they went on. They are good for matching parentheses or braces or square brackets or open and close tags in HTML.
A stack is also how HP calculators handle reverse Polish notation. In this notation the operator follows both operands, and no parentheses are needed. So the expression (3 + 5) * (4 - 6)
becomes 3 5 + 4 6 - *
. To evaluate it, push operands onto the stack when you encounter them. When you reach an operator, pop the top two values from the stack and apply the operator to them. (The first popped becomes the second operand in the operation.) Push the result of the operation back on the stack. At the end there is a single value on the stack, and that is the value of the expression.
A stack is also how you can do depth-first search of a maze or a graph. Let's consider a maze. Start by pushing the start square onto the stack. The repeatedly do the following:
Quit when you reach the goal square.
Because Stack
is an ADT, an interface should specify its operations. The CS10Stack
interface in CS10Stack.java contains the operations given above.
The class java.util.Stack
also has these operations, but instead of the name isEmpty
they use empty
. It also has an additional operation, size
.
The book has its own version of the ADT, but instead of the name peek
they use top
, and they also add size
. You would think that computer scientists could agree on a standard set of names. Yeah, not so much. At least we all agree on push
and pop
.
One question is how to handle pop
or peek
on an empty stack. Both Java and the book throw an exception. That seems a bit harsh, and so CS10Stack
is more forgiving: it returns null
.
How do we implement a stack? One simple option is to use an array. The implementation has two instance variables: an array called stack
and an int called top
that keeps track of the position of the top of the stack. In an empty stack top
equals − 1. To push, add 1 to top
and save the value pushed in stack[top]
. To peek
just return stack[top]
(after checking that top
is nonnegative). pop
is peek
but with top--
.
This implementation is fast (all operations take O(1) time), and it is space efficient (except for the unused part of the array). The drawback is that the array can fill up, and when it does, you get an exception on push
.
An alternative that avoids that problem uses a linked list. A singly linked list suffices. The top of the stack is the head of the list. The push operation adds to the front of the list, and the pop removes from the front of the list. All operations are take O(1) time in this implementation, also. You need to have space for the links in the linked list, but you never have empty space as you do in the array implementation.
Another way that avoids the problem of the array being full is to use an ArrayList
. To push, you add to the end of the ArrayList
, and to pop you remove the last element. The ArrayList
can grow, so it never becomes full. The code for this implementation is in ArrayListStack.java. Note that you don't even need to keep track of the top. The ArrayList
does it for you.
Do these operations all take O(1) time? It looks like it, as long as add
and remove
at the end of the ArrayList
take O(1) time. The remove
operation certainly takes O(1) time. The add
usually does is, but sometimes can take longer.
To understand why an add
operation can take more than constant time, we need to look at how an ArrayList
is implemented. The underlying data structure is an array. There is also a variable to keep track of the size, from which we can easily get the last occupied position in the array. Adding to the end just increases the size by 1 and assigns the object added to the next position. However, what happens when the array is full? A new array is allocated and everything is copied to the new array. Doing so takes time Θ(n), where n is the number of elements in the ArrayList
.
If we had to copy the entire ArrayList
upon each add
operation, the process would be very slow. It would in fact take time O(n2) to add n elements to the end of the ArrayList
. That would be too slow. So instead, when the ArrayList
is full, the new array allocated is not just one position bigger than the old one, but much bigger. One option is to double the size of the array. Then a lot of add
operations can happen before the array needs to be copied again.
With this approach, n add
operations will take O(n) time. In other words, the average time per operation is only O(1). We call this the amortized time. Amortization is what accountants do when saving up to buy an expensive item like a computer. Suppose that you want to buy a computer every 3 years and it costs 1500 dollars. One way to think about this is to have no cost the first two years and 1500 dollars the third year. An alternative is to set aside 500 dollars each year. In the third year you can take the accumulated 1500 dollars and spend it on the computer. So the computer costs 500 dollars a year, amortized. (In tax law it goes the other direction: you spend the 1500 dollars, but instead of being able to deduct the whole thing the first year you have to spead it over the 3 year life of the computer, and you deduct 500 dollars a year.)
For the ArrayList
case, we can think in terms of tokens that can pay for copying something into the array. Suppose that we have just doubled the array size from n to 2n, which means that we have just copied n elements; let's call these n elements that were copied the "old elements." We spend all our tokens copying the old elements, so that immediately after copying, we have no tokens available. By the time the array has 2n elements and the array size doubles again, we must have one token for each of the 2n elements to pay for copying 2n elements.
Here's how we do it. We charge three tokens for each add
:
Therefore, by the time the array has 2n elements, every element has a token, which pays for copying it.
By thinking about it in this way, we see that the cost per add
operation is a constant (three tokens).
The array size does not have to double when the array fills. For example, it could increase by a factor of 3/2, in which case you can modify the argument to show that charging four tokens per add
operation works.
A Queue
is a FIFO (First In, First Out) data structure. "Queueing up" is a mostly British way of saying standing in line. And a Queue
data structure mimics a line at a grocery store or bank where people join at the back, are served when they get to the front, and nobody is allowed to cut into the line. The ADT Queue
has at at least the following operations (in addition to a constructor to create an empty Queue
):
enqueue
: add an element at the rear of the queuedequeue
: remove and return the first element in the queuefront
: return the first element but don't remove itisEmpty
: return a boolean indicating whether the queue is emptyWhat do we use a Queue
for? An obvious answer is that it is useful in simulations of lines at banks, toll booths, etc. But more important are the queues within computer systems for things like printers. When you submit a print job you are enqueued. When the print job gets to the front of the queue, it is dequeued and printed. Time-sharing systems use round-robin scheduling. The first job is dequeued and run for a fixed period of time or until it blocks (i.e., has to wait) for I/O or some other reason. Then it is enqueued. This process repeats as long as there are jobs in the queue. New jobs are enqueued. Jobs that finish leave the system instead of being enqueued. In this way, every job gets a fair share of the CPU. The book shows how to solve the Josephus problem using a queue. It is basically a round-robin scheduler, where every kth job is killed instead of being enqueued again.
A queue can also be used to search a maze. The same process is used as for the stack, but with a queue as the ADT. This leads to breadth-first search, and will find the shortest path through the maze.
An obvious way to implement a queue is to use a linked list. A singly linked list suffices, if it includes a tail pointer. Enqueue at the tail and dequeue from the head. All operations take Θ(1) time.
If you use a circular, doubly linked list with a sentinel, you can organize the list the opposite way: enqueue at the head and dequeue from the tail. If you were to try it this way for a singly linked list, you would keep having to run down the entire list to find the predecessor to the tail when dequeuing, and so this operation would take Θ(n) time.
The textbook presents a Queue
interface and part of an implementation using a singly linked list. They also include a size
method. The interface CS10Queue
in CS10Queue.java has the methods given above, and LinkedListQueue.java is an implementation that uses a SentinelDLL
. It could be changed to use an SLL
by changing one declaration and one constructor call. All operations would still take Θ(1) time.
Java also has a Queue
interface. It does not use the conventional names. Instead of enqueue
it has add
and offer
. Instead of front
it has element
and peek
. Instead of dequeue
it has remove
and poll
. Why two choices for each? The first choice in each pair throws an exception if it fails. The second fails more gracefully, returning false
if offer
fails (because the queue is full) and null
if peek
or poll
is called on an empty queue. At least isEmpty
and size
keep their conventional names.
A deque (pronounced "deck") is a double-ended queue. You can add or delete from either end. A minimal set of operations is
addFirst
: add to the headaddLast
: add to the tailremoveFirst
: remove and return the first elementremoveLast
: remove and return the last elementisEmpty
: return a boolean indicating whether the deque is emptyAdditional operations include
getFirst
: return the first element but do not remove itgetLast
: return the last element but do not remove itsize
: return the number of elements in the dequeA deque can be used as a stack, as a queue, or as something more complex. In fact, the Java documentation recommends using a Deque
instead of the legacy Stack
class when you need a stack. This is. because the Stack
class, which extends the Vector
class, includes non-stack operations (e.g. searching through the stack for an item). (Vector
was replaced by ArrayList
and is deprecated in recent Java releases.)
Implement a dequeue is with a SentinelDLL
. If you look at the methods you will see that all of these operations are already there except for the two remove operations and size()
. The remove operations can be implemented by calling either getFirst
or getLast
and then calling remove
. The size
operation can be left out or can be implemented by adding a count
instance variable to keep track of the number of items in the deque. Each of these operations requires Θ(1) time.
Once again, Java provides two versions of each of each deque operation. The two "add" operations have corresponding "offer" operations (offerFirst
and offerLast
). The two "remove" operations have corresponding "poll" operations, and the two "get" operations have corresponding "peek" operations. These alternate operations do not throw exceptions.
In recent lectures we saw linked lists in which the notion of the current element was part of the SentinelDLL
or SLL
object. Although in some ways this is convenient, in others it is not. For example, if some method is going through the list and passes the linked list to another method, that method can change the current element. It would be nice if there were some way that each method could have its own independent concept of the element in the list that it is currently dealing with.
In fact, we don't have to incorporate current
as an instance variable of SentinelDLL
or SLL
. We'll focus on modifying the SentinelDLL
class today, and we'll see how to make a separate object that knows how to traverse and modify a given list. By making it a separate object, we can have any number of them active at any time. In other words, we could have 0, 1, 2, 3, or any other number of such objects around, and each could have its own notion of the current element of the list. Our modification of the SentinelDLL
class will not have the instance variable current
, nor will it have anything like current
. Therefore it will not have get
, remove
, next
, hasNext
, previous
, hasPrevious
, add
, or set
methods.
This style of going through a data structure is so common that there's a name for it: an iterator. In fact, it's the basis of one of the standard interfaces in Java: the Iterator
interface.
Iterator
interfaceThe Iterator
interface consists of three methods:
hasNext
returns a boolean indicating whether there is a next element in the iteration through the data structure.next
returns a reference to the next object in the data structure, and it advances the iteration by one place.remove
deletes the object returned by the most recent call to next
. It might happen that there is no such object to delete, in which case remove
throws an IllegalStateException
. This method is optional, in that a class implementing the Iterator
interface must define it, but it's allowed to have an empty body. The remove
method should be called at most once each time next
is called.Iterators apply to lots of different data structures, not just linked lists. There is a general style of using the Iterator
interface. To demonstrate it, we need a class that allows us to get an Iterator
for the contents of the collection. The ArrayList
class is one such class. The driver in IteratedArrayList.java shows how.
If the Iterator
interface is implemented properly, then creating an object that implements the Iterator
interface starts an iteration. In IteratedArrayList.java, the iterator for an ArrayList
needs a reference to the ArrayList
, and it starts the iteration. Then we typically have a while-loop, whose header calls the hasNext
method. Within the body of the while-loop, a call to next
fetches the next element in the data structure, and the call to next
may be followed by a call to remove
. IteratedArrayList.java has two iterations through the ArrayList
: one to print out all elements and remove every other one, and one to show that the first iteration removed every other element.
We will sometimes see iterators used in for-loops rather than while-loops, but that's OK. After all, a for-loop is just a while-loop in disguise.
You might see similarities between the foreach-loops that we used to run through arrays and iterators. In fact, a foreach-loop for a collection of objects translates into code that uses iterators!. Foreach-loops work for arrays, ArrayList
s, and anything else that is "Iteratable." However, an iterator gives us one power that foreach-loops do not. It allows us to remove items.
ArrayList
How might we implement an iterator for an ArrayList
? Here is one way to do it.
Create an object that has three instance variables:
int position
that is the index of the position before the one that we will return when next
is called. It should be initialized to − 1.boolean nextWasCalled
that is true
if next
was called since the most recent remove
. It should be initialized to false
.list
to the ArrayList
that created the iterator.We can then implement the Iterator
methods as follows:
hasNext
tests if position < list.size()-1
. (Note that when position == list.size()-1
you are saying that a call of next
should return list.get(list.size())
, which does not exist.)next
sets nextWasCalled
to true;
, increments position
, and returns list.get(position)
.remove
throws an exception if nextWasCalled
is false
. Otherwise it calls list.remove(position)
and sets nextWasCalled
to false
. The next item (and the rest of the ArrayList
) will be moved left one position, and so remove
also decrements position
.When we use an iterator in a linked list, we often want more functionality than the standard Iterator
interface provides. In fact, Java supplies a standard ListIterator
class. Its concept of "current" is different from the one we have seen. It has a "cursor position" between two elements in the list. A call to next
returns the item after the cursor and moves the cursor forward. A call to previous
returns the item before the cursor and moves the cursor backwards. Because of the way this works, alternating calls to next
and previous
will keep returning the same element. In addition to the methods in Iterator
, the ListIterator
interface requires the following methods:
previous
: return the previous element in the list and move the cursor forward.hasPrevious
: return a boolean indicating whether there is a previous element.add
: add an element item at thecurrent position, just before the cursor (so that a call to previous
would return that item and a call to next
would be unaffected).set
: replace the element item last returned by next
or previous
by obj
.remove
: in this interface, remove the element most recently last returned by a call to next
or previous
.nextIndex
: return the index of the element that would be returned by a call to next
.previousIndex
: return the index of the element that would be returned by a call to previous
.Calls to the remove
and set
methods are invalid if there has never been a call to next
or previous
or if remove
or add
has been called since the most recent call to next
or previous
.
The ArrayList
class has a method that returns a ListIterator
, also. There is a separate class LinkedList
, which behaves like our circular doubly-linked list with a sentinel. Both implement the interface List
, which requires a number of methods, including all that we saw for ArrayList
plus Iterator
, and ListIterator
. They differ in the amount of time operations take. For instance, a get
, set
, or add
on a LinkedList
requires time proportional to the distance that the index is from the nearest end of the list. That means that an add
to either the front or end of a LinkedList
takes constant time, unlike an ArrayList
. If a ListIterator
is used, the time required for any method in the interface is constant. For an ArrayList
, the time for an add
or remove
is proportional to the number of items after the item added or removed, even if using a ListIterator
.
Because the conventions and operations are different from what we have implemented in SentinelDLL
we will show how to implement a ListIterator
using this new concept of the current element. We extend the Iterator
interface by declaring the CS10ListIterator
interface in CS10ListIterator.java.
Because we have removed some of the methods from the SentinelDLL
class, we need a new interface for the list class to implement. This new interface, CS10IteratedList
in CS10IteratedList.java, is similar to the LinkedList
interface in CS10LinkedList.java. The methods add
, remove
, get
, next
, and hasNext
—all of which require access to the current
instance variable—are gone.
There one new method: listIterator
. This method will return an object that can iterate through the object whose class implements CS10ListIterator
. This returned object starts an iteration.
SentinelDLLIterator
classSentinelDLLIterator.java is a modified version of the circular, doubly linked list with a sentinel that includes an iterator. The first thing to notice is that the SentinelDLLIterator
class implements the CS10IteratedList
interface, and so the methods that were in LinkedList
but not in CS10IteratedList
are missing from SentinelDLLIterator
.
The second thing to notice is that the SentinelDLLIterator
class has just sentinel
as an instance variable; there is no current
instance variable, as there was in SentinelDLL
.
But the most salient feature of our SentinelDLLIterator
class implementation is the inner class DLLIterator
, which implements the ListIterator
interface. The DLLIterator
class is private. Users of the SentinelDLLIterator
can still get a DLLIterator
by calling the listIterator
method. Moreover, because DLLIterator
implements the public CS10ListIterator
interface, once any part of any program has a reference to a DLLIterator
, it can call the public methods in CS10ListIterator
on it. The constructor is private, however, so that the only way to create a DLLIterator
object is to call the method listIterator
on a SentinelDLLIterator
object.
And, perhaps most importantly, by making DLLIterator
an inner class of SentinelDLLIterator
, the methods of DLLIterator
can access anything that the methods of SentinelDLLIterator
can access. That would include the instance variable sentinel
, as well as anything that is public
in the Element
class (such as data
, next
, and previous
).
The DLLIterator
class has two instance variables:
current
is chosen so that the implicit cursor is between current
and current.next
. This may seem a strange thing to do, but it allows us to go through a list, removing elements either forward or backward by alternately calling next
and remove
, or previous
and remove
.lastReturned
is a reference to the Element
whose data was returned by the most recent call to next
or previous
. This information is needed by remove
and set
. If next
or previous
was never called, or if a call to remove
or add
has changed the list since the last call to next
or previous
, this instance variable has the value null
.From how we've defined current
, it needs to be advanced in next
before we return an object when moving forward and after determining the object to return when moving backward. In order for everything to work, current
initially references the sentinel (rather than, say, sentinel.next
).
I have included an equals
method in DLLIterator
, and it is set so that two DLLIterator
objects are considered equal if they are currently referencing the same Element
. The code checks to ensure that both objects involved are DLLIterator
objects, and it returns false
if they're not.
Returning to the SentinelDLLIterator
class, there is a new method listIterator
. It creates a new DLLIterator
for the SentinelDLLIterator
object and returns a reference to it. This listIterator
method is made to be called from outside the SentinelDLLIterator
class, and because it returns a reference to a DLLIterator
, its return value may be assigned to CS10ListIterator
or even Iterator
(since ListIterator
extends Iterator
).
In the SentinelDLL
class, the toString
method now uses the iterator. Notice how toString
uses the iteration paradigm from before, with a while-loop whose test includes the call iter.hasNext
and whose body includes the call iter.next
.
The DLLIterator
created in toString
is independent of any other DLLIterator
in existence. Where one DLLIterator
's current
is has no effect at all on where another DLLIterator
's current
is.
We can really see this independence in ListTestIterator.java. Here, our test driver creates a DLLIterator
by the line
CS10ListIterator<String> iter = theList.listIterator();
The current
instance variable of this DLLIterator
is moved by next
and previous
and used by add
. But when we call theList.toString
, the DLLIterator
created and used by toString
does not affect the DLLIterator
in main
.
Similarly, the DLLIterator
created and used in calls to addFirst
and addLast
are independent of all others. Therefore adding to the front or back of a list does not change the current item in iter
.
I have also added a "clear" option that iterates through the list, removing all objects. (I could have used the clear
method, but chose not to). I have added a "print reversed" option that runs through the list backwards, after advancing to the end.
The "nested print" option really shows the power of separate iterators. Here, we have two DLLIterator
s, outer
and inner
. For each list object traversed by outer
, we perform a full traversal of the list with inner
. This task would be impossible if we were limited only to the methods we had in our original linked list implementations.
Having multiple iterators on the same object can be very useful, as we just saw. As long as none of them modifies the list everything is fine. Problems may arise, however, if any of the iterators modifies the list. In particular, if one iterator removes an element that is the current element of another iterator, things can get very messy. Even changing the list by using addFirst
and addLast
can change how things work, and calling clear
is definitely a problem!
Multiple threads (streams of control) can really cause problems. Suppose that you are on the second to last item in the list, you call hasNext
and true
is returned, and then call next
. Should be safe, right? Well, not if somebody else in another thread removed the last item between the two calls. (Maybe somebody clicked on a button or a Timer went off between the calls, and the method registered with the listener changed the list.)
Because of this potential, a bulletproof iterator should throw an exception if the list has been modified in any way except via the iterator's own operations. We won't worry about these situations for now.