Java has a group of interfaces for holding collections of objects and classes that implement them. We have briefly touched up List
, which is an interface with two Java-provided implementations: ArrayList
and LinkedList
. Today will look at two other interfaces for holding collections of objects: Set
and Map
. Each has two Java-provided implementations. Set
is implemented by HashSet
and TreeSet
. Map is implemented by HashMap
and TreeMap
. We will be looking at their underlying data structures, hash tables and binary search trees, in the next few lectures.
List
interfaceAmong other methods, the List<E>
interface provides the following:
boolean add(E o)
true
.void add(int index, E o)
index
of this list.void clear()
boolean contains(Object o)
true
if this list contains the specified element, false
otherwise.E get(int index)
index
.boolean isEmpty()
true
if this list contains no elements, false
otherwise.int indexOf(Object o)
Iterator<E> iterator()
ListIterator<E> listIterator()
E remove(int index)
boolean remove(Object o)
true
if the element is present, false
otherwise.E set(int index, E element)
int size()
If both ArrayList
and LinkedList
implement this set of operations, why have both? Efficiencies differ. Access operations (set
and get
) take constant time in an ArrayList
, but require time proportional to the distance to the nearest end for a LinkedList
. (The LinkedList
is a doubly-linked circular linked list, and it's smart enough to start at the closest end.) On the other hand, modification operations (add
and remove
at a given index) require time proportional to the number of elements after the index in an ArrayList
, because all of these elements must be copied. But for a LinkedList
, they take constant time after the time to access the index (distance from nearest end). Therefore, all Iterator
or ListIterator
operations take constant time for a LinkedList
, but add
and remove
operations take time proportional to the number of remaining elements for an ArrayList
.
Set
interfaceA Set
differs from a List
in that a List
has a linear order, whereas a Set
does not. Furthermore, an element can appear multiple times in a List
but only once in a Set
.
Here are the primary operations on a Set<E>
:
boolean add(E o)
true
if o
was not in the set, false
if o
was already in the set.void clear()
boolean contains(Object o)
true
if this set contains the specified element, false
otherwise.boolean isEmpty()
true
if this set contains no elements, false
otherwise.Iterator<E> iterator()
boolean remove(Object o)
true
if the o
was present, false
otherwise.int size()
All of these methods are also part of the List
interface. So why have a separate interface?
The main reason is implementation efficiency. The contains
operation on either an ArrayList
or a LinkedList
with n elements takes O(n) time, and for an ArrayList
the remove
operation can take O(n) time. For applications such as a dictionary for a spell checker, these running times are too slow.
There are two implementations of Set
in the Java class library. Both implement the contains
operation more efficiently than it can be implemented for a List
.
The first implementation is TreeSet
, which uses a data structure called a balanced binary tree to store the data. You can think of it as a little like a linked list on which you can do binary search. We will talk about this data structure soon. The important point is that the add
, remove
, and contains
methods all take O(lg n) time for a set with n elements. It works only on Comparable
objects. The iterator is guaranteed to return the elements in increasing order by compareTo
and takes O(n) time to iterate through the entire set. Getting the first element from the iterator takes O(lg n) time.
The second is HashSet
, which uses a data structure called a hash table. We will talk about hash tables next time.
If the hash table is used properly, then the add
, remove
, and contains
operations all take O(1) time on average (although it is possible that they could take Θ(n) time if you were extremely unlucky). The iterator returns the elements in a somewhat arbitrary order.
As an example of the use of sets, consider the program SetDemo.java. It creates a set consisting of all of the keywords in Java. It then uses an iterator to go through the set and print each of the words. (Note that an iterator on a Set
is identical to an iterator on a List
.) Finally, it lets the user type words and determines if they are keywords by using contains
to see if they are in the set.
Map
interfaceThe Map
interface describes a data structure that can be thought of as a set where each element has associated data. Each data element is associated with a key. By looking up the key, you can get the associated data, just like a dictionary in Python. A key is typically something like your student ID number, and the associated data might be your student record. A Map
can be implemented using balanced a binary tree or a hash tables, just like a Set
.
The primary operations in a Map<K,V>
are the following (where K
is the generic type for the key and V
is the generic type for the associated data):
void clear()
boolean containsValue(Object value)
true
if this map maps one or more keys that map to the specified value, false
otherwise.V get(Object key)
boolean isEmpty()
true
if this map contains no key-value mappings.Set<K> keySet()
Set
containing the keys contained in this map.V put(K key, V value)
null
if key was not in the map.V remove(Object key)
null
if key is not in the map).int size()
For an example of the use of a map, consider AnimalSounds.java. This program allows the user to insert animal names as keys and the sounds that they make as the associated data. The user can then ask for the sound that a given animal makes, or to remove an animal from the map.
Note the way the the print operation works. The code for this is
if (animalMap.isEmpty())
System.out.println("The map is empty");
else {
System.out.println("Here are the animals and their sounds:");
Set<String> animalNames = animalMap.keySet();
Iterator<String> iter = animalNames.iterator();
while (iter.hasNext()) {
animal = iter.next();
System.out.println(toTitleCase(getArticle(animal)) + " "
+ animal + " says " + animalMap.get(animal) + ".");
}
}
Note that the first step is to call keySet
to get all of the keys in the map. Then we create iterator for the set, and we use it to iterate through the set, printing each key and the value returned by get
for that key.
The method of voting in which the candidate with the most votes wins the election has some drawbacks. If two conservatives get in a race against a liberal in a conservative district they could split the conservative vote and the liberal gets elected, even though he is the third choice of the majority of the voters in the election. Also, third parties have a hard time getting established, because voting for a third-party candidate can be throwing away your vote. If about a third of the 22,000 New Hampshire voters who voted for Nader in the Bush-Gore election had voted for Gore instead, he would have won the state and the presidency. Florida, and its hanging chads, would not have mattered.
Some states solve these problems by having a runoff election between the top two candidates if nobody gets a majority of the votes. But a runoff election costs time and money. A popular alternative suggestion is the instant runoff election.
In an instant runoff election, the voters fill out a ballot with an ordered list of candidates, from most favorable to least favorable. The election takes place in rounds. In the first round, each ballot awards a vote to the first candidate on the ballot. If nobody has a majority, then the candidate with the fewest votes is dropped from the election. (In case of ties we will chose one at random.) Then another round is run. This time, each ballot's vote is awarded to the first candidate in its list who has not been eliminated. The bottom candidate is dropped, and the process repeats until one candidate has a majority. (In fact, it can repeat until there is just one candidate left and get the same result. Once someone has a majority they will never be eliminated.)
How could we write a program to determine the winner of an instant runoff election? The first step is to determine what objects appear in the problem and how they interact with one another. One obvious choice is a ballot. We could say, "Oh, that is just a list" and not create an object for it. But let's take an object-oriented approach and say that there should be a Ballot
class.
Another object would be the set of all the ballots in the election. We could just say, "Make a set of lists," but let's make Election
a class, also.
A final object that might be less obvious is one that represents the results of the voting. Let's create a VoteTally
class. The alternative is to use a map from candidate names to the number of votes that they received.
We could have a class to keep track of the current set of candidates, but the Set
class seems to do everything that we are likely to need. Unless we discover an action that we need to do that the Set
class doesn't handle, we will just use a Set
.
What actions do we need to perform? We first need to get our set of candidates. Note that we can limit this set to candidates who get at least one first-round vote. Others will have zero votes and will be dropped before any of the candidates who got first-round votes. It sounds like Election
is the class that has access to the data to perform this, with help from the Ballot
class to get the first element on each ballot.
Next, we have to run a round of the election. This task requires going through all of the ballots, determining to whom each vote should go, and increasing that candidate's tally by 1. The Ballot
class has the data to determine who should get the vote. The Election
class has the ballots. The VoteTally
class should update itself by adding a vote for the candidate.
After running a round, we have to find the candidate with the fewest votes. The VoteTally
class has the information to do so. But what if there is a tie? Maybe we should return a list of candidates who share the lowest vote total. In this program we will pick one at random to eliminate from the current candidate set, but there could be other choices. Returning a list makes it easier to implement another choice mechanism if we change our mind.
We have to repeat running a round of the election and eliminating the candidate with fewest votes until we have only one candidate left. This procedure does not seem to be appropriate for any class. A method in a new class, InstantRunoff
, can do this.
So what sorts of things do we want to be able to do with a Ballot
object?
addCandidate
method to add candidates to the ballot.There are many other possible things we could do with a ballot. Getting all of the candidates in order is one possibility, and so we could supply an iterator. A toString
method could be useful. A way of getting the number of candidates on the ballot could be useful. But for now we will do the minimum. We can always come back later to add new methods.
What should we do with an Election
object? We need to create it, plus perform the jobs mentioned above.
Election
. A constructor to create an empty Election
and an addBallot
method could take care of this.What about the VoteTally
object?
The code in Ballot.java, Election.java, and VoteTally.java do these operations. The class InstantRunoffOO.java supplies the method to loop through the rounds and the main method for testing.
You can run this code using ballots.txt as input, a file we made with 200 randomly created ballots using candidate names from the 2016 NH Republican primary, but according to a probability distribution determined by the politcal website fivethirtyeight.com. You'll need to modify the string in ballotFileName
for your own computer.
An alternate approach is InstantRunoffProc.java. This code does the same thing as InstantRunoffOO.java, but through fixed data structures and static methods. It has less code, which is a plus. There are longer lists of parameters, as all of the data must be passed around "bare." We see data declarations such as List<List<String>> ballots
. These declarations are not easy to read and take getting used to. In short, there is no data encapsulation, which is a minus.
In a program this short, encapsulation and data hiding aren't that important. On the other hand, I originally had a Set
of ballots instead of a List
. The Set
of Ballot
objects in InstantRunoffOO.java wasn't a problem, because Ballot
did not override equals
. Therefore the Set
did not consider two Ballot
objects with the same names in the same order as duplicates. But in InstantRunoffProc.java, it was a problem, because two ballots with choices "Romney Huntsman" were entered into an election but Romney got only one vote. (I originally wrote the programs after the 2012 election.) The two ArrayList
objects ended up being equal, so only one was kept in the set. Changing from Set<List<String>>
to List<List<String>>
required five changes spread out over four methods. Finding all of the appropriate changes in a much bigger program (and avoiding changes where the Set<List<String>>
wasn't dealing with ballots and may have been correct as it was) would have been tedious and error-prone. In contrast, making the same change in InstantRunoffOO.java required two changes in Ballot.java: declaring the instance variable and initializing it in the constructor. Even if the program had millions of lines, I still would have only needed to make those two changes.
File Reading and the Scanner
classWe open a file to read in the ballots. This is done by the call:
ballotFile = new Scanner(new File(ballotFileName));
The way to open a file is to create a new File
object, giving it the
full path name of the file. But what do we do with the file after it is open? We
pass it to a Scanner
object. We saw the Scanner
class before, but will summarize here. You can open it on an input stream (usually System.in
) or even on a String
. Then you can read any type of data. The next
method reads the next token as a String
. (Recall that tokens are like words, separated by white space. But you can also change the separator. The class is very flexible.) You can also call nextLine
, nextInt
, nextLong
, nextDouble
, nextFloat
, nextBoolean
, nextBigInteger
, nextBigDecimal
, nextShort
, and nextByte
. It will read characters from the input and convert them to the corresponding type. There is also a "has" version of each of these that returns true if the next thing in the input can be converted to the corresponding type (hasNext
, hasNextLine
, hasNextInt
, etc.).
Also, note the use of a try
- catch
block to test for an
invalid file name. If we were asking the user for the name we might prompt the user to enter a different name. Because the name is hard-wired into the program we instead print an error message to System.err and exit the program.