Lists Implementation 2: Growing Array
Last time we looked at the specification of a generic List ADT, and one implementation that supports those operations. Now we'll see that the same ADT can be implemented completely differently; i.e., provide the same interface for working with data, but with different "guts" to make that happen. Why would we do that? There are efficiency trade offs between different implementations for different usage patterns. So we'll review how we talk about efficiency in CS.
All the code files for today: GrowingArray.java; ListTestArray.java; SimpleIterator.java; TimeTest.java.
Growing array list implementation
For a completely different way to implement the List interface, let's use an array. An array is a more primitive representation of an ordered set of elements, in that it has a fixed length that we have to specify when we create it, and we can't add/remove elements. (C and Matlab programmers are used to such arrays, though as with lists, Java indexes arrays starting at 0.). While linked lists require us to start at the beginning and proceed sequentially from an element to its next, arrays are "random access" — we can instantly get/set any element. (They are basically just one big chunk of memory, into which we index a given element by knowing the fixed size of each element. Array object types are specificied with square brackets, and to get
/set
elements, we give the index in square brackets. For example:
int[] numbers = new int[10];
for (int i=0; i<10; i++) {
numbers[i] = i*2;
}
The basic picture is:
numbers[0] | 0 |
numbers[1] | 2 |
numbers[2] | 4 |
numbers[3] | 6 |
numbers[4] | 8 |
numbers[5] | 10 |
numbers[6] | 12 |
numbers[7] | 14 |
numbers[8] | 16 |
numbers[9] | 18 |
To belabor the point, an array of n
elements has elements indexed from 0 to n
–1, only. An array of n
elements does not have an element with index n
. Thus the 10 elements of our array are numbers[0], numbers[1], numbers[2], ..., numbers[9].
The code:
- We declare an array of
int
s named "numbers
". - We initialize it to actually be able to hold 10 ints. (Again, if we don't
new
, there will be a run-time error — try it so you'll recognize it.) - We have a loop over each element in the array (0..9) and set its value. What's in the array now? (Again, if we go <= 10 rather than < 10 there will be a run-time error — try it so you'll recognize it.)
So how can an array be used to implement a list, which can grow? The trick is that when we run out of room, since we can't grow the array, we instead create a new one (say of double the size) and copy over the elements.
Our implementation: GrowingArray.java. We simply use the underlying array for get
and set
, and a loop for toString
. In addition to the array, we keep a variable size
that says how many elements are actually filled in. We can then compare that to the actual length of the array (array.length, a variable not a method) and allocate and copy over if needed.
Add
and remove
may also require some element shifting. If we are adding in the middle of the array, we need to shift things after it to the right by one index, in order to make room. Likewise, if we are removing in the middle of the array, we shift things after it to the left by one step, in order to fill in the hole it would leave behind. Note the order of the loops: from right to left for add vs. from left to right for remove. Step through it.
List analysis
Both the linked list and array implementations support the same interface. When would we choose one over the other?
Linked list: efficient if we're adding / removing / getting / setting at the front of the list. Constant time, in fact. But in the worst case we might have to advance all the way through the list — Θ(n).
Growing array: efficient if we're getting / setting an index already within the size of the filled in elements. Adding / removing at the end is efficient (except when we have to grow), but in the worst case we might have to shift over the entire array — Θ(n). What about that growing bit? While any individual growing step will be slow, it doubles the length of the array, so will last a while. Thus we won't have to grow it again till we add as many elements to the array as we already have. Thus on average, it's not so expensive.
We can make this argument tighter, and argue that n add operations will never take more than O(n) time. This is O(1) amortized time. Amortization is what accountants do when saving up to buy an expensive item like a computer. Suppose that you want to buy a computer every 3 years and it costs $1500. One way to think about this is to have no cost the first two years and $1500 the third year. An alternative is to set aside $500 each year. In the third year you can take the accumulated $1500 and spend it on the computer. So the computer costs $500 a year, amortized. (In tax law it goes the other direction - you spend the $1500, but instead of being able to deduct the whole thing the first year you have to spead it over the 3 year life of the computer, and you deduct $500 a year.)
For the growing array case, we can think in terms of tokens that can pay for copying an element. We charge 3 tokens for each add. One pays for adding the item to the array and two are set aside for later. Suppose that we have just expanded the array from size n to size 2n and copied the items in the old array to the first n positions of the new array. We have no accumulated tokens. The last n positions of the array are empty, so we will be able to do n add operations before the new array is full with 2n items in it. Each add sets aside 2 tokens. After n add operations we will have accumulated 2n tokens. That is enough to pay for copying all 2n things in the array when we do the next add and have to double the array size again. Therefore the cost is O(1) per add operation, amortized. (The current implementation of ArrayList in Java makes the new array 3/2 times the size of the old array to waste less space, but the argument can be modified to show that a charge of 4 tokens works.)