Menu
Home
Log in / Register
 
Home arrow Computer Science arrow Data Structures and Algorithms with Python
< Prev   CONTENTS   Next >

2.10 Amortized Complexity

Sometimes it is not possible to find a tight upper bound on an algorithm. For instance, most operations may be bounded by some function c*g(n) but every once in a while there may be an operation that takes longer. In these cases it may be helpful to employ something called Amortized Complexity. Amortization is a term used by accountants when spreading the cost of some business transaction over a number of years rather than applying the whole expense to the books in one fiscal year. This same idea is employed in Computer Science when the cost of an operation is averaged. The key idea behind all amortization methods is to get as tight an upper bound as we can for the worst case running time of any sequence of n operations on a data structure (which usually starts out empty). By dividing by n we get the average or amortized running time of each operation in the sequence.

Fig. 2.6 A Lower and Upper Bound

Consider the PyList append operation discussed earlier in this chapter. The latest version of the PyList append method simply calls the Python append operation on lists. Python is implemented in C. It turns out that while Python supports an append operation for lists, lists are implemented as arrays in C and it is not possible to add to an array in C. An array can be allocated with a fixed size, but cannot have its size increased once created.

Pretend for a moment that Python lists, like C arrays, did not support the append method on lists and that the only way to create a list was to write something like [None]*n where n was a fixed value. Writing [None]*n creates a fixed size list of n elements each referencing the value None. This is the way C and C++ arrays are allocated. In our example, since we are pretending that Python does not support append we must implement our PyList append method differently. We can't use the append method and earlier in this chapter we saw that that adding on item at a time with the + operator was a bad idea. We'll do something a little different. Our PyList append operation, when it runs out of space in the fixed size list, will double the size of the list copying all items from the old list to the new list as shown in the code in Sect. 2.10.1.

2.10.1 A PyList Class

1 class PyList:

2 # The size below is an initial number of locations for the list object. The

3 # numItems instance variable keeps track of how many elements are currently stored

4 # in the list since self.items may have empty locations at the end.

5 def init (self,size=1):

6 self.items = [None] * size

7 self.numItems = 0

8

9 def append(self,item):

10 if self.numItems == len(self.items):

11 # We must make the list bigger by allocating a new list and copying

12 # all the elements over to the new list.

13 newlst = [None] * self.numItems * 2

14 for k in range(len(self.items)):

15 newlst[k] = self.items[k]

16

17 self.items = newlst

18

19 self.items[self.numItems] = item

20 self.numItems += 1

21

22 def main():

23 p = PyList()

24

25 for k in range(100):

26 p.append(k)

27

28 print(p.items)

29 print(p.numItems)

30 print(len(p.items))

31

32 if name == " main ":

33 main()

The claim is that, using this new PyList append method, a sequence of n append operations on a PyList object, starting with an empty list, takes O(n) time meaning that individual operations must not take longer than O(1) time. How can this be true? Whenever the list runs out of space a new list is allocated and all the old elements are copied to the new list. Clearly, copying n elements from one list to another takes longer than O(1) time. Understanding how append could exhibit O(1) complexity relies on computing the amortized complexity of the append operation. Technically, when the list size is doubled the complexity of append is O(n). But how often does that happen? The answer is not that often.

2.10.2 Proof of Append Complexity

The proof that the append method has O(1) complexity uses what is called the accounting method to find the amortized complexity of append. The accounting method stores up cyber dollars to pay for expensive operations later. The idea is that there must be enough cyber dollars to pay for any operation that is more expensive than the desired complexity.

Consider a sequence of n append operations on an initially empty list. Appending the first element to the list is done in O(1) time since there is space for the first item added to the list because one slot was initially allocated in the list. Storing a value in an already allocated slot takes O(1) time. However, according to the accounting method, we'll claim that the cost of doing the append operation requires an additional two cyber dollars. This is still O(1) complexity. Each time we run out of space we'll double the number of slots in the fixed size list. Allocating a fixed size list is a O(1) operation regardless of the list size. The extra work comes when copying the elements from the old list to the new list.

Fig. 2.7 Append Cyber Dollars

The first time we need to double the size is when the second append is called. There are two cyber dollars stored up at this point in time. One of them is needed when copying the one element stored in the old list to the new fixed size list capable of holding two elements. Transition one in Fig. 2.7 shows the two stored cyber dollars and the result after copying to the new list when moving from step A to step B.

When append is called on version B of the list the result is version C. At this point, three cyber dollars are stored to be used when doubling the list size to four locations. The first two are filled with the old contents of the list. Two of the three stored cyber dollars are used while copying these values to the new list. When the list of size four fills, two additional append operations have occurred, storing five cyber dollars. Four of these cyber dollars are used in the copy from step E to step F. Again, when the list of size eight fills in step G there are nine stored cyber dollars to be used in doubling the list size and copying the elements over.

But, what if we didn't double the size of the list each time. If we increased the list size by one half its previous size each time, we could still make this argument work if we stored four cyber dollars for each append operation. In fact, as long as the size of the list grows proportionally to its current size each time it is expanded this argument still works to prove that appending to a list is a O(1) operation when lists must be allocated with a fixed size.

As mentioned earlier, the Python list object is implemented in C. While Python provides an append operation, the C language can only allocate fixed size lists, called arrays in C. And yet, Python list objects can append objects in O(1) time as can be observed by experimentation or by analyzing the C code that implements Python list objects. The Python list append implementation achieves this by increasing the list size as described in this section when the fixed size array runs out of space to achieve an amortized complexity of O(1).

 
Found a mistake? Please highlight the word and press Shift + Enter  
< Prev   CONTENTS   Next >
 
Subjects
Accounting
Business & Finance
Communication
Computer Science
Economics
Education
Engineering
Environment
Geography
Health
History
Language & Literature
Law
Management
Marketing
Philosophy
Political science
Psychology
Religion
Sociology
Travel