Menu
Home
Log in / Register
 
Home arrow Computer Science arrow Data Structures and Algorithms with Python
< Prev   CONTENTS   Next >

2.7 Making the PyList Append Efficient

Now, going back to our original problem, we wanted to find out how much time it takes to append n items to a PyList. It turns out, using the append method in Sect. 2.5.1, it will perform in O(n2) time. This is because the first time we called append we had to copy one element of the list. The second time we needed to copy two elements. The third time append was called we needed to copy three elements. Our proof in Sect. 2.6 is that 1 + 2 + 3 + ••• + n equals n*(n + 1)/2. The highest powered term in

this formula is the n2 term. Therefore, the append method in Sect. 2.5.1 exhibits

O(n2) complexity. This is not really a good result. The red curve in the graph of Fig. 2.4 shows the actual results of how much time it takes to append 200,000 elements to a PyList. The line looks somewhat like the graph of f(n) = n2. What this tells us is that if we were to draw a complex program with say 100,000 graphics commands in it, to add one more command to the sequence it would take around 27 s.

This is unacceptable! We may never draw anything that complex, but a computer should be able to add one more graphic command quicker than that!

In terms of big-Oh notation we say that the append method is O(n2). When n gets large, programs or functions with O(n2) complexity are not very good. You typically

want to stay away from writing code that has this kind of computational complexity associated with it unless you are absolutely sure it will never be called on large data sizes.

One real-world example of this occurred a few years ago. A tester was testing some code and placed a CD in a CD drive. On this computer all the directories and file names on the CD were read into memory and sorted alphabetically. The sorting algorithm that was used in that case had O(n2) complexity. This was OK because most CDs put in this computer had a relatively small number of directories and files on them. However, along came one CD with literally hundreds of thousands of files on it. The computer did nothing but sort those file names alphabetically for around 12 h. When this was discovered, the programmer rewrote the sorting code to be more efficient and reduced the sorting time to around 15 s. That's a BIG difference! It also illustrates just how important this idea of computational complexity is.

If we take another look at our PyList append method we might be able to make it more efficient if we didn't have to access each element of the first list when concatenating the two lists. The use of the + operator is what causes Python to

access each element of that first list. When + is used a new list is created with space

for one more element. Then all the elements from the old list must be copied to the

new list and the new element is added at the end of this list.

Using the append method on lists changes the code to use a mutator method to alter the list by adding just one more element. It turns out that adding one more element to an already existing list is very efficient in Python. In fact, appending an item to a list is a O(1) operation as we'll see later in this chapter. This means to append n items to a list we have gone from O(n2) to O(n). Later in this chapter we'll learn just how Python can insure that we get O(1) complexity for the append operation. The blue line in Fig. 2.4 shows how the PyList append method works when the

+ operator is replaced by calling the list append method instead. At 100,000 elements

Fig. 2.4 The Complexity of Appending to a Pylist

in the PyList we go from 27 s to add another element to maybe a second, but probably less than that. That's a nice speedup in our program. After making this change, the PyList append method is given in Sect. 2.7.1.

2.7.1 Efficient Append

1 class PyList:

2 def init (self):

3 self.items = []

4

5 # The append method is used to add commands to the sequence.

6 def append(self,item):

7 self.items.append(item)

8

9 ...

 
Found a mistake? Please highlight the word and press Shift + Enter  
< Prev   CONTENTS   Next >
 
Subjects
Accounting
Business & Finance
Communication
Computer Science
Economics
Education
Engineering
Environment
Geography
Health
History
Language & Literature
Law
Management
Marketing
Philosophy
Political science
Psychology
Religion
Sociology
Travel