Menu
Home
Log in / Register
 
Home arrow Computer Science arrow Data Structures and Algorithms with Python
< Prev   CONTENTS   Next >

2.3 Accessing Elements in a Python List

With experimentation we can verify that all locations within the RAM of a computer can be accessed in the same amount of time. A Python list is a collection of contiguous memory locations. The word contiguous means that the memory locations of a list are grouped together consecutively in RAM. If we want to verify that the RAM of a computer behaves like a group of people all remembering their names and their values, we can run some tests with Python lists of different sizes to find the average time to retrieve from or store a value into a random element of the list.

To test the behavior of Python lists we can write a program that randomly stores and retrieves values in a list. We can test two different theories in this program.

1. The size of a list does not affect the average access time in the list.

2. The average access time at any location within a list is the same, regardless of its location within the list.

To test these two theories, we'll need to time retrieval and storing of values within a list. Thankfully, Python includes a datetime module that can be used to record the current time. By subtracting two datetime objects we can compute the number of microseconds (i.e. millionths of a second) for any operation within a program. The program in Sect. 2.3.1 was written to test list access and record the access time for retrieving values and storing values in a Python list.

2.3.1 List Access Timing

1 import datetime

2 import random

3 import time

4

5 def main():

6

7 # Write an XML file with the results

8 file = open("ListAccessTiming.xml","w")

9

10 file.write('<?xml version="1.0" encoding="UTF-8" standalone="no" ?> ')

11

12 file.write('<Plot title="Average List Element Access Time"> ')

13

14 # Test lists of size 1000 to 200000.

15 xmin = 1000

16 xmax = 200000

17

18 # Record the list sizes in xList and the average access time within

19 # a list that size in yList for 1000 retrievals.

20 xList = []

21 yList = []

22

23 for x in range(xmin, xmax+1, 1000):

24

25 xList.append(x)

26

27 prod = 0

28

29 # Creates a list of size x with all 0's

30 lst = [0] * x

31

32 # let any garbage collection/memory allocation complete or at least

33 # settle down

34 time.sleep(1)

35

36 # Time before the 1000 test retrievals

37 starttime = datetime.datetime.now()

38

39 for v in range(1000):

40 # Find a random location within the list

41 # and retrieve a value. Do a dummy operation

42 # with that value to ensure it is really retrieved.

43 index = random.randint(0,x-1)

44 val = lst[index]

45 prod = prod * val

46 # Time after the 1000 test retrievals

47 endtime = datetime.datetime.now()

48

49 # The difference in time between start and end.

50 deltaT = endtime starttime

51

52 # Divide by 1000 for the average access time

53 # But also multiply by 1000000 for microseconds.

54 accessTime = deltaT.total_seconds() * 1000

55

56 yList.append(accessTime)

57

58 file.write(' <Axes> ')

59 file.write(' <XAxis min="'+str(xmin)+'" max="'+str(xmax)+'">List Size</XAxis> ')

60 file.write(' <YAxis min="'+str(min(yList))+'" max="'+str(60)+'">Microseconds</YAxis> ')

61 file.write(' </Axes> ')

62

63 file.write(' <Sequence title="Average Access Time vs List Size" color="red"> ')

64

65 for i in range(len(xList)):

66 file.write(' <DataPoint x="'+str(xList[i])+'" y="'+str(yList[i])+'"/> ')

67

68 file.write(' </Sequence> ')

69

70 # This part of the program tests access at 100 random locations within a list

71 # of 200,000 elements to see that all the locations can be accessed in

72 # about the same amount of time.

73 xList = lst

74 yList = [0] * 200000

75

76 time.sleep(2)

77

78 for i in range(100):

79 starttime = datetime.datetime.now()

80 index = random.randint(0,200000-1)

81 xList[index] = xList[index] + 1

82 endtime = datetime.datetime.now()

83 deltaT = endtime starttime

84 yList[index] = yList[index] + deltaT.total_seconds() * 1000000

85

86 file.write(' <Sequence title="Access Time Distribution" color="blue"> ')

87

88 for i in range(len(xList)):

89 if xList[i] > 0:

90 file.write(' <DataPoint x="'+str(i)+'" y="'+str(yList[i]/xList[i])+'"/> ')

91

92 file.write(' </Sequence> ')

93 file.write('</Plot> ')

94 file.close()

1 if name == " main ":

2 main()

When running a program like this the times that you get will depend not only on the actual operations being performed, but the times will also depend on what other activity is occurring on the computer where the test is being run. All modern operating systems, like Mac OS X, Linux, or Microsoft Windows, are multi-tasking. This means the operating system can switch between tasks so that we can get email while writing a computer program, for instance. When we time something we will not only see the effects of our own program running, but all programs that are currently running on the computer. It is nearly impossible to completely isolate one program in a multi-tasking system. However, most of the time a short program will run without too much interruption.

The program in Sect. 2.3.1 writes an XML file with its results. The XML file format supports the description of experimentally collected data for a two dimensional plot of one or more sequences of data. One sample of the data that this program generates looks like Sect. 2.3.2. The data is abbreviated, but the format is as shown in Sect. 2.3.2.

 
Found a mistake? Please highlight the word and press Shift + Enter  
< Prev   CONTENTS   Next >
 
Subjects
Accounting
Business & Finance
Communication
Computer Science
Economics
Education
Engineering
Environment
Geography
Health
History
Language & Literature
Law
Management
Marketing
Philosophy
Political science
Psychology
Religion
Sociology
Travel