The algorithms we will study in this text will be of one of the complexities of O(1), O(log n), O(n log n), O(n2), or O(cn). A graph of the shapes of these functions appears in Fig. 2.5. Most algorithms have one of these complexities corresponding to some factor of n. Constant values added or multiplied to the terms in a formula for measuring the time needed to complete a computation do not affect the overall complexity of that operation. Computational complexity is only affected by the highest power term of the equation. The complexities graphed in Fig. 2.5 are of some power n or the log of n, except for the really awful exponential complexity of O(cn), where c is some constant value.

As you are reading the text and encounter algorithms with differing complexities, they will be one of the complexities shown in Fig. 2.5. As always, the variable n represents the size of the data provided as input to the algorithm. The time taken to process that data is the vertical axis in the graph. While we don't care about the exact numbers in this graph, we do care about the overall shape of these functions. The flatter the line, the lower the slope, the better the algorithm performs. Clearly an algorithm that has exponential complexity (i.e. O(cn)) or n-squared complexity (i.e. O(n2)) complexity will not perform very well except for very small values of n. If you know your algorithm will never be called for large values of n then an inefficient algorithm might be acceptable, but you would have to be really sure that you knew that your data size would always be small. Typically we want to design algorithms that are as efficient as possible.

In subsequent chapters you will encounter sorting algorithms that are O(n2) and then you'll learn that we can do better and achieve O(n log n) complexity. You'll see search algorithms that are O(n) and then learn how to achieve O(log n) complexity. You'll also learn a technique called hashing that will search in O(1) time. The techniques you learn will help you deal with large amounts of data as efficiently

Fig. 2.5 Common Big-Oh Complexities

as possible. As each of these techniques are explored, you'll also have the opportunity to write some fun programs and you'll learn a good deal about object-oriented programming.

2.9 More Asymptotic Notation

Earlier in this chapter we developed Big-Oh notation for describing an upper bound on the complexity of an algorithm. There we began with an intuitive understanding of the idea of efficiency saying that a function exhibits a complexity if it is bounded above by a function of n where n represents the size of the data given to the algorithm. In this section we further develop these concepts to bound the efficiency of an algorithm from both above and below.

We begin with an in-depth discussion of efficiency and the measurement of it in Computer Science. When concerning ourselves with algorithm efficiency there are two issues that must be considered.

• The amount of time an algorithm takes to run

• and, related to that, the amount of space an algorithm uses while running.

Typically, computer scientists will talk about a space/time tradeoff in algorithms. Sometimes we can achieve a faster running time by using more memory. But, if we use too much memory we can slow down the computer and other running programs. The space that is referred to is the amount of RAM needed to solve a problem. The time we are concerned with is a measure of how the number of operations grow as the size of the data grows.

Consider a function T(n) that is a description of the running time of an algorithm, where n is the size of the data given to the algorithm. As computer scientists we want to study the asymptotic behavior of this function. In other words, we want to study how T(n) increases as n → ∞. The value of n is a Natural number representing possible sizes of input data. The natural numbers are the set of non-negative integers. The definition in Sect. 2.9.1 is a re-statement of the Big-Oh notation definition presented

earlier in this chapter.

2.9.1 Big-Oh Asymptotic Upper Bound

O(g(n)) ={ f (n) | ∃d > 0 and n0 > 0 3 0 ≤ f (n) ≤ dg(n) ∀n ≥ n0}

We write that

f (n) is O(g(n)) ⇔ f ∈ O((g(n)))

and we say that f is big-oh g of n. The definition of Big-Oh says that we can find an upper bound for the time it will take for an algorithm to run. Consider the plot of time versus data size given in Fig. 2.3. Data size, or n is the x axis, while time is the y axis. Imagine that the green line represents the observed behavior of some algorithm.

The blue line clearly is an upper bound to the green line after about n = 4. This is what the definition of big-Oh means. For a while, the upper bounding function may not be an upper bound, but eventually it becomes an upper bound and stays that way all the way to the limit as n approaches infinity.

But, does the blue line represent a tight bound on the complexity of the algorithm whose running time is depicted by the green line? We'd like to know that when we describe the complexity of an algorithm it is truly representational of the actual running time. Saying that the algorithm runs in O(n2) is accurate even if the algorithm runs in time proportional to n because Big-Oh notation only describes an upper bound. If we truly want to say what the algorithm's running time is proportional to, then we need a little more power. This leads us to our next definition in Sect. 2.9.2.

2.9.2 Asymptotic Lower Bound

Q(g(n)) ={ f (n) | ∃c > 0 and n0 > 0 3 0 ≤ cg(n) ≤ f (n) ∀n ≥ n0}

Omega notation serves as a way to describe a lower bound of a function. In this case the lower bound definition says for a while it might be greater, but eventually there is some n0 where T(n) dominates g(n) for all bigger values of n. In that case, we can write that the algorithm is Q(g(n)). Considering our graph once again, we see that the purple line is dominated by the observed behavior sometime after n = 75. As with the upper bound, for a while the lower bound may be greater than the observed behavior, but after a while, the lower bound stays below the observed behavior for all bigger values of n.

With both a lower bound and and upper bound definition, we now have the notation to define an asymptotically tight bound. This is called Theta notation.

2.9.3 Theta Asymptotic Tight Bound

E> g(n)) = { f (n) |∃ c > 0, d > 0 and n0 > 0 3 0 ≤ cg(n) ≤ f (n) ≤ dg(n) ∀n ≥ n0}

If we can find such a function g, then we can declare that E> g(n)) is an asymptotically tight bound for T(n), the observed behavior of an algorithm. In Fig. 2.6 the upper

bound blue line is g(n) = n2 and the lower bound purple line is a plot of g(n)/110. If we let c = 1 and d = 1/110, we have the asymptotically tight bound of T(n) at E> n2). Now, instead of saying that n-squared is an upper bound on the algorithm's behavior,

we can proclaim that the algorithm truly runs in time proportional to n-squared. The behavior is bounded above and below by functions of n-squared proving the claim that the algorithm is an n-squared algorithm.

Found a mistake? Please highlight the word and press Shift + Enter