5.1 Upper bounds: the big-OO notation

Several terms are used to describe the running-time equation for an algorithm. These terms – and their associated symbols – indicate precisely what aspect of the algorithm’s behaviour is being described. One is the upper bound for the growth of the algorithm’s running time. It indicates the upper or highest growth rate that the algorithm can have.

Because the phrase “has an upper bound to its growth rate of f(n)f(n)” is long and often used when discussing algorithms, we adopt a special notation, called big-OO notation. If the upper bound for an algorithm’s growth rate (for, say, the worst case) is f(n)f(n), then we would write that this algorithm is “in the set O(f(n))O(f(n)) in the worst case” (or just “in O(f(n))O(f(n)) in the worst case”). For example, if n2n^2 grows as fast as T(n)T(n) (the running time of our algorithm) for the worst-case input, we would say the algorithm is “in O(n2)O(n^2) in the worst case”.

5.1.1 Formal definition

So, how do we define the upper bound? First, if gg is an upper bound of ff, then this should mean something like f(n)g(n)f(n)\leq g(n) in the long run. What we mean by this is that whenever nn becomes sufficiently large, then f(n)f(n) should not outgrow g(n)g(n).

But this is not all there is to it. We have already mentioned that we want to abstract away from constant factors – if algorithm 𝐀\mathbf{A} is twice as fast as algorithm 𝐁\mathbf{B}, then they grow at the same rate, and we want our notation to capture that. So what we want to say is that f(n)kg(n)f(n)\leq k\cdot g(n), for some arbitrary constant kk. This gives us the following formal definition:

Upper bound

fO(g)f\in O(g) iff there exist positive numbers kk and n0n_0 such that f(n)kg(n)f(n) \leq k\cdot g(n) for all n>n0n>n_0

The constant n0n_0 is the smallest value of nn for which the claim of an upper bound holds true. Usually n0n_0 is small, such as 1, but does not need to be. You must also be able to pick some constant kk, but it is irrelevant what the value for kk actually is. In other words, the definition says that for all inputs of the type in question (such as the worst case for all inputs of size nn) that are large enough (i.e., n>n0n > n_0), the algorithm always executes in less than or equal to kg(n)k\cdot g(n) steps for some constant kk.

Note that the definition is somewhat simplified, it only works if ff and gg are monotonically increasing. This means that if xyx\leq y then f(x)f(y)f(x)\leq f(y), so the value can never decrease whenever xx increases. But this is not a real restriction for our purposes, because there are no algorithms that becomes faster when the input size grows. In the very best case, the runtime of an algorithm can be independent of the input size, but this is also monotonically increasing. If we were to allow any non-monotonic functions, then the definition of upper bound would become slightly more complicated. You can look up the formal definition in mathematical textbooks, or in Wikipedia.

Example: Comparing two functions

Assume f(n)=n(logn)2f(n) = n(\log n)^2 and g(n)=0.001n2g(n) = 0.001\cdot n^2. How can we use the definitions above to prove that fO(g)f \in O(g)?

We have to find positive numbers kk and n0n_0 so that f(n)kg(n)f(n)\leq k\cdot g(n). Since gg has a constant factor of 0.001, we can try with k=1000k=1000:

kg(n)=10000.001n2=n2 k\cdot g(n) = 1000 \cdot 0.001 \cdot n^2 = n^2

Now we readily see that f(n)=n(logn)2f(n) = n(\log n)^2 is smaller than kg(n)=n2k\cdot g(n) = n^2 for all n1n\geq 1, so we can set n0=1n_0 = 1.

Note that there are plenty of possible values to choose from, such as k=1k=1 and n0=13,789n_0=13,789. We can even use very large values such as k=n0=1099k=n_0=10^{99}, since we are only interested in what happens when nn grows infinitly large.

Example: Quadratic algorithm

For a particular algorithm, T(n)=c1n2+c2nT(n) = c_1 n^2 + c_2 n in the average case where c1c_1 and c2c_2 are positive numbers. Then,

c1n2+c2nc1n2+c2n2(c1+c2)n2 c_1 n^2 + c_2 n \leq c_1 n^2 + c_2 n^2 \leq (c_1 + c_2)n^2

for all n>1n > 1. So, T(n)cn2T(n) \leq c n^2 for c=c1+c2c = c_1 + c_2, and n0=1n_0 = 1. Therefore, T(n)T(n) is in O(n2)O(n^2) by the definition.

If someone asked you out of the blue “Who is the best?” your natural reaction should be to reply “Best at what?” In the same way, if you are asked “What is the growth rate of this algorithm”, you would need to ask “When? Best case? Average case? Or worst case?” Some algorithms have the same behaviour no matter which input instance of a given size that they receive. An example is finding the maximum in an array of integers. But for many algorithms, it makes a big difference which particular input of a given size is involved, such as when searching an unsorted array for a particular value. So any statement about the upper bound of an algorithm must be in the context of some specific class of inputs of size nn. We measure this upper bound nearly always on the best-case, average-case, or worst-case inputs. Thus, we cannot say, “this algorithm has an upper bound to its growth rate of n2n^2” because that is an incomplete statement. We must say something like, “this algorithm has an upper bound to its growth rate of n2n^2 in the average case”.

Knowing that something is in O(f(n))O(f(n)) says only how bad things can be. Perhaps things are not nearly so bad. Because sequential search is in O(n)O(n) in the worst case, it is also true to say that sequential search is in O(n2)O(n^2). But sequential search is practical for large nn in a way that is not true for some other algorithms in O(n2)O(n^2). We always seek to define the running time of an algorithm with the tightest (lowest) possible upper bound. Thus, we prefer to say that sequential search is in O(n)O(n). This also explains why the phrase “is in O(f(n))O(f(n))” or the notation “O(f(n))\in O(f(n))” is used instead of “is O(f(n))O(f(n))” or “=O(f(n))= O(f(n))”. There is no strict equality to the use of big-OO notation. O(n)O(n) is in O(n2)O(n^2), but O(n2)O(n^2) is not in O(n)O(n).

5.1.2 Simplifying rules

We introduced simplifying rules in Section 2.7.1, but repeat them here in compact form:

Rule Simplification
(1) Transitivity if fO(g)f\in O(g) and gO(h)g\in O(h), then fO(h)f\in O(h)
(2) Constant k>0k>0 if fO(g)f\in O(g), then kfO(g)k\cdot f\in O(g)
(3) Addition if fO(g)f\in O(g) and fO(g)f'\in O(g'), then f+fO(max(g,g))f+f'\in O(\max(g,g'))
(4) Multiplication if fO(g)f\in O(g) and fO(g)f'\in O(g'), then ffO(gg)f\cdot f'\in O(g\cdot g')

Using these rules we can easily determine the asymptotic growth rate for many algorithms.

  • Rule (2) says that any constant-time operation is O(1)O(1).
  • Rule (3) says that if you have a sequence of statements, you only need to consider the most expensive statement: O(max(f1,,fk))O(\max(f_1,\ldots,f_k)).
  • Rule (4) says that if you have a loop that repeats a statement pp a number of times, the total cost is the cost of pp times the number of iterations: O(nf)O(n\cdot f).
  • Rule (2) also says that if you repeat a statement pp a constant number of times, you can treat it as you only execute pp once.

5.1.3 big-OO and logarithms

One interesting consequence of asymptotic complexity is that the base of a logarithm becomes irrelevant:

O(log2(n))=O(ln(n))=O(log10(n)) O(\log_2(n)) = O(\ln(n)) = O(\log_10(n))

The reason for this is that according to the logarithm laws, logb(n)=loga(n)1/loga(b)\log_b(n) = \log_a(n) \cdot 1/\log_a(b). But 1/loga(b)1/\log_a(b) is a constant which we can ignore, so O(logb(n))=O(loga(n))O(\log_b(n)) = O(\log_a(n)). Therefore we can just ignore the base and write O(logn)O(\log n). (Note that this does not hold for exponential growth – e.g., 2nO(10n)2^n\in O(10^n), but 10nO(2n)10^n\not\in O(2^n).)

Another consequence of the logarithm laws is that it doesn’t really matter if you take the logarithm from a linear, quadratic, cubic, or any power function:

O(logn)=O(logn2)=O(logn3)=O(lognk) O(\log n) = O(\log n^2) = O(\log n^3) = O(\log n^k)

The reason for this is that lognk=klogn\log n^k = k\cdot\log n according to the logarithm laws, so the exponent kk becomes a multiplicative constant and can be ignored. However, taking the power of a logarithm cannot be ignored, so O(logn)O(\log n) and O(log2n)O(\log^2 n) are different complexity classes.