What are the characteristics of a $\Theta(n \log n)$ time complexity algorithm?

Question Detail: 

Sometimes it's easy to identify the time complexity of an algorithm my examining it carefully. Algorithms with two nested loops of $N$ are obviously $N^2$. Algorithms that explore all the possible combinations of $N$ groups of two values are obviously $2^N$.

However I don't know how to "spot" an algorithm with $\Theta(N \log N)$ complexity. A recursive mergesort implementation, for example, is one. What are the common characteristics of mergesort or other $\Theta(N \log N)$ algorithms that would give me a clue if I was analyzing one?

I'm sure there is more than one way an algorithm can be of $\Theta(N \log N)$ complexity, so any and all answers are appreciated. BTW I'm seeking general characteristics and tips, not rigorous proofs.

Asked By : Barry Fruitman
Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/10091

Answered By : Gilles

Your archetypical $\Theta(n \log n)$ is a divide-and-conquer algorithm, which divides (and recombines) the work in linear time and recurses over the pieces. Merge sort works that way: spend $O(n)$ time splitting the input into two roughly equal pieces, recursively sort each piece, and spend $\Theta(n)$ time combining the two sorted halves.

Intuitively, continuing the divide-and-conquer idea, each division stage takes linear time in total, because the increase in the number of pieces to divide exactly matches the decrease in the size of the pieces, since the time taken by division is linear. The total running time is the product of the total cost of a division stage multiplied by the number of division stages. Since the size of the pieces is halved at each time, there are $\log_2(n)$ division stages, so the total running time is $n \cdot \log(n)$. (Up to a multiplicative constant, the base of the logarithm is irrelevant.)

Putting it in equations (), one way to estimate the running time $T(n)$ of such an algorithm is to express it recursively: $T(n) = 2 T(n/2) + \Theta(n)$. It's clear that this algorithm takes more than linear time, and we can see how much more by dividing by $n$: $$ \frac{T(n)}{n} = \frac{T(n/2)}{n/2} + \Theta(1) $$ When $n$ doubles, $T(n)/n$ increases by a constant amount: $T(n)/n$ increases logarithmically, or in other words, $T(n) = \Theta(n \log n)$.

This is an instance of a more general pattern: the master theorem. For any recursive algorithm that divides its input of size $n$ into $a$ pieces of size $n/b$ and takes a time $f(n)$ to perform the division and recombination, the running time satisfies $T(n) = a \cdot T(n/b) + f(n)$. This leads to a closed form that depends on the values of $a$ and $b$ and the shape of $f$. If $a = b$ and $f(n) = \Theta(n)$, the master theorem states that $T(n) = \Theta(n \log n)$.

No comments

Powered by Blogger.