What are the reasons to learn different algorithms / data structures serving the same purpose?
I have been wondering about this question since I was an undergraduate student. It is a general question but I will elaborate with examples below.
I have seen a lot of algorithms - for example, for maximum flow problems, I know around 3 algorithms which can solve the problem: Ford-Fulkerson, Edmonds-Karp & Dinic, with Dinic having the best complexity.
For data structures - for example, heaps - there are binary heaps, binomial heaps & Fibonacci heaps, with Fibonacci heap having the best overall complexity.
What keeps me confusing is: are there any reasons why we need to know them all? Why not just learn and get familiar with the best complexity one?
I know it is the best if we know them all, I just want to know are there any "more valid" reasons, like some problems / algorithms can only be solved by using A but not B, etc.
Asked By : shole
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/53225
Answered By : Pseudonym
There's a textbook waiting to be written at some point, with the working title Data Structures, Algorithms, and Tradeoffs. Almost every algorithm or data structure which you're likely to learn at the undergraduate level has some feature which makes it better for some applications than others.
Let's take sorting as an example, since everyone is familiar with the standard sort algorithms.
First off, complexity isn't the only concern. In practice, constant factors matter, which is why (say) quick sort tends to be used more than heap sort even though quick sort has terrible worst-case complexity.
Secondly, there's always the chance that you find yourself in a situation where you're programming under strange constraints. I once had to do quantile extraction from a modest-sized (1000 or so) collection of samples as fast as possible, but it was on a small microcontroller which had very little spare read-write memory, so that ruled out most $O(n \log n)$ sort algorithms. Shell sort was the best tradeoff, since it was sub-quadratic and didn't require additional memory.
In other cases, ideas from an algorithm or data structure might be applicable to a special-purpose problem. Bubble sort seems to be always slower than insertion sort on real hardware, but the idea of performing a bubble pass is sometimes exactly what you need.
Consider, for example, some kind of 3D visualisation or video game on a modern video card, where you'd like to draw objects in order from closest-to-the-camera to furthest-from-the-camera for performance reasons, but if you don't get the order exact, the hardware will take care of it. If you're moving around the 3D environment, the relative order of objects won't change very much between frames, so performing one bubble pass every frame might be a reasonable tradeoff. (The Source engine by Valve does this for particle effects.)
There's persistence, concurrency, cache locality, scalability onto a cluster/cloud, and a host of other possible reasons why one data structure or algorithm may be more appropriate than another even given the same computational complexity for the operations that you care about.
Having said that, that doesn't mean that you should memorise a bunch of algorithms and data structures just in case. Most of the battle is realising that there is a tradeoff to be exploited in the first place, and knowing where to look if you think there might be something appropriate.
Post a Comment