Quicksort vs. insertion sort on linked list: performance

Question Detail: 

I have written a program to sort Linked Lists and I noticed that my insertion sort works much better than my quicksort algorithm. Does anyone have any idea why this is? Insertion sort has a complexity of $\Theta(n^2)$ and quicksort $O(n\log n)$ so therefore quicksort should be faster. I tried for random input size and it shows me the contrary. Strange...

Here the code in Java:

public static LinkedList qSort(LinkedList list) {      LinkedList x, y;     Node currentNode;     int size = list.getSize();      //Create new lists x smaller equal and y greater     x = new LinkedList();     y = new LinkedList();      if (size <= 1)         return list;     else {          Node pivot = getPivot(list);         // System.out.println("Pivot: " + pivot.value);              //We start from the head         currentNode = list.head;          for (int i = 0; i <= size - 1; i++) {             //Check that the currentNode is not our pivot             if (currentNode != pivot) {                 //Nodes with values smaller equal than the pivot goes in x                 if (currentNode.value <= pivot.value) {                     {                         x.addNode(currentNode.value);                         // System.out.print("Elements in x:");                         // x.printList();                     }                  }                  //Nodes with values greater than the pivot goes in y                 else if (currentNode.value > pivot.value) {                     if (currentNode != pivot) {                         y.addNode(currentNode.value);                         // System.out.print("Elements in y:");                         // y.printList();                     }                 }             }             //Set the pointer to the next node             currentNode = currentNode.next;         }          //Recursive calls and concatenation of the Lists and pivot         return concatenateList(qSort(x), pivot, qSort(y));      } } 
Asked By : forrestGump
Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/1354

Answered By : Louis

Pedro's answer covers what is going wrong with your first attempt. But I want to talk about a few things when it comes to analyzing algorithms that may be a bit subtle.

What is your algorithm?

Before you can start analyzing an algorithm, you need to say exactly what you're doing. "Quicksort", for example, can mean a number of different things. The basic pattern is:

  • Pick a pivot element
  • Partition the input around the pivot
  • Recursively sort the pieces
  • Combine the sub-problems

Even at this level, there is some ambiguity, since I haven't actually described how to pick the pivot and how to partition. In the modern era, which probably means about 20 years, people think that you:

  • Pick the pivot uniformly at random
  • Parition into "less than", "equal", and "greater than" pieces

In which case the number of comparisons is $O(n\log n)$ in expectation and close to information theoretic bounds.

But! Notice that you and some commenters like to pick the pivot other ways, so it is still important to say what you really mean. Picking the pivot deterministically means that there are cases (depending on your rule) that will be much much worse.

What are your primitives?

Even once you've decided what you really want your algorithm to do, you need to think about the "basic steps". For quicksort we need:

  • A way to pick the pivot
  • A way to compare elements
  • A way to partition
  • A way to combine

Notice that most analysis of sorting devolves onto counting comparisons. We'll talk about why this is next.

How much do your primitives cost?

Finally, in any detailed running time analysis, you need some model of computation in which you can count "operations". The most popular model for algorithms is the "RAM" which has arrays with constant-time reads/writes, the usual arithmetic operations, and integer comparisons. (But there are lots of others.)

For the usual analysis of quicksort, on an input of length $n$, you'll want:

  • Picking the pivot to be $O(n)$
  • Partitioning to be $O(n)$
  • Combining sub-arrays to be $O(n)$

You can do this with linked lists (as in your second try) or (more usually) arrays, but the actual implementations will be very different. The issue is that in your first try (according to other answers), you were spending $\Omega(n^2)$ on partitioning, which changes the complexity a lot.

Why sorting analyses usually just count comparisons.

The reason I wrote all of this down is that, even in your bad implementation, you weren't doing more comparisons. Most of the effort in analyzing sorting algorithms based on comparisons counts only comparisons for a few reasons:

  • It is very general (machine independent)
  • For most algorithms in that class, a reasonable implementation can "charge" every other operation to a comparison in a way that each comparison gets $O(1)$ operations

This second part is where your first try fell down. Hope this helps.

Even more

Everything so far was still theory. Practical sorting algorithms actually need to be tried on real machines. This is a whole area, and my understanding is that the current champ is not one of the standard algorithms, but something called "timsort", which you can find here http://svn.python.org/projects/python/trunk/Objects/listsort.txt

No comments

Powered by Blogger.