Why is Quicksort described as "in-place" if the sublists take up quite a bit of memory? Surely only something like bubble sort is in-place?

Question Detail: 

Quicksort is described as "in-place" but using an implementation such as:

def sort(array):     less = []     equal = []     greater = []     if len(array) > 1:         pivot = array[0]         for x in array:             if x < pivot:                 less.append(x)             if x == pivot:                 equal.append(x)             if x > pivot:                 greater.append(x)         return sort(less) + equal + sort(greater)     else:         return array 

You have to create a copy of the list for each recursion. By the first return, in memory we have:

  • array
  • greater+equal+less

Then by the second recursion across all sub-lists we have:

  • array
  • greater, equal, less from first recursion
  • greater+equal+less from less1, greater+equal+less from greater1

etc...

Is this just badly written code or am I correct in thinking that for a big list, you actually have to have, proportionally, a fair amount of extra space to store thse?

When I think of something that is "in-place", I think of bubble sort, which simply swaps elements in the list like: http://en.wikipedia.org/wiki/File:Bubble-sort-example-300px.gif

BubbleSort only requires 1 extra variable to store a potentially swapped element.

Asked By : LittleBobbyTables
Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/22516

Answered By : Gilles

This particular implementation of quicksort is not in-place. It treats the data structure as a list that can only grow in one direction (in which case a merge sort would be simpler and faster). However, it is possible to write an in-place implementation of quicksort, and this is the way it is usually presented.

In an in-place implementation, instead the sort function is not called recursively on newly constructed arrays that become smaller and smaller, but on bounds that become closer and closer.

def sort(array, start, end):     if end >= start: return     pivot = array[start]     middle = start + 1     for i in range(start+1, end):         if array[i] >= array[middle]:             tmp = array[i]             array[i] = array[middle]             array[middle] = tmp             middle += 1     sort(array, start, middle)     sort(array, middle, end) 

(Beware of this code, I have only typed it, not proved it. Any off-by-one errors are yours to fix. Real-world implementations would use a different algorithm for small sizes but this doesn't affect asymptotic behavior. Real-world implementations would choose a better pivot but I won't get into that here as it doesn't really for this question.)

The Wikipedia page presents both a non-in-place and an in-place version of the algorithm.

Quicksort as written here requires $O(d) + s$ extra storage, where $d$ is the depth of the recursion (which depends on the pivot quality) and $s$ is the element size. The stack size requirement can be improved: there are two recursive calls, and we can make the second one a tail call (which consumes no stack). If we always do the recursive call on the smaller half first, then the maximum stack size for array lengths up to $n$ satisfies $\hat S(n) \le \hat S(m)$ with $m \le n/2 \le \hat S(n/2)$, so $\hat S(n) \le \lg_2(n) \hat S(1)$. Thus we can achieve $O(\log n) + s$ extra storage requirement, regardless of the pivot choice.

There are plenty of other sorting algorithms that can be implemented in-place, including insertion sort, selection sort and heap sort. The simple form of merge sort isn't in-place, but there is a more sophisticated variant which is.

Thanks to Aryabhata for pointing out that Quicksort can always be done in $\lg(n)$ stack and that there is a variant of merge sort with both $O(1)$ additional storage and $O(n \log(n))$ runtime.

No comments

Powered by Blogger.