Adding elements to a sorted array

Question Detail: 

What would be the fastest way of doing this (from an algorithmic perspective, as well as a practical matter)?

I was thinking something along the following lines.

I could add to the end of an array and then use bubblesort as it has a best case (totally sorted array at start) that is close to this, and has linear running time (in the best case).

On the other hand, if I know that I start out with a sorted array, I can use a binary search to find out the insertion point for a given element.

My hunch is that the second way is nearly optimal, but curious to see what is out there.

How can this best be done?

Asked By : soandos
Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/930

Answered By : Patrick87

We count the number of array element reads and writes. To do bubble sort, you need $1 + 4n$ accesses (the initial write to the end, then, in the worst case, two reads and two writes to do $n$ swaps). To do the binary search, we need $2\log n + 2n + 1$ ($2\log n$ for binary search, then, in the worst case, $2n$ to shift the array elements to the right, then 1 to write the array element to its proper position).

So both methods have the same complexity for array implementations, but the binary search method requires fewer array accesses in the long run... asymptotically, half as many. There are other factors at play, naturally.

Actually, you could use better implementations and only count actual array accesses (not accesses to the element to be inserted). You could do $2n + 1$ for bubble sort, and $\log n + 2n + 1$ for binary search... so if register/cache access is cheap and array access is expensive, searching from the end and shifting along the way (smarter bubble sort for insertion) could be better, though not asymptotically so.

A better solution might involve using a different data structure. Arrays give you O(1) accesses (random access), but insertions and deletions might cost. A hash table could have O(1) insertions & deletions, accesses would cost. Other options include BSTs and heaps, etc. It could be worth considering your application's usage needs for insertion, deletion and access, and choose a more specialized structure.

Note also that if you want to add $m$ elements to a sorted array of $n$ elements, a good idea might be to efficiently sort the $m$ items, then merge the two arrays; also, sorted arrays can be built efficiently using e.g. heaps (heap sort).

No comments

Powered by Blogger.