A d-ary heap problem from CLRS

Question Detail: 

I got confused while solving the following problem (questions 1–3).

Question

A d-ary heap is like a binary heap, but(with one possible exception) non-leaf nodes have d children instead of 2 children.

  1. How would you represent a d-ary heap in an array?

  2. What is the height of a d-ary heap of n elements in terms of n and d?

  3. Give an efficient implementation of EXTRACT-MAX in a d-ary max-heap. Analyze its running time in terms of d and n.

  4. Give an efficient implementation of INSERT in a d-ary max-heap. Analyze its running time in terms of d and n.

  5. Give an efficient implementation of INCREASE-KEY(A, i, k), which flags an error if k < A[i] = k and then updates the d-ary matrix heap structure appropriately. Analyze its running time in terms of d and n.

My Solution

  1. Give an array $A[a_1 .. a_n]$

    $\qquad \begin{align} \text{root} &: a_1\\ \text{level 1} &: a_{2} \dots a_{2+d-1}\\ \text{level 2} &: a_{2+d} \dots a_{2+d+d^2-1}\\ &\vdots\\ \text{level k} &: a_{2+\sum\limits_{i=1}^{k-1}d^i} \dots a_{2+\sum\limits_{i=1}^{k}d^i-1} \end{align}$

    My notation seems a bit sophisticated. Is there any other simpler one?

  2. Let h denotes the height of the d-ary heap.

    Suppose that the heap is a complete d-ary tree $$ 1+d+d^2+..+d^h=n\\ \dfrac{d^{h+1}-1}{d-1}=n\\ h=log_d[n{d-1}+1] - 1 $$

  3. This is my implementation:

    EXTRACT-MAX(A) 1  if A.heapsize < 1 2      error "heap underflow" 3  max = A[1] 4  A[1] = A[A.heapsize] 5  A.heap-size = A.heap-size - 1 6  MAX-HEAPIFY(A, 1) 7  return max  MAX-HEAPIFY(A, i) 1  assign depthk-children to AUX[1..d] 2  for k=1 to d 3      compare A[i] with AUX[k] 4      if A[i] <= AUX[k] 5          exchange A[i] with AUX[k] 6          k = largest 7  assign AUX[1..d] back to A[depthk-children] 8  if largest != i 9      MAX-HEAPIFY(A, (2+(1+d+d^2+..+d^{k-1})+(largest-1) ) 
    • The running time of MAX-HEAPIFY:

      $$T_M = d(c_8*d + (c_9+..+c_13)*d +c_14*d)$$ where $c_i$ denotes the cost of i-th line above.

    • EXTRACT-MAX: $$ T_E = (c_1+..+c_7) + T_M \leq C*d*h\\ = C*d*(log_d[n(d-1)+1] - 1)\\ = O(dlog_d[n(d-1)]) $$

    Is this an efficient solution? Or there is something wrong within my solution?

Asked By : lucasKoFromTW
Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/6078

Answered By : Bartosz Przybylski

  1. Your solution is valid and follows definition of d-ary heap. But as you pointed out your notation is a bit sophisticated.

    You might use those two following functions to retrieve parent of i-th element and j-th child of i-th element.

    $\text{d-ary-parent}({\it i}) \\ \ \ \ \ {\bf return}\ \lfloor (i-2)/d + 1 \rfloor$

    $\text{d-ary-child}(i, j) \\ \ \ \ \ {\bf return}\ d(i-1)+j+1$

    Obviously $1 \le j \le d$. You can verify those functions checking that $\text{d-ary-parent}(\text{d-ary-child}(i,j)) = i$

    Also easy to see is that binary heap is special type of $d$-ary heap where $d=2$, if you substitute $d$ with $2$, then you will see that they match functions PARENT, LEFT and RIGHT mentioned in book.

  2. If I understand your answer correctly you use a geometric progression. In your case you get go $h = log_d(n\,d-1+1) - 1$, which is obviously $\log_d(n\,d) - 1 = \log_d(n) + \log_d(d) - 1 = \log_d(n) + 1 - 1 = \log_d(n)$, which in fact is a valid and correct solution. But just for sake of handling constant fluctuations you might wanna write $\Theta(\log_d(n))$.

    The reason for this is that some heaps might not be balanced, so their longest path and shorthest path migt vary by some constant $c$, by using $\Theta$ notation you eliminate this problem.

  3. You don't need to re-implement procedure given in textbook, but you must alter it a bit, eg assigning all children to $AUX$ table using given $\text{d-ary-parent}$ and $\text{d-ary-child}$ functions.

    Because $\text{EXTRACT-MAX}$ was not altered, it depends on running time of $\text{MAX-HEAPIFY}$. In your analysis you now you must use worst-case time proportional to hight and number of children which each node must examine (which is at most d). Once again your analysis is very precise, in the end you got $O(d\ \log_d(n(d-1)))$, which can be transformed to:

    $O(d\ \log_d(n(d-1))) = O(d(\log_d(n) + \log(d-1))) = O(d\ log_d(n) + d\ \log_d(d-1))$

    For practical reasons we can always assume that $d \ll n$, so we can lose the $d \log_d(d-1)$ part of O notation, then we will get $O(d\log_d(n))$. Which is also valid solution. But not surprisingly you can also analyse function run time using Master theorem, which will show that $\text{MAX-HEAPIFY}$ is not only $O$ but even $\Theta$.

  4. CLRS book already provides INSERT procedure. Which looks like this:

    $\text{INSERT}(A, key)\\ \ \ \ \ A.heap\_size = A.heap\_size + 1 \\ \ \ \ \ A[A.heap\_size] = -\infty\\ \ \ \ \ \text{INCREASE-KEY}(A, A.heap\_size, key)$

    It can be easy proven but common sense dictates that it's time complexity is $O(\log_d(n))$. It's because heap might be traversed all the way to the root.

  5. Just like INSERT, INCREASE-KEY is also defined in textbook as:

    $\text{INCREASE-KEY}(A, i, key)\\ \ \ \ \ {\bf if}\ key < A[i]\\ \ \ \ \ \ \ \ \ {\bf error} \text{"new key is smaller then current"}\\ \ \ \ \ A[i] = key\\ \ \ \ \ {\bf while}\ i > 1\ and\ A[i] > A[\text{d-ary-parent}(i)]\\ \ \ \ \ \ \ \ \ A[i] \leftrightarrow A[\text{d-ary-parent(i)}]\\ \ \ \ \ \ \ \ \ i = \text{d-ary-parent(i)}$

    Complexity is obviously $O(\log_d(n))$ (see previous point).

No comments

Powered by Blogger.