Insertion and Deletion in VL Trees Submitted in Partial Fulfillment of te Requirements for Dr. Eric Kaltofen s 66621: nalysis of lgoritms by Robert McCloskey December 14, 1984 1 ackground ccording to Knut [Knut73], te VL version of te binary searc tree represents a nice compromise between te optimum binary tree, wose eigt is minimal but for wic maintenance is seemingly difficult, and te arbitrary binary tree, wic requires no special maintenance but wose eigt could possibly grow to be muc greater tan minimal. It accomplises tis by employing a balanced (or admissible [VL62]) binary tree, wic is defined as one in wic eac node meets te requirement tat te respective eigts of its left and rigt subtrees differ by no more tan one. Te eigt of an admissible tree wit n > nodes is at least lg n (as it is for any binary tree) and, as del son-vel skii and Landis [VL62] sow, less tan log φ (n + 1) (approximately 1.44 lg(n + 1)), were φ is te golden ratio (1 + 5)/2. (For more on te golden ratio, see ttp://en.wikipedia.org/wiki/golden ratio.) To obtain tis result, first tey computed te minimum number of nodes in an admissible tree of eigt. For = and = 1, tese values are 1 and 2, respectively. For > 1, an admissible tree aving a minimal number of nodes consists of a root node and minimal subtrees of eigts 1 and 2. Hence, te desired minimum is given by te recurrence M() = 1 M(1) = 2 M() = M( 1) + M( 2) + 1 ( > 1) wic closely resembles te Fibonacci recurrence. 2 Te result ten follows by induction. Te eigt of an VL Tree, ten, can be no worse tan 5% greater tan optimal 3, and so te number of steps required for a searc is still proportional to lg n, even in te worst case. In order to easily maintain te tree s admissibility during te performance of a sequence of insertions and/or deletions, we associate wit eac node a balance factor, wic is te difference between te eigts of its left and rigt subtrees, respectively. In an admissible tree, ten, eac node as a balance factor of eiter 1,, or +1. 1 revised January 21 and again in January 212 2 Indeed, taking F = and F 1 = 1 as te initial values in te Fibonacci sequence and F k = F k 1 + F k 2 (k > 1) as te recurrence, we get M() = F +3 1 for all. 3 Te largest ratio between actual eigt and optimum eigt is realized by a 7-node VL tree of eigt tree. 1
Te beauty of te VL metod is tat it provides a way to easily rebalance a subtree tat as become inadmissible as te result of an insertion or deletion of a node witin it in order to make it admissible again, witout sacrificing more tan a constant factor in run-time complexity. Consequently, bot insertion and deletion require O(lg n) time. Node Insertion Insertion of a node into an VL Tree proceeds in exactly te same manner as in an arbitrary binary searc tree. Once te new node as been put in place, toug, additional steps must be taken to update balance factors and to ensure te tree s admissibility. Specifically, te portion of te tree tat is affected by an insertion is te subtree wose root is te last node wit a non-zero balance factor lying on wat del son-vel skii and Landis refer to as te recorded cain, wic is te pat from te root of te tree to te parent of te newly-inserted node. (If all nodes on te recorded cain ave a balance factor of zero, is taken to be te root of te tree.) Te newly-inserted node s balance factor sould be set to zero, of course. Meanwile, eac node below on te recorded cain as a subtree wose eigt as increased by one due to te insertion. ( proof is left to te reader.) Tus, eac suc node s balance factor sould be canged from zero to eiter +1 or 1, respectively, according to weter te new node was put into its left subtree or rigt subtree. s for, it, too, as a subtree wose eigt as increased. If s balance factor was zero implying tat it is te root of te tree its new balance factor is determined in te same way as tose of its proper descendants in te recorded cain (as described in te preceding paragrap). In tis case, te tree is still admissible. If, on te oter and, s balance factor was +1 or 1, tere are two possibilities: 1. Te recorded cain leads into wat ad been s sorter subtree, in wic case s balance factor sould be set to zero. Te eigt of te subtree rooted at as not been altered, and so te balance factors of s proper ancestors need not be modified. Te tree remains admissible. 2. Te recorded cain leads into s taller subtree, in wic case s balance factor sould be set to eiter +2 or 2 (according to weter, respectively, te newly-inserted node is in s left subtree or rigt subtree), implying inadmissibility. Tere are two possible cases, eac aving a mirror image. Let be te cild of tat is on te recorded cain. In te first case, s and s balance factors ave te same sign. To make te subtree rooted at admissible, a single rotation is performed, as illustrated in Figure 1. In te second case, s and s balance factors ave opposite signs. To make te subtree rooted at admissible, a double rotation is performed, as illustrated in Figure 2. (Tat figure depicts node C as aving balance factor ±1. However, te same remedy applies if C is itself te newly-inserted node (and ence as balance factor zero), in wic case te subtrees labeled,, and δ in te figure are empty.) Note tat, in bot cases, te resulting subtree s eigt is te same as wat it ad been prior to te insertion. Tus, te balance factors of te proper ancestors of te subtree need not be modified. 2
+2 +1 +1 +1 Figure 1: Case I: Rebalancing Using a Single Rotation +2 C 1 δ /+1 1/ +1/ 1 C δ 1 1 Figure 2: Case II: Rebalancing Using a Double Rotation 3
Node Deletion Deletion of a node from an VL Tree proceeds in exactly te same manner as in an arbitrary binary searc tree. (Te task of node deletion can always be reduced to tat of deleting a node tat as at most one cild.) s wit insertion, additional steps must be taken to maintain balance factors and tree admissibility. Wit respect to deletion, te recorded cain is te pat from te root to te parent of te newly-deleted node. Te portion of te tree tat may be affected by a node deletion is te subtree wose root D is te last node on te recorded cain aving a zero balance factor. (If all nodes on te recorded cain ave a non-zero balance factor, D is taken to be te root of te tree.) Te nodes on te recorded cain are considered in order from te bottom (i.e., te parent of te deleted node) up to D. Referring to te currently considered node as, tere are tree possibilities: 1. as a balance factor of zero, wic is to say tat is D. In tis case, te subtree of in wic te deletion occurred as decreased in eigt by one; ence, s balance factor sould be set to eiter 1 or +1, respectively, according to weter te deletion occurred in te left subtree or te rigt subtree. ecause te eigt of te subtree rooted at (i.e., D) is uncanged, te balance factors of s proper ancestors need not be modified. 2. Te recorded cain leads into wat ad been s taller subtree, te eigt of wic decreased by one as a result of te deletion. Hence, s balance factor sould be set to zero. ecause te eigt of te tree rooted at as decreased by one, its parent sould be te node to be considered next. (Tat is, s parent will play te role of during te next iteration.) 3. Te recorded cain leads into s sorter subtree, te eigt of wic decreased by one as a result of te deletion. Hence, s balance factor sould be set to 2 or +2, respectively, according to weter te deletion occurred in its left subtree or rigt subtree. Tis makes te subtree rooted at inadmissible. Tere are tree cases (eac aving a mirror image), two of wic are identical to tose tat can be produced by an insertion (and terefore are remedied by te metods illustrated in Figures 1 and 2). 4 Te tird case, sown in Figure 3, is te same as tat sown in Figure 1, except tat te balance factor of node is initially zero instead of ±1. 5 Wat is most interesting is tat, in eac of te first two cases, te performance of te appropriate rotation(s) results in a decrease in te eigt of te rebalanced subtree, wereas, in te tird case, tat subtree s eigt is uncanged. s a consequence, after doing te appropriate rotation to remedy te tird case, noting else needs to be done. On te oter and, in eac of te first two cases, te parent node of te root of te rebalanced 4 Regarding Case II (Figure 2): In te context of a deletion tere is te additional possibility for bot of subtrees and to ave eigt (and tus for node C to ave a zero balance factor). In tat case, bot nodes and end up wit a zero balance factor after te double rotation. 5 In te context of a deletion, node (in all tree cases) is eiter te cild of opposite to tat on te recorded cain or else te sibling of te deleted node. Tat is, in Figures 1 and 3 (resp., Figure 2) te deleted node ad been in subtree (resp., δ). 4
subtree must be considered next, as its balance factor needs adjustment. Indeed, it is possible, in te worst case, for te deletion of a node to result in a rotation (single or double) occurring at eac node on te recoded cain, of wic tere are O(lg n). +2 1 +1 +1 +1 +1 +1 Figure 3: Case III: Rebalancing fter a Deletion Using a Single Rotation Run-time complexity Te major difference between insertion and deletion is tat deletion can require up to O(lg n) rotations, wereas insertion requires at most one. ot operations entail an initial searc (O(lg n) time), a pysical insertion or deletion of a node (O(1) time), and a modification of te balance factors of at most all te nodes on te recorded cain (O(lg n) time). ecause a rotation can be done in constant time (via a few pointer assignments), te time required by rotations is O(1) during an insertion and O(lg n) during a deletion. In summary, ten, bot insertion and deletion take O(lg n) time. Original Paper Interestingly, (te Englis translation of) te paper in wic VL Trees were first introduced [VL62] said noting about node deletion. Moreover, it erroneously omitted te rebalancing case illustrated ere in Figure 1 and instead included te case illustrated in Figure 3, wic cannot possibly occur during an insertion! References [VL62] del son-vel skii, G.M. and Y.M. Landis, n lgoritm for te Organization of Information, Soviet Mat. Dokl 3 (1962) (Englis translation), pp. 1259-1262. [Knut73] Knut, Donald E., rt of Computer Programming, Volume 3: Sorting and Searcing, ddison-wesley, 1973. 5