Method for efficiently balancing binary search trees
The Day–Stout–Warren (DSW) algorithm is a method for efficiently balancing binary search trees – that is, decreasing their height to O(log n) nodes, where n is the total number of nodes. Unlike a self-balancing binary search tree, it does not do this incrementally during each operation, but periodically, so that its cost can be amortized over many operations. The algorithm was designed by Quentin F. Stout and Bette Warren in a 1986 CACM paper,[1] based on work done by Colin Day in 1976.[2]
The algorithm requires linear (O(n)) time and is in-place. The original algorithm by Day generates as compact a tree as possible: all levels of the tree are completely full except possibly the bottom-most. It operates in two phases. First, the tree is turned into a linked list by means of an in-order traversal, reusing the pointers in the (threaded) tree's nodes. A series of left-rotations forms the second phase.[3]
The Stout–Warren modification generates a complete binary tree, namely one in which the bottom-most level is filled strictly from left to right. This is a useful transformation to perform if it is known that no more inserts will be done. It does not require the tree to be threaded, nor does it require more than constant space to operate.[1] Like the original algorithm, Day–Stout–Warren operates in two phases, the first entirely new, the second a modification of Day's rotation phase.[1][3]
A 2002 article by Timothy J. Rolfe brought attention back to the DSW algorithm;[3] the naming is from the section title "6.7.1: The DSW Algorithm" in Adam Drozdek's textbook.[4] Rolfe cites two main advantages: "in circumstances in which one generates an entire binary search tree at the beginning of processing, followed by item look-up access for the rest of processing" and "pedagogically within a course on data structures where one progresses from the binary search tree into self-adjusting trees, since it gives a first exposure to doing rotations within a binary search tree."
Pseudocode
The following is a presentation of the basic DSW algorithm in pseudocode, after the Stout–Warren paper.[1][note 1] It consists of a main routine with three subroutines. The main routine is given by
Allocate a node, the "pseudo-root", and make the tree's actual root the right child of the pseudo-root.
Call tree-to-vine with the pseudo-root as its argument.
Call vine-to-tree on the pseudo-root and the size (number of elements) of the tree.
Make the tree's actual root equal to the pseudo-root's right child.
routine tree-to-vine(root)
// Convert tree to a "vine", i.e., a sorted linked list,
// using the right pointers to point to the next node in the list
tail ← root
rest ← tail.right
while rest ≠ nil
if rest.left = nil
tail ← rest
rest ← rest.right
else
temp ← rest.left
rest.left ← temp.right
temp.right ← rest
rest ← temp
tail.right ← temp
routine compress(root, count)
scanner ← root
for i ← 1 to count
child ← scanner.right
scanner.right ← child.right
scanner ← scanner.right
child.right ← scanner.left
scanner.left ← child
Notes
^This version does not produce perfectly balanced nodes; Stout and Warren present a modification that does, in which the first call to compress is replaced by a different subroutine.
^In the original presentation, tree-to-vine computed the tree's size as it went. For the sake of brevity, we assume this number to be known in advance.