# Functional Sets, Part 3: Balancing

Elm 0.18 · Updated February 15, 2018 · 8 Minute Read · ∞ Permalink

Last time, we covered how rotation could keep our trees balanced. But how do we automatically apply our rotations?

## Rotation Recap

When we rotate a tree, we move the subtrees and heads around so that the tree has a different root, but the same order. It looks like this:

Last time we created two functions, `rotateLeft` and `rotateRight`, to do this rotation for us.

## AVL Trees

So, OK, we can rotate our trees. That’s a nice first step! But we don’t want to have to rotate left or right manually, we just want `insert` to take care of it for us. So how do we get from here to there?

Fortunately, tons of smart people have thought about this problem, and we have tons of options! Red-black trees, scapegoat trees, splay trees, treaps… but we’re going to use AVL trees (which take their name from their inventors: Adelson-Velsky and Landis.)

AVL trees balance by making sure the height of each subtree is about the same. In our example from before, the balanced tree had a height of 2, while the unbalanced tree had a height of 5. A balanced tree of 1 through 5 has a height of 2.

Take note that we’re using the number of levels of nodes to define the height, not how many nodes there are (this is called the weight.) This means that:

• An empty set has a height of 0.
• A singleton set has a height of 1.
• Anything else is the height of it’s tallest subtree, plus 1.

When we `insert`, we’ll look at the new heights of the trees after insertion. If the difference between them is greater than 1 (so if the left is taller than the right, or the reverse), we’ll rotate!

We want to keep track of these heights without recomputing them every time, so we’ll change our implementation of `Set` to look like this:

``````type Set comparable
= Tree Int comparable (Set comparable) (Set comparable)
| Empty
``````

The `Int` as the new first field of `Tree` will hold the tree’s height. We’ll need to refer to this field often, let’s write a quick function to get the height of any tree:

``````height : Set comparable -> Int
height set =
case set of
Empty ->
0

Tree height _ _ _ ->
height
``````

Now we change our constructor functions. `empty` can stay the same, but singleton will change to add a height:

``````singleton : comparable -> Set comparable
singleton item =
Tree 1 item empty empty
``````

…and we’ll need an entirely new constructor, `tree`:

``````tree : comparable -> Set comparable -> Set comparable -> Set comparable
Tree
(max (height left) (height right) + 1)
left
right
``````

We’ll use this in `insert` in place of direct calls to `Tree`:

``````insert : comparable -> Set comparable -> Set comparable
insert item set =
case set of
Empty ->
singleton item

Tree _ head left right ->
tree head (insert item left) right
else if item > head then
tree head left (insert item right)
else
set
``````

And finally, we need to change `rotateLeft` and `rotateRight`:

``````rotateLeft : Set comparable -> Set comparable
rotateLeft set =
case set of

_ ->
set

rotateRight : Set comparable -> Set comparable
rotateRight set =
case set of

_ ->
set
``````

That’s all we need to do to set the stage! Now that we track of balance, we can rotate during insertion to keep the tree balanced. This takes less work that you might think!

## Our Trigger: Height Difference

Remember that we measure the balance of a tree as the height of the right subtree minus the height of the left subtree. A tree is a valid AVL tree if the height is -1, 0, or 1. So, to make our trees valid AVL trees, we’ll rotate if the balance is outside that range.

We can get the height difference for a set with a new function:

``````diff : Set comparable -> Int
diff set =
case set of
Empty ->
0

Tree _ _ left right ->
height right - height left
``````

This means that if the tree leans to the right the value will be positive, and to the left the value will be negative.

### The Simplest Case: 2 or More

To balance our tree, we’ll first need to handle if the height difference is greater than one.

Say we insert the letters A through C, in order.

This is not a valid AVL tree, since the balance is 2 (a height of two on the right minus 0 on the left.) Since the tree is leaning right we’ll rotate left, so that B is at the root.

Now the height of the left and the right subtrees is the same, so the difference is 0. This will make our tree valid again.

We’ll implement our initial balancing function like this:

``````balance : Set comparable -> Set comparable
balance set =
case set of
Empty ->
set

Tree _ head left right ->
if diff set < -1 then
rotateRight set
else if diff set > 1 then
rotateLeft set
else
set
``````

An empty set is already balanced, so we return it immediately. Same for a set whose difference falls between the valid range of -1 and 1. Otherwise, we rotate!

We’ll also change `insert` to call `balance` after inserting:

``````insert : comparable -> Set comparable -> Set comparable
insert item set =
case set of
Empty ->
singleton item

Tree _ head left right ->
tree head (insert item left) right |> balance
else if item > head then
tree head left (insert item right) |> balance
else
set
``````

Notice that we’re only balancing the tree after insertion. This will make sure that each level of our tree is balanced from the bottom up, and only if any levels need it.

### Left- and Right-Leaning Sets

This seems great so far, but there’s one more tree shape we need to consider. What if we inserted A, C, then B?

The difference would be 2, but if we rotated right it would swap to -2!

In this case, we need to rotate the tree at C right, then the tree at A left. The right rotation at C will create our unbalanced A-B-C set from above, then the left rotation at A will move B to the root, balancing the tree. If the tree were leaning the other way, we’d just reverse the operation.

The code looks like this:

``````balance : Set comparable -> Set comparable
balance set =
case set of
Empty ->
set

Tree _ head left right ->
if diff set == -2 && diff left == 1 then
-- left leaning tree with right-leaning left subtree.
tree head (rotateLeft left) right |> rotateRight
else if diff set < -1 then
rotateRight set
else if diff set == 2 && diff right == -1 then
-- right leaning tree with left-leaning right subtree.
tree head left (rotateRight right) |> rotateLeft
else if diff set > 1 then
rotateLeft set
else
set
``````

Whew, that function has quite a few conditionals! Generally, we’re examining the difference in each tree, and the difference in the relevant subtrees. If we find that the difference is 2 (or -2) and the difference in the subtree is -1 (or 1) we’ve hit our problem. We send the subtree to `rotateLeft` or `rotateRight` before rotating the whole tree in question.

It looks sort of messy, but these 9 lines of code will maintain a valid AVL tree (and fast insert/lookup operations) in every situation. We’re done with balancing and `insert`!

## Make Your Own Trees, Again

Now you can try this out for yourself. This is the same visualization from last time, but with balancing turned on. Insert values to see how the balancing code reacts and keeps everything tuned up.

Things to play around with:

• Insert three values in order.
• Insert three values so that you would get a leaning tree (like A C B.) Check out how the tree balances itself.
• Create some trees with the same values in different insertion order. Notice how the composition of the tree will change with different insertion orders.