8.1 Binary trees

A binary tree is made up of a finite set of elements called nodes. This set either is empty or consists of a node called the root together with two binary trees, called the left and right subtrees, which are disjoint from each other and from the root. (Disjoint means that they have no nodes in common.) The roots of these subtrees are children of the root. There is an edge from a node to each of its children, and a node is said to be the parent of its children.

If n1,n2,...,nkn_1, n_2, ..., n_k is a sequence of nodes in the tree such that nin_i is the parent of ni+1n_i+1 for 1i<k1 \leq i < k, then this sequence is called a path from n1n_1 to nkn_k. The length of the path is k1k-1. If there is a path from node RR to node MM, then RR is an ancestor of MM, and MM is a descendant of RR. Thus, all nodes in the tree are descendants of the root of the tree, while the root is the ancestor of all nodes. The depth of a node MM in the tree is the length of the path from the root of the tree to MM. The height of a tree is the depth of the deepest node in the tree. All nodes of depth dd are at level dd in the tree. The root is the only node at level 0, and its depth is 0. A leaf node is any node that has two empty children. An internal node is any node that has at least one non-empty child.

Figure 8.1: An example binary tree.

Figure 8.1 above illustrates the various terms used to identify parts of a binary tree. Node AA is the root, and nodes BB and CC are AA’s children. Nodes BB and DD together form a subtree. Node BB has two children: Its left child is the empty tree and its right child is DD. Nodes AA, CC, and EE are ancestors of GG. Nodes DD, EE, and FF make up level 2 of the tree; node AA is at level 0. The edges from AA to CC to EE to GG form a path of length 3. Nodes DD, GG, HH, and II are leaves. Nodes AA, BB, CC, EE, and FF are internal nodes. The depth of II is 3. The height of this tree is 3.

Figure 8.2 below illustrates an important point regarding the structure of binary trees. Because all binary tree nodes have two children (one or both of which might be empty), the two binary trees (a) and (b) in the figure are not the same.

Figure 8.2: Two different binary trees: (a) the root has a non-empty left child; (b) the root has a non-empty right child; and (c) the same tree as (a), with the missing right child made explicit; (d) the same tree as (b), with the missing left child made explicit.

Two restricted forms of binary tree are sufficiently important to warrant special names. Each node in a full binary tree is either (1) an internal node with exactly two non-empty children or (2) a leaf. A complete binary tree has a restricted shape obtained by starting at the root and filling the tree by levels from left to right. In the complete binary tree of height dd, all levels except possibly level dd are completely full. The bottom level has its nodes filled in from the left side.

Figure 8.3 below illustrates the differences between full and complete binary trees. There is no particular relationship between these two tree shapes; that is, the tree (a) is full but not complete while the tree (b) is complete but not full. The binary heap (Section 9.2) is an example of a complete binary tree. The Huffman coding tree (Section 9.4) is an example of a full binary tree.

Figure 8.3: Examples of full and complete binary trees: (a) is full but not complete; (b) is complete but not full.

Note: While these definitions for full and complete binary tree are the ones most commonly used, they are not universal. Because the common meaning of the words “full” and “complete” are quite similar, there is little that you can do to distinguish between them other than to memorise the definitions. Here is a memory aid that you might find useful: “Complete” is a wider word than “full”, and complete binary trees tend to be wider than full binary trees because each level of a complete binary tree is as wide as possible.

8.1.1 Binary trees are recursive data structures

A recursive data structure is a data structure that is partially composed of smaller or simpler instances of the same data structure. For example, linked lists and binary trees can be viewed as recursive data structures. A list is a recursive data structure because a list can be defined as either (1) an empty list or (2) a node followed by a list. A binary tree is typically defined as (1) an empty tree or (2) a node pointing to two binary trees, one its left child and the other one its right child.

The recursive relationships used to define a structure provide a natural model for any recursive algorithm on the structure.

One way to think about recursion is to see it as delegation: Suppose you want to compute the sum of the values stored in a binary tree. And since you are a lazy person you don’t want to do most of the work yourself, so you ask two friends to help you.

  • The first friend will take the left subtree to sum it.
  • The second friend will take the right subtree to sum it.
  • The only thing you have to do is to sum the values that got from your friends.

You don’t need to think about how your friends (the recursive calls) calculated their sums, as long as you accept that they are correct.

Here is a visual explanation of the same idea.