Technical Articles by Christopher Punt

Trees

What is a Tree?

A Tree is a type of graph. In order for a graph to be considered a Tree it needs to meet these constraints:

  1. There is a single root node (where a root node has no parents, but can have any number of children).
  2. The graph should be acyclic. This means when navigating the tree it should never lead to a repeated cycle of nodes.

If these conditions are met then the graph would be considered a Tree. A Tree has a clear hierarchical structure with the top level of the Tree being the root, the second level being the root’s children, and so on. There are two broad ways to traverse a Tree: depth first search (DFS) and breadth first search (BFS).

Tree Terminology

Here’s a list of common Tree terminology:

  • Root- A root is the node at the top of the Tree. This node will not have a parent but may have multiple children
  • Branch- A branch is a node with a parent node and children nodes
  • Leaf- A leaf is a node which has no children
  • Edge- An edge connects two nodes
  • Parent- A parent is a node with 1 or more child nodes
  • Child- A child is a node with a parent node
  • Height- The height of a Tree is the longest path
  • Depth- The depth of a Tree is the length between the root and a specific node

Why are Trees Important?

Trees are used throughout the programming world, from compilers to databases, and even in HTML. Trees can be the ideal data structure when the data naturally has a hierarchical formation like a file system on a computer. Many common Trees have an average of O(log(n)) runtime for access, search, insertion, and deletion. This Big O cheat sheet displays how Trees compare against other data structures performance times.

Types of Trees

There are many different types of Trees which each have their own use case. Below is a list of 6 popular Trees:

  • General Tree
    • Properties
      • Follows properties of a generic Tree
      • There is no limit on how many children one node can have
    • Use Case
      • Folder Hierarchy
  • N-Ary Tree
    • Properties
      • Follows properties of a generic Tree
      • Can have at most n children per node
    • Use Case
      • XML documents stored in user a DOM Tree
  • Binary Tree
    • Properties
      • Follows properties of a generic Tree
      • Can have at most 2 children per node
    • Use Case
      • Database indices
  • Binary Search Tree
    • Properties
      • Follows properties of a Binary Tree
      • Ordered by size, leftNode <= currentNode < rightNode
    • Use Case
      • Storing sorted data
  • Binary Heaps (Min & Max)
    • Properties
      • Follows properties of a Binary Tree
      • Min binary heap has the smallest number as the root and ordered in ascending order
      • Max heap elements have the largest number as the root and ordered in descending order
      • A min / max heap is a complete binary Tree
      • Self balancing
    • Use Case
      • Implementing priority queues
  • Tries
    • Properties
      • Variant of N-Ary Tree
      • Each path down the Tree may represent a word
    • Use Case
      • Word autocompletion

As stated at the start of this section, this is only a small list of some of the most popular Trees. There are still many more Trees which may be more appropriate in different scenarios.

Balanced Binary Search Trees

A balanced Binary Search Tree (BST) does not mean a BST which is perfectly balanced. It just means the BST is balanced enough to perform O(log n) insert and find times. Below I will cover three types of balanced Trees:

  • Complete BST- This is a Tree where every level is filled, except for potentially the last level. The last level needs to be filled left to right to be a complete Tree.
  • Full BST- This is a Tree where every node has 0 or 2 children. So no node can have a single child node.
  • Perfect BST- This is a Tree which is full, complete and all leaf nodes will be at the same level. Perfect binary Trees are rare in real life and in interviews.

Binary Search Tree

In this section I will be implementing a simple BST. As covered earlier, the properties of a BST are that they follow the properties of a Binary Tree and right nodes will be higher and left nodes will be lower.

The node structure for my BST implementation will consist of three properties: data, left and right. The data property will store the data, where in this case, the data will be an integer. The left property will either be None or a node containing an integer less than or equal to its parent. The right property will be None or a node containing an integer greater than its parent. Example of Node class:

class Node:
    def __init__ (self, data):
        self.data = data
        self.left = None
        self.right = None

The BST itself will contain three methods. The first is __init__() which is invoked on initialization, this method sets the root of the Tree equal to None. The second method is add(data), this takes an integer as an argument and the method checks to see if the root is None. If this is true, then the root is set to the new node. If it is false, then addNode(node, data) is invoked.

The final method, addNode(node, data), takes two arguments: the first is the current node and the second is the data to be inserted. This method is called recursively to traverse the Tree. On each traversal, there are a couple of checks made. The first if statement checks if the current value should be added to the left or right side of the tree. Once this is decided another check is made to see if this spot is empty. If it is, the node is added. If it is not empty then the function is called recursively until an empty slot is found.

If a BST is self balancing then the average runtime will be O(log n). Since this tree is not self balancing the average runtime will be O(n). Example of BST class:

class BST:
    def __init__ (self):
        self.root = None

    def add (self, data):
        if self.root is None:
            self.root = Node(data)
        else:
            self.addNode(self.root, data)

    def addNode (self, node, data):
        if data > node.data:
            if node.right is None:
                node.right = Node(data)
            else:
                self.addNode(node.right, data)
        else:
            if node.left is None:
                node.left = Node(data)
            else:
                self.addNode(node.left, data)

        return True

Traversing Binary Trees

In this section, I will cover four methods to traverse a Binary Tree. Three methods will use a recursive DFS traversal, whilst the other method will use an iterative BFS traversal. It’s important to understand different ways to traverse a Tree so it’s easy to apply the best traversal method.

In-order traversal visits the left branch first, then the current node and finally the right branch. When in-order traversal is used on a BST the output will be the values in ascending order. Example:

def inOrder (node):
		if (node is not None):
				inOrder(node.left)
				print(node.data)
				inOrder(node.right)

Pre-order traversal visits the current node before its children. In pre-order traversal the root will always be the first node visited. Example:

def preOrder (node):
		if (node is not None):
				print(node.data)
				preOrder(node.left)
				preOrder(node.right)

Post-order traversal visits the current node after its child nodes. In post-order traversal the root will always be the last node visited. Example:

def postOrder (node):
		if (node is not None):
				postOrder(node.left)
				postOrder(node.right)
				print(node.data)

Breadth first search traverses layer-by-layer of the tree. This function uses Python’s built in Queue data structure to keep track of nodes to search. Example:

def bfs (root):
		queue = Queue()
		queue.put(root)
		node = queue.get()

		while node is not None:
				print(node.data)
				if node.left: queue.put(node.left)
						if node.right: queue.put(node.right)
								node = None if queue.empty() else queue.get()

Below is an example of each traversal method in action with their outputs:

bst = BST()
bst.add(20)
bst.add(14)
bst.add(24)
bst.add(15)
bst.add(40)
bst.add(7)

inOrder(bst.root) # 7, 14, 15, 20, 24, 40
preOrder(bst.root) # 20, 14, 7, 15, 24, 40
postOrder(bst.root) # 7, 15, 14, 40, 24, 20
bfs(bst.root) # 20, 14, 24, 7, 15, 40

Summary

In this article, I covered the very basics of Trees. Trees are far too big of a topic to fully cover in one article. But if you are interested in putting your Tree knowledge to the test then hackerrank and leetcode have some good questions to do so. If you want to view the source code for the examples in this article click here.