less_retarded_wiki/tree.md
2025-05-21 21:59:16 +02:00

4.6 KiB

Tree

WIP

Tree is an abstract mathematical structure, adopted and frequently used as a data type and data structure in programming, which in simplified terms consists of nodes that form a loopless graph resembling an upside-down tree when drawn on paper. Slightly more presicely tree can be defined as a set of nodes of which each has assigned exactly one of the other nodes as its parent, except for the root node that has no parent, in such a way that there are no cycles (i.e. between any two nodes there always exists exactly one path). The definitions may vary slightly, for example in mathematics it's defined as an undirected graph whereas in computer science it may be seen as directed (because parents "point" to their children), but generally always the same idea underlies the definition. In the context of programming it's important to note that tree is a hierarchical structure, i.e. consisting of "levels": the first level is the root node, the second its children, the third their children etc. A close to real life example of a tree might be the taxonomy tree used in biology to classify living organisms by dividing them into big groups and subsequent subgroups such as kingdom, family and species. As for their significance, trees are among the most essential structures in both programming and mathematics, they belong more or less to intermediate programming. The importance of trees can hardly be overstated, they see frequent use for example as an indexing structure that greatly accelerates searching databases. A set of several disconnected trees is called a forest.

Tree is also a kind of very big plant that has trunk and branches and this kind of stuff. It is no coincidence the programming structure is also called a tree -- it's so because the structure is similar to the physical, real life tree and we conveniently borrow more terms with real life analogies (root, branches, leaves, pruning, forest, ...).

It's also possible to give a beautiful, recursive definition of a tree: tree is a node N0 that has a number (even zero) of children, each of which is a tree of which none share any node and none contains N0. In fact recursion is something inherently associated with trees: algorithms for traversing trees, for instance, are typically recursive in nature.

       666
       / \
      /   \
     96   99
     /    /\
    /    /  \
   69   66  71
   /\       /\
  /  \     /  \
 6    9   7    1

Example of a binary tree of height 4. It's also a heap because each parent is greater in value than any of its children. It might be serialized as: (((6)69(9))96())666((66)99((7)71(1))).

Terminology: the first, topmost node without any parent is called root node. Nodes that have no children are called leaf nodes; nodes being neither a root nor a leaf are usually called internal nodes. We may also encounter terms such as subtrees and branches. Relationships between nodes are described by the same nouns used for family relationships, i.e.: parent node, child node, sibling node, ancestor node, descendant node etc., although some relationships are NOT in common use, e.g. "grandfather node", "cousin node" or "uncle node" (:D). Then we name properties such as the node depth (length of the path from the root to the node), tree height (maximum of all leaves' depths), tree size (total node count), tree breadth (leaf count) etc.

We classify trees by various properties they may have, for example their height, "density", purpose ("decision tree", ...), constraints they satisfy ("heap", ...) or attributes such as being "balanced". Arguably the most important kinds of trees to introduce are N-ary trees in which any node is allowed to have no more than N children. N-ary trees, and especially binary trees (N = 2), are frequently encountered in programming because (for simplicity and performance) nodes in computer memory have often preallocated a fixed number of pointers to their child nodes and this imposes a limit on the maximum number of children. Knowing that a tree is N-ary has additional advantages too, for instance it's possible to easily compute the maximum size a tree of given height will require in memory and so on.

TODO: more more more

See Also