This commit is contained in:
Miloslav Ciz 2025-05-25 18:16:38 +02:00
parent 35c0f438a4
commit 23f4bd88fc
20 changed files with 2028 additions and 1990 deletions

17
tree.md
View file

@ -27,7 +27,22 @@ Insofar as programming goes, the key characteristic of trees is their **hierarch
**Terminology**: the first, topmost node without any parent is called **root node**. Nodes that have no children are called **leaf nodes**; nodes being neither a root nor a leaf are usually called **internal nodes**. We may also encounter terms such as **subtrees** and **branches**. Relationships between nodes are described by the same nouns used for family relationships, i.e.: **parent node**, **child node**, **sibling node**, **ancestor node**, **descendant node** etc., although some relationships are NOT in common use, e.g. "grandfather node", "cousin node" or "uncle node" (:D). Then we name properties such as the **node depth** (length of the path from the root to the node), **tree height** (maximum of all leaves' depths), **tree size** (total node count), **tree breadth** (leaf count) etc.
We classify trees by various properties they may have, for example their height, "density", purpose ("decision tree", "search tree" ...), constraints they satisfy ("heap", ...), what kind of value the nodes store and where (in all nodes, just leaves, ...) or attributes such as being "balanced". Arguably the most important kinds of trees to introduce are **N-ary trees** in which any node is allowed to have no more than *N* children. N-ary trees, and especially [binary](binary.md) trees (*N = 2*), are frequently encountered in programming because (for [simplicity](simplicity.md) and performance) nodes in computer memory have often preallocated a fixed number of [pointers](pointer.md) to their child nodes and this imposes a limit on the maximum number of children. Knowing that a tree is N-ary has additional advantages too, for instance it's possible to easily compute the maximum size a tree of given height will require in memory and so on. In case of *N = 1* the tree degenerates into a [linked list](linked_list.md).
We classify trees by various properties they may have, for example their height, "density", purpose ("decision tree", "search tree" ...), constraints they satisfy ("heap", AVL, ...), what kind of value the nodes store and where (in all nodes, just leaves, ...) or attributes such as being "balanced". Arguably the most important kinds of trees to introduce are **N-ary trees** in which any node is allowed to have no more than *N* children. N-ary trees, and especially [binary](binary.md) trees (*N = 2*), are frequently encountered in programming because (for [simplicity](simplicity.md) and performance) nodes in computer memory have often preallocated a fixed number of [pointers](pointer.md) to their child nodes and this imposes a limit on the maximum number of children. Knowing that a tree is N-ary has additional advantages too, for instance it's possible to easily compute the maximum size a tree of given height will require in memory and so on. In case of *N = 1* the tree degenerates into a [linked list](linked_list.md).
Sometimes a tree isn't even a physical data structure but rather an [implicit](implicit.md) structure formed for example by recursive calls of functions. A typical example of this would be the *abstract syntax tree* in [compilers](compiler.md). The compiler sees the language underneath as a hierarchy of symbols (see also [grammar](grammar.md)), i.e. a tree, but it typically won't construct the tree as an explicitly stored structure in the memory, but will rather call a function to process a block of tokens, which will then call other functions to process smaller blocks and so on, so that the language is processed AS IF it really WAS stored like a tree. For instance an expression such as `(a + b) * f(!x,y,a - b)` might be represented (from the processing point of view) as:
```
*
/ \
/ \
+ f
/| /|\
/ | / | \
a b ! y -
/ / \
/ / \
x a b
```
TODO: more more more