Update

2025-03-26 21:33:03 +01:00 · 2025-03-26 21:33:03 +01:00 · a8a9ce5cfc
commit a8a9ce5cfc
parent 21973f4724
22 changed files with 2026 additions and 1983 deletions
--- a/probability.md
+++ b/probability.md
@ -45,7 +45,13 @@ Showing this is nothing more than simply [normalization](normalization.md) by th

 **[Venn diagrams](venn_diagram.md) are excellent for visualizing probabilities**. Imagine the space of all possibilities as a circle with area equal to 1 and then events as other smaller circles inside this circle. Area occupied by each circle is the corresponding event's probability. Now imagine performing an experiment as choosing ta random a point in the the big circle, for example by blindly throwing a dart. It's clear that the larger an event's circle is, the higher chance it has of being hit. Events with non-overlapping circles are mutually exclusive as there is no way the dart could ever land simultaneously in two non-overlapping areas. It's clear that the probability of one of several mutually exclusive events occurring is the sum of the corresponding circles' areas, just like stated by the equation above. Overlapping circles represent events allowed to happen simultaneously. Should events *x* and *y* overlap, then the conditional probability *P(x|y)* is the proportion of *x*'s area inside *y* to the whole area of *y*. And so on.

-**Probability distribution [functions](function.md)**: until now we've implicitly assumed that the all possible outcomes (events) of an experiment are equally likely to occur, i.e. that for instance each marble in a box has the same likelihood of being picked etc. In real life scenarios this frequently doesn't hold however, for example the likelihood of a human being born with red hair is lower than that of being born with dark hair (considering we don't have further information about parents etc.). This is modeled by so called *probability distribution function* -- this function says how likely each possible outcome is. For a finite number of discrete outcomes, such as possible hair colors, the functions may simply state the probability directly, e.g. *p_hair_color(black) = 0.75*, *p_hair_color(red) = 0.01* etc. For continuous values, such as human height, the situation gets slightly more complicated: the function cannot directly state a probability of a single value, only a probability of a value falling within certain INTERVAL. Consider e.g. asking about the probability of a human being exactly 1.75 meters tall? It's essentially 0 because anyone getting even very short of said height will always be at least a few micrometers off. So we should rather ask what's the probability of someone being between 1.75 and 1.76 meters tall? And this already makes good sense. For this continuous values are rather described by so called **probability density functions**, which must be [integrated](integral.md) over given interval in order to obtain a direct probability. There further exist equivalent kinds of functions such as cumulative distribution functions that say the probability of of the value being *x* or lower, but we won't delve into these now. The most important probability distributions are **uniform** (all events are equally likely) and **normal**, which has the bell curve shape and which describes many variables in nature, for example [IQ](iq.md) distribution or height of trees in a forest.
+**Probability distribution [functions](function.md)**: until now we've implicitly assumed that the all possible outcomes (events) of an experiment are equally likely to occur, i.e. that for instance each marble in a box has the same likelihood of being picked etc. In real life scenarios this frequently doesn't hold however, for example the likelihood of a human being born with red hair is lower than that of being born with dark hair (considering we don't have further information about parents etc.). This is modeled by so called *probability distribution function* -- this function says how likely each possible outcome is. For a finite number of discrete outcomes, such as possible hair colors, the functions may simply state the probability directly, e.g. *p_hair_color(black) = 0.75*, *p_hair_color(red) = 0.01* etc. For continuous values, such as human height, the situation gets slightly more complicated: the function cannot directly state a probability of a single value, only a probability of a value falling within certain INTERVAL. Consider e.g. asking about the probability of a human being exactly 1.75 meters tall? It's essentially 0 because anyone getting even very short of said height will always be at least a few micrometers off. So we should rather ask what's the probability of someone being between 1.75 and 1.76 meters tall? And this already makes good sense. For this continuous values are rather described by so called **probability density functions**, which must be [integrated](integral.md) over given interval in order to obtain a direct probability. There further exist equivalent kinds of functions such as cumulative distribution functions that say the probability of of the value being *x* or lower, but we won't delve into these now. Some of the most important probability distributions are **uniform** (all events are equally likely), **normal**, which is continuous, has the bell curve shape and describes many variables in nature, for example [IQ](iq.md) distribution or height of trees in a forest, and **binomial** (described below), which is a discrete distribution, in shape similar to normal distribution, saying probability of given number of successful experiments in a fixed number of experiments.
+
+Binomial distribution tells us the probability of seeing exactly *x* successful experiments if we perform *n* experiments in total. Given success probability *p*, it is computed as:
+
+*Bi(n,p,x) = binomial(n,x) * p^x * (1 - p)^(n - x)*
+
+where *binomial(n,x)* is the binomial coefficient computed as *n! / (x! * (n - x)!)*. For example the probability of seeing exactly 6 heads in 10 coin flips is *10! / (6! * (10 - 6)!) * 0.5^6 * (1 - 0.5)^(10 - 6) ~= 0.21*.

 ## See Also