less_retarded_wiki/regex.md

7 lines
1.7 KiB
Markdown
Raw Normal View History

2023-11-05 22:10:49 +01:00
# Regular Expression
Regular expression (shortened *regex* or *regexp*) is a kind of mathematical [expression](expression.md), very often used in [programming](programming.md), that can be used to define simple patterns in [strings](string.md) of characters (usually text). Regular expressions are typically used for searching patterns (i.e. not just exact matches but rather sequences of characters which follow some rules, e.g. numeric values), substitutions (replacement) of such patterns, describing [syntax](syntax.md) of computer languages, their [parsing](parsing.md) etc. (though they may also be used in more wild ways, e.g. for generating strings). Regular expression is itself a string of symbols which however describes potentially many (even [infinitely](infinite.md) many) other strings thanks to containing special symbols that may stand for repetition, alternative etc. For example `a.*.b` is a regular expression describing a string that starts with letter `a`, which is followed by a sequence of at least one character and then ends with `b` (so e.g. `aab`, `abbbb`, `acaccb` etc.).
**Regular expressions are computationally weak**, they are equivalent to the weakest models of computations such as **[finite state machines](finite_state_machine.md)** -- in fact regular expressions are often implemented as finite state machines. This means that **regular expressions can NOT describe any possible pattern**, only relatively simple ones; however it turns out that very many commonly encountered patterns are simple enough to be described this way, so we have a [good enough](good_enough.md) tool. The advantage of regular expressions is exactly that they are simple, yet very often sufficient.
TODO