Update

2025-09-23 22:07:11 +02:00 · 2025-09-23 22:07:11 +02:00 · 9969237a2b
commit 9969237a2b
parent 349045e2b8
12 changed files with 2093 additions and 1984 deletions
--- a/compression.md
+++ b/compression.md
@ -59,11 +59,11 @@ The following is an overview of some most common compression techniques.

 **[Predictor](predictor.md) compression** is based on making a *predictor* that tries to guess following data from previous values (which can be done e.g. in case of pictures, sound or text) and then only storing the difference against such a predicted result. If the predictor is good, we may only store the small amount of the errors it makes.

-A famous family of dictionary compression algorithms are **Lempel-Ziv (LZ)** algorithms -- these two guys first proposed [LZ77](lz77.md) in (1977, sliding window) and [LZ78](lz78.md) (explicitly stored dictionary, 1978). These were a basis for improved/remix algorithms, most notably [LZW](lzw.md) (1984, Welch). Additionally these algorithms are used and combined in other algorithms, most notably [gif](gif.md) and [DEFLATE](deflate.md) (used e.g. in gzip and png).
+A famous family of dictionary compression algorithms are **Lempel-Ziv (LZ)** -- these two guys first proposed [LZ77](lz77.md) in (1977, sliding window) and [LZ78](lz78.md) (explicitly stored dictionary, 1978). These methods provided a basis for numerous improved/remixed algorithms, most notably [LZW](lzw.md) (1984, Welch). Additionally these algorithms are used and combined in other ones, most notably [gif](gif.md) and [DEFLATE](deflate.md) (used e.g. in gzip and png).

-An approach similar to the predictor may be trying to find some general mathematical [model](model.md) of the data and then just find and store the right parameters of the model. This may for example mean [vectorizing](vector_graphics.md) a bitmap image, i.e. finding geometrical shapes in an image composed of pixels and then only storing the parameters of the shapes -- of course this may not be 100% accurate, but again if we want to preserve the data accurately, we may additionally also store the small amount of errors the model makes. Similar approach is used in [vocoders](vocoder.md) used in cellphones that try to mathematically model human speech (however here the compression is lossy), or in [fractal](fractal.md) compression of images. A nice feature we gain here is the ability to actually "increase the resolution" (or rather generate detail) of the original data -- once we fit a model onto our data, we may use it to tell us values that are not actually present in the data (i.e. we get a fancy [interpolation](interpolation.md)/[extrapolation](extrapolation.md)).
+An approach similar to predictor is searching for a **mathematical [model](model.md) of the data** and storing only the model parameters (which should be a relatively few numbers, compared to storing the data explicitly). For example this can mean [vectorizing](vector_graphics.md) a bitmap image, i.e. finding geometric shapes (such as lines and circles) in the image (a grid of pixels) and then storing the shape parameters rather than pixel values -- this may apparently not be 100% accurate due to noise and more complex shapes, but again if we desire to preserve the data without losses, additional error correction may be applied by storing the small remaining error, which will allow for restoring the image precisely (of course, the error must really be small, otherwise we might fail to actually compress the data, and this all depends on how well our model predicts and "fits"). Similar approach is used in [vocoders](vocoder.md) used in cellphones that attempt to mathematically model human speech (however here the compression is lossy), or in [fractal](fractal.md) compression of images. A nice feature we gain here is the ability to actually "increase the resolution" (or rather generate detail) of the original data -- once we fit a model onto our data, we may use it to tell us values that are not actually present in the data (i.e. we get a fancy [interpolation](interpolation.md)/[extrapolation](extrapolation.md)).

-Another property of data to exploit may be its sparsity -- if for example we have a huge image that's prevalently white, we may say white is the implicit color and we only somehow store the pixels of other colors.
+Another property of data to exploit may be its **sparsity** -- if for example we were to compress a gigantic image which prevalently consists of large white areas, we could say that white is the implicit color and we'll only explicitly store pixels of other colors.

 Some more wild techniques may include [genetic programming](genetic_programming.md) that tries to evolve a small program that reproduces the input data, or using "[AI](ai.md)" in whatever way to compress the data (in fact compression is an essential part of many [neural networks](neural_network.md) as it forces the network to "understand", make sense of the data -- many neural networks therefore internally compress and decompress the data so as to filter out the unimportant information; [large language models](llm.md) are now starting to beat traditional compression algorithms at compression ratios).