Update
This commit is contained in:
parent
d765507e6a
commit
ecd9c2f991
8 changed files with 22 additions and 6 deletions
|
@ -4,6 +4,8 @@ Compression means encoding [data](data.md) (such as images or texts) in a differ
|
|||
|
||||
{ There is a cool compressing competition known as Hutter Prize that offers 500000 pounds to anyone who can break the current record for compressing [Wikipedia](wikipedia.md). Currently the record is at compressing 1GB down to 115MB. See http://prize.hutter1.net for more. ~drummyfish }
|
||||
|
||||
{ [LMAO](lmao.md) retard [patents](patent.md) are being granted on impossible compression algorithms, see e.g. http://gailly.net/05533051.html. See also [Sloot Digital Coding System](sloot.md), a miraculous compression algorithm that "could store a whole movie in 8 KB" lol. ~drummyfish }
|
||||
|
||||
Let's keep in mind compression is not applied just to files on hard drives, it can also be used e.g. in RAM to utilize it more efficiently.
|
||||
|
||||
Why don't we compress everything? Firstly because compressed data is slow to work with, it requires significant CPU time to compress and decompress data, it's a kind of a space-time tradeoff (we gain more storage space for the cost of CPU time). Secondly compressed data is more prone to [corruption](corruption.md) because redundant information (which can help restoring corrupted data) is removed from it -- in fact we sometimes purposefully do the opposite of compression and make our data bigger to protect it from corruption (see e.g. [error correcting](error_correction.md) codes, [RAID](raid.md) etc.). And last but not least, many data can hardly be compressed or are so small it's not even worth it.
|
||||
|
@ -47,6 +49,8 @@ The following is an overview of some most common compression techniques.
|
|||
|
||||
**[Predictor](predictor.md) compression** is based on making a *predictor* that tries to guess following data from previous values (which can be done e.g. in case of pictures, sound or text) and then only storing the difference against such a predicted result. If the predictor is good, we may only store the small amount of the errors it makes.
|
||||
|
||||
A famous family of dictionary compression algorithms are **Lempel-Ziv (LZ)** algorithms -- these two guys first proposed [LZ77](lz77.md) in (1977, sliding window) and ([LZ78](lz78.md), explicitly stored dictionary, 1978). These were a basis for improved/remix algorithms, most notably [LZW](lzw.md) (1984, Welch). Additionally these algorithms are used and combined in other algorithms, most notably [gif](gif.md) and [DEFLATE](deflate.md) (used e.g. in gzip and png).
|
||||
|
||||
An approach similar to the predictor may be trying to find some general mathematical [model](model.md) of the data and then just find and store the right parameters of the model. This may for example mean [vectorizing](vector_graphics.md) a bitmap image, i.e. finding geometrical shapes in an image composed of pixels and then only storing the parameters of the shapes -- of course this may not be 100% accurate, but again if we want to preserve the data accurately, we may additionally also store the small amount of errors the model makes. Similar approach is used in [vocoders](vocoder.md) used in cellphones that try to mathematically model human speech (however here the compression is lossy), or in [fractal](fractal.md) compression of images. A nice feature we gain here is the ability to actually "increase the resolution" (or rather generate detail) of the original data -- once we fit a model onto our data, we may use it to tell us values that are not actually present in the data (i.e. we get a fancy [interpolation](interpotation.md)/[extrapolation](extrapolation.md)).
|
||||
|
||||
Another property of data to exploit may be its sparsity -- if for example we have a huge image that's prevalently white, we may say white is the implicit color and we only somehow store the pixels of other colors.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue