master
Miloslav Ciz 11 months ago
parent 619bedb131
commit e267fff78e

@ -42,6 +42,7 @@ The methods may be tagged with the following:
|segmented road |*OO 2.5D*, e.g. Outrun |
|[shear warp rednering](shear_warp.md) |*IO*, volumetric |
|[splatting](splatting.md) |*OO*, rendering with 2D blobs |
|[texture slicing](texture_slicing.md) |*OO*, volumetric, layering textures |
|[triangle rasterization](rasterization.md)|*OO*, traditional in GPUs |
|[voxel space rendering](voxel_space.md) |*OO 2.5D*, e.g. Comanche |
|[wireframe rendering](wireframe.md) |*OO*, just lines |

@ -87,6 +87,7 @@ Here is a list of some acronyms:
- **[ENIAC](eniac.md)** (electronic numerical integrator and computer)
- **[EOF](eof.md)** (end of [file](file.md))
- **[EOL](eol.md)** (end of line, end of life)
- **[ERP](erp.md)** (erotic role play)
- **[ESR](esr.md)** (Erik Steven Raymond)
- **[EULA](eula.md)** (end user license agreement)
- **[FAQ](faq.md)** (frequently asked questions)
@ -192,6 +193,7 @@ Here is a list of some acronyms:
- **[KKK](kkk.md)** (ku klux klan)
- **[KYS](kys.md)** ([kill yourself](suicide.md))
- **[LAMP](lamp.md)** (linux apache mysql php)
- **[LARP](larp.md)** (live action role play)
- **[LAN](lan.md)** (local area network)
- **[LCD](lcd.md)** (liquid crystal display)
- **[LED](led.md)** (light emitting diode)
@ -232,6 +234,7 @@ Here is a list of some acronyms:
- **[NASA](nasa.md)** (national aeronautic and space administration)
- **[NAT](nat.md)** (network address translation)
- **[NC](nc.md)** (non commercial)
- **[NEET](neet.md)** (not in education, employment or training)
- **[NGL](ngl.md)** (not gonna lie)
- **[NOP](nop.md)** (no operation)
- **[NP](np.md)** (nondeterministic polynomial)

@ -173,8 +173,11 @@ RETURN_TYPE myFunction (TYPE1 param1, TYPE2 param2, ...)
- [C tutorial](c_tutorial.md)
- [C pitfalls](c_pitfalls.md)
- [C programming style](programming_style.md)
- [C++](cpp.md)
- [IOCCC](ioccc.md)
- [D](d.md)
- [HolyC](holyc.md)
- [QuakeC](quakec.md)
- [Pascal](pascal.md)
- [Fortran](fortran.md)
- [LISP](lisp.md)

@ -25,11 +25,11 @@ The following is an example of how well different types of compression work for
| image lossy (JPG), nearly indistinguishable quality | 164 | 0.054 |
| image lossy (JPG), ugly but readable | 56 | 0.018 |
Mathematically there cannot exist a lossless compression algorithm that would always reduce the size of any input data -- if it existed, we could just repeatedly apply it and compress ANY data to zero bytes. And not only that -- **every lossless compression will inevitably enlarge some input files**. This is also mathematically given -- we can see compression as simply mapping input binary sequences to output (compressed) binary sequences, while such mapping has to be one-to-one ([bijective](bijection.md)); it can be simply shown that if we make any such mapping that reduces the size of some input (maps a longer sequence to a shorter one, i.e. compresses it), we will also have to map some short code to a longer one. However we can make it so that our compression algorithm enlarges a file at most by 1 bit: we can say that the first bit in the compressed data says whether the following data is compressed or not; if our algorithm fails to reduce the size of the input, it simply sets the bit to says so and leaves the original file uncompressed.
Mathematically there cannot exist a lossless compression algorithm that would always reduce the size of any input data -- if it existed, we could just repeatedly apply it and compress ANY data to zero bytes. And not only that -- **every lossless compression will inevitably enlarge some input files**. This is also mathematically given -- we can see compression as simply mapping input binary sequences to output (compressed) binary sequences, while such mapping has to be one-to-one ([bijective](bijection.md)); it can be simply shown that if we make any such mapping that reduces the size of some input (maps a longer sequence to a shorter one, i.e. compresses it), we will also have to map some short code to a longer one. However we can make it so that our compression algorithm enlarges a file at most by 1 bit: we can say that the first bit in the compressed data says whether the following data is compressed or not; if our algorithm fails to reduce the size of the input, it simply sets the bit to says so and leaves the original file uncompressed (in practice many algorithms don't do this though as they try to work as streaming filters, without random access to data, which would be needed here).
**Dude, how does compression really work tho?** The basic principle of lossless compression is **removing [redundancy](redundancy.md)** ([correlations](correlation.md) in the data), i.e. that which is explicitly stored in the original data but doesn't really have to be there because it can be reasoned out from the remaining data. This is why a completely random [noise](noise.md) can't be compressed -- there is no correlated data in it, nothing to reason out from other parts of the data. However human language for example contains many redundancies. Imagine we are trying to compress English text and have a word such as "computer" on the input -- we can really just shorten it to "computr" and it's still pretty clear the word is meant to be "computer" as there is no other similar English word (we also see that compression algorithm is always specific to the type of data we expect on the input -- we have to know what nature of the input data we can expect). Another way to remove redundancy is to e.g. convert a string such as "HELLOHELLOHELLOHELLOHELLO" to "5xHELLO". Lossy compression on the other hand tries to decide what information is of low importance and can be dropped -- for example a lossy compression of text might discard information about case (upper vs lower case) to be able to store each character with fewer bits; an all caps text is still readable, though less comfortably.
**OK, but how much can we really compress?** Well, as stated above, there can never be anything such as a universal uber compression algorithm that just makes any input file super small -- everything really depends on the nature of the data we are trying to compress. The more we know about the nature of the input data, the more we can compress, so a general compression program will compress only a little, while an image-specialized compress program will compress better (but will only work with images). As said, we just cannot compress completely random data at all (as we don't know anything about the nature of such data). On the other hand data with a lot of redundancy, such as video, can be compressed A LOT. **In theory we can make an algorithm that compresses one specific 100GB video to 1 bit** (we just define that a bit "1" decompresses to this specific video), but it will only work for that one single video, not for video in general. Similarly video compression algorithms used in practice work only for videos that appear in the real world which exhibit certain patterns, such as two consecutive frames being very similar -- if we try to compress e.g. static (white noise), video codecs just shit themselves trying to compress it (look up e.g. videos of confetti and see how blocky they get).
**OK, but how much can we really compress?** Well, as stated above, there can never be anything such as a universal uber compression algorithm that just makes any input file super small -- everything really depends on the nature of the data we are trying to compress. The more we know about the nature of the input data, the more we can compress, so a general compression program will compress only a little, while an image-specialized compression program will compress better (but will only work with images). As an extreme example, consider that **in theory we can make e.g. an algorithm that compresses one specific 100GB video to 1 bit** (we just define that a bit "1" decompresses to this specific video), but it will only work for that one single video, not for video in general -- i.e. we made an extremely specialized compression and got an extremely good compression ratio, however due to such extreme specialization we can almost never use it. As said, we just cannot compress completely random data at all (as we don't know anything about the nature of such data). On the other hand data with a lot of redundancy, such as video, can be compressed A LOT. Similarly video compression algorithms used in practice work only for videos that appear in the real world which exhibit certain patterns, such as two consecutive frames being very similar -- if we try to compress e.g. static (white noise), video codecs just shit themselves trying to compress it (look up e.g. videos of confetti and see how blocky they get).
## Methods
@ -49,7 +49,7 @@ An approach similar to the predictor may be trying to find some general mathemat
Another property of data to exploit may be its sparsity -- if for example we have a huge image that's prevalently white, we may say white is the implicit color and we only somehow store the pixels of other colors.
Some more wild techniques may include [genetic programming](genetic_programming.md) that tries to evolve a small program that reproduces the input data, or using "[AI](ai.md)" in whatever way to compress the data.
Some more wild techniques may include [genetic programming](genetic_programming.md) that tries to evolve a small program that reproduces the input data, or using "[AI](ai.md)" in whatever way to compress the data (in fact compression is an essential part of many [neural networks](neural_network.md) as it forces the network to "understand", make sense of the data -- many neural networks therefore internally compress and decompress the data so as to filter out the unimportant information).
Note that many of these methods may be **combined or applied repeatedly** as long as we are getting smaller results.
@ -74,21 +74,143 @@ In **audio** we usually straight remove frequencies that humans can't hear (usua
Here is a list of some common compression programs/utilities/standards/formats/etc:
| util/format | extensions | free? | media | lossless? | notes |
| ----------------- | ---------- | ----- | -------------| --------- | -------------------------------------------- |
|[bzip2](bzip2.md) | .bz2 | yes | general | yes | BurrowsWheeler alg. |
|[flac](flac.md) | .flac | yes | audio | yes | super free lossless audio format |
|[gif](gif.md) | .gif |now yes| image/anim. | no | limited color palette, patents expired |
|[gzip](gzip.md) | .gz | yes | general | yes | by GNU, DEFLATE, LZ77, mostly used by Unices |
|[jpeg](jpeg.md) | .jpg, .jpeg| yes? | raster image | no | common lossy format, under patent fire |
|[lz4](lz4.md) | .lz4 | yes | general | yes | high compression/decompression speed, LZ77 |
|[mp3](mp3.md) | .mp3 |now yes| audio | no | popular audio format, patents expired |
|[png](png.md) | .png | yes | raster image | yes | popular lossless image format, transparency |
|[rar](rar.md) | .rar | NO | general | yes | popular among normies, PROPRIETARY |
|[vorbis](vorbis.md)| .ogg | yes | audio | no | was a free alternative to mp3, used with ogg |
|[zip](zip.md) | .zip | yes? | general | yes | along with encryption may be patented |
|[7-zip](7zip.md) | .7z | yes | general | yes | more complex archiver |
| util/format | extensions | free? | media | lossless? | notes |
| ----------------- | ---------- | ----- | ------------- | --------- | -------------------------------------------- |
|[bzip2](bzip2.md) | .bz2 | yes | general | yes | BurrowsWheeler alg. |
|[flac](flac.md) | .flac | yes | audio | yes | super free lossless audio format |
|[gif](gif.md) | .gif |now yes| image/anim. | no | limited color palette, patents expired |
|[gzexe](gzexe.md) | | yes |executable bin.| yes | makes self-extracting executable |
|[gzip](gzip.md) | .gz | yes | general | yes | by GNU, DEFLATE, LZ77, mostly used by Unices |
|[jpeg](jpeg.md) | .jpg, .jpeg| yes? | raster image | no | common lossy format, under patent fire |
|[lz4](lz4.md) | .lz4 | yes | general | yes | high compression/decompression speed, LZ77 |
|[mp3](mp3.md) | .mp3 |now yes| audio | no | popular audio format, patents expired |
|[png](png.md) | .png | yes | raster image | yes | popular lossless image format, transparency |
|[rar](rar.md) | .rar | NO | general | yes | popular among normies, PROPRIETARY |
|[vorbis](vorbis.md)| .ogg | yes | audio | no | was a free alternative to mp3, used with ogg |
|[zip](zip.md) | .zip | yes? | general | yes | along with encryption may be patented |
|[7-zip](7zip.md) | .7z | yes | general | yes | more complex archiver |
## Code Example
TODO
Let's write a simple lossless compression utility in [C](c.md). It will work on binary files and we will use the simplest RLE method, i.e. our program will just shorten continuous sequences of repeating bytes to a short sequence saying "repeat this byte N times". Note that this is very primitive (a small improvement might be actually done by looking for sequences of longer words, not just single bytes), but it somewhat works for many files and demonstrates the basics.
The compression will work like this:
- We will choose some random, hopefully not very frequent byte value, as our special "marker value". Let's say this will be the value 0xf3.
- We will read the input file and whenever we encounter a sequence of 4 or more same bytes in a row, we will output these 3 bytes:
- the marker value
- byte whose values is the length of the sequence minus 4
- the byte to repeat
- If we encounter the marker value is encountered in input, we output 2 bytes:
- the marker value
- value 0xff (which we won't be able to use for the length of the sequence)
- Otherwise we just output the byte we read from the input.
Decompression is then quite simple -- we simply output what we read, unless we read the marker value; in such case we look whether the following value is 0xFF (then we output the marker value), else we know we have to repeat the next character this many times plus 4.
For example given input bytes
```
0x11 0x00 0x00 0xAA 0xBB 0xBB 0xBB 0xBB 0xBB 0xBB 0x10 0xF3 0x00
\___________________________/ \__/
long repeating sequence marker!
```
Our algorithm will output a compressed sequence
```
0x11 0x00 0x00 0xAA 0xF3 0x02 0xBB 0x10 0xF3 0xFF 0x00
\____________/ \_______/
compressed seq. encoded marker
```
Notice that, as stated above in the article, there inevitably exists a "danger" of actually enlarging some files. This can happen if the file contains no sequences that we can compress and at the same time there appear the marker values which actually get expanded (from 1 byte to 2).
The nice property of our algorithm is that both compression and decompression can be streaming, i.e. both can be done in a single pass as a filter, without having to load the file into memory or randomly access bytes in files. Also the memory complexity of this algorithm is constant (RAM usage will be the same for any size of the file) and time complexity is linear (i.e. the algorithm is "very fast").
Here is the actual code of this utility (it reads from stdin and outputs to stdout, a flag `-x` is used to set decompression mode, otherwise it is compressing):
```
#include <stdio.h>
#define SPECIAL_VAL 0xf3 // random value, hopefully not very common
void compress(void)
{
unsigned char prevChar = 0;
unsigned int seqLen = 0;
unsigned char end = 0;
while (!end)
{
int c = getchar();
if (c == EOF)
end = 1;
if (c != prevChar || c == SPECIAL_VAL || end || seqLen > 200)
{ // dump the sequence
if (seqLen > 3)
printf("%c%c%c",SPECIAL_VAL,seqLen - 4,prevChar);
else
for (int i = 0; i < seqLen; ++i)
putchar(prevChar);
seqLen = 0;
}
prevChar = c;
seqLen++;
if (c == SPECIAL_VAL)
{
// this is how we encode the special value appearing in the input
putchar(SPECIAL_VAL);
putchar(0xff);
seqLen = 0;
}
}
}
void decompress(void)
{
unsigned char end = 0;
while (1)
{
int c = getchar();
if (c == EOF)
break;
if (c == SPECIAL_VAL)
{
unsigned int seqLen = getchar();
if (seqLen == 0xff)
putchar(SPECIAL_VAL);
else
{
c = getchar();
for (int i = 0; i < seqLen + 4; ++i)
putchar(c);
}
}
else
putchar(c);
}
}
int main(int argc, char **argv)
{
if (argc > 1 && argv[1][0] == '-' && argv[1][1] == 'x' && argv[1][2] == 0)
decompress();
else
compress();
return 0;
}
```
How well does this perform? If we try to let the utility compress its own source code, we get to 1242 bytes from the original 1344, which is not so great -- the compression ratio is only about 92% here. We can see why: the only repeating bytes in the source code are the space characters used for indentation -- this is the only thing our primitive algorithm manages to compress. However if we let the program compress its own binary version, we get much better results (at least on the computer this was tested on): the original binary has 16768 bytes while the compressed one has 5084 bytes, which is an EXCELLENT compression ratio of 30%! Yay :-)

@ -10,7 +10,7 @@ A demo isn't a video, it is a non-[interactive](interactive.md) [real time](real
Some of the biggest demoparties are or were Assembly (Finland), The Party (Denmark), The Gathering (Norway), Kindergarden (Norway) and Revision (Germany). A guy on https://mlab.taik.fi/~eye/demos/ says that he has never seen a demo [female](female.md) programmer and that females often have free entry to demoparties while men have to pay because there are almost no women anyway xD Some famous demogroups include Farbrausch (Germany, also created a tiny 3D shooter game [.kkrieger](kkrieger.md)), Future Crew (Finland), Pulse (international), Haujobb (international), Conspiracy (Hungary) and [Razor 1911](razor_1911.md) (Norway). { Personally I liked best the name of a group that called themselves *Byterapers*. ~drummyfish } There is an online community of demosceners at at https://www.pouet.net.
**On technological side of demos**: great amount of hacking, exploitation of bugs and errors and usage of techniques going against "good programming practices" are made use of in making of demos. They're usually made in [C](c.md), [C++](cpp.md) or [assembly](assembly.md) (though some retards even make demos in [Java](java.md) [lmao](lmao.md)). In intros it is extremely important to save space wherever possible, so things such as [procedural generation](procgen.md) and [compression](compression.md) are heavily used. Manual [assembly](assembly.md) optimization for size can take place. [Tracker music](tracker_music.md), [chiptune](chiptune.md), [fractals](fractal.md) and [ASCII art](ascii_art.md) are very popular. New techniques are still being discovered, e.g. [bytebeat](bytebeat.md). [GLSL](glsl.md) shader source code that's to be embedded in the executable has to be minified or compressed. Compiler flags are chosen so as to minimize size, e.g. small size optimization (`-Os`), turning off buffer security checks or turning on fast [float](float.md) operations. The final executable is also additionally compressed with [specialized executable compression](executable_compression.md).
**On technological side of demos**: great amount of [hacking](hacking.md), exploitation of bugs and errors and usage of techniques going against "good programming practices" are made use of in making of demos. Demosceners make use and invent many kinds of effects, such as the *plasma* (cycling color palette on a 2D noise pattern), *copper bars*, *[moire](moire.md)* patterns, waving, lens distortion etc. Demos are usually written in [C](c.md), [C++](cpp.md) or [assembly](assembly.md) (though some retards even make demos in [Java](java.md) [lmao](lmao.md)). In intros it is extremely important to save space wherever possible, so things such as [procedural generation](procgen.md) and [compression](compression.md) are heavily used. Manual [assembly](assembly.md) optimization for size can take place. [Tracker music](tracker_music.md), [chiptune](chiptune.md), [fractals](fractal.md) and [ASCII art](ascii_art.md) are very popular. New techniques are still being discovered, e.g. [bytebeat](bytebeat.md). [GLSL](glsl.md) shader source code that's to be embedded in the executable has to be minified or compressed. Compiler flags are chosen so as to minimize size, e.g. small size optimization (`-Os`), turning off buffer security checks or turning on fast [float](float.md) operations. The final executable is also additionally compressed with [specialized executable compression](executable_compression.md).
## See Also

@ -49,6 +49,7 @@ These are some sources you can use for research and gathering information for ar
- **[YouTube](youtube.md)**: Yes, sadly this is nowadays one of the biggest sources of information which is unfortunately hidden in videos full of ads and retarded zoomers, the information is unindexed. If you are brave enough, you can dig this information out and write it here as a proper text.
- Try searching with different search engines than just Google (wiby, marginalia, Yandex, Bing, Yahoo, Internet Archive, ...).
- **Non-web**: When web fails, you can search the [darknet](darknet.md), [gopher](gopher.md), [gemini](gemini.md), [usenet](usenet.md), [tor](tor.md) etc.
- ...
## Purpose

Loading…
Cancel
Save