This commit is contained in:
Miloslav Ciz 2023-08-23 20:31:35 +02:00
parent 9f6a70d07b
commit 976617b212
4 changed files with 11 additions and 5 deletions

View file

@ -1,6 +1,10 @@
# Byte # Byte
TODO Byte (symbol: B) is a basic unit of [information](information.md), nowadays practically always consisting of 8 [bits](bit.md) (in which case it is also called an **octet**), which allow it to store 2^8 = 256 distinct values (for example a number in range 0 to 255). It is usually the smallest unit of memory a [CPU](cpu.md) is able to operate on, and memory addresses are assigned by one byte steps. We use bytes to measure the size of [memory](memory.md) and derive higher memory [units](memory_units.md) from it, such as a kilobyte (kB, 1000 bytes), kibibyte (KiB, 1024 bytes), megabyte (MB, 10^6 bytes) etc. In [programming](programming.md) a one byte [variable](variable.md) is nowadays seen as very small and used if we are really limited by memory constraints (e.g. [embdedded](embedded.md)) or to mimic older 8bit computers ("[retro](retro.md) games" etc.): one byte can be used to store very small numbers (while in mainstream processors numbers nowadays mostly have 4 or 8 bytes), text characters ([ASCII](ascii.md), ...), very primitive colors (see [RGB332](rgb332.md), [palettes](palette.md), ...) etc.
Historically *byte* was used to stand for the basic addressable unit of memory that could store one text character or another "basic value" and could therefore have a different size than 8 bits: e.g. ASCII machines might have had a 7bit byte, 16bit machines a 16bit byte etc.; in [C](c.md) (standard 99) `char` is the "byte" data type, its byte size is always 1 (`sizeof(char) == 1`), though its number of bits (`CHAR_BIT`) can be greater or equal to 8; if you need an exact 8bit byte use types such as `int8_t` and `uint8_t` from the standard `stdint` library. From now on we will implicitly talk about 8bit bytes.
**Value of one byte can be written exactly with two [hexadecimal](hexadecimal.md) digits** with each digit always corresponding to higher/lower 4 bits, making mental conversions very easy; this is very convenient compared to [decimal](decimal.md) representation, so programmers prefer to write byte values in hexadecimal. For example a byte whose binary value is *11010010* is *D2* in hexadecimal (*1101* is always *D* and *0010* is always *2*), while in decimal we get 210.
**Byte frequency/probability**: it may be [interesting](interesting.md) and/or useful (e.g. for [compression](compression.md)) to know how often different byte values appear in the data we process with computers -- indeed, this always DEPENDS; if we are working with plain [ASCII](ascii.md) text, we will never encounter values above 127, and on the other hand if we are processing photos from a polar expedition, we will likely mostly encounter byte values of 255 (as snow will cause most pixels to be completely white). In general we may expect values such as [0](zero.md), 255, [1](one.md) and [2](two.md) to be most frequent, as many times these are e.g. assigned special meanings in data encodings, they may be cutoff values etc. Here is a table of measured byte frequencies in real data: **Byte frequency/probability**: it may be [interesting](interesting.md) and/or useful (e.g. for [compression](compression.md)) to know how often different byte values appear in the data we process with computers -- indeed, this always DEPENDS; if we are working with plain [ASCII](ascii.md) text, we will never encounter values above 127, and on the other hand if we are processing photos from a polar expedition, we will likely mostly encounter byte values of 255 (as snow will cause most pixels to be completely white). In general we may expect values such as [0](zero.md), 255, [1](one.md) and [2](two.md) to be most frequent, as many times these are e.g. assigned special meanings in data encodings, they may be cutoff values etc. Here is a table of measured byte frequencies in real data:
@ -8,7 +12,7 @@ TODO
| type of data | least c. | 2nd least c. | 3rd least c. | 3rd most c. | 2nd most c. | most c. | | type of data | least c. | 2nd least c. | 3rd least c. | 3rd most c. | 2nd most c. | most c. |
| -------------------------- | --------- | ------------ | ------------ | ------------ | ------------- | ------------- | | -------------------------- | --------- | ------------ | ------------ | ------------ | ------------- | ------------- |
| GNU/Linux 64bit executable | 0x9e (0%) | 0xb2 (0%) | 0x9a (0%) | 0x48 (2%) | 0xff (3%) | 0x00 (32%) | | GNU/Linux x86 executable | 0x9e (0%) | 0xb2 (0%) | 0x9a (0%) | 0x48 (2%) | 0xff (3%) | 0x00 (32%) |
| bare metal ARM executable | 0xcf (0%) | 0xb7 (0%) | 0xa7 (0%) | 0xff (2%) | 0x01 (3%) | 0x00 (15%) | | bare metal ARM executable | 0xcf (0%) | 0xb7 (0%) | 0xa7 (0%) | 0xff (2%) | 0x01 (3%) | 0x00 (15%) |
| UTF8 English txt book | 0x00 (0%) | 0x01 (0%) | 0x02 (0%) |0x74 (`t`, 6%)|0x65 (`e`, 8%) |0x20 (` `, 14%)| | UTF8 English txt book | 0x00 (0%) | 0x01 (0%) | 0x02 (0%) |0x74 (`t`, 6%)|0x65 (`e`, 8%) |0x20 (` `, 14%)|
| C source code | 0x00 (0%) | 0x01 (0%) | 0x02 (0%) |0x31 (`1`, 6%)|0x20 (` `, 12%)|0x2c (`,`, 16%)| | C source code | 0x00 (0%) | 0x01 (0%) | 0x02 (0%) |0x31 (`1`, 6%)|0x20 (` `, 12%)|0x2c (`,`, 16%)|

View file

@ -2,7 +2,7 @@
Encyclopedia (also encyclopaedia, cyclopedia or cyclopaedia) is a large [book](book.md) (or a series of books) providing structured summary of wide knowledge in one or many fields (such as [mathematics](math.md), [history](history.md), engineering, general knowledge etc.), usually in a form of alphabetically ordered articles on terms used in the field. Paper encyclopedias are often printed in several volumes as their scope is too great for a single book. The largest and most famous encyclopedia to date is the online [Wikipedia](wikipedia.md) created by volunteers in [free culture](free_culture.md) spirit, however Wikipedia suffers from significant issues such as [censorship](censorship.md), high political propaganda and low quality of writing, therefore it is important to also stay interested in other encyclopedias such as Britannica or [LRS wiki](lrs_wiki.md). Encyclopedia (also encyclopaedia, cyclopedia or cyclopaedia) is a large [book](book.md) (or a series of books) providing structured summary of wide knowledge in one or many fields (such as [mathematics](math.md), [history](history.md), engineering, general knowledge etc.), usually in a form of alphabetically ordered articles on terms used in the field. Paper encyclopedias are often printed in several volumes as their scope is too great for a single book. The largest and most famous encyclopedia to date is the online [Wikipedia](wikipedia.md) created by volunteers in [free culture](free_culture.md) spirit, however Wikipedia suffers from significant issues such as [censorship](censorship.md), high political propaganda and low quality of writing, therefore it is important to also stay interested in other encyclopedias such as Britannica or [LRS wiki](lrs_wiki.md).
**Similar terms:** encyclopedias, which also used to also be called **cyclopedias** in the past, are similar to **dictionaries** and these works often overlap (many encyclopedias call themselves dictionaries); the main difference is that a dictionary focuses on providing linguistic information about the terms and has shorter term definitions, while encyclopedias has longer articles (which however limits the total number of terms it may contain). Encyclopedias are also a subset of so called **reference works**, i.e. works that serve to provide [information](information.md) and reference to it (other kinds of reference works being e.g. world maps or [API](api.md) references). A **universal/general** encyclopedia is one that focuses on human knowledge at wide, as opposed to an encyclopedia that focuses on one specific field of knowledge. **Compendium** can be seen almost as a synonym to encyclopedia, with encyclopedias perhaps usually being more general and extensive. **Similar terms:** encyclopedias, which also used to be called **cyclopedias** in the past, are similar to **dictionaries** and these works often overlap (many encyclopedias call themselves dictionaries); the main difference is that a dictionary focuses on providing linguistic information about the terms and has shorter term definitions, while encyclopedias have longer articles (which however limits the total number of terms it may contain). Encyclopedias are also a subset of so called **reference works**, i.e. works that serve to provide [information](information.md) and reference to it (other kinds of reference works being e.g. world maps or [API](api.md) references). A **universal/general** encyclopedia is one that focuses on human knowledge at wide, as opposed to an encyclopedia that focuses on one specific field of knowledge. **Compendium** can be seen almost as a synonym to encyclopedia, with encyclopedias perhaps usually being more general and extensive.
## Notable/Nice Encyclopedias ## Notable/Nice Encyclopedias

View file

@ -2,6 +2,8 @@
WORK IN PROGRESS WORK IN PROGRESS
{ Most of these I just heard/read somewhere, e.g. on [4chan](4chan.md), in [Jargon File](jargon_file.md) or from [RMS](rms.md), some terms I made myself. ~drummyfish }
| mainstream | correct/cooler | | mainstream | correct/cooler |
| ------------------------------------------ | -------------------------------------- | | ------------------------------------------ | -------------------------------------- |
| [Apple](apple.md) user | iToddler | | [Apple](apple.md) user | iToddler |
@ -18,7 +20,7 @@ WORK IN PROGRESS
| influencer | manipulator | | influencer | manipulator |
| [Intel](intel.md) | [Incel](incel.md) | | [Intel](intel.md) | [Incel](incel.md) |
| [Internet Explorer](internet_explorer.md) | Internet Exploder, Internet Exploiter | | [Internet Explorer](internet_explorer.md) | Internet Exploder, Internet Exploiter |
| [internet of things](iot.md) | internet of stinks | | [Internet of things](iot.md) | Internet of stinks |
| [iPad](ipda.md) | iBad | | [iPad](ipda.md) | iBad |
| [iPhone](iphone.md) | spyPhone | | [iPhone](iphone.md) | spyPhone |
| "left" | [pseudoleft](pseudoleft.md), SJW | | "left" | [pseudoleft](pseudoleft.md), SJW |

View file

@ -4,7 +4,7 @@ Teletext is now pretty much obsolete technology that allowed broadcasting extrem
{ Just checked on my TV and it still works in 2022 here. For me teletext was something I could pretend was "the internet" when I was little and when we didn't have internet at home yet, it was very cool. Back then it took a while to load any page but I could read some basic news or even browse graphical logos for cell phones. Nowadays TVs have buffers and have all the pages loaded at any time so the browsing is instantaneous. ~drummyfish } { Just checked on my TV and it still works in 2022 here. For me teletext was something I could pretend was "the internet" when I was little and when we didn't have internet at home yet, it was very cool. Back then it took a while to load any page but I could read some basic news or even browse graphical logos for cell phones. Nowadays TVs have buffers and have all the pages loaded at any time so the browsing is instantaneous. ~drummyfish }
The principal difference against the [Internet](internet.md) was that teletext was [broadcast](broadcast.md), i.e. it was a one-way communication. Users couldn't send back any data or even request any page, they could only wait and catch the pages that were broadcast by TV stations (this had advantages though, e.g. it couldn't be [DDOSed](ddos.md)). Each station would have its own teletext with fewer than 1000 pages -- the user would write a three place number of the page he wanted to load ("catch") and the TV would wait until that page was broadcast (this might have been around 30 seconds at most), then it would be displayed. The data about the pages were embedded into unused parts of the TV signal. The principal difference against the [Internet](internet.md) was that teletext was [broadcast](broadcast.md), i.e. it was a one-way communication. Users couldn't send back any data or even request any page, they could only wait and catch the pages that were broadcast by TV stations (this had advantages though, e.g. it couldn't be [DDOSed](ddos.md) and it couldn't spy on its users as they didn't send any information back). Each station would have its own teletext with fewer than 1000 pages -- the user would write a three place number of the page he wanted to load ("catch") and the TV would wait until that page was broadcast (this might have been around 30 seconds at most), then it would be displayed. The data about the pages were embedded into unused parts of the TV signal.
The pages allowed fixed-width text and some very blocky graphics, both could be colored with very few basic colors. It looked like something you render in a very primitive [terminal](terminal.md). The pages allowed fixed-width text and some very blocky graphics, both could be colored with very few basic colors. It looked like something you render in a very primitive [terminal](terminal.md).