This commit is contained in:
Miloslav Ciz 2023-04-28 14:03:26 +02:00
parent e5834a1aaf
commit 4497fd70b5
6 changed files with 30 additions and 18 deletions

18
www.md
View file

@ -1,18 +1,22 @@
# World Wide Web
World Wide Web (www or just *the web*) is (or was, if we accept that by 2021 the web is basically dead) a network of interconnected documents on the [Internet](internet.md) (called *websites* or *webpages*). Webpages are normally written in [HTML](html.md) language and can refer to each other by [hyperlinks](hyperlink.md). The web itself works on top of the [HTTP](http.md) protocol. Some people confuse the web with the Internet, but of course those people are retarded: web is just one of many service existing on the Internet (other ones being e.g. [email](email.md) or [torrents](torrent.md)). In order to browse the web you need an Internet connection and a [web browser](browser.md).
World Wide Web (www or just *the web*) is (or was -- by 2023 mainstream web is dead) a network of interconnected documents on the [Internet](internet.md), which we call *websites* or *webpages*. Webpages are normally written in the [HTML](html.md) [language](language.md) and can refer to each other by [hyperlinks](hyperlink.md) ("clickable" links right in the text). The web itself works on top of the [HTTP](http.md) protocol which says how clients and servers communicate. Some people confuse the web with the Internet, but of course those people are retarded: web is just one of many so called services existing on the Internet (other ones being e.g. [email](email.md) or [torrents](torrent.md)). In order to browse the web you need an Internet connection and a [web browser](browser.md).
{ **How to browse the web in the [age of shit](21st_century.md)?** Currently my "workflow" is following: I use the [badwolf](badwolf.md) browser (a super suckless, very fast from-scratch browser that allows turning JavaScript on/off, i.e. I mostly browse [small web](smol_internet.md) without JS but can still do banking etc.) with a **CUSTOM START PAGE** that I completely own and which only changes when I want it to -- this start page is just my own tiny HTML on my disk that has links to my favorite sites (which serves as my suckless "bookmark" system) AND a number of search bars for different search engines (Google, Duckduckgo, Yandex, wiby, Searx, marginalia, Right Dao, ...). This is important as nowadays you mustn't rely on Google or any other single search engine -- I just use whichever engine I deem best for my request at any given time. ~drummyfish }
An important part of the web is also searching its vast amounts of information with [search engines](search_engine.md) such as the infamous [Google](google.md) engine. It also relies on systems such as [DNS](dns.md).
Web is kind of a bloated [shit](shit.md), for more [suckless](suckless.md) alternatives see [gopher](gopher.md) and [gemini](gemini.md).
Mainstream web is now EXTREMELY bloated and practically unusable, for more [suckless](suckless.md) alternatives see [gopher](gopher.md) and [gemini](gemini.md). See also [smol web](smol_internet.md).
The web is perhaps the best, saddest and funniest example of [capitalist](capitalist_software.md) [bloat](bloat.md), the situation with web sites is completely ridiculous and depressive. A nice article about the issue, called *The Website Obesity Crisis*, can be found at https://idlewords.com/talks/website_obesity.htm. There is a tool for measuring a website bloat at https://www.webbloatscore.com/: it computes the ratio of the page size to the size of its screenshot (e.g. [YouTube](youtube.md) currently scores 35.7).
The web used to be perhaps the greatest part of the web, the thing that made Internet widespread, however it quickly deteriorated by capitalist mainstreamization and commercialization and by now, in 2020s, it is one of the most illustrative, depressing and most hilarious examples of [capitalist](capitalist_software.md) [bloat](bloat.md). A nice article about the issue, called *The Website Obesity Crisis*, can be found at https://idlewords.com/talks/website_obesity.htm. There is a tool for measuring a website bloat at https://www.webbloatscore.com/: it computes the ratio of the page size to the size of its screenshot (e.g. [YouTube](youtube.md) currently scores 35.7).
Back in the days (90s and early 2000s) web used to be a place of freedom working more or less in a decentralized manner and on anarchist principles people used to have their own unique websites, censorship was difficult to implement and mostly non-existent and websites used to have a much better design and were safer, as they were pure [HTML](html.md) documents.
## How It Went To Shit
As the time went web used to become more and more [shit](shit.md), as is the case with everything touched by [capitalism](capitalist_software.md) the advent of so called **web 2.0** brought about a lot of [complexity](complexity.md), websites started to incorporate runnable scripts ([JavaScript](javascript.md), [Flash](flash.md)) which lead to many negative things such as security vulnerabilities (web pages now have power to run code) and more complexity in web browsers, which leads to even more possible vulnerabilities, [bloat](bloat.md) and to browser monopolies (greater effort is needed to develop a browser, making it a privilege of those who can afford it, and those can subsequently dictate de-facto standards that further strengthen their monopolies). Another disaster came with **[social networks](social_network.md)** in mid 2000s, most notably [Facebook](facebook.md) but also [YouTube](youtube.md) and others, which centralized the web and rid people of control. Out of comfort people stopped creating and hosting own websites and rather created a page on Facebook. This gave the power to corporations and allowed **mass-surveillance**, **mass-censorship** and **propaganda brainwashing**. As the web became more and more popular, corporations and governments started to take more control over it, creating technologies and laws to make it less free. By 2020, the good old web is but a memory, everything is controlled by corporations, infected with billions of unbearable ads, [DRM](drm.md), malware (trackers, [crypto](crypto.md) miners), there exist no good web browsers, web pages now REQUIRE JavaScript even if it's not really needed due to which they are painfully slow and buggy, there are restrictive laws and censorship and de-facto laws (site policies) put in place by corporations controlling the web.
Back in the days (90s and early 2000s) web used to be a place of freedom working more or less in a decentralized manner, on [anarchist](anarchism.md) and often even [communist](communism.md) principles -- people used to have their own unique websites where they shared freely and openly, [censorship](censorship.md) was difficult to implement and mostly non-existent and websites used to have a much better design, were [KISS](kiss.md), safer, "open" (no paywalls, registration walls, country blocks, [DRM](drm.md), ...), MUCH faster and more robust as they were pure [HTML](html.md) documents. It was also the case that most websites were truly nice, useful and each one had a "soul" as they were usually made by passionate nerds who had a creative freedom and true desires to create a nice website (yes, even if they were making a commercial website for some company).
As the time marched on web used to become more and more [shit](shit.md), as is the case with everything touched by [capitalist](capitalist_software.md) hand -- the advent of so called **web 2.0** brought about a lot of [complexity](complexity.md), websites started to incorporate client-side scripts ([JavaScript](javascript.md), [Flash](flash.md), [Java](java.md) applets, ...) which led to many negative things such as incompatibility with browsers (kickstarting browser consumerism and [update culture](update_culture.md)), performance loss and security vulnerabilities (web pages now became Turing complete programs rather than mere documents) and more complexity in web browsers, which leads to immense [bloat](bloat.md) and browser [monopolies](bloat_monopoly.md) (greater effort is needed to develop a browser, making it a privilege of those who can afford it, and those can subsequently dictate de-facto standards that further strengthen their monopolies). Another disaster came with **[social networks](social_network.md)** in mid 2000s, most notably [Facebook](facebook.md) but also [YouTube](youtube.md), [Twitter](twitter.md) and others, which centralized the web and rid people of control. Out of comfort people stopped creating and hosting own websites and rather created a page on Facebook. This gave the power to corporations and allowed **mass-surveillance**, **mass-censorship** and **propaganda brainwashing**. As the web became more and more popular, corporations and governments started to take more control over it, creating technologies and laws to make it less free. By 2020, the good old web is but a memory and a hobby of a few boomers, everything is controlled by corporations, infected with billions of unbearable ads, [DRM](drm.md), malware (trackers, [crypto](crypto.md) miners, ...), there exist no good web browsers, web pages now REQUIRE JavaScript even if it's not needed in principle due to which they are painfully slow and buggy, there are restrictive laws and censorship and de-facto laws (site policies) put in place by corporations controlling the web.
Mainstream web is quite literally unusable nowadays. What people searched for on the web they now search on on a handful of platforms like Facebook and YouTube (often not even using a web browser but rather a mobile "[app](app.md)"); if you try to "google" something, what you get is just a list of unusable sites written by [AIs](ai.md) that load for several minutes (unless you have the latest 1024 TB RAM beast) and won't let you read beyond the first paragraph without registration. These sites are uplifted by [SEO](seo.md) for pure commercial reasons, they contain no useful information, just ads. Useful sites are buried under several millions of unusable results or downright censored for political reasons (e.g. using some forbidden word). Thankfully you can still try to browse the [smol web](smol_internet.md) with search engines such as [wiby](wiby.md), but still that only gives a glimpse of what the good old web used to be.
## History
@ -39,9 +43,9 @@ The webpages are stored on web [servers](server.md), i.e. computers specialized
When a user enters a URL of a page into the browser, the following happens (it's kind of simplified, there are [caches](cache.md) etc.):
1. The [domain](domain.md) name (e.g. `www.mysite.org`) is converted into an [IP](ip.md) address of the server the site is hosted on. This is done by asking a [DNS](dns.md) server -- these are special servers that hold the database mapping domain names to IP addresses (when you buy a domain, you can edit its record in this database to make it point to whatever address you want).
2. The browser sends a request for given page to the IP address of the server. This is done via [HTTP](http.md) (or [HTTPS](https.md) in the encrypted case) protocol -- this protocol is a language via which web servers and clients talk (it can contain additional data like passwords entered on the site etc.). (If the encrypted HTTPS protocol is used, encryption is performed with [asymmetric cryptography](asymmetric_cryptography.md) using the server's public key whose digital signature additionally needs to be checked with some certificate authority.) This request is delivered to the server by the mechanisms and lower network layers of the [Internet](internet.md), typically [TCP](tcp.md)/[IP](ip.md).
2. The browser sends a request for given page to the IP address of the server. This is done via [HTTP](http.md) (or [HTTPS](https.md) in the encrypted case) protocol (that's the `http://` or `https://` in front of the domain name) -- this protocol is a language via which web servers and clients talk (besides websites it can communicate additional data like passwords entered on the site, [cookies](cookie.md) etc.). (If the encrypted HTTPS protocol is used, encryption is performed with [asymmetric cryptography](asymmetric_cryptography.md) using the server's public key whose digital signature additionally needs to be checked with some certificate authority.) This request is delivered to the server by the mechanisms and lower network layers of the [Internet](internet.md), typically [TCP](tcp.md)/[IP](ip.md).
3. The server receives the request and sends back the webpage embedded again in an [HTTP](http.md) response, along with other data such as the error/success code.
4. Client browser receives the page and displays it. If the page contains additional resources that are needed for displaying the page, such as images, they are automatically retrieved the same way.
4. Client browser receives the page and displays it. If the page contains additional resources that are needed for displaying the page, such as images, they are automatically retrieved the same way (of course things like [caching](cache.md) may be employed so that they same image doesn't have to be readownloaded literally every time).
[Cookies](cookie.md), small files that sites can store in the user's browser, are used on the web to implement stateful behavior (e.g. remembering if the user is signed in on a forum). However cookies can also be abused for tracking users, so they can be turned off.