less_retarded_wiki/www.md
2023-02-18 16:03:01 +01:00

65 lines
12 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# World Wide Web
World Wide Web (www or just *the web*) is (or was, if we accept that by 2021 the web is basically dead) a network of interconnected documents on the [Internet](internet.md) (called *websites* or *webpages*). Webpages are normally written in [HTML](html.md) language and can refer to each other by [hyperlinks](hyperlink.md). The web itself works on top of the [HTTP](http.md) protocol. Some people confuse the web with the Internet, but of course those people are retarded: web is just one of many service existing on the Internet (other ones being e.g. [email](email.md) or [torrents](torrent.md)). In order to browse the web you need an Internet connection and a [web browser](browser.md).
{ **How to browse the web in the [age of shit](21st_century.md)?** Currently my "workflow" is following: I use the [badwolf](badwolf.md) browser (a super suckless, very fast from-scratch browser that allows turning JavaScript on/off, i.e. I mostly browse [small web](smol_internet.md) without JS but can still do banking etc.) with a **CUSTOM START PAGE** that I completely own and which only changes when I want it to -- this start page is just my own tiny HTML on my disk that has links to my favorite sites (which serves as my suckless "bookmark" system) AND a number of search bars for different search engines (Google, Duckduckgo, Yandex, wiby, Searx, marginalia, Right Dao, ...). This is important as nowadays you mustn't rely on Google or any other single search engine -- I just use whichever engine I deem best for my request at any given time. ~drummyfish }
An important part of the web is also searching its vast amounts of information with [search engines](search_engine.md) such as the infamous [Google](google.md) engine. It also relies on systems such as [DNS](dns.md).
Web is kind of a bloated [shit](shit.md), for more [suckless](suckless.md) alternatives see [gopher](gopher.md) and [gemini](gemini.md).
The web is perhaps the best, saddest and funniest example of [capitalist](capitalist_software.md) [bloat](bloat.md), the situation with web sites is completely ridiculous and depressive. A nice article about the issue, called *The Website Obesity Crisis*, can be found at https://idlewords.com/talks/website_obesity.htm. There is a tool for measuring a website bloat at https://www.webbloatscore.com/: it computes the ratio of the page size to the size of its screenshot (e.g. [YouTube](youtube.md) currently scores 35.7).
Back in the days (90s and early 2000s) web used to be a place of freedom working more or less in a decentralized manner and on anarchist principles people used to have their own unique websites, censorship was difficult to implement and mostly non-existent and websites used to have a much better design and were safer, as they were pure [HTML](html.md) documents.
As the time went web used to become more and more [shit](shit.md), as is the case with everything touched by [capitalism](capitalist_software.md) the advent of so called **web 2.0** brought about a lot of [complexity](complexity.md), websites started to incorporate runnable scripts ([JavaScript](javascript.md), [Flash](flash.md)) which lead to many negative things such as security vulnerabilities (web pages now have power to run code) and more complexity in web browsers, which leads to even more possible vulnerabilities, [bloat](bloat.md) and to browser monopolies (greater effort is needed to develop a browser, making it a privilege of those who can afford it, and those can subsequently dictate de-facto standards that further strengthen their monopolies). Another disaster came with **[social networks](social_network.md)** in mid 2000s, most notably [Facebook](facebook.md) but also [YouTube](youtube.md) and others, which centralized the web and rid people of control. Out of comfort people stopped creating and hosting own websites and rather created a page on Facebook. This gave the power to corporations and allowed **mass-surveillance**, **mass-censorship** and **propaganda brainwashing**. As the web became more and more popular, corporations and governments started to take more control over it, creating technologies and laws to make it less free. By 2020, the good old web is but a memory, everything is controlled by corporations, infected with billions of unbearable ads, [DRM](drm.md), malware (trackers, [crypto](crypto.md) miners), there exist no good web browsers, web pages now REQUIRE JavaScript even if it's not really needed due to which they are painfully slow and buggy, there are restrictive laws and censorship and de-facto laws (site policies) put in place by corporations controlling the web.
## History
World Wide Web was invented by an English computer scientist [Tim Berners-Lee](berners_lee.md). In 1980 he employed [hyperlinks](hyperlink.md) in a notebook program called ENQUIRE, he saw the idea was good. On March 12 1989 he was working at [CERN](cern.md) where he proposed a system called "web" that would use [hypertext](hypertext.md) to link documents (the term hypertext was already around). He also considered the name *Mesh* but settled on *World Wide Web* eventually. He started to implement the system with a few other people. At the end of 1990 they already had implemented the [HTTP](http.md) protocol for client-server communication, the [HTML](html.md), language for writing websites, the first web server and the first [web browser](browser.md) called *WorldWideWeb*. They set up the first website http://info.cern.ch that contained information about the project.
In 1993 CERN made the web [public domain](public_domain.md), free for anyone without any licensing requirements. The main reason was to gain advantage over competing systems such as [Gopher](gopher.md) that were [proprietary](proprietary.md). By 1994 there were over 500 web servers around the world. WWW Consortium ([W3M](w3m.md)) was established to maintain standards for the web. A number of new browsers were written such as the text-only [Lynx](lynx.md), but the [proprietary](proprietary.md) [Netscape Navigator](netscape_navigator.md) would go to become the most popular one until [Micro$oft](microsoft)'s [Internet Explorer](internet_explorer.md) (see [browser wars](browser_wars.md)). In 1997 [Google](google.md) search engine appeared, as well as [CSS](css.md). There was a economic bubble connected to the explosion of the Web called the [dot-comm boom](dot_com_boom.md).
Between 2000 and 2010 there used to be a mobile alternative to the web called [WAP](wap.md). Back then mobile phones were significantly weaker than PCs so the whole protocol was simplified, e.g. it had a special markup language called [WML](wml.md) instead of [HTML](html.md). But as the phones got more powerful they simply started to support normal web and WAP disappeared.
Around 2005, the time when [YouTube](youtube.md), [Twitter](twitter.md), [Facebook](facebook.md) and other shit sites started to appear and become popular, so called [Web 2.0](web_20.md) started to form. This was a shift in the web's paradigm towards more [shittiness](shit.md) such as more [JavaScript](javascript.md), [bloat](bloat.md), interactivity, websites as programs, [Flash](flash.md), [social networks](social_network.md) etc. This would be the beginning of the web's downfall.
## How It Works
It's all pretty well known, but in case you're a nub...
Users browse the Internet using [web browsers](browser.md), programs made specifically for this purpose. Pages on the [Internet](internet.md) are addressed by their [URL](url.md), a kind of textual address such as `http://www.mysite.org/somefile.html`. This address is entered into the web browser, the browser retrieves it and displays it.
A webpage can contain text, pictures, graphics and nowadays even other media like video, audio and even programs that run in the browser. Most importantly webpages are [hypertext](hypertext.md), i.e. they may contain clickable references to other pages -- clicking a link immediately opens the linked page.
The page itself is written in [HTML](html.md) language (not really a [programming](programming.md), more like a file format), a relatively simple language that allows specifying the structure of the text (headings, paragraphs, lists, ...), inserting links, images etc. In newer browsers there are additionally two more important languages that are used with websites (they can be embedded into the HTML file or come in separate files): [CSS](css.md) which allows specifying the look of the page (e.g. text and font color, background images, position of individual elements etc.) and [JavaScript](js.md) which can be used to embed [scripts](script.md) (small [programs](program.md)) into webpages which will run on the user's computer (in the browser). These languages combined make it possible to make websites do almost anything, even display advanced 3D graphics, play movies etc. However, it's all huge [bloat](bloat.md), it's pretty slow and also dangerous, it was better when webpages used to be HTML only.
The webpages are stored on web [servers](server.md), i.e. computers specialized on listening for requests and sending back requested webpages. If someone wants to create a website, he needs a server to host it on, so called [hosting](hosting.md). This can be done by setting up one's own server -- so called [self hosting](self_hosting.md) -- but nowadays it's more comfortable to buy a hosting service from some company, e.g. a [VPS](vps.md). For running a website you'll also want to buy a web [domain](domain.md) (like `mydomain.com`), i.e. the base part of the textual address of your site (there exist free hosting sites that even come with free domains if you're not picky, just search...).
When a user enters a URL of a page into the browser, the following happens (it's kind of simplified, there are [caches](cache.md) etc.):
1. The [domain](domain.md) name (e.g. `www.mysite.org`) is converted into an [IP](ip.md) address of the server the site is hosted on. This is done by asking a [DNS](dns.md) server -- these are special servers that hold the database mapping domain names to IP addresses (when you buy a domain, you can edit its record in this database to make it point to whatever address you want).
2. The browser sends a request for given page to the IP address of the server. This is done via [HTTP](http.md) (or [HTTPS](https.md) in the encrypted case) protocol -- this protocol is a language via which web servers and clients talk (it can contain additional data like passwords entered on the site etc.). (If the encrypted HTTPS protocol is used, encryption is performed with [asymmetric cryptography](asymmetric_cryptography.md) using the server's public key whose digital signature additionally needs to be checked with some certificate authority.) This request is delivered to the server by the mechanisms and lower network layers of the [Internet](internet.md), typically [TCP](tcp.md)/[IP](ip.md).
3. The server receives the request and sends back the webpage embedded again in an [HTTP](http.md) response, along with other data such as the error/success code.
4. Client browser receives the page and displays it. If the page contains additional resources that are needed for displaying the page, such as images, they are automatically retrieved the same way.
[Cookies](cookie.md), small files that sites can store in the user's browser, are used on the web to implement stateful behavior (e.g. remembering if the user is signed in on a forum). However cookies can also be abused for tracking users, so they can be turned off.
Other programming languages such as [PHP](php.md) can also be used on the web, but they are used for server-side programming, i.e. they don't run in the web browser but on the server and somehow generate and modify the sites for each request specifically. This makes it possible to create dynamic pages such as [search engines](search_engine.md) or [social networks](social_network.md).
## See Also
- [Dark Web](dark_web.md)
- [Dork Web/Smol Internet](smol_internet.md)
- [teletext](teletext.md)
- [minitel](minitel.md)
- [WAP](wap.md)
- [Usenet](usenet.md)
- [TOR](tor.md)
- [Gopher](gopher.md)
- [Gemini](gemini.md)
- [BBS](bbs.md)
- [Freenet](freenet.md)
- [IPFS](ipfs.md)
- [Internet](internet.md)
- [Kwangmyong](kwangmyong.md)