8.4 KiB
HTML
HTML, short for Hypertext Markup Language, is a relatively simple computer language for describing documents with hyperlinks ("clickable pointers to other such documents"), serving to create websites on the World Wide Web. This makes it the most basic language of the web, it's a text format in which websites are transferred over the Internet. HTML is NOT a programming language, just one for describing documents -- it contains the text of the website along with special tags marking parts of it as paragraphs, heading etc. HTML is easy! Even women can learn it.
Going by traditional definitions, HTML is NOT a programming language because it doesn't express algorithms, just a structure and content of a document (webpage), so boasting about being an "HTML programmer" only results in cringe and embarrassment. Still under a more liberal definition of a "programming language" (such as the one used in the esolang circles) it IS possible to claim HTML is a sort of programming language, specifically a declarative one that's not Turing complete. But this is like stretching the definition of "music" so that it includes any kind of audible noise, like farting for example, so that anyone who farts can be called a musician.
History and context: HTML came to be as a part of the world wide web framework created around 1990 by Tim Berners-Lee. Later on it got standardized every once in a few years or so; the latest standard is HTML5 from 2014. In syntax HTML is similar to another widely popular language called XML. This is due to both languages descending from SGML, a standard for markup languages. HTML and XML different, however, in both syntax and semantics (unlike with XML, HTML tags are case insensitive, closing tags aren't required, semantics of tags is predefined etc.), and so in general HTML and XML require different parsers and libraries. There was once an effort to make a version of HTML conforming to XML rules, so called XHTML, but it was kind of fruitless as hardly anyone adopted it.
HTML can be mixed with other web languages, namely CSS and JavaScript. JavaScript is a shitty retarded scripting language for embedding sneaky, automatically executed programs to the HTML document, such as crypto miners, keyloggers, bloat and other malware, so good programmers consider use of JavaScript a very bad practice, so henceforth we'll just ignore it. CSS serves to give the HTML document a specific visual style, for instance specify concrete fonts, background color, paragraph spacing etc. In its beginning HTML actually contained its own ways for manipulating the visual appearance of the document (and for backwards compatibility still does), but later on a new paradigm was adopted, stating that HTML should only define the "structure and content" of the document, while its appearance would be dictated separately by another language. CSS is crap too, but using it correctly and moderately is justifiable, i.e. as long as the CSS is light and the document stays fine when the style is removed, everything's cool.
Is HTML bloat? Is it acceptable? Strictly speaking it's neither the most minimal language, nor the most elegant one, but it definitely leans towards the more KISS part of the spectrum, i.e. it is completely acceptable and usable, especially when limited to a subset of most commonly used tags. A nicely made HTML can relatively easily be auto-converted to other formats too, so in the end it doesn't matter too much whether a document is in HTML or Markdown or whatever. Unfortunately the vast majority of websites nowadays are not a nice HTML, but this is due to retarded soydevs. HTML's advantage is mainly in its historical status as the most widely supported common denominator of the web -- a plain HTML page can be viewed in EVERY web browser, and of course in the end it's even human readable. HTML is incomparably simpler and more sucklesss when contrasted with formats such as PDF, Latex or MS Word, but formats such as Markdown or even plaintext ASCII txt are indeed yet a lot simpler and more often than not objectively better than HTML. Full HTML compliance is bloat of course, but the same probably holds even for Markdown. To sum up: using HTML is cool if we do it well.
Example
HTML is literally easy as fuck, here's more or less how it works:
The whole HTML document (webpage) is just a text file with .html extension. So to make a page, create an empty file, name it mypage.html and open it with a text editor (gedit, vim, emacs or whatever), then start editing it. To see the result just open the file simultaneously in any web browser (drag-and-drop should just work), then after every edit just refresh the page. NOTE: default page on a website is always named index.html, so name your main page like this.
PRO TIP: When you're done making the page, always validate it! Browsers tolerate errors and will show the page even if it's faulty, but stupider browsers may not handle it, so you want to make sure there are actually no errors. Just look up "HTML validator" on the web.
Now for the content of the HTML itself. The language works with so called tags. A tag named abc starts with <abc>
and ends with </abc>
, potentially having some text in between, for example <abc> something </abc>
. Tags can also have attributed, e.g. <abc something="somevalue">
. The names of tags and their possible attributes are predefined, they can be looked up on the Internet, but most of the important ones are demonstrated by the example below. Tags may also be nested and some may not require an end tag. White spaces don't matter, so you can indent the code however you like. Multiple whitespaces in text will be reduced to just one space, so you can break longer text to multiple lines. That's basically it. The rest will be demonstrated by an example (just copy paste it and play around with it):
<!DOCTYPE HTML> <!-- Must be here so that programs know this is HTML. -->
<!-- This is a comment, programs ignores it. You can sign yourself here etc. -->
<html> <!-- Must be here. -->
<head> <!-- Holds meta information. -->
<title> Cool Site </title> <!-- Name (for bookmarks etc.). -->
<meta charset="utf-8">
</head>
<body> <!-- Actual content goes here. -->
<h1> Awesome Webpage </h1> <!-- Level 1 heading (biggest). -->
<p> Welcome to this amazing page. </p> <!-- Paragraph of text. -->
<p>
Another paragraph with more text. It can span
multiple lines, all will be displayed as a
continuous text.
</p>
<br> <!-- Adds a newline. -->
<p>
What if we want to render the less than/greater than symbols? it's done
like this: < >.
<b>This text is bold</b> and <i>this one is italics</i>.
<a href="https://www.tastyfish.cz">This</a> is a link to some other page.
And <a href="#morestuff">this</a> links to a heading below.
We can also create<sub>subscripts</sub> and<sup>superscripts</sup>.
Now let's include an image of a cat:
</p>
<img src="https://opengameart.org/sites/default/files/catfree.png" alt="cat image">
<h2 id="morestuff"> More Stuff </h2> <!-- Level 2 heading (smaller). -->
<table>
<tr> <th> column 1 </th> <th> column 2 </th> </tr>
<tr> <td> value 1 </td> <td> value 2 </td> </tr>
<tr> <td> value 3 </td> <td> value 4 </td> </tr>
</table>
<hr> <!-- Horizontal line. -->
<pre>
Preformatted text, usually used for code and ASCII
art. Will use monospace font and preserve all
whitespaces, which is why we can't indent it like
the other stuff.
</pre>
</body>
</html>