Update

2024-02-13 17:12:51 +01:00 · 2024-02-13 17:12:51 +01:00 · a5acdddb82
commit a5acdddb82
parent b3106e1ec8
16 changed files with 1921 additions and 1672 deletions
--- a/programming_language.md
+++ b/programming_language.md
@ -17,7 +17,7 @@ int square(int x)
  return x * x;
 }

-int main()
+int main(void)
 {
  for (int i = 0; i < 5; ++i)
    printf("%d squared is %d\n",i,square(i));
@ -39,11 +39,16 @@ Which prints:
 We divide programming languages into different groups. Perhaps the most common divisions is to two groups:

 - **compiled** languages: Meant to be transformed by a [compiler](compiler.md) to a [native](native.md) (directly executable) binary program, i.e. before running the program we have to run it through the process of compilation into runnable form. These languages are typically more efficient but usually more difficult to program in, less flexible and the compiled programs are non-portable (can't just be copy-pasted to another computer with different [architecture](isa.md) and expected to run; note that this doesn't mean compiled languages aren't [portable](portability.md), just that the compiled EXECUTABLE is not). These languages are usually [lower level](low-level), use static and strong [typing](typing.md) and more of manual [memory management](memory_management.md). Examples: [C](c.md), [C++](cpp.md), [go](go.md), [Haskell](haskell.md) or [Pascal](pascal.md).
- **interpreted** languages: Meant to be interpreted by an [interpreter](interpreter.md) "on-the-go", i.e. what we write we can also immediately run; these languages are often used for [scripting](scripting.md). To run such program you need the interpreter of the language installed on your computer and this interpreter reads the [source code](source_code.md) as it is written and performs what it dictates (well, this is actually simplified as the interpreter normally also internally does a kind of quick "lightweight" compilation, but anyway...). These languages are generally less efficient (slower, use more RAM) but also more flexible, easier to program in and [independent of platforms](platform_independent.md). These languages usually [higher-level](high_level.md), use weak and dynamic [typing](typing.md) and automatic [memory management](memory_management.md) ([garbage collection](garbage_collection.md), ...). Examples: [Python](python.md), [Perl](perl.md), [JavaScript](js.md) and [BASH](bash.md).
+- **interpreted** languages: Meant to be interpreted by an [interpreter](interpreter.md) "on-the-go", i.e. what we write we can also immediately run; these languages are often used for **[scripting](scripting.md)**. To run such program you need the interpreter of the language installed on your computer and this interpreter reads the [source code](source_code.md) as it is written and performs what it dictates (well, this is actually simplified as the interpreter normally also internally does a kind of quick "lightweight" compilation, but anyway...). These languages are generally less efficient (slower, use more RAM) but also more flexible, easier to program in and [independent of platforms](platform_independent.md). These languages usually [higher-level](high_level.md), use weak and dynamic [typing](typing.md) and automatic [memory management](memory_management.md) ([garbage collection](garbage_collection.md), ...). Examples: [Python](python.md), [Perl](perl.md), [JavaScript](js.md) and [BASH](bash.md).

 Sometimes the distinction here may not be completely clear, for example Python is normally considered an interpreted language but it can also be compiled into [bytecode](bytecode.md) and even native code. [Java](java.md) is considered more of a compiled language but it doesn't compile to native code (it compiles to bytecode). [C](c.md) is traditionally a compiled language but there also exist C interpreters. [Comun](comun.md) is meant to be both compiled and interpreted etc.

-We can divide language in many more ways, for example based on their **[paradigm](paradigm.md)** (roughly its core idea/model/"philosophy", e.g. [impertaive](imperative.md), [declarative](declarative.md), [object-oriented](oop.md), [functional](functional.md), [logical](logical.md), ...), **purpose** (general purpose, special purpose), computational power ([turing complete](turing_complete.md) or weaker), level of **[abstraction](abstraction.md)** (high, low), [typing](data_type.md) (strong, weak, dynamic, static) or function evaluation (strict, lazy).
+Another common division is by **level of [abstraction](abstraction.md)** roughly to (keep in mind the transition is gradual and depends on context, the line between low and high level is extremely fuzzy):
+
+- **[low level](low_level.md)**: Languages which are so called "closer to [hardware](hardware.md)" ("glorified [assembly](assembly.md)"), using little to no abstraction (reflecting more how a computer actually works under the hood without adding too many artificial concepts above it, allowing direct access to memory with [pointers](pointer.md), ...), for this they very often use plain [imperative](imperative.md) paradigm), being less comfortable (requiring the programmer to do many things manually), less flexible, less safe (allowing shooting oneself in the foot). However (because [less is more](less_is_more.md)) they have great many advantages, e.g. being [simple](kiss.md) to implement (and so more [free](freedom.md)) and **greatly efficient** (being fast, memory efficient, ...). One popular definition is also that "a low level language is that which requires paying attention to the irrelevant". Low level languages are **typically compiled** (but it doesn't have to be so). Where exactly low level languages end is highly subjective, many say [C](c.md), [Fortran](fortran.md), [Forth](forth.md) and similar languages are low level (normally when discussing them in context of new, very high level languages), others (mainly the older programmers) say only [assembly](assembly.md) languages are low level and some will even say only [machine code](machine_code.md) is low level.
+- **[high level](high_level.md)**: Languages with higher level of abstraction than low level ones -- they are normally more complex (though not always), interpreted (again, not necessarily), comfortable, dynamically typed, beginner friendly, "safe" (having various safety mechanism, automatic checks, automatic memory management such as [garbage collection](garbage_collection.md)) etc. For all this they are typically slower, less memory efficient, and just more [bloated](bloat.md). Examples are [Python](python.md) or [JavaScript](js.md).
+
+We can divide language in many more ways, for example based on their **[paradigm](paradigm.md)** (roughly its core idea/model/"philosophy", e.g. [impertaive](imperative.md), [declarative](declarative.md), [object-oriented](oop.md), [functional](functional.md), [logical](logical.md), ...), **purpose** (general purpose, special purpose), computational power ([turing complete](turing_complete.md) or weaker, many definitions of a programming language require Turing completeness), [typing](data_type.md) (strong, weak, dynamic, static) or function evaluation (strict, lazy).

 A computer language consists from two main parts:

@ -68,7 +73,7 @@ Now that we have a specification, i.e. the idea, someone has to realize it, i.e.

 A new language comes to existence just as other things do -- when there is a reason for it. I.e. if someone feels there is no good language for whatever he's doing or if someone has a brilliant idea and want to write a PhD thesis or if someone smokes too much weed or if a corporation wants to control some software platform etc., a new language may be made. This often happen gradually (again, like with many things), i.e. someone just starts modifying an already existing language -- at first he just makes a few macros, then he starts making a more complex preprocessor, then he sees it's starting to become a new language so he gives it a name and makes it a new language -- such language may at first just be transpiled to another language (often [C](c.md)) and over time it gets its own full compiler. At first a new language is written in some other language, however most languages aim for **[self hosted](self_hosting.md) implementation**, i.e. being written in itself. This is natural and has many advantages -- a language written in itself proves its maturity, it becomes independent and as it itself improves, so does its own compiler. Self hosting a language is one of the greatest milestones in its life -- after this the original implementation in the other language often gets deletes as it would just be a burden to keep [maintaining](maintenance.md) it.

-**So can a language be [bloated](bloat.md)?** Well, yes, if we consider that a very complicated language just cannot be implemented in a simple, non-bloated way -- we can say the language itself is inevitably bloated. It may contain features that will be rarely used, it may be inelegant etc. However many times when referring to language we just refer to its implementation(s). **How to tell if language is bloated?** One can get an idea from several things, e.g. list of features, [paradigm](paradigm.md), size of its implementations, size of the specification, year of creation (newer mostly means more bloat) and so on. However be careful, many of these are just clues, for example small specification may just mean it's vague. Even a small self hosted implementation doesn't have to mean the language is small -- imagine e.g. a language that just does what you write in plain English; such language will have just one line self hosted implementation: "Implement yourself." But to actually [bootstrap](boot.md) the language will be immensely difficult and will require a lot of bloat.
+**So can a language be inherently fast, [bloated](bloat.md) etc.?** When we say a language is fast, bloated, memory efficient and so on, we often refer to its implementations because, as mentioned, a language is just an idea which can be implemented in many ways with different priorities and tradeoffs, so just keep in mind that talking about languages like this usually refers to the implementations. But on the other hand yes, a language CAN itself be seen as inherently having a similar property because it's simply such that its implementations more or less have to have this property. A very complicated language just cannot be implemented in a simple, non-bloated way, an extremely high level and flexible language cannot be implemented to be among the fastest -- so referring to language implementations we also a little bit refer to the language itself as an implementation reflects the abstract language's properties. **How to tell if language is bloated?** One can get an idea from several things, e.g. list of features, [paradigm](paradigm.md), size of its implementations, size of the specification, year of creation (newer mostly means more bloat) and so on. However be careful, many of these are just clues, for example small specification may just mean it's vague. Even a small self hosted implementation doesn't have to mean the language is small -- imagine e.g. a language that just does what you write in plain English; such language will have just one line self hosted implementation: "Implement yourself." But to actually [bootstrap](boot.md) the language will be immensely difficult and will require a lot of bloat.

 **Can you use multiple programming languages for one project?** Yes, though it may be a burden, so don't do it just because you can. Combining languages is possible in many ways, e.g. by embedding a [scripting](scripting.md) language into a compiled language, linking together object files produces by different languages, creating different programs that communicate over network etc.