175 lines
42 KiB
Markdown
175 lines
42 KiB
Markdown
# Programming Language
|
|
|
|
Programming language is an artificial [formal](formal_language.md) (mathematically precise) language created in order to allow humans to relatively easily write [algorithms](algorithm.md) for [computers](computer.md). It basically allows a human to very specifically and precisely but still relatively comfortably tell a computer what to do. We call a program written in programming language the program's **[source code](source_code.md)**. Programming languages often try to mimic some human language -- practically always [English](english.md) -- so as to be somewhat close to humans but programming language is actually MUCH simpler so that a computer can actually analyze it and understand it precisely (as computers are extremely bad at understanding actual [human language](human_language.md)), without ambiguity, so in the end it all also partially looks like [math](math.md) expressions. A programming language can be seen as a middle ground between pure [machine code](machine_code.md) (the computer's native language, very hard to handle by humans) and natural language (very hard to handle by computers).
|
|
|
|
For beginners: a programming language is actually much easier to learn than a [foreign language](human_language.md), it will typically have fewer than 100 "words" to learn (out of which you'll mostly use like 10) and once you know one programming language, learning another becomes a breeze because they're all (usually) pretty similar in basic concepts. The hard part may be learning some of the concepts if you encounter them for the first time. This is not to say programming is easy -- it is hard, but not because learning the language would be difficult; learning the language is relatively the easier part of programming, the hard parts are for example designing the program's architecture well, designing good protocols and interfaces, learning the math behind the problems you're solving, creating good mathematical models, optimizing and debugging your program well and so on.
|
|
|
|
A programming language is distinct from a general computer language by its purpose to express algorithms and be used for creation of [programs](program.md). In other words: there are computer languages that are NOT programming languages (at least in the narrower sense), such as [HTML](html.md), [json](json.md) and so on. So you shouldn't be calling yourself a programmer if you're just manually writing a website in HTML, people will laugh at you.
|
|
|
|
We use programming languages to write two basic types of programs: executable programs (programs that can actually be directly run) and [libraries](library.md) (code that cannot be run on its own but is supposed to be reused in other programs, e.g. library with mathematical functions, networking, [games](game.md) and so on).
|
|
|
|
A **simple example** of source code in the [C](c.md) programming language is the following:
|
|
|
|
```
|
|
// simple program computing squares of numbers
|
|
#include <stdio.h>
|
|
|
|
int square(int x)
|
|
{
|
|
return x * x;
|
|
}
|
|
|
|
int main(void)
|
|
{
|
|
for (int i = 0; i < 5; ++i)
|
|
printf("%d squared is %d\n",i,square(i));
|
|
|
|
return 0;
|
|
}
|
|
```
|
|
|
|
Which prints:
|
|
|
|
```
|
|
0 squared is 0
|
|
1 squared is 1
|
|
2 squared is 4
|
|
3 squared is 9
|
|
4 squared is 16
|
|
```
|
|
|
|
We divide programming languages into different groups. Perhaps the most common divisions is to two groups:
|
|
|
|
- **compiled** languages: Meant to be transformed by a [compiler](compiler.md) to a [native](native.md) (directly executable) binary program, i.e. before running the program we have to run it through the process of compilation into runnable form. These languages are typically more efficient but usually more difficult to program in, less flexible and the compiled programs are non-portable (can't just be copy-pasted to another computer with different [architecture](isa.md) and expected to run; note that this doesn't mean compiled languages aren't [portable](portability.md), just that the compiled EXECUTABLE is not). These languages are usually [lower level](low_level.md), use static and strong [typing](typing.md) and more of manual [memory management](memory_management.md). Examples: [C](c.md), [C++](cpp.md), [go](go.md), [Haskell](haskell.md) or [Pascal](pascal.md).
|
|
- **interpreted** languages: Meant to be interpreted by an [interpreter](interpreter.md) "on-the-go", i.e. what we write we can also immediately run; these languages are often used for **[scripting](scripting.md)**. To run such program you need the interpreter of the language installed on your computer and this interpreter reads the [source code](source_code.md) as it is written and performs what it dictates (well, this is actually simplified as the interpreter normally also internally does a kind of quick "lightweight" compilation, but anyway...). These languages are generally less efficient (slower, use more RAM) but also more flexible, easier to program in and [independent of platforms](platform_independent.md). These languages usually [higher-level](high_level.md), use weak and dynamic [typing](typing.md) and automatic [memory management](memory_management.md) ([garbage collection](garbage_collection.md), ...). Examples: [Python](python.md), [Perl](perl.md), [JavaScript](js.md) and [BASH](bash.md).
|
|
|
|
Sometimes the distinction between compiled and interpreted languages is not completely clear, for example Python is normally considered an interpreted language but it can also be compiled into [bytecode](bytecode.md) and even native code. [Java](java.md) is considered more of a compiled language but it doesn't compile to native code (it compiles to bytecode). [C](c.md) is traditionally a compiled language but there also exist C interpreters. [Comun](comun.md) is meant to be both compiled and interpreted etc. So calling a language interpreted vs compiled is more about what it was designed for, what its priorities are, if the designers made it highly flexible and friendly for interpreting or if they rather intended the code to be efficiently compiled into fast and compact native code.
|
|
|
|
Another common division is by **level of [abstraction](abstraction.md)** roughly to (keep in mind the transition is gradual and depends on context, the line between low and high level is extremely fuzzy):
|
|
|
|
- **[low level](low_level.md)**: Languages which are so called "closer to [hardware](hardware.md)" ("glorified [assembly](assembly.md)"), using little to no abstraction (reflecting more how a computer actually works under the hood without adding too many artificial concepts above it, allowing direct access to memory with [pointers](pointer.md), ...), for this they very often use plain [imperative](imperative.md) paradigm), being less comfortable (requiring the programmer to do many things manually), less flexible, less safe (allowing shooting oneself in the foot). However (because [less is more](less_is_more.md)) they have great many advantages, e.g. being [simple](kiss.md) to implement (and so more [free](freedom.md)) and **greatly efficient** (being fast, memory efficient, ...). One popular definition is also that "a low level language is that which requires paying attention to the irrelevant"; another definition says a low level language is that in which one command usually corresponds to one machine instruction. Low level languages are **typically compiled** (but it doesn't have to be so). Where exactly low level languages end is highly subjective, many say [C](c.md), [Fortran](fortran.md), [Forth](forth.md) and similar languages are low level (normally when discussing them in context of new, very high level languages), others (mainly the older programmers) say only [assembly](assembly.md) languages are low level and some will even say only [machine code](machine_code.md) is low level.
|
|
- **[high level](high_level.md)**: Languages with higher level of abstraction than low level ones -- they are normally more complex (though not always), interpreted (again, not necessarily), comfortable, dynamically typed, beginner friendly, "safe" (having various safety mechanism, automatic checks, automatic memory management such as [garbage collection](garbage_collection.md)) etc. For all this they are typically slower, less memory efficient, and just more [bloated](bloat.md). Examples are [Python](python.md) or [JavaScript](js.md).
|
|
|
|
We can divide language in many more ways, for example based on their **[paradigm](paradigm.md)** (roughly its core idea/model/"philosophy", e.g. [impertaive](imperative.md), [declarative](declarative.md), [object-oriented](oop.md), [functional](functional.md), [logical](logical.md), ...), **purpose** (general purpose, special purpose), computational power ([turing complete](turing_complete.md) or weaker, many definitions of a programming language require Turing completeness), [typing](data_type.md) (strong, weak, dynamic, static) or function evaluation (strict, lazy).
|
|
|
|
A computer language consists of two main parts:
|
|
|
|
- **[syntax](syntax.md)**: The grammar rules and words, i.e. how the language "looks", what expressions we are allowed to write in it. Syntax says which words can follow other words, if indentation has to follow some rules, how to insert comments in the source code, what format numbers can be written in, what kinds of names variables can have etc. Syntax is the surface part, it's often considered not as important or hard as semantics (e.g. syntax errors aren't really a big deal as the language processor immediately catches them and we correct them easily), but a good design of syntax is nevertheless still very important because that's what the programmer actually deals with a great amount of time.
|
|
- **[semantics](semantics.md)**: The meaning of what we write, i.e. semantics says what the syntax actually stands for. E.g. when syntax says it is possible to write `a / b`, semantics says this means the mathematical operation of division and furthermore specifies what *a* and *b* can actually be, what happens if *b* is zero etc. Semantics is the deeper part as firstly it is more difficult to define and secondly it gives the language its [features](feature.md), its power to compute, usability, it can make the language robust or prone to errors, it can make it efficient or slow, easy and hard to compile, optimize etc.
|
|
|
|
We also commonly divide a language to two main parts:
|
|
|
|
- **core language**: The basis of programming language is formed by a relatively small "pure" language whose words and rules are all built-in and hard-wired. They include the most elementary mechanisms such as basic arithmetic operators, elementary [data types](data_type.md) such as numbers and strings, control structures such as *if-then-else* branching and loops, the ability to define [functions](function.md) etc. This core language is like an "engine" of the language, it should be simple and well optimized because everything will be built on top of it. Higher level languages often include in their core what in lower level languages is provided by libraries, e.g. sorting functions, dynamic data types and so on.
|
|
- **[standard library](stdlib.md)** (*stdlib*): The language standard traditionally also defines a standard library, i.e. a library that has to come with the language and which will provide certain basic functionality such as user [input/output](io.md), working with files, basic mathematical functions like [square root](sqrt.md) or [sine](sin.md) and so on. These are things that are usually deemed too complex to be part of the language core, thing that can already be implemented using the core language and which are so common that will likely be needed by majority of programs, so the standard will guarantee to programmers that they will always have these basic libraries at hand (with exactly specified [API](api.md)) available. How complex the standard library is depends on each languages: some languages have huge standard libraries (which makes it very hard to implement them) and, vice versa, some languages have no standard library.
|
|
|
|
Besides the standard library there will also exist many third party [libraries](library.md), but these are no longer considered part of the language itself, they are already a products of the language.
|
|
|
|
**What is the best programming language and which one should you learn?** (See also [programming](programming.md).) These are the big questions, the topic of programming languages is infamous for being very [religious](holy_war.md) and different people root for different languages like they do e.g. for [football](football.md) teams. For [minimalists](minimalism.md), i.e. [suckless](suckless.md), [LRS](lrs.md) (us), [Unix](unix.md) people, [Plan9](plan9.md) people etc., the standard language is **[C](c.md)**, which is also probably the most important language in [history](history.md). It is not in the league of the absolutely most minimal and objectively best languages, but it's relatively minimalist (much more than practically any [modern](modern.md) language) and has great advantages such as being one of the absolutely fastest languages, being extremely well established, long tested, supported everywhere, having many compilers etc. But C isn't easy to learn as a first language. Some minimalist also promote [go](golang.md), which is kind of like "new C". Among the most minimal usable languages are traditionally [Forth](forth.md) and [Lisp](lisp.md) which kind of compete for who really is the smallest, then there is also our [comun](comun.md) which is a bit bigger but still much smaller than C. To learn programming you may actually want to start with some ugly language such as [Python](python.md), but you should really aim to transition to a better language later on.
|
|
|
|
**Can you use multiple programming languages for one project?** Yes, though it may be a burden, so don't do it just because you can. Combining languages is possible in many ways, e.g. by embedding a [scripting](scripting.md) language into a compiled language, linking together object files produces by different languages, creating different programs that communicate over network etc.
|
|
|
|
## History
|
|
|
|
WIP
|
|
|
|
Very early computers were programmed directly in [machine code](machine_code.md), there weren't even any assemblers and assembly languages around, programmers had to do things like search for opcodes in computer manuals, manually encode data and get this all onto punch cards or in better case use some primitive interface such as so called "front panel" to program the computer. These kinds of machine languages that were used back then are now called **first generation languages**.
|
|
|
|
The **first higher level programming language** was probably Plankalkul made by Konrad Zuse some time shortly after 1942, though it didn't run on any computer, it was only in stage of specification -- implementation of it would only be made much later, in 1975. It was quite advanced -- it had [functions](function.md), arrays, exceptions and some advanced data structures, though it for example didn't support [recursive](recursion.md) calls. It was important as it planted the seed of an idea of an abstract, higher level, machine independent language.
|
|
|
|
The **first [assembly](assembly.md) language** was created by Maurice Wilkes and his team for the [EDSAC](edsac.md) computer released in 1949. It used single letters for instructions. Assembly languages are called **second generation languages**, they further help with programming, though still at very low level. Programmers were now able to write text (as opposed to plain numbers), instructions got human friendlier names and assemblers did some simple but tedious tasks automatically, but it's still it was pretty tedious to write in assembly and programs were still machine specific, non-portable.
|
|
|
|
Only the **third generation languages** made the step of adding significant [abstraction](abstraction.md) to achieve a level of comfortable development and portability -- programmers would be able to e.g. write algebraic expressions that would be automatically translated to specific instructions by the language compiler; it would be enough to write the program once and then automatically compile it for different CPUs, without the need to rewrite it. **[Fortran](fortran.md)** is considered to be first such language, made in 1957 by [IBM](ibm.md). Fortran would develop and change throughout the years, it was standardized and added more "features", it became quite popular and is still used even nowadays, it is known for being very fast. In 1958 John McCarthy started to develop **[Lisp](lisp.md)**, a highly elegant, high level language that would spawn many derivatives and remains very popular even nowadays.
|
|
|
|
During late 60s the term [object oriented programming](oop.md) (OOP) appeared, as well as first languages such as Simula and [Smalltalk](smalltalk.md) that were based on this [paradigm](paradigm.md). Back then it was a rather academic experiment, not really harmful in itself; later on OOP would be seized and raped by capitalists to break computers. In 1964 the language called **[BASIC](basic.md)** appeared that was aimed at making programming easier even for non-professionals -- it would become a very popular language for the home computers. On a similar not in 1970 **[Pascal](pascal.md)** was created to be an educational language -- some hackers already saw this as too much of a retardization of programming languages (see the famous *Real Programmers Don't Use Pascal* essay).
|
|
|
|
One of the most notable events in history of programming languages was the invention of the **[C](c.md) language** in 1972 by Dennis Ritchie and Brian Kerninghan who used it as a tool for their [Unix](unix.md) operating system. The early version C was quite different from today's C but the language as a whole is undoubtedly the most important one in history -- it's not the most elegant one but it achieved the exactly correct mixture of features, simplicity and correct design choices such as allowing freedom and flexibility of implementation that would in turn lead to extreme efficiency and adoption by many, to standardization, further leading to many implementations and their high [optimization](optimization.md) which in turned increased C's popularity yet more and so on. From this point on new languages would typically in one way or another try to iterate on C. Also in 1972 the **first [esoteric programming language](esolang.md)** -- INTERCAL -- was created as kind of parody language. This would create a dedicated community of people creating similar "funny" language, which is highly active even today.
|
|
|
|
In 1978 the Intel 8086 [CPU](cpu.md) was released, giving rise to the **[x86](x86.md) assembly** language -- the assembly that would become perhaps the most widely used ones, owing to the popularity of Intel CPUs. In 1979 Bjarne Stroustrup sadly started to work on **[C++](cpp.md)**, a language that would rape the concept of [object oriented programming](oop.md) introduced by languages like Simula and Smalltalk in a highly twisted, [capitalist way](capitalist_software.md), starting the trend of creating ugly, [bloated](bloat.md) languages focused on profit making.
|
|
|
|
Just before the 90s, in the year of our Lord 1989, the ANSI C standard (also known as C89) was released -- this is considered one of the best C standards. In 1991 **[Java](java.md)**, a slow, bloated, purely capital-oriented language with FORCED [OOP](oop.md) started to be developed by *Sun Microsystems*. This was a disaster, it would lead to completely fucking up computer for ever after. In the same year **[Python](python.md)** -- a language for retards -- appeared, which would also greatly contribute to destroying computer technology in a few decades. Meanwhile after some spark of renewed interest in esoteric languages **[Brainfuck](brainfuck.md)** was made in 1993 and went on to become probably the most popular among esoteric languages -- this was at least one good events. However in 1995 another disaster struck when **[JavaScript](javascript.md)** was announced, this would later on completely destroy the whole [web](www.md). At the end of 90s, in 1999, the other one of the two best C standards -- C99 -- was released. This basically marks the end of good events in the world of programming languages, with some minor exceptions such as the creation of [comun](comun.md) in 2022.
|
|
|
|
## More Details And Context
|
|
|
|
What really IS a programming language -- is it software? Is it a standard? Can a language be [bloated](bloat.md)? How does the languages evolve? Where is the exact line between a programming language and non-programming language? Who makes programming languages? Who "owns" them? Who controls them? Why are there so many and not just one? These are just some of the questions one may ask upon learning about programming. Let's try to quickly answer some of them.
|
|
|
|
Strictly speaking programming language is a [formal language](formal_language.md) with [semantics](semantics.md), i.e. just something akin a "mathematical idea" -- as such it cannot be directly "owned", at least not on the grounds of [copyright](copyright.md), as seems to have been quite strongly established by a few court cases now. However things related to a language can sadly be owned, for example their specifications (official standards describing the language), [trademarks](trademark.md) (the name or logo of the language), implementations (specific software such as the language's compiler), [patents](patent.md) on some ideas used in the implementation etc. Also if a language is very complex, it can be owned practically; typically a corporation will make an extremely complicated language which only 1000 paid programmers can maintain, giving the corporation complete control over the language -- see [bloat monopoly](bloat_monopoly.md) and [capitalist software](capitalist_software.md).
|
|
|
|
At this point we should start to distinguish between the pure language and its **[implementation](implementation.md)**. As has been said, the pure language is just an idea -- this idea is explained in detail in so called **language specification**, a document that's kind of a standard that precisely describes the language. Specification is a technical document, it is NOT a tutorial or promotional material or anything like that, its purpose is just to DEFINE the language for those who will be implementing it -- sometimes specification can be a very official standard made by some standardizing organization (as e.g. with C), other times it may be just a collaborative online document that at the same time serves as the language reference (as e.g. with Lua). In any case it's important to [version](version_numbering.md) the specification just as we version programs, because when specification changes, the specified languages usually changes too (unless it's a minor change such as fixing some typos), so we have to have a way to exactly identify WHICH version of the language we are referring to. Theoretically specification is the first thing, however in practice we usually have someone e.g. program a small language for internal use in a company, then that language becomes more popular and widespread and only then someone decides to standardize it and make the official specification. Specification describes things like syntax, semantics, conformance criteria etc., often using precise formal tools such as [grammars](grammar.md). It's hugely difficult to make good specification because one has to decide what depth to go to and even what to purposefully leave unspecified! One would thought that it's always better to define as many things as possible, but that's naive -- leaving some things up to the choice of those who will be implementing the language gives them freedom to implement it in a way that's fastest, most elegant or convenient in any other way.
|
|
|
|
It is possible for a language to exist without official specification -- the language is then basically specified by some of its implementations, i.e. we say the language is "what this program accepts as valid input". Many languages go through this phase before receiving their specification. Language specified purely by one implementation is not a very good idea because firstly such specification is not very readable and secondly, as said, here EVERYTHING is specified by this one program (the language EQUALS that one specific compiler), we don't know where the freedom of implementation is. Do other implementations have to produce exactly the same compiled binary as this one (without being able to e.g. optimize it better or produce binaries for other platforms)? If not, how much can they differ? Can they e.g. use different representation of numbers (may be important for compatibility)? Do they have to reproduce even the same bugs as the original compiler? Do they have to have the same technical limitations? Do they have to implement the same command line interface (without potentially adding improvements)? Etc.
|
|
|
|
Specification typically gets updated just as software does, it has its own version and so we then also talk about version of the language (e.g. C89, C99, C11, ...), each one corresponding to some version of the specification.
|
|
|
|
Now that we have a specification, i.e. the idea, someone has to realize it, i.e. program it, make the implementation; this mostly means programming the language's [compiler](compiler.md) or [interpreter](interpreter.md) (or both), and possibly other tools (debugger, optimizer, [transpiler](transpiler.md), etc.). A language can (and often does) have multiple implementations; this happens because some people want to make the language as fast as possible while others e.g. want to rather have small, [minimalist](minimalism.md) implementation that will run on limited computers, others want implementation under a different license etc. The first implementation is usually so called **reference implementation** -- the one that will serve as a kind of authority that shows how the language should behave (e.g. in case it's not clear from the specification) to those who will make newer implementations; here the focus is often on correctness rather than e.g. efficiency or minimalism, though it is often the case that reference implementations are among the best as they're developed for longest time. Reference implementations guide development of the language itself, they help spot and improve weak points of the language etc. Besides this there are third party implementations, i.e. those made later by others. These may add extensions and/or other modifications to the original language so they spawn **dialects** -- slightly different versions of the language. We may see dialects as [forks](fork.md) of the original language, which may sometimes even evolve into a completely new language over time. Extensions of the languages may sound like a good thing as they add more "comfort" and "features", however they're usually bad as they create a [dependency](dependency.md) and fuck up the standardization -- if someone writes a program in a specific compiler's dialect, the program won't compile under other compilers.
|
|
|
|
A new language comes to existence just as other things do -- when there is a reason for it. I.e. if someone feels there is no good language for whatever he's doing or if someone has a brilliant idea and want to write a PhD thesis or if someone smokes too much weed or if a corporation wants to control some software platform etc., a new language may be made. This often happen gradually (again, like with many things), i.e. someone just starts modifying an already existing language -- at first he just makes a few macros, then he starts making a more complex preprocessor, then he sees it's starting to become a new language so he gives it a name and makes it a new language -- such language may at first just be transpiled to another language (often [C](c.md)) and over time it gets its own full compiler. At first a new language is written in some other language, however most languages aim for **[self hosted](self_hosting.md) implementation**, i.e. being written in itself. This is natural and has many advantages -- a language written in itself proves its maturity, it becomes independent and as it itself improves, so does its own compiler. Self hosting a language is one of the greatest milestones in its life -- after this the original implementation in the other language often gets deletes as it would just be a burden to keep [maintaining](maintenance.md) it.
|
|
|
|
**So can a language be inherently fast, [bloated](bloat.md), memory efficient etc.?** When we say a language is so and so, we generally refer to its implementations and our experience from practice because, as explained previously, a language in itself is only an idea that can be implemented in many ways with different priorities and tradeoffs, and not only that; even if we choose specific implementations of languages, the matter of [benchmarking](benchmark.md) and comparing them is very complicated because the results will be highly dependent for example on hardware architecture we use (some [ISA](isa.md) have slow branching, lack the divide instruction, some MCUs lack floating point unit etcetc., all of which may bias results heavily to either side) AND on test programs we use (some types of problems may better fit the specialization of one language that will do very well at it while it would do much worse at other types of problems), the way they are written (the problem of choosing idiomatic code vs transliteration, i.e. performance will depend on whether we try to solve the benchmark problem in the way that's natural for the language or the way that's more faithful to the described solution) and what weight we give to each one (i.e. even when using multiple benchmarks, we ultimately have to assign a specific importance to each one). It's a bit like trying to say who the fastest human is -- generally we can pick the top sportsmen in the world but then we're stuck because one will win at sprint while the other one at long distance running and another one at swimming, and if we consider even letting them compete in different clothes, weather conditions and so on, we'll just have to give up. So speaking about languages and their quantitative properties in practice generally means talking about their implementations and practical experience we have. HOWEVER, on the other hand, it does make sense to talk about properties of languages as such as well -- a language CAN itself be seen as inherently having some property if it's defined so that its every implementation has to have this property, at least practically speaking. Dynamic typing for example means the language will be generally slower because operations on variables will inevitably require some extra runtime checks of what's stored in the variable. A very complicated language just cannot be implemented in a simple, non-bloated way, an extremely high level and flexible language cannot be implemented to be among the fastest -- so in the end we also partially speak about languages as such because eventually implementations just reflect the abstract language's properties. **How to tell if a language is bloated?** One can get an idea from several things, e.g. list of features, [paradigm](paradigm.md), size of its implementations, number of implementations, size of the specification, year of creation ([newer](modern.md) mostly means more bloat) and so on. However be careful, many of these are just heuristics, for example small specification may just mean it's vague. Even a small self hosted implementation doesn't have to mean the language is small -- imagine e.g. a language that just does what you write in plain English; such language will have just one line self hosted implementation: "Implement yourself." But to actually [bootstrap](boot.md) the language will be immensely difficult and will require a lot of bloat.
|
|
|
|
Judging languages may further be complicated by the question of what the language encompasses because some languages are e.g. built on relatively small "pure language" core while relying on a huge library, preprocessor, other embedded languages and/or other tools of the development environment coming with the language -- for example [POSIX shell](posix_shell.md) makes heavy use of separate programs, utilities that should come with the POSIX system. Similarly [Python](python.md) relies on its huge library. So sometimes we have to make it explicitly clear about this.
|
|
|
|
## Notable Languages
|
|
|
|
Here is a table of notable programming languages in chronological order (keep in mind a language usually has several versions/standards/implementations, this is just an overview).
|
|
|
|
| language | minimalist/good? | since | speed | mem. | ~min. selfhos. impl. LOC |DT LOC|spec. (~no stdlib pages)| notes |
|
|
| ----------------------- | ---------------- | ----- | ------- | -------- | ------------------------ | ---- | ---------------------- | ----------------------------------------------------------------------- |
|
|
|"[assembly](assembly.md)"| **yes but...** | 1947? | | | | | | NOT a single language, non-[portable](portability.md) |
|
|
| [Fortran](fortran.md) | **kind of** | 1957 | 1.95 (G)| 7.15 (G) | | | 300, proprietary (ISO) | similar to Pascal, compiled, fast, was used by scientists a lot |
|
|
| [Lisp](lisp.md)(s) | **yes** | 1958 | 3.29 (G)| 18 (G) | 100 (judg. by jmc lisp) | 35 | 40 (r3rs) | elegant, KISS, functional, many variants (Common Lisp, Scheme, ...) |
|
|
| [Basic](basic.md) | kind of? | 1964 | | | | | | mean both for beginners and professionals, probably efficient |
|
|
| [Forth](forth.md) | **YES** | 1970 | | | 100 (judg. by milliforth)| 77 | 200 (ANS Forth) | [stack](stack.md)-based, elegant, very KISS, interpreted and compiled |
|
|
| [Pascal](pascal.md) | **kind of** | 1970 | 5.26 (G)| 2.11 (G) | | 59 | 80, proprietary (ISO) | like "educational C", compiled, not so bad actually |
|
|
| **[C](c.md)** | **kind of** | 1972 | 1.0 | 1.0 | 10K? (judg. by chibicc) | 49 | 160, proprietary (ISO) | compiled, fastest, efficient, established, suckless, low-level, #1 lang.|
|
|
| [Prolog](prolog.md) | maybe? | 1972 | | | | | | [logic](logic.md) paradigm, hard to learn/use |
|
|
|[Smalltalk](smalltalk.md)| **quite yes** | 1972 | 47 (G) | 41 (G) | | | 40, proprietary (ANSI) | PURE (bearable kind of) [OOP](oop.md) language, pretty minimal |
|
|
| [C++](cpp.md) | no, bearable | 1982 | 1.18 (G)| 1.27 (G) | | 51 | 500, proprietary | bastard child of C, only adds [bloat](bloat.md) ([OOP](oop.md)), "games"|
|
|
| [Ada](ada.md) | ??? | 1983 | | | | | | { No idea about this, sorry. ~drummyfish } |
|
|
| Object Pascal | no | 1986 | | | | | | Pascal with OOP (like what C++ is to C), i.e. only adds bloat |
|
|
| Objective-C | probably not | 1986 | | | | | | kind of C with Smalltalk-style "pure" objects? |
|
|
| [Oberon](oberon.md) | kind of? | 1987 | | | | | | simplicity as goal, part of project Oberon |
|
|
| [Perl](perl.md) | rather not | 1987 | 77 (G) | 8.64 (G) | | | | interpreted, focused on strings, has kinda cult following |
|
|
| [Bash](bash.md) | well | 1989 | | | | | | Unix scripting shell, very ugly syntax, not so elegant but bearable |
|
|
| [Haskell](haskell.md) | **kind of** | 1990 | 5.02 (G)| 8.71 (G) | | | 150, proprietary | [functional](functional.md), compiled, acceptable |
|
|
| [Python](python.md) | NO | 1991 | 45 (G) | 7.74 (G) | | 32 | 200? (p. lang. ref.) | interpreted, huge bloat, slow, lightweight OOP, artificial obsolescence |
|
|
| POSIX [shell](shell.md) | well, "kind of" | 1992 | | | | | 50, proprietary (paid) | standardized (std 1003.2-1992) Unix shell, commonly e.g. [Bash](bash.md)|
|
|
|[Brainfuck](brainfuck.md)| **yes** | 1993 | | | 100 (judg. by dbfi) | | 1 | extremely minimal (8 commands), hard to use, [esolang](esolang.md) |
|
|
| [FALSE](false.md) | **yes** | 1993 | | | | | 1 | very small yet powerful, Forth-like, similar to Brainfuck |
|
|
| [Lua](lua.md) | **quite yes** | 1993 | 91 (G) | 5.17 (G) | 7K (LuaInLua) | | 40, free | small, interpreted, mainly for scripting (used a lot in games) |
|
|
| [Java](java.md) | NO | 1995 | 2.75 (G)| 21.48 (G)| | | 800, proprietary | forced [OOP](oop.md), "platform independent" (bytecode), slow, bloat |
|
|
| [JavaScript](js.md) | NO | 1995 | 8.30 (G)| 105 (G) | 50K (est. from QuickJS) | 34 | 500, proprietary? | interpreted, the [web](web.md) lang., bloated, classless [OOP](oop.md) |
|
|
| [PHP](php.md) | no | 1995 | 23 (G) | 6.73 (G) | | | 120 (by Google), CC0 | server-side web lang., OOP |
|
|
| [Ruby](ruby.md) | no | 1995 | 122 (G) | 8.57 (G) | | | | similar to Python |
|
|
| [C#](c_sharp.md) | NO | 2000 | 4.04 (G)| 26 (G) | | | | proprietary (yes it is), extremely bad lang. owned by Micro$oft, AVOID |
|
|
| [D](d.md) | no | 2001 | | | | | | some expansion/rework of C++? OOP, generics etcetc. |
|
|
| [Rust](rust.md) | NO! lol | 2006 | 1.64 (G)| 3.33 (G) | | | 0 :D | extremely bad, slow, freedom issues, toxic community, no standard, AVOID|
|
|
| [Go](go.md) | **kind of** maybe| 2009 | 4.71 (G)| 5.20 (G) | | | 130, proprietary? | "successor to C" but not well executed, bearable but rather avoid |
|
|
| [LIL](lil.md) | **yea** | 2010? | | | | | | not known too much but nice, "everything's a string" |
|
|
| [uxntal](uxn.md) | **yes** but SJW | 2021 | | | 400 (official) | | 2? (est.), proprietary | assembly lang. for a minimalist virtual machine, PROPRIETARY SPEC. |
|
|
| **[comun](comun.md)** | **yes** | 2022 | | | 4K | 76 | 2, CC0 | "official" [LRS](lrs.md) language, WIP, similar to Forth |
|
|
|
|
NOTES on the table above:
|
|
|
|
- performance data: the `speed`/`mem.` column says a benchmarked estimate running time/memory consumption of the best case (best compiler, best run, ...) relateive to C (i.e. "how many times the language is worse than C"). The data may come from various sources, for example the *[The Computer Language Benchmark Game](https://sschakraborty.github.io/benchmark/task-descriptions.html)* (G), own measurement (O) etc.
|
|
- implementation size: this is just very rough estimate based on the smallest implementation found, sometimes guessed and rounded to some near value (for example finding a small implementation whose main goal isn't small size we may conclude it could be written yet a bit smaller).
|
|
- DT LOC is the number of [lines of code](loc.md) of our standardized [divisor tree](divisor_tree.md) program at the time of writing this
|
|
|
|
TODO: Tcl, Rebol
|
|
|
|
## Interesting Languages
|
|
|
|
Some programming languages may be [interesting](interesting.md) rather than directly useful -- following this trail may lead you to more obscure and underground programming communities -- however these languages are important too as they teach us a lot and may help us design good practically usable languages. In fact professional researches in theory of computation spend their whole lives dealing with practically unusable languages and purely theoretical computers. Even a great painter sometimes draws funny silly pictures in his notebook, it helps build a wide relationship with the art and you never know if a serious idea can be spotted in a joke.
|
|
|
|
One such language is e.g. **[Unary](unary_lang.md)**, a programming language that only uses a single character while being Turing complete (i.e. having the highest possible "computing power", being able to express any program). All programs in Unary are just sequences of one character, differing only by their length (i.e. a program can also be seen just as a single natural number, the length of the sequence). We can do this because we can make an ordered list of all (infinitely many) possible programs in some simple programming language (such as a [Turing machine](turing_machine.md) or [Brainfuck](brainfuck.md)), i.e. assign each program its ordinal number (1st, 2nd, 3rd, ...) -- then to express a program we simply say the position of the program on the list.
|
|
|
|
There is a community around so called **[esoteric programming languages](esolang.md)** which takes great interest in such languages, from mere [jokes](jokes.md) (e.g. languages that look like cooking recipes or languages that can compute everything but can't output anything) to discussing semi-serious and serious, even philosophical and metaphysical questions. They make you think about what really is a programming language; where should we draw the line exactly, what is the absolute essence of a programming language? What's the smallest thing we would call a programming language? Does it have to be Turing complete? Does it have to allow output? What does it even mean to compute? And so on. If you dare, kindly follow the rabbit hole.
|
|
|
|
## See Also
|
|
|
|
- [esoteric programming language](esolang.md)
|
|
- [constructed language](conlang.md)
|
|
- [human language](human_language.md)
|
|
- [computer language](computer_language.md)
|
|
- [pseudocode](pseudocode.md)
|
|
- [compiler](compiler.md)
|