less_retarded_wiki/c.md

384 lines
44 KiB
Markdown
Raw Permalink Normal View History

# C
{ We have a [C tutorial](c_tutorial.md)! ~drummyfish }
2024-08-26 15:22:39 +02:00
C is an [old](old.md) [low level](low_level.md) structured [statically typed](static_typing.md) [imperative](imperative.md) compiled [programming language](programming_language.md), it is very fast, efficient and currently possibly the most commonly used language by many [minimalist](minimalism.md) programmers including [less retarded software](lrs.md). Though by very strict standards it would still be considered [bloated](bloat.md), compared to any mainstream [modern](modern.md) language it is very bullshitless, [KISS](kiss.md), very well optimized, culturally established and stable, so it is also the go-to language of the [suckless](suckless.md) community as well as most true experts, for example the [Linux](linux.md) and [OpenBSD](openbsd.md) developers, owing to a good, relatively simple design, **uncontested performance**, **wide support**, great number of compilers, high level of control and a status of firmly tested and established language. C doesn't belong to the class of most minimal languages like [Forth](forth.md), [Lisp](lisp.md) and [Brainfuck](brainfuck.md), but it is among the most minimalist "traditional" kind of languages. C is **perhaps the most important language in [history](history.md)**; it influenced, to smaller or bigger degree, basically all of the widely used languages today such as [C++](c.md), [Java](java.md), [JavaScript](javascript.md) etc., however it is not a relic of the past, it is still actively used -- in the area of low level programming C is probably still the number one unsurpassed language. C is by no means perfect or extremely mathematically [elegant](beauty.md), but it is currently one of the best practical choice of a programming language. Though C is almost always compiled, C interpreters can be found too.
2024-05-27 22:48:10 +02:00
{ See https://wiki.bibanon.org/The_Perpetual_Playthings. Also look up *The Ten Commandments for C Programmers* by Henry Spencer. Also the *Write in C* song (parody of *Let it Be*). ~drummyfish }
2023-03-31 16:30:24 +02:00
2024-08-26 15:22:39 +02:00
It is usually **not considered an easy language to learn** because of its low level nature and amount of control (fuck up opportunities) it gives: it requires good understanding of how a [computer](computer.md) works on the lower level and doesn't prevent the programmer from shooting himself in the foot. Programmer is given full control (and therefore "responsibility"). There are things considered "tricky" which one must be aware of, such as undefined behavior of certain operators or manual [memory management](memory_management.md). This is what can discourage a lot of modern "[coding monkeys](soydev.md)" from choosing C, but it's also what inevitably allows such great performance -- undefined behavior allows the compiler to choose the most efficient implementation. On the other hand, C as a language is pretty simple without [modern](modern.md) bullshit concepts such as [OOP](oop.md), it is not as much hard to learn but rather hard to master, as any other true [art](art.md). In any case **you have to learn C** even if you don't plan to program in it regularly, it's the most important language in history and lingua franca of programming, you will meet C in many places and have to at least understand it: programmers very often use C instead of [pseudocode](pseudocode.md) to explain algorithms, C is used for [optimizing](optimization.md) critical parts even in non-C projects, many languages compile to C, it is just all around and you have to understand it like you have to understand [English](english.md).
2024-02-13 17:12:51 +01:00
2024-08-26 15:22:39 +02:00
Some of the typical traits of C include plentiful (over)utilization of **[preprocessor](preprocessor.md)** ([macros](macro.md), the underlying C code is infamously littered with "`#ifdefs`" all over the place which modify the code just before compiling -- this is mostly used for compile-time configuration and/or achieving better performance and/or for [portability](portability.md)), **[pointers](pointer.md)** (direct access to memory, used e.g. for memory allocation, this is infamously related to "shooting oneself in the foot", e.g. by getting [memory leaks](memory_leak.md)) and a lot of **[undefined behavior](undefined_behavior.md)** (many things are purposefully left undefined in C to allow compilers to generate greatly efficient code, but this sometimes lead to weird [bugs](bug.md) or a program working on one machine but not another, so C requires some knowledge of its specification). Also a bit infamous one may encounter complicated type declarations like `void (*float(int,void (*n)(int)))(int)`, these are frequently a subject of [jokes](jokes.md) ("look, C is simple").
2023-02-06 20:06:02 +01:00
2024-08-26 15:22:39 +02:00
Unlike many "[modern](modern.md)" languages, C by itself doesn't offer too much advanced and fancy functionality such as displaying graphics, working with [network](network.md), getting raw keyboard state and so on -- the base language doesn't even have any [input/output](io.md), it's a pure processor of values in memory. The standard library offers things like basic I/O with standard input/output streams, handling [files](file.md) on the disk, manipulating [strings](string.md), handling time, evaluating [mathematical](math.md) functions and other things, but for anything more advanced you will need an external library like [SDL](sdl.md) or those defined by [Posix](posix.md).
2023-12-22 00:43:13 +01:00
2024-08-26 15:22:39 +02:00
C is said to be a **"[portable](portability.md) [assembly](assembly.md)"** not only because it is quite low level and almost on par with assembly in performance, but also because many languages just choose to compile to C rather than to compile to assembly. Though C is structured (has control structures such as branches and loops) and can be used in a relatively high level manner, it is also possible to write assembly-like code that operates directly with bytes in memory through [pointers](pointer.md) without many safety mechanisms, so C is often used for writing things like hardware [drivers](driver.md). On the other hand some restrain from likening C to assembly because C compilers still perform many transformations of the code and what you write is not necessarily always what you get.
2023-02-06 20:06:02 +01:00
2024-08-26 15:22:39 +02:00
Mainstream consensus acknowledges that C is among the best languages for writing low level code and code that requires **performance**, such as [operating systems](operating_system.md), [drivers](driver.md) or [games](game.md). Even scientific libraries with normie-language interfaces -- e.g. various [machine learning](machine_learning.md) [Python](python.md) libraries -- usually have the performance critical core written in [C](c.md). Normies will tell you that for things outside this scope C is not a good language, with which we disagree -- [we](lrs.md) recommend using C for basically everything that's [supposed to last](future_proof.md), i.e. if you want to write a good dynamic website, you should probably write it in C.
**C is NOT a subset of C++.** This is proven for example by the simple example of a C program that uses the word `class` as a name for a variable -- in C++ this cannot be done. Sometimes the differences between C and C++ are bigger, for example in semantics, and may cause trouble to those who don't know about them. It is true that many C programs will run as C++ just fine, but not nearly all, and some C programs that run as C++ will have different behavior. We have to always be aware of this.
2024-02-13 17:12:51 +01:00
**Is C low or high level?** This depends on the context. Firstly back in the day when most computers were programmed in [assembly](assembly.md), C was seen as high level, simply because it offered the highest level of abstraction at the time, while nowadays with languages like [Python](python.md) and [JavaScript](js.md) around people see C as very low level by comparison -- so it really depends on if you talk about C in context of "old" or "modern" programming and which languages you compare it to. Secondly it also depends on HOW you program in C -- you may choose to imitate assembly programming in C a lot, avoid using libraries, touch hardware directly, avoid using complex features and creating your own abstractions -- here you are really doing low level programming. On the other hand you can emulate the "modern" high-level style programming in C too, you can even mimic [OOP](oop.md) and make it kind of "C++ with different syntax", you may use libraries that allow you to easily work with strings, heavy macros that pimp the language to some spectacular abomination, you may write your own garbage collector etc. -- here you are basically doing high level programming in C.
2024-02-13 21:57:06 +01:00
**[Fun](fun.md)**: `main[-1u]={1};` is a C [compiler bomb](compiler_bomb.md) :) it's a short program that usually makes the compiler produce a huge binary.
2024-08-05 22:39:28 +02:00
## Examples
Let's write a simple program called **[divisor tree](divisor_tree.md)** -- this program will be interactively reading positive numbers (smaller than 1000) from the user and for each one it will print the [binary tree](binary_tree.md) of the numbers divisors so that if a number has divisors, the ones that are closest to each other will be its children. If invalid input is given, the program ends. The tree will be written in format `(L N R)` where *N* is the number of the tree node, *L* is its the node's left subtree and *R* is the right subtree. This problem is made so that it will showcase most of the basic features of a programming language (like control structures, function definition, [recursion](recursion.md), [input/output](io.md) etc.). Let's from now on consider this our standardized program for showcasing programming languages.
Here is the program written in C99 (let this also serve as a reference implementation of the program):
```
#include <stdio.h> // include standard I/O library
// recursive function, prints divisor tree of x
void printDivisorTree(unsigned int x)
{
int a = -1, b = -1;
for (int i = 2; i <= x / 2; ++i) // find two closest divisors
if (x % i == 0)
{
a = i;
b = x / i;
if (b <= a)
break;
}
2024-08-31 14:44:45 +02:00
2024-08-05 22:39:28 +02:00
putchar('(');
2024-08-31 14:44:45 +02:00
2024-08-05 22:39:28 +02:00
if (a > 1)
{
printDivisorTree(a);
printf(" %d ",x);
printDivisorTree(b);
}
else
2024-08-31 14:44:45 +02:00
printf("%d",x);
2024-08-05 22:39:28 +02:00
putchar(')');
}
int main(void)
{
while (1) // main loop, read numbers from the user
{
unsigned int number;
printf("enter a number: ");
2024-08-31 14:44:45 +02:00
2024-08-05 22:39:28 +02:00
if (scanf("%u",&number) == 1 && number < 1000)
{
printDivisorTree(number);
putchar('\n');
}
else
break;
}
return 0;
}
```
Run of this program may look for example like this:
```
enter a number: 32
((((2) 4 (2)) 8 (2)) 32 ((2) 4 (2)))
enter a number: 256
((((2) 4 (2)) 16 ((2) 4 (2))) 256 (((2) 4 (2)) 16 ((2) 4 (2))))
enter a number: 7
(7)
enter a number: 0
(0)
2024-08-06 22:19:44 +02:00
enter a number: 15
2024-08-05 22:39:28 +02:00
((5) 15 (3))
enter a number: quit
```
## History and Context
2024-02-20 16:24:28 +01:00
C was developed in 1972 at [Bell Labs](bell_labs.md) alongside the [Unix](unix.md) operating system by [Dennis Ritchie](dennis_ritchie.md) and [Brian Kerninghan](brian_kerninghan.md), as a successor to the [B](b.md) language ([portable](portability.md) language with [recursion](recursion.md)) written by Denis Ritchie and [Ken Thompson](ken_thompson.md), which was in turn inspired by the the [ALGOL](algol.md) language (code blocks, lexical [scope](scope.md), ...). C was for a while called NB for "new B". C was intimately interconnected with Unix and its [hacker culture](hacking.md), both projects would continue to be developed together, influencing each other. In 1973 Unix was rewritten in C. In 1978 Keninghan and Ritchie published a book called *The C Programming Language*, known as *K&R*, which became something akin the C specification. In March 1987 [Richard Stallman](rms.md) along with others released the first version of [GNU C compiler](gcc.md) -- the official compiler of the [GNU](gnu.md) project and the compiler that would go on to become one of the most widely used. In 1989, the [ANSI C](ansi_c.md) standard, also known as C89, was released by the American ANSI -- this is a very well supported and overall good standard. The same standard was also adopted a year later by the international ISO, so C90 refers to the same language. In 1999 ISO issues a new standard that's known as C99, still a very good standard embraced by [LRS](lrs.md). Later in 2011 and 2017 the standard was revised again to C11 and C17, which are however no longer considered good.
## Standards
2024-06-07 16:46:05 +02:00
C is not a single language, there have been a few standards over the years since its inception in 1970s. The standard defines two major parts: the base language and standard library. Notable standards and versions are:
- **K&R C**: C as described by its inventors in the book *The C Programming Language*, before official standardization. This is kind of too ancient nowadays.
2022-12-26 15:23:35 +01:00
- **C89/C90 (ANSI/ISO C)**: First fully standardized version, usable even today, many hardcore C programmers stick to this version so as to enjoy maximum compiler support.
- **C95**: A minor update of the previous standard, adds wide character support.
2023-12-14 21:08:58 +01:00
- **C99**: Updated standard from the year 1999, striking a nice balance between "[modern](modern.md)" and "good old". This is a good version to use in [LRS](lrs.md) programs, but will be a little less supported than C89, even though still very well supported. Notable new features against C89 include `//` comments, [stdint](stdint.md) library (fixed-width integer types), [float](float.md) and `long long` type, variable length stack-allocated [arrays](array.md), variadic [macros](macro.md) and declaration of variables "anywhere" (not just at function start).
- **C11**: Updated standard from the year 2011. This one is too [bloated](bloat.md) and isn't worth using.
2022-12-26 15:23:35 +01:00
- **C17/C18**: Yet another update, yet more bloated and not worth using anymore.
- ...
2023-12-14 21:08:58 +01:00
Quite nice online reference to all the different standards (including C++) is available at https://en.cppreference.com/w/c/99.
2023-02-06 20:06:02 +01:00
[LRS](lrs.md) should use C99 or C89 as the newer versions are considered [bloat](bloat.md) and don't have such great support in compilers, making them less portable and therefore less free.
The standards of C99 and older are considered pretty [future-proof](future_proof.md) and using them will help your program be future-proof as well. This is to a high degree due to C having been established and tested better than any other language; it is one of the oldest languages and a majority of the most essential software is written in C, C compiler is one of the very first things a new hardware platform needs to implement, so C compilers will always be around, at least for historical reasons. C has also been very well designed in a relatively minimal fashion, before the advent of modern feature-creep and and bullshit such as [OOP](oop.md) which cripples almost all "modern" languages.
## Compilers
2024-02-19 23:59:22 +01:00
C is extreme well established, standardized and implemented so there is a great number of C compilers around. Let us list only some of the more notable ones.
2023-12-22 00:43:13 +01:00
- [gcc](gcc.md): The main "big name" that can compile all kinds of languages including C, used by default in many places, very [bloated](bloat.md) and can take long to compile big programs, but is pretty good at [optimizing](optimization.md) the code and generating fast code. Also has number of frontends and can compile for many platforms. Uses GENERIC/GIMPLE [intermediate representation](intermediate_representation.md).
- [clang](clang.md): Another big bloated compiler, kind of competes with gcc, is similarly good at optimization etc. Uses [LLVM](llvm.md) intermediate representation.
2024-02-24 16:17:37 +01:00
- [tcc](tcc.md): Tiny C compiler, [suckless](suckless.md), orders of magnitude smaller (currently around 25 KLOC) and simpler than gcc and clang, doesn't use any intermediate representation, cannot optimize nearly as well as the big compilers so the generated executables can be a bit slower and/or bigger (though sometimes they may be smaller), however besides its internal simplicity there are many advantages, mainly e.g. fast compilation (claims to be 9 times faster than gcc) and small tcc executable (about 100 kB). Seems to only support x86 at the moment.
2023-12-22 00:43:13 +01:00
- [scc](scc.md): Another small/suckless C compiler, currently about 30 KLOC.
2024-06-10 10:07:07 +02:00
- [chibicc](chibicc.md): Hell of a small C compiler (looks like around 10 KLOC).
2023-12-26 23:42:01 +01:00
- [DuskCC](duskcc.md): [Dusk OS](duskos.md) C compiler written in [Forth](forth.md), focused on extreme simplicity, probably won't adhere to standards completely.
2024-06-11 13:36:28 +02:00
- [8c](8c.md), [8cc](8cc.md): Another small compiler.
2024-02-19 23:59:22 +01:00
- [c2bf](c2bf.md): Partially implemented C to [brainfuck](brainfuck.md) compiler.
- [lcc](lcc.md): Proprietary, source available small C compiler, about 20 KLOC.
- [pcc](pcc.md): A very early C compiler that was later developed further to support even the C99 standard.
- Borland Turbo C: old proprietary compiler with [IDE](ide.md).
- [sdcc](sdcc.md) (small device C compiler): For small 8 bit [microcontrollers](mcu.md).
2024-02-24 16:17:37 +01:00
- msvc ([Micro$oft](microsoft.md) visual C++): Badly bloated proprietary C/C++ compiler by a shitty [corporation](corporation.md). Avoid.
2024-07-18 20:20:45 +02:00
- [M2-Planet](m2_planet.md): Simple compiler of C subset used for bootstrapping the [GNU](gnu.md) operating system.
2022-12-26 15:23:35 +01:00
- ...
## Standard Library
2023-12-22 00:43:13 +01:00
Besides the pure C language the C standard specifies a set of [libraries](library.md) that have to come with a standard-compliant C implementation -- so called standard library. This includes e.g. the *stdio* library for performing standard [input/output](io.md) (reading/writing to/from screen/files) or the *math* library for mathematical functions. It is usually relatively okay to use these libraries as they are required by the standard to exist so the [dependency](dependency.md) they create is not as dangerous, however many C implementations aren't completely compliant with the standard and may come without the standard library. Also many stdlib implementations suck or you just can't be sure what the implementation will prefer (size? speed?) etc. So for sake of [portability](portability.md) it is best if you can avoid using standard library.
2023-02-06 20:06:02 +01:00
2023-12-22 00:43:13 +01:00
The standard library (libc) is a subject of live debate because while its interface and behavior are given by the C standard, its implementation is a matter of each compiler; since the standard library is so commonly used, we should take great care in assuring it's extremely well written, however we ALWAYS have to choose our priorities and make tradeoffs, there just mathematically CANNOT be an ultimate implementation that will be all extremely fast and extremely memory efficient and extremely portable and extremely small. So choosing your C environment usually comprises of choosing the C compiler and the stdlib implementation. As you probably guessed, the popular implementations ([glibc](glibc.md) et al) are [bloat](bloat.md) and also often just [shit](shit.md). Better alternatives thankfully exist, such as:
- [musl](musl.md)
- [uclibc](uclibc.md)
- [not using](dependency.md) the standard library :)
2023-12-22 00:43:13 +01:00
- ...
2024-02-24 16:17:37 +01:00
## Good And Bad Things About C
Firstly let's sum up some of the reasons why C is so good:
2024-08-26 15:22:39 +02:00
- **C as a language is relatively simple**: Though strictly speaking it's not in the league of most minimal languages like [Forth](forth.md) and [Lisp](lisp.md), C is the next best thing in terms of [minimalism](minimalism.md) and the small amount of bloat it contains is usually somehow justified at least, the language (or its subset) can be implemented in a quite minimal way if one so desires. It employs little [abstraction](abstraction.md). This all helps performance, freedom and encourages many implementations. C's standard library also isn't gigantic like that of for example [Python](python.md), the important parts basically just provide I/O and help with simple things like manipulating strings and memory allocation, so new C implementations aren't burdened by having to implement tons of libraries. There are several C compilers that were made by a single man.
- **It is extremely fast and efficient**: Owing to mentioned attributes such as good specification, simplicity, lack of bullshit and having a good balance between low and high level, C is known for being possibly the fastest [portable](portability.md) language in existence, also with very small memory footprint etc.
- **C doesn't limit you or hold (tie) your hands**: This may bad for the beginner but is necessary for the expert, most of the times C won't "protect" you from doing anything, even crashing your program -- this kind of freedom is necessary to achieve truly marvelous things, C is like a race car, it doesn't have speed limiters and automatic transmission, nothing that's hidden from you or which would increase the car weight, it trusts in you being a good driver.
2024-02-24 16:17:37 +01:00
- **C is highly standardized**: Many languages have some kind of "online specification", however C is on the next level by literally being officially standardized by the forefront standardizing organizations like ANSI and ISO, by full time paid experts over many years and iterations, so the language is extremely well defined and described, down to saying which exact things are left undefined/unspecified, leaving freedom of implementation that leads to the language's great performance.
2024-08-26 15:22:39 +02:00
- **It's extremely well establishes, optimized, stable and time tested, plus many helper tools exists**: Being among the oldest languages, one of the old time [hackers](hacking.md) and the language of [Unix](unix.md), maybe the most important piece of software in history, C has been so widely adopted, reimplemented, optimized and tested over and over that it's considered to be among the most essential pieces of software any platform has to have. Everything on the low level is written in C, so you essentially first have to have C to be able to run anything else. Many companies have invested great many resources to making C fast as it benefited them. While other languages come and go, or at least mutate and become something else over time, C stands as one very few stable things in computer technology, which is a rarity. Along the way hackers have also made tons and tons of tools that help with C development, various static analyzers, debuggers, code beautifiers, transpilers etcetc.
- **It doesn't have any [modern](modern.md) [bullshit](bullshit.md)**: There is no [OOP](oop.md), [generics](generics.md), [garbage collection](garbage_collection.md), no [package manager](package_manager.md), no [furry](furry.md) mascots etc.
- **There is a huge number of [compilers](compiler.md)**: While a "[modern](modern.md)" language has some kind of main reference implementation and then maybe one of two alternative implementations, C has dozens (maybe even hundreds) of compilers. You'll find compilers under all possible [licenses](license.md), huge ones with many features and uber optimizations, small ones that will run on tiny devices, ones that compile very fast, ones that translate C to other languages and so on.
- **It is elitist**: Higher difficulty of learning C creates a nice "barrier to entry" with an effect that keeps absolute idiots away, keeping the language less intoxicated by retarded ideas. { NOTE: The word "elitist" here is not to really mean inherently "discriminating" of course, but rather "unpopular among the stupid" because it's quite different from the mainstream and requires some effort on unlearning bad mainstream habits, i.e. nowadays it needs some dedication, you can't just join in effortlessly. It's elitist in the same way in which Unix systems or suckless software are elitist. ~drummyfish }
2024-02-24 16:17:37 +01:00
- **C is close to the [hardware](hw.md), reflecting how computers work**: This has many advantages: firstly efficiency, as code that maps well to hardware is predictable and efficient, lacking [magic](magic.md) in translation. It simplifies implementations, making the language more free. Then also the programmer himself is close to the machine, he has to learn how it works, what it likes and dislikes -- a knowledge every programmer has to have.
2024-08-26 15:22:39 +02:00
- **There is a good balance between low and high level (minimalism vs "features")**: C seems to have hit the sweet spot at which it offers just enough high level features for comfortable programming, such as [data type](data_type.md) checks, routines and preprocessor, while not crossing the line beyond which it would have to pay an unreasonably high cost for the comfort, i.e. it managed to buy a lot for a very low price. Things like this cannot really be completely planned, it requires a genius, intuition and many years of trial and error iterations to create a language like this.
- **It is [old](old.md), written only by white male [hackers](hacking.md), at times when [capitalism](capitalism.md) was weaker**: No [women](woman.md) were probably involved in the development, making the language wasn't a form of some angry minority's political protest (of course we aren't racists or sexists, it's just a fact that white men are best at programming), the development was largely part of genuine research, at the time when computers weren't mainstream and computer technology wasn't being raped as hard as today. C developers didn't even think of embedding any political message in the language. Times like that will never be repeated.
2024-02-24 16:17:37 +01:00
- ...
2024-08-26 15:22:39 +02:00
Now let's admit that nothing is [perfect](perfect.md), not even C; it was one of the first relatively higher level languages and even though it has showed to have been designed extremely well, some things didn't turn out that well. We still prefer C as one of the best choices, but it's good to be aware of its downsides and smaller issues, if only for the sake of one day designing a better language. Please bear in mind all here are just suggestions, they may of course be a subject to counter arguments and further discussion. Here are some of the **bad things** about the language:
2024-08-26 15:22:39 +02:00
- **C specification (the ISO standard) is [proprietary](proprietary.md)**. The language itself probably can't be copyrighted, nevertheless this may change in the future, and a proprietary spec lowers C's accessibility and moddability (you can't make derivative versions of the spec).
- **The specification is also long as fuck** (approx. 500 pages, our of that 163 of the pure language), indicating [bloat](bloat.md)/complexity/obscurity. A good, free language should have a simple definition and specification. It could be simplified a lot by simplifying the language itself as well as dropping some truly legacy considerations (like [BCD](bcd.md) systems?).
- **Some behavior is weird and has unnecessary exceptions**, for example a function can return anything, including a `struct`, except for an array. This makes it awkward for example when implementing [vectors](vector.md) which would best be made as arrays but you want functions to return them, so you are forced to ugly hacks like wrapping them inside a struct just for this.
- **Some things could be made simpler**: e.g. using [reverse polish](reverse_polish.md) notation for expressions?
2023-03-02 17:05:12 +01:00
- **Some things could be dropped entirely** ([enums](enum.md), [bitfields](bitfield.md), possibly also unions etc.), they can be done and imitated in other ways without much hassle.
2023-02-06 20:06:02 +01:00
- **The preprocessor isn't exactly elegant**, it has completely different syntax and rules from the main language, not very suckless -- ideally preprocessor uses the same language as the base language.
2024-08-26 15:22:39 +02:00
- **The syntax is sucky sometimes**, infamously e.g. division by pointer dereference can actually create a comment (like `myvalue /*myptr`), also multiplication and pointer dereference use the same symbol `*` while both operation can be used with a pointer -- that may create confusion. Also a case label with variables inside it HAS TO be enclosed in curly brackets but other ones don't, data type names may consist of multiple tokens (`long long int` etc.), many preprocessor commands need to be on separate lines (makes some one liners impossible), also it's pretty weird that the condition after `if` has to be in brackets etc., it could all be designed better. Keywords also might be better being single chars, like `?` instead of `if` etc. (see [comun](comun.md)). A shorter source code that doesn't try to imitate English would be probably better.
- **Some undefined/unspecified behavior is probably unnecessary** -- undefined behavior isn't bad in general of course, but some of it has shown to be rather cumbersome; for example the unspecified representation of integers, their binary size and behavior of floats leads to a lot of trouble (unknown upper bounds, sizes, dangerous and unpredictable behavior of many operators, difficult testing etc.) while practically all computers have settled on using 8 bit bytes, [two's complement](twos_complement.md) and IEEE754 for [floats](float.md) -- this could easily be made a mandatory assumption which would simplify great many things without doing basically any harm. New versions of C actually already settle on two's complement. This doesn't mean C should be shaped to reflect the degenerate "[modern](modern.md)" trends in programming though!
2023-04-06 22:43:02 +02:00
- Some basic things that are part of libraries or extensions, like fixed width types and binary literals and possibly very basic I/O (putchar/readchar), could be part of the language itself rather than provided by libraries.
2024-02-24 16:17:37 +01:00
- All that stuff with *.c* and *.h* files is unnecessary, there should just be one file type probably.
2024-08-12 13:51:40 +02:00
- It's not [Forth](forth.md).
2024-02-24 16:17:37 +01:00
- ...
## Basics
This is a quick overview, for a more in depth tutorial see [C tutorial](c_tutorial.md).
A simple program in C that writes "welcome to C" looks like this:
```
#include <stdio.h> // standard I/O library
int main(void)
{
2024-08-31 14:44:45 +02:00
// this is the main program
2024-02-19 23:59:22 +01:00
puts("welcome to C");
return 0; // end with success
}
```
2024-08-26 15:22:39 +02:00
You can simply paste this code into a file which you name let's say `program.c`, then you can compile the program from command line like this:
2024-08-26 15:22:39 +02:00
`cc -o program program.c`
Then if you run the program from command line (`./program` on Unix like systems) you should see the message.
2024-02-19 23:59:22 +01:00
## Cheatsheet/Overview
2024-02-19 23:59:22 +01:00
Here is a quick reference cheatsheet of some of the important things in C, also a possible overview of the language.
**data types** (just some):
2024-02-19 23:59:22 +01:00
| data type | values (size) | printf |notes |
| ------------------------- | ------------------------------------------------------ | ---------- | -------------------------------------------------- |
| `int` (`signed int`, ...) | integer, at least -32767 to 32767 (16 bit), often more | `%d` | native integer, **fast** (prefer for speed) |
| `unsigned int` | integer, non-negative, at least 0 to 65535, often more | `%u` | same as `int` but no negative values |
| `signed char` | integer, at least -127 to 127, mostly -128 to 127 |`%c`, `%hhi`| `char` forced to be signed |
| `unsigned char` | integer, at least 0 to 255 (almost always the case) |`%c`, `%hhu`| smallest memory chunk, **[byte](byte.md)** |
| `char` | integer, at least 256 values | `%c` | signed or unsigned, used for string characters |
| `short` | integer, at least -32767 to 32767 (16 bit) | `%hd` | like `int` but supposed to be smaller |
| `unsigned short` | integer, non-negative, at least 0 to 65535 | `%hu` | like `short` but unsigned |
| `long` | integer, at least -2147483647 to 2147483647 (32 bit) | `%ld` | for big signed values |
| `unsigned long` | integer, at least 0 to 4294967295 (32 bit) | `%lu` | for big unsigned values |
| `long long` | integer, at least some -9 * 10^18 to 9 * 10^18 (64 bit)| `%lld` | for very big signed values |
| `unsigned long long` | integer, at least 0 to 18446744073709551615 (64 bit) | `%llu` | for very big unsigned values |
| `float` | floating point, some -3 * 10^38 to 3 * 10^38 | `%f` |[float](float.md), tricky, bloat, can be slow, avoid|
| `double` | floating point, some -1 * 10^308 to 10^308 | `%lf` | like `float` but bigger |
| `T [N]` | array of `N` values of type `T` | | **array**, if `T` is `char` then **string** |
| `T *` | memory address | `%p` | pointer to type `T`, (if `char` then **string**) |
| `uint8_t` | 0 to 255 (8 bit) |`PRIu8` |exact width, two's compl., must include `<stdint.h>`|
| `int8_t` | -128 to 127 (8 bit) |`PRId8` | like `uint8_t` but signed |
| `uint16_t` | 0 to 65535 (16 bit) |`PRIu16` | like `uint8_t` but 16 bit |
| `int16_t` | -32768 to 32767 (16 bit) |`PRId16` | like `uint16_t` but signed |
| `uint32_t` | -2147483648 to 2147483647 (32 bit) |`PRIu32` | like `uint8_t` but 32 bit |
| `int32_t` | 0 to 4294967295 (32 bit) |`PRId32` | like `uint32_t` but signed |
| `int_least8_t` | at least -128 to 127 |`PRIdLEAST8`| signed integer with at least 8 bits, `<stdint.h>` |
| `int_fast8_t` | at least -128 to 127 |`PRIdFAST8` | fast signed int. with at least 8 bits, `<stdint.h>`|
| struct | | | structured data type |
There is no **bool** (true, false), use any integer type, 0 is false, everything else is true (there may be some bool type in the stdlib, don't use that). A **string** is just array of chars, it has to end with value 0 (NOT ASCII character for "0" but literally integer value 0)!
**main program structure**:
```
#include <stdio.h>
int main(void)
{
// code here
return 0;
}
```
**branching aka if-then-else**:
```
if (CONDITION)
{
// do something here
}
else // optional
{
// do something else here
}
```
**for loop** (repeat given number of times):
```
for (int i = 0; i < MAX; ++i)
{
// do something here, you can use i
}
```
**while loop** (repeat while CONDITION holds):
```
while (CONDITION)
{
// do something here
}
```
2024-02-19 23:59:22 +01:00
**do while loop** (same as *while* but CONDITION at the end), not used that much:
```
do
{
// do something here
} while (CONDITION);
```
**function definition**:
```
RETURN_TYPE myFunction (TYPE1 param1, TYPE2 param2, ...)
{ // return type can be void
// do something here
}
```
2024-02-19 23:59:22 +01:00
**compilation** (you can replace `gcc` with another compiler):
- quickly compile and run: `gcc myprogram.c && ./a.out`.
- compile more properly: `gcc -std=c99 -Wall -Wextra -pedantic -O3 -o myprogram myprogram.c`.
To **[link](linking.md)** a library use `-llibrary`, e.g. `-lm` (when using `<math.h>`), `-lSDL2` etc.
The following are some symbols ([functions](function.md), [macros](macro.md), ...) from the **standard library**:
| symbol | library | description | example |
| ------------------------- | --------------- | ------------------------------------------------------------------ | ---------------------------------------- |
| *putchar(c)* | *stdio.h* | Writes a single character to output. | `putchar('a');` |
| *getchar()* | *stdio.h* | Reads a single character from input. | `int inputChar = getchar();` |
| *puts(s)* | *stdio.h* | Writes string to output (adds newline at the end). | `puts("hello");` |
| *printf(s, a, b, ...)* | *stdio.h* | Complex print func., allow printing numbers, their formatting etc. | `printf("value is %d\n",var);` |
| *scanf(s, a, b, ...)* | *stdio.h* | Complex reading func., allows reading numbers etc. | `scanf("%d",&var);` |
| *fopen(f,mode)* | *stdio.h* | Opens file with given name in specific mode, returns pointer. | `FILE *myFile = fopen("myfile.txt","r");`|
| *fclose(f)* | *stdio.h* | Closes previously opened file. | `fclose(myFile);` |
| *fputc(c,f)* | *stdio.h* | Writes a single character to file. | `fputc('a',myFile);` |
| *fgetc(f)* | *stdio.h* | Reads a single character from file. | `int fileChar = fgetc(myFile);` |
| *fputs(s,f)* | *stdio.h* | Writes string to file (without newline at end). | `fputs("hello",myFile);` |
| *fprintf(s, a, b, ...)* | *stdio.h* | Like `printf` but outputs to a file. | `fprintf(myFile,"value is %d\n",var);` |
| *fscanf(f, s, a, b, ...)* | *stdio.h* | Like `scanf` but reads from a file. | `fscanf(myFile,"%d",&var);` |
| *fread(data,size,n,f)* | *stdio.h* | Reads *n* elems to *data* from *file*, returns no. of elems read. | `fread(myArray,sizeof(item),1,myFile);` |
| *fwrite(data,size,n,f)* | *stdio.h* | Writes *n* elems from *data* to *file*, returns no. of elems writ. | `fwrite(myArray,sizeof(item),1,myFile);` |
| *EOF* | *stdio.h* | [End of file](eof.md) value. | `int c = getchar(); if (c == EOF) break;`|
| *rand()* | *stdlib.h* | Returns pseudorandom number. | `char randomLetter = 'a' + rand() % 26;` |
| *srand(n)* | *stdlib.h* | Seeds pseudorandom number generator. | `srand(time(NULL));` |
| *NULL* | *stdlib.h*, ... | Value assigned to pointers that point "nowhere". | `int *myPointer = NULL;` |
| *malloc(size)* | *stdlib.h* | Dynamically allocates memory, returns pointer to it (or NULL). | `int *myArr = malloc(sizeof(int) * 10);` |
| *realloc(mem,size)* | *stdlib.h* | Resizes dynamically allocates memory, returns pointer (or NULL). |`myArr = realloc(myArr,sizeof(int) * 20);`|
| *free(mem)* | *stdlib.h* | Frees dynamically allocated memory. | `free(myArr);` |
| *atof(str)* | *stdlib.h* | Converts string to floating point number. | `double val = atof(answerStr);` |
| *atoi(str)* | *stdlib.h* | Converts string to integer number. | `int val = atof(answerStr);` |
| *EXIT_SUCCESS* | *stdlib.h* | Value the program should return on successful exit. | `return EXIT_SUCCESS;` |
| *EXIT_FAILURE* | *stdlib.h* | Value the program should return on exit with error. | `return EXIT_FAILURE;` |
| *sin(x)* | *math.h* | Returns [sine](sin.md) of angle in [RADIANS](rad.md). | `float angleSin = sin(angle);` |
| *cos(x)* | *math.h* | Like `sin` but returns cosine. | `float angleCos = cos(angle);` |
| *tan(x)* | *math.h* | Returns [tangent](tan.md) of angle in RADIANS. | `float angleTan = tan(angle);` |
| *asin(x)* | *math.h* | Returns arcus sine of angle, in RADIANS. | `float angle = asin(angleSine);` |
| *ceil(x)* | *math.h* | Rounds a floating point value up. | `double x = ceil(y);` |
| *floor(x)* | *math.h* | Rounds a floating point value down. | `double x = floor(y);` |
| *fmod(a,b)* | *math.h* | Returns floating point reminded after division. | `double rem = modf(x,3.5);` |
| *isnan(x)* | *math.h* | Checks if given float value is NaN. | `if (!isnan(x))` |
| *NAN* | *math.h* | Float quiet [NaN](nan.md) (not a number) value, don't compare! | `if (y == 0) return NAN;` |
| *log(x)* | *math.h* | Computes natural [logarithm](log.md) (base [e](e.md)). | `double x = log(y);` |
| *log10(x)* | *math.h* | Computes decadic [logarithm](log.md) (base 10). | `double x = log10(y);` |
| *log2(x)* | *math.h* | Computes binary [logarithm](log.md) (base 2). | `double x = log2(y);` |
| *exp(x)* | *math.h* | Computes exponential function (*e^x*). | `double x = exp(y);` |
| *sqrt(x)* | *math.h* | Computes floating point [square root](sqrt.md). | `double dist = sqrt(dx * dx + dy * dy);` |
| *pow(a,b)* | *math.h* | Power, raises *a* to *b* (both floating point). | `double cubeRoot = pow(var,1.0/3.0);` |
| *abs(x)* | *math.h* | Computes [absolute value](abs.md). | `double varAbs = abs(var);` |
| *INT_MAX* | *limits.h* | Maximum value that can be stored in `int` type. | `printf("int max: %d\n",INT_MAX);` |
| *memset(mem,val,size)* | *string.h* | Fills block of memory with given values. | `memset(myArr,0,sizeof(myArr));` |
| *memcpy(dest,src,size)* | *string.h* | Copies bytes of memory from one place to another, returns dest. | `memcpy(destArr,srcArr,sizeof(srcArr);` |
| *strcpy(dest,src)* | *string.h* | Copies string (zero terminated) to dest, unsafe. | `char myStr[16]; strcpy(myStr,"hello");` |
| *strncpy(dest,src,n)* | *string.h* | Like `strcpy` but limits max number of bytes to copy, safer. |`strncpy(destStr,srcStr,sizeof(destStr));`|
| *strcmp(s1,s2)* | *string.h* | Compares two strings, returns 0 if equal. | `if (!strcmp(str1,"something"))` |
| *strlen(str)* | *string.h* | Returns length of given string. | `int l = strlen(myStr);` |
| *strstr(str,substr)* | *string.h* | Finds substring in string, returns pointer to it (or NULL). | `if (strstr(cmdStr,"quit") != NULL)` |
| *time(t)* | *time.h* |Stores calendar time (often Unix t.) in t (can be NULL), returns it.|`printf("tstamp: %d\n",(int) time(NULL));`|
| *clock()* | *time.h* | Returns approx. CPU cycle count since program start. |`printf("CPU ticks: %d\n",(int) clock());`|
| *CLOCKS_PER_SEC* | *time.h* | Number of CPU ticks per second. |`int sElapsed = clock() / CLOCKS_PER_SEC;`|
## See Also
2023-11-27 21:22:14 +01:00
- [B](b.md)
- [D](d.md)
2023-05-18 20:48:44 +02:00
- [comun](comun.md)
- [C tutorial](c_tutorial.md)
- [C pitfalls](c_pitfalls.md)
- [C programming style](programming_style.md)
2023-07-22 21:54:49 +02:00
- [C++](cpp.md)
2023-02-22 20:51:22 +01:00
- [IOCCC](ioccc.md)
2023-07-22 21:54:49 +02:00
- [HolyC](holyc.md)
- [QuakeC](quakec.md)
2023-02-22 20:51:22 +01:00
- [Pascal](pascal.md)
- [Fortran](fortran.md)
- [LISP](lisp.md)
2023-02-25 22:19:03 +01:00
- [FORTH](forth.md)
2024-02-04 21:08:42 +01:00
- [memory management](memory_management.md) in C