Update tuto

This commit is contained in:
Miloslav Ciz 2022-04-03 12:31:40 +02:00
parent 818ab679b9
commit 6ff9159889

View file

@ -13,11 +13,14 @@ You should probably know at least the completely basic ideas of programming befo
- Extremely **fast and efficient**. - Extremely **fast and efficient**.
- Very **widely supported and portable** to almost anything. - Very **widely supported and portable** to almost anything.
- **[Low level](low_level.md)**, i.e. there is relatively little [abstraction](abstraction.md) and not many comfortable built-in functionality such as [garbage collection](garbage_collection.md), you have to write many things yourself, you will deal with [pointers](pointer.md), [endianness](endianness.md) etc. - **[Low level](low_level.md)**, i.e. there is relatively little [abstraction](abstraction.md) and not many comfortable built-in functionality such as [garbage collection](garbage_collection.md), you have to write many things yourself, you will deal with [pointers](pointer.md), [endianness](endianness.md) etc.
- [Imperative](imperative.md), without [object oriented programming](oop.md).
- Considered **hard**, but in certain ways it's simple, it lacks [bloat](bloat.md) and [bullshit](bullshit.md) of "[modern](modern.md)" languages which is an essential thing. It will take long to learn but it's the most basic thing you should know if you want to create good software. You won't regret. - Considered **hard**, but in certain ways it's simple, it lacks [bloat](bloat.md) and [bullshit](bullshit.md) of "[modern](modern.md)" languages which is an essential thing. It will take long to learn but it's the most basic thing you should know if you want to create good software. You won't regret.
- **Not holding your hand**, i.e. you may very easily "shoot yourself in your foot" and crash your program. This is the price for the language's power. - **Not holding your hand**, i.e. you may very easily "shoot yourself in your foot" and crash your program. This is the price for the language's power.
- Very old, well established and tested by time. - Very old, well established and tested by time.
- Recommended by us for serious programs. - Recommended by us for serious programs.
If you come from a language like [Python](python.md) or [JavaScript](javascript.md), you may be shocked that C doesn't come with its own [package manager](package_manager.md), [debugger](debugger.md) or [build system](build_system.md), it doesn't have [modules](module.md), [generics](generics.md), [garabage collection](garbage_collection.d), [OOP](oop.md), [hashmaps](hashmap.md), dynamic [lists](list.md), [type inference](type_inference.md) and similar "[modern](modern.md)" featured. When you truly get into C, you'll find it's a good thing.
Programming in C works like this: Programming in C works like this:
1. You write a C source code into a file. 1. You write a C source code into a file.
@ -127,7 +130,7 @@ int main(void)
- `int myVariable;` is so called **variable declaration**, it tells the compiler we are creating a new variable with the name `myVariable` and data type `int`. Variables can be created almost anywhere in the code (even outside the `main` function) but that's a topic for later. - `int myVariable;` is so called **variable declaration**, it tells the compiler we are creating a new variable with the name `myVariable` and data type `int`. Variables can be created almost anywhere in the code (even outside the `main` function) but that's a topic for later.
- `myVariable = 5;` is so called **variable assignment**, it stores a value 5 into variable named `myVariable`. IMPORTANT NOTE: the `=` does NOT signify mathematical equality but an assignment (equality in C is written as `==`); when compiler encounters `=`, it simply takes the value on the right of it and writes it to the variable on the left side of it. Sometimes people confuse assignment with an equation that the compiler solves -- this is NOT the case, assignment is much more simple, it simply writes a value into variable. So `x = x + 1;` is a valid command even though mathematically it would be an equation without a solution. - `myVariable = 5;` is so called **variable assignment**, it stores a value 5 into variable named `myVariable`. IMPORTANT NOTE: the `=` does NOT signify mathematical equality but an assignment (equality in C is written as `==`); when compiler encounters `=`, it simply takes the value on the right of it and writes it to the variable on the left side of it. Sometimes people confuse assignment with an equation that the compiler solves -- this is NOT the case, assignment is much more simple, it simply writes a value into variable. So `x = x + 1;` is a valid command even though mathematically it would be an equation without a solution.
- `printf("%d\n",myVariable);` prints out the value currently stored in `myVariable`. Don't get scared by this complicated command, it will be explained later. For now only know this prints the variable content. - `printf("%d\n",myVariable);` prints out the value currently stored in `myVariable`. Don't get scared by this complicated command, it will be explained later (once we learn about [pointers](pointer.md)). For now only know this prints the variable content.
- `myVariable = 8;` assigns a new value to `myVariable`, overwriting the old. - `myVariable = 8;` assigns a new value to `myVariable`, overwriting the old.
- `printf("%d\n",myVariable);` again prints the value in `myVariable`. - `printf("%d\n",myVariable);` again prints the value in `myVariable`.
@ -617,7 +620,38 @@ Let's also mention some additional data types we can use in programs:
- `char`: A single text character such as *'a'*, *'G'* or *'_'*. We can assign characters as `char c = 'a';` (single characters are enclosed in apostrophes similarly to how text strings are inside quotes). We can read a character as `c = getchar();` and print it as `putchar(c);`. Special characters that can be used are `\n` (newline) or `\t` (tab). Characters are in fact small numbers (usually with 256 possible values) and can be used basically anywhere a number can be used (for example we can compare characters, e.g. `if (c < 'b') ...`). Later we'll see characters are basic building blocks of text strings. - `char`: A single text character such as *'a'*, *'G'* or *'_'*. We can assign characters as `char c = 'a';` (single characters are enclosed in apostrophes similarly to how text strings are inside quotes). We can read a character as `c = getchar();` and print it as `putchar(c);`. Special characters that can be used are `\n` (newline) or `\t` (tab). Characters are in fact small numbers (usually with 256 possible values) and can be used basically anywhere a number can be used (for example we can compare characters, e.g. `if (c < 'b') ...`). Later we'll see characters are basic building blocks of text strings.
- `unsigned int`: Integer that can only take positive values or 0 (i.e. no negative values). It can store higher positive values than normal `int` (which is called a *signed int*). - `unsigned int`: Integer that can only take positive values or 0 (i.e. no negative values). It can store higher positive values than normal `int` (which is called a *signed int*).
- `long`: Big integer, takes more memory but can store number in the range of at least a few billion. - `long`: Big integer, takes more memory but can store number in the range of at least a few billion.
- `float` and `double`: [Floating point](float.md) number (`double` is bigger and more precise than `float`) -- an approximation of [real numbers](real_number.md), i.e. numbers with a fractional part such as 2.5 or 0.0001. - `float` and `double`: [Floating point](float.md) number (`double` is bigger and more precise than `float`) -- an approximation of [real numbers](real_number.md), i.e. numbers with a fractional part such as 2.5 or 0.0001. You can print these numbers as `printf("%lf\n",x);` and read them as `scanf("%f",&x);`.
Here is a short example with the new data types:
```
#include <stdio.h>
int main(void)
{
char c;
float f;
puts("Enter character.");
c = getchar(); // read character
puts("Enter float.");
scanf("%f",&f);
printf("Your character is :%c.\n",c);
printf("Your float is %lf\n",f);
float fSquared = f * f;
int wholePart = f; // this can be done
printf("It's square is %lf.\n",fSquared);
printf("It's whole part is %d.\n",wholePart);
return 0;
}
```
Notice mainly how we can assign a `float` value into the variable of `int` type (`int wholePart = f;`). This can be done even the other way around and with many other types. C can do automatic **type conversions** (*[casting](cast.md)*), but of course, some information may be lost in this process (e.g. the fractional part).
In the section about functions we said a function can only call a function that has been defined before it in the source code -- this is because the compiler read the file from start to finish and if you call a function that hasn't been defined yet, it simply doesn't know what to call. But sometimes we need to call a function that will be defined later, e.g. in cases where two functions call each other (function *A* calls function *B* in its code but function *B* also calls function *A*). For this there exist so called **[forward declaractions](forward_decl.md)** -- a forward declaration is informing that a function of certain name (and with certain parameters etc.) will be defined later in the code. Forward declaration look the same as a function definition, but it doesn't have a body (the part between `{` and `}`), instead it is terminated with a semicolon (`;`). Here is an example: In the section about functions we said a function can only call a function that has been defined before it in the source code -- this is because the compiler read the file from start to finish and if you call a function that hasn't been defined yet, it simply doesn't know what to call. But sometimes we need to call a function that will be defined later, e.g. in cases where two functions call each other (function *A* calls function *B* in its code but function *B* also calls function *A*). For this there exist so called **[forward declaractions](forward_decl.md)** -- a forward declaration is informing that a function of certain name (and with certain parameters etc.) will be defined later in the code. Forward declaration look the same as a function definition, but it doesn't have a body (the part between `{` and `}`), instead it is terminated with a semicolon (`;`). Here is an example:
@ -763,6 +797,101 @@ As a bonus, let's see a few useful compiler flags:
## Advanced Data Types and Variables (Structs, Arrays) ## Advanced Data Types and Variables (Structs, Arrays)
Until now we've encountered simple data types such as `int`, `char` or `float`. These identify values which can take single atomic values (e.g. numbers or text characters). Such data types are called **[primitive types](primitive_type.md)**.
Above these there exist **[compound data types](compound_type.md)** (also *complex* or *structured*) which are composed of multiple primitive types. They are necessary any advanced program.
The first compound type is a structure, or **[struct](struct.md)**. It is a collection of several values of potentially different data types (primitive or compound). The following code shows how a struc can be created and used.
```
#include <stdio.h>
typedef struct
{
char initial; // initial of name
int weightKg;
int heightCm;
} Human;
int bmi(Human human)
{
return (human.weightKg * 10000) / (human.heightCm * human.heightCm);
}
int main(void)
{
Human carl;
carl.initial = 'C';
carl.weightKg = 100;
carl.heightCm = 180;
if (bmi(carl) > 25)
puts("Carl is fat.");
return 0;
}
```
The part of the code starting with `typedef struct` creates a new data type that we call `Human` (one convention for data type names is to start them with an uppercase character). This data type is a structure consisting of three members, one of type `char` and two of type `int`. Inside the `main` function we create a variable `carl` which is of `Human` data type. Then we set the specific values -- we see that each member of the struct can be accessed using the dot character (`.`), e.g. `carl.weightKg`; this can be used just as any other variable. Then we see the type `Human` being used in the parameter list of the function `bmi`, just as any other type would be used.
What is this good for? Why don't we just create global variables such as `carl_initial`, `carl_weightKg` and `carl_heightCm`? In this simple case it might work just as well, but in a more complex code this would be burdening -- imagine we wanted to create 10 variables of type `Human` (`john`, `becky`, `arnold`, ...). We would have to painstakingly create 30 variables (3 for each person), the function `bmi` would have to take two parameters (`height` and `weight`) instead of one (`human`) and if we wanted to e.g. add more information about every human (such as `hairLength`), we would have to manually create another 10 variables and add one parameter to the function `bmi`, while with a struct we only add one member to the struct definition and create more variables of type `Human`.
**Structs can be nested**. So you may see things such as `myHouse.groundFloor.livingRoom.ceilingHeight` in C code.
Another extremely important compound type is **[array](array.md)** -- a sequence of items, all of which are of the same data type. Each array is specified with its length (number of items) and the data type of the items. We can have, for instance, an array of 10 `int`s, or an array of 235 `Human`s. The important thing is that we can **index** the array, i.e. we access the individual items of the array by their position, and this position can be specified with a variable. This allows for **looping over array items** and performing certain operations on each item. Demonstration code follows:
```
#include <stdio.h>
#include <math.h> // for sqrt()
int main(void)
{
float vector[5];
vector[0] = 1;
vector[1] = 2.5;
vector[2] = 0;
vector[3] = 1.1;
vector[4] = -405.054;
puts("The vector is:");
for (int i = 0; i < 5; ++i)
printf("%lf ",vector[i]);
putchar('\n'); // newline
/* compute vector length with
pythagoren theorem: */
float sum = 0;
for (int i = 0; i < 5; ++i)
sum += vector[i] * vector[i];
printf("Vector length is: %lf\n",sqrt(sum));
return 0;
}
```
We've included a new library called `math.h` so that we can use a function for square root (`sqrt`). (If you have trouble compiling the code, add `-lm` flag to the compile command.)
`float vector[5];` is a declaration of an array of length 5 whose items are of type `float`. When compiler sees this, it creates a continuous area in memory long enough to store 5 numbers of `float` type, the numbers will reside here one after another.
After doing this, we can **index** the array with square brackets (`[` and `]`) like this: `ARRAY_NAME[INDEX]` where `ARRAY_NAME` is the name of the array (here `vector`) and `INDEX` is an expression that evaluates to integer, **starting with 0** and going up to the vector length minus one (remember that **programmers count from zero**). So the first item of the array is at index 0, the second at index 1 etc. The index can be a numeric constant like `3`, but also a variable or a whole expression such as `x + 3 * myFunction()`. Indexed array can be used just like any other variable, you can assign to it, you can use it in expressions etc. This is seen in the example. Trying to access an item beyond the array's bounds (e.g. `vector[100]`) will likely crash your program.
Especially important are the parts of code staring with `for (int i = 0; i < 5; ++i)`: this is an iteration over the array. It's a very common pattern that we use whenever we need to perform some action with every item of the array.
Arrays can also be multidimensional, but we won't bothered with that right now.
Why are arrays so important? They allow us to work with great number of data, not just a handful of numeric variables. We can create an array of million structs and easily work with all of them thanks to indexing and loops, this would be practically impossible without arrays. Imagine e.g. a game of [chess](chess.md); it would be very silly to have 64 plain variables for each square of the board (`squareA1`, `squareA2`, ..., `squareH8`), it would be extremely difficult to work with such code. With an array we can represent the square as a single array, we can iterate over all the squares easily etc.
string
## Macros/Preprocessor ## Macros/Preprocessor
The C language comes with a feature called *preprocessor* which is necessary for some advanced things. It allows automatized modification of the source code before it is compiled. The C language comes with a feature called *preprocessor* which is necessary for some advanced things. It allows automatized modification of the source code before it is compiled.