You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

7.9 KiB

Linear Algebra

In mathematics linear algebra is an extension of the classical elemental algebra (which means the basic "operations with numbers/variables") to vectors and matrices (kind of "operations with arrays of numbers"). It is a basic tool of advanced mathematics and computer science (and many other sciences) and at least at the very basic level should be known by every programmer.

Why is it called linear algebra? It is related to the concept of linearity which kind of has to do with "dealing with straight lines" (NOT curved ones), i.e. the results we get in linear algebra are more abstract equivalents of "straight lines", e.g. planes and hyperplanes, though this may be hard too see due to all the abstraction and higher dimensionality. { The concept of linearity has several possibly incompatible definitions and is kinda confusing (for example in whether the lines always have to pass through the origin). I was actually looking up "why is it called LINEAR algebra" and the above explanation is how I understood the answers I found. ~drummyfish }

Basics

In "normal" algebra our basic elements are numbers; we learn to add then, multiply then, solve equation with them etc. In linear algebra we call these "single numbers" scalars (e.g. 1, -10.5 or pi are scalars), and we also add more complex elements: vectors and matrices, with which we may perform similar operations, even though they sometimes behave a bit differently (e.g. the order in multiplication of matrices matters, unlike with scalars).

Vectors are, put in a very simplified and slightly incorrect way, sequences (arrays) of numbers, e.g. a vector of length 3 may be [1.5, 0, -302]. A matrix can similarly be seen as a two dimensional "array of numbers", e.g. a 2x3 matrix may look like this:

|1  2.5 -10|
|24 -3   0 |

We may kind of see vectors as matrices that have either only one column, so called column vectors, or only one row, so called row vectors -- it is only a matter of convention which type of vectors we choose to use (this affects e.g. "from which side" we will multiply vectors by matrices). I.e. we choose which kind of vectors we'll use and then keep using only that kind. For example a row vector

|5 7.3 -2|

is really a 1x3 matrix that as a column vector (3x1 matrix) would look as

|5  |
|7.3|
|-2 |

Why do we even work with vectors and matrices? Because these can represent certain things we encounter in math and programming better than numbers, e.g. vectors may represent points in space or velocities with directions and matrices may represent transformations such as rotations (this is not obvious but it's true).

With vectors and matrices we can perform similar operations as with "normal numbers", i.e. addition, subtraction, multiplication, but there are also new operations and some operations may behave differently. E.g. when dealing with vectors, there are multiple ways to "multiply" them: we may multiply a vector with a scalar but also a vector with vector (and there are multiple ways to do this such as dot product which results in a scalar and cross product which results in a vector). Matrix multiplication is, unlike multiplication of real numbers, non-commutative (A times B doesn't necessarily equal B times A), but it's still distributive. We can also multiply vectors with matrices but only those that have "compatible sizes". And we can also solve equations and systems of equations which have vectors and matrices in them.

There is an especially important matrix called the identity matrix (sometimes also unit matrix), denoted I, an NxN matrix by which if we multiply any matrix we get that same matrix. The identity matrix has 1s on the main diagonal and 0s elsewhere. E.g. a 3x3 identity matrix looks as

|1 0 0|
|0 1 0|
|0 0 1|

Now let's see some the details of basic operations with vectors and matrices:

  • matrix/vector addition/subtraction: We can add (subtract) vectors and matrices only if they have exactly the same size. We perform the operation very simply element-wise. E.g. adding vector [1 0 -2] to vector [3 1.1 3] results in vector [4 1.1 1].
  • matrix/vector multiplication by scalar: We simply multiply each element of the vector/matrix by the scalar, e.g. [2 0 -3] * 7 = [14 0 -21].
  • matrix/vector multiplication: We can multiply matrix (vector) A by matrix (vector) B only if A has the number of columns equal to the number of rows of B. I.e. we can e.g. multiply a 2x3 (2 rows, 3 columns) matrix by a 3x5 matrix, but NOT a 2x4 matrix by 2x4 matrix. Note that unlike with real numbers, order in matrix multiplication matters (matrix multiplication is non-commutative), i.e. AB is not generally equal to BA. Multiplying a MxN matrix by NxO matrix results in a MxO matrix (e.g. 2x3 matrix times 3x4 matrix results in a 2x4 matrix) in which each element is a dot product of the corresponding row from the first matrix with the corresponding column of the second matrix. An example will follow later.
  • matrix/vector "division": We mention just for clarity that the term matrix division isn't really used but we can achieve the principle of division by multiplication by inverse matrices (similarly to how division on real numbers is really a multiplication by reciprocal of a number).
  • matrix/vector transpose: Transpose of a matrix A is denoted as A^T. It is the matrix A flipped by its main (top left to bottom right) diagonal, i.e. the transpose of an NxM matrix is an MxN matrix. Transpose makes column vectors into row vectors and back.
  • matrix inverse: The inverse matrix of an NxN matrix A is denoted as A^-1 and it is a matrix such that if we multiply A by it we get the identity matrix (I). Inverse matrix is similar to a reciprocal value in the world of real numbers. Note that non-square matrices don't have inverses and even some square matrices don't have inverses. How to invert a matrix? A general method is to simply solve the equation that defines it.
  • matrix determinant: Determinant of a matrix is a scalar computed in a specific way from the matrix that reflects some of the properties of the matrix (e.g. its invertibility). It appears in many equations so it's good to know about it.

Example of matrix multiplication: this is a super important operation so let's see an example. Let's have a 2x3 matrix A:

    |1 2 3|
A = |4 5 6|

and a 3x4 matrix B:

    |7  8  9  10|
B = |11 12 13 14|
    |15 16 17 18|

The result, AB, will be a 2x4 matrix in which e.g. the top-left element is equal to 1 * 7 + 2 * 11 + 3 * 15 = 74 (the dot product of the row 1 2 3 with the column 7 11 15). On paper we usually draw the matrices conveniently as follows:

                                |7   8   9   10 |
                                |11  12  13  14 |         
                                |15  16  17  18 |
        |7  8  9  10|
|1 2 3| |11 12 13 14| = |1 2 3| |74  80  86  92 |
|4 5 6| |15 16 17 18|   |4 5 6| |173 188 203 218|

In case it's still not clear, here is a C code of the above shown matrix multiplication:

#include <stdio.h>

int main()
{
  int A[2][3] = {
    {1, 2, 3},
    {4, 5, 6}};

  int B[3][4] = {
    {7,  8,  9,  10},
    {11, 12, 13, 14},
    {15, 16, 17, 18}};
    
  for (int row = 0; row < 2; ++row)
  {
    for (int col = 0; col < 4; ++col)
    {
      int sum = 0;
      
      for (int i = 0; i < 3; ++i)
        sum += A[row][i] * B[i][col];
        
      printf("%d ",sum);
    }
    
    putchar('\n');
  }
  
  return 0;
}

See Also