Understanding Matrices | Part 4: Matrix Inverse

of this series [1], [2], and [3], we’ve observed:

interpretation of multiplication of a matrix by a vector,
the physical meaning of matrix-matrix multiplication,
the behavior of several special-type matrices, and
visualization of matrix transpose.

On this story, I would like to share my perspective on what lies beneath matrix inversion, why different formulas related to inversion are the best way they really are, and at last, why calculating the inverse might be done way more easily for matrices of several special types.

Listed below are the definitions that I take advantage of throughout the stories of this series:

Matrices are denoted with uppercase (like ‘‘, ‘‘), while vectors and scalars are denoted with lowercase (like ‘‘, ‘‘ or ‘‘, ‘‘).
|| – is the length of vector ‘‘,
– is the transpose of matrix ‘‘,
^-1 – is the inverse of matrix ‘‘.

Definition of the inverse matrix

From part 1 of this series – “matrix-vector multiplication” [1], we keep in mind that a certain matrix ““, when multiplied by a vector ‘‘ as “ = “, might be treated as a change of input vector ‘‘ into the output vector ‘‘. In that case, then the inverse matrix ^-1 should do the reverse transformation – it should transform vector ‘‘ back to ‘‘:

[begin{equation*}
x = A^{-1}y
end{equation*}]

Substituting “ = ” there’ll give us:

[begin{equation*}
x = A^{-1}y = A^{-1}(Ax) = (A^{-1}A)x
end{equation*}]

which implies that the product of the unique matrix and its inverse – ^-1, must be such a matrix, which does no transformation to any input vector ‘‘. In other words:

[begin{equation*}
(A^{-1}A) = E
end{equation*}]

where “” is the identity matrix.

The primary query that may arise here is, is it at all times possible to reverse the influence of a certain matrix ““? The reply is – it is feasible, provided that no 2 different input vectors ₁ and ₂ are being transformed through “” into the identical output vector ‘‘. In other words, the inverse matrix ^-1 exists provided that for any output vector ‘‘ there exists exactly one input vector ‘‘, which is transformed through “” into it:

[begin{equation*}
y = Ax
end{equation*}]

On this series, I don’t need to dive an excessive amount of into the formal a part of definitions and proofs. As a substitute, I would like to look at several cases where it is definitely possible to invert the given matrix ““, and we are going to see how the inverse matrix ^-1 is calculated for every of those cases.

Inverting chains of matrices

A very important formula related to matrix inverse is:

[begin{equation*}
(AB)^{-1} = B^{-1}A^{-1}
end{equation*}]

which states that the inverse of the product of matrices is the same as the product of inverse matrices, but within the reverse order. Let’s understand why the order of matrices is being reversed.

What’s the physical meaning of the inverse ()^-1? It must be such a matrix that turns back the influence of the matrix (). So if:

[begin{equation*}
y = (AB)x,
end{equation*}]

then, we should always have:

[begin{equation*}
x = (AB)^{-1}y.
end{equation*}]

Now the transformation “ = ()” goes in 2 steps: first, we do:

[begin{equation*}
Bx = t,
end{equation*}]

which provides an intermediate vector ‘‘, after which that ‘‘ is multiplied by ““:

[begin{equation*}
y = At = A(Bx).
end{equation*}]

So the matrix “” influenced the vector after it was already influenced by ““. On this case, to show back such a sequential influence, at first we should always turn back the influence of ““, by multiplying ^-1 over ‘‘, which can give us:

[begin{equation*}
A^{-1}y = A^{-1}(ABx) = (A^{-1}A)Bx = EBx = Bx = t,
end{equation*}]

… the intermediate vector ‘‘, produced a bit above.

Then, after getting back the intermediate vector ‘‘, to revive ‘‘, we should always also reverse the influence of matrix ““. And that is completed by multiplying ^-1 over ‘‘:

[begin{equation*}
B^{-1}t = B^{-1}(Bx) = (B^{-1}B)x = Ex = x,
end{equation*}]

or writing all of it in an expanded way:

[begin{equation*}
x = B^{-1}(A^{-1}A)Bx = (B^{-1}A^{-1})(AB)x,
end{equation*}]

which explicitly shows that to show back the influence of the matrix () we should always use (^-1^-1).

This is the reason within the inverse of a product of matrices, their order is reversed:

[begin{equation*}
(AB)^{-1} = B^{-1}A^{-1}
end{equation*}]

The identical principle is applied when we’ve more matrices in a sequence, like:

[begin{equation*}
(ABC)^{-1} = C^{-1}B^{-1}A^{-1}
end{equation*}]

Inversion of several special matrices

Now, with the perception of what lies beneath matrix inversion, let’s view how matrices of several special types are being inverted.

Inverse of cyclic-shift matrix

A cyclic-shift matrix is such a matrix ““, which when multiplied by an input vector ‘‘, produces an output vector “ = “, where all values of ‘‘ are cyclic shifted by some ‘‘ positions. To attain that, the cyclic-shift matrix “” has 2 lines of ‘1’s, which reside parallel to its fundamental diagonal, while all other cells of it are ‘0’s.

[begin{equation*}
begin{pmatrix}
y_1 y_2 y_3 y_4 y_5
end{pmatrix}
= y = Vx =
begin{bmatrix}
0 & 0 & 1 & 0 & 0
0 & 0 & 0 & 1 & 0
0 & 0 & 0 & 0 & 1
1 & 0 & 0 & 0 & 0
0 & 1 & 0 & 0 & 0
end{bmatrix}
*
begin{pmatrix}
x_1 x_2 x_3 x_4 x_5
end{pmatrix}
=
begin{pmatrix}
x_3 x_4 x_5 x_1 x_2
end{pmatrix}
end{equation*}]

Now, how should we undo the transformation of the cyclic-shift matrix ““? Obviously, we should always apply one other cyclic-shift matrix ^-1, which now cyclic shifts all of the values of ‘‘ downwards by ‘‘ positions (remember, “” was shifting all of the values of ‘‘ upwards).

[begin{equation*}
begin{pmatrix}
x_1 x_2 x_3 x_4 x_5
end{pmatrix}
= x = V^{-1}Vx =
begin{bmatrix}
0 & 0 & 0 & 1 & 0
0 & 0 & 0 & 0 & 1
1 & 0 & 0 & 0 & 0
0 & 1 & 0 & 0 & 0
0 & 0 & 1 & 0 & 0
end{bmatrix}
begin{bmatrix}
0 & 0 & 1 & 0 & 0
0 & 0 & 0 & 1 & 0
0 & 0 & 0 & 0 & 1
1 & 0 & 0 & 0 & 0
0 & 1 & 0 & 0 & 0
end{bmatrix}
begin{pmatrix}
x_1 x_2 x_3 x_4 x_5
end{pmatrix}
= V^{-1}y
end{equation*}]

This is the reason the inverse of a cyclic-shift matrix is one other cyclic-shift matrix:

[begin{equation*}
V_1^{-1} = V_2
end{equation*}]

Greater than that, we will note that the X-diagram of ^-1 is definitely the horizontal flip of the X-diagram of ““. And from the previous a part of this series – “transpose of a matrix” [3], we keep in mind that the horizontal flip of an X-diagram corresponds to the transpose of that matrix. This is the reason the inverse of a cyclic shift matrix is the same as its transpose:

[begin{equation*}
V^{-1} = V^T
end{equation*}]

Inverse of an exchange matrix

An exchange matrix, often denoted by ““, is such a matrix, which when multiplied by an input vector ‘‘, produces an output vector ‘‘, having all of the values of ‘‘, but in reverse order. To attain that, “” has ‘1’s on its anti-diagonal, while all other cells are ‘0’s.

[begin{equation*}
begin{pmatrix}
y_1 y_2 y_3 y_4 y_5
end{pmatrix}
= y = Jx =
begin{bmatrix}
0 & 0 & 0 & 0 & 1
0 & 0 & 0 & 1 & 0
0 & 0 & 1 & 0 & 0
0 & 1 & 0 & 0 & 0
1 & 0 & 0 & 0 & 0
end{bmatrix}
*
begin{pmatrix}
x_1 x_2 x_3 x_4 x_5
end{pmatrix}
=
begin{pmatrix}
x_5 x_4 x_3 x_2 x_1
end{pmatrix}
end{equation*}]

Obviously, to undo one of these transformation, we should always apply another exchange matrix.

[
begin{equation*}
begin{pmatrix}
x_1 x_2 x_3 x_4 x_5
end{pmatrix}
= x = J^{-1}Jx =
begin{bmatrix}
0 & 0 & 0 & 0 & 1
0 & 0 & 0 & 1 & 0
0 & 0 & 1 & 0 & 0
0 & 1 & 0 & 0 & 0
1 & 0 & 0 & 0 & 0
end{bmatrix}
begin{bmatrix}
0 & 0 & 0 & 0 & 1
0 & 0 & 0 & 1 & 0
0 & 0 & 1 & 0 & 0
0 & 1 & 0 & 0 & 0
1 & 0 & 0 & 0 & 0
end{bmatrix}
begin{pmatrix}
x_1 x_2 x_3 x_4 x_5
end{pmatrix}
= J^{-1}y
end{equation*}]

This is the reason the inverse of an exchange matrix is the exchange matrix itself:

[begin{equation*}
J^{-1} = J
end{equation*}]

Inverse of a permutation matrix

A permutation matrix is such a matrix “” which, when multiplied by an input vector ‘‘, rearranges its values in a unique order. To attain that, an *-sized permutation matrix “” has ‘‘ 1(s), arranged in such a way that no two 1(s) appear on the identical row or the identical column. All other cells of “” are 0(s).

[begin{equation*}
begin{pmatrix}
y_1 y_2 y_3 y_4 y_5
end{pmatrix}
= y = Px =
begin{bmatrix}
0 & 0 & 1 & 0 & 0
1 & 0 & 0 & 0 & 0
0 & 0 & 0 & 1 & 0
0 & 0 & 0 & 0 & 1
0 & 1 & 0 & 0 & 0
end{bmatrix}
*
begin{pmatrix}
x_1 x_2 x_3 x_4 x_5
end{pmatrix}
=
begin{pmatrix}
x_3 x_1 x_4 x_5 x_2
end{pmatrix}
end{equation*}]

Now, what sort of matrix must be the inverse of a permutation matrix? In other words, easy methods to undo the transformation of a permutation matrix ““? Obviously, we want to do one other rearrangement, which acts in reverse order. So, for instance, if the input value ₃ was moved by “” to output value ₁, then within the inverse permutation matrix ^-1, the input value ₁ must be moved back to output value ₃. Which means that when drawing X-diagrams of permutation matrices “^-1” and ““, one will probably be the reflection of the opposite.

Similarly to the case of an exchange matrix, within the case of a permutation matrix, we will visually note that the X-diagrams of “” and ^-1 differ only by a horizontal flip. That’s the reason the inverse of any permutation matrix “” is the same as its transposition:

[begin{equation*}
P^{-1} = P^T
end{equation*}]

[begin{equation*}
begin{pmatrix}
x_1 x_2 x_3 x_4 x_5
end{pmatrix}
= x = P^{-1}Px =
begin{bmatrix}
0 & 1 & 0 & 0 & 0
0 & 0 & 0 & 0 & 1
1 & 0 & 0 & 0 & 0
0 & 0 & 1 & 0 & 0
0 & 0 & 0 & 1 & 0
end{bmatrix}
begin{bmatrix}
0 & 0 & 1 & 0 & 0
1 & 0 & 0 & 0 & 0
0 & 0 & 0 & 1 & 0
0 & 0 & 0 & 0 & 1
0 & 1 & 0 & 0 & 0
end{bmatrix}
begin{pmatrix}
x_1 x_2 x_3 x_4 x_5
end{pmatrix}
= P^{-1}y
end{equation*}]

Inverse of a rotation matrix

A rotation matrix on 2D plane is such a matrix ““, which, when multiplied by a vector (₁, ₂), rotates the purpose “=(₁, x₂)” counter-clockwise by a certain angle “” across the null-point. Its formula is:

[
begin{equation*}
begin{pmatrix}
y_1 y_2
end{pmatrix}
= y = Rx =
begin{bmatrix}
cos(theta) & -sin(theta)
sin(theta) & phantom{+} cos(theta)
end{bmatrix}
*
begin{pmatrix}
x_1 x_2
end{pmatrix}
end{equation*}]

Now, what must be the inverse of a rotation matrix? The best way to undo the rotation produced by a matrix ““? Obviously, that must be one other rotation matrix, this time with an angle “-” (or “360°-“):

[begin{equation*}
R^{-1} =
begin{bmatrix}
cos(-theta) & -sin(-theta)
sin(-theta) & phantom{+} cos(-theta)
end{bmatrix}
=
begin{bmatrix}
phantom{+} cos(theta) & sin(theta)
-sin(theta) & cos(theta)
end{bmatrix}
=
R^T
end{equation*}]

Which is why the inverse of a rotation matrix is one other rotation matrix. We also see that the inverse ^-1 is the same as the transpose of the unique matrix ““.

Inverse of a triangular matrix

An upper-triangular matrix is a square matrix that has zeros below its diagonal. Due to that, in its X-diagram, there aren’t any arrows directed downwards:

The horizontal arrows correspond to cells of the diagonal, while the arrows which can be directed upwards correspond to the cells above the diagonal.

Similarly, the lower-triangular matrix is defined, which has zeroes above its fundamental diagonal. In this text, we are going to concentrate only on upper-triangular matrices, as for lower-triangular ones, inversion is performed in a similar way.

For simplicity, let’s at first address inverting a 2×2-sized upper-triangular matrix ‘‘.

Once ‘‘ is multiplied by an input vector ‘‘, the result vector “ = ” has the next form:

[begin{equation*}
y =
begin{pmatrix}
y_1 y_2
end{pmatrix}
=
begin{bmatrix}
a_{1,1} & a_{1,2}
0 & a_{2,2}
end{bmatrix}
begin{pmatrix}
x_1 x_2
end{pmatrix}
=
begin{pmatrix}
begin{aligned}
a_{1,1}x_1 + a_{1,2}x_2
a_{2,2}x_2
end{aligned}
end{pmatrix}
end{equation*}]

Now, when calculating the inverse matrix ^-1, we wish it to act within the reverse order:

How should we restore (₁, ₂) from (₁, ₂)? The primary and simplest step is to revive ₂, using only ₂, because ₂ was originally affected only by ₂. We don’t need the worth of ₁ for that:

Next, how should we restore ₁? This time, we will’t use only ₁, since the value “₁ = _1,1₁ + _1,2₂” is sort of a combination of ₁ and ₂. But we will restore ₁ if using each ₁ and ₂ properly. This time, ₂ will help to filter out the influence of ₂, so the pure value of ₁ might be restored:

We see now that the inverse ^-1 of the upper-triangular matrix “” can be an upper-triangular matrix.

What about triangular matrices of larger sizes? Let’s take this time a 3×3-sized matrix and find its inverse analytically.

Values of the output vector ‘‘ are obtained now from ‘‘ in the next way:

[
begin{equation*}
y =
begin{pmatrix}
y_1 y_2 y_3
end{pmatrix}
= Ax =
begin{bmatrix}
a_{1,1} & a_{1,2} & a_{1,3}
0 & a_{2,2} & a_{2,3}
0 & 0 & a_{3,3}
end{bmatrix}
begin{pmatrix}
x_1 x_2 x_3
end{pmatrix}
=
begin{pmatrix}
begin{aligned}
a_{1,1}x_1 + a_{1,2}x_2 + a_{1,3}x_3
a_{2,2}x_2 + a_{2,3}x_3
a_{3,3}x_3
end{aligned}
end{pmatrix}
end{equation*}]

As we’re considering constructing the inverse matrix ^-1, our goal is to search out (₁, ₂, ₃), having the values of (₁, ₂, ₃):

[begin{equation*}
begin{pmatrix}
x_1 x_2 x_3
end{pmatrix}
= A^{-1}y =
begin{bmatrix}
text{?} & text{?} & text{?}
text{?} & text{?} & text{?}
text{?} & text{?} & text{?}
end{bmatrix}
*
begin{pmatrix}
y_1 y_2 y_3
end{pmatrix}
end{equation*}]

In other words, we must solve the system of linear equations mentioned above.

Doing that can restore at first the worth of ₃ as:

[begin{equation*}
y_3 = a_{3,3}x_3, hspace{1cm} x_3 = frac{1}{a_{3,3}} y_3
end{equation*}]

which can make clear cells of the last row of ^-1 :

[begin{equation*}
begin{pmatrix}
x_1 x_2 x_3
end{pmatrix}
= A^{-1}y =
begin{bmatrix}
text{?} & text{?} & text{?}
text{?} & text{?} & text{?}
0 & 0 & frac{1}{a_{3,3}}
end{bmatrix}
*
begin{pmatrix}
y_1 y_2 y_3
end{pmatrix}
end{equation*}]

Having ₃ discovered, we will bring all its occurrences to the left side of the system:

[begin{equation*}
begin{pmatrix}
y_1 – a_{1,3}x_3
y_2 – a_{2,3}x_3
y_3 – a_{3,3}x_3
end{pmatrix}
=
begin{pmatrix}
begin{aligned}
a_{1,1}x_1 + a_{1,2}x_2
a_{2,2}x_2
0
end{aligned}
end{pmatrix}
end{equation*}]

which can allow us to calculate ₂ as:

[begin{equation*}
y_2 – a_{2,3}x_3 = a_{2,2}x_2, hspace{1cm}
x_2 = frac{y_2 – a_{2,3}x_3}{a_{2,2}} = frac{y_2 – (a_{2,3}/a_{3,3})y_3}{a_{2,2}}
end{equation*}]

This already clarifies the cells of the second row of ^-1 :

[begin{equation*}
begin{pmatrix}
x_1 x_2 x_3
end{pmatrix}
= A^{-1}y =
begin{bmatrix}
text{?} & text{?} & text{?} [0.2cm]
0 & frac{1}{a_{2,2}} & – frac{a_{2,3}}{a_{2,2}a_{3,3}} [0.2cm]
0 & 0 & frac{1}{a_{3,3}}
end{bmatrix}
*
begin{pmatrix}
y_1 y_2 y_3
end{pmatrix}
end{equation*}]

Finally, having the values of ₃ and ₂ discovered, we will do the identical trick of moving now ₂ to the left side of the system:

[begin{equation*}
begin{pmatrix}
begin{aligned}
y_1 – a_{1,3}x_3 & – a_{1,2}x_2
y_2 – a_{2,3}x_3 & – a_{2,2}x_2
y_3 – a_{3,3}x_3 &
end{aligned}
end{pmatrix}
=
begin{pmatrix}
a_{1,1}x_1
0
0
end{pmatrix}
end{equation*}]

from which ₁ will probably be derived as:

[begin{equation*}
begin{aligned}
& y_1 – a_{1,3}x_3 – a_{1,2}x_2 = a_{1,1}x_1,
& x_1
= frac{y_1 – a_{1,3}x_3 – a_{1,2}x_2}{a_{1,1}}
= frac{y_1 – (a_{1,3}/a_{3,3})y_3 – a_{1,2}frac{y_2 – (a_{2,3}/a_{3,3})y_3}{a_{2,2}}}{a_{1,1}}
end{aligned}
end{equation*}]

so the primary row of matrix ^-1 may also be clarified:

[begin{equation*}
begin{pmatrix}
x_1 x_2 x_3
end{pmatrix}
= A^{-1}y =
begin{bmatrix}
frac{1}{a_{1,1}} & – frac{a_{1,2}}{a_{1,1}a_{2,2}} & frac{a_{1,2}a_{2,3} – a_{1,3}a_{2,2}}{a_{1,1}a_{2,2}a_{3,3}} [0.2cm]
0 & frac{1}{a_{2,2}} & – frac{a_{2,3}}{a_{2,2}a_{3,3}} [0.2cm]
0 & 0 & frac{1}{a_{3,3}}
end{bmatrix}
*
begin{pmatrix}
y_1 y_2 y_3
end{pmatrix}
end{equation*}]

After deriving ^-1 analytically, we will see that it’s also an upper-triangular matrix.

Being attentive to the sequence of actions that we used here to calculate ^-1, we will say obviously now that the inverse of any upper-triangular matrix ‘‘ can be an upper-triangular matrix:

A similar judgment will show that the inverse of a lower-triangular matrix is one other lower-triangular matrix.

A numerical example of inverting a sequence of matrices

Let’s have one other take a look at why, during an inversion of a sequence of matrices, their order is reversed. Recalling the formula:

[begin{equation*}
(AB)^{-1} = B^{-1}A^{-1}
end{equation*}]

This time, for each ‘‘ and ‘‘, we are going to take certain forms of matrices. The primary matrix “=” will probably be a cyclic shift matrix:

Let’s recall here that to revive the input vector ‘‘, the inverse ^-1 should do the other – cyclic shift values of the argument vector ‘‘ downwards:

The second matrix “=” will probably be a diagonal matrix with different values on its fundamental diagonal:

The inverse ^-1 of such a scale matrix, to revive the unique vector ‘‘, must halve only the primary 2 values of its argument vector ‘‘:

Now, what sort of behavior will the product matrix “” have? When calculating “ = “, it’s going to double only the primary 2 values of the input vector ‘‘, and cyclic shift your entire result upwards.

We all know already that after the output vector “ = ” is calculated, to reverse the influence of the product matrix “” and to revive the input vector ‘‘, we should always do:

[begin{equation*}
x = (VS)^{-1}y = S^{-1}V^{-1}y
end{equation*}]

In other words, the order of matrices ‘‘ and ‘‘ must be reversed during inversion:

And what is going to occur if we attempt to invert the love of “” in an improper way, without reversing the order of the matrices, assuming that ^-1^-1 is what must be used for it:

We see that the unique vector (₁, ₂, ₃, ₄) from the appropriate side just isn’t restored on the left side now. As a substitute, we’ve vector
(2₁, ₂, 0.5₃, ₄) there. One reason for that is that the worth ₃ mustn’t be halved on its path, nevertheless it actually gets halved because in the mean time when matrix ^-1 is applied, ₃ appears on the second position from the highest, which actually halves it. Same refers back to the path of value ₁. All that leads to having an altered vector on the left side.

Conclusion

On this story, we’ve checked out matrix inversion operation ^-1 as something that undoes the transformation of the given matrix ““. Now we have observed why inverting a sequence of matrices like ()^-1 actually reverses the order of multiplication, leading to ^-1^-1^-1. Also, we got a visible perspective on why inverting several special forms of matrices leads to one other matrix of the identical type.

Thanks for reading!

This might be the last a part of my “Understanding Matrices” series. I hope you enjoyed reading all 4 parts! If that’s the case, be happy to follow me on LinkedIn, as hopefully other articles will probably be coming soon, and I’ll post updates there!

.

References:

[1] – Understanding matrices | Part 1: Matrix-Vector Multiplication

[2] – Understanding matrices | Part 2: Matrix-Matrix Multiplication

[3] – Understanding matrices | Part 3: Matrix Transpose

Understanding Matrices | Part 4: Matrix Inverse

Definition of the inverse matrix

Inverting chains of matrices