Learn how to Interpret Matrix Expressions — Transformations

-

Let’s return to the matrix

and apply the transformation to a couple of sample points.

The results of transformation B on various input vectors

Notice the next:

  • point x₁​ has been rotated counterclockwise and brought closer to the origin,
  • point x₂​, however, has been rotated clockwise and pushed away from the origin,
  • point x₃​ has only been scaled down, meaning it’s moved closer to the origin while keeping its direction,
  • point x₄ has undergone an identical transformation, but has been scaled up.

The transformation compresses within the x⁽¹⁾-direction and stretches within the x⁽²⁾-direction. You may consider the grid lines as behaving like an accordion.

Directions resembling those represented by the vectors x₃ and x₄ play a crucial role in machine learning, but that’s a story for an additional time.

For now, we are able to call them eigen-directions, because vectors along these directions might only be scaled by the transformation, without being rotated. Every transformation, apart from rotations, has its own set of eigen-directions.

Recall that the transformation matrix is constructed by stacking the transformed basis vectors in columns. Perhaps you’d prefer to see what happens if we swap the rows and columns afterwards (the transposition).

Allow us to take, for instance, the matrix

where Aᵀ stands for the transposed matrix.

From a geometrical perspective, the coordinates of the primary latest basis vector come from the primary coordinates of all the old basis vectors, the second from the second coordinates, and so forth.

In NumPy, it’s so simple as that:

import numpy as np

A = np.array([
[1, -1],
[1 , 1]
])

print(f'A transposed:n{A.T}')

A transposed:
[[ 1 1]
[-1 1]]

I need to disappoint you now, as I cannot provide a straightforward rule that expresses the connection between the transformations A and Aᵀ in only a couple of words.

As a substitute, let me show you a property shared by each the unique and transposed transformations, which is able to turn out to be useful later.

Here is the geometric interpretation of the transformation represented by the matrix A. The realm shaded in gray known as the parallelogram.

Parallelogram spanned by the premise vectors transformed by matrix A

Compare this with the transformation obtained by applying the matrix Aᵀ:

Parallelogram spanned by the premise vectors transformed by matrix A

Now, allow us to consider one other transformation that applies entirely different scales to the unit vectors:

The parallelogram related to the matrix B is way narrower now:

Parallelogram spanned by the premise vectors transformed by matrix B

but it surely seems that it is identical size as that for the matrix Bᵀ:

Parallelogram spanned by the premise vectors transformed by matrix B

Let me put it this manner: you’ve got a set of numbers to assign to the components of your vectors. For those who assign a bigger number to at least one component, you’ll need to make use of smaller numbers for the others. In other words, the entire length of the vectors that make up the parallelogram stays the identical. I do know this reasoning is a bit vague, so when you’re on the lookout for more rigorous proofs, check the literature within the references section.

And here’s the kicker at the top of this section: the realm of the parallelograms will be found by calculating the determinant of the matrix. What’s more, the determinant of the matrix and its transpose are equivalent.

More on the determinant within the upcoming sections.

You may apply a sequence of transformations — for instance, start by applying A to the vector x, after which pass the result through B. This will be done by first multiplying the vector x by the matrix A, after which multiplying the result by the matrix B:

You may multiply the matrices B and A to acquire the matrix C for further use:

That is the effect of the transformation represented by the matrix C:

Transformation described by the composite matrix BA

You may perform the transformations in reverse order: first apply B, then apply A:

Let D represent the sequence of multiplications performed on this order:

And that is the way it affects the grid lines:

Transformation described by the composite matrix AB

So, you may see for yourself that the order of matrix multiplication matters.

There’s a cool property with the transpose of a composite transformation. Try what happens once we multiply A by B:

after which transpose the result, which implies we’ll apply (AB)ᵀ:

You may easily extend this remark to the next rule:

To complete off this section, consider the inverse problem: is it possible to get well matrices A and B given only C = AB?

That is matrix factorization, which, as you may expect, doesn’t have a singular solution. Matrix factorization is a robust technique that may provide insight into transformations, as they might be expressed as a composition of simpler, elementary transformations. But that’s a subject for an additional time.

You may easily construct a matrix representing a do-nothing transformation that leaves the usual basis vectors unchanged:

It is usually known as the identity matrix.

Take a matrix A and consider the transformation that undoes its effects. The matrix representing this transformation is A⁻¹. Specifically, when applied after or before A, it yields the identity matrix I:

There are numerous resources that designate easy methods to calculate the inverse by hand. I like to recommend learning Gauss-Jordan method since it involves easy row manipulations on the augmented matrix. At each step, you may swap two rows, rescale any row, or add to a particular row a weighted sum of the remaining rows.

Take the next matrix for example for hand calculations:

It is best to get the inverse matrix:

Confirm by hand that equation (4) holds. You may as well do that in NumPy.

import numpy as np

A = np.array([
[1, -1],
[1 , 1]
])

print(f'Inverse of A:n{np.linalg.inv(A)}')

Inverse of A:
[[ 0.5 0.5]
[-0.5 0.5]]

Take a take a look at how the 2 transformations differ within the illustrations below.

Transformation A
Transformation A⁻¹

At first glance, it’s not obvious that one transformation reverses the results of the opposite.

Nevertheless, in these plots, you may notice a captivating and far-reaching connection between the transformation and its inverse.

Take an in depth take a look at the primary illustration, which shows the effect of transformation A on the premise vectors. The unique unit vectors are depicted semi-transparently, while their transformed counterparts, resulting from multiplication by matrix A, are drawn clearly and solidly. Now, imagine that these newly drawn vectors are the premise vectors you employ to explain the space, and also you perceive the unique space from their perspective. Then, the unique basis vectors will appear smaller and, secondly, will likely be oriented towards the east. And this is precisely what the second illustration shows, demonstrating the effect of the transformation A⁻¹.

This can be a preview of an upcoming topic I’ll cover in the subsequent article about using matrices to represent different perspectives on data.

All of this sounds great, but there’s a catch: some transformations can’t be reversed.

The workhorse of the subsequent experiment will likely be the matrix with 1s on the diagonal and b on the antidiagonal:

where b is a fraction within the interval (0, 1). This matrix is, by definition, symmetrical, because it happens to be equivalent to its own transpose: A=Aᵀ, but I’m just mentioning this by the way in which; it’s not particularly relevant here.

Invert this matrix using the Gauss-Jordan method, and you’ll get the next:

You may easily find online the principles for calculating the determinant of 2×2 matrices, which is able to give

This isn’t any coincidence. Usually, it holds that

Notice that when b = 0, the 2 matrices are equivalent. This isn’t any surprise, as A reduces to the identity matrix I.

Things get tricky when b = 1, because the det(A) = 0 and det(A⁻¹) becomes infinite. Consequently, A⁻¹ doesn’t exist for a matrix A consisting entirely of 1s. In algebra classes, teachers often warn you a few zero determinant. Nevertheless, once we consider where the matrix comes from, it becomes apparent that an infinite determinant can even occur, leading to a fatal error. Anyway,

a zero determinant means the transformation is non-ivertible.

ASK DUKE

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x