Understanding Tensors: Learning a Data Structure Through 3 Pesky Errors

Artificial Intelligence

Understanding Tensors: Learning a Data Structure Through 3 Pesky Errors

admin

March 13, 2024

Understanding Tensors: Learning a Data Structure Through 3 Pesky Errors

I’ve recently been tinkering with deep learning models in Tensorflow, and have accordingly been introduced to managing data as tensors.

As a Data Engineer that works all day in tables that I can easily slice, dice, and visualize, I had absolutely no intuition around working with tensors, and I appeared to consistently run into the identical errors that, especially at first, went way over my head.

Nevertheless, deep diving them has taught me quite a bit about tensors and TensorFlow, and I desired to consolidate those learnings here to make use of as a reference.

If you’ve got a favourite error, solution, or debugging tip, please leave a comment!

Before we dive into the errors themselves, I desired to document a couple of of the light-weight, easy bits and pieces of code that I’ve found helpful in debugging. (Even though it have to be stated for legal reasons that we after all all the time debug with official debugging features and never just dozens of print statements 🙂)

Seeing inside our Tensorflow Datasets

First off, our actual data. Once we print a Dataframe or SELECT * in SQL, we see the information! Once we print a tensor dataset we see…

<_TensorSliceDataset element_spec=(TensorSpec(shape=(2, 3), dtype=tf.int32, name=None), TensorSpec(shape=(1, 1), dtype=tf.int32, name=None))>

That is all quite useful information, nevertheless it doesn’t help us understand what’s actually occurring in our data.

To print a single tensor inside the execution graph we are able to leverage tf.print. This text is an excellent deep dive into tf.print that I highly recommend if you happen to plan to make use of it often: Using tf.Print() in TensorFlow

But when working with Tensorflow datasets during development, sometimes we’d like to see a couple of values at a time. For that we are able to loop through and print individual pieces of information like this:


# Generate dummy 2D data
np.random.seed(42)
num_samples = 100
num_features = 5
X_data = np.random.rand(num_samples, num_features).astype(np.float32)
y_data = 2 * X_data[:, 0] + 3 * X_data[:, 1] - 1.5 * X_data[:, 2] + 0.5 * X_data[:, 3] + np.random.randn(num_samples)# Turn it right into a Tensorflow Dataset
dataset = tf.data.Dataset.from_tensor_slices((X_data, y_data))
# Print the primary 10 rows
for i, (features, label) in enumerate(dataset.take(10)):
print(f"Row {i + 1}: Features - {features.numpy()}, Label - {label.numpy()}")

We may use skip to get to a particular index:

mini_dataset = dataset.skip(100).take(20)
for i, (features, label) in enumerate(mini_dataset):
print(f"Row {i + 1}: Features - {features.numpy()}, Label - {label.numpy()}")

Knowing our tensors’ specs

When working with tensors we also must know their shape, rank, dimension, and data type (if a few of that vocabulary is unfamiliar, because it was to me initially, don’t worry, we’ll get back to it later within the article). Anyway, below are a couple of lines of code to collect this information:


# Create a sample tensor
sample_tensor = tf.constant([[1, 2, 3], [4, 5, 6]])# Get the scale of the tensor (total variety of elements)
tensor_size = tf.size(sample_tensor).numpy()
# Get the rank of the tensor
tensor_rank = tf.rank(sample_tensor).numpy()
# Get the form of the tensor
tensor_shape = sample_tensor.shape
# Get the scale of the tensor
tensor_dimensions = sample_tensor.shape.as_list()
# Print the outcomes
print("Tensor Size:", tensor_size)
print("Tensor Rank:", tensor_rank)
print("Tensor Shape:", tensor_shape)
print("Tensor Dimensions:", tensor_dimensions)

The above outputs:

Tensor Size: 6
Tensor Rank: 2
Tensor Shape: (2, 3)
Tensor Dimensions: [2, 3]

Augmenting model.summary()

Finally, its is all the time helpful to give you the chance to see how data is moving through a model, and the way shape changes throughout inputs and outputs between layers. The source of many an error can be a mismatch between these expected input and output shapes and the form of a given tensor.

model.summary() after all gets the job done, but we are able to complement that information with the next snippet, which adds a bit more context with model and layer inputs and outputs:

print("###################Input Shape and Datatype#####################")
[print(i.shape, i.dtype) for i in model.inputs]
print("###################Output Shape and Datatype#####################")
[print(o.shape, o.dtype) for o in model.outputs]
print("###################Layer Input Shape and Datatype#####################")
[print(l.name, l.input, l.dtype) for l in model.layers]

So let’s jump into some errors!

Rank

ValueError: Shape have to be rank x but is rank y….

Okay, to start with, what’s a rank? Rank is just the unit of dimensionality we use to explain tensors. A rank 0 tensor is a scalar value; a rank one tensor is a vector; a rank two is a matrix, and so forth for all n dimensional structures.

Take for instance a 5 dimensional tensor.

rank_5_tensor = tf.constant([[[[[1, 2], [3, 4]], [[5, 6], [7, 8]]], [[[9, 10], [11, 12]], [[13, 14], [15, 16]]]],
[[[[17, 18], [19, 20]], [[21, 22], [23, 24]]], [[[25, 26], [27, 28]], [[29, 30], [31, 32]]]]])
print("nRank 5 Tensor:", rank_5_tensor.shape)

Rank 5 Tensor: (2, 2, 2, 2, 2)

The code above shows that every dimension of the five has a size of two. If we desired to index it, we could accomplish that along any of those axes. To get on the last element, 32, we’d run something like:

rank_5_tensor.numpy()[1][1][1][1][1]

The official tensor documentation has some really helpful visualizations to make this a bit more comprehensible.

Back to the error: it’s just flagging that the tensor provided is a special dimension than what is predicted to a selected function. For instance if the error declares that the “Shape have to be rank 1 but is rank 0…” it implies that we’re providing a scalar value, and it expects a 1-D tensor.

Take the instance below where we try to multiply tensors along with the matmul method.

import tensorflow as tf
import numpy as np
# Create a TensorFlow dataset with random matrices
num_samples = 5
matrix_size = 3
dataset = tf.data.Dataset.from_tensor_slices(np.random.rand(num_samples, matrix_size, matrix_size))
mul = [1,2,3,4,5,6]# Define a function that uses tf.matmul
def matmul_function(matrix):
return tf.matmul(matrix, mul)
# Apply the matmul_function to the dataset using map
result_dataset = dataset.map(matmul_function)

If we take a peek on the documentation, matmul expects no less than a rank 2 tensor, so multiplying the matrix by [1,2,3,4,5,6], which is just an array, will raise this error.

ValueError: Shape have to be rank 2 but is rank 1 for '{{node MatMul}} = MatMul[T=DT_DOUBLE, transpose_a=false, transpose_b=false](args_0, MatMul/b)' with input shapes: [3,3], [2].

An ideal first step for this error is to dive into the documentation and understand what the function you might be using is on the lookout for (Here’s a pleasant list of the functions available on tensors: raw_ops.

Then use the rank method to find out what we are literally providing.

print(tf.rank(mul))

tf.Tensor(1, shape=(), dtype=int32)

So far as fixes go, tf.reshape is commonly a superb option to begin with. Let’s take a temporary moment to speak a bit of bit about tf.reshape, since it can be a faithful companion throughout our Tensorflow journey: tf.reshape(tensor, shape, name=None)

Reshape simply takes within the tensor we wish to reshape, and one other tensor containing what we wish the form of the output to be. For instance, let’s reshape our multiplication input:

mul = [1,2,3,4,5,6]
tf.reshape(mul, [3, 2]).numpy()

array([[1, 2],
[3, 4],
[5, 6]], dtype=int32)

Our variable will turn right into a (3,2) tensor (3 rows, 2 columns). A fast note, tf.reshape(t, [3, -1]).numpy() will produce the identical thing since the -1 tells Tensorflow to compute the scale of the dimension such that the entire size stays constant. The variety of elements in the form tensor is the rank.

Once we create a tensor with the right rank, our multiplication will work just nice!

Shape

ValueError: Input of layer is incompatible with layer….

Having an intuitive understanding of tensor shape, and the way it interacts and changes across model layers has made life with deep learning significantly easier

First, getting basic vocab out of the best way: the form of a tensor refers back to the variety of elements along each dimension, or axis of the tensor. For instance, a 2D tensor with 3 rows and 4 columns has a shape of (3, 4).

So what can go improper with shape? Glad you asked, quite a couple of things!

In the beginning the form and rank of your training data must match the input shape expected by the input layer. Let’s take a have a look at an example, a basic CNN:

import tensorflow as tf
from tensorflow.keras import layers, models# Create a function to generate sample data
def generate_sample_data(num_samples=100):
for _ in range(num_samples):
features = tf.random.normal(shape=(64, 64, 3))
labels = tf.one_hot(tf.random.uniform(shape=(), maxval=10, dtype=tf.int32), depth=10)
yield features, labels
# Create a TensorFlow dataset using the generator function
sample_dataset = tf.data.Dataset.from_generator(generate_sample_data, output_signature=(tf.TensorSpec(shape=(64, 64, 3), dtype=tf.float32), tf.TensorSpec(shape=(10,), dtype=tf.float32)))
# Create a CNN model with an input layer expecting (128, 128, 3)
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(128, 128, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))
# Compile the model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
# Fit the model using the dataset
model.fit(sample_dataset.batch(32).repeat(), epochs=5, steps_per_epoch=100, validation_steps=20)

Attempting to run the code above will end in:

ValueError: Input 0 of layer "sequential_5" is incompatible with the layer: expected shape=(None, 128, 128, 3), found shape=(None, 64, 64, 3)

It is because our model is expecting the input tensor to be of the form (128, 128, 3) and our generated data is (64, 64, 3).

In a situation like this, our good friend, reshape, or one other Tensorflow function, resize, might help. If, as within the case above we’re working with images, we are able to simply run resize or change the expectations of our model’s input:

def resize_image(image, label):
resized_image = tf.image.resize(image, size=target_shape)
return resized_image, label# Apply the resize function to all the dataset
resized_dataset = sample_dataset.map(resize_image)

On this context, it is useful to know a bit of about how common kinds of models and model layers expect input of various shapes, so let’s take a bit of detour.

Deep Neural Networks of Dense layers absorb 1 dimensional tensors (or 2 dimensional, depending on whether you include batch size, but we’ll discuss batch size in a bit) of the format (feature_size, ) where feature_size is the variety of features in each sample.

Convolutional Neural Networks absorb data representing images, using 3 dimensional tensors of (width, height, channels) where channels are the colour scheme, 1 for gray scale, and three for RBG.

And at last, Recurrent Neural Networks comparable to LTSMs absorb 2 dimensions (time steps, feature_size)

But back to errors! One other common perpetrator in Tensorflow shape errors has to do with how shape changes as data passes through the model layers. As previously mentioned, different layers take in several input shapes, and so they may reshape output.

Returning to our CNN example from above, let’s break it again, by seeing what happens once we remove the Flatten layer. If we attempt to run the code we’ll see

ValueError: Shapes (None, 10) and (None, 28, 28, 10) are incompatible

That is where printing all of our model input and output shapes together with our data shapes is useful to assist us pinpoint where there’s a mismatch.

model.summary() will show us

Layer (type) Output Shape Param #
=================================================================
conv2d_15 (Conv2D) (None, 126, 126, 32) 896
max_pooling2d_10 (MaxPooli (None, 63, 63, 32) 0
ng2D)
conv2d_16 (Conv2D) (None, 61, 61, 64) 18496
max_pooling2d_11 (MaxPooling2D) (None, 30, 30, 64) 0
conv2d_17 (Conv2D) (None, 28, 28, 64) 36928
flatten_5 (Flatten) (None, 50176) 0
dense_13 (Dense) (None, 64) 3211328
dense_14 (Dense) (None, 10) 650
=================================================================
Total params: 3268298 (12.47 MB)
Trainable params: 3268298 (12.47 MB)
Non-trainable params: 0 (0.00 Byte)

And our further diagnostic will reveal

###################Input Shape and Datatype#####################
(None, 128, 128, 3) 
###################Output Shape and Datatype#####################
(None, 10) 
###################Layer Input Shape and Datatype#####################
conv2d_15 KerasTensor(type_spec=TensorSpec(shape=(None, 128, 128, 3), dtype=tf.float32, name='conv2d_15_input'), name='conv2d_15_input', description="created by layer 'conv2d_15_input'") float32
max_pooling2d_10 KerasTensor(type_spec=TensorSpec(shape=(None, 126, 126, 32), dtype=tf.float32, name=None), name='conv2d_15/Relu:0', description="created by layer 'conv2d_15'") float32
conv2d_16 KerasTensor(type_spec=TensorSpec(shape=(None, 63, 63, 32), dtype=tf.float32, name=None), name='max_pooling2d_10/MaxPool:0', description="created by layer 'max_pooling2d_10'") float32
max_pooling2d_11 KerasTensor(type_spec=TensorSpec(shape=(None, 61, 61, 64), dtype=tf.float32, name=None), name='conv2d_16/Relu:0', description="created by layer 'conv2d_16'") float32
conv2d_17 KerasTensor(type_spec=TensorSpec(shape=(None, 30, 30, 64), dtype=tf.float32, name=None), name='max_pooling2d_11/MaxPool:0', description="created by layer 'max_pooling2d_11'") float32
flatten_5 KerasTensor(type_spec=TensorSpec(shape=(None, 28, 28, 64), dtype=tf.float32, name=None), name='conv2d_17/Relu:0', description="created by layer 'conv2d_17'") float32
dense_13 KerasTensor(type_spec=TensorSpec(shape=(None, 50176), dtype=tf.float32, name=None), name='flatten_5/Reshape:0', description="created by layer 'flatten_5'") float32
dense_14 KerasTensor(type_spec=TensorSpec(shape=(None, 64), dtype=tf.float32, name=None), name='dense_13/Relu:0', description="created by layer 'dense_13'") float32

It’s loads of output, but we are able to see that dense_13 layer is on the lookout for input of (None, 50176) shape. Nevertheless, conv2d_17 layer outputs (None, 28, 28, 64)

Flatten layers transform the multi-dimensional output from previous layers right into a one-dimensional (flat) vector that the Dense layer expects.

Conv2d and Max Pooling layers change their input data in other interesting ways as well, but those are out of scope for this text. For an awesome breakdown take a have a look at: Ultimate Guide to Input shape and Model Complexity in Neural Networks

But what about batch size?! I haven’t forgotten!

If we break our code another time by removing the .batch(32) from the dataset in model.fit we’ll get the error:

ValueError: Input 0 of layer "sequential_10" is incompatible with the layer: expected shape=(None, 128, 128, 3), found shape=(128, 128, 3)

That’s because, the primary dimension of a layer’s input is reserved for the batch size or variety of samples that we wish the model to work through at a time. For an excellent deep dive read through Difference between batch and epoch.

Batch size defaults to None prior to fitting, as we are able to see within the model summary output, and our model expects us to set it elsewhere, depending on how we tune the hyperparameter. We may force it in our input layer by utilizing batch_input_size as a substitute of input_size, but that decreases our flexibility in testing out different values.

Type

TypeError: Didn’t convert object of type to Tensor. Unsupported object type

Finally, let’s talk a bit about some data type specifics in Tensors.

The error above is one other, that, if you happen to’re used to working in database systems with tables built from all types of data, generally is a bit baffling, nevertheless it is considered one of the more easy to diagnose and fix, although there are a few common causes to look out for.

The major issue is that, although tensors support quite a lot of data types, once we convert a NumPy array to tensors (a standard flow inside deep learning), the datatypes have to be floats. The script below initializes a contrived example of a dataframe with None and with string data points. Let’s walk through some issue and fixes for this instance:

import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential
data = [
[None, 0.2, '0.3'],
[0.1, None, '0.3'],
[0.1, 0.2, '0.3'],
]
X_train = pd.DataFrame(data=data, columns=["x1", "x2", "x3"])
y_train = pd.DataFrame(data=[1, 0, 1], columns=["y"])# Create a TensorFlow dataset
train_dataset = tf.data.Dataset.from_tensor_slices((X_train.to_numpy(), y_train.to_numpy()))
# Define the model
model = Sequential()
model.add(Dense(1, input_dim=X_train.shape[1], activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
# Fit the model using the TensorFlow dataset
model.fit(train_dataset.batch(3), epochs=3)

Running this code will flag to us that:

ValueError: Didn't convert a NumPy array to a Tensor (Unsupported object type float).

Essentially the most obvious issue is that you just are sending in a NumPy array that incorporates some non float type, an object. If you’ve got an actual column of categorical data, there are a lot of ways to convert that to numeric data (One shot encoding, etc) but that’s out of scope for this discussion.

We will determine that if we run print(X_train.dtypes), which can tell us what’s in our dataframe that Tensorflow doesn’t like.

x1 float64
x2 float64
x3 object
dtype: object

If we’re running into non float data points, the road below will magically solve all of our problems:

X_train = np.asarray(X_train).astype('float32')

One other thing to examine for is that if you’ve got None or np.nan anywhere.

To search out out we are able to use a couple of lines of code comparable to:

null_mask = X_train.isnull().any(axis=1)
null_rows = X_train[null_mask]
print(null_rows)

Which tells us that we’ve nulls on rows 0 and 1:

x1 x2 x3
0 NaN 0.2 0.3
1 0.1 NaN 0.3

In that case, and that is predicted/intentional we’d like to interchange those values with an appropriate alternative. Fillna might help us here.

X_train.fillna(value=0, inplace=True)

With these changes to the code below, our NumPy array will successfully convert to a tensor dataset and we are able to train our model!

I often find that I learn probably the most about a selected technology when I actually have to work through errors, and I hope this has been somewhat helpful to you too!

If you’ve got cool suggestions and tricks or fun Tensorflow errors please pass them along!