Home Artificial Intelligence In case your watch watched you eat… Collecting the information Preprocessing the information Data Training Conclusion Resources

In case your watch watched you eat… Collecting the information Preprocessing the information Data Training Conclusion Resources

0
In case your watch watched you eat…
Collecting the information
Preprocessing the information
Data Training
Conclusion
Resources

From their humble beginnings as easy timekeeping devices, watches have evolved to develop into essential fashion accessories for each men and girls. Nevertheless, with the arrival of smartwatches and wearable technology, watches have taken on a latest role, that of “watching” people.

In this text, I’ll show you train your precious smartwatch or some other wearable that may record your physical data to a surveillance machine step-by-step. Here, specifically, supervised machine learning algorithms(Logistic Regressions, and Neural Networks) are applied to trace your each day eating status. With that, your watch will know while you eat in the course of the day.

The device I used to gather my physiological data is Empatica E4, a wearable that is principally utilized in research settings for digital biomarker collection. It collected my heart rate(HR), blood volume pulse(BVP), electrodermal activity(EDA), temperature(TEMP), and accelerometer data(ACC). Here I’ll discard ACC data because, in the course of the eating session, I’m not actively moving. The eating schedule is as such: eating for two minutes; rest for two minutes; eating for two minutes; rest for two minutes; eating the meal until finished. In total, I collected 253,715 data points given the wearable has a sampling rate of 64Hz.

Easy Flowchart of Data Ingestion and Preprocessing

The unique raw data looks like this:

A Column of Raw EDA Data

With the primary index being the present time in seconds because the epoch and the second index being the sampling rate.

First, I merge three sessions of information together by concatenation. Some features are collected in numerous Herz akin to EDA with a sampling rate of 4 Hz(As shown in the image above), and HR with a sampling rate of 1 Hz.

import tensorflow as tf
import pandas as pd
import matplotlib
from matplotlib import pyplot as plt
import seaborn as sns
import sklearn.metrics as sk_metrics
import tempfile
import os
import numpy as np

# My original TEMP file is known as as TEMP.csv
file_name = 'TEMP.csv'
# My final TEMP file name:
export_name = 'TEMP64_02.csv'

# Desired sampling rate is 64
target_hz = 64
df = pd.read_csv(file_name)

# In my file, the primary cell gives the sampling rate
current_hz = df.iloc[0,0]

# This offers how much the ratio of current and desired Hz
multiplication_ratio = target_hz / current_hz
ur = multiplication_ratio

column_values = df.iloc[1:,:].to_numpy()

# This for loop repeately insert value to replenish the
# undersampling gap
for col in range(np.shape(column_values)[1]):
curr_col = column_values[:,col]
new_col = np.repeat(curr_col, ur)

new_df[curr_df_headers[col]] = new_col

# adding a row with value of Hz and set index to be -1
new_df.loc[-1] = target_hz
# shifting index by 1
new_df.index = new_df.index + 1

Through the use of the code above, we will upsample any data, which is below 64Hz, to precisely 64Hz.

Remember, we’re using supervised machine learning methods, so the bottom truth label is required for us to coach our data. Thus, the following step is to assign a ground truth label to every of our data points. I won’t go in-depth because I might somewhat leave more time to go over the machine learning training methods. Just a touch: you’ll be able to simply record which period interval you might be actively eating after which assign True or False (1/0) labels beside each point as a latest column within the CSV file.

To concatenate all sessions of information:

df1 = pd.read_csv('session1.csv', index_col=False)
df2 = pd.read_csv('session2.csv',index_col=False)
df3 = pd.read_csv('session3.csv',index_col=False)
frames = [df1, df2, df3]
dataset = pd.concat(frames, ignore_index=True)

Here’s a small portion of how our preprocessed dataset looks like:

Concatenated Data with All Features and Ground Truth Label
# A glimpse of information information

dataset.info()

Dataset Information

The following step we’ll have to do is to design a moving average filter for time partition. Without it, training will likely be meaningless because we wish our model to give you the chance to predict our eating status from as a substitute of a series of time points.

Moving Average Filter with Window Size of 32 and Stride of two
# Here we get half size of information to mimic the stride of two
# and using rolling(window=32) function to define a time space of 0.5s
dataset.loc[dataset.index[np.arange(len(dataset))%2==1],'hr64_w_s_32']=dataset.hr.rolling(window=32).mean()
dataset.loc[dataset.index[np.arange(len(dataset))%2==1],'bvp64_w_s_32']=dataset.bvp.rolling(window=32).mean()
dataset.loc[dataset.index[np.arange(len(dataset))%2==1],'eda64_w_s_32']=dataset.eda.rolling(window=32).mean()
dataset.loc[dataset.index[np.arange(len(dataset))%2==1],'temp64_w_s_32']=dataset.temp.rolling(window=32).mean()

# Creat a latest dataframe and drop rows containing NaN
clean_dataset = dataset[['hr64_w_s_32', 'bvp64_w_s_32', 'eda64_w_s_32', 'temp64_w_s_32', 'eating_status']]
clean_dataset.dropna(inplace=True)
clean_dataset

clean_dataset Values and Variable Names

Before starting our training process, we want to divide data into training sets(80% of information) and test sets(20% of information), and define several evaluation functions. Since our variables have different meanings, normalization is obligatory. It transforms features to be on an analogous scale.

train_dataset = dataset[0:int(len(clean_dataset)*0.80)]
test_dataset = dataset.drop(train_dataset.index)

# set x to be the variable columns and set y to be ground truth column
x_train, y_train = train_dataset.iloc[:, [0,1,2,3]], train_dataset.iloc[:,4]
x_test, y_test = test_dataset.iloc[:, [0,1,2,3]], test_dataset.iloc[:,4]

# convert to trainable tensors
x_train, y_train = tf.convert_to_tensor(x_train, dtype=tf.float32), tf.convert_to_tensor(y_train, dtype=tf.float32)
x_test, y_test = tf.convert_to_tensor(x_test, dtype=tf.float32), tf.convert_to_tensor(y_test, dtype=tf.float32)

class Normalize(tf.Module):
def __init__(self, x):
# Initialize the mean and standard deviation for normalization
self.mean = tf.Variable(tf.math.reduce_mean(x, axis=0))
self.std = tf.Variable(tf.math.reduce_std(x, axis=0))

def norm(self, x):
# Normalize the input
return (x - self.mean)/self.std

def unnorm(self, x):
# Unnormalize the input
return (x * self.std) + self.mean

norm_x = Normalize(x_train)
x_train_norm, x_test_norm = norm_x.norm(x_train), norm_x.norm(x_test)

def log_loss(y_pred, y):
# Compute the log loss function
ce = tf.nn.sigmoid_cross_entropy_with_logits(labels=y, logits=y_pred)
return tf.reduce_mean(ce)

# define logistic regression moduels
class LogisticRegression(tf.Module):

def __init__(self):
self.built = False

def __call__(self, x, train=True):
# Initialize the model parameters on the primary call
if not self.built:
# Randomly generate the weights and the bias term
rand_w = tf.random.uniform(shape=[x.shape[-1], 1], seed=22)
rand_b = tf.random.uniform(shape=[], seed=22)
self.w = tf.Variable(rand_w)
self.b = tf.Variable(rand_b)
self.built = True
# Compute the model output
z = tf.add(tf.matmul(x, self.w), self.b)
z = tf.squeeze(z, axis=1)
if train:
return z
return tf.sigmoid(z)

log_reg = LogisticRegression()

def predict_class(y_pred, thresh=0.5):
# Return a tensor with `1` if `y_pred` > `0.5`, and `0` otherwise
return tf.solid(y_pred > thresh, tf.float32)

def accuracy(y_pred, y):
# Return the proportion of matches between `y_pred` and `y`
y_pred = tf.math.sigmoid(y_pred)
y_pred_class = predict_class(y_pred)
check_equal = tf.solid(y_pred_class == y,tf.float32)
acc_val = tf.reduce_mean(check_equal)
return acc_val

batch_size = 64
train_dataset = tf.data.Dataset.from_tensor_slices((x_train_norm, y_train))
train_dataset = train_dataset.shuffle(buffer_size=x_train.shape[0]).batch(batch_size)
test_dataset = tf.data.Dataset.from_tensor_slices((x_test_norm, y_test))
test_dataset = test_dataset.shuffle(buffer_size=x_test.shape[0]).batch(batch_size)

With the training set and testing set defined above, we’re able to train our innocent smartwatch to a bit evil stalker.

Log Loss Function

Our goal is to diminish the loss function as much as possible. So we use the function above to define our loss function. The y with the hat is our predicted value while the y without the hat is the bottom truth value.

Substituting y hat to be easy linear equation Xw+b to get latest loss function as above.

The gradient descent functions for logistic regression are defined above. The partial functions are used to update parameters in a latest iteration.

# Set training parameters
from tqdm import tqdm
epochs = 200
learning_rate = 0.01
train_losses, test_losses = [], []
train_accs, test_accs = [], []

# Arrange the training loop and start training
for epoch in tqdm(range(epochs)):
batch_losses_train, batch_accs_train = [], []
batch_losses_test, batch_accs_test = [], []

# Iterate over the training data
for x_batch, y_batch in train_dataset:
with tf.GradientTape() as tape:
y_pred_batch = log_reg(x_batch)
batch_loss = log_loss(y_pred_batch, y_batch)
batch_acc = accuracy(y_pred_batch, y_batch)
# Update the parameters with respect to the gradient calculations
grads = tape.gradient(batch_loss, log_reg.variables)
for g,v in zip(grads, log_reg.variables):
v.assign_sub(learning_rate * g)
# Keep track of batch-level training performance
batch_losses_train.append(batch_loss)
batch_accs_train.append(batch_acc)

# Iterate over the testing data
for x_batch, y_batch in test_dataset:
y_pred_batch = log_reg(x_batch)
batch_loss = log_loss(y_pred_batch, y_batch)
batch_acc = accuracy(y_pred_batch, y_batch)
# Keep track of batch-level testing performance
batch_losses_test.append(batch_loss)
batch_accs_test.append(batch_acc)

# Keep track of epoch-level model performance
train_loss, train_acc = tf.reduce_mean(batch_losses_train), tf.reduce_mean(batch_accs_train)
test_loss, test_acc = tf.reduce_mean(batch_losses_test), tf.reduce_mean(batch_accs_test)
train_losses.append(train_loss)
train_accs.append(train_acc)
test_losses.append(test_loss)
test_accs.append(test_acc)
if epoch % 20 == 0:
print(f"Epoch: {epoch}, Training log loss: {train_loss:.3f}")

Model Traning Output of Logistic Regressions

We set the epochs to 200 and the educational rate to 0.01 at our first trial. Since our dataset is larger, we predict a protracted time of coaching. If you must train it faster, be at liberty to diminish epochs and increase the educational rate. Nevertheless, you need to also get a possible lower accuracy in your personal dataset. For more information on the basic algorithm of logistic regression please see: Logistic regression for binary classification with Core APIs.

To view our results simply use the matplotlib package to display loss and accuracy:

plt.plot(range(epochs), train_losses, label = "Training loss")
plt.plot(range(epochs), test_losses, label = "Testing loss")
plt.xlabel("Epoch")
plt.ylabel("Log loss")
plt.legend()
plt.title("Log loss vs training iterations");
Log loss graph
plt.plot(range(epochs), train_accs, label = "Training accuracy")
plt.plot(range(epochs), test_accs, label = "Testing accuracy")
plt.xlabel("Epoch")
plt.ylabel("Accuracy (%)")
plt.legend()
plt.title("Accuracy vs training iterations");
print(f"Final training log loss: {train_losses[-1]:.3f}")
print(f"Final testing log Loss: {test_losses[-1]:.3f}")
print(f"Final training accuracy: {train_accs[-1]:.3f}")
print(f"Final testing accuracy: {test_accs[-1]:.3f}")
Final training log loss: 0.570
Final testing log Loss: 0.743
Final training accuracy: 0.701
Final testing accuracy: 0.594

Because of this, this model could roughly predict the result but not thoroughly and quickly reach overfitting on the epoch of number 2. Possibly since the feature of information needs a greater loss function to categorize the feature map into ground truth labels, the model could only end in around 59% of accuracy. The opposite reason could also be since the logistic regression requires the independent variable to haven’t any or little or no multicollinearity to every feature. In our case, it is rather likely that 4 features respond in an analogous strategy to eating signals. Because it doesn’t have an important result, let’s try a more complicated model.

We first preprocess the information by assigning the training set to 80% of the overall data and the testing set to twenty%. Using the StandardScaler to scale data ready for machine learning training.

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import tensorflow as tf
import math

df = clean_dataset

X = df.drop('eating_status', axis=1)
y = df['eating_status']

X_train, X_test, y_train, y_test = train_test_split(
X, y,
test_size=0.2, random_state=42
)

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

The final NN set-up begins by assigning weights w and bias b. Each evaluated value will likely be fed into the activation function to extend the complexity that would not be possible from easy linear regression or logistic regression algorithms.

Image of Parameter Flow in NN Set-up from A Gentle Introduction To Sigmoid Function

The evaluated value of y hat will likely be compared with the bottom truth label y and calculate the difference and update the educational rate based on how large the loss value is. Here we use the identical log loss from the logistic regression as our loss function. This might be defined from the TensorFlow built-in package keras.losses.binary_crossentropy.

The NN layer is ready to 128x256x256. Clearly, the layer is simply too complicated and it has a risk of overfitting. For more about overfitting, you’ll be able to go to this website: Complete Guide to Prevent Overfitting in Neural Networks. Be happy to alter the dimension of NN and have a try for your personal dataset to get a greater idea about how the dimension of the layer could impact the output. (Here offers an important animated explanation of how layer works from the Tensorflow website: Neural Network Layers from Tensorflow)

import tensorflow as tf
tf.random.set_seed(42)

model = tf.keras.Sequential([
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(256, activation='relu'),
tf.keras.layers.Dense(256, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])

model.compile(
loss=tf.keras.losses.binary_crossentropy,
optimizer=tf.keras.optimizers.Adam(lr=0.03),
metrics=[
tf.keras.metrics.BinaryAccuracy(name='accuracy'),
tf.keras.metrics.Precision(name='precision'),
tf.keras.metrics.Recall(name='recall')
]
)

history = model.fit(X_train_scaled, y_train, validation_split=0.2, batch_size=32, epochs=100)

After the step above, you need to give you the chance to have a solid trained model at hand. Now, let’s see how our model performs with some visualization tools:

from matplotlib import rcParams

rcParams['figure.figsize'] = (18, 8)
rcParams['axes.spines.top'] = False
rcParams['axes.spines.right'] = False

plt.plot(
np.arange(1, 101),
history.history['loss'], label='Training Loss'
)
plt.plot(
np.arange(1, 101),
history.history['accuracy'], label='Training Accuracy'
)
plt.plot(
np.arange(1, 101),
history.history['val_loss'], label='Validation Loss'
)
plt.plot(
np.arange(1, 101),
history.history['val_accuracy'], label='Validation Accuracy'
)
plt.title('Evaluation metrics', size=20)
plt.xlabel('Epoch', size=14)
plt.legend();

The loss and accuracy graph shows surprisingly good results. The accuracy approaches 100% with the rise of epochs. The model just isn’t overfitting since the validation loss remains to be decreasing and accuracy approaching 100%. The model looks promising. So, let’s test our model with our testing set and see the way it performs:

# simply using predict methos to feed in our testing set
predictions = model.predict(X_test_scaled)

# The anticipated values are easy numbers and we want to categorise each number
# based on the worth. Whether it is larger than 0.5, we classify it to positive(True),
# otherwise it's negative(False).
prediction_classes = [
1 if prob > 0.5 else 0 for prob in np.ravel(predictions)
]

We could also use the confusion metrics to judge our performance and provides us a greater idea in regards to the AUC/ROC curve(you’ll be able to try it by yourself with the assistance from this website: The right way to Use ROC Curves and Precision-Recall Curves for Classification in Python):

from sklearn.metrics import confusion_matrix
print(confusion_matrix(y_test, prediction_classes))
[[14869   368]
[ 240 9892]]
from sklearn.metrics import accuracy_score, precision_score, recall_score

print(f'Accuracy: {accuracy_score(y_test, prediction_classes):.2f}')
print(f'Precision: {precision_score(y_test, prediction_classes):.2f}')
print(f'Recall: {recall_score(y_test, prediction_classes):.2f}')

Accuracy: 0.98
Precision: 0.96
Recall: 0.98

From the result above, we could see an enormous performance increase from our logistic regression models. It’s also possible to try to switch the complexity of networks and see the performance increase/decrease and find one of the best model to fit your dataset.

LEAVE A REPLY

Please enter your comment!
Please enter your name here