From their humble beginnings as easy timekeeping devices, watches have evolved to develop into essential fashion accessories for each men and girls. Nevertheless, with the arrival of smartwatches and wearable technology, watches have taken on a latest role, that of “watching” people.

In this text, I’ll show you train your precious smartwatch or some other wearable that may record your physical data to a surveillance machine step-by-step. Here, specifically, supervised machine learning algorithms(Logistic Regressions, and Neural Networks) are applied to trace your each day eating status. With that, your watch will know while you eat in the course of the day.

The device I used to gather my physiological data is *Empatica E4*, a wearable that is principally utilized in research settings for digital biomarker collection. It collected my heart rate(HR), blood volume pulse(BVP), electrodermal activity(EDA), temperature(TEMP), and accelerometer data(ACC). Here I’ll discard ACC data because, in the course of the eating session, I’m not actively moving. The eating schedule is as such: eating for two minutes; rest for two minutes; eating for two minutes; rest for two minutes; eating the meal until finished. In total, I collected 253,715 data points given the wearable has a sampling rate of 64Hz.

The unique raw data looks like this:

With the primary index being the present time in seconds because the epoch and the second index being the sampling rate.

First, I merge three sessions of information together by concatenation. Some features are collected in numerous Herz akin to EDA with a sampling rate of 4 Hz(As shown in the image above), and HR with a sampling rate of 1 Hz.

`import tensorflow as tf`

import pandas as pd

import matplotlib

from matplotlib import pyplot as plt

import seaborn as sns

import sklearn.metrics as sk_metrics

import tempfile

import os

import numpy as np# My original TEMP file is known as as TEMP.csv

file_name = 'TEMP.csv'

# My final TEMP file name:

export_name = 'TEMP64_02.csv'

# Desired sampling rate is 64

target_hz = 64

df = pd.read_csv(file_name)

# In my file, the primary cell gives the sampling rate

current_hz = df.iloc[0,0]

# This offers how much the ratio of current and desired Hz

multiplication_ratio = target_hz / current_hz

ur = multiplication_ratio

column_values = df.iloc[1:,:].to_numpy()

# This for loop repeately insert value to replenish the

# undersampling gap

for col in range(np.shape(column_values)[1]):

curr_col = column_values[:,col]

new_col = np.repeat(curr_col, ur)

new_df[curr_df_headers[col]] = new_col

# adding a row with value of Hz and set index to be -1

new_df.loc[-1] = target_hz

# shifting index by 1

new_df.index = new_df.index + 1

Through the use of the code above, we will upsample any data, which is below 64Hz, to precisely 64Hz.

Remember, we’re using supervised machine learning methods, so the bottom truth label is required for us to coach our data. Thus, the following step is to assign a ground truth label to every of our data points. I won’t go in-depth because I might somewhat leave more time to go over the machine learning training methods. Just a touch: you’ll be able to simply record which period interval you might be actively eating after which assign True or False (1/0) labels beside each point as a latest column within the CSV file.

To concatenate all sessions of information:

`df1 = pd.read_csv('session1.csv', index_col=False)`

df2 = pd.read_csv('session2.csv',index_col=False)

df3 = pd.read_csv('session3.csv',index_col=False)

frames = [df1, df2, df3]

dataset = pd.concat(frames, ignore_index=True)

Here’s a small portion of how our preprocessed dataset looks like:

`# A glimpse of information information`dataset.info()

The following step we’ll have to do is to design a moving average filter for time partition. Without it, training will likely be meaningless because we wish our model to give you the chance to predict our eating status from as a substitute of a series of *time points*.

`# Here we get half size of information to mimic the stride of two`

# and using rolling(window=32) function to define a time space of 0.5s

dataset.loc[dataset.index[np.arange(len(dataset))%2==1],'hr64_w_s_32']=dataset.hr.rolling(window=32).mean()

dataset.loc[dataset.index[np.arange(len(dataset))%2==1],'bvp64_w_s_32']=dataset.bvp.rolling(window=32).mean()

dataset.loc[dataset.index[np.arange(len(dataset))%2==1],'eda64_w_s_32']=dataset.eda.rolling(window=32).mean()

dataset.loc[dataset.index[np.arange(len(dataset))%2==1],'temp64_w_s_32']=dataset.temp.rolling(window=32).mean()# Creat a latest dataframe and drop rows containing NaN

clean_dataset = dataset[['hr64_w_s_32', 'bvp64_w_s_32', 'eda64_w_s_32', 'temp64_w_s_32', 'eating_status']]

clean_dataset.dropna(inplace=True)

clean_dataset

Before starting our training process, we want to divide data into training sets(80% of information) and test sets(20% of information), and define several evaluation functions. Since our variables have different meanings, normalization is obligatory. It transforms features to be on an analogous scale.

`train_dataset = dataset[0:int(len(clean_dataset)*0.80)]`

test_dataset = dataset.drop(train_dataset.index)# set x to be the variable columns and set y to be ground truth column

x_train, y_train = train_dataset.iloc[:, [0,1,2,3]], train_dataset.iloc[:,4]

x_test, y_test = test_dataset.iloc[:, [0,1,2,3]], test_dataset.iloc[:,4]

# convert to trainable tensors

x_train, y_train = tf.convert_to_tensor(x_train, dtype=tf.float32), tf.convert_to_tensor(y_train, dtype=tf.float32)

x_test, y_test = tf.convert_to_tensor(x_test, dtype=tf.float32), tf.convert_to_tensor(y_test, dtype=tf.float32)

class Normalize(tf.Module):

def __init__(self, x):

# Initialize the mean and standard deviation for normalization

self.mean = tf.Variable(tf.math.reduce_mean(x, axis=0))

self.std = tf.Variable(tf.math.reduce_std(x, axis=0))

def norm(self, x):

# Normalize the input

return (x - self.mean)/self.std

def unnorm(self, x):

# Unnormalize the input

return (x * self.std) + self.mean

norm_x = Normalize(x_train)

x_train_norm, x_test_norm = norm_x.norm(x_train), norm_x.norm(x_test)

def log_loss(y_pred, y):

# Compute the log loss function

ce = tf.nn.sigmoid_cross_entropy_with_logits(labels=y, logits=y_pred)

return tf.reduce_mean(ce)

# define logistic regression moduels

class LogisticRegression(tf.Module):

def __init__(self):

self.built = False

def __call__(self, x, train=True):

# Initialize the model parameters on the primary call

if not self.built:

# Randomly generate the weights and the bias term

rand_w = tf.random.uniform(shape=[x.shape[-1], 1], seed=22)

rand_b = tf.random.uniform(shape=[], seed=22)

self.w = tf.Variable(rand_w)

self.b = tf.Variable(rand_b)

self.built = True

# Compute the model output

z = tf.add(tf.matmul(x, self.w), self.b)

z = tf.squeeze(z, axis=1)

if train:

return z

return tf.sigmoid(z)

log_reg = LogisticRegression()

def predict_class(y_pred, thresh=0.5):

# Return a tensor with `1` if `y_pred` > `0.5`, and `0` otherwise

return tf.solid(y_pred > thresh, tf.float32)

def accuracy(y_pred, y):

# Return the proportion of matches between `y_pred` and `y`

y_pred = tf.math.sigmoid(y_pred)

y_pred_class = predict_class(y_pred)

check_equal = tf.solid(y_pred_class == y,tf.float32)

acc_val = tf.reduce_mean(check_equal)

return acc_val

batch_size = 64

train_dataset = tf.data.Dataset.from_tensor_slices((x_train_norm, y_train))

train_dataset = train_dataset.shuffle(buffer_size=x_train.shape[0]).batch(batch_size)

test_dataset = tf.data.Dataset.from_tensor_slices((x_test_norm, y_test))

test_dataset = test_dataset.shuffle(buffer_size=x_test.shape[0]).batch(batch_size)

With the training set and testing set defined above, we’re able to train our innocent smartwatch to a bit evil stalker.

Our goal is to diminish the loss function as much as possible. So we use the function above to define our loss function. The y with the hat is our predicted value while the y without the hat is the bottom truth value.

Substituting y hat to be easy linear equation Xw+b to get latest loss function as above.

The gradient descent functions for logistic regression are defined above. The partial functions are used to update parameters in a latest iteration.

`# Set training parameters`

from tqdm import tqdm

epochs = 200

learning_rate = 0.01

train_losses, test_losses = [], []

train_accs, test_accs = [], []# Arrange the training loop and start training

for epoch in tqdm(range(epochs)):

batch_losses_train, batch_accs_train = [], []

batch_losses_test, batch_accs_test = [], []

# Iterate over the training data

for x_batch, y_batch in train_dataset:

with tf.GradientTape() as tape:

y_pred_batch = log_reg(x_batch)

batch_loss = log_loss(y_pred_batch, y_batch)

batch_acc = accuracy(y_pred_batch, y_batch)

# Update the parameters with respect to the gradient calculations

grads = tape.gradient(batch_loss, log_reg.variables)

for g,v in zip(grads, log_reg.variables):

v.assign_sub(learning_rate * g)

# Keep track of batch-level training performance

batch_losses_train.append(batch_loss)

batch_accs_train.append(batch_acc)

# Iterate over the testing data

for x_batch, y_batch in test_dataset:

y_pred_batch = log_reg(x_batch)

batch_loss = log_loss(y_pred_batch, y_batch)

batch_acc = accuracy(y_pred_batch, y_batch)

# Keep track of batch-level testing performance

batch_losses_test.append(batch_loss)

batch_accs_test.append(batch_acc)

# Keep track of epoch-level model performance

train_loss, train_acc = tf.reduce_mean(batch_losses_train), tf.reduce_mean(batch_accs_train)

test_loss, test_acc = tf.reduce_mean(batch_losses_test), tf.reduce_mean(batch_accs_test)

train_losses.append(train_loss)

train_accs.append(train_acc)

test_losses.append(test_loss)

test_accs.append(test_acc)

if epoch % 20 == 0:

print(f"Epoch: {epoch}, Training log loss: {train_loss:.3f}")

We set the epochs to 200 and the educational rate to 0.01 at our first trial. Since our dataset is larger, we predict a protracted time of coaching. If you must train it faster, be at liberty to diminish epochs and increase the educational rate. Nevertheless, you need to also get a possible lower accuracy in your personal dataset. For more information on the basic algorithm of logistic regression please see: Logistic regression for binary classification with Core APIs.

To view our results simply use the *matplotlib* package to display loss and accuracy:

`plt.plot(range(epochs), train_losses, label = "Training loss")`

plt.plot(range(epochs), test_losses, label = "Testing loss")

plt.xlabel("Epoch")

plt.ylabel("Log loss")

plt.legend()

plt.title("Log loss vs training iterations");

`plt.plot(range(epochs), train_accs, label = "Training accuracy")`

plt.plot(range(epochs), test_accs, label = "Testing accuracy")

plt.xlabel("Epoch")

plt.ylabel("Accuracy (%)")

plt.legend()

plt.title("Accuracy vs training iterations");

`print(f"Final training log loss: {train_losses[-1]:.3f}")`

print(f"Final testing log Loss: {test_losses[-1]:.3f}")

print(f"Final training accuracy: {train_accs[-1]:.3f}")

print(f"Final testing accuracy: {test_accs[-1]:.3f}")

`Final training log loss: 0.570`

Final testing log Loss: 0.743

Final training accuracy: 0.701

Final testing accuracy: 0.594

Because of this, this model could roughly predict the result but not thoroughly and quickly reach overfitting on the epoch of number 2. Possibly since the feature of information needs a greater loss function to categorize the feature map into ground truth labels, the model could only end in around 59% of accuracy. The opposite reason could also be since the logistic regression requires the independent variable to haven’t any or little or no multicollinearity to every feature. In our case, it is rather likely that 4 features respond in an analogous strategy to eating signals. Because it doesn’t have an important result, let’s try a more complicated model.

We first preprocess the information by assigning the training set to 80% of the overall data and the testing set to twenty%. Using the *StandardScaler* to scale data ready for machine learning training.

`from sklearn.model_selection import train_test_split`

from sklearn.preprocessing import StandardScaler

import tensorflow as tf

import mathdf = clean_dataset

X = df.drop('eating_status', axis=1)

y = df['eating_status']

X_train, X_test, y_train, y_test = train_test_split(

X, y,

test_size=0.2, random_state=42

)

scaler = StandardScaler()

X_train_scaled = scaler.fit_transform(X_train)

X_test_scaled = scaler.transform(X_test)

The final NN set-up begins by assigning weights w and bias b. Each evaluated value will likely be fed into the activation function to extend the complexity that would not be possible from easy linear regression or logistic regression algorithms.

The evaluated value of y hat will likely be compared with the bottom truth label y and calculate the difference and update the educational rate based on how large the loss value is. Here we use the identical log loss from the logistic regression as our loss function. This might be defined from the TensorFlow built-in package *keras.losses.binary_crossentropy.*

The NN layer is ready to 128x256x256. Clearly, the layer is simply too complicated and it has a risk of overfitting. For more about overfitting, you’ll be able to go to this website: Complete Guide to Prevent Overfitting in Neural Networks. Be happy to alter the dimension of NN and have a try for your personal dataset to get a greater idea about how the dimension of the layer could impact the output. (Here offers an important animated explanation of how layer works from the Tensorflow website: Neural Network Layers from Tensorflow)

`import tensorflow as tf`

tf.random.set_seed(42)model = tf.keras.Sequential([

tf.keras.layers.Dense(128, activation='relu'),

tf.keras.layers.Dense(256, activation='relu'),

tf.keras.layers.Dense(256, activation='relu'),

tf.keras.layers.Dense(1, activation='sigmoid')

])

model.compile(

loss=tf.keras.losses.binary_crossentropy,

optimizer=tf.keras.optimizers.Adam(lr=0.03),

metrics=[

tf.keras.metrics.BinaryAccuracy(name='accuracy'),

tf.keras.metrics.Precision(name='precision'),

tf.keras.metrics.Recall(name='recall')

]

)

history = model.fit(X_train_scaled, y_train, validation_split=0.2, batch_size=32, epochs=100)

After the step above, you need to give you the chance to have a solid trained model at hand. Now, let’s see how our model performs with some visualization tools:

`from matplotlib import rcParams`rcParams['figure.figsize'] = (18, 8)

rcParams['axes.spines.top'] = False

rcParams['axes.spines.right'] = False

plt.plot(

np.arange(1, 101),

history.history['loss'], label='Training Loss'

)

plt.plot(

np.arange(1, 101),

history.history['accuracy'], label='Training Accuracy'

)

plt.plot(

np.arange(1, 101),

history.history['val_loss'], label='Validation Loss'

)

plt.plot(

np.arange(1, 101),

history.history['val_accuracy'], label='Validation Accuracy'

)

plt.title('Evaluation metrics', size=20)

plt.xlabel('Epoch', size=14)

plt.legend();

The loss and accuracy graph shows surprisingly good results. The accuracy approaches 100% with the rise of epochs. The model just isn’t overfitting since the validation loss remains to be decreasing and accuracy approaching 100%. The model looks promising. So, let’s test our model with our testing set and see the way it performs:

`# simply using predict methos to feed in our testing set`

predictions = model.predict(X_test_scaled)# The anticipated values are easy numbers and we want to categorise each number

# based on the worth. Whether it is larger than 0.5, we classify it to positive(True),

# otherwise it's negative(False).

prediction_classes = [

1 if prob > 0.5 else 0 for prob in np.ravel(predictions)

]

We could also use the confusion metrics to judge our performance and provides us a greater idea in regards to the AUC/ROC curve(you’ll be able to try it by yourself with the assistance from this website: The right way to Use ROC Curves and Precision-Recall Curves for Classification in Python):

`from sklearn.metrics import confusion_matrix`

print(confusion_matrix(y_test, prediction_classes))

`[[14869 368]`

[ 240 9892]]

`from sklearn.metrics import accuracy_score, precision_score, recall_score`print(f'Accuracy: {accuracy_score(y_test, prediction_classes):.2f}')

print(f'Precision: {precision_score(y_test, prediction_classes):.2f}')

print(f'Recall: {recall_score(y_test, prediction_classes):.2f}')

`Accuracy: 0.98`

Precision: 0.96

Recall: 0.98

From the result above, we could see an enormous performance increase from our logistic regression models. It’s also possible to try to switch the complexity of networks and see the performance increase/decrease and find one of the best model to fit your dataset.