Home Artificial Intelligence Machine Learning for Jiu Jitsu Pose Tracking To Get Higher at Jiu Jitsu What’s Pose Tracking? What’s Jiu Jitsu? Why Pose Tracking for Jiu Jitsu Why Did I Do This? Machine Learning Comes In Steps on Using Pose Tracking to Generate Insights for Jiu Jitsu Pose Tracking to Track Body Joints The Preliminary Results Final Note References

Machine Learning for Jiu Jitsu Pose Tracking To Get Higher at Jiu Jitsu What’s Pose Tracking? What’s Jiu Jitsu? Why Pose Tracking for Jiu Jitsu Why Did I Do This? Machine Learning Comes In Steps on Using Pose Tracking to Generate Insights for Jiu Jitsu Pose Tracking to Track Body Joints The Preliminary Results Final Note References

2
Machine Learning for Jiu Jitsu
Pose Tracking To Get Higher at Jiu Jitsu
What’s Pose Tracking?
What’s Jiu Jitsu?
Why Pose Tracking for Jiu Jitsu
Why Did I Do This?
Machine Learning Comes In
Steps on Using Pose Tracking to Generate Insights for Jiu Jitsu
Pose Tracking to Track Body Joints
The Preliminary Results
Final Note
References

Photo by Kampus Production from Pexels: https://www.pexels.com/photo/a-judoka-throwing-an-opponent-to-the-ground-6765024/

Using pose estimation with mediapipe to trace Jiu Jitsu movements

Brazilian Jiu-Jitsu is a martial art that has been getting lots of popularity recently on account of its effectiveness and applicability in real-world combat.

I’ve been practicing Brazilian Jiu Jitsu for over 10 years and I made a decision to hitch my interests in martial arts and machine learning to provide you with a project that lived on the intersection of those 2 really interesting fields.

Due to this fact, I turned to pose estimation as a promising technique for a complimentary tool to assist with my development in Jiu Jitsu.

Pose tracking is the means of detecting and tracking the movement of an individual’s body in real-time using computer vision technology. It involves using algorithms to capture and interpret the movement of assorted body parts, corresponding to the arms, legs, and torso.

This system will be relevant for analyzing body movement in sports, because it allows coaches and athletes to discover and proper movement patterns that could be negatively impacting performance or causing injuries.

By providing real-time feedback, athletes could make adjustments to their technique, resulting in improved performance and reduced risk of injury. Moreover, this technology will be used to match movements to those of top performers in the game for instance, to assist beginners discover areas for improvement and refine their technique accordingly.

Jiu Jitsu is a martial art centered around the concept of subduing opponents through a mix of pins and submission holds like joint locks and chokes.

Jiu Jitsu focuses on grappling and ground fighting techniques. It was initially developed in Japan and later modified and popularized in Brazil. But now, it has spread all around the world on account of its increase in popularity particularly in the USA.

The essential principle is that a smaller, weaker person can defend against a bigger, stronger opponent through the use of leverage and technique. Practitioners aim to manage their opponent’s body and position themselves in a dominant position where they will execute techniques corresponding to chokes, joint locks, and throws.

Image by Timoth Eberly in https://unsplash.com/photos/7MRajrPiTqw*

Jiu Jitsu is now a well-liked sport and self-defense system practiced all around the world. It requires physical and mental discipline, in addition to a willingness to learn and adapt.

It has also been found to have quite a few advantages, including improved physical fitness and mental acuity, increased confidence and self-esteem, in addition to stress relief.

The massive emphasis on technique makes this martial art quite unique, and within the context of a jiu jitsu gym, it is frequently the role of the black belt coach to offer feedback to the scholar regarding the appropriateness of his or her execution of various techniques.

Nonetheless, it’s often the case that folks wish to learn but either don’t have the access to an authority, or the category comprises too many students and it becomes difficult for the person conducting the category to offer specific and private feedback regarding whether or not the scholar is performing the movements appropriately.

Inside this gap of feedback is that I feel tools like pose tracking can greatly profit the world of martial arts normally and Jiu Jitsu specifically (although one can argue the identical for Judo, wrestling, and striking-based martial arts as well), because they may very well be seamlessly integrated right into a smartphone only requiring athletes to film themselves while performing the movement they are attempting to enhance.

The shape of this feedback is something that will should be developed, and this text is an try and provide directions for the way such a machine learning based feedback system would work for helping students recuperate at performing foundational movements in the game.

Okay, so here is the story.

Often, whenever you develop your Jiu Jitsu skills, you find yourself falling under one in all 2 categories: bottom player, or top player. Meaning whether you are likely to play from the underside using your “guard” (a reference to the usage of the legs to perform attacks on the opponent) or from the highest by first taking down your opponent after which proceeding to pass the road of their legs to (often) reach a dominant position like being mounted in your opponent or taking his/her back.

Image by Nolan Kent in https://unsplash.com/photos/x_V62hOwnDk?utm_source=unsplash&utm_medium=referral&utm_content=creditShareLink

Such a duality is clearly artificial, and typically, most experienced players can play each positions extremely well.

Nonetheless, it’s the case that folks are likely to lean towards preferences at first of their journey in Jiu Jitsu, and that may hugely impact their progress in other areas in the event that they get stuck executing the identical strategy time and again.

In a way, that’s what happened to me, I used to fight lots as a guard player, on account of a predominant culture in Brazil that fosters starting the grappling bouts from the knees to avoid either injuries or since the mat space isn’t sufficiently big like wrestling mats in big High School Gyms within the US.

Image by the creator. Photos of me in competition pulling guard.

This habit of sitting down and fighting from the back willingly without engaging my opponents within the standup game, had a negative impact on my development as a martial artist because as I became higher and higher at Jiu Jitsu, I noticed that one thing holding me back was my lack of high-level knowledge on find out how to take people down.

This ignited a fireplace in me to start out working more from a standing position, and I went on to check and practice Wrestling and Judo after a few years into my brown belt.

During the last 2 years I actually have been mostly a top player, and indeed improved quite a bit my ability to take people to the bottom.

Image by the creator. My wrestling journey.

Nonetheless, there are specific foundational movements in Judo for instance which can be extremely difficult to develop, and since I don’t know any Judo experts, nor do I live near any high-level Judo or Wrestling gyms, I noticed that I needed one other technique to improve certain foundational movements, specifically the hips mobility for takedowns just like the “Uchimata” and other hip based throws.

Photo by Kampus Production from Pexels: https://www.pexels.com/photo/a-judoka-throwing-an-opponent-to-the-ground-6765024/

Okay, so with the goal of improving my ability to perform Judo throws just like the Uchimata, I concocted a “geeky” plan: I’m gonna use Machine Learning (I do know, such a particular plan).

I made a decision I wanted to analyze whether or not I could use Pose Tracking to collect insight on find out how to correct things just like the speed and direction of the feet and other points of executing these movements.

So let’s get into how I did that.

The general plan was this:

1. Discover a video reference containing the movement I used to be trying to emulate

2. Record myself performing the movement repeatedly

3. Generate insights using pose tracking and visualization with Python.

To do all of that I needed a reference video of an elite-level practitioner performing the move I used to be attempting to learn. Within the case of the uchimata, I discovered this video of an Olympic-level player performing a warm-up technique against the wall that’s directly relevant to what I desired to learn:

Then I began recording myself performing the movement, at the very least those where I’m actively learning a certain move.

Image by the creator.

In possession of the reference video, and now having recorded a few of my very own footage, I used to be able to check out some fun machine learning stuff.

For the pose tracking I used something called mediapipe, Google’s open-source project for facilitating the applying of machine learning to live and streaming media.

The convenience of use of this selection got me excited to try it out.

In essence, I did the next:

I wrote this code to create videos where the model estimates the position of the body joints

and overlays them within the actual footage to showcase the robustness of the model.

Image by creator

Yes, yes I do know, I don’t look exactly elite-level. But give me a break, my Judo skills are under construction!

The code I used for this was:

from base64 import b64encode
import cv2
import mediapipe as mp
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()
import numpy as np
from natsort import natsorted
from mpl_toolkits.mplot3d import Axes3D
from matplotlib.animation import FuncAnimation
from IPython.display import clear_output
%matplotlib inline
import pandas as pd
import plotly.graph_objects as go
import plotly.express as px
from IPython.display import HTML, display
import ipywidgets as widgets
from typing import List # I do not think I would like this!

# Custom imports
from pose_tracking_utils import *

mp_drawing = mp.solutions.drawing_utils
mp_drawing_styles = mp.solutions.drawing_styles
mp_pose = mp.solutions.pose

def create_pose_tracking_video(video_path):
# For webcam input:
cap = cv2.VideoCapture(video_path)
frame_width = int(cap.get(3))
frame_height = int(cap.get(4))
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
output_path = pathlib.Path(video_path).stem + "_pose.mp4"
out = cv2.VideoWriter(output_path, fourcc, 30.0, (frame_width, frame_height))
with mp_pose.Pose(min_detection_confidence=0.5,
min_tracking_confidence=0.5) as pose:
while cap.isOpened():
success, image = cap.read()
if not success:
print("Ignoring empty camera frame.")
break
# To enhance performance, optinally mark the iamge as
# not writeable to pass by reference.
image.flags.writeable = False
image= cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
results = pose.process(image)
# Draw the annotation on the image.
image.flags.writeable = True
image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
mp_drawing.draw_landmarks(image, results.pose_landmarks,
mp_pose.POSE_CONNECTIONS,
landmark_drawing_spec=mp_drawing_styles.get_default_pose_landmarks_style())

# Flip the image horizontally for a self-view display.
out.write(cv2.flip(image, 1))
if cv2.waitKey(5) & 0xFF == 27:
break

cap.release()
out.release()
print("Pose video created!")

return output_path

This essentially leverages the mediapipe package to generate a visualization that detects the keypoints and overlays them on top of the video footage.

VIDEO_PATH = "./videos/clip_training_session_1.mp4"
# Initialize MediaPipe Pose model
body_part_index = 32
pose = mp_pose.Pose(static_image_mode=False, min_detection_confidence=0.5, min_tracking_confidence=0.5)

# Initialize OpenCV VideoCapture object to capture video from the camera
cap = cv2.VideoCapture(VIDEO_PATH)

# Create an empty list to store the trace of the fitting elbow
trace = []

# Create empty lists to store the x, y, z coordinates of the fitting elbow
x_vals = []
y_vals = []
z_vals = []

# Create a Matplotlib figure and subplot for the real-time updating plot
# fig, ax = plt.subplots()
# plt.title('Time Lapse of the X Coordinate')
# plt.xlabel('Frames')
# plt.ylabel('Coordinate Value')
# plt.xlim(0,1)
# plt.ylim(0,1)
# plt.ion()
# plt.show()
frame_num = 0

while True:
# Read a frame from the video capture
success, image = cap.read()
if not success:
break
# Convert the frame to RGB format
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

# Process the frame with MediaPipe Pose model
results = pose.process(image)

# Check if any body parts are detected

if results.pose_landmarks:
# Get the x,y,z coordinates of the fitting elbow
x, y, z = results.pose_landmarks.landmark[body_part_index].x, results.pose_landmarks.landmark[body_part_index].y, results.pose_landmarks.landmark[body_part_index].z

# Append the x, y, z values to the corresponding lists
x_vals.append(x)
y_vals.append(y)
z_vals.append(z)

# # Add the (x, y) coordinates to the trace list
trace.append((int(x * image.shape[1]), int(y * image.shape[0])))

# Draw the trace on the image
for i in range(len(trace)-1):
cv2.line(image, trace[i], trace[i+1], (255, 0, 0), thickness=2)

plt.title('Time Lapse of the Y Coordinate')
plt.xlabel('Frames')
plt.ylabel('Coordinate Value')
plt.xlim(0,len(pose_coords))
plt.ylim(0,1)
plt.plot(y_vals);
# Clear the plot and update with the brand new x, y, z coordinate values
#ax.clear()
# ax.plot(range(0, frame_num + 1), x_vals, 'r.', label='x')
# ax.plot(range(0, frame_num + 1), y_vals, 'g.', label='y')
# ax.plot(range(0, frame_num + 1), z_vals, 'b.', label='z')
# ax.legend(loc='upper left')
# plt.draw()
plt.pause(0.00000000001)
clear_output(wait=True)
frame_num += 1

# Convert the image back to BGR format for display
image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)

# Display the image
cv2.imshow('Pose Tracking', image)

# Wait for user input to exit
if cv2.waitKey(1) & 0xFF == ord('q'):
break

# Release the video capture, close all windows, and clear the plot
cap.release()
cv2.destroyAllWindows()
plt.close()

After which, I generated a plot containing the timeline of the x,y,z coordinates:

plt.figure(figsize=(15,7))
plt.subplot(3,1,1)
plt.title('Time Lapse of the x Coordinate')
plt.xlabel('Frames')
plt.ylabel('Coordinate Value')
plt.xlim(0,len(pose_coords))
plt.ylim(0,1)
plt.plot(x_vals)

plt.subplot(3,1,2)
plt.title('Time Lapse of the y Coordinate')
plt.xlabel('Frames')
plt.ylabel('Coordinate Value')
plt.xlim(0,len(pose_coords))
plt.ylim(0,1.1)
plt.plot(y_vals)

plt.subplot(3,1,3)
plt.title('Time Lapse of the z Coordinate')
plt.xlabel('Frames')
plt.ylabel('Coordinate Value')
plt.xlim(0,len(pose_coords))
plt.ylim(-1,1)
plt.plot(z_vals)

plt.tight_layout();

The concept with this is able to be to have granular control over things like, the direction of your feet when executing a movement.

Now that I used to be confident that the model was properly capturing my body pose, I created some trace visualizations of relevant body joints just like the feet (which is absolutely essential when performing takedown techniques).

To have an idea of how the move is performed I produced a visualization that represented the execution of that movement from the attitude of a body part, on this case, the feet:

Image by creator

I did it each for my training sessions and for the reference video containing the motion I used to be attempting to imitate.

It’s essential to notice here that there are numerous issues with doing this that regard the resolution of the camera, the space at which the movements were performed in addition to the frame rate of the recording of every video, nonetheless, I’m just going to bypass all of that to create a elaborate plot (LoL).

Again the code for this approach:

def create_joint_trace_video(video_path,body_part_index=32, color_rgb=(255,0,0)):
"""
This function creates a trace of the body part being tracked.
body_part_index: The index of the body part being tracked.
video_path: The trail to the video being analysed.
"""
# Initialize MediaPipe Pose modelpose = mp_pose.Pose(static_image_mode=False, min_detection_confidence=0.5, min_tracking_confidence=0.5)

# Initialize OpenCV VideoCapture object to capture video from the camera
cap = cv2.VideoCapture(video_path)
frame_width = int(cap.get(3))
frame_height = int(cap.get(4))
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
output_path = pathlib.Path(video_path).stem + "_trace.mp4"
out = cv2.VideoWriter(output_path, fourcc, 30.0, (frame_width, frame_height))

# Create an empty list to store the trace of the body part being tracked
trace = []

with mp_pose.Pose(min_detection_confidence=0.5,
min_tracking_confidence=0.5) as pose:
while cap.isOpened():
success, image = cap.read()
if not success:
print("Ignoring empty camera frame.")
break

# Convert the frame to RGB format
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

# Process the frame with MediaPipe Pose model
results = pose.process(image)

# Check if any body parts are detected
if results.pose_landmarks:
# Get the x,y coordinates of the body part being tracked (on this case, the fitting elbow)
x, y = int(results.pose_landmarks.landmark[body_part_index].x * image.shape[1]), int(results.pose_landmarks.landmark[body_part_index].y * image.shape[0])

# Add the coordinates to the trace list
trace.append((x, y))

# Draw the trace on the image
for i in range(len(trace)-1):
cv2.line(image, trace[i], trace[i+1], color_rgb, thickness=2)

# Convert the image back to BGR format for display
image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)

# Display the image
out.write(image)
if cv2.waitKey(5) & 0xFF == 27:
break

cap.release()
out.release()
print("Joint Trace video created!")

Here I’m simply processing each frame as I did before to provide the pose videos, nonetheless, I’m also appending the x, and y coordinates of the actual body part into a listing I call `trace` which is used to provide the tracing line that accompanies the body part throughout the video.

4. Comparing the Traces

In possession of those capabilities, I could finally get into the a part of gathering insights from this approach.

To try this, I needed a technique to compare these traces with the intention to produce some variety of visually wealthy feedback that might help me understand how my poor execution of the movement in comparison with that of an elite athlete.

Now, the actual traces without the video within the background were plotted right into a graph.

def get_joint_trace_data(video_path, body_part_index,xmin=300,xmax=1000,
ymin=200,ymax=800):
"""
Creates a graph with the tracing of a specific body part,
while executing a certain movement.
"""
cap = cv2.VideoCapture(video_path)
frame_width = int(cap.get(3))
frame_height = int(cap.get(4))

# Create an empty list to store the trace of the body part being tracked
trace = []
i = 0
with mp_pose.Pose(min_detection_confidence=0.5,
min_tracking_confidence=0.5) as pose:
while cap.isOpened():
success, image = cap.read()
if not success:
print("Ignoring empty camera frame.")
break

# Convert the frame to RGB format
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

# Process the frame with MediaPipe Pose model
results = pose.process(image)

# Check if any body parts are detected
if results.pose_landmarks:
# Get the x,y coordinates of the body part being tracked (on this case, the fitting elbow)
x, y = int(results.pose_landmarks.landmark[body_part_index].x * image.shape[1]), int(results.pose_landmarks.landmark[body_part_index].y * image.shape[0])

# Add the coordinates to the trace list
trace.append((x, y))

# Plot the trace on the graph
fig, ax = plt.subplots()
#ax.imshow(image)
ax.set_xlim(xmin,xmax)
ax.set_ylim(ymin,ymax)
ax.invert_yaxis()
ax.plot(np.array(trace)[:, 0], np.array(trace)[:, 1], color='r')
# plt.savefig(f'joint_trace{i}.png')
# plt.close()
i+=1
plt.pause(0.00000000001)
clear_output(wait=True)
# Display the graph
#plt.show()

if cv2.waitKey(5) & 0xFF == 27:
break

cap.release()

return trace

video_path = "./videos/clip_training_session_2.mp4"
body_part_index = 31
foot_trace = get_joint_trace_data(video_path, body_part_index)

video_path = "./videos/uchimata_wall.mp4"
body_part_index = 31
foot_trace_reference = get_joint_trace_data(video_path, body_part_index,xmin=0,ymin=0,xmax=1300)

foot_trace_clip = foot_trace[:len(foot_trace_reference)]
plt.subplot(1,2,1)
plt.plot(np.array(foot_trace_clip)[:, 0], np.array(foot_trace_clip)[:, 1], color='r')
plt.gca().invert_yaxis();

plt.subplot(1,2,2)
plt.plot(np.array(foot_trace_reference)[:, 0], np.array(foot_trace_reference)[:, 1], color='g')
plt.gca().invert_yaxis();

Okay, with this we begin to see more clearly the differences between the signature shape of the foot moving in several contexts.

First, we see that while the elite player does more of a straight step right into a turn, generating an almost complete half circle along with his feet, I, then again, have this curvature appearance to my initial step inside, and likewise don’t create a half circle when throwing my leg into the air.

Also, while the elite player generates a large circle when moving his leg up, I create a shallow circle almost like an eclipse.

Image by creator, comparing traces for movement execution

I discovered these preliminary results to be quite nice because they indicate that, despite the constraints of the comparison, one can gauge differences regarding the signature shape of the movement’s execution just by observing traces like these.

Besides that, I desired to see if I could make comparisons regarding the speed with which the moves are performed, to research that I visualized the real-time motion of the coordinates of the body joints in time, putting the plots of me and the expert side by side to see how off was my timing.

The challenge with this evaluation is that, because the videos have various speeds and usually are not aligned in any way, I needed first to align them in a meaningful way.

I used to be unsure which technique to make use of here, but a conversation with my buddy Aaron (a neuroscientist on the Champalimaud Neuroscience Institute in Lisbon) form of illuminated an option for me: dynamic time warping.

Dynamic time warping (DTW) is a way used to measure the similarity between two temporal sequences with different speeds.

The essential idea is that you’ve gotten two different time series that will have some pattern you want to research, so that you try and align them by applying a couple of rules that let you calculate the optimal match between the 2 sequences.

Two repetitions of a walking sequence and although they’ve various speeds we will observe that the tracings of the limbs are quite similar; taken from Wikipedia referencing (Olsen et al, 2017).

I discovered a pleasant introduction to this topic in this text:

by Jeremy Zhang.

To make use of dynamic time warping I did the next:

from fastdtw import fastdtw
from scipy.spatial.distance import euclidean

max_x = max(max(foot_trace_clip, key=lambda x: x[0])[0], max(foot_trace_reference, key=lambda x: x[0])[0])
max_y = max(max(foot_trace_clip, key=lambda x: x[1])[1], max(foot_trace_reference, key=lambda x: x[1])[1])

foot_trace_clip_norm = [(x/max_x, y/max_y) for (x, y) in foot_trace_clip]
foot_trace_reference_norm = [(x/max_x, y/max_y) for (x, y) in foot_trace_reference]

distance, path = fastdtw(foot_trace_clip_norm, foot_trace_reference_norm, dist=euclidean)

The outputs I get listed below are:

1. distance: the euclidean distance between the 2 temporal sequence vectors.

2. path: a mapping between the indexes of the 2 temporal sequences as a nested list of tuples

Now, I can use the output stored within the path variable to create a plot with each sequences aligned:

foot_trace_reference_norm_mapped = [foot_trace_reference_norm[path[i][1]] for i in range(len(path))]
foot_trace_clip_norm_mapped = [foot_trace_clip_norm[path[i][1]] for i in range(len(path))]

plt.subplot(1,2,1)
plt.plot(np.array(foot_trace_reference_norm_mapped)[:, 0], np.array(foot_trace_reference_norm_mapped)[:, 1], color='g')
plt.gca().invert_yaxis();

plt.subplot(1,2,2)
plt.plot(np.array(foot_trace_clip_norm_mapped)[:, 0], np.array(foot_trace_clip_norm_mapped)[:, 1], color='r')
plt.gca().invert_yaxis();
plt.show()

Image by the creator, the temporal sequences aligned using the DTW algorithm

Now, given the shortage of information mainly for the reference trace, I can’t say that this plot gave me lots more insight than the weather already discussed before, nonetheless, it does help to focus on what I said before regarding the form of the movement.

Nonetheless, as a note for the longer term, my idea here was that, if certain conditions may very well be met to assist with making each videos more uniform, I would really like to have a reference tracing from which I compare the tracings of my attempts with the intention to use it for immediate feedback.

I’d use the euclidean distance output from the DTW algorithm as my feedback metric and have an app that might highlight once I am getting closer or farther than the signature shape I could be attempting to emulate.

For example that, let me show you an example.

def find_individual_traces(trace,window_size=60, color_plot="r"):
"""
Function that takes in a liste of tuples containing x,y coordinates
and plots them as different clips with various sizes to permit the user to seek out
the purpose where a full repetition has been accomplished
"""

clip_size = 0
for i in range(len(trace)//window_size):
plt.plot(np.array(trace[clip_size:clip_size+window_size])[:, 0], np.array(trace[clip_size:clip_size+window_size])[:, 1], color=color_plot)
plt.gca().invert_yaxis()
plt.title(f"Trace, clip size = {clip_size}")
plt.show()
clip_size+=window_size

def get_individual_traces(trace, clip_size):
num_clips = len(trace)//clip_size
trace_clips = []
i = 0
for clip in range(num_clips):
trace_clips.append(trace[i:i+clip_size])
i+=clip_size

return trace_clips

find_individual_traces(foot_trace_clip_norm)

Images by the creator. Traces of the movement of the feet executed by me.

Here I’m showing clips from the video where I execute each individual movement. Each of those traces will be in comparison with a reference trace obtained similarly:

find_individual_traces(foot_trace_reference_norm, window_size=45,color_plot="g")
Images by the creator. Traces of the movement of the feet executed by the elite player.

Once I obtain the reference tracings I get some noise signals as well, but I’ll use the third one as my reference:

Image by the creator

Now I can loop over the tracings representing my actual movement and see how they compare to this reference trace across a pair of coaching sessions.

video_path = "./videos/clip_training_session_3.mp4"
body_part_index = 31
foot_trace_clip = get_joint_trace_data(video_path, body_part_index)

video_path = "./videos/uchimata_wall.mp4"
body_part_index = 31
foot_trace_reference = get_joint_trace_data(video_path, body_part_index,xmin=0,ymin=0,xmax=1300)

# Showing a plot with the tracings from the training session
plt.plot(np.array(foot_trace_clip)[:, 0], np.array(foot_trace_clip)[:, 1], color='r')
plt.gca().invert_yaxis();

Image by the creator. Tracings of the x, y coordinates of the feet over a couple of executions of the movement.

Now I get the normalized values from each tracings.

max_x = max(max(foot_trace_clip, key=lambda x: x[0])[0], max(foot_trace_reference, key=lambda x: x[0])[0])
max_y = max(max(foot_trace_clip, key=lambda x: x[1])[1], max(foot_trace_reference, key=lambda x: x[1])[1])

foot_trace_clip_norm = [(x/max_x, y/max_y) for (x, y) in foot_trace_clip]
foot_trace_reference_norm = [(x/max_x, y/max_y) for (x, y) in foot_trace_reference]

I get the tracings from the training clip in addition to the reference traces to assist me set a goal.

The clip size is ready manually.

traces = get_individual_traces(foot_trace_clip_norm, clip_size=67)
traces_ref = get_individual_traces(foot_trace_reference_norm, clip_size=60)

I show an example of the tracings obtained after having removed a couple of which I manually classified as noise upon empirical statement.

# Here I show an example trace from the brand new clip
index = 0
color_plot = "black"
plt.plot(np.array(traces[index])[:, 0], np.array(traces[index])[:, 1], color=color_plot)
plt.gca().invert_yaxis()
plt.title(f"Trace {index}")
plt.show()
Image by the creator

Then I loop over the tracings and plot their rating as compared to a reference trace I pick from those obtained from the video with the elite player:

trace_ref = traces[2]
trace_scores = []

for trace in traces:
distance, path = fastdtw(trace, trace_ref, dist=euclidean)
trace_scores.append(distance)

plt.plot(trace_scores, color="black")
plt.title("Trace Scores with DTW")
plt.xlabel("Trace Index")
plt.ylabel("Euclidean Distance Rating")
plt.show()

Image by the creator

Now, the primary weird thing I noticed here is the up and down of the metric, which might only be explained by the undeniable fact that among the tracings obtained referred to the foot coming down slightly than up while executing the movement.

Nonetheless, the cool thing about this plot is that the scores for the tracing even appeared to improve a bit and at the very least stay consistent at 20 (which on this case is the measure of the Euclidean distance between the 2 sequences).

Despite not being to interpret these numbers conclusively at this point, I discovered quite insightful that an approach like this may very well be converted right into a measurable metric that compares the standard of a movement with respect to a different.

In the longer term, I would really like to look into find out how to higher extract the training clips to acquire perfectly aligned segments of every execution of a movement with the intention to produce more consistent results.

Overall, I feel doing these experiments was quite interesting since it pointed to the facility of this method to offer a granular assessment of movement, despite the undeniable fact that it could still need lots of work with the intention to turn into a useful gizmo for insight.

If you happen to prefer video, try my Youtube Video on this topic here:

2 COMMENTS

LEAVE A REPLY

Please enter your comment!
Please enter your name here