Designing and Deploying a Machine Learning Python Application (Part 2)

Artificial Intelligence

Designing and Deploying a Machine Learning Python Application (Part 2)

admin

February 24, 2024

Designing and Deploying a Machine Learning Python Application (Part 2)

As we haven’t quite solved the important thing problems, let’s dig in only a bit further before entering into the low-level nitty-gritty. As stated by Heroku:

Web applications that process incoming HTTP requests concurrently make rather more efficient use of dyno resources than web applications that only process one request at a time. For this reason, we recommend using web servers that support concurrent request processing every time developing and running production services.

The Django and Flask web frameworks feature convenient built-in web servers, but these blocking servers only process a single request at a time. When you deploy with one in every of these servers on Heroku, your dyno resources shall be underutilized and your application will feel unresponsive.

We’re already ahead of the sport by utilizing employee multiprocessing for the ML task, but can take this a step further by utilizing Gunicorn:

Gunicorn is a pure-Python HTTP server for WSGI applications. It means that you can run any Python application concurrently by running multiple Python processes inside a single dyno. It provides an ideal balance of performance, flexibility, and configuration simplicity.

Okay, awesome, now we will utilize much more processes, but there’s a catch: each recent employee Gunicorn employee process will represent a replica of the appliance, meaning that they too will utilize the bottom ~150MB RAM as well as to the Heroku process. So, say we pip install gunicorn and now initialize the Heroku web process with the next command:

gunicorn .wsgi:application --workers=2 --bind=0.0.0.0:$PORT

The bottom ~150MB RAM in the net process turns into ~300MB RAM (base memory usage multipled by # gunicorn employees).

While being cautious of the constraints to multithreading a Python application, we will add threads to employees as well using:

gunicorn .wsgi:application --threads=2 --worker-class=gthread --bind=0.0.0.0:$PORT

Even with problem #3, we will still discover a use for threads, as we would like to make sure our web process is able to processing a couple of request at a time while being careful of the appliance’s memory footprint. Here, our threads could process miniscule requests while ensuring the ML task is distributed elsewhere.

Either way, by utilizing gunicorn employees, threads, or each, we’re setting our Python application as much as process a couple of request at a time. We’ve kind of solved problem #2 by incorporating various ways to implement concurrency and/or parallel task handling while ensuring our application’s critical ML task doesn’t depend on potential pitfalls, akin to multithreading, setting us up for scale and attending to the basis of problem #3.

Okay so what about that tricky problem #1. At the tip of the day, ML processes will typically find yourself taxing the hardware in a technique or one other, whether that might be memory, CPU, and/or GPU. Nevertheless, by utilizing a distributed system, our ML task is integrally linked to the primary web process yet handled in parallel via a Celery employee. We are able to track the beginning and end of the ML task via the chosen Celery broker, in addition to review metrics in a more isolated manner. Here, curtailing Celery and Heroku employee process configurations are as much as you, nevertheless it is a superb place to begin for integrating a long-running, memory-intensive ML process into your application.

Now that we’ve had a likelihood to essentially dig in and get a high level picture of the system we’re constructing, let’s put it together and give attention to the specifics.

To your convenience, here is the repo I shall be mentioning on this section.

First we’ll begin by organising Django and Django Rest Framework, with installation guides here and here respectively. All requirements for this app may be present in the repo’s requirements.txt file (and Detectron2 and Torch shall be built from Python wheels laid out in the Dockerfile, with a purpose to keep the Docker image size small).

The subsequent part shall be organising the Django app, configuring the backend to save lots of to AWS S3, and exposing an endpoint using DRF, so if you happen to are already comfortable doing this, be happy to skip ahead and go straight to the ML Task Setup and Deployment section.

Django Setup

Go ahead and create a folder for the Django project and cd into it. Activate the virtual/conda env you’re using, ensure Detectron2 is installed as per the installation instructions in Part 1, and install the necessities as well.

Issue the next command in a terminal:

django-admin startproject mltutorial

This can create a Django project root directory titled “mltutorial”. Go ahead and cd into it to seek out a manage.py file and a mltutorial sub directory (which is the actual Python package on your project).

mltutorial/
manage.py
mltutorial/
__init__.py
settings.py
urls.py
asgi.py
wsgi.py

Open settings.py and add ‘rest_framework’, ‘celery’, and ‘storages’ (needed for boto3/AWS) within the INSTALLED_APPS list to register those packages with the Django project.

In the basis dir, let’s create an app which is able to house the core functionality of our backend. Issue one other terminal command:

python manage.py startapp docreader

This can create an app in the basis dir called docreader.

Let’s also create a file in docreader titled mltask.py. In it, define an easy function for testing our setup that takes in a variable, file_path, and prints it out:

def mltask(file_path):
return print(file_path)

Now attending to structure, Django apps use the Model View Controller (MVC) design pattern, defining the Model in models.py, View in views.py, and Controller in Django Templates and urls.py. Using Django Rest Framework, we’ll include serialization on this pipeline, which give a way of serializing and deserializing native Python dative structures into representations akin to json. Thus, the appliance logic for exposing an endpoint is as follows:

Database ← → models.py ← → serializers.py ← → views.py ← → urls.py

In docreader/models.py, write the next:

from django.db import models
from django.dispatch import receiver
from .mltask import mltask
from django.db.models.signals import(
post_save
)class Document(models.Model):
title = models.CharField(max_length=200)
file = models.FileField(blank=False, null=False)
@receiver(post_save, sender=Document)
def user_created_handler(sender, instance, *args, **kwargs):
mltask(str(instance.file.file))

This sets up a model Document that may require a title and file for every entry saved within the database. Once saved, the @receiver decorator listens for a post save signal, meaning that the desired model, Document, was saved within the database. Once saved, user_created_handler() takes the saved instance’s file field and passes it to, what is going to grow to be, our Machine Learning function.

Anytime changes are made to models.py, you will want to run the next two commands:

python manage.py makemigrations
python manage.py migrate

Moving forward, create a serializers.py file in docreader, allowing for the serialization and deserialization of the Document’s title and file fields. Write in it:

from rest_framework import serializers
from .models import Documentclass DocumentSerializer(serializers.ModelSerializer):
class Meta:
model = Document
fields = [
'title',
'file'
]

Next in views.py, where we will define our CRUD operations, let’s define the power to create, in addition to list, Document entries using generic views (which essentially means that you can quickly write views using an abstraction of common view patterns):

from django.shortcuts import render
from rest_framework import generics
from .models import Document
from .serializers import DocumentSerializerclass DocumentListCreateAPIView(
generics.ListCreateAPIView):
queryset = Document.objects.all()
serializer_class = DocumentSerializer

Finally, update urls.py in mltutorial:

from django.contrib import admin
from django.urls import path, includeurlpatterns = [
path("admin/", admin.site.urls),
path('api/', include('docreader.urls')),
]

And create urls.py in docreader app dir and write:

from django.urls import pathfrom . import views
urlpatterns = [
path('create/', views.DocumentListCreateAPIView.as_view(), name='document-list'),
]

Now we’re all setup to save lots of a Document entry, with title and field fields, on the /api/create/ endpoint, which is able to call mltask() post save! So, let’s test this out.

To assist visualize testing, let’s register our Document model with the Django admin interface, so we will see when a recent entry has been created.

In docreader/admin.py write:

from django.contrib import admin
from .models import Documentadmin.site.register(Document)

Create a user that may login to the Django admin interface using:

python manage.py createsuperuser

Now, let’s test the endpoint we exposed.

To do that and not using a frontend, run the Django server and go to Postman. Send the next POST request with a PDF file attached:

If we check our Django logs, we must always see the file path printed out, as laid out in the post save mltask() function call.

AWS Setup

You’ll notice that the PDF was saved to the project’s root dir. Let’s ensure any media is as an alternative saved to AWS S3, getting our app ready for deployment.

Go to the S3 console (and create an account and get our your account’s Access and Secret keys if you happen to haven’t already). Create a recent bucket, here we shall be titling it ‘djangomltest’. Update the permissions to make sure the bucket is public for testing (and revert back, as needed, for production).

Now, let’s configure Django to work with AWS.

Add your model_final.pth, trained in Part 1, into the docreader dir. Create a .env file in the basis dir and write the next:

AWS_ACCESS_KEY_ID = 
AWS_SECRET_ACCESS_KEY = 
AWS_STORAGE_BUCKET_NAME = 'djangomltest'MODEL_PATH = './docreader/model_final.pth'

Update settings.py to incorporate AWS configurations:

import os
from dotenv import load_dotenv, find_dotenv
load_dotenv(find_dotenv())# AWS
AWS_ACCESS_KEY_ID = os.environ['AWS_ACCESS_KEY_ID']
AWS_SECRET_ACCESS_KEY = os.environ['AWS_SECRET_ACCESS_KEY']
AWS_STORAGE_BUCKET_NAME = os.environ['AWS_STORAGE_BUCKET_NAME']
#AWS Config
AWS_DEFAULT_ACL = 'public-read'
AWS_S3_CUSTOM_DOMAIN = f'{AWS_STORAGE_BUCKET_NAME}.s3.amazonaws.com'
AWS_S3_OBJECT_PARAMETERS = {'CacheControl': 'max-age=86400'}
#Boto3
STATICFILES_STORAGE = 'mltutorial.storage_backends.StaticStorage'
DEFAULT_FILE_STORAGE = 'mltutorial.storage_backends.PublicMediaStorage'
#AWS URLs
STATIC_URL = f'https://{AWS_S3_CUSTOM_DOMAIN}/static/'
MEDIA_URL = f'https://{AWS_S3_CUSTOM_DOMAIN}/media/'

Optionally, with AWS serving our static and media files, it would be best to run the next command with a purpose to serve static assets to the admin interface using S3:

python manage.py collectstatic

If we run the server again, our admin should appear similar to how it will with our static files served locally.

Once more, let’s run the Django server and test the endpoint to be sure the file is now saved to S3.

ML Task Setup and Deployment

With Django and AWS properly configured, let’s arrange our ML process in mltask.py. Because the file is long, see the repo here for reference (with comments added in to assist with understanding the varied code blocks).

What’s essential to see is that Detectron2 is imported and the model is loaded only when the function is known as. Here, we’ll call the function only through a Celery task, ensuring the memory used during inferencing shall be isolated to the Heroku employee process.

So finally, let’s setup Celery after which deploy to Heroku.

In mltutorial/_init__.py write:

from .celery import app as celery_app
__all__ = ('celery_app',)

Create celery.py within the mltutorial dir and write:

import osfrom celery import Celery
# Set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'mltutorial.settings')
# We'll specify Broker_URL on Heroku
app = Celery('mltutorial', broker=os.environ['CLOUDAMQP_URL'])
# Using a string here means the employee doesn't need to serialize
# the configuration object to child processes.
# - namespace='CELERY' means all celery-related configuration keys
#   must have a `CELERY_` prefix.
app.config_from_object('django.conf:settings', namespace='CELERY')
# Load task modules from all registered Django apps.
app.autodiscover_tasks()
@app.task(bind=True, ignore_result=True)
def debug_task(self):
print(f'Request: {self.request!r}')

Lastly, make a tasks.py in docreader and write:

from celery import shared_task
from .mltask import mltask@shared_task
def ml_celery_task(file_path):
mltask(file_path)
return "DONE"

This Celery task, ml_celery_task(), should now be imported into models.py and used with the post save signal as an alternative of the mltask function pulled directly from mltask.py. Update the post_save signal block to the next:

@receiver(post_save, sender=Document)
def user_created_handler(sender, instance, *args, **kwargs):
ml_celery_task.delay(str(instance.file.file))

And to check Celery, let’s deploy!

In the basis project dir, include a Dockerfile and heroku.yml file, each laid out in the repo. Most significantly, editing the heroku.yml commands will let you configure the gunicorn web process and the Celery employee process, which might aid in further mitigating potential problems.

Make a Heroku account and create a recent app called “mlapp” and gitignore the .env file. Then initialize git within the projects root dir and alter the Heroku app’s stack to container (with a purpose to deploy using Docker):

$ heroku login
$ git init
$ heroku git:distant -a mlapp
$ git add .
$ git commit -m "initial heroku commit"
$ heroku stack:set container
$ git push heroku master

Once pushed, we just must add our env variables into the Heroku app.

Go to settings in the web interface, scroll right down to Config Vars, click Reveal Config Vars, and add each line listed within the .env file.

You might have noticed there was a CLOUDAMQP_URL variable laid out in celery.py. We want to provision a Celery Broker on Heroku, for which there are a selection of options. I shall be using CloudAMQP which has a free tier. Go ahead and add this to your app. Once added, the CLOUDAMQP_URL environment variable shall be included routinely within the Config Vars.

Finally, let’s test the ultimate product.

To observe requests, run:

$ heroku logs --tail

Issue one other Postman POST request to the Heroku app’s url on the /api/create/ endpoint. You will note the POST request come through, Celery receive the duty, load the model, and begin running pages:

We’ll proceed to see the “Running for page…” until the tip of the method and you possibly can check the AWS S3 bucket because it runs.

Congrats! You’ve now deployed and ran a Python backend using Machine Learning as an element of a distributed task queue running in parallel to the primary web process!

As mentioned, it would be best to adjust the heroku.yml commands to include gunicorn threads and/or employee processes and positive tune celery. For further learning, here’s a great article on configuring gunicorn to fulfill your app’s needs, one for digging into Celery for production, and one other for exploring Celery employee pools, with a purpose to help with properly managing your resources.

Completely satisfied coding!

Unless otherwise noted, all images utilized in this text are by the writer