Home Artificial Intelligence From Chaos to Consistency: Docker for Data Scientists Background What’s Docker? Docker Technical Features Installing Docker Deploying With Docker Example Summary & Further Thoughts References & Further Reading Connect With Me!

From Chaos to Consistency: Docker for Data Scientists Background What’s Docker? Docker Technical Features Installing Docker Deploying With Docker Example Summary & Further Thoughts References & Further Reading Connect With Me!

1
From Chaos to Consistency: Docker for Data Scientists
Background
What’s Docker?
Docker Technical Features
Installing Docker
Deploying With Docker Example
Summary & Further Thoughts
References & Further Reading
Connect With Me!

An introduction and application of Docker for Data Scientists

Photo by Ian Taylor on Unsplash

Nevertheless it works on my machine?

This can be a classic meme within the tech community, especially for Data Scientists who wish to ship their amazing machine-learning model, only to learn that the production machine has a unique operating system. Removed from ideal.

Nevertheless…

There’s an answer due to these wonderful things called and tools to regulate them equivalent to

On this post, we are going to dive into what containers are and the way you possibly can construct and run them using Docker. The usage of containers and Docker has grow to be an industry standard and customary practice for data products. As a Data Scientist, learning these tools is then a useful tool in your arsenal.

Docker is a service that help construct, run and execute code and applications in containers.

Now it’s possible you’ll be wondering, what’s a container?

Ostensibly, a container may be very just like a . It’s a small isolated environment where all the pieces is self ‘contained’ and may be run on any machine. The first selling point of containers and VMs is their portability, allowing your application or model to run seamlessly on any on-premise server, local machine, or on cloud platforms equivalent to .

The essential difference between containers and VMs is how they use their hosts computer resources. Containers are so much more lightweight as they don’t actively partition the hardware resources of the host machine. I is not going to delve into the total technical details here, nonetheless if you ought to understand a bit more, I even have linked a terrific article explaining their differences here.

Docker is then simply a tool we use to create, manage and run these containers with ease. It’s one among the essential the explanation why containers have grow to be very fashionable, because it enables developers to simply deploy applications and models that run anywhere.

Diagram by writer.

There are three essential elements we want to run a container using Docker:

  • A text file that comprises the instructions of how one can construct a docker. image
  • : A blueprint or template to create a Docker container.
  • An isolated environment that gives all the pieces an application or machine learning model must run. Includes things equivalent to dependencies and OS versions.
Diagram by writer.

There are also a number of other key points to notice:

  • A background process (daemon) that deals with the incoming requests to docker.
  • A shell interface that allows the user to talk to Docker through its daemon.
  • Just like GitHun, a spot where developers can share their Docker images.

Hombrew

The very first thing it’s best to install is (link here). That is dubbed because the ‘missing package manager for MacOS’ and may be very useful for anyone coding on their Mac.

To put in Homebrew, simply run the command given on their website:

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

Confirm Homebrew is installed by running brew help.

Docker

Now with Homebrew installed, you possibly can install docker by running brew install docker. Confirm docker is installed by running which docker , the output shouldn’t rise any errors and seem like this:

/opt/homebrew/bin/docker

Colima

The ultimate part, is it install Simply, runinstall colima and confirm it’s installed with which colima. Again, the output should seem like this:

/opt/homebrew/bin/colima

Now you is likely to be wondering, what on earth is Colima?

Colima is a software package that allows on MacOS. In additional laymen terms, Colima creates the environment for containers to work on our system. To attain this, it runs a Linux virtual machine with a that Docker can communicate with using the .

Alternativetly, you can too install as an alternative of Colima. Nevertheless, I prefer Colima for a number of reasons: its free, more lightweight and I like working within the terminal!

See this blog post here for more arguments for Colima

Workflow

Below is an example of how Data Scientists and Machine Learning Engineers can deploy their model using Docker:

Diagram by writer.

Step one is clearly to construct their amazing model. Then, that you must wrap up all of the stuff you might be using to run the model, stuff just like the python version and package dependencies. The ultimate step is to make use of that requirements file contained in the Dockerfile.

If this seems completely arbitrary to you for the time being don’t fret, we are going to go over this process step-by-step!

Basic Model

Let’s start by constructing a basic model. The provided code snippet displays a straightforward implementation of the classification model on the famous Iris dataset:

Dataset from Kaggle with a CC0 licence.

GitHub Gist by writer.

This file is known as basic_rf_model.py for reference.

Create Requirements File

Now that we now have our model ready, we want to create a requirement.txt file to accommodate all of the dependencies that underpin the running of our model. In this easy example, we luckily only depend on the scikit-learn package. Due to this fact, our requirement.txt will simply seem like this:

scikit-learn==1.2.2

You may check the version you might be running in your computer by the scikit-learn --version command.

Create Dockerfile

Now we are able to finally create our Dockerfile!

So, in the identical directiory because the requirement.txt and basic_rf_model.py, create a file named Dockerfile. Inside Dockerfile we may have the next:

GitHub Gist by writer.

Let’s go over line by line to see what all of it means:

  • FROM python:3.9: That is the bottom image for our image
  • MAINTAINER egor@some.email.com: This means who maintains this image
  • WORKDIR /src: Sets the working directory of the image to be src
  • COPY . .: Copy the present directory files to the Docker directory
  • RUN pip install -r requirements.txt: Install the necessities from requirement.txt file into the Docker environment
  • CMD ["python", "basic_rf_model.py"]: Tells the container to execute the command python basic_rf_model.py and run the model

Initiate Colima & Docker

The subsequent step is setup the Docker environment: First we want besides up Colima:

colima start

After Colima has began up, check that the Docker commands are working by running:

docker ps

It should return something like this:

CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES

This is nice and means each Colima and Docker are working as expected!

: the docker ps command lists all the present running containers.

Construct Image

Now it’s time to construct our first Docker Image from the Dockerfile that we created above:

docker construct . -t docker_medium_example

The -t flag indicates the name of the image and the . tells us to construct from this current directory.

If we now run docker images, we must always see something like this:

Image from writer.

Congrats, the image has been built!

Run Container

After the image has been created, we are able to run it as a container using the IMAGE ID listed above:

docker run bb59f770eb07

Output:

Accuracy: 0.9736842105263158

Because all it has done is run the basic_rf_model.py script!

Extra Information

This tutorial is just scratching the surface of what Docker can do and be used for. There are lots of more features and commands to learn to know Docker. I great detailed tutorial is given on the Docker website which you could find here.

One cool feature is which you could run the container in interactive mode and go into its shell. For instance, if we run:

docker run -it bb59f770eb07 /bin/bash

You’ll enter the Docker container and it should look something like this:

Image by writer.

We also used the ls command to point out all of the files within the Docker working directory.

Docker and containers are implausible tools to make sure Data Scientists’ models can run anywhere and anytime with no issues. They do that by creating small isolated compute environments that contain all the pieces for the model to run effectively. This is known as a container. It is straightforward to make use of and light-weight, rendering it a typical industrial practice nowadays. In this text, we went over a basic example of how you possibly can package your model right into a container using Docker. The method was easy and seamless, so is something Data Scientists can learn and pick up quickly.

Full code utilized in this text may be found at my GitHub here:

(All emojis designed by OpenMoji — the open-source emoji and icon project. License: CC BY-SA 4.0)

1 COMMENT

LEAVE A REPLY

Please enter your comment!
Please enter your name here