Construct Your Own AI Coding Assistant in JupyterLab with Ollama and Hugging Face

Jupyter AI brings generative AI capabilities right into the interface. Having an area AI assistant ensures privacy, reduces latency, and provides offline functionality, making it a strong tool for developers. In this text, we’ll learn how you can arrange an area AI coding assistant in JupyterLab using Jupyter AI, Ollama and Hugging Face. By the top of this text, you’ll have a totally functional coding assistant in JupyterLab able to autocompleting code, fixing errors, creating latest notebooks from scratch, and way more, as shown within the screenshot below.

Coding assistant in Jupyter Lab via Jupyter AI | Image by Creator

First things first — what’s Jupyter AI? Because the name suggests, Jupyter AI is a JupyterLab extension for generative AI. This powerful tool transforms your standard Jupyter notebooks or JupyterLab environment right into a generative AI playground. The perfect part? It also works seamlessly in environments like Google Colaboratory and Visual Studio Code. This extension does all of the heavy lifting, providing access to a wide range of model providers (each open and closed source) right inside your Jupyter environment.

Flow diagram of the installation process | Image by Creator

Organising the environment involves three important components:

JupyterLab
The Jupyter AI extension
Ollama (for Local Model Serving)
[Optional] Hugging Face (for GGUF models)

1. Installing the Jupyter AI Extension

It’s beneficial to create a latest environment specifically for Jupyter AI to maintain your existing environment clean and organised. Once done follow the following steps. Jupyter AI requires JupyterLab 4.x or Jupyter Notebook 7+, so be sure that you’ve got the most recent version of Jupyter Lab installed. You may install/upgrade JupyterLab with pip or conda:

# Install JupyterLab 4 using pip
pip install jupyterlab~=4.0

Next, install the Jupyter AI extension as follows.

pip install "jupyter-ai[all]"

That is the simplest way for installation because it includes all provider dependencies (so it supports Hugging Face, Ollama, etc., out of the box). So far, Jupyter AI supports the next model providers :

Supported Model providers in Jupyter AI together with the dependencies | Created by Creator from the documentation

Should you encounter errors throughout the Jupyter AI installation, manually install Jupyter AI using pip without the [all] optional dependency group. This manner you’ll be able to control which models can be found in your Jupyter AI environment. For instance, to put in Jupyter AI with only added support for Ollama models, use the next:

pip install jupyter-ai langchain-ollama

The dependencies depend on the model providers (see table above). Next, restart your JupyterLab instance. Should you see a chat icon on the left sidebar, this implies all the pieces has been installed perfectly. With Jupyter AI, you’ll be able to chat with models or use inline magic commands directly inside your notebooks.

Native chat UI in JupyterLab | Image by Creator

2. Setting Up Ollama for Local Models

Now that Jupyter AI is installed, we’d like to configure it with a model. While Jupyter AI integrates with Hugging Face models directly, some models may not work properly. As a substitute, Ollama provides a more reliable strategy to load models locally.

Ollama is a handy tool for running Large Language Models locally. It allows you to download pre-configured AI models from its library. Ollama supports all major platforms (macOS, Windows, Linux), so select the strategy in your OS and download and install it from the official website. After installation, confirm that it is ready up accurately by running:

Ollama --version
------------------------------
ollama version is 0.6.2

Also, be sure that your Ollama server be running which you’ll be able to check by calling on the terminal:

$ ollama serve
Error: listen tcp 127.0.0.1:11434: bind: address already in use

If the server is already lively, you will note an error like above confirming that Ollama is running and in use.

Option 1: Using Pre-Configured Models

Ollama provides a library of pre-trained models you can download and run locally. To begin using a model, download it using the pull command. For instance, to make use of qwen2.5-coder:1.5b, run:

ollama pull qwen2.5-coder:1.5b

It will download the model in your local environment. To verify if the model has been downloaded, run:

ollama list

It will list all of the models you’ve downloaded and stored locally in your system using Ollama.

Option 2: Loading a Custom Model

If the model you wish isn’t available in Ollama’s library, you’ll be able to load a custom model by making a Model File that specifies the model’s source.For detailed instructions on this process, discuss with the Ollama Import Documentation.

Option 3: Running GGUF Models directly from Hugging Face

Ollama now supports GGUF models directly from the Hugging Face Hub, including each private and non-private models. This implies if you ought to use GGUF model directly from Hugging Face Hub you’ll be able to achieve this without requiring a custom Model File as mentioned in Option 2 above.

For instance, to load a 4-bit quantized Qwen2.5-Coder-1.5B-Instruct model from Hugging Face:

1. First, enable Ollama under your Local Apps settings.

The best way to enable Ollama under your Local Apps settings on Hugging Face | Image by Creator

2. On the model page, select Ollama from the Use this model dropdown as shown below.

Accessing GGUF model from HuggingFace Hub via Ollama | Image by Creator

We’re almost there. In JupyterLab, open the Jupyter AI chat interface on the sidebar. At the highest of the chat panel or in its settings (gear icon), there’s a dropdown or field to pick out the Model provider and model ID. Select Ollama because the provider, and enter the model name exactly as shown by Ollama list within the terminal (e.g. qwen2.5-coder:1.5b). Jupyter AI will connect with the local Ollama server and cargo that model for queries. No API keys are needed since that is local.

Set Language model, Embedding model and inline completions models based on the models of your selection.
Save the settings and return to the chat interface.

Configure Jupyter AI with Ollama | Image by Creator

This configuration links Jupyter AI to the locally running model via Ollama. While inline completions must be enabled by this process, if that doesn’t occur, you’ll be able to do it manually by clicking on the Jupyternaut icon, which is positioned in the underside bar of the JupyterLab interface to the left of the Mode indicator (e.g., Mode: Command). This opens a dropdown menu where you’ll be able to select Enable completions by Jupyternaut to activate the feature.

Enabling code completions in notebook | Image by Creator

Once arrange, you should use the AI coding assistant for various tasks like code autocompletion, debugging help, and generating latest code from scratch. It’s necessary to notice here you can interact with the assistant either through the chat sidebar or directly in notebook cells using %%ai magic commands. Let’s have a look at each the ways.

Coding assistant via Chat interface

That is pretty straightforward. You may simply chat with the model to perform an motion. As an example, here is how we will ask the model to elucidate the error within the code after which subsequently fix the error by choosing code within the notebook.

Debugging Assistance Example using Jupyter AI via Chat | Image by Creator

It’s also possible to ask the AI to generate code for a task from scratch, just by describing what you wish in natural language. Here’s a Python function that returns all prime numbers as much as a given positive integer N, generated by Jupyternaut.

Generating Recent Code from Prompts using Jupyter AI via Chat | Image by Creator

Coding assistant via notebook cell or IPython shell:

It’s also possible to interact with models directly inside a Jupyter notebook. First, load the IPython extension:

%load_ext jupyter_ai_magics

Now, you should use the %%ai cell magic to interact together with your chosen language model using a specified prompt. Let’s replicate the above example but this time inside the notebook cells.

Generating Recent Code from Prompts using Jupyter AI within the notebook | Image by Creator

For more details and options you’ll be able to discuss with the official documentation.

As you’ll be able to gauge from this text, Jupyter AI makes it easy to establish a coding assistant, provided you’ve got the fitting installations and setup in place. I used a comparatively small model, but you’ll be able to pick from a wide range of models supported by Ollama or Hugging Face. The important thing advantage here is that using an area model offers significant advantages: it enhances privacy, reduces latency, and reduces dependence on proprietary model providers. Nevertheless, running large models locally with Ollama might be resource-intensive so be sure that you’ve got sufficient RAM. With the rapid pace at which open-source models are improving, you’ll be able to achieve comparable performance even with these alternatives.

Construct Your Own AI Coding Assistant in JupyterLab with Ollama and Hugging Face

1. Installing the Jupyter AI Extension

2. Setting Up Ollama for Local Models

Option 1: Using Pre-Configured Models

Option 2: Loading a Custom Model

Option 3: Running GGUF Models directly from Hugging Face

Coding assistant via Chat interface

Coding assistant via notebook cell or IPython shell:

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

Deep Reinforcement Learning: The Actor-Critic Method

Putting RL back in RLHF

EDA in Public (Part 3): RFM Evaluation for Customer Segmentation in Pandas

From DeepSpeed to FSDP and Back Again with Hugging Face Speed up

The Next Generation of HumanEval

Construct Your Own AI Coding Assistant in JupyterLab with Ollama and Hugging Face

1. Installing the Jupyter AI Extension

2. Setting Up Ollama for Local Models

Option 1: Using Pre-Configured Models

Option 2: Loading a Custom Model

Option 3: Running GGUF Models directly from Hugging Face

Coding assistant via Chat interface

Coding assistant via notebook cell or IPython shell:

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.