Everyone’s been using Claude and OpenAI as coding assistants for the previous few years, but there’s less appeal if you happen to have a look at the developments coming out of open source projects like Open R1. If we have a look at the evaluation on LiveCodeBench below, we are able to see that the 7B parameter variant outperforms Claude 3.7 Sonnet and GPT-4o. These models are the every day drivers of many engineers in applications like Cursor and VSCode.
Evals are great and all, but I would like to get my hands dirty and feel the commits! This blog post focuses on how you’ll be able to integrate these models in your IDE now. We are going to arrange OlympicCoder 7B, the smaller of the 2 OlympicCoder variants, and we’ll use a quantized variant for optimum local inference. Here’s the stack we’re going to make use of:
- OlympicCoder 7B. The 4bit GGUF version from the LMStudio Community
- LM Studio: A tool that simplifies running AI models
- Visual Studio Code (VS Code)
- Proceed a VS Code extension for local models
It’s vital to say that we selected this stack purely for simplicity. It is advisable to experiment with the larger model and/or different GGUF files. And even alternative inference engines like llama.cpp.
LM Studio is sort of a control panel for AI models. It integrates with the Hugging Face hub to tug models, helps you discover the precise GGUF file, and exposes an API that other applications can use to interact with the model.
In brief, it permits you to download and run them with none complicated setup.
- Go to the LM Studio website: Open your web browser and go to https://lmstudio.ai/download.
- Select your operating system: Click the download button on your computer (Windows, Mac, or Linux).
- Install LM Studio: Run the downloaded file and follow the instructions. It’s similar to installing another program.
The GGUF files that we want are hosted on the hub. We are able to open the model from the hub in LMStudio, using the ‘Use this model’ button:
It will link to the LMStudio application and open it in your machine. You’ll just must Select a Quantization. I went for Q4_K_M because it should perform well on most devices. If you will have more compute, it is advisable to check out considered one of the choices with Q8_*.
If you should skip the UI, you can too load models with LMStudio via the command line:
lms get lmstudio-community/OlympicCoder-7B-GGUF
lms load olympiccoder-7b
lms server start
That is the vital part. We now must integrate VScode with the model served by LMStudio.
- In LM Studio, activate the server on the ‘Developer’ tab. It will expose the endpoints at
http://localhost:1234/v1.
- Install the VS Code Extension to hook up with our local server. I went for Proceed.dev, but there are other options too.
- In VSCode, go to the Extensions view (click the square icon on the left sidebar, or press Ctrl+Shift+X / Cmd+Shift+X).
- Seek for “Proceed” and install the extension from “Proceed Dev”.
- Configure a Recent Model in Proceed.dev
- Open the Proceed tab and within the models dropdown, select ‘add latest chat model’.
- It will open a json configuration file. You’ll must specify the model name. I.e. olympiccoder-7b
Many of the core AI features in vscode can be found via this setup, for instance:
- Code Completion: Start typing, and the AI will suggest tips on how to finish your code.
- Generate Code: Ask it to write down a function or a complete block of code. For instance, you could possibly type (in a comment or a chat window, depending on the extension): // Write a function to reverse a string in JavaScript
- Explain Code: Select some code and ask the AI to elucidate what it does.
- Refactor Code: Ask the AI to make your code cleaner or more efficient.
- Write Tests: Ask the AI to create unit tests on your code.
OlympicCoder will not be Claude. It’s optimised on the CodeForces-CoTs dataset which is predicated on competitive coding challenges. That signifies that it is best to not expect it to be super friendly and explanatory. As an alternative, roll up your sleeves and expect a no-holds barred competitive coder able to cope with tough problems.
It is advisable to mix up OlympicCoder with other models to get a rounded coding experience. For instance, if you happen to’re attempting to squeeze milliseconds out of a binary search, try OlympicCoder. If you should design a user facing API, go for Claude-3.7-sonnet or Qwen-2.5-Coder.
- Share your favorite generations within the comments below
- Check out one other variant of OlympicCoder from the hub.
- Experiment with quantization types based in your hardware.
- Check out multiple models in LM Studio for various coding vibes! Try the model catalog https://lmstudio.ai/models
- Experiment with other VS Code extensions like Cline which have agentic functionality





