We’re thrilled to share that Scaleway is now a supported Inference Provider on the Hugging Face Hub!
Scaleway joins our growing ecosystem, enhancing the breadth and capabilities of serverless inference directly on the Hub’s model pages. Inference Providers are also seamlessly integrated into our client SDKs (for each JS and Python), making it super easy to make use of a wide selection of models along with your preferred providers.
This launch makes it easier than ever to access popular open-weight models like gpt-oss, Qwen3, DeepSeek R1, and Gemma 3 — right from Hugging Face. You possibly can browse Scaleway’s org on the Hub at https://huggingface.co/scaleway and check out trending supported models at https://huggingface.co/models?inference_provider=scaleway&sort=trending.
Scaleway Generative APIs is a totally managed, serverless service that gives access to frontier AI models from leading research labs via easy API calls. The service offers competitive pay-per-token pricing starting at €0.20 per million tokens.
The service runs on secure infrastructure positioned in European data centers (Paris, France), ensuring data sovereignty and low latency for European users. The platform supports advanced features including structured outputs, function calling, and multimodal capabilities for each text and image processing.
Built for production use, Scaleway’s inference infrastructure delivers sub-200ms response times for first tokens, making it ideal for interactive applications and agentic workflows. The service supports each text generation and embedding models. You possibly can learn more about Scaleway’s platform and infrastructure at https://www.scaleway.com/en/generative-apis/.
Read more about easy methods to use Scaleway as an Inference Provider in its dedicated documentation page.
See the list of supported models here.
How it really works
In the web site UI
- In your user account settings, you’re capable of:
- Set your individual API keys for the providers you’ve signed up with. If no custom secret’s set, your requests will likely be routed through HF.
- Order providers by preference. This is applicable to the widget and code snippets within the model pages.

- As mentioned, there are two modes when calling Inference Providers:
- Custom key (calls go on to the inference provider, using your individual API key of the corresponding inference provider)
- Routed by HF (in that case, you do not need a token from the provider, and the costs are applied on to your HF account slightly than the provider’s account)

- Model pages showcase third-party inference providers (those which might be compatible with the present model, sorted by user preference)

From the client SDKs
from Python, using huggingface_hub
The next example shows easy methods to use OpenAI’s gpt-oss-120b using Scaleway because the inference provider. You need to use a Hugging Face token for automatic routing through Hugging Face, or your individual Scaleway API key if you’ve got one.
Note: this requires using a recent version of huggingface_hub (>= 0.34.6).
import os
from huggingface_hub import InferenceClient
client = InferenceClient(
provider="scaleway",
api_key=os.environ["HF_TOKEN"],
)
messages = [
{
"role": "user",
"content": "Write a poem in the style of Shakespeare"
}
]
completion = client.chat.completions.create(
model="openai/gpt-oss-120b",
messages=messages,
)
print(completion.decisions[0].message)
from JS using @huggingface/inference
import { InferenceClient } from "@huggingface/inference";
const client = latest InferenceClient(process.env.HF_TOKEN);
const chatCompletion = await client.chatCompletion({
model: "openai/gpt-oss-120b",
messages: [
{
role: "user",
content: "Write a poem in the style of Shakespeare",
},
],
provider: "scaleway",
});
console.log(chatCompletion.decisions[0].message);
Billing
Here is how billing works:
For direct requests, i.e. once you use the important thing from an inference provider, you’re billed by the corresponding provider. As an illustration, when you use a Scaleway API key you are billed in your Scaleway account.
For routed requests, i.e. once you authenticate via the Hugging Face Hub, you will only pay the usual provider API rates. There isn’t any additional markup from us; we just go through the provider costs directly. (In the long run, we may establish revenue-sharing agreements with our provider partners.)
Necessary Note ‼️ PRO users get $2 price of Inference credits every month. You need to use them across providers. 🔥
Subscribe to the Hugging Face PRO plan to get access to Inference credits, ZeroGPU, Spaces Dev Mode, 20x higher limits, and more.
We also provide free inference with a small quota for our signed-in free users, but please upgrade to PRO when you can!
Feedback and next steps
We’d like to get your feedback! Share your thoughts and/or comments here: https://huggingface.co/spaces/huggingface/HuggingDiscussions/discussions/49

