Public AI on Hugging Face Inference Providers 🔥

We’re thrilled to share that Public AI is now a supported Inference Provider on the Hugging Face Hub!
Public AI joins our growing ecosystem, enhancing the breadth and capabilities of serverless inference directly on the Hub’s model pages. Inference Providers are also seamlessly integrated into our client SDKs (for each JS and Python), making it super easy to make use of a wide selection of models together with your preferred providers.

This launch makes it easier than ever to access public and sovereign models from institutions just like the Swiss AI Initiative and AI Singapore — right from Hugging Face. You possibly can browse Public AI’s org on the Hub at https://huggingface.co/publicai and take a look at trending supported models at https://huggingface.co/models?inference_provider=publicai&sort=trending.

The Public AI Inference Utility is a nonprofit, open-source project. The team builds products and organizes advocacy to support the work of public AI model builders just like the Swiss AI Initiative and AI Singapore, amongst others.

The Public AI Inference Utility runs on a distributed infrastructure that mixes a vLLM-powered backend with a deployment layer designed for resilience across multiple partners. Behind the scenes, inference is handled by servers exposing OpenAI-compatible APIs on vLLM, deployed across clusters donated by national and industry partners. A worldwide load-balancing layer ensures requests are routed efficiently and transparently, no matter which country’s compute is serving the query.

Free public access is supported by donated GPU time and promoting subsidies, while long-term stability is meant to be anchored by state and institutional contributions. You possibly can learn more about Public AI’s platform and infrastructure at https://platform.publicai.co/.

You possibly can now use the Public AI Inference Utility as an Inference Provider on Hugging Face. We’re excited to see what you may construct with this latest provider.

Read more about easy methods to use Public AI as an Inference Provider in its dedicated documentation page.

See the list of supported models here.

How it really works

In the web site UI

In your user account settings, you’re in a position to:

Set your individual API keys for the providers you’ve signed up with. If no custom secret is set, your requests will probably be routed through HF.
Order providers by preference. This is applicable to the widget and code snippets within the model pages.

Inference Providers

As mentioned, there are two modes when calling Inference Providers:

Custom key (calls go on to the inference provider, using your individual API key of the corresponding inference provider)
Routed by HF (in that case, you do not need a token from the provider, and the costs are applied on to your HF account quite than the provider’s account)

Inference Providers

Model pages showcase third-party inference providers (those which might be compatible with the present model, sorted by user preference)

From the client SDKs

from Python, using huggingface_hub

The next example shows easy methods to use Swiss AI’s Apertus-70B using Public AI because the inference provider. You should utilize a Hugging Face token for automatic routing through Hugging Face, or your individual Public AI API key if you may have one.

Note: this requires using a recent version of huggingface_hub (>= 0.34.6).

import os
from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="publicai",
    api_key=os.environ["HF_TOKEN"],
)

messages = [
    {
        "role": "user",
        "content": "What is the capital of France?"
    }
]

completion = client.chat.completions.create(
    model="swiss-ai/Apertus-70B-Instruct-2509",
    messages=messages,
)

print(completion.selections[0].message)

from JS using @huggingface/inference

import { InferenceClient } from "@huggingface/inference";

const client = latest InferenceClient(process.env.HF_TOKEN);

const chatCompletion = await client.chatCompletion({
  model: "swiss-ai/Apertus-70B-Instruct-2509",
  messages: [
    {
      role: "user",
      content: "What is the capital of France?",
    },
  ],
  provider: "publicai",
});

console.log(chatCompletion.selections[0].message);

Billing

On the time of writing, usage of the Public AI Inference Utility through Hugging Face Inference Providers is freed from charge. Pricing and availability may change.

Here is how billing works for other providers on the platform:

For direct requests, i.e. once you use the important thing from an inference provider, you’re billed by the corresponding provider. As an illustration, if you happen to use a Public AI API key you are billed in your Public AI account.

For routed requests, i.e. once you authenticate via the Hugging Face Hub, you will only pay the usual provider API rates. There is not any additional markup from us; we just go through the provider costs directly. (In the long run, we may establish revenue-sharing agreements with our provider partners.)

Necessary Note ‼️ PRO users get $2 value of Inference credits every month. You should utilize them across providers. 🔥

Subscribe to the Hugging Face PRO plan to get access to Inference credits, ZeroGPU, Spaces Dev Mode, 20x higher limits, and more.

We also provide free inference with a small quota for our signed-in free users, but please upgrade to PRO if you happen to can!

Feedback and next steps

We’d like to get your feedback! Share your thoughts and/or comments here: https://huggingface.co/spaces/huggingface/HuggingDiscussions/discussions/49

Source link

Public AI on Hugging Face Inference Providers 🔥