Featherless AI on Hugging Face Inference Providers 🔥

We’re thrilled to share that Featherless AI is now a supported Inference Provider on the Hugging Face Hub!
Featherless AI joins our growing ecosystem, enhancing the breadth and capabilities of serverless inference directly on the Hub’s model pages. Inference Providers are also seamlessly integrated into our client SDKs (for each JS and Python), making it super easy to make use of a wide range of models along with your preferred providers.

Featherless AI supports a wide range of text and conversational models, including the most recent open-source models from DeepSeek, Meta, Google, Qwen, and way more.

Featherless AI is a serverless AI inference provider with unique model loading and GPU orchestration abilities that makes an exceptionally large catalog of models available for users. Providers often offer either a low price of access to a limited set of models, or a limiteless range of models with users managing servers and the associated costs of operation. Featherless provides the most effective of each worlds offering unmatched model range and variety but with serverless pricing. Find the total list of supported models on the models page.

We’re super excited to see what you will construct with this recent provider!

Read more about learn how to use Featherless as an Inference Provider in its dedicated documentation page.

How it really works

In the web site UI

In your user account settings, you might be in a position to:

Set your personal API keys for the providers you’ve signed up with. If no custom secret’s set, your requests might be routed through HF. Learn more about request types within the docs.
Order providers by preference. This is applicable to the widget and code snippets within the model pages.

Inference Providers

As mentioned, there are two modes when calling Inference Providers:

Custom key (calls go on to the inference provider, using your personal API key of the corresponding inference provider)
Routed by HF (in that case, you do not need a token from the provider, and the fees are applied on to your HF account reasonably than the provider’s account)

Inference Providers

Model pages showcase third-party inference providers (those which are compatible with the present model, sorted by user preference)

From the client SDKs

from Python, using huggingface_hub

The next example shows learn how to use DeepSeek-R1 using Featherless AI because the inference provider. You should utilize a Hugging Face token for automatic routing through Hugging Face, or your personal Featherless AI API key if you’ve gotten one.

Install or upgrade huggingface_hub to make sure you’ve gotten version v0.33.0 or higher: pip install --upgrade huggingface-hub

import os
from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="featherless-ai",
    api_key=os.environ["HF_TOKEN"]
)

messages = [
    {
        "role": "user",
        "content": "What is the capital of France?"
    }
]

completion = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-R1-0528", 
    messages=messages, 
)

print(completion.selections[0].message)

from JS using @huggingface/inference

import { InferenceClient } from "@huggingface/inference";

const client = recent InferenceClient(process.env.HF_TOKEN);

const chatCompletion = await client.chatCompletion({
    model: "deepseek-ai/DeepSeek-R1-0528",
    messages: [
        {
            role: "user",
            content: "What is the capital of France?"
        }
    ],
    provider: "featherless-ai",
});

console.log(chatCompletion.selections[0].message);

Billing

For direct requests, i.e. while you use the important thing from an inference provider, you might be billed by the corresponding provider. As an illustration, in case you use a Featherless AI API key you are billed in your Featherless AI account.

For routed requests, i.e. while you authenticate via the Hugging Face Hub, you will only pay the usual provider API rates. There isn’t any additional markup from us, we just go through the provider costs directly. (In the longer term, we may establish revenue-sharing agreements with our provider partners.)

Essential Note ‼️ PRO users get $2 value of Inference credits every month. You should utilize them across providers. 🔥

Subscribe to the Hugging Face PRO plan to get access to Inference credits, ZeroGPU, Spaces Dev Mode, 20x higher limits, and more.

We also provide free inference with a small quota for our signed-in free users, but please upgrade to PRO in case you can!

Feedback and next steps

We might like to get your feedback! Share your thoughts and/or comments here: https://huggingface.co/spaces/huggingface/HuggingDiscussions/discussions/49

Source link

Featherless AI on Hugging Face Inference Providers 🔥

How it really works

In the web site UI

From the client SDKs

from Python, using huggingface_hub

from JS using @huggingface/inference

Billing

Feedback and next steps

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

Prompt Caching with the OpenAI API: A Full Hands-On Python tutorial

Constructing a Navier-Stokes Solver in Python from Scratch: Simulating Airflow

Escaping the SQL Jungle

A Gentle Introduction to Nonlinear Constrained Optimization with Piecewise Linear Approximations

Agentic RAG Failure Modes: Retrieval Thrash, Tool Storms, and Context Bloat (and How you can Spot Them Early)

Featherless AI on Hugging Face Inference Providers 🔥

How it really works

In the web site UI

From the client SDKs

from Python, using huggingface_hub

from JS using @huggingface/inference

Billing

Feedback and next steps

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.