Making hundreds of open LLMs bloom within the Vertex AI Model Garden

Today, we’re thrilled to announce the launch of Deploy on Google Cloud, a brand new integration on the Hugging Face Hub to deploy hundreds of foundation models easily to Google Cloud using Vertex AI or Google Kubernetes Engine (GKE). Deploy on Google Cloud makes it easy to deploy open models as API Endpoints inside your personal Google Cloud account, either directly through Hugging Face model cards or inside Vertex Model Garden, Google Cloud’s single place to find, customize, and deploy a wide range of models from Google and Google partners. Starting today, we’re enabling the most well-liked open models on Hugging Face for inference powered by our production solution, Text Generation Inference.

With Deploy on Google Cloud, developers can construct production-ready Generative AI applications without managing infrastructure and servers, directly inside their secure Google Cloud environment.

A Collaboration for AI Builders

This recent experience expands upon the strategic partnership we announced earlier this 12 months to simplify the access and deployment of open Generative AI models for Google customers. Certainly one of the fundamental problems developers and organizations face is the time and resources it takes to deploy models securely and reliably. Deploy on Google Cloud offers a simple, managed solution to those challenges, providing dedicated configurations and assets to Hugging Face Models. It’s a straightforward click-through experience to create a production-ready Endpoint on Google Cloud’s Vertex AI.

“Vertex AI’s Model Garden integration with the Hugging Face Hub makes it seamless to find and deploy open models on Vertex AI and GKE, whether you begin your journey on the Hub or directly within the Google Cloud Console” says Wenming Ye, Product Manager at Google. “We will’t wait to see what Google Developers construct with Hugging Face models”.

How it really works – from the Hub

Deploying Hugging Face Models on Google Cloud is super easy. Below, one can find step-by-step instructions on how you can deploy Zephyr Gemma. Starting today, all models with the “text-generation-inference” tag can be supported.

Open the “Deploy” menu, and choose “Google Cloud”. This can now bring you straight into the Google Cloud Console, where you’ll be able to deploy Zephyr Gemma in 1 click on Vertex AI, or GKE.

Once you’re within the Vertex Model Garden, you’ll be able to select Vertex AI or GKE as your deployment environment. With Vertex AI you’ll be able to deploy the model with 1-click on “Deploy”. For GKE, you’ll be able to follow instructions and manifest templates on how you can deploy the model on a brand new or running Kubernetes Cluster.

How it really works – from Vertex Model Garden

Vertex Model Garden is where Google Developers can find ready-to-use models for his or her Generative AI projects. Starting today, the Vertex Model Garden offers a brand new experience to simply deploy the most well-liked open LLMs available on Hugging Face!

You’ll find the brand new “Deploy From Hugging Face” option inside Google Vertex AI Model Garden, which permits you to search and deploy Hugging Face models directly inside your Google Cloud console.

Whenever you click on “Deploy From Hugging Face”, a form will appear where you’ll be able to quickly seek for model IDs. Tons of of the most well-liked open LLMs on Hugging Face can be found with ready-to-use, tested hardware configurations.

Once you discover the model you wish to deploy, select it, and Vertex AI will prefill all required configurations to deploy your model to Vertex AI or GKE. You possibly can even make sure you chosen the proper model by “viewing it on Hugging Face.” If you happen to’re using a gated model, ensure to offer your Hugging Face access token so the model download will be authorized.

And that’s it! Deploying a model like Zephyr Gemma directly, from the Vertex Model Garden onto your personal Google Cloud account is just a few clicks.

We’re just getting began

We’re excited to collaborate with Google Cloud to make AI more open and accessible for everybody. Deploying open models on Google Cloud has never been easier, whether you begin from the Hugging Face Hub, or throughout the Google Cloud console. And we’re not going to stop there – stay tuned as we enable more experiences to construct AI with open models on Google Cloud!

Source link

Making hundreds of open LLMs bloom within the Vertex AI Model Garden

A Collaboration for AI Builders

How it really works – from the Hub

How it really works – from Vertex Model Garden

We’re just getting began

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

What Are Agent Skills Beyond Claude?

3 Questions: Constructing predictive models to characterize tumor progression

Introducing Storage Buckets on the Hugging Face Hub

Constructing a Like-for-Like solution for Stores in Power BI

How NVIDIA Builds Open Data for AI

Making hundreds of open LLMs bloom within the Vertex AI Model Garden

A Collaboration for AI Builders

How it really works – from the Hub

How it really works – from Vertex Model Garden

We’re just getting began

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.