Hugging Face models in Amazon Bedrock

-


logos

We’re excited to announce that popular open models from Hugging Face at the moment are available on Amazon Bedrock in the brand new Bedrock Marketplace! AWS customers can now deploy 83 open models with Bedrock Marketplace to construct their Generative AI applications.

Under the hood, Bedrock Marketplace model endpoints are managed by Amazon Sagemaker Jumpstart. With Bedrock Marketplace, you may now mix the benefit of use of SageMaker JumpStart with the fully managed infrastructure of Amazon Bedrock, including compatibility with high-level APIs reminiscent of Agents, Knowledge Bases, Guardrails and Model Evaluations.

When registering your Sagemaker Jumpstart endpoints in Amazon Bedrock, you simply pay for the Sagemaker compute resources and regular Amazon Bedrock APIs prices are applicable.

On this blog we are going to show you the way to deploy Gemma 2 27B Instruct and use the model with Amazon Bedrock APIs. Learn the way to:

  1. Deploy Google Gemma 2 27B Instruct
  2. Send requests using the Amazon Bedrock APIs
  3. Clean Up



Deploy Google Gemma 2 27B Instruct

There are two ways to deploy an open model for use with Amazon Bedrock:

  1. You possibly can deploy your open model from the Bedrock Model Catalog.
  2. You possibly can deploy your open model with Amazon Jumpstart and register it with Bedrock.

Each ways are similar, so we are going to guide you thru the Bedrock Model catalog.

To start, within the Amazon Bedrock console, ensure that you might be in one among the 14 regions where the Bedrock Marketplace is on the market. Then, you select “Model catalog” within the “Foundation models” section of the navigation pane. Here, you may seek for each serverless models and models available in Amazon Bedrock Marketplace. You filter results by “Hugging Face” provider and you may flick through the 83 open models available.

For instance, let’s search and choose Google Gemma 2 27B Instruct.

model-catalog.png

Selecting the model opens the model detail page where you may see more information from the model provider reminiscent of highlights concerning the model, and usage including sample API calls.

On the highest right, let’s click on Deploy.

model-card.png

It brings you to the deployment page where you may select the endpoint name, the instance configuration and advanced settings related to networking configuration and repair role used to perform the deployment in Sagemaker. Let’s use the default advanced settings and the really useful instance type.

You might be also required to just accept the End User License Agreement of the model provider.

On the underside right, let’s click on Deploy.

model-deploy.png

We just launched the deployment of  GoogleGemma 2 27B Instruct model on a ml.g5.48xlarge instance, hosted in your Amazon Sagemaker tenancy, compatible with Amazon Bedrock APIs!

The endpoint deployment can take several minutes. It’s going to appear within the “Marketplace deployments” page, which you could find within the “Foundation models” section of the navigation pane.



Use the model with Amazon Bedrock APIs

You possibly can quickly test the model within the Playground through the UI.  Nevertheless, to invoke the deployed model programmatically with any Amazon Bedrock APIs, it’s worthwhile to get the endpoint ARN.

From the list of managed deployments, select your model deployment to repeat its endpoint ARN.

model-arn.png

You possibly can query your endpoint using the AWS SDK in your selected language or with the AWS CLI.

Here is an example using Bedrock Converse API through the AWS SDK for Python (boto3):

import boto3

bedrock_runtime = boto3.client("bedrock-runtime")


endpoint_arn = "arn:aws:sagemaker:<:region>:<:accountid>:endpoint/"


inference_config = {
    "maxTokens": 256,
    "temperature": 0.1,
    "topP": 0.999,
}


additional_model_fields = {"parameters": {"repetition_penalty": 0.9, "top_k": 250, "do_sample": True}}
response = bedrock_runtime.converse(
    modelId=endpoint_arn,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "text": "What is Amazon doing in the field of generative AI?",
                },
            ]
        },
        ],
    inferenceConfig=inference_config,
    additionalModelRequestFields=additional_model_fields,
)
print(response["output"]["message"]["content"][0]["text"])
"Amazon is making significant strides in the sector of generative AI, applying it across various services and products. Here's a breakdown of their key initiatives:nn**1. Amazon Bedrock:**nn* That is their **fully managed service** that permits developers to construct and scale generative AI applications using models from Amazon and other leading AI corporations. n* It offers access to foundational models like **Amazon Titan**, a family of enormous language models (LLMs) for text generation, and models from Cohere"

That’s it! If you wish to go further, have a take a look at the Bedrock documentation.



Clean up

Don’t forget to delete your endpoint at the top of your experiment to stop incurring costs! At the highest right of the page where you grab the endpoint ARN, you may delete your endpoint by clicking on “Delete”.



Source link

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x