An introduction to AWS Bedrock

-

at first of 2026, AWS has several related yet distinct components that make up its agentic and LLM abstractions.

  • Bedrock is the model layer that allows access to large language models.
  • Agents for Bedrock is the managed application layer. In other words, AWS runs the agents for you based in your requirements.
  • Bedrock AgentCore is an infrastructure layer that allows AWS to run agents you develop using third-party frameworks similar to CrewAI and LangGraph.

Aside from these three services, AWS also has Strands, an open source Python library for constructing agents outside of the Bedrock service, which might then be deployed on other AWS services similar to ECS and Lambda.

It will probably be confusing because all three agentic-based services have the term “Bedrock” of their names, but in this text, I’ll give attention to the usual Bedrock service and show how and why you’d use it. standard Bedrock service and show how and why you’d use it.

As a service, Bedrock has only been available on AWS since early 2023. That ought to provide you with a clue as to why it was introduced. Amazon could clearly see the rise of Large Language Models and their impact on IT architecture and the systems development process. That’s AWS’s meat and potatoes, and so they were keen that no person was going to eat their lunch. 

And although AWS has developed a number of LLMs of its own, it realised that to remain competitive, it might have to make the very top models, similar to those from Anthropic, available to users. And that’s where Bedrock steps in. As they said in their very own blurb on their website,

How do I access Bedrock?

Okay, in order that’s the idea behind the why of Bedrock, but how can we get access to it and really use it? Not surprisingly, the very first thing you would like is an AWS account. I’m going to assume you have already got this, but when not, click the next link to set one up. 

https://aws.amazon.com/account

Usefully, after you register for a brand new AWS account, a great variety of the services that you simply use will fall under the so-called “free tier” at AWS, which suggests your costs must be minimal for one 12 months following your account creation – assuming you don’t go crazy and begin firing up huge compute servers and such like.

There are three principal ways to make use of AWS services.

  • Via the console. In the event you’re a beginner, this can probably be your selected route because it’s the best solution to start
  • Via an API. In the event you’re handy at coding, you may access all of AWS’s services via an API. For instance, for Python programmers, AWS provides the boto3 library. There are similar libraries for other languages, similar to JavaScript, etc. 
  • Via the command line interface (CLI). The CLI is a further tool you may download from AWS and permits you to interact with AWS services straight out of your terminal. 

Note that, to make use of the latter two methods, you must have login credentials arrange in your local system.

What can I do with Bedrock?

The short answer is you could do many of the things you may with regular chat models from OpenAI, Anthropic, Google, and so forth. Underlying Bedrock are quite a lot of foundation models you could use with it, similar to:-

  • Kimi K2 Considering. A deep reasoning model
  • Claude Opus 4.5. To many individuals, that is the highest LLM available to this point.
  • GPT-OSS. OpenAI’s open source LLM

And plenty of, many others besides. For a full list, try the next link.

https://aws.amazon.com/bedrock/model-choice

How do I exploit Bedrock?

To make use of Bedrock, we’ll use a combination of the AWS CLI and the Python API provided by the boto3 library. Make sure that you’ve gotten the next setup as a prerequisite

  • An AWS account.
  • The AWS CLI has been downloaded and installed in your system.
  • An Identity and Access Management (IAM) user is ready up with appropriate permissions and access keys. You may do that via the AWS console.
  • Configured your user credentials via the AWS CLI like this. On the whole, three pieces of knowledge must be supplied. All of which you’ll get from the previous step above. You can be prompted to enter relevant information,
$ aws configure

AWS Access Key ID [None]: AKIAIOSFODNN7EXAMPLE
AWS Secret Access Key [None]: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
Default region name [None]: us-west-2
Default output format [None]:

Giving Bedrock access to a model

Back within the day (a number of months ago!), you had to make use of the AWS management console to request access to particular models from Bedrock, but now access is mechanically granted once you invoke a model for the primary time.

Note that for Anthropic models, first-time users may have to submit use case details before they will access the model. Also note that access to top models from Anthropic and other providers willincur costs so please make sure you monitor your billing usually and take away any model access you now not need.

Nevertheless, we still have to know the model name we would like to make use of. To get an inventory of all Bedrock-compatible models, we will use the next AWS CLI command.

aws bedrock list-foundation-models

This can return a JSON result set listing various properties of every model, like this.

{
    "modelSummaries": [
        {
            "modelArn": "arn:aws:bedrock:us-east-2::foundation-model/nvidia.nemotron-nano-12b-v2",
            "modelId": "nvidia.nemotron-nano-12b-v2",
            "modelName": "NVIDIA Nemotron Nano 12B v2 VL BF16",
            "providerName": "NVIDIA",
            "inputModalities": [
                "TEXT",
                "IMAGE"
            ],
            "outputModalities": [
                "TEXT"
            ],
            "responseStreamingSupported": true,
            "customizationsSupported": [],
            "inferenceTypesSupported": [
                "ON_DEMAND"
            ],
            "modelLifecycle": {
                "status": "ACTIVE"
            }
        },
        {
            "modelArn": "arn:aws:bedrock:us-east-2::foundation-model/anthropic.claude-sonnet-4-20250514-v1:0",
...
...
...

Select the model you would like and note its modelID from the JSON output, as we’ll need this in our Python code later. A very important caveat to that is that you simply’ll often see the next in a model description,

...
...
"inferenceTypesSupported": [
    "INFERENCE_PROFILE"
]
...
...

That is reserved for models that:

  • Are large or in high demand
  • Require reserved or managed capability
  • Need explicit cost and throughput controls

For these models, we will’t just reference the modelID in our code. As a substitute, we want to reference an inference profile. An inference profile is a Bedrock resource that’s certain to 1 or more foundational LLMs and a region. 

There are two ways to acquire an inference profile you should use. The primary is to create one yourself. These are called Application Profiles. The second way is to make use of one in all AWS’s Supported Profiles. That is the better option, because it’s pre-built for you and you only have to obtain the relevant Profile ID related to the inference profile to make use of in your code. 

If you desire to take the route of making your Application Profile, try the suitable AWS documentation, but I’m going to make use of a supported profile in my example code.

For an inventory of Supported Profiles in AWS, try the link below:

https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles-support.html#inference-profiles-support-system

For my first code example, I need to make use of Claude’s Sonnet 3.5 V2 model, so I clicked the road above and saw the next description.

Image from AWS website

I took note of the profile ID ( us.anthropic.claude-3–5-sonnet-20241022-v2:0 ) and one in all the valid source regions ( us-east-1 )

For my second two example code snippets, I’ll use OpenAI’s open-source LLM for text output and AWS’s Titan Image generator for images. Neither of those models requires an inference profile, so you may just use the regular modelID for them in your code.

NB: Whichever model(s) you select, ensure that your AWS region is ready to the right value for every.

Setting Up a Development Environment

As we’ll be performing some coding, it’s best to isolate your environment so we don’t interfere with any of our other projects. So let’s do this now. I’m using Windows and the UV package manager for this, but use whichever tool you’re most comfortable with. My code will run in a Jupyter notebook.

uv init bedrock_demo --python 3.13
cd bedrock_demo
uv add boto3 jupyter

# To run the notebook, type this in
uv run jupyter notebook

Using Bedrock from Python

Let’s see Bedrock in motion with a number of examples. The primary can be easy, and we’ll steadily increase the complexity as we go.

Example 1: An easy query and answer using an inference profile

This instance uses the Claude Sonnet 3.5 V2 model we talked about earlier. As mentioned, to invoke this model, we use a profile ID related to its inference profile. 

import json
import boto3

brt = boto3.client("bedrock-runtime", region_name="us-east-1")

profile_id = "us.anthropic.claude-3-5-sonnet-20241022-v2:0"

body = json.dumps({
    "anthropic_version": "bedrock-2023-05-31",
    "max_tokens": 200,
    "temperature": 0.2,
    "messages": [
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What is the capital of France?"}
            ]
        }
    ]
})

resp = brt.invoke_model(
    modelId=profile_id,
    body=body,
    accept="application/json",
    contentType="application/json"
)

data = json.loads(resp["body"].read())

# Claude responses come back as a "content" array, not OpenAI "selections"
print(data["content"][0]["text"])

#
# Output
#
The capital of France is Paris.

Note that invoking this model (and others prefer it) creates an implied subscription between you and AWS’s marketplace. That is not a recurring regular charge. It only costs you when the model is definitely used, but its best to watch this to avoid unexpected bills. It’s best to receive an email outlining the subscription agreement, with a link to administer and/or cancel any existing model subscriptions which are arrange.

Example 2: Create a picture

An easy image creation using AWS’s own Titan model. This model isn’t related to an inference profile, so we will just reference it using its modelID.

import json
import base64
import boto3

brt_img = boto3.client("bedrock-runtime", region_name="us-east-1")
model_id_img = "amazon.titan-image-generator-v2:0"

prompt = "A hippo riding a motorcycle."

body = json.dumps({
    "taskType": "TEXT_IMAGE",
    "textToImageParams": {
        "text": prompt
    },
    "imageGenerationConfig": {
        "numberOfImages": 1,
        "height": 1024,
        "width": 1024,
        "cfgScale": 7.0,
        "seed": 0
    }
})

resp = brt_img.invoke_model(
    modelId=model_id_img,
    body=body,
    accept="application/json",
    contentType="application/json"
)

data = json.loads(resp["body"].read())

# Titan returns base64-encoded images within the "images" array
img_b64 = data["images"][0]
img_bytes = base64.b64decode(img_b64)

out_path = "titan_output.png"
with open(out_path, "wb") as f:
    f.write(img_bytes)

print("Saved:", out_path)

On my system, the output image looked like this.

Image by AWS Titan LLM

Example 3: A technical support triage assistant using OpenAI’s OSS model

It is a more complex and useful example. Here, we arrange an assistant that can take problems reported to it by non-technical users and output additional questions you would possibly want the user to reply, in addition to the almost definitely causes of the difficulty and what further steps to take. Like our previous example, this model isn’t related to an inference profile.

import json
import re
import boto3
from pydantic import BaseModel, Field
from typing import List, Literal, Optional

# ----------------------------
# Bedrock setup
# ----------------------------
REGION = "us-east-2"
MODEL_ID = "openai.gpt-oss-120b-1:0"

brt = boto3.client("bedrock-runtime", region_name=REGION)

# ----------------------------
# Output schema
# ----------------------------
Severity = Literal["low", "medium", "high"]
Category = Literal["account", "billing", "device", "network", "software", "security", "other"]

class TriageResponse(BaseModel):
    category: Category
    severity: Severity
    summary: str = Field(description="One-sentence restatement of the issue.")
    likely_causes: List[str] = Field(description="Top plausible causes, concise.")
    clarifying_questions: List[str] = Field(description="Ask only what is required to proceed.")
    safe_next_steps: List[str] = Field(description="Step-by-step actions protected for a non-technical user.")
    stop_and_escalate_if: List[str] = Field(description="Clear red flags that require knowledgeable/helpdesk.")
    recommended_escalation_target: Optional[str] = Field(
        default=None,
        description="If severity is high, who to contact (e.g., IT admin, bank, ISP)."
    )

# ----------------------------
# Helpers
# ----------------------------
def invoke_chat(messages, max_tokens=800, temperature=0.2) -> dict:
    body = json.dumps({
        "messages": messages,
        "max_tokens": max_tokens,
        "temperature": temperature
    })

    resp = brt.invoke_model(
        modelId=MODEL_ID,
        body=body,
        accept="application/json",
        contentType="application/json"
    )
    return json.loads(resp["body"].read())

def extract_content(data: dict) -> str:
    return data["choices"][0]["message"]["content"]

def extract_json_object(text: str) -> dict:
    """
    Extract the primary JSON object from model output.
    Handles common cases like  blocks or extra text.
    """
    text = re.sub(r".*?", "", text, flags=re.DOTALL).strip()

    start = text.find("{")
    if start == -1:
        raise ValueError("No JSON object found.")

    depth = 0
    for i in range(start, len(text)):
        if text[i] == "{":
            depth += 1
        elif text[i] == "}":
            depth -= 1
            if depth == 0:
                return json.loads(text[start:i+1])

    raise ValueError("Unbalanced JSON braces; couldn't parse.")

# ----------------------------
# The useful function
# ----------------------------
def triage_issue(user_problem: str) -> TriageResponse:
    messages = [
        {
            "role": "system",
            "content": (
                "You are a careful technical support triage assistant for non-technical users. "
                "You must be conservative and safety-first. "
                "Return ONLY valid JSON matching the given schema. No extra text."
            )
        },
        {
            "role": "user",
            "content": f"""
User problem:
{user_problem}

Return JSON that matches this schema:
{TriageResponse.model_json_schema()}
""".strip()
        }
    ]

    raw = invoke_chat(messages)
    text = extract_content(raw)
    parsed = extract_json_object(text)
    return TriageResponse.model_validate(parsed)

# ----------------------------
# Example
# ----------------------------
if __name__ == "__main__":
    problem = "My laptop is connected to Wi-Fi but web sites won't load, and Zoom keeps saying unstable connection."
    result = triage_issue(problem)
    print(result.model_dump_json(indent=2))

Here is the output.

"category": "network",
  "severity": "medium",
  "summary": "Laptop shows Wi‑Fi connection but cannot load web sites and Zoom 
              reports an unstable connection.",
  "likely_causes": [
    "Router or modem malfunction",
    "DNS resolution failure",
    "Local Wi‑Fi interference or weak signal",
    "IP address conflict on the network",
    "Firewall or security software blocking traffic",
    "ISP outage or throttling"
  ],
  "clarifying_questions": [
    "Are other devices on the same Wi‑Fi network able to access the internet?",
    "Did the problem start after any recent changes (e.g., new software, OS update, VPN installation)?",
    "Have you tried moving closer to the router or using a wired Ethernet connection?",
    "Do you see any error codes or messages in the browser or Zoom besides "unstable connection"?"
  ],
  "safe_next_steps": [
    "Restart the router and modem by unplugging them for 30 seconds, then power them back on.",
    "On the laptop, forget the Wi‑Fi network, then reconnect and re-enter the password.",
    "Run the built‑in Windows network troubleshooter (Settings → Network & Internet → Status → Network troubleshooter).",
    "Disable any VPN or proxy temporarily and test the connection again.",
    "Open a command prompt and run `ipconfig /release` followed by `ipconfig /renew`.",
    "Flush the DNS cache with `ipconfig /flushdns`.",
    "Try accessing a simple website (e.g., http://example.com) and note if it loads.",
    "If possible, connect the laptop to the router via Ethernet to see if the issue persists."
  ],
  "stop_and_escalate_if": [
    "The laptop still cannot reach any website after completing all steps.",
    "Other devices on the same network also cannot access the internet.",
    "You receive error messages indicating hardware failure (e.g., Wi‑Fi adapter not found).",
    "The router repeatedly restarts or shows error lights.",
    "Zoom continues to report a poor or unstable connection despite a working internet test."
  ],
  "recommended_escalation_target": "IT admin"
}

Summary

This text introduced AWS Bedrock, AWS’s managed gateway to foundation large language models, explaining why it exists, the way it matches into the broader AWS AI stack, and tips on how to use it in practice. We covered model discovery, region and credential setup, and the important thing distinction between on-demand models and people who require inference profiles  – a typical source of confusion for developers. 

Through practical Python examples, we demonstrated text and image generation using each standard on-demand models and people who require an inference profile.

At its core, Bedrock reflects AWS’s long-standing philosophy: abstract infrastructure complexity without removing control. Somewhat than pushing a single “best” model, Bedrock treats foundation models as managed infrastructure components  – swappable, governable, and region-aware. This implies a future where Bedrock evolves less as a chat interface and more as a model orchestration layer, tightly integrated with IAM, networking, cost controls, and agent frameworks. 

Over time, we’d expect Bedrock to maneuver further toward standardised inference contracts (subscriptions) and clearer separation between experimentation and production capability. And with their Agent and AgentCore services, we’re already seeing deeper integration of agentic workflows with Bedrock, positioning models not as products in themselves but as durable constructing blocks inside AWS systems.


For the avoidance of doubt, other than being a sometime user of their services, I even have no connection or affilliation with Amazon Web Services

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x