The Death of the “Every thing Prompt”: Google’s Move Toward Structured AI

been laying the groundwork for a more structured option to construct interactive, stateful AI-driven applications. One in all the more interesting outcomes of this effort was the discharge of their latest Interactions API a couple of weeks ago.

As large language models (LLMs) come and go, it’s often the case that an API developed by an LLM provider can get a bit outdated. In spite of everything, it will probably be difficult for an API designer to anticipate all the varied changes and tweaks that is likely to be applied to whichever system the API is designed to serve. That is doubly true in AI, where the pace of change is unlike anything seen within the IT world before.

We’ve seen this before with OpenAI, as an illustration. Their initial API for his or her models was called the Completions API. As their models advanced, that they had to upgrade and release a brand new API called Responses.

Google is taking a rather different tack with the Interactions API. It’s not a whole substitute for his or her older generateContent API, but quite an extension of it.

As Google says in its own documentation…

“The Interactions API (Beta) is a unified interface for interacting with Gemini models and agents. It simplifies state management, tool orchestration, and long-running tasks.”

The remainder of this text explores the architectural necessity of the Interactions API. We’ll start easy by showing how the Interactions API can do all the things its predecessor could, then end with the way it enables stateful operations, the express integration of Google’s high-latency Deep Research agentic capabilities, and the handling of long-running tasks. We’ll move beyond a “Hello World” example to construct systems that require deep thought and the orchestration of asynchronous research.

The Architectural Gap: Why “Chat” is Insufficient

To know why the Interactions API exists, we must analyse why the usual LLM chat loop is insufficient.

In a regular chat application, “state” is implicit. It exists only as a sliding window of token history. If a user is in step 3 of an onboarding wizard and asks an off-topic query, the model might hallucinate a brand new path, effectively breaking the wizard. The developer has no programmatic guarantee that the user is where they’re alleged to be.

For more modern AI systems development, that is insufficient. To counter that, Google’s latest API offers ways to check with previous context in subsequent LLM interactions. We’ll see an example of that later.

The Deep Research Problem

Google’s Deep Research capability (powered by Gemini) is agentic. It doesn’t just retrieve information; it formulates a plan, executes dozens of searches, reads lots of of pages, and synthesises a solution. This process is asynchronous and high-latency.

You can’t simply prompt a regular chat model to “do deep research” inside a synchronous loop without risking timeouts or context window overflows. The Interactions API permits you to encapsulate this volatile agentic process right into a stable, managed Step, pausing the interaction state. At the identical time, the heavy lifting occurs and resumes only when structured data is returned. Nevertheless, if a deep research agent is taking a protracted time to do its research, the very last thing you should do is sit there twiddling your thumbs waiting for it to complete. The Interactions API permits you to perform background research and poll for its results periodically, so you might be notified as soon because the agent returns its results.

Setting Up a Development Environment

Let’s see the Interactions API up close by taking a look at a couple of coding examples of its use. As with all development project, it’s best to isolate your environment, so let’s do this now. I’m using Windows and the UV package manager for this, but use whichever tool you’re most comfortable with. My code was run in a Jupyter notebook.

uv init interactions_demo --python 3.12
cd interactions_demo
uv add google-genai jupyter

# To run the notebook, type this in

uv run jupyter notebook

To run my example code, you’ll also need a Google API key. When you don’t have one, go to Google’s AI Studio website and log in. Near the underside left of the screen, you’ll see a Get API key link. Click on that and follow the instructions to get your key. Once you have got a key, create an environment variable named GOOGLE_API_KEY in your system and set its value to your API key.

Example 1: A Hello World equivalent

from google import genai

client = genai.Client()

interaction =  client.interactions.create(
    model="gemini-2.5-flash",
    input="What's the capital of France"
)

print(interaction.outputs[-1].text)

#
# Output
#
The capital of France is **Paris**.

Example 2: Using Nano Banana to generate a picture

Before we examine the particular capabilities of state management and deep research that the brand new Interactions API offers, I need to indicate that it’s a general-purpose, multi-modal tool. For this, we’ll use the API to create a picture for us using Nano Banana, which is officially referred to as Gemini 3 Pro Image Preview.

import base64
import os
from google import genai

# 1. Make sure the directory exists
output_dir = r"c:temp"
if not os.path.exists(output_dir):
    os.makedirs(output_dir)
    print(f"Created directory: {output_dir}")

client = genai.Client()

print("Sending request...")

try:
    # 2. Correct Syntax: Pass 'response_modalities' directly (not inside config)
    interaction = client.interactions.create(
        model="gemini-3-pro-image-preview", # Ensure you have got access to this model
        input="Generate a picture of a hippo wearing a top-hat riding a uni-cycle.",
        response_modalities=["IMAGE"] 
    )

    found_image = False

    # 3. Iterate through outputs and PRINT all the things
    for i, output in enumerate(interaction.outputs):
        
        # Debug: Print the sort so we all know what we got
        print(f"n--- Output {i+1} Type: {output.type} ---")

        if output.type == "text":
            # If the model refused or chatted back, this may print why
            print(f"📝 Text Response: {output.text}")

        elif output.type == "image":
            print(f"Image Response: Mime: {output.mime_type}")
            
            # Construct filename
            file_path = os.path.join(output_dir, f"hippo_{i}.png")
            
            # Save the image
            with open(file_path, "wb") as f:
                # The SDK often returns base64 bytes or string
                if isinstance(output.data, bytes):
                    f.write(output.data)
                else:
                    f.write(base64.b64decode(output.data))
            
            print(f"Saved to: {file_path}")
            found_image = True
    
    if not found_image:
        print("nNo image was returned. Check the 'Text Response' above for the explanation.")

except Exception as e:
    print(f"nError: {e}")

This was my output.

Example 3: State Management

Stateful management within the Interactions API is built across the “Interaction” resource, which serves as a session record that incorporates the entire history of a task, from user inputs to tool results.

To proceed a conversation that remembers the previous context, you pass an ID of an earlier interaction into the previous_interaction_id parameter of a brand new request.

The server uses this ID to routinely retrieve the total context of the actual session it’s related to, eliminating the necessity for the developer to resend your complete chat history. A side-effect is that, this manner, caching may be used more effectively, resulting in improved performance and reduced token costs.

Stateful interactions require that the information be stored on Google’s servers. By default, the shop parameter is ready to true, which enables this feature. If a developer sets store=false, they can’t use stateful features like previous_interaction_id.

Stateful mode also allows mixing different models and agents in a single thread. For instance, you would use a Deep Research agent for data collection after which reference that interaction’s ID to have a regular (cheaper) Gemini model summarise the findings.

Here’s a fast example where we kick off a sure bet by telling the model our name and asking it some easy questions. We record the Interaction ID that the session produces, then, at some later time, we ask the model what our name was and what the second query we asked was.

from google import genai

client = genai.Client()

# 1. First turn
interaction1 = client.interactions.create(
    model="gemini-3-flash-preview",
    input="""
Hi,It's Tom here, are you able to tell me the chemical name for water. 
Also, which is the smallest recognised country on the earth? 
And the way tall in feet is Mt Everest
"""
)
print(f"Response: {interaction1.outputs[-1].text}")
print(f"ID: {interaction1.id}")
#
# Output
#

Response: Hi Tom! Listed below are the answers to your questions:

*   **Chemical name for water:** Probably the most common chemical name is **dihydrogen monoxide** ($H_2O$), though in formal chemistry circles, its systematic name is **oxidane**.
*   **Smallest recognized country:** **Vatican City**. It covers only about 0.17 square miles (0.44 square kilometers) and is an independent city-state enclaved inside Rome, Italy.
*   **Height of Mt. Everest:** In accordance with probably the most recent official measurement (confirmed in 2020), Mt. Everest is **29,031.7 feet** (8,848.86 meters) tall.
ID: v1_ChdqamxlYVlQZ01jdmF4czBQbTlmSHlBOBIXampsZWFZUGdNY3ZheHMwUG05Zkh5QTg

A number of hours later …

from google import genai

client = genai.Client()

# 2. Second turn (passing previous_interaction_id)
interaction2 = client.interactions.create(
    model="gemini-3-flash-preview",
    input="Are you able to tell me my name and what was the second query I asked you",
    previous_interaction_id='v1_ChdqamxlYVlQZ01jdmF4czBQbTlmSHlBOBIXampsZWFZUGdNY3ZheHMwUG05Zkh5QTg'
)
print(f"Model: {interaction2.outputs[-1].text}")

#
# Output
#
Model: Hi Tom! 

Your name is **Tom**, and the second query you asked was: 
**"Which is the smallest recognised country on the earth?"** 
(to which the reply is Vatican City).

Example 4: The Asynchronous Deep Research Orchestrator

Now, on to something that Google’s old API cannot do. One in all the important thing advantages of the Interactions API is which you can use it to call specialised agents, corresponding to deep-research-pro-preview-12-2025, for complex tasks.

In this instance, we’ll construct a competitive intelligence engine. The user specifies a business competitor, and the system triggers a Deep Research agent to scour the online, read annual reports, and create a Strengths, Weaknesses, Opportunites and Threats (SWOT) evaluation. We split this into two parts. First, we are able to fire off our research request using code like this.

import time
import sys
from google import genai

def competitive_intelligence_engine():
    client = genai.Client()

    print("--- Deep Research Competitive Intelligence Engine ---")
    competitor_name = input("Enter the name of the competitor to investigate (e.g., Nvidia, Coca-Cola): ")
    
    # We craft a particular prompt to force the agent to search for specific document types
    prompt = f"""
    Conduct a deep research investigation into '{competitor_name}'.
    
    Your specific tasks are:
    1. Scour the online for probably the most recent Annual Report (10-K) and latest Quarterly Earnings transcripts.
    2. Seek for recent news regarding product launches, strategic partnerships, and legal challenges within the last 12 months.
    3. Synthesize all findings into an in depth SWOT Evaluation (Strengths, Weaknesses, Opportunities, Threats).
    
    Format the output as knowledgeable executive summary with the SWOT section clearly defined in Markdown.
    """

    print(f"n Deploying Deep Research Agent for: {competitor_name}...")
    
    # 1. Start the Deep Research Agent
    # We use the particular agent ID provided in your sample
    try:
        initial_interaction = client.interactions.create(
            input=prompt,
            agent="deep-research-pro-preview-12-2025",
            background=True
        )
    except Exception as e:
        print(f"Error starting agent: {e}")
        return

    print(f" Research began. Interaction ID: {initial_interaction.id}")
    print("⏳ The agent is now browsing the online and reading reports. This will take several minutes.")

This can produce the next output.

--- Deep Research Competitive Intelligence Engine ---
Enter the name of the competitor to investigate (e.g., Nvidia, Coca-Cola):  Nvidia

Deploying Deep Research Agent for: Nvidia...
Research began. Interaction ID: v1_ChdDdXhiYWN1NEJLdjd2ZElQb3ZHdTBRdxIXQ3V4YmFjdTRCS3Y3dmRJUG92R3UwUXc
The agent is now browsing the online and reading reports. This will take several minutes.

Next, since we all know the research job will take a while to finish, we are able to use the Interaction ID printed above to watch it and check periodically to see if it’s finished.

Often, this may be done in a separate process that might email or text you when the research job was accomplished so which you can get on with other tasks within the meantime.

try:
    while True:
        # Refresh the interaction status
        interaction = client.interactions.get(initial_interaction.id)
            
        # Calculate elapsed time
        elapsed = int(time.time() - start_time)
            
        # Print a dynamic status line so we understand it's working
        sys.stdout.write(f"r Status: {interaction.status.upper()} | Time Elapsed: {elapsed}s")
        sys.stdout.flush()

        if interaction.status == "accomplished":
            print("nn" + "="*50)
            print(f" INTELLIGENCE REPORT: {competitor_name.upper()}")
            print("="*50 + "n")
                
            # Print the content
            print(interaction.outputs[-1].text)
            break
            
        elif interaction.status in ["failed", "cancelled"]:
            print(f"nnJob ended with status: {interaction.status}")
            # Sometimes error details are within the output text even on failure
            if interaction.outputs:
               print(f"Error details: {interaction.outputs[-1].text}")
            break

        # Wait before polling again to respect rate limits
        time.sleep(10)

except KeyboardInterrupt:
    print("nUser interrupted. Research may proceed in background.")

I won’t show the total research output, because it was pretty lengthy, but here is just a part of it.

==================================================
📝 INTELLIGENCE REPORT: NVIDIA
==================================================

# Strategic Evaluation & Executive Review: Nvidia Corporation (NVDA)

### Key Findings
*   **Financial Dominance:** Nvidia reported record Q3 FY2026 revenue of **$57.0 billion** (+62% YoY), driven by a staggering **$51.2 billion** in Data Center revenue. The corporate has effectively transitioned from a hardware manufacturer to the foundational infrastructure provider for the "AI Industrial Revolution."
*   **Strategic Expansion:** Major moves in late 2025 included a **$100 billion investment roadmap with OpenAI** to deploy 10 gigawatts of compute and a **$20 billion acquisition of Groq's assets**, pivoting Nvidia aggressively into the AI inference market.
*   **Regulatory Peril:** The corporate faces intensifying geopolitical headwinds. In September 2025, China's SAMR found Nvidia in violation of antitrust laws regarding its Mellanox acquisition. Concurrently, the U.S. Supreme Court allowed a class-action lawsuit regarding crypto-revenue disclosures to proceed.
*   **Product Roadmap:** The launch of the **GeForce RTX 50-series** (Blackwell architecture) and **Project DIGITS** (personal AI supercomputer) at CES 2025 signals a push to democratize AI compute beyond the information center to the desktop.

---

## 1. Executive Summary

Nvidia Corporation (NASDAQ: NVDA) stands on the apex of the unreal intelligence transformation, having successfully evolved from a graphics processing unit (GPU) vendor right into a full-stack computing platform company. As of early 2026, Nvidia will not be merely selling chips; it's constructing "AI Factories"-entire data centers integrated with its proprietary networking, software (CUDA), and hardware.
The fiscal 12 months 2025 and the primary three quarters of fiscal 2026 have demonstrated unprecedented financial acceleration. The corporate's "Blackwell" architecture has seen demand outstrip supply, making a backlog that extends well into 2026. Nevertheless, this dominance has invited intense scrutiny. The geopolitical rift between the U.S. and China poses the one best threat to Nvidia's long-term growth, evidenced by recent antitrust findings by Chinese regulators and continued smuggling controversies involving restricted chips just like the Blackwell B200.
Strategically, Nvidia is hedging against the commoditization of AI training by aggressively entering the **inference** market-the phase where AI models are used quite than built. The acquisition of Groq's technology in December 2025 is a defensive and offensive maneuver to secure low-latency processing capabilities.

---

## 2. Financial Performance Evaluation
**Sources:** [cite: 1, 2, 3, 4, 5]

### 2.1. Fiscal 12 months 2025 Annual Report (10-K) Highlights
Nvidia's Fiscal 12 months 2025 (ending January 2025) marked a historic inflection point within the technology sector.
*   **Total Revenue:** $130.5 billion, a **114% increase** year-over-year.
*   **Net Income:** $72.9 billion, soaring **145%**.
*   **Data Center Revenue:** $115.2 billion (+142%), confirming the entire shift of the corporate's gravity away from gaming and toward enterprise AI.
*   **Gross Margin:** Expanded to **75.0%** (up from 72.7%), reflecting pricing power and the high value of the Hopper architecture.
...
...
...
## 5. SWOT Evaluation

### **Strengths**
*   **Technological Monopoly:** Nvidia possesses an estimated 80-90% market share in AI training chips. The **Blackwell** and upcoming **Vera Rubin** architectures maintain a multi-year lead over competitors.
*   **Ecosystem Lock-in (CUDA):** The CUDA software platform stays the industry standard. The recent expansion into "AI Factories" and full-stack solutions (networking + hardware + software) makes switching costs prohibitively high for enterprise customers.
*   **Financial Fortress:** With gross margins exceeding **73%** and free money flow within the tens of billions, Nvidia has immense capital to reinvest in R&D ($100B OpenAI commitment) and acquire emerging tech (Groq).
*   **Supply Chain Command:** By pre-booking massive capability at TSMC (CoWoS packaging), Nvidia effectively controls the tap of worldwide AI compute supply.

### **Weaknesses**
*   **Revenue Concentration:** A significant slice of revenue is derived from a handful of "Hyperscalers" (Microsoft, Meta, Google, Amazon). If these clients successfully pivot to their very own custom silicon (TPUs, Trainium, Maia), Nvidia's revenue could face a cliff.
*   **Pricing Alienation:** The high cost of Nvidia hardware (e.g., $1,999 for consumer GPUs, $30k+ for enterprise chips) is pushing smaller developers and startups toward cheaper alternatives or cloud-based inference solutions.
*   **Supply Chain Single Point of Failure:** Total reliance on **TSMC** in Taiwan exposes Nvidia to catastrophic risk within the event of a cross-strait conflict or natural disaster.

### **Opportunities**
*   **The Inference Market:** The $20B Groq deal positions Nvidia to dominate the *inference* phase (running models), which is anticipated to be a bigger market than training in the long term.
*   **Sovereign AI:** Nations (Japan, France, Middle Eastern states) are constructing their very own "sovereign clouds" to guard data privacy. This creates a brand new, massive customer base outside of US Big Tech.
*   **Physical AI & Robotics:** With **Project GR00T** and the **Jetson** platform, Nvidia is positioning itself because the brain for humanoid robots and autonomous industrial systems, a market still in its infancy.
*   **Software & Services (NIMs):** Nvidia is transitioning to a software-as-a-service model with Nvidia Inference Microservices (NIMs), creating recurring revenue streams which are less cyclical than hardware sales.

### **Threats**
*   **Geopolitical Trade War:** The US-China tech war is the existential threat. Further tightening of export controls (e.g., banning H20 chips) or aggressive retaliation from China (SAMR antitrust penalties) could permanently sever access to one in every of the world's largest semiconductor markets.
*   **Regulatory Antitrust Motion:** Beyond China, Nvidia faces scrutiny within the EU and US (DOJ) regarding its bundling practices and market dominance. A forced breakup or behavioral remedies could hamper its "full-stack" strategy.
*   **Smuggling & IP Theft:** As seen with the DeepSeek controversy, export bans may inadvertently fuel a black market and speed up Chinese domestic innovation (e.g., Huawei Ascend), making a competitor that operates outside Western IP laws.
*   **"Good Enough" Competition:** For a lot of inference workloads, cheaper chips from AMD or specialized ASICs may eventually turn out to be "ok," eroding Nvidia's pricing power on the lower end of the market.
...
...
...

There’s a bunch more you’ll be able to do with the Interactions API than I’ve shown, including tool and performance calling, MCP integration, structured output and streaming.

But please remember that, as of the time of writing, the Interactions API remains to be in Beta, and Google’s deep research agent is in preview. This can undoubtedly change in the approaching weeks, nevertheless it’s best to examine before using this tool in a production system.

For more information, see the link below for Google’s official documentation page for the interactions API.

https://ai.google.dev/gemini-api/docs/interactions?ua=chat

Summary

The Google Interactions API signals a maturity within the AI engineering ecosystem. It acknowledges that the “Every thing Prompt”, a single, massive block of text attempting to handle personality, logic, tools, and safety, is an anti-pattern.

By utilizing this API, developers using Google AI can effectively decouple Reasoning (the LLM’s job) from Architecture (the Developer’s job).

Unlike usual chat loops, where state is implicit and liable to hallucinations, this API uses a structured “Interaction” resource to function a everlasting session record of all inputs, outputs, and gear results. With stateful management, developers can reference an Interaction ID from a previous chat and retrieve full context routinely. This will optimise caching, improve performance, and lower costs by eliminating the necessity to resend entire histories.

Moreover, the Interactions API is uniquely able to orchestrating asynchronous, high-latency agentic processes, corresponding to Google’s Deep Research, which may scour the online and synthesise massive amounts of information into complex reports. This research may be done asynchronously, which suggests you’ll be able to fire off long-running tasks and write easy code to be notified when the job finishes, allowing you to work on other tasks within the interim.

When you are constructing a creative writing assistant, a straightforward chat loop is high-quality. But should you are constructing a financial analyst, a medical screener, or a deep research engine, the Interactions API provides the scaffolding mandatory to show a probabilistic model right into a more reliable product.

The Death of the “Every thing Prompt”: Google’s Move Toward Structured AI

The Architectural Gap: Why “Chat” is Insufficient

The Deep Research Problem

Setting Up a Development Environment

Example 1: A Hello World equivalent

Example 2: Using Nano Banana to generate a picture

Example 3: State Management

Example 4: The Asynchronous Deep Research Orchestrator

Summary

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

Making AI Work, MIT Technology Review’s latest AI newsletter, is here

Train your first Decision Transformer

What's recent in Diffusers? 🎨

AI ads steal the show at Super Bowl LX

Incredibly Fast BLOOM Inference with DeepSpeed and Speed up

The Death of the “Every thing Prompt”: Google’s Move Toward Structured AI

The Architectural Gap: Why “Chat” is Insufficient

The Deep Research Problem

Setting Up a Development Environment

Example 1: A Hello World equivalent

Example 2: Using Nano Banana to generate a picture

Example 3: State Management

Example 4: The Asynchronous Deep Research Orchestrator

Summary

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.