In December we first introduced native image output in Gemini 2.0 Flash to trusted testers. Today, we’re making it available for developer experimentation across all regions currently supported by Google AI Studio. You may test this latest capability using an experimental version of Gemini 2.0 Flash (gemini-2.0-flash-exp) in Google AI Studio and via the Gemini API.
Gemini 2.0 Flash combines multimodal input, enhanced reasoning, and natural language understanding to create images.
Listed below are some examples of where 2.0 Flash’s multimodal outputs shine:
1. Text and pictures together
Use Gemini 2.0 Flash to inform a story and it is going to illustrate it with pictures, keeping the characters and settings consistent throughout. Give it feedback and the model will retell the story or change the sort of its drawings.
Story and illustration generation in Google AI Studio
2. Conversational image editing
Gemini 2.0 Flash helps you edit images through many turns of a natural language dialogue, great for iterating towards an ideal image, or to explore different ideas together.
Multi-turn conversation image editing maintaining context throughout the conversation in Google AI Studio
3. World understanding
Unlike many other image generation models, Gemini 2.0 Flash leverages world knowledge and enhanced reasoning to create the right image. This makes it perfect for creating detailed imagery that’s realistic–like illustrating a recipe. While it strives for accuracy, like all language models, its knowledge is broad and general, not absolute or complete.
Interleaved text and image output for a recipe in Google AI Studio
4. Text rendering
Most image generation models struggle to accurately render long sequences of text, often leading to poorly formatted or illegible characters, or misspellings. Internal benchmarks show that 2.0 Flash has stronger rendering in comparison with leading competitive models, and great for creating advertisements, social posts, and even invitations.
Image outputs with long text rendering in Google AI Studio
Start making images with Gemini today
Start with Gemini 2.0 Flash via the Gemini API. Read more about image generation in our docs.
from google import genai
from google.genai import types
client = genai.Client(api_key="GEMINI_API_KEY")
response = client.models.generate_content(
model="gemini-2.0-flash-exp",
contents=(
"Generate a story a few cute baby turtle in a 3d digital art style. "
"For every scene, generate a picture."
),
config=types.GenerateContentConfig(
response_modalities=["Text", "Image"]
),
)
Python
Whether you’re constructing AI agents, developing apps with beautiful visuals like illustrated interactive stories, or brainstorming visual ideas in conversation, Gemini 2.0 Flash permits you to add text and image generation with only a single model. We’re desirous to see what developers create with native image output and your feedback will help us finalize a production-ready version soon.
