Gemini Image Models (Nano Banana Family)

Last updated: April 22, 2026

Last updated: April 14, 2026

asset_JJNMLuSAzhom47BNNmtEmH9c_A professional, clean, and modern banner design with soft, natural lighting, viewed from an elevated perspective, showcasing the technological innovation of Gemini Image Models (Nano .png

Introduction

Scenario offers three Gemini image models from Google, collectively known as the Nano Banana family. These models represent a breakthrough in visual reasoning, allowing you to generate and edit images through natural conversation while maintaining strict identity and factual grounding.


Model Overview

Model

Official Name

Reference Images

Max Resolution

Google Search

Best Use Case

Gemini 2.5

Nano Banana

Up to 6

Auto / 1K

No

Standard edits & fast iteration

Gemini 3.0 Pro

Nano Banana Pro

Up to 14

4K

Yes

Studio-quality & complex scenes

Gemini 3.1

Nano Banana 2

Up to 14

4K

Yes

Speed & subject consistency

All models are accessible via Edit with Prompts and the Generate Image dashboard.


🍌 Gemini 3.1 (Nano Banana 2)

Gemini 3.1 is the recommended model for most workflows. It shares the 14-reference-image capacity and Google Search support of 3.0 Pro but adds a 512px resolution tier and defaults to 1K output, making it the best choice when speed and credit efficiency matter.

Parameters

  • Prompt: required. Max 4096 characters.

  • Reference Images: optional. Up to 14 reference images.

  • Aspect Ratio: same options as Gemini 2.5.

  • Resolution: output resolution: 512, 1K (default), 2K, or 4K.

  • Use Google Search: boolean, default false.

  • Number of Outputs: 1 to 4 (default 1).

  • Seed: optional. For reproducible results.

Use Gemini 3.1 for rapid iteration, high-volume workflows, or when 1K output is sufficient. Upgrade to 3.0 Pro when you need 2K or 4K for final delivery.


🍌 Gemini 3.0 Pro (Nano Banana Pro)

Gemini 3.0 Pro is the high-capability model for complex creative tasks. It supports up to 14 reference images, selectable output resolutions up to 4K, and optional Google Search grounding for factually accurate outputs.

Parameters

  • Prompt: required. Max 4096 characters.

  • Reference Images: optional. Up to 14 reference images.

  • Aspect Ratio: same options as Gemini 2.5.

  • Resolution: output resolution: 1K, 2K (default), or 4K.

  • Use Google Search: boolean, default false. When enabled, the model accesses real-time information from Google Search to ground outputs in current data. Useful for infographics, factual scenes, and product content with real-world references.

  • Number of Outputs: 1 to 4 (default 1).

  • Seed: optional. For reproducible results.

What sets it apart

  • Accepts up to 14 reference images, enabling consistency across scenes with multiple characters and objects.

  • Outputs at 1K, 2K, or 4K for print-ready and high-resolution deliverables.

  • Google Search integration allows generating factually grounded visuals such as real-time data infographics, historically accurate scenes, and product label translations.

  • Superior text rendering within images, including multilingual text replacement.

Use Gemini 3.0 Pro when you need maximum reference fidelity, high-resolution output, or real-world grounding.


🍌 Gemini 2.5 (Nano Banana)

Gemini 2.5 is the earlier model in the Nano Banana family. It handles core editing and generation tasks with a simpler parameter set, but has fewer reference image slots (up to 6), no resolution control, and no Google Search support. For most new workflows, Gemini 3.1 offers the same speed at lower cost with more capabilities. Gemini 2.5 remains available for teams with existing workflows built around it.

Parameters

  • Prompt: required. Natural language description of the desired change or generation. Max 4096 characters.

  • Reference Images: optional. Up to 6 reference images for style or content guidance.

  • Aspect Ratio: output aspect ratio. Options: 21:9, 16:9, 3:2, 4:3, 5:4, 1:1, 4:5, 3:4, 2:3, 9:16, or auto (default, matches input image).

  • Number of Outputs: 1 to 4 images per generation (default 1).

  • Seed: optional. For reproducible results.


Using Reference Images

All three models accept reference images to guide the output. The way you use them varies by task.

  • Single reference: provide one image and a prompt describing the edit. The model applies the instruction while preserving the subject.

  • Multiple references: provide 2 to 6 images (Gemini 2.5) or 2 to 14 images (3.0 Pro and 3.1). Useful for compositing characters into new environments, matching an existing art style, generating new assets consistent with a roster, or blending visual elements from different sources.

  • Style references: include images that represent the target aesthetic without describing it in text. The model infers the style from the visual context.

To add reference images in Edit with Prompts, click the image slots in the left panel before generating.


Prompting Guide

Effective prompts for the Gemini family follow a few principles.

  • Be specific about what changes and what stays. Rather than "make it better," describe the exact modification: "Change the jacket to dark brown leather and keep everything else the same."

  • Describe desired outcomes, not restrictions. "Add a warm golden-hour glow to the lighting" works better than "don't make it too dark."

  • Use cinematic or photographic language for composition. Terms like "close-up shot," "drone aerial view," "shallow depth of field," and "three-quarter angle" reliably control framing.

  • Break complex scenes into steps. For multi-element edits, run sequential prompts. Complete one change, then use the output as the new reference for the next edit.


Practical Examples: A Showcase of Possibilities

The true power of Gemini models is best understood through the diverse and complex tasks it can accomplish. The following examples, compiled from verified official announcements, expert reviews, and early access tests, demonstrate the breadth of its capabilities.

Text, Infographics, and Data Visualization

1. Real-Time Weather Infographic

  • Concept: This example showcases the model's ability to connect to real-time data via Google Search and visualize it. It moves beyond static image generation to create dynamic, data-driven content. This is a powerful tool for news, reporting, and personalized information.

Prompt: "Generate an infographic of the current weather in Tokyo."


2. Technical Project Explainer

Concept: A demonstration of deep reasoning and knowledge grounding. With a very short prompt, the model researches a complex open-source project and generates a comprehensive, accurate infographic. This highlights its ability to synthesize information and present it visually.

Prompt: "Infographic explaining how the Datasette open source project works"


3. Product Label Translation

Concept: This showcases precise, localized image editing. The model can identify, translate, and re-render text in a different language while perfectly preserving the surrounding image details. This is a game-changer for global marketing and product localization.

Prompt: "Translate all the English text on the three yellow and blue cans into Korean, while keeping everything else the same"


4. Recipe Flash Cards

Concept: Combining web search with structured content generation. The model can look up information (a recipe) and then reformat it into a different layout (flash cards). This is useful for educational content, study guides, and instructional materials.

Prompt: "Look up a recipe and generate flash cards"


5. Text on Whiteboard

  • Concept: A test of fine-motor skill simulation and text rendering accuracy. The model generates an image of a character performing the action of writing, with the resulting text being legible and contextually placed. It even adds relevant environmental details.

Prompt: "Create a panda writing 'Gemini 3.0 is on Scenario' on a whiteboard


UI/UX and Application Design

6. Modern App UI

  • Concept: A demonstration of the model's ability to generate modern, professional user interface designs. It understands current design trends and can create assets for different themes (light and dark mode). This can significantly speed up the prototyping and design process.

Prompt: "Create a modern application UI in dark mode with neon accents


7. Software Interface Simulation

  • Concept: The ability to generate realistic mockups of existing software interfaces. While not pixel-perfect, it can create convincing representations of operating systems and applications. This is useful for creating tutorials, marketing materials, or envisioning integrations.

Prompt: "Create a picture of a Windows computer with YouTube tab open


8. Brand Variation with Logo Preservation

Concept:
This example demonstrates the model’s ability to preserve the Terra Quest logo’s visual identity while generating creative variations across different environments. The model keeps the logo perfectly intact with no distortion in typography or proportions and produces background variations that remain consistent with the original illustrative style. It also updates internal illustrated elements, such as the mountains inside the boot, so they match the theme of the new background. This approach ensures coherent, professional design outputs suitable for brand-safe workflows.

Prompt example:
A descriptive prompt to create creative variations of a social asset featuring the Terra Quest logo placed over a new environment background while preserving the logo’s structure, color palette, and identity. Update the illustrated elements inside the boot so they visually match the new background.


Storyboarding and Scene Composition

9. Cinematic Storyboarding

  • Concept: Translating a single moment into a narrative sequence. The model can take one image and generate a series of shots with different camera angles, effectively creating a storyboard. This demonstrates an understanding of cinematic language and visual storytelling

Prompt: "Create a storyboard for this scene"


10. Scene Composition with Mood Matching

  • Concept: Advanced multi-reference composition. The model can take multiple inputs—an illustration, a phone, and a mood board—and blend them into a single, coherent scene. It intelligently matches the lighting and even adds creative details that fit the mood.

Prompt: Change the man's pose to hold the banana close to the camera


11. 2D to 3D Scene Rendering

  • Concept: Transforming a flat collection of 2D assets into a cohesive 3D space. This shows the model's ability to interpret brand guidelines and create a dimensional rendering of an environment. It's a powerful tool for event planning, architectural visualization, and marketing.

  • Prompt: A descriptive prompt to combine 2D brand elements from a mood board into a single 3D rendering.

Character, Style, and Brand Consistency


12. Lion as Superman

  • Concept: A creative blend of a real-world animal with a fictional character. This example highlights the model's ability to merge concepts and add realistic physical effects, like motion blur on the cape. It's a demonstration of both imagination and technical execution.

Prompt: A lion as superman flying in the sky"


13. Low-Poly Game Style

  • Concept: Style transfer for game development. The model can take a concept and render it in a specific, stylized aesthetic like low-poly game art. It can also generate relevant UI elements, showing an understanding of the target medium.

Prompt: Turn into low-poly style


14. Professional Headshot Reframing

  • Concept: A practical business use case for maintaining brand consistency. The model can take a new employee's headshot and adjust the background, lighting, and framing to match the style of existing team photos. This is a huge time-saver for corporate branding.

Prompt: Create a professional headshot with the subject in a tailored suit


Advanced Transformation and Creative Control

15. 3D Pancake Skull

  • Concept: A test of complex object generation and creative interpretation. The model is asked to create a highly unusual object—a pancake shaped like a skull—and then apply realistic food styling. This demonstrates its ability to handle imaginative and detailed prompts.

Prompt: Create an image of a three-dimensional pancake in the shape of a skull, garnished on top with blueberries and maple syrup


16. Selective Focus Control

Concept: A demonstration of professional photographic controls. The model can manipulate the depth of field to selectively blur parts of an image, drawing the viewer's attention. This mimics the use of a wide-aperture lens and is a key tool in photography.

Prompt: "Focus on the faces of the crowd and make woman blurry"


17. Time of Day Change

Concept: A powerful tool for controlling the mood and atmosphere of a scene. The model can realistically transform the lighting of an image to change the time of day. This is invaluable for real estate, film, and marketing.

Prompt: "Change to daytime"


18. Aspect Ratio Zoom

Concept: A practical tool for content creation and reframing. The model can zoom in on a specific part of an image while locking the aspect ratio. This is useful for creating social media cut-downs or focusing on a key detail.

Prompt: "Zoom in on this image, maintaining a 16:9 aspect ratio"


Conclusion

The Nano Banana family covers the full range of image creation and editing needs within Scenario. Gemini 3.1 is the recommended default for most workflows, offering up to 14 reference images, resolutions from 512 to 4K, and Google Search support at a fast and cost-efficient pace. Gemini 3.0 Pro is the right choice when the work demands maximum resolution, multi-reference identity consistency, or factual grounding through Google Search. Gemini 2.5 is also available for teams with existing workflows built around it.

Across all three models, the quality of your prompt remains the single biggest factor in output quality. Clear, specific instructions with well-chosen reference images will consistently outperform elaborate settings with a vague prompt.