top of page

Google Gemini: Complete List of All Models Available Across Pro, Flash, Flash-Lite, Live, Audio and Preview Variants

ree

Google Gemini provides a wide catalogue of multimodal models spanning high-reasoning tiers, cost-efficient inference variants, real-time audio and streaming modes, and experimental preview releases integrated across Google AI Studio, the Gemini API and Vertex AI.

This multilayered ecosystem includes flagship Pro models for advanced reasoning, Flash models for balanced performance, Flash-Lite models optimized for scale and budget, Live and Audio models for speech and streaming workflows, and specialised preview options made available during early release cycles.

Understanding the full list of available Gemini models is essential for developers choosing the right architecture for reasoning, multimodal input, latency requirements, throughput constraints or integration with Google Cloud production systems.

··········

··········

Gemini organizes its ecosystem into Pro, Flash, Flash-Lite, Live, Audio and Preview categories, each designed for specific performance and multimodal requirements.

The structural division of Gemini models reflects Google’s modular approach to generative AI, where each family emphasizes a performance dimension such as reasoning, speed, throughput, or real-time multimodal interaction.

Gemini Pro models act as the flagship tier, delivering high-fidelity multimodal processing, long-context reasoning, research-grade logic, code execution and document-level interpretation across text, images, audio, video and PDFs.

Gemini Flash models maintain strong capability at reduced cost and latency, supporting general instruction tasks, multipurpose chat, consumer-facing applications and medium-complexity multimodal reasoning.

Gemini Flash-Lite models offer maximum throughput and minimal latency for large-scale workloads, summarization pipelines, classification tasks, high-volume automation and constrained budget environments.

Real-time interaction is supported by Gemini Live and Audio variants capable of continuous audio processing, TTS, voice dialog and low-latency multimodal streaming.

Preview releases extend this catalogue with early-access modes that allow developers to test upcoming architectures such as Gemini 3 Pro.

·····

Gemini Model Families

Model Group

Primary Purpose

Key Multimodal Behavior

Gemini Pro

High-grade reasoning

Full multimodal inputs

Gemini Flash

Balanced latency/cost

Agile multimodal

Gemini Flash-Lite

Scaled throughput

Lightweight multimodal

Gemini Live / Audio

Streaming

Real-time voice + audio

Preview Releases

Experimental

Mode-specific capabilities

··········

··········

Gemini 2.5 models form the current production foundation, offering strong multimodal reasoning across Pro, Flash and Flash-Lite tiers.

The Gemini 2.5 Pro family provides Google’s most advanced stable multimodal architecture, designed for deep reasoning, multi-document interpretation, code analysis, mathematical workflows and high-fidelity cross-modal alignment across all supported modalities.

Gemini 2.5 Flash introduces lower-latency inference with competitive reasoning quality, optimized for everyday tasks, interactive chat, rapid prototyping and general-purpose multimodal analysis.

Gemini 2.5 Flash-Lite prioritizes throughput and cost efficiency, providing the best option for summarization, classification, high-volume request patterns and large-scale distributed applications where operational cost is critical.

Earlier 2.0 generation models remain accessible for compatibility and benchmarking, though 2.5 variants provide significant improvements in multimodal accuracy and consistency.

·····

Gemini 2.5 Model Availability

Model Name

Capability Level

Modalities Supported

Gemini 2.5 Pro

Flagship reasoning

Text, image, audio, video, PDF

Gemini 2.5 Flash

Speed + accuracy

Multimodal

Gemini 2.5 Flash-Lite

Budget/high-volume

Text + light multimodal

Gemini 2.0 Flash

Legacy

Image + text

Gemini 2.0 Flash-Lite

Legacy

Text-first

··········

··········

Gemini 3 Pro introduces the newest generation of large-scale multimodal reasoning with expanded context and early-access performance previews.

The Gemini 3 Pro Preview model represents Google’s next-generation architecture, combining improved context handling, more refined multimodal alignment, stronger reasoning stability and advanced extraction capabilities across extended document sets.

Gemini 3 Pro supports unified processing across text, images, video, audio and PDFs, providing tighter cross-modal consistency and improved reliability in complex multimodal environments.

As a preview model, performance characteristics, cost behavior, modality coverage and maximum generation limits may evolve during the rollout period, while developers gain early access to evaluate architecture improvements before full release.

The introduction of Gemini 3 Pro marks the next step toward deeper reasoning, higher memory capacity and more accurate multimodal grounding within Google’s expanding generative AI ecosystem.

·····

Gemini 3 Model Availability

Model Name

Status

Capability Focus

Gemini 3 Pro

Preview

Deep multimodal reasoning

Gemini 3 Pro (Future GA)

Upcoming

Full multimodal fidelity

Gemini 3 Extensions

Experimental

Modality enhancements

··········

··········

Live, audio and streaming variants extend Gemini into real-time multimodal interaction across continuous inputs and dynamic conversational experiences.

Gemini Live Flash models enable real-time audio and streaming workloads, offering voice dialog, continuous mic input, waveform analysis, audio reasoning and dynamic cross-modal alignment across speech and text.

Additional audio dialog and TTS preview models support voice assistants, narration systems, interactive applications and multimodal agents requiring real-time perception capabilities.

Developers integrating real-time input pipelines in mobile apps, web apps or dedicated devices benefit from the low-latency streaming architecture provided by these models.

Google also offers preview streaming modes that combine real-time speech transcription, audio understanding and instantaneous text-to-speech generation, expanding Gemini’s interactive reach.

·····

Gemini Live and Audio Variants

Model Variant

Primary Function

Supported Tasks

Gemini Live Flash

Real-time multimodal

Voice + streaming reasoning

Audio Dialog Models

Speech understanding

Natural conversation

TTS Preview Models

Output generation

Voice synthesis

Live Preview Modes

Experimental

Continuous input flows

··········

··········

Specialized, experimental and preview models expand the Gemini catalogue with early-access features for developers evaluating next-generation functionalities.

Beyond production models, Google regularly deploys experimental variants to Google AI Studio and the Gemini API, allowing developers to test advanced features before they are stabilized for general availability.

Preview models may include early multimodal upgrades, extended context behaviors, new attention mechanisms, domain-specific optimization and emerging safety or constraint frameworks intended for enterprise integration.

The preview architecture lets developers benchmark performance, adapt system prompts, refine toolchains and explore next-generation multimodal consistency ahead of formal rollout.

These experimental modes form part of Google’s iterative release strategy, ensuring a steady progression from stable production models to increasingly capable future generations.

·····

Preview and Experimental Catalogue

Model Category

Release Status

Intended Use

Gemini 3 Previews

Limited rollout

Architecture testing

Flash Preview Variants

Experimental

Latency evaluation

Audio/TTS Previews

Mode-specific

Real-time pipelines

Multimodal Extensions

Emerging

Prototype features

··········

FOLLOW US FOR MORE

··········

··········

DATA STUDIOS

··········

bottom of page