Google Gemini: Complete List of All Models Available Across Pro, Flash, Flash-Lite, Live, Audio and Preview Variants
- Graziano Stefanelli
- 4 hours ago
- 4 min read

Google Gemini provides a wide catalogue of multimodal models spanning high-reasoning tiers, cost-efficient inference variants, real-time audio and streaming modes, and experimental preview releases integrated across Google AI Studio, the Gemini API and Vertex AI.
This multilayered ecosystem includes flagship Pro models for advanced reasoning, Flash models for balanced performance, Flash-Lite models optimized for scale and budget, Live and Audio models for speech and streaming workflows, and specialised preview options made available during early release cycles.
Understanding the full list of available Gemini models is essential for developers choosing the right architecture for reasoning, multimodal input, latency requirements, throughput constraints or integration with Google Cloud production systems.
··········
··········
Gemini organizes its ecosystem into Pro, Flash, Flash-Lite, Live, Audio and Preview categories, each designed for specific performance and multimodal requirements.
The structural division of Gemini models reflects Google’s modular approach to generative AI, where each family emphasizes a performance dimension such as reasoning, speed, throughput, or real-time multimodal interaction.
Gemini Pro models act as the flagship tier, delivering high-fidelity multimodal processing, long-context reasoning, research-grade logic, code execution and document-level interpretation across text, images, audio, video and PDFs.
Gemini Flash models maintain strong capability at reduced cost and latency, supporting general instruction tasks, multipurpose chat, consumer-facing applications and medium-complexity multimodal reasoning.
Gemini Flash-Lite models offer maximum throughput and minimal latency for large-scale workloads, summarization pipelines, classification tasks, high-volume automation and constrained budget environments.
Real-time interaction is supported by Gemini Live and Audio variants capable of continuous audio processing, TTS, voice dialog and low-latency multimodal streaming.
Preview releases extend this catalogue with early-access modes that allow developers to test upcoming architectures such as Gemini 3 Pro.
·····
Gemini Model Families
Model Group | Primary Purpose | Key Multimodal Behavior |
Gemini Pro | High-grade reasoning | Full multimodal inputs |
Gemini Flash | Balanced latency/cost | Agile multimodal |
Gemini Flash-Lite | Scaled throughput | Lightweight multimodal |
Gemini Live / Audio | Streaming | Real-time voice + audio |
Preview Releases | Experimental | Mode-specific capabilities |
··········
··········
Gemini 2.5 models form the current production foundation, offering strong multimodal reasoning across Pro, Flash and Flash-Lite tiers.
The Gemini 2.5 Pro family provides Google’s most advanced stable multimodal architecture, designed for deep reasoning, multi-document interpretation, code analysis, mathematical workflows and high-fidelity cross-modal alignment across all supported modalities.
Gemini 2.5 Flash introduces lower-latency inference with competitive reasoning quality, optimized for everyday tasks, interactive chat, rapid prototyping and general-purpose multimodal analysis.
Gemini 2.5 Flash-Lite prioritizes throughput and cost efficiency, providing the best option for summarization, classification, high-volume request patterns and large-scale distributed applications where operational cost is critical.
Earlier 2.0 generation models remain accessible for compatibility and benchmarking, though 2.5 variants provide significant improvements in multimodal accuracy and consistency.
·····
Gemini 2.5 Model Availability
Model Name | Capability Level | Modalities Supported |
Gemini 2.5 Pro | Flagship reasoning | Text, image, audio, video, PDF |
Gemini 2.5 Flash | Speed + accuracy | Multimodal |
Gemini 2.5 Flash-Lite | Budget/high-volume | Text + light multimodal |
Gemini 2.0 Flash | Legacy | Image + text |
Gemini 2.0 Flash-Lite | Legacy | Text-first |
··········
··········
Gemini 3 Pro introduces the newest generation of large-scale multimodal reasoning with expanded context and early-access performance previews.
The Gemini 3 Pro Preview model represents Google’s next-generation architecture, combining improved context handling, more refined multimodal alignment, stronger reasoning stability and advanced extraction capabilities across extended document sets.
Gemini 3 Pro supports unified processing across text, images, video, audio and PDFs, providing tighter cross-modal consistency and improved reliability in complex multimodal environments.
As a preview model, performance characteristics, cost behavior, modality coverage and maximum generation limits may evolve during the rollout period, while developers gain early access to evaluate architecture improvements before full release.
The introduction of Gemini 3 Pro marks the next step toward deeper reasoning, higher memory capacity and more accurate multimodal grounding within Google’s expanding generative AI ecosystem.
·····
Gemini 3 Model Availability
Model Name | Status | Capability Focus |
Gemini 3 Pro | Preview | Deep multimodal reasoning |
Gemini 3 Pro (Future GA) | Upcoming | Full multimodal fidelity |
Gemini 3 Extensions | Experimental | Modality enhancements |
··········
··········
Live, audio and streaming variants extend Gemini into real-time multimodal interaction across continuous inputs and dynamic conversational experiences.
Gemini Live Flash models enable real-time audio and streaming workloads, offering voice dialog, continuous mic input, waveform analysis, audio reasoning and dynamic cross-modal alignment across speech and text.
Additional audio dialog and TTS preview models support voice assistants, narration systems, interactive applications and multimodal agents requiring real-time perception capabilities.
Developers integrating real-time input pipelines in mobile apps, web apps or dedicated devices benefit from the low-latency streaming architecture provided by these models.
Google also offers preview streaming modes that combine real-time speech transcription, audio understanding and instantaneous text-to-speech generation, expanding Gemini’s interactive reach.
·····
Gemini Live and Audio Variants
Model Variant | Primary Function | Supported Tasks |
Gemini Live Flash | Real-time multimodal | Voice + streaming reasoning |
Audio Dialog Models | Speech understanding | Natural conversation |
TTS Preview Models | Output generation | Voice synthesis |
Live Preview Modes | Experimental | Continuous input flows |
··········
··········
Specialized, experimental and preview models expand the Gemini catalogue with early-access features for developers evaluating next-generation functionalities.
Beyond production models, Google regularly deploys experimental variants to Google AI Studio and the Gemini API, allowing developers to test advanced features before they are stabilized for general availability.
Preview models may include early multimodal upgrades, extended context behaviors, new attention mechanisms, domain-specific optimization and emerging safety or constraint frameworks intended for enterprise integration.
The preview architecture lets developers benchmark performance, adapt system prompts, refine toolchains and explore next-generation multimodal consistency ahead of formal rollout.
These experimental modes form part of Google’s iterative release strategy, ensuring a steady progression from stable production models to increasingly capable future generations.
·····
Preview and Experimental Catalogue
Model Category | Release Status | Intended Use |
Gemini 3 Previews | Limited rollout | Architecture testing |
Flash Preview Variants | Experimental | Latency evaluation |
Audio/TTS Previews | Mode-specific | Real-time pipelines |
Multimodal Extensions | Emerging | Prototype features |
··········
FOLLOW US FOR MORE
··········
··········
DATA STUDIOS
··········



