ChatGPT: Updated list of available models, capabilities, and pricing (August 2025)
- Graziano Stefanelli
- Aug 23
- 4 min read

OpenAI’s catalog of available ChatGPT models has expanded significantly, covering GPT-5, GPT-4 variants, the o-series reasoning models, o4-mini multimodal options, research-preview releases like GPT-4.5, and lightweight on-device deployments optimized for mobile and edge inference. The offering now balances reasoning depth, multimodality, latency, and cost, giving developers and enterprise teams flexibility when integrating ChatGPT into workflows.
GPT-5 models redefine flagship capabilities.
The GPT-5 family marks OpenAI’s most advanced release, combining multimodal input (text, image, and audio), improved reasoning, and agentic orchestration for complex tasks. It introduces enhanced context handling up to 256 000 tokens, extending usability for research, analytics, and enterprise-scale datasets.
Key enhancements introduced with GPT-5.
Expanded multimodal support with better image and table parsing.
Planning and decision-making improvements for multi-step workflows.
Improved consistency in responses across large datasets.
Full availability across ChatGPT Plus, Team, and Enterprise subscriptions.
Wider tool integration support for structured outputs and function calls.
Model | Context Window | Use Case | Pricing Status |
GPT-5 | 256 000 tokens | Flagship multimodal reasoning | Pending |
GPT-5 Pro | 256 000 tokens | Deeper structured orchestration | Enterprise-tier uplift |
GPT-4o models continue leading in speed and multimodality.
The GPT-4o generation, including its smaller variants, remains the primary production family for fast, cost-effective multimodal experiences. These models enable text, image, and table-based tasks while maintaining high performance for real-time chat.
Model | Context Window | Latency | Main Use | Pricing (1 000 tokens) |
GPT-4o | 128 000 tokens | Fast | Balanced multimodal workflows | $0.18 / $0.54 |
GPT-4o Mini | 64 000 tokens | Very low | Lightweight chat, summarization | $0.09 / $0.27 |
GPT-4o Nano 20B | 8 000 tokens | Ultra-low | On-device inference kit | Licensed OEM |
GPT-4o Vision-Assist | 64 000 tokens | Mid | Image-to-JSON structured parsing | 1.2× GPT-4o |
Details on Nano and Vision-Assist capabilities.
The GPT-4o Nano 20B model targets edge deployments with full local inference on supported hardware such as Pixel 10 devices, while the Vision-Assist beta supports table extraction, structured image parsing, and field-level labeling, ideal for document-heavy workflows.
GPT-4.1 and GPT-4 Turbo balance cost and performance.
GPT-4.1Â bridges high-precision reasoning and affordable usage for large-scale contexts up to 1 000 000 tokens, useful for developers needing fine-grained control without sacrificing reliability.
GPT-4 Turbo, meanwhile, remains attractive for workloads needing stable performance at a lower cost, having received a recent price drop.
Model | Context Window | Primary Focus | Pricing (1 000 tokens) |
GPT-4.1 | 1 000 000 tokens | Coding, analysis, research | $0.24 / $0.48 |
GPT-4.1 Mini | 512 000 tokens | Middle-tier workflows | $0.12 / $0.32 |
GPT-4 Turbo | 128 000 tokens | Backward compatibility | $0.24 / $0.48 |
The o-series models redefine structured reasoning.
OpenAI’s o-series introduces a new generation of models optimized for multi-step reasoning, chain-of-thought orchestration, and structured decision-making. Unlike standard GPT models, these variants prioritize analytical depth over speed, allowing developers and enterprises to handle more complex workflows with controlled token budgeting.
Key improvements introduced with o-series models.
Adaptive reasoning levels configurable per request, supporting minimal, low, medium, and high effort modes.
Long-context reasoning up to 200,000 tokens with better consistency across extended analytical chains.
Native support for function calling and tool-driven orchestration.
Optimized pricing tiers to balance cost, latency, and reasoning depth.
Model | Context Window | Reasoning Effort | Use Case | Pricing (1,000 tokens) |
o1 | 200,000 tokens | Configurable | Advanced research, coding, data modeling | $2.00 / $8.00 |
o1-mini | 128,000 tokens | Medium | Cost-efficient structured tasks | $1.10 / $4.40 |
o1-pro | 200,000 tokens | High | Deep technical orchestration | $20.00 / $80.00 |
o3 | 200,000 tokens | Configurable | Balanced reasoning + multimodal support | $2.00 / $8.00 |
o3-mini | 200,000 tokens | Low-medium | High-volume production workloads | $1.10 / $4.40 |
o3-mini-high | 200,000 tokens | High | Extended chain-of-thought tasks | $1.10 / $4.40 |
o3-pro | 200,000 tokens | Maximum | Competitive research, multi-agent tasks | $20.00 / $80.00 |
These models are especially relevant for enterprises deploying agentic multi-step reasoning pipelines or applications requiring high output quality with predictable orchestration costs.
The o4-mini series focuses on lightweight multimodal reasoning.
The o4-mini models expand the OpenAI catalog with compact, multimodal-optimized architectures that balance speed, reasoning power, and cost. They offer full support for images, text, and table parsing while maintaining sub-second latency for smaller workloads.
Key enhancements introduced with o4-mini models.
Improved vision-to-JSONÂ structured parsing.
Lowered inference costs while retaining access to high reasoning effort modes.
Support for both tool invocation and real-time multimodal workflows.
Model | Context Window | Reasoning Effort | Capabilities | Pricing (1,000 tokens) |
o4-mini | 128,000 tokens | Configurable | Balanced multimodal processing | $1.10 / $4.40 |
o4-mini-high | 128,000 tokens | High | Deeper reasoning in edge tasks | $1.10 / $4.40 |
This makes the o4-mini line especially suitable for enterprise dashboards, structured analytics, and real-time document parsing in environments where latency constraints are critical.
GPT-4.5 expands creativity and pattern recognition.
GPT-4.5 launched as a research preview, designed for users who need improved pattern detection, creative synthesis, and faster context adaptation across long sessions. Unlike GPT-4 Turbo, it focuses less on transactional chat performance and more on exploratory ideation.
Model | Context Window | Primary Use | Pricing (1,000 tokens) | Availability |
GPT-4.5 | 128,000 tokens | Brainstorming, complex ideation, high-context summarization | $75.00 / $150.00 | Research preview |
Currently, GPT-4.5 is available under a limited access tier via ChatGPT Pro and Azure Foundry, making it more suitable for R&D teams rather than mainstream enterprise adoption.
GPT-5 mini and GPT-5 nano enable scalable multimodal inference.
In addition to the flagship GPT-5 and GPT-5 Pro, OpenAI has introduced lighter GPT-5 derivatives optimized for speed and affordability while retaining multimodal capabilities.
Model | Context Window | Latency | Use Case | Pricing (1,000 tokens) |
GPT-5 | 256,000 tokens | Mid-range | Enterprise-scale reasoning | $1.25 / $10.00 |
GPT-5 mini | 128,000 tokens | Low | High-volume lightweight workflows | $0.25 / $2.00 |
GPT-5 nano | 64,000 tokens | Ultra-low | Mobile inference, edge integrations | $0.05 / $0.40 |
These smaller GPT-5 variants automatically integrate into OpenAI’s routing system, allowing developers to scale workloads dynamically without manually switching between model endpoints.
GPT-3.5 Turbo transitions to legacy but remains relevant.
While GPT-3.5 Turbo is scheduled for deprecation in February 2026, it still supports fine-tuning projects and lightweight chat experiences where cost efficiency is the primary driver.
Model | Context Window | Availability | Pricing (1 000 tokens) |
GPT-3.5 Turbo | 16 000 tokens | Legacy until Feb 2026 | $0.005 / $0.015 |
Open-weight GPT OSS models enable local deployment.
OpenAI has also introduced GPT OSSÂ models with 20B and 120B parameters, providing developers with self-hosted capabilities for scenarios requiring full control of inference, privacy, and latency.
Model | Context Window | Parameters | Deployment Type | License |
gpt-oss-20B | 8 000 tokens | 20B | Local inference | MIT-style |
gpt-oss-120B | 8 000 tokens | 120B | Private hosting | MIT-style |
____________
FOLLOW US FOR MORE.
DATA STUDIOS

