Perplexity AI Model Catalog: Search, Advanced Chat, Plugin Engines, and Plan Mapping Explained
- Graziano Stefanelli
- 9 hours ago
- 2 min read

Perplexity AI routes every query through a dynamic roster of proprietary and third-party language models.
That roster now spans lightweight retrieval engines for web snippets, heavyweight reasoning systems for deep chat, and specialized plugins for images, audio, and code execution.
Knowing which model appears in each plan helps users align cost, context capacity, and performance with their actual workload.
··········
··········
Perplexity’s model ecosystem divides into three functional groups: Search, Advanced Chat, and Plugin engines.
Search models prioritize speed and web grounding.
Advanced Chat models handle long-context reasoning, coding, and multimodal prompts.
Plugin engines invoke niche tools—vision, audio, shell, or code execution—on demand.
··········
Model Groups Overview
Group | Representative Models | Primary Task |
Search | Sonar (Llama 3.1 70B) | Real-time retrieval & snippets |
Advanced Chat | GPT-5.1, Claude Sonnet 4.5, Claude Opus 4.5, Gemini 3 Pro | Deep reasoning, coding, multimodal |
Plugin / Specialized | Mistral Large 16B, Command R-Plus, DeepSeek-R1, Stable Diffusion XL | Images, audio, code exec |
··········
··········
Plan tiers unlock progressively stronger chat models, with Gemini 3 Pro reserved for research mode.
Free users default to GPT-4o-mini and can toggle to Claude Haiku 4.5 for lighter workloads.
Pro subscribers gain GPT-5.1 Instant by default, then upgrade to GPT-5.1 Thinking, Claude Sonnet 4.5, Claude Opus 4.5, and Gemini 3 Pro at will.
Enterprise clients receive dedicated GPT-5.1 Thinking endpoints plus compliance variants.
··········
Plan-to-Model Mapping
Plan | Default Chat Model | Selectable Upgrades |
Free | GPT-4o-mini | Claude Haiku 4.5 |
Pro | GPT-5.1 Instant | GPT-5.1 Thinking, Claude Sonnet 4.5, Claude Opus 4.5, Gemini 3 Pro |
Enterprise | GPT-5.1 Thinking | Dedicated endpoints, compliance models |
··········
··········
Context windows range from 128 K tokens in Sonar to one million tokens in Gemini 3 Pro.
Sonar and GPT-4o-mini each offer 128 K tokens, sufficient for grounded search and concise chats.
GPT-5.1 Thinking and Claude Opus 4.5 expand to roughly 200 K tokens for multi-document or large codebase tasks.
Gemini 3 Pro, available in research mode, supports a one-million-token window for enterprise-scale analysis.
··········
Context Window Comparison
Model | Context Window |
Sonar (Llama 3.1 70B) | 128 K |
GPT-4o-mini | 128 K |
GPT-5.1 Instant | 128 K |
GPT-5.1 Thinking | 196 K |
Claude Sonnet 4.5 | 200 K |
Claude Opus 4.5 | 200 K |
Gemini 3 Pro | 1 M |
··········
··········
API documentation lists active specialized models and forthcoming deprecations.
Perplexity’s August 2025 guide highlights active API-only engines such as Mistral Large 16 B, Command R-Plus, and DeepSeek-R1 for domain-specific reasoning.
A February 2025 changelog warns developers to migrate from codellama-70b-instruct and mixtral-8x22b-instruct before deprecation.
··········
API-Only and Deprecated Models
Status | Model Names |
Active API-only | Mistral Large 16 B, Command R-Plus, DeepSeek-R1 |
Deprecated Feb 2025 | codellama-70b-instruct, mixtral-8x22b-instruct |
··········
··········
Perplexity’s multi-model architecture offers flexibility at every subscription tier.
Free users wield GPT-4o-mini and Claude Haiku 4.5 for grounded answers and light chat without spending a cent.
Pro subscribers unlock GPT-5.1, Claude 4.5 family models, and Gemini 3 Pro for complex research, coding, and multimodal workflows.
Enterprise teams receive compliance-ready endpoints, extended context, and custom throughput guarantees.
Plugin engines round out the ecosystem with images, audio transcription, and live code execution, ensuring Perplexity adapts to virtually any knowledge-work scenario.
··········
FOLLOW US FOR MORE
··········
··········
DATA STUDIOS
··········

