Perplexity AI Model Catalog: Search, Advanced Chat, Plugin Engines, and Plan Mapping Explained

Graziano Stefanelli
9 hours ago
2 min read

Perplexity AI routes every query through a dynamic roster of proprietary and third-party language models.

That roster now spans lightweight retrieval engines for web snippets, heavyweight reasoning systems for deep chat, and specialized plugins for images, audio, and code execution.

Knowing which model appears in each plan helps users align cost, context capacity, and performance with their actual workload.

··········

Perplexity’s model ecosystem divides into three functional groups: Search, Advanced Chat, and Plugin engines.

Search models prioritize speed and web grounding.

Advanced Chat models handle long-context reasoning, coding, and multimodal prompts.

Plugin engines invoke niche tools—vision, audio, shell, or code execution—on demand.

··········

Model Groups Overview

Group	Representative Models	Primary Task
Search	Sonar (Llama 3.1 70B)	Real-time retrieval & snippets
Advanced Chat	GPT-5.1, Claude Sonnet 4.5, Claude Opus 4.5, Gemini 3 Pro	Deep reasoning, coding, multimodal
Plugin / Specialized	Mistral Large 16B, Command R-Plus, DeepSeek-R1, Stable Diffusion XL	Images, audio, code exec

··········

Plan tiers unlock progressively stronger chat models, with Gemini 3 Pro reserved for research mode.

Free users default to GPT-4o-mini and can toggle to Claude Haiku 4.5 for lighter workloads.

Pro subscribers gain GPT-5.1 Instant by default, then upgrade to GPT-5.1 Thinking, Claude Sonnet 4.5, Claude Opus 4.5, and Gemini 3 Pro at will.

Enterprise clients receive dedicated GPT-5.1 Thinking endpoints plus compliance variants.

··········

Plan-to-Model Mapping

Plan	Default Chat Model	Selectable Upgrades
Free	GPT-4o-mini	Claude Haiku 4.5
Pro	GPT-5.1 Instant	GPT-5.1 Thinking, Claude Sonnet 4.5, Claude Opus 4.5, Gemini 3 Pro
Enterprise	GPT-5.1 Thinking	Dedicated endpoints, compliance models

··········

Context windows range from 128 K tokens in Sonar to one million tokens in Gemini 3 Pro.

Sonar and GPT-4o-mini each offer 128 K tokens, sufficient for grounded search and concise chats.

GPT-5.1 Thinking and Claude Opus 4.5 expand to roughly 200 K tokens for multi-document or large codebase tasks.

Gemini 3 Pro, available in research mode, supports a one-million-token window for enterprise-scale analysis.

··········

Context Window Comparison

Model	Context Window
Sonar (Llama 3.1 70B)	128 K
GPT-4o-mini	128 K
GPT-5.1 Instant	128 K
GPT-5.1 Thinking	196 K
Claude Sonnet 4.5	200 K
Claude Opus 4.5	200 K
Gemini 3 Pro	1 M

··········

API documentation lists active specialized models and forthcoming deprecations.

Perplexity’s August 2025 guide highlights active API-only engines such as Mistral Large 16 B, Command R-Plus, and DeepSeek-R1 for domain-specific reasoning.

A February 2025 changelog warns developers to migrate from codellama-70b-instruct and mixtral-8x22b-instruct before deprecation.

··········

API-Only and Deprecated Models

Status	Model Names
Active API-only	Mistral Large 16 B, Command R-Plus, DeepSeek-R1
Deprecated Feb 2025	codellama-70b-instruct, mixtral-8x22b-instruct

··········

Perplexity’s multi-model architecture offers flexibility at every subscription tier.

Free users wield GPT-4o-mini and Claude Haiku 4.5 for grounded answers and light chat without spending a cent.

Pro subscribers unlock GPT-5.1, Claude 4.5 family models, and Gemini 3 Pro for complex research, coding, and multimodal workflows.

Enterprise teams receive compliance-ready endpoints, extended context, and custom throughput guarantees.

Plugin engines round out the ecosystem with images, audio transcription, and live code execution, ensuring Perplexity adapts to virtually any knowledge-work scenario.

··········

DATA STUDIOS

··········

[datastudios.org]