top of page

Perplexity AI Model Catalog: Search, Advanced Chat, Plugin Engines, and Plan Mapping Explained

ree

Perplexity AI routes every query through a dynamic roster of proprietary and third-party language models.

That roster now spans lightweight retrieval engines for web snippets, heavyweight reasoning systems for deep chat, and specialized plugins for images, audio, and code execution.

Knowing which model appears in each plan helps users align cost, context capacity, and performance with their actual workload.

··········

··········

Perplexity’s model ecosystem divides into three functional groups: Search, Advanced Chat, and Plugin engines.

Search models prioritize speed and web grounding.

Advanced Chat models handle long-context reasoning, coding, and multimodal prompts.

Plugin engines invoke niche tools—vision, audio, shell, or code execution—on demand.

··········

Model Groups Overview

Group

Representative Models

Primary Task

Search

Sonar (Llama 3.1 70B)

Real-time retrieval & snippets

Advanced Chat

GPT-5.1, Claude Sonnet 4.5, Claude Opus 4.5, Gemini 3 Pro

Deep reasoning, coding, multimodal

Plugin / Specialized

Mistral Large 16B, Command R-Plus, DeepSeek-R1, Stable Diffusion XL

Images, audio, code exec

··········

··········

Plan tiers unlock progressively stronger chat models, with Gemini 3 Pro reserved for research mode.

Free users default to GPT-4o-mini and can toggle to Claude Haiku 4.5 for lighter workloads.

Pro subscribers gain GPT-5.1 Instant by default, then upgrade to GPT-5.1 Thinking, Claude Sonnet 4.5, Claude Opus 4.5, and Gemini 3 Pro at will.

Enterprise clients receive dedicated GPT-5.1 Thinking endpoints plus compliance variants.

··········

Plan-to-Model Mapping

Plan

Default Chat Model

Selectable Upgrades

Free

GPT-4o-mini

Claude Haiku 4.5

Pro

GPT-5.1 Instant

GPT-5.1 Thinking, Claude Sonnet 4.5, Claude Opus 4.5, Gemini 3 Pro

Enterprise

GPT-5.1 Thinking

Dedicated endpoints, compliance models

··········

··········

Context windows range from 128 K tokens in Sonar to one million tokens in Gemini 3 Pro.

Sonar and GPT-4o-mini each offer 128 K tokens, sufficient for grounded search and concise chats.

GPT-5.1 Thinking and Claude Opus 4.5 expand to roughly 200 K tokens for multi-document or large codebase tasks.

Gemini 3 Pro, available in research mode, supports a one-million-token window for enterprise-scale analysis.

··········

Context Window Comparison

Model

Context Window

Sonar (Llama 3.1 70B)

128 K

GPT-4o-mini

128 K

GPT-5.1 Instant

128 K

GPT-5.1 Thinking

196 K

Claude Sonnet 4.5

200 K

Claude Opus 4.5

200 K

Gemini 3 Pro

1 M

··········

··········

API documentation lists active specialized models and forthcoming deprecations.

Perplexity’s August 2025 guide highlights active API-only engines such as Mistral Large 16 B, Command R-Plus, and DeepSeek-R1 for domain-specific reasoning.

A February 2025 changelog warns developers to migrate from codellama-70b-instruct and mixtral-8x22b-instruct before deprecation.

··········

API-Only and Deprecated Models

Status

Model Names

Active API-only

Mistral Large 16 B, Command R-Plus, DeepSeek-R1

Deprecated Feb 2025

codellama-70b-instruct, mixtral-8x22b-instruct

··········

··········

Perplexity’s multi-model architecture offers flexibility at every subscription tier.

Free users wield GPT-4o-mini and Claude Haiku 4.5 for grounded answers and light chat without spending a cent.

Pro subscribers unlock GPT-5.1, Claude 4.5 family models, and Gemini 3 Pro for complex research, coding, and multimodal workflows.

Enterprise teams receive compliance-ready endpoints, extended context, and custom throughput guarantees.

Plugin engines round out the ecosystem with images, audio transcription, and live code execution, ensuring Perplexity adapts to virtually any knowledge-work scenario.

··········

FOLLOW US FOR MORE

··········

··········

DATA STUDIOS

··········

bottom of page