top of page

Perplexity AI Context Window: Maximum Token Limits, Memory Retention, Conversation Length, And Context Handling Explained

Perplexity AI’s approach to context and memory combines explicit model token windows, user-centric conversation threading, and configurable retention for both text and file-based workflows. Understanding these mechanisms helps users manage the persistence and recall of information across sessions and tasks.

·····


Model Context Windows Define The Hard Limits For Each Request.

Perplexity AI models enforce strict token limits for each request, setting the boundary for how much input, history, and output can be processed in a single exchange. The Sonar model family forms the backbone of Perplexity’s offerings, each with clearly published context lengths.

Sonar supports up to 128,000 tokens per request, while Sonar Pro extends this to 200,000 tokens. Other variants, such as Sonar Reasoning Pro and Sonar Deep Research, offer 128,000 token windows. These caps include both the prompt (input, history, search context) and the generated response, making them the definitive ceiling for complex or long-running interactions.

Users can configure how much retrieved web context is included in the model’s working window using the search_context_size parameter, optimizing between depth and capacity for each query.

........

Perplexity Sonar Model Context Window Limits

Model

Maximum Tokens

Coverage

Sonar

128,000

Input, history, and output combined

Sonar Pro

200,000

Input, history, and output combined

Sonar Reasoning Pro

128,000

Input, history, and output combined

Sonar Deep Research

128,000

Input, history, and output combined

The model window is the controlling factor for single-prompt complexity and document length.

·····

Conversation Length And Thread Retention Are Managed At The Product Level.

Perplexity structures user interactions as “Threads,” each capturing a multi-turn conversation with ongoing context retention. For signed-in users, Threads are stored in the user’s Library indefinitely unless deleted, preserving continuity and allowing long-form research or project-based work over time.

In privacy modes, session retention is reduced. Logged-out Threads expire after 14 days, while Incognito Threads are retained for 24 hours. Enterprise organizations have the option to enforce or customize thread retention policies for compliance and security, reflecting organizational requirements.

The practical limit for active context in any reply is still set by the underlying model’s token window, meaning only the most recent or relevant turns may be fully considered in long conversations.

........

Perplexity Thread Retention Across User Modes

User Mode

Thread Retention

Notes

Signed-in

Indefinite until deleted

Persistent Library storage

Logged-out

14 days

Auto-expiration

Incognito

24 hours

Auto-expiration

Enterprise

Admin configurable

Organization policy

Retention time shapes how long conversations are available for review and follow-up.

·····

Memory Features Provide Cross-Thread Personalization Without Storing Full Conversations.

Perplexity AI now includes a “Memory” feature, which remembers user preferences and key details across Threads. Memory is designed for personalization—recalling names, tasks, or project goals between sessions—without retaining full conversation histories as “memory.”

Memory can be managed or deleted in user settings, and is intended to supplement but not replace Thread-level recall.

Files and images attached to Threads are retained for 30 days by default, while Enterprise Pro users have a 7-day retention window for files. Files in Spaces or repositories persist until manually deleted, allowing longer-term collaboration.

........

Memory And File Retention In Perplexity AI

Feature

Retention/Behavior

Notes

Memory (user details)

Until deleted

Personalization across Threads

Thread files (standard)

30 days

For follow-ups within Threads

Thread files (Enterprise Pro)

7 days

Shortened for compliance

Spaces/repo files

Until deleted

Long-term collaboration

Memory bridges sessions, while files and context expire on a set schedule.

·····

Context Handling In Perplexity AI Balances Model Limits, Conversation Length, And User Privacy.

Perplexity AI’s context system is designed to optimize model capacity, conversation continuity, and user privacy. Model token windows cap the complexity and document length of each exchange, while Thread retention enables extended research and project workflows.

Memory features allow for lightweight cross-thread personalization, and file retention is managed to balance convenience and data security. Enterprise users benefit from additional administrative controls, ensuring that retention aligns with organizational needs.

For best results, users should be mindful of token window constraints, periodically review active Threads and files, and leverage Memory for personalized, multi-session assistance.

·····

FOLLOW US FOR MORE.

·····

DATA STUDIOS

·····

·····

bottom of page