top of page

ChatGPT Context Window, Token Limits, Memory: models, features, settings, etc

ree

ChatGPT manages long conversations and uploaded content through context windows, token limits, and memory features that vary depending on the model, subscription tier, and enterprise setup. In late 2025, OpenAI has clarified practical limits for GPT-4.1, GPT-5, o-series reasoning models, and the app-level file upload system, while also explaining how memories are stored and controlled by users. These differences matter for developers, enterprise deployments, and everyday ChatGPT users alike.

·····

.....

How context windows differ between models.

The GPT-4.1 family, including GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano, supports up to 1,000,000 tokens of context. This represents the largest window currently available in the ChatGPT product and API, designed to handle extensive conversations, large documents, and multi-file inputs.

The GPT-5 API is documented with a 128,000-token context window. Although ChatGPT may internally route queries to updated models, the API specification confirms this limit when developers explicitly choose GPT-5.

The o-series reasoning models, such as o3 and o3-mini, have context windows published per snapshot. These models allocate a portion of the window to internal reasoning tokens, so effective space for user content is smaller. Model cards provide exact details for each release, and users must account for this distribution when planning long prompts.

For enterprise-level PDF processing, ChatGPT indexes both text and images when a file exceeds ~110,000 tokens. Instead of loading the entire document into the raw context, the system uses retrieval to fetch relevant text-image pairs, making reasoning feasible without requiring the entire file to fit into the window.

·····

.....

Token limits and practical constraints.

Token usage in ChatGPT depends not only on prompt length but also on the reasoning style of the model. With reasoning-enabled models, hidden internal reasoning steps consume output tokens, which count against the total token allocation. This means that even if a user requests a short answer, the model may spend hundreds or thousands of tokens internally before producing its response.

File uploads in the ChatGPT app are another important constraint. Each file can be up to 512 MB in size, and text or document files can hold approximately 2 million tokens. For spreadsheets, there is no token cap, although file size is still limited. Images are capped at 20 MB, and total per-user storage is 10 GB. These rules apply to both local uploads and files imported from connected applications such as Google Drive or Microsoft OneDrive.

·····

.....

How memory works in ChatGPT.

ChatGPT’s memory system is divided into two distinct layers:

  • Saved Memories: These are durable, user-approved memories of facts, preferences, or working context. The assistant can recall them across sessions, unless the user edits or deletes them.

  • Chat History: This consists of past conversations that the assistant may use for context. However, details are not guaranteed to persist, and only Saved Memories provide reliable continuity.

Users have full control over memory. Memory can be turned on or off, and individual items can be reviewed, edited, or deleted at any time. A Temporary Chat option ensures that memory is not read or written during the conversation, making it useful for private or sensitive queries.

Memory is also integrated into Projects, where ChatGPT remembers files and previous chats tied to a project, maintaining continuity in that workspace without requiring constant re-uploading. For ChatGPT Business and Enterprise, memory includes organizational controls, and content is not used for model training.

Custom GPTs created in GPT Builder do not support memory, meaning each session is stateless regardless of the user’s personal memory settings.

·····

.....

Memory and search integration.

When ChatGPT is connected to external search providers, Saved Memories can inform how results are retrieved. This is an opt-in feature, giving users the choice to allow memory to shape web-based queries. Combined with Projects, this makes it possible to blend private knowledge with real-time search.

Retention policies are documented separately, with options for users and organizations to delete chat data, uploaded files, or entire memory sets. Enterprise deployments offer stricter guarantees, with retention aligned to security and compliance requirements.

·····

.....

Table of context, token, and memory limits.

Model / Feature

Context window

Token notes

Memory behavior

GPT-4.1 family

1,000,000 tokens

Full multimodal support

Saved Memories + Projects

GPT-5 API

128,000 tokens

No reasoning tokens by default

Saved Memories + Projects

o-series reasoning models

Varies by snapshot

Reasoning tokens consume output space

Saved Memories + Projects

File uploads

512 MB/file; ~2M tokens text; 20 MB images; no token cap for spreadsheets

10 GB per-user storage

Files remembered in Projects

Memory

N/A

N/A

User-controlled; can edit, delete, disable; not in Custom GPTs

This table illustrates how model choice, file size, and memory determine how ChatGPT processes information across sessions.

·····

.....

Operational recommendations.

Users handling very large documents should rely on GPT-4.1 models with a 1M-token context, or combine them with Projects to offload content into retrieval rather than direct context. For reasoning models, prompts must leave space for internal thinking, which is billed and constrained by output token allocations.

File uploads should be kept within size caps, and where spreadsheets are involved, CSV or XLSX files can exceed millions of tokens without issue, though performance improves with smaller, well-structured inputs.

For personalization, users should enable Saved Memories for durable facts, switch to Temporary Chat for sensitive work, and remember that Custom GPTs do not support memory. Enterprise and Business plans allow organizations to manage memory policies at scale, ensuring compliance.

By understanding these distinctions, users and developers can maximize ChatGPT’s ability to manage long contexts, large files, and evolving user needs while keeping control over memory and retention.

.....

FOLLOW US FOR MORE.

DATA STUDIOS

bottom of page