ChatGPT Context Window, Token Limits, Memory: models, features, settings, etc

Graziano Stefanelli
Oct 1
4 min read

ChatGPT manages long conversations and uploaded content through context windows, token limits, and memory features that vary depending on the model, subscription tier, and enterprise setup. In late 2025, OpenAI has clarified practical limits for GPT-4.1, GPT-5, o-series reasoning models, and the app-level file upload system, while also explaining how memories are stored and controlled by users. These differences matter for developers, enterprise deployments, and everyday ChatGPT users alike.

·····

.....

How context windows differ between models.

The GPT-4.1 family, including GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano, supports up to 1,000,000 tokens of context. This represents the largest window currently available in the ChatGPT product and API, designed to handle extensive conversations, large documents, and multi-file inputs.

The GPT-5 API is documented with a 128,000-token context window. Although ChatGPT may internally route queries to updated models, the API specification confirms this limit when developers explicitly choose GPT-5.

The o-series reasoning models, such as o3 and o3-mini, have context windows published per snapshot. These models allocate a portion of the window to internal reasoning tokens, so effective space for user content is smaller. Model cards provide exact details for each release, and users must account for this distribution when planning long prompts.

For enterprise-level PDF processing, ChatGPT indexes both text and images when a file exceeds ~110,000 tokens. Instead of loading the entire document into the raw context, the system uses retrieval to fetch relevant text-image pairs, making reasoning feasible without requiring the entire file to fit into the window.

·····

.....

Token limits and practical constraints.

Token usage in ChatGPT depends not only on prompt length but also on the reasoning style of the model. With reasoning-enabled models, hidden internal reasoning steps consume output tokens, which count against the total token allocation. This means that even if a user requests a short answer, the model may spend hundreds or thousands of tokens internally before producing its response.

File uploads in the ChatGPT app are another important constraint. Each file can be up to 512 MB in size, and text or document files can hold approximately 2 million tokens. For spreadsheets, there is no token cap, although file size is still limited. Images are capped at 20 MB, and total per-user storage is 10 GB. These rules apply to both local uploads and files imported from connected applications such as Google Drive or Microsoft OneDrive.

·····

.....

How memory works in ChatGPT.

ChatGPT’s memory system is divided into two distinct layers:

Saved Memories: These are durable, user-approved memories of facts, preferences, or working context. The assistant can recall them across sessions, unless the user edits or deletes them.
Chat History: This consists of past conversations that the assistant may use for context. However, details are not guaranteed to persist, and only Saved Memories provide reliable continuity.

Users have full control over memory. Memory can be turned on or off, and individual items can be reviewed, edited, or deleted at any time. A Temporary Chat option ensures that memory is not read or written during the conversation, making it useful for private or sensitive queries.

Memory is also integrated into Projects, where ChatGPT remembers files and previous chats tied to a project, maintaining continuity in that workspace without requiring constant re-uploading. For ChatGPT Business and Enterprise, memory includes organizational controls, and content is not used for model training.

Custom GPTs created in GPT Builder do not support memory, meaning each session is stateless regardless of the user’s personal memory settings.

·····

.....

Memory and search integration.

When ChatGPT is connected to external search providers, Saved Memories can inform how results are retrieved. This is an opt-in feature, giving users the choice to allow memory to shape web-based queries. Combined with Projects, this makes it possible to blend private knowledge with real-time search.

Retention policies are documented separately, with options for users and organizations to delete chat data, uploaded files, or entire memory sets. Enterprise deployments offer stricter guarantees, with retention aligned to security and compliance requirements.

·····

.....

Table of context, token, and memory limits.

Model / Feature	Context window	Token notes	Memory behavior
GPT-4.1 family	1,000,000 tokens	Full multimodal support	Saved Memories + Projects
GPT-5 API	128,000 tokens	No reasoning tokens by default	Saved Memories + Projects
o-series reasoning models	Varies by snapshot	Reasoning tokens consume output space	Saved Memories + Projects
File uploads	512 MB/file; ~2M tokens text; 20 MB images; no token cap for spreadsheets	10 GB per-user storage	Files remembered in Projects
Memory	N/A	N/A	User-controlled; can edit, delete, disable; not in Custom GPTs

This table illustrates how model choice, file size, and memory determine how ChatGPT processes information across sessions.

·····

.....

Operational recommendations.

Users handling very large documents should rely on GPT-4.1 models with a 1M-token context, or combine them with Projects to offload content into retrieval rather than direct context. For reasoning models, prompts must leave space for internal thinking, which is billed and constrained by output token allocations.

File uploads should be kept within size caps, and where spreadsheets are involved, CSV or XLSX files can exceed millions of tokens without issue, though performance improves with smaller, well-structured inputs.

For personalization, users should enable Saved Memories for durable facts, switch to Temporary Chat for sensitive work, and remember that Custom GPTs do not support memory. Enterprise and Business plans allow organizations to manage memory policies at scale, ensuring compliance.

By understanding these distinctions, users and developers can maximize ChatGPT’s ability to manage long contexts, large files, and evolving user needs while keeping control over memory and retention.

.....

DATA STUDIOS

.....[datastudios.org]