Claude Opus 4.6 Context Window, Long Projects, Large Files, and 1M-Token Workflows: What Anthropic’s 1M Context Actually Means in the API and How Claude Handles Project-Scale Work in Practice

Apr 10
10 min read

Claude Opus 4.6 is now officially positioned by Anthropic as a 1M-token context model, but that number by itself does not fully describe how long projects, large file collections, and extended multi-document workflows actually behave across Claude’s developer platform and Claude’s product surfaces.

The key reason is that Anthropic describes one set of limits and capabilities at the raw model level, where Claude Opus 4.6 supports a 1M-token context window and 128K max output tokens, while Claude’s product documentation for Projects, uploads, and long-running work adds separate mechanics such as project knowledge, retrieval-augmented generation, caching, file-size limits, and usage constraints that shape what users can actually do in sustained real-world workflows.

That means the phrase “Claude Opus 4.6 context window” only answers one part of the question, because a single giant request in the Claude Developer Platform is not the same thing as a long-lived Claude project with many chats, many documents, repeated queries, reused instructions, and retrieval behavior that changes as the project grows.

·····

Claude Opus 4.6 is officially a 1M-token context model, but Anthropic still frames that capability in a platform-specific way.

Anthropic’s context window documentation states that Claude Opus 4.6 and Claude Sonnet 4.6 support a 1M-token context window, and Anthropic’s model materials also state that Claude Opus 4.6 has 128K max output tokens, which places it in the category of very large context models designed for heavy document sets, large-scale codebases, long reasoning chains, and extended agent-style sessions.

At the same time, Anthropic’s public launch materials say that the 1M-token context window is currently available in beta on the Claude Developer Platform, which is an important qualification because it indicates that the headline 1M figure is not simply a universal statement about every Claude surface in every workflow but a documented platform-side capability with a specific rollout status.

That distinction matters because users often treat model context as if it automatically maps one-to-one onto the product experience, when Anthropic’s own documentation suggests a more careful reading in which the model clearly supports very large contexts while the broader Claude ecosystem layers product behavior on top of that raw model capacity.

........

Claude Opus 4.6 and Related Official Context Specifications

Item	Officially Documented Value
Claude Opus 4.6 context window	1M tokens
Claude Sonnet 4.6 context window	1M tokens
Claude Opus 4.6 max output	128K tokens
Claude Sonnet 4.6 max output	64K tokens
Older Claude Sonnet 4.5 family context	200K tokens

·····

Long projects in Claude are governed by project architecture as much as by raw context size.

Anthropic describes Projects as self-contained workspaces with their own chat histories, instructions, and knowledge, which means a long project is not just one long prompt but a structured workspace in which Claude is expected to revisit information, reuse knowledge, and operate across many interactions over time rather than solving everything through one giant request.

That difference is fundamental, because a 1M-token request is mainly a question about how much one inference call can ingest, while a long project is a question about how Claude manages state, retrieves relevant information, preserves useful knowledge, and keeps work coherent across many turns and many files without forcing the user to rebuild the entire working set manually on every interaction.

Anthropic’s documentation makes clear that projects now scale beyond simple context stuffing because Claude can automatically switch to a retrieval-augmented mode as more files and information are added, which means large projects are increasingly handled through a hybrid design in which some knowledge is stored and reused, some is cached, and some is actively retrieved when relevant instead of being kept fully resident in one monolithic live prompt at all times.

This is the most important conceptual correction for anyone thinking about long projects in Claude, because it replaces the simplistic idea of “one giant context window” with a more accurate architecture in which Claude’s project behavior depends on workspace mechanics, retrieval strategy, and knowledge reuse in addition to raw model capacity.

·····

Large files are limited first by upload rules and extracted content, and only then by the theoretical context ceiling.

Anthropic’s upload documentation states that users can upload files up to 30 MB each, attach up to 20 files per chat, and store unlimited project files in count, while also noting that the actual usable content still has to fit within Claude’s context window and token-related constraints once the material is extracted and processed.

That means a file’s size in megabytes is not the same thing as its effective footprint in model context, because the real limiting factor is the extracted or usable content that Claude must process rather than the raw container size of the file itself, and those two values can diverge substantially depending on whether the file is mostly dense text, scanned images, mixed layouts, embedded graphics, or complex formatting.

Anthropic also says that PDFs larger than 30 MB can sometimes be processed through Claude’s computing environment without loading them into the live context window, which is one of the most revealing details in the whole large-file discussion because it shows that Anthropic is not treating all large documents as a brute-force prompt stuffing problem and instead allows some oversized material to be handled through a separate computational pathway.

That design matters because it means Claude’s practical ability to work with large files is broader than the narrow question of how many tokens fit into the conversational context, while also being more complex than a simple promise that any large file can always be fully loaded and reasoned over in one pass.

........

Claude File and Project Handling Limits That Matter Before Raw Context Does

Workflow Element	Officially Documented Limit or Behavior
File size per upload	30 MB per file
Files per chat	Up to 20 files
Project file count	Unlimited in count
Real project limit	Total usable content must still fit token and context constraints
Oversized PDF handling	Large PDFs can sometimes be processed outside the live context window

·····

A 1M-token workflow is best understood as a single-request capability rather than a complete description of Claude project work.

Anthropic’s platform documentation says that 1M-context models can support up to 600 images or PDF pages in a single request, compared with 100 for 200K-context models, which makes it clear that the intended use case for the 1M window includes very large multimodal and document-heavy requests that would previously have required more aggressive chunking or multiple separate calls.

That capability is highly relevant for developer workflows in which the user can deliberately assemble a massive request containing large amounts of text, code, documents, or images and then ask Claude Opus 4.6 to reason across that whole working set in one call, because this is the most direct expression of what a 1M-token model actually provides at the platform level.

But that is still not the same thing as a long-running project in the Claude app, because a project can contain far more accumulated material over time than would be practical or necessary to inject into every live request, and Anthropic’s own documentation shows that Projects increasingly rely on retrieval and caching so Claude can work efficiently over stored knowledge without pretending that every token is simultaneously resident in every answer generation step.

So the cleanest distinction is that a 1M-token workflow describes the scale of a single context assembly, while a long project describes a managed workspace in which knowledge persists, retrieval becomes more important as the project grows, and Claude’s practical behavior depends on how it selects and reuses the relevant subset of that knowledge for each new turn.

·····

Output limits still force staging even when the input side becomes dramatically larger.

Anthropic’s documentation states that Claude Opus 4.6 has a maximum output of 128K tokens, which is very large by historical standards but still much smaller than the model’s 1M-token context window, and that asymmetry matters because it means the model can read much more than it can emit in one completion.

This becomes a practical bottleneck in tasks such as full-document rewrites, exhaustive structured extraction, long legal or technical redrafts, complete transformations of very large reports, or multi-file synthesis work where the combined output the user wants may itself approach or exceed the model’s generation ceiling even if the source material is comfortably within the context window.

In other words, the right question is not simply whether Claude Opus 4.6 can ingest a massive working set, but whether the resulting deliverable can fit into one response without truncation or forced summarization, because very large input capacity does not eliminate the need for staged outputs when the requested result is itself extremely long.

This is one of the most important operational realities behind 1M-token workflows, because large-context models often feel most impressive on ingestion while still requiring careful output planning if the user wants complete rather than selective results.

........

Why 1M-Token Input Does Not Eliminate the Need for Multi-Step Work

Task Type	Main Limiting Factor
Large document review	Input context and retrieval quality
Full rewrite of a very long source	Output ceiling
Exhaustive extraction from many files	Output ceiling plus formatting overhead
Multi-document reasoning	Context size plus relevant retrieval
Large project continuation over time	Project knowledge, caching, and usage limits

·····

Claude Projects increasingly rely on retrieval and caching rather than on repeated full restuffing of all knowledge.

Anthropic’s help documentation says that project content is cached and that reused project material does not count against limits in the same way each time it is reused, which is a major operational detail because it means long project work is optimized around persistence and reuse rather than around the inefficient pattern of repeatedly reloading the full knowledge base into every interaction from scratch.

Anthropic also says that Projects automatically switch to a RAG-powered mode when more files and information are added, which strongly suggests that as projects scale, Claude becomes more retrieval-oriented and less dependent on the assumption that the entire project corpus must be fully stuffed into the active context for every answer.

Taken together, those two mechanics create a different model of scale from what many users imagine when they hear “1M-token model,” because the long-project experience is not simply a bigger version of a normal chat and is instead a hybrid workflow built around stored knowledge, selective retrieval, cached reuse, and project-level organization.

That architecture is important because it lets Claude work across more material over time than would be comfortable to keep in live conversational context on every turn, while also improving efficiency and helping keep repeated project work from consuming limits as if every query were a cold start.

·····

Usage limits remain a practical constraint even when the model itself supports massive contexts.

Anthropic’s support documentation on usage and length limits says that message length, file attachments, conversation length, and model choice all affect how quickly usage limits are reached, which means the existence of a 1M-token-capable model does not imply unlimited sustained practical use for users working inside Claude’s consumer or subscription surfaces.

This point matters because very large prompts, heavy file attachments, long chats, and repeated project operations can all make a workflow more expensive in terms of available usage budget or message capacity, so the real ceiling for long projects is partly defined by what the model can technically ingest and partly by what the user’s plan allows them to do repeatedly in a given period.

That creates a second layer of realism around 1M-token workflows, because a model can be capable of huge one-shot requests while the surrounding product economics and plan structure still encourage more selective, retrieved, or staged workflows in day-to-day use.

In practice, this means that the phrase “Claude Opus 4.6 supports 1M tokens” is technically true and strategically important, but it is not the same as saying that all long projects should always be handled as single enormous prompts or that such usage is the most natural pattern in every Claude environment.

·····

Large-file workflows benefit from specialized handling, which shows that raw context is only one part of Claude’s scaling strategy.

Anthropic’s documentation on file creation and editing notes that large PDFs over 30 MB can be processed through Claude’s computing environment without loading them into the context window, which is a strong indication that Claude’s scaling strategy for large files includes external computation and document handling pathways in addition to straightforward context ingestion.

That matters because some of the hardest real-world workflows are not simply text-heavy but structurally messy, involving large PDFs, mixed layouts, image-rich documents, or files whose usefulness depends on transformations, extraction, and processing steps that are better handled outside the narrow boundaries of live conversational context.

The same pattern appears in project-scale retrieval, where Anthropic increasingly relies on RAG as projects grow, because both features point in the same direction, namely that Claude’s approach to scale is not merely to increase the raw token limit and hope that solves everything, but to combine bigger context windows with retrieval, caching, and alternate processing paths for oversized or structurally complex materials.

This is the most mature way to understand Claude Opus 4.6 in practice, because it frames the model not as a simple container that can hold more tokens than before, but as the center of a broader workflow system designed to support large-scale reasoning through several complementary mechanisms.

........

The Most Important Difference Between 1M Requests and Long Claude Projects

Workflow Type	What Actually Governs Scale
One massive API request	Raw model context, media limits, output ceiling
Long Claude project	Project knowledge, retrieval behavior, caching, usage limits
Large PDF processing	Upload rules, extracted content, or computing-environment handling
Multi-file reasoning	Relevant retrieval plus context assembly
Ongoing technical workspace	Persistent knowledge reuse rather than repeated full prompt loading

·····

The most accurate conclusion is that Claude Opus 4.6 makes 1M-token workflows real, but long projects are governed by a broader product architecture.

Anthropic’s official documentation supports a very clear raw-model claim, because Claude Opus 4.6 has a 1M-token context window and 128K max output tokens, and the Claude Developer Platform is the place where Anthropic most directly frames that capability as available for very large requests, including large document and multimodal workloads.

At the same time, Anthropic’s product documentation shows that long projects and large file collections in Claude are not governed by that number alone, since Projects use persistent workspace structure, retrieval-augmented generation, cached reuse of project content, file-size and per-chat upload limits, and in some cases alternate processing through the computing environment instead of relying purely on context stuffing.

The result is that Claude Opus 4.6 should be understood in two complementary ways, first as a genuinely large-context model that can support single-request workflows at a scale that used to require much heavier fragmentation, and second as part of a broader Claude product architecture in which long-running work is increasingly handled through retrieval, caching, and project knowledge rather than by keeping the full project corpus inside the live prompt at all times.

That is why the best answer to the question of Claude Opus 4.6 context is not simply “1M tokens,” but “1M tokens at the model level, combined with a project system that manages scale through retrieval, caching, file handling rules, and staged output constraints in real-world long-project and large-file workflows.”

·····

DATA STUDIOS

·····

[datastudios.org]

·····