Claude: Token limits and context windows
- Graziano Stefanelli
- Aug 22
- 3 min read

Claude introduces expanded context capabilities.
Anthropic has recently rolled out significant updates to Claude’s context windows, particularly focusing on Claude Sonnet 4, Claude Opus 4.1, and the Heavy 4 private preview. These enhancements allow developers and enterprises to process substantially larger datasets, manage lengthy document uploads, and handle multimodal inputs more efficiently.
The key update is the introduction of a 1,000,000-token beta context window for Sonnet 4, available through Anthropic’s API and Amazon Bedrock, with Vertex AI integration scheduled next. This enables more complex operations, such as analyzing hundreds of pages of PDFs, large knowledge bases, and mixed media inputs, while maintaining low latency and reliable stability.
Claude Sonnet 4 now supports a 1,000,000-token beta context window.
Anthropic’s flagship release includes a 1M-token beta program for Sonnet 4, accessible using the header flag betas=["context-1m-2025-08-07"]. In practical terms, this enables developers to load extremely large datasets and manage longer conversations without needing manual segmentation.
Model | Standard Context | Beta Context | Latency (First Token) | Availability |
Sonnet 4 | 200,000 tokens | 1,000,000 tokens | ≈ 1.2 sec | Public beta, API + Bedrock |
Opus 4.1 | 200,000 tokens | N/A | ≈ 1.9 sec | Production |
Heavy 4 Preview | 256,000 tokens | N/A | ≈ 2.1 sec | Private preview only |
The 1M-token beta has been released to address growing demand from enterprise users who work with large-scale legal documents, financial datasets, R&D archives, and multimedia-driven workflows. For now, Anthropic reserves approximately 6% of this window for system and safety tokens, leaving ≈940,000 tokens available to users.
Opus 4.1 and Heavy 4 maintain high-capacity performance.
While Sonnet 4 leads the 1M beta rollout, Opus 4.1 remains optimized for intensive reasoning tasks, sustaining a 200,000-token context window with faster average token generation speeds than previous models. The Heavy 4 private preview offers 256,000 tokens for select enterprise partners and introduces enhanced multimodal features, including depth-map vision for image understanding and early support for structured video frames.
For developers working at scale, this balance between latency, reasoning depth, and multimodal capabilities determines the best model choice depending on workload complexity.
Pricing structures adapt to long-context workloads.
Anthropic has introduced new pricing tiers for token-intensive use cases. Costs differ depending on model selection and whether users leverage the 1M-token beta or standard context windows.
Tier | Input Cost (per 1,000 tokens) | Output Cost (per 1,000 tokens) | Cache Read Discount |
Sonnet 4 (standard) | $3.00 | $15.00 | N/A |
Sonnet 4 (1M beta) | $6.00 | $22.50 | -75% on cached tokens |
Opus 4.1 | $15.00 | $75.00 | N/A |
Heavy 4 Preview | TBD | TBD | TBD |
The context-cache beta now enables significant cost optimizations, offering ≈75% discounts on cached token reads, while cached writes add a 25% storage overhead. This is particularly relevant for scenarios involving frequent reference to stable datasets, enabling businesses to reduce costs without sacrificing context length.
Technical best practices for managing large context windows.
As Claude’s token limits increase, efficient management becomes critical for maintaining performance and cost control. Anthropic has issued several recommendations:
Scenario | Recommendation | Token Impact |
Large PDF uploads | Chunk files into ≤50,000-token segments | Avoids overflow |
Code repositories | Compress and exclude build artefacts | Saves ~15-20% tokens |
Image processing | Batch ≤10 images per request | Prevents request failure |
Audio inputs | Stream ≤2-minute segments | Minimizes processing delay |
Video frames | Limit to 1 frame ≈120 tokens | Controls multimodal usage |
By adopting these optimizations, developers can avoid costly overages while ensuring the model processes high-volume data efficiently.
Roadmap includes future 2,000,000-token expansion.
Anthropic has confirmed plans to extend Claude’s capabilities to 2,000,000 tokens by the end of the year for enterprise partners, particularly those using Vertex AI and Bedrock. This roadmap reflects a broader trend towards ultra-long context AI systems, aimed at enabling:
End-to-end reasoning across large-scale corporate datasets
Improved multimodal indexing across documents, images, and video
More advanced RAG (retrieval-augmented generation) pipelines
Efficient consolidation of legal, medical, and financial archives
This evolution positions Claude among the leading tools for high-capacity document intelligence, bridging the gap between token efficiency and deep contextual reasoning.
Do you want me to proceed with row 54 next — “Paid vs free plan feature differences” for Perplexity AI — and make a full, detailed research before drafting the article?
____________
FOLLOW US FOR MORE.
DATA STUDIOS

