Claude Sonnet 4 now supports 1 million tokens of context

Graziano Stefanelli
Aug 13
3 min read

Claude reaches the 1M-token threshold with Sonnet 4, marking a pivotal evolution in context-aware AI usage.

Anthropic has officially extended the context window of Claude Sonnet 4 to a staggering 1 million tokens, unlocking the ability to process entire codebases, thousands of document pages, or multi-session workflows without losing coherence. This update, announced on August 12, 2025, places Claude on par with or even ahead of competitors like OpenAI’s GPT-4o and GPT-5 in terms of long-form understanding and retention—at least in enterprise API environments.

The new 1M-token capability is currently available in public beta via API, specifically for users on Tier 4 or custom rate limits. It is also fully integrated in Amazon Bedrock, with Google Vertex AI support expected shortly. However, it is not yet available through the Claude web or mobile interface, nor within the standard consumer version of Claude.ai.

A single prompt can now span an entire book, code repository, or technical archive.

A token budget of 1 million tokens translates into roughly:

2,500+ pages of text, or
75,000–110,000 lines of code, or
a full software product documentation set, or
multiple contracts, policies, and financial disclosures for legal review.

This enables Claude Sonnet 4 to handle high-density enterprise workloads such as full-stack code refactoring, context-aware document synthesis, and multi-modal research sessions without segmentation.

Anthropic’s own demos show the model navigating deep software architectures, writing explanations across files, and revising code across multiple functions—all in one request. Clients like Rakuten, Bolt.new, and iGent AI have been early adopters, especially in agentic coding and compliance automation.

Pricing doubles past 200K tokens, but optimizations bring efficiency gains.

The 1M-token context window comes with a revised pricing model:

For prompts up to 200K tokens:→ $3 per million tokens (input)→ $15 per million tokens (output)
For prompts above 200K tokens:→ $6 per million tokens (input)→ $22.50 per million tokens (output)

Anthropic recommends techniques like prompt caching, batch inference, and modular agent design to mitigate costs and reduce latency. According to internal benchmarks, these methods can cut compute time and token spend by up to 50%, especially when used in workflows requiring repeated queries against static long-term context.

Technical access requires API header and specific model selection.

To enable the 1M-token context, developers must:

Use Claude Sonnet 4 via API (Anthropic or Bedrock)
Include the following header in API calls:

anthropic-beta: context-1m-2025-08-07

This extended capability does not apply to Claude Opus or Sonnet versions on lower API tiers. The upgrade is aimed at developers building complex autonomous agents, codex-style assistants, or document-heavy search tools.

Anthropic has not confirmed whether this upgrade will reach the Claude web app, but hints suggest it may be reserved for enterprise tiers due to the high compute load.

Claude’s positioning strengthens in the AI coding and legal document arenas.

The expansion aligns Claude with advanced use cases across regulated industries. With GPT-5 focusing on general reasoning and ChatGPT-4o on multimodal versatility, Claude Sonnet 4 now carves a niche in:

Software engineering (refactoring, PR review, debugging across repos)
Legal & compliance (contract comparison, regulatory synthesis)
Knowledge extraction (from internal reports, product manuals, or medical datasets)
Agentic workflows that simulate multi-step decision-making based on persistent memory

Reddit users have noted the significance of this release:

“This basically enables Claude to work like a junior developer with context on the entire repo.”“It finally beats GPT at coherent long-term workflows.”“Enterprise-only, of course—but a real power move.”

Claude’s context leap reflects Anthropic’s API-first, agent-ready future.

With this release, Anthropic signals its long-term direction: a developer-centric, enterprise-grade platform for building scalable AI agents that operate across full knowledge bases. The 1M-token window is not just about memory—it’s about fluid logic across space, time, and domain.

By doubling down on API performance, Anthropic distances itself from the consumer chatbot market and focuses on Claude as infrastructure. The choice of restricting access to Tier 4 and Bedrock users reinforces this.

In the short term, it means most users won't access this capacity directly via Claude.ai. But for enterprise developers and strategic partners, the Claude 1M API context window could become the foundation for next-generation intelligent systems.

_________

DATA STUDIOS

datastudios.org