Microsoft Copilot Context Window Token Limits and Memory

Graziano Stefanelli
Sep 30
4 min read

Microsoft Copilot functions across productivity apps in Microsoft 365, developer environments such as GitHub Copilot Chat, consumer web and mobile apps, and custom solutions built in Copilot Studio. Each of these contexts has different behavior when it comes to context windows, token handling, and memory. In late 2025, Microsoft has published practical guidance for document length and file grounding, clarified token capacities in GitHub Copilot, and introduced memory features that separate personalization from conversation history.

·····

.....

How context windows work in Microsoft 365 Copilot.

In Microsoft 365 Copilot, there is no single published token number. Instead, Microsoft provides usage guidance expressed in pages and word counts. For summarization and referencing, Copilot can process documents up to about 300 pages or 1.5 million words. For rewriting tasks, the recommended limit is 3,000 words. These thresholds help ensure that results remain accurate and complete, as very long documents may degrade response quality.

In SharePoint authoring, Microsoft advises that individual rich text blocks be kept under 3,000 words, even though the editor itself allows more, because Copilot’s processing quality drops with larger inputs. This shows that practical context management is defined by functional guidance rather than raw token capacity.

·····

.....

File grounding and retrieval constraints.

The most important effective limit in Microsoft 365 Copilot is the file grounding cap. When referencing external files, Copilot can fully search and use the contents of up to 20 files. If more than 20 files are listed, it selects the 20 most relevant and ignores the rest.

This rule means that retrieval scope often determines Copilot’s usable context more than the underlying model tokens. Copilot integrates content through the Microsoft Graph and the Semantic Index, which ensures that responses are generated only from files and data the user is authorized to access. This approach adds enterprise-grade compliance but requires careful file selection to stay within the 20-file scope.

·····

.....

Token windows in GitHub Copilot Chat.

GitHub Copilot Chat provides explicit numbers for context windows. The standard capacity is 64,000 tokens, and this expands to 128,000 tokens in Visual Studio Code Insiders when running GPT-4o with extended support.

These values allow developers to work with larger files and codebases than in earlier versions. For very large repositories, Copilot uses repo-aware retrieval methods such as embeddings and symbol search rather than trying to include the entire codebase in a single prompt. This improves efficiency and keeps queries relevant without overloading the token window.

·····

.....

Consumer Copilot retention policies.

In the consumer-facing Copilot web and mobile apps, Microsoft does not publish specific token window numbers. Input size and model choice determine how much Copilot can process, and these limits may vary by mode.

Conversation history is stored for about 18 months by default. Users can delete individual chats or clear all history at any time. This default retention helps maintain continuity but also requires active management by users who prefer not to keep records of their interactions.

·····

.....

Copilot Memory personalization.

Separate from chat history, Microsoft has introduced Copilot Memory, a personalization feature that remembers user preferences and recurring facts. This allows Copilot to tailor responses across sessions.

Users can interact with memory directly by asking, “What do you know about me?” and can request Copilot to add, update, or forget specific items. From settings, memory can also be toggled off entirely. This feature provides personalization while maintaining transparency and control, since users can view, edit, or delete stored memories at any time.

The distinction is important: Copilot Memory refers to a curated set of facts for personalization, while conversation history is a log of past chats retained for a defined period.

·····

.....

Copilot Studio quotas and behavior.

In Copilot Studio, which allows organizations to build custom copilots and agents, there are no fixed token window numbers published. Instead, Microsoft defines limits in terms of request quotas and session allowances. Developers design context management by configuring knowledge sources, connectors, and retrieval behavior.

This approach shifts responsibility to builders, who must balance grounding scope with performance. Studio’s flexibility allows for specialized workflows but requires careful design to manage effective context size.

·····

.....

Table of context window and memory features.

Surface	Context guidance	File limits	Memory and retention
Microsoft 365 Copilot	~300 pages / 1.5M words for summaries, ~3,000 words for rewrites	20 files fully scanned	Enterprise retention policies, no training use
GitHub Copilot Chat	64k tokens standard, up to 128k in VS Code Insiders	Repo-aware retrieval, not raw file limits	No personalization memory, IDE context only
Consumer Copilot	No published token number, varies by mode	N/A	Chats stored ~18 months by default
Copilot Memory	N/A	N/A	Personalization feature, user-editable
Copilot Studio	No fixed token window, quotas defined per session	Connector-driven	Enterprise-defined retention

This table illustrates the different ways Microsoft defines or constrains context across its Copilot surfaces. In productivity apps, context is guided by word counts and file caps, while in development environments the token window is explicit. In consumer and enterprise contexts, memory and retention policies take on additional importance.

·····

.....

Operational considerations.

For best results in Microsoft 365 Copilot, documents should be kept within Microsoft’s recommended page and word counts, and only the most relevant 20 files should be referenced in a single prompt. In GitHub Copilot, developers should plan around a 64k–128k token context and rely on repo-aware retrieval for large projects. In consumer Copilot, retention should be managed through chat history controls, while Copilot Memory should be used deliberately to personalize responses. In Copilot Studio, enterprises should design retrieval strategies carefully to align with quotas and ensure efficient processing.

These practices show that Microsoft Copilot balances raw model capacity, retrieval rules, and memory features to create a flexible system that adapts to productivity, coding, consumer, and enterprise environments.

.....

DATA STUDIOS

.....[datastudios.org]