ChatGPT: File Upload and Reading: formats, limits, retention, and Enterprise integrations

Oct 25, 2025
5 min read

ChatGPT can read, summarize, and analyze documents, spreadsheets, slides, code, and images directly inside a chat. Upload a file with the paperclip, ask questions (“extract all KPIs from pages 6–10”), and the model responds with summaries, tables, or structured JSON. For organizations, ChatGPT adds connected storage, admin governance, and retention controls—turning file Q&A into a dependable document-intelligence workflow.

·····

.....

What you can upload—and what ChatGPT actually “reads.”

File type	What ChatGPT does	Notes
PDF	Parses the text layer and, where enabled, performs visual retrieval over embedded images/diagrams/charts; supports section/page-scoped Q&A.	Upload via paperclip in chat, then ask follow-ups (“Summarize Methods; list all dates on pages 5–8”).
DOCX / TXT / RTF	Extracts headings, lists, paragraphs; rewrites or summarizes.	Works well for policies, memos, essays.
XLSX / CSV	Interprets cells, ranges, headers; explains formulas; can output CSV/Markdown tables.	Spreadsheet ingestion is “special” in a few limits; see below.
Images (JPG/PNG/WEBP)	Vision analysis of diagrams, screenshots, handwriting; can describe, compare, translate.	20 MB per image file.
Code files	Explains, reviews, and refactors; can generate tests and diffs.	Supply the full function/class for best results.

Visual PDF retrieval is particularly useful for scanned reports and figure-heavy documents: upload once, then ask the model to explain charts, annotate trends, or reconcile a table with narrative text.

·····

.....

Hard limits to plan around (and what “counts”).

Limit	Value	Applies to	Source
Max file size	512 MB	Any single file upload (chat or custom GPT)	Community guidance & product docs summaries.
Text/doc indexing cap	~2,000,000 tokens per file	Text & document files (not spreadsheets)	Builder/community confirmations.
Image file cap	20 MB per image	Vision analysis	Help Center.
Per-user storage cap	10 GB per end user	Aggregate uploads	Help Center File Uploads FAQ.
Per-org storage cap	100 GB per org	Aggregate uploads	Help Center File Uploads FAQ.
Upload cadence	Up to 80 files / 3 hours (Free: 3/day)	Rolling window; may be lowered at peak load	Help Center File Uploads FAQ.
Concurrent batch size	UI often limits ~10–25 files per batch	Practical front-end ceiling	Community reports.

Why it matters: very long PDFs and image-dense decks hit token and size ceilings fast. Split by chapter/section and keep images under 20 MB each to avoid stalls.

·····

.....

Where your files come from (device vs. connected storage).

Device uploads: drag-and-drop or use the paperclip inside any chat.
Connected apps: ChatGPT supports adding files from Microsoft OneDrive and SharePoint by sharing file URLs; Google Drive support via URL-import isn’t listed on this Help Center page.
Enterprise & Team integrations: recent product updates and reporting highlight broader cloud integrations (e.g., Drive/Dropbox/Box/SharePoint/OneDrive) for business plans—rolled out within enterprise surfaces that enforce tenant permissions.

Practical takeaway: Individuals usually upload from the device; teams should use governed cloud links so permissions, versions, and retention policies carry over automatically.

·····

.....

Retention, privacy, and training.

Standard tiers (consumer): Chats persist until you delete them; when deleted, they’re scheduled for permanent deletion within 30 days (exceptions apply for legal/security holds).
Enterprise/Edu/Business: Inputs/outputs aren’t used to train models by default; admins control retention and governance.
Legal preservation orders: news reporting noted a court directive requiring preservation of deleted consumer chats in the context of litigation; enterprise zero-retention orgs are excluded. Treat this as a legal exception, not normal policy.

Bottom line: for sensitive workflows, use Team/Enterprise/Edu so retention and no-training guarantees are policy-enforced.

·····

.....

How to get the best results with uploaded files.

Target your questions.Ask for page- or section-scoped answers: “Summarize §4.2 (pp. 13–18) and extract dates as JSON.” This keeps token usage low and answers specific.
Prefer text layers.Export PDFs from Docs/Word instead of scanning. If you must scan, crop key figures as high-DPI images (≤20 MB) and ask targeted questions about each.
Request structured outputs.Tell ChatGPT to return CSV/JSON/Markdown tables for KPIs and logs; it reduces ambiguity and makes copy-out simple.
Use chained passes for long documents.Do section summaries → synthesis instead of “summarize 200 pages at once.” It’s faster, cheaper, and more faithful.
For spreadsheets, be explicit.Name columns/ranges (“compare B2:B500 vs. C2:C500; compute YoY%”) and ask for table outputs. Spreadsheet files aren’t subject to the 2M-token file cap, but overall context still matters.

·····

.....

Quick decision guide — best way to ingest your document.

Scenario	Best method	Why
Clean, text-layer PDF	Upload PDF; ask page-scoped Q&A	Preserves structure; minimal friction.
Scanned PDF with charts	Upload PDF (visual PDF retrieval) or crop and upload images ≤20 MB	Vision reads figures; images avoid bad OCR.
Huge spreadsheet	Upload XLSX/CSV; request CSV/Markdown outputs	Spreadsheet-friendly ingestion; structure retained.
Team repository (M365)	Share OneDrive/SharePoint links with proper permissions	Keeps enterprise governance intact.
Multi-cloud enterprise	Use org integrations (Drive/Dropbox/Box/SharePoint/OneDrive)	Unified access with tenant enforcement.

·····

.....

Plan differences that affect file workflows.

Plan	File uploads	Connected storage	Retention & training	Notes
Free	Limited quota; fewer uploads/day	—	Standard consumer deletion window; general privacy	Great for casual tests.
Plus	Full uploads within caps	—	Consumer retention; settings control data sharing	Good for solo research.
Team	Higher practical throughput	Org cloud links; collaborative spaces (as available)	Admin-scoped retention; no training on business data	Designed for small teams.
Enterprise / Edu	Highest reliability & throughput	Broad integrations, governed by tenant	Admin-controlled retention; no training by default	For sensitive/regulated use.

Exact numeric allowances can evolve; always check your in-product limits and admin console.

·····

.....

Troubleshooting table (fast fixes).

Symptom	Likely cause	Fix
“File too large”	>512 MB file or image >20 MB	Split/compress; export PDFs without embedded scans; compress images.
“Upload limit reached”	Hit the rolling quota (e.g., 80 files / 3 hrs; Free 3/day)	Wait for the window reset or reduce batch size.
Messy table extraction	Scanned PDF / poor OCR	Re-upload the table as a cropped image; ask for CSV.
No access to cloud link	Permission mismatch on OneDrive/SharePoint	Share the file URL with proper org permissions.
Unexpected chat persistence	Legal hold / policy exception	Use Enterprise with admin-controlled retention for governed workflows.

·····

.....

Practical prompts to copy.

PDF sections → JSON: “From pages 12–18, extract all dates, amounts, and parties into JSON: {date, amount_usd, party, page}. Cite page numbers.”
Tables → CSV: “Convert the table on page 5 to CSV with headers; normalize thousands separators and ISO-date the first column.”
Spreadsheet analysis: “For Revenue and COGS columns, compute quarterly totals and YoY% by quarter; return a Markdown table and one-paragraph commentary.”
Figures explanation: “Explain the bar chart on page 9 in two bullets: trend, outliers. Then compare to the paragraph beneath it.”

These patterns force focused answers, keep token usage low, and produce outputs you can paste into Sheets/Excel, Docs, or dashboards.

·····

.....

Bottom line.

ChatGPT is a capable document and data reader: PDFs (including visual elements), docs, sheets, images, and code—all inside a single chat.
Know your ceilings: 512 MB/file, ~2M tokens per text/doc file, 20 MB/image, and rolling upload quotas. Plan to split long/scanned documents and ask page-scoped questions.
For organizations, Team/Enterprise add connected storage and no-training guarantees with admin-controlled retention—essential for governed document workflows.

.....

DATA STUDIOS

.....