ChatGPT: File Upload and Reading: formats, limits, retention, and Enterprise integrations
- Graziano Stefanelli
- Oct 25, 2025
- 5 min read

ChatGPT can read, summarize, and analyze documents, spreadsheets, slides, code, and images directly inside a chat. Upload a file with the paperclip, ask questions (“extract all KPIs from pages 6–10”), and the model responds with summaries, tables, or structured JSON. For organizations, ChatGPT adds connected storage, admin governance, and retention controls—turning file Q&A into a dependable document-intelligence workflow.
·····
.....
What you can upload—and what ChatGPT actually “reads.”
File type | What ChatGPT does | Notes |
Parses the text layer and, where enabled, performs visual retrieval over embedded images/diagrams/charts; supports section/page-scoped Q&A. | Upload via paperclip in chat, then ask follow-ups (“Summarize Methods; list all dates on pages 5–8”). | |
DOCX / TXT / RTF | Extracts headings, lists, paragraphs; rewrites or summarizes. | Works well for policies, memos, essays. |
XLSX / CSV | Interprets cells, ranges, headers; explains formulas; can output CSV/Markdown tables. | Spreadsheet ingestion is “special” in a few limits; see below. |
Images (JPG/PNG/WEBP) | Vision analysis of diagrams, screenshots, handwriting; can describe, compare, translate. | 20 MB per image file. |
Code files | Explains, reviews, and refactors; can generate tests and diffs. | Supply the full function/class for best results. |
Visual PDF retrieval is particularly useful for scanned reports and figure-heavy documents: upload once, then ask the model to explain charts, annotate trends, or reconcile a table with narrative text.
·····
.....
Hard limits to plan around (and what “counts”).
Limit | Value | Applies to | Source |
Max file size | 512 MB | Any single file upload (chat or custom GPT) | Community guidance & product docs summaries. |
Text/doc indexing cap | ~2,000,000 tokens per file | Text & document files (not spreadsheets) | Builder/community confirmations. |
Image file cap | 20 MB per image | Vision analysis | Help Center. |
Per-user storage cap | 10 GB per end user | Aggregate uploads | Help Center File Uploads FAQ. |
Per-org storage cap | 100 GB per org | Aggregate uploads | Help Center File Uploads FAQ. |
Upload cadence | Up to 80 files / 3 hours (Free: 3/day) | Rolling window; may be lowered at peak load | Help Center File Uploads FAQ. |
Concurrent batch size | UI often limits ~10–25 files per batch | Practical front-end ceiling | Community reports. |
Why it matters: very long PDFs and image-dense decks hit token and size ceilings fast. Split by chapter/section and keep images under 20 MB each to avoid stalls.
·····
.....
Where your files come from (device vs. connected storage).
Device uploads: drag-and-drop or use the paperclip inside any chat.
Connected apps: ChatGPT supports adding files from Microsoft OneDrive and SharePoint by sharing file URLs; Google Drive support via URL-import isn’t listed on this Help Center page.
Enterprise & Team integrations: recent product updates and reporting highlight broader cloud integrations (e.g., Drive/Dropbox/Box/SharePoint/OneDrive) for business plans—rolled out within enterprise surfaces that enforce tenant permissions.
Practical takeaway: Individuals usually upload from the device; teams should use governed cloud links so permissions, versions, and retention policies carry over automatically.
·····
.....
Retention, privacy, and training.
Standard tiers (consumer): Chats persist until you delete them; when deleted, they’re scheduled for permanent deletion within 30 days (exceptions apply for legal/security holds).
Enterprise/Edu/Business: Inputs/outputs aren’t used to train models by default; admins control retention and governance.
Legal preservation orders: news reporting noted a court directive requiring preservation of deleted consumer chats in the context of litigation; enterprise zero-retention orgs are excluded. Treat this as a legal exception, not normal policy.
Bottom line: for sensitive workflows, use Team/Enterprise/Edu so retention and no-training guarantees are policy-enforced.
·····
.....
How to get the best results with uploaded files.
Target your questions.Ask for page- or section-scoped answers: “Summarize §4.2 (pp. 13–18) and extract dates as JSON.” This keeps token usage low and answers specific.
Prefer text layers.Export PDFs from Docs/Word instead of scanning. If you must scan, crop key figures as high-DPI images (≤20 MB) and ask targeted questions about each.
Request structured outputs.Tell ChatGPT to return CSV/JSON/Markdown tables for KPIs and logs; it reduces ambiguity and makes copy-out simple.
Use chained passes for long documents.Do section summaries → synthesis instead of “summarize 200 pages at once.” It’s faster, cheaper, and more faithful.
For spreadsheets, be explicit.Name columns/ranges (“compare B2:B500 vs. C2:C500; compute YoY%”) and ask for table outputs. Spreadsheet files aren’t subject to the 2M-token file cap, but overall context still matters.
·····
.....
Quick decision guide — best way to ingest your document.
Scenario | Best method | Why |
Clean, text-layer PDF | Upload PDF; ask page-scoped Q&A | Preserves structure; minimal friction. |
Scanned PDF with charts | Upload PDF (visual PDF retrieval) or crop and upload images ≤20 MB | Vision reads figures; images avoid bad OCR. |
Huge spreadsheet | Upload XLSX/CSV; request CSV/Markdown outputs | Spreadsheet-friendly ingestion; structure retained. |
Team repository (M365) | Share OneDrive/SharePoint links with proper permissions | Keeps enterprise governance intact. |
Multi-cloud enterprise | Use org integrations (Drive/Dropbox/Box/SharePoint/OneDrive) | Unified access with tenant enforcement. |
·····
.....
Plan differences that affect file workflows.
Plan | File uploads | Connected storage | Retention & training | Notes |
Free | Limited quota; fewer uploads/day | — | Standard consumer deletion window; general privacy | Great for casual tests. |
Plus | Full uploads within caps | — | Consumer retention; settings control data sharing | Good for solo research. |
Team | Higher practical throughput | Org cloud links; collaborative spaces (as available) | Admin-scoped retention; no training on business data | Designed for small teams. |
Enterprise / Edu | Highest reliability & throughput | Broad integrations, governed by tenant | Admin-controlled retention; no training by default | For sensitive/regulated use. |
Exact numeric allowances can evolve; always check your in-product limits and admin console.
·····
.....
Troubleshooting table (fast fixes).
Symptom | Likely cause | Fix |
“File too large” | >512 MB file or image >20 MB | Split/compress; export PDFs without embedded scans; compress images. |
“Upload limit reached” | Hit the rolling quota (e.g., 80 files / 3 hrs; Free 3/day) | Wait for the window reset or reduce batch size. |
Messy table extraction | Scanned PDF / poor OCR | Re-upload the table as a cropped image; ask for CSV. |
No access to cloud link | Permission mismatch on OneDrive/SharePoint | Share the file URL with proper org permissions. |
Unexpected chat persistence | Legal hold / policy exception | Use Enterprise with admin-controlled retention for governed workflows. |
·····
.....
Practical prompts to copy.
PDF sections → JSON: “From pages 12–18, extract all dates, amounts, and parties into JSON: {date, amount_usd, party, page}. Cite page numbers.”
Tables → CSV: “Convert the table on page 5 to CSV with headers; normalize thousands separators and ISO-date the first column.”
Spreadsheet analysis: “For Revenue and COGS columns, compute quarterly totals and YoY% by quarter; return a Markdown table and one-paragraph commentary.”
Figures explanation: “Explain the bar chart on page 9 in two bullets: trend, outliers. Then compare to the paragraph beneath it.”
These patterns force focused answers, keep token usage low, and produce outputs you can paste into Sheets/Excel, Docs, or dashboards.
·····
.....
Bottom line.
ChatGPT is a capable document and data reader: PDFs (including visual elements), docs, sheets, images, and code—all inside a single chat.
Know your ceilings: 512 MB/file, ~2M tokens per text/doc file, 20 MB/image, and rolling upload quotas. Plan to split long/scanned documents and ask page-scoped questions.
For organizations, Team/Enterprise add connected storage and no-training guarantees with admin-controlled retention—essential for governed document workflows.
.....
FOLLOW US FOR MORE.
DATA STUDIOS
.....



