top of page

ChatGPT: File Upload and Reading: formats, limits, retention, and Enterprise integrations

ree

ChatGPT can read, summarize, and analyze documents, spreadsheets, slides, code, and images directly inside a chat. Upload a file with the paperclip, ask questions (“extract all KPIs from pages 6–10”), and the model responds with summaries, tables, or structured JSON. For organizations, ChatGPT adds connected storage, admin governance, and retention controls—turning file Q&A into a dependable document-intelligence workflow.

·····

.....

What you can upload—and what ChatGPT actually “reads.”

File type

What ChatGPT does

Notes

PDF

Parses the text layer and, where enabled, performs visual retrieval over embedded images/diagrams/charts; supports section/page-scoped Q&A.

Upload via paperclip in chat, then ask follow-ups (“Summarize Methods; list all dates on pages 5–8”).

DOCX / TXT / RTF

Extracts headings, lists, paragraphs; rewrites or summarizes.

Works well for policies, memos, essays.

XLSX / CSV

Interprets cells, ranges, headers; explains formulas; can output CSV/Markdown tables.

Spreadsheet ingestion is “special” in a few limits; see below.

Images (JPG/PNG/WEBP)

Vision analysis of diagrams, screenshots, handwriting; can describe, compare, translate.

20 MB per image file.

Code files

Explains, reviews, and refactors; can generate tests and diffs.

Supply the full function/class for best results.

Visual PDF retrieval is particularly useful for scanned reports and figure-heavy documents: upload once, then ask the model to explain charts, annotate trends, or reconcile a table with narrative text.

·····

.....

Hard limits to plan around (and what “counts”).

Limit

Value

Applies to

Source

Max file size

512 MB

Any single file upload (chat or custom GPT)

Community guidance & product docs summaries.

Text/doc indexing cap

~2,000,000 tokens per file

Text & document files (not spreadsheets)

Builder/community confirmations.

Image file cap

20 MB per image

Vision analysis

Help Center.

Per-user storage cap

10 GB per end user

Aggregate uploads

Help Center File Uploads FAQ.

Per-org storage cap

100 GB per org

Aggregate uploads

Help Center File Uploads FAQ.

Upload cadence

Up to 80 files / 3 hours (Free: 3/day)

Rolling window; may be lowered at peak load

Help Center File Uploads FAQ.

Concurrent batch size

UI often limits ~10–25 files per batch

Practical front-end ceiling

Community reports.

Why it matters: very long PDFs and image-dense decks hit token and size ceilings fast. Split by chapter/section and keep images under 20 MB each to avoid stalls.

·····

.....

Where your files come from (device vs. connected storage).

  • Device uploads: drag-and-drop or use the paperclip inside any chat.

  • Connected apps: ChatGPT supports adding files from Microsoft OneDrive and SharePoint by sharing file URLs; Google Drive support via URL-import isn’t listed on this Help Center page.

  • Enterprise & Team integrations: recent product updates and reporting highlight broader cloud integrations (e.g., Drive/Dropbox/Box/SharePoint/OneDrive) for business plans—rolled out within enterprise surfaces that enforce tenant permissions.

Practical takeaway: Individuals usually upload from the device; teams should use governed cloud links so permissions, versions, and retention policies carry over automatically.

·····

.....

Retention, privacy, and training.

  • Standard tiers (consumer): Chats persist until you delete them; when deleted, they’re scheduled for permanent deletion within 30 days (exceptions apply for legal/security holds).

  • Enterprise/Edu/Business: Inputs/outputs aren’t used to train models by default; admins control retention and governance.

  • Legal preservation orders: news reporting noted a court directive requiring preservation of deleted consumer chats in the context of litigation; enterprise zero-retention orgs are excluded. Treat this as a legal exception, not normal policy.

Bottom line: for sensitive workflows, use Team/Enterprise/Edu so retention and no-training guarantees are policy-enforced.

·····

.....

How to get the best results with uploaded files.

  1. Target your questions.Ask for page- or section-scoped answers: “Summarize §4.2 (pp. 13–18) and extract dates as JSON.” This keeps token usage low and answers specific.

  2. Prefer text layers.Export PDFs from Docs/Word instead of scanning. If you must scan, crop key figures as high-DPI images (≤20 MB) and ask targeted questions about each.

  3. Request structured outputs.Tell ChatGPT to return CSV/JSON/Markdown tables for KPIs and logs; it reduces ambiguity and makes copy-out simple.

  4. Use chained passes for long documents.Do section summaries → synthesis instead of “summarize 200 pages at once.” It’s faster, cheaper, and more faithful.

  5. For spreadsheets, be explicit.Name columns/ranges (“compare B2:B500 vs. C2:C500; compute YoY%”) and ask for table outputs. Spreadsheet files aren’t subject to the 2M-token file cap, but overall context still matters.

·····

.....

Quick decision guide — best way to ingest your document.

Scenario

Best method

Why

Clean, text-layer PDF

Upload PDF; ask page-scoped Q&A

Preserves structure; minimal friction.

Scanned PDF with charts

Upload PDF (visual PDF retrieval) or crop and upload images ≤20 MB

Vision reads figures; images avoid bad OCR.

Huge spreadsheet

Upload XLSX/CSV; request CSV/Markdown outputs

Spreadsheet-friendly ingestion; structure retained.

Team repository (M365)

Share OneDrive/SharePoint links with proper permissions

Keeps enterprise governance intact.

Multi-cloud enterprise

Use org integrations (Drive/Dropbox/Box/SharePoint/OneDrive)

Unified access with tenant enforcement.

·····

.....

Plan differences that affect file workflows.

Plan

File uploads

Connected storage

Retention & training

Notes

Free

Limited quota; fewer uploads/day

Standard consumer deletion window; general privacy

Great for casual tests.

Plus

Full uploads within caps

Consumer retention; settings control data sharing

Good for solo research.

Team

Higher practical throughput

Org cloud links; collaborative spaces (as available)

Admin-scoped retention; no training on business data

Designed for small teams.

Enterprise / Edu

Highest reliability & throughput

Broad integrations, governed by tenant

Admin-controlled retention; no training by default

For sensitive/regulated use.

Exact numeric allowances can evolve; always check your in-product limits and admin console. 

·····

.....

Troubleshooting table (fast fixes).

Symptom

Likely cause

Fix

File too large

>512 MB file or image >20 MB

Split/compress; export PDFs without embedded scans; compress images.

Upload limit reached

Hit the rolling quota (e.g., 80 files / 3 hrs; Free 3/day)

Wait for the window reset or reduce batch size.

Messy table extraction

Scanned PDF / poor OCR

Re-upload the table as a cropped image; ask for CSV.

No access to cloud link

Permission mismatch on OneDrive/SharePoint

Share the file URL with proper org permissions.

Unexpected chat persistence

Legal hold / policy exception

Use Enterprise with admin-controlled retention for governed workflows.

·····

.....

Practical prompts to copy.

  • PDF sections → JSON: “From pages 12–18, extract all dates, amounts, and parties into JSON: {date, amount_usd, party, page}. Cite page numbers.”

  • Tables → CSV: “Convert the table on page 5 to CSV with headers; normalize thousands separators and ISO-date the first column.”

  • Spreadsheet analysis: “For Revenue and COGS columns, compute quarterly totals and YoY% by quarter; return a Markdown table and one-paragraph commentary.”

  • Figures explanation: “Explain the bar chart on page 9 in two bullets: trend, outliers. Then compare to the paragraph beneath it.”

These patterns force focused answers, keep token usage low, and produce outputs you can paste into Sheets/Excel, Docs, or dashboards.

·····

.....

Bottom line.

  • ChatGPT is a capable document and data reader: PDFs (including visual elements), docs, sheets, images, and code—all inside a single chat.

  • Know your ceilings: 512 MB/file, ~2M tokens per text/doc file, 20 MB/image, and rolling upload quotas. Plan to split long/scanned documents and ask page-scoped questions.

  • For organizations, Team/Enterprise add connected storage and no-training guarantees with admin-controlled retention—essential for governed document workflows.

.....

FOLLOW US FOR MORE.

DATA STUDIOS

.....

bottom of page