ChatGPT: File Upload and Reading: formats, limits, retention, and Enterprise integrations
- Graziano Stefanelli
- Oct 25
- 5 min read

ChatGPT can read, summarize, and analyze documents, spreadsheets, slides, code, and images directly inside a chat. Upload a file with the paperclip, ask questions (“extract all KPIs from pages 6–10”), and the model responds with summaries, tables, or structured JSON. For organizations, ChatGPT adds connected storage, admin governance, and retention controls—turning file Q&A into a dependable document-intelligence workflow.
·····
.....
What you can upload—and what ChatGPT actually “reads.”
Visual PDF retrieval is particularly useful for scanned reports and figure-heavy documents: upload once, then ask the model to explain charts, annotate trends, or reconcile a table with narrative text.
·····
.....
Hard limits to plan around (and what “counts”).
Why it matters: very long PDFs and image-dense decks hit token and size ceilings fast. Split by chapter/section and keep images under 20 MB each to avoid stalls.
·····
.....
Where your files come from (device vs. connected storage).
Device uploads: drag-and-drop or use the paperclip inside any chat.
Connected apps: ChatGPT supports adding files from Microsoft OneDrive and SharePoint by sharing file URLs; Google Drive support via URL-import isn’t listed on this Help Center page.
Enterprise & Team integrations: recent product updates and reporting highlight broader cloud integrations (e.g., Drive/Dropbox/Box/SharePoint/OneDrive) for business plans—rolled out within enterprise surfaces that enforce tenant permissions.
Practical takeaway: Individuals usually upload from the device; teams should use governed cloud links so permissions, versions, and retention policies carry over automatically.
·····
.....
Retention, privacy, and training.
Standard tiers (consumer): Chats persist until you delete them; when deleted, they’re scheduled for permanent deletion within 30 days (exceptions apply for legal/security holds).
Enterprise/Edu/Business: Inputs/outputs aren’t used to train models by default; admins control retention and governance.
Legal preservation orders: news reporting noted a court directive requiring preservation of deleted consumer chats in the context of litigation; enterprise zero-retention orgs are excluded. Treat this as a legal exception, not normal policy.
Bottom line: for sensitive workflows, use Team/Enterprise/Edu so retention and no-training guarantees are policy-enforced.
·····
.....
How to get the best results with uploaded files.
Target your questions.Ask for page- or section-scoped answers: “Summarize §4.2 (pp. 13–18) and extract dates as JSON.” This keeps token usage low and answers specific.
Prefer text layers.Export PDFs from Docs/Word instead of scanning. If you must scan, crop key figures as high-DPI images (≤20 MB) and ask targeted questions about each.
Request structured outputs.Tell ChatGPT to return CSV/JSON/Markdown tables for KPIs and logs; it reduces ambiguity and makes copy-out simple.
Use chained passes for long documents.Do section summaries → synthesis instead of “summarize 200 pages at once.” It’s faster, cheaper, and more faithful.
For spreadsheets, be explicit.Name columns/ranges (“compare B2:B500 vs. C2:C500; compute YoY%”) and ask for table outputs. Spreadsheet files aren’t subject to the 2M-token file cap, but overall context still matters.
·····
.....
Quick decision guide — best way to ingest your document.
·····
.....
Plan differences that affect file workflows.
Exact numeric allowances can evolve; always check your in-product limits and admin console.
·····
.....
Troubleshooting table (fast fixes).
·····
.....
Practical prompts to copy.
PDF sections → JSON: “From pages 12–18, extract all dates, amounts, and parties into JSON: {date, amount_usd, party, page}. Cite page numbers.”
Tables → CSV: “Convert the table on page 5 to CSV with headers; normalize thousands separators and ISO-date the first column.”
Spreadsheet analysis: “For Revenue and COGS columns, compute quarterly totals and YoY% by quarter; return a Markdown table and one-paragraph commentary.”
Figures explanation: “Explain the bar chart on page 9 in two bullets: trend, outliers. Then compare to the paragraph beneath it.”
These patterns force focused answers, keep token usage low, and produce outputs you can paste into Sheets/Excel, Docs, or dashboards.
·····
.....
Bottom line.
ChatGPT is a capable document and data reader: PDFs (including visual elements), docs, sheets, images, and code—all inside a single chat.
Know your ceilings: 512 MB/file, ~2M tokens per text/doc file, 20 MB/image, and rolling upload quotas. Plan to split long/scanned documents and ask page-scoped questions.
For organizations, Team/Enterprise add connected storage and no-training guarantees with admin-controlled retention—essential for governed document workflows.
.....
FOLLOW US FOR MORE.
DATA STUDIOS
.....




