ChatGPT for extracting tables and text from PDF documents
- Graziano Stefanelli
- Sep 16
- 4 min read

ChatGPT now supports advanced workflows for extracting structured content from PDF documents, including both plain text and tabular data. With file upload capabilities, Python-based processing through Advanced Data Analysis (ADA), and configurable prompts, it offers a flexible solution for tasks that traditionally required dedicated OCR or PDF parsing tools. This article examines how ChatGPT handles PDF inputs, how to extract tables correctly, and what practical limits users should consider across various plans.
You can upload PDFs directly into ChatGPT across all plans.
ChatGPT supports native file uploads across Free, Plus, Team, and Enterprise plans. PDFs up to 512 MB in size and 2 million tokens in content can be parsed by the system. The interface accepts file drops or attachments using the paperclip icon. The current limits are:
Per file: 512 MB max (all file types)
Per PDF: capped at 2 million tokens (roughly 1.5 million words)
Free tier: 3 file uploads per day
Plus, Pro, Team tiers: up to 80 files per 3 hours
Per chat: up to 10 files can be attached at once (UI-limited)
The file content is automatically indexed and segmented by ChatGPT for downstream retrieval within the same conversation. PDF structure is preserved in most cases, and users can reference specific pages or sections directly.
Advanced Data Analysis enables table extraction from digital PDFs.
For Plus, Team, and Enterprise users, enabling Advanced Data Analysis activates a code interpreter environment behind the scenes. This allows ChatGPT to use Python libraries such as pandas, camelot, or tabula-py to process uploaded PDFs.
Typical table extraction workflow:
Upload the PDF file via the paperclip icon.
Use a prompt such as:
“Extract all tables and export them as CSV files. Return a zip file and list the page numbers.”
ChatGPT identifies table structures, processes them via Python, and packages the results.
The system usually returns:
A downloadable ZIP file with individual CSVs for each table.
A Markdown table listing pages and table headers.
A brief explanation of what was extracted and any formatting issues.
These steps work best on digitally generated PDFs with clear horizontal and vertical borders. Tables with merged cells, rotated layouts, or inconsistent column widths may lead to parsing errors or flattened results.
OCR and embedded images are not processed outside the Enterprise plan.
A frequent misconception is that GPT-4o or GPT-5 preview models can extract text from images embedded in PDFs. This is not the case unless the user is on an Enterprise plan with Visual Retrieval enabled.
For Free, Plus, and Team users:
Embedded images are ignored entirely in PDF files.
If the PDF consists of scanned images (e.g., scanned contracts or invoices), ChatGPT will return empty results unless text has already been OCR-embedded.
For Enterprise:
Visual Retrieval enables image parsing, inline OCR, and layout-aware extraction.
Footnotes, charts, and screenshots are processed with layout sensitivity.
Output formats and export workflows are optimized for tables.
When more than one table is detected:
ChatGPT automatically creates a ZIP archive containing each table in a separate CSV file.
File names typically follow the pattern table_page_5_index_2.csv.
Headers are inferred, and column types are detected where possible (e.g., numerical, categorical).
Users may also prompt the system to:
Combine all tables into a single Excel file with named sheets.
Return tables as Markdown, HTML, or LaTeX code for copy-paste use.
Generate inline charts using Matplotlib or Seaborn for quick visualization.
Example prompt:
Extract all tables from pages 3 to 10.
Export each table as a CSV and return a ZIP file.
List the table headers and their corresponding page numbers.
There are technical and practical limits to consider.
Despite its flexibility, ChatGPT’s PDF capabilities are bounded by both model behavior and file structure:
Limitation | Details |
Token cap | Content beyond 2 million tokens is silently dropped. |
No embedded image processing | Ignored entirely unless using Enterprise Visual Retrieval. |
Scanned tables | Often fail to parse unless preprocessed with external OCR tools. |
Table detection sensitivity | Works best on clean, bordered tables with consistent row spacing. |
UI file limit | No more than 10 files per chat can be uploaded at once (not documented). |
When working with large documents, users are encouraged to:
Split PDFs into logical sections to stay under the token cap.
Pre-OCR scanned documents using tools like Adobe Acrobat or Tesseract.
Use precise page range references to narrow the scope of extraction.
Privacy policies differ based on plan and model training settings.
ChatGPT’s handling of uploaded files is governed by plan-specific data retention and training policies:
Free / Plus / Pro users: files may be used to improve the model only if the "Improve the model" setting is enabled. Users can disable this in settings.
Team, Business, and Enterprise users: uploaded documents are excluded from training, encrypted at rest, and stored in a tenant-isolated workspace.
Enterprise admins can control file retention, audit logs, and data export permissions.
Summary table: ChatGPT PDF and table extraction overview
Feature | Supported Plans | Notes |
File upload (512 MB) | All plans | Hard size cap |
Token limit (2M tokens) | All plans | Text beyond this is ignored |
Table extraction via ADA | Plus, Team, Enterprise | Uses Python backend |
OCR / Vision on embedded images | Enterprise only | Visual Retrieval required |
CSV/ZIP export | Plus, Team, Enterprise | Auto-generated on table detection |
File privacy and retention control | Business, Enterprise | Tenant-isolated, admin-controlled |
This makes ChatGPT a practical tool for anyone needing to parse and extract structured data from PDFs—especially in research, finance, or reporting workflows—so long as the input format is clean and within limits. For highly structured extraction, Enterprise plans unlock OCR and visual document understanding, while ADA gives Plus users powerful scripting support for numerical and tabular output.
____________
FOLLOW US FOR MORE.
DATA STUDIOS