What Data Formats Can ChatGPT Handle for Knowledge Extraction

Graziano Stefanelli
Sep 12
3 min read

What Data Formats Can ChatGPT Handle for Knowledge Extraction.

ChatGPT supports a wide variety of file formats for knowledge extraction across its different plan tiers. Whether you're uploading PDFs, spreadsheets, presentations, or code files, ChatGPT—especially with Advanced Data Analysis (ADA) enabled—can read, process, and extract structured information from these files with natural-language prompts. However, performance, parsing depth, and supported features vary based on file type, plan tier, and system limitations.

ChatGPT accepts a broad range of document and data formats.

ChatGPT can ingest and extract knowledge from many standard document types. The file types fall into several main categories:

Category	Supported formats	Notes
Text documents	PDF, DOCX, RTF, TXT, MD, ODT, EPUB	Text parsing only; embedded images ignored unless on Enterprise tier.
Spreadsheets	CSV, TSV, XLS, XLSX, Google Sheets (via conversion)	Parsed via pandas; cell count limit ≈ 800,000 rows before sampling.
Presentations	PPT, PPTX, Google Slides (via conversion), Keynote	Slide text extracted; images omitted unless using Enterprise.
Code and data	JSON, XML, YAML, HTML, .py, .js, .java, etc.	Handled as structured text; UTF-8 encoding recommended.
Images	JPEG, PNG, WEBP, TIFF, BMP, GIF (static only)	Supported only in GPT-4o (Plus/Pro) and Enterprise plans.

Google-native files (Docs, Sheets, Slides) are automatically converted under the hood into Microsoft-compatible formats when uploaded or retrieved from Drive. However, embedded content (e.g., inline images) is only parsed in Enterprise plans that support Visual Retrieval.

Upload limits and size constraints affect file processing.

While ChatGPT allows uploads of substantial files, system constraints apply for memory, processing time, and reliability. These limitations affect performance across document types:

Parameter	Limit
Max file size (all formats)	512 MB
Max files per session	10 files per conversation
Upload quota (Plus/Pro)	80 files every 3 hours
Max practical spreadsheet size	~50 MB or 800,000 rows

Spreadsheets larger than 50 MB often trigger sampling or partial loading due to RAM constraints in the Python sandbox. Uploading large PDFs or structured files should be done in 50–100 MB chunks to avoid memory errors during analysis.

The depth of knowledge extraction depends on your plan.

Different subscription tiers provide different levels of file parsing capabilities:

Plan	Text parsing	Embedded image parsing	Multimodal (vision)
Free (GPT-3.5)	Basic	No	No
Plus / Pro (GPT-4o)	Advanced (ADA)	Yes (in GPT-4o vision tier)	Yes (images only)
Enterprise	Advanced + Visual Retrieval	Yes (PDFs, PPTs, Slides)	Enhanced with OCR

In lower tiers, embedded diagrams, figures, or screenshots inside a PDF or presentation are ignored completely. Enterprise users benefit from Visual Retrieval, which enables image captioning and OCR-driven parsing of images embedded in documents.

Best practices improve extraction reliability and structure.

To optimize results when uploading structured files like spreadsheets, JSON files, or long PDFs, users should apply prompt patterns that clearly define the task and schema. Recommended structures include:

For spreadsheets or tabular data:

File: inventory_2025.xlsx
Columns: Product_ID, Warehouse, Stock_Level, Last_Updated.
Task: Extract rows with Stock_Level < 10 and return sorted by Warehouse.

For long PDFs:

Split into smaller parts (≤ 50 MB)
Use prompts like: “Extract key takeaways from pages 10–20”
Specify format: “Return as table with columns: Concept, Description, Source.”

The difference between “summarise” and “extract” is also important. Use extract when you want verbatim output (e.g., raw tables), and summarise when looking for analytical or prose-style output.

Some file types and encodings may cause issues.

While ChatGPT handles UTF-8, UTF-16, and ASCII encodings reliably, users have reported problems with exotic encodings like UTF-32 or platform-specific formats. In these cases:

Convert to UTF-8 before upload
Avoid legacy binary formats (e.g., old XLS binaries)
Test small sections first if unsure of encoding consistency

Similarly, scanned PDFs or files with poor OCR quality may not produce usable outputs unless paired with Enterprise-tier OCR tools or pre-processed externally.

Current limitations to be aware of

Despite its file flexibility, ChatGPT still has notable boundaries:

No native audio/video file upload support in the chat UI (as of September 2025). Transcription via Whisper is available in API-based workflows but not via drag-and-drop.
No animation support in GIFs; only first frame or static content is parsed.
File previews and full parsing are disabled once the session ends unless files are reuploaded.
Embedded images in PDFs and presentations are ignored in all plans except Enterprise.

Users on lower plans will need to rely on external tools to convert visual content to text before ChatGPT can analyze it meaningfully.

ChatGPT’s file handling capabilities are extensive, but nuanced. It’s well-suited to knowledge extraction from PDFs, spreadsheets, structured text, and basic presentations, especially in the Plus, Pro, and Enterprise tiers where ADA and GPT-4o are active. For deep analysis, users should prepare cleanly encoded, well-structured files and use prompt templates that align schema with task. As OpenAI continues expanding multimodal access and visual parsing, the range of supported formats—and the fidelity of extraction—will likely grow further.

____________

DATA STUDIOS

datastudios.org