How DeepSeek can be used for data analysis in chat and API workflows

Graziano Stefanelli
Sep 10
4 min read

DeepSeek allows analysts to interact with spreadsheets through plain-language prompts and structured outputs.

DeepSeek’s models—specifically DeepSeek-V 3.1 and the R1 (Reasoner) variant—offer capabilities that make them well-suited for exploratory data analysis, even for users without coding expertise. With 128,000-token context windows, natural-language instruction parsing, and support for structured file uploads, the platform allows users to work with real datasets directly inside the chat interface or through API integrations. These tools cater to analysts, business users, and developers who need quick insights, automated summaries, and exportable code suggestions.

You can upload spreadsheets directly in the chat interface for immediate insights.

In the DeepSeek web UI, users can upload common spreadsheet formats—including CSV and XLSX—directly into the chat window. The model immediately processes the content, allowing users to pose commands like:

“What are the most significant revenue sources by product line?”“Summarize profit margin changes by quarter.”“Create a table showing top 5 regions by net income in 2023.”

DeepSeek then replies with a structured summary, often in a mix of narrative bullets, markdown tables, and optional SQL snippets. This no-code workflow makes it accessible for professionals who need fast answers without writing scripts.

File size limits apply, and very large Excel files must be split or sampled.

As of September 2025, DeepSeek imposes a file upload limit of approximately 10 MB per spreadsheet, especially in the consumer-facing web interface. Larger datasets often return a “file too large” error or simply fail to load, particularly if they include multiple sheets, embedded images, or thousands of rows.

To work around these limits, users can:

Remove unnecessary columns or hidden rows.
Export only relevant sheets from Excel.
Convert to a compressed CSV format.
Use tools like Google Sheets or Python to downsample.

This limitation places DeepSeek behind platforms like ChatGPT (35–100 MB per file) or Claude (up to 50 MB total context), but a planned upgrade for Q4 2025 may improve this.

DeepSeek does not run code directly, but it suggests SQL or Python output for external execution.

One of the key distinctions in DeepSeek’s data handling is the lack of a native code execution sandbox. It does not include a feature comparable to ChatGPT’s Code Interpreter or Claude’s Code tool.

Instead, when prompted, DeepSeek will:

Suggest a Python pandas snippet to replicate an analysis.
Format a SQL query for external databases.
Describe in prose how to build a certain visualization.

Model	Code Execution	Table Output	Chart Suggestions
DeepSeek-V 3.1	❌ (no execution)	✅ Yes	✅ Described in text
DeepSeek-R1	❌ (no execution)	✅ Yes	✅ Described in text

Users are expected to copy and paste code into a Jupyter notebook, RStudio, or SQL console if they wish to perform deeper operations, calculations, or visualization.

DeepSeek retains multi-turn memory across prompts, enabling iterative filtering and reasoning.

The 128,000-token context window is long enough to hold several pages of data, plus an extended user conversation. This means the assistant can retain prior instructions, track previous calculations, and chain responses over time.

A typical flow might include:

Upload: sales_2021_2024.csv
Prompt 1: “Summarize total annual revenue by region.”
Prompt 2: “Filter out any year with negative gross margin.”
Prompt 3: “Now forecast 2025 using a basic linear trend.”

The assistant will respond to each step using remembered context, offering results without requiring the user to repeat or reframe prior data points. This memory window allows deeper analysis compared to short-context tools, which often forget earlier instructions after a few turns.

The API lets developers build data assistants using the same capabilities.

For teams building internal dashboards or research bots, DeepSeek provides a developer API with support for:

Structured JSON input/output
Batch requests
Multi-document input (via streaming or prompt formatting)
Custom system prompts and pre-injected instruction templates

The deepseek-reasoner endpoint, for example, can handle raw text or formatted tabular data, process it using natural language, and return a structured output for visualization or logging.

API Feature	Available
Max tokens per request	128,000
JSON output support	✅ Yes
Function calling	❌ Not yet
Tool use (external)	❌ Not available
Streaming	✅ Yes

Pricing varies by token usage but remains competitive with models like GPT-4o or Claude 3.5 Sonnet, especially when used for lightweight tabular tasks or repeated analyses over similar data sources.

DeepSeek is planning roadmap upgrades to support larger files and native spreadsheet functions.

Based on developer forum discussions and official roadmap leaks, the DeepSeek team is preparing features that will address some current limitations:

Larger file uploads (30 MB or more)
Awareness of multiple sheets in Excel
Formula tracing for structured spreadsheets
Live SQL or Python integration (in early alpha)

These are expected by the end of 2025, which would position DeepSeek more competitively against platforms offering full scripting environments.

Analysts can already use DeepSeek for meaningful data analysis with smart prompting and structured workflow.

To get the most out of DeepSeek for analysis:

Upload clean, compressed data (≤10 MB).
Start with a high-level summary prompt.
Use follow-ups to filter, sort, group, or forecast.
Ask for code snippets if you plan to export logic.
Rely on markdown tables or structured replies to extract insights.

DeepSeek is already a capable assistant for conversational data analysis, especially when speed and structure matter more than executable code. While it lacks a runtime sandbox and has lower file limits than some competitors, its reasoning strength, long memory, and API flexibility make it a useful research and reporting partner across analytics workflows.

____________

DATA STUDIOS

datastudios.org