All YandexGPT models available in 2025: complete list for web, app, and API with generation 5 variants and developer access
- Graziano Stefanelli
- Aug 11
- 3 min read

YandexGPT now powers the Alice assistant and offers Lite and Pro models through Yandex Cloud.
As of August 2025, Yandex has consolidated its large language model deployment under the YandexGPT brand, available to both end users and developers. The public-facing Alice assistant now runs on the YandexGPT 3+ family, while generation 5 YandexGPT Lite and Pro are offered through the Foundation Models section of Yandex Cloud for API use. These models support long context windows, versioned endpoints, and flexible modes of access.
Here we present a full breakdown of all current YandexGPT models, their availability across consumer and developer channels, and the context in which each should be used.
The Yandex web and app experience runs Alice with YandexGPT models in the backend.
The Alice assistant is available across Yandex-enabled devices, mobile apps, and smart home systems. In April 2024, Yandex confirmed that Alice was upgraded to YandexGPT 3, and since then, the platform has been continuously updated in the backend. Users do not select models manually—Alice automatically runs on the most current YandexGPT deployment, now described as generation 5 in internal documentation.
The experience is consistent across platforms, and updates are invisible to the end user. Alice now supports long-form answers, follow-ups, and document parsing thanks to its integration with YandexGPT’s high-context architecture.
Yandex Cloud exposes Lite and Pro models with version control and 32k token context.
For developers, Yandex offers full API access to its models through the Foundation Models section of Yandex Cloud. Two models are available for production use:
● YandexGPT Pro
This is the flagship model designed for deep reasoning, document analysis, and complex synthesis. It supports up to 32,000 tokens of context and is suitable for enterprise applications, multi-step Q&A, and structured generation. The model ID is yandexgpt, and versioning is managed through /rc, /latest, and /deprecated branches.
● YandexGPT Lite
A lightweight variant optimized for low-latency generation, chat UIs, and high-throughput applications. It shares the same 32k context window as Pro in generation 5 but is tuned for speed and responsiveness. The model ID is yandexgpt-lite.
Both models are accessed using the format:
gpt://<folder_ID>/yandexgpt
or
gpt://<folder_ID>/yandexgpt-liteYou
can append /latest, /rc, or /deprecated to lock the version.
Generation 5 models replace older versions and support structured rollout.
The most recent update (July 29, 2025) confirms that both YandexGPT Pro and Lite have reached generation 5 and are the current versions in the /latest and /rc branches. Prior generation 4 models remain available under /deprecated, but new applications are encouraged to target the newer builds.
Each model supports both synchronous and asynchronous generation modes, and developers can toggle between them based on latency and scalability needs.
Technical capabilities include memory, search augmentation, and chat orchestration.
YandexGPT models integrate seamlessly with the broader Yandex AI Assistant API, allowing developers to build assistant-like flows, including:
Memory persistence across turns
RAG (retrieval-augmented generation) using company data
Integration with Yandex Drive, Disk, and Mail
Developers can use the Quickstart SDKs and sample projects in Python, REST, and gRPC to deploy chatbots, writing tools, and internal copilots using Lite or Pro as the underlying LLM.
Platform availability and usage methods by interface.
The key distinction is that Alice auto-routes to the latest model, while developers on Yandex Cloud can explicitly select model type and generation.
Choosing between Lite and Pro depends on speed, complexity, and budget.
Choose YandexGPT Pro when you need:
Multi-document Q&A
Long-context reasoning
Structured workflows and enterprise copilots
Choose YandexGPT Lite when you need:
Fast, responsive answers
Lower cost per request
Simple completion tasks or short-turn chats
Both models are priced separately, and performance trade-offs are documented in the Foundation Models dashboard.
Summary of YandexGPT’s current model architecture.
Yandex continues to expand its LLM capabilities through a mix of closed and open-weight models, supporting not only its assistant but also a wide range of business integrations. A full benchmarking comparison of Lite vs Pro vs previous gens is available through Yandex Cloud documentation.
____________
FOLLOW US FOR MORE.
DATA STUDIOS

