top of page

All YandexGPT models available in 2025: complete list for web, app, and API with generation 5 variants and developer access

ree

YandexGPT now powers the Alice assistant and offers Lite and Pro models through Yandex Cloud.

As of August 2025, Yandex has consolidated its large language model deployment under the YandexGPT brand, available to both end users and developers. The public-facing Alice assistant now runs on the YandexGPT 3+ family, while generation 5 YandexGPT Lite and Pro are offered through the Foundation Models section of Yandex Cloud for API use. These models support long context windows, versioned endpoints, and flexible modes of access.



Here we present a full breakdown of all current YandexGPT models, their availability across consumer and developer channels, and the context in which each should be used.


The Yandex web and app experience runs Alice with YandexGPT models in the backend.

The Alice assistant is available across Yandex-enabled devices, mobile apps, and smart home systems. In April 2024, Yandex confirmed that Alice was upgraded to YandexGPT 3, and since then, the platform has been continuously updated in the backend. Users do not select models manually—Alice automatically runs on the most current YandexGPT deployment, now described as generation 5 in internal documentation.


The experience is consistent across platforms, and updates are invisible to the end user. Alice now supports long-form answers, follow-ups, and document parsing thanks to its integration with YandexGPT’s high-context architecture.



Yandex Cloud exposes Lite and Pro models with version control and 32k token context.

For developers, Yandex offers full API access to its models through the Foundation Models section of Yandex Cloud. Two models are available for production use:


● YandexGPT Pro

This is the flagship model designed for deep reasoning, document analysis, and complex synthesis. It supports up to 32,000 tokens of context and is suitable for enterprise applications, multi-step Q&A, and structured generation. The model ID is yandexgpt, and versioning is managed through /rc, /latest, and /deprecated branches.


● YandexGPT Lite

A lightweight variant optimized for low-latency generation, chat UIs, and high-throughput applications. It shares the same 32k context window as Pro in generation 5 but is tuned for speed and responsiveness. The model ID is yandexgpt-lite.

Both models are accessed using the format:

gpt://<folder_ID>/yandexgpt 

or

gpt://<folder_ID>/yandexgpt-liteYou

can append /latest, /rc, or /deprecated to lock the version.



Generation 5 models replace older versions and support structured rollout.

The most recent update (July 29, 2025) confirms that both YandexGPT Pro and Lite have reached generation 5 and are the current versions in the /latest and /rc branches. Prior generation 4 models remain available under /deprecated, but new applications are encouraged to target the newer builds.

Each model supports both synchronous and asynchronous generation modes, and developers can toggle between them based on latency and scalability needs.



Technical capabilities include memory, search augmentation, and chat orchestration.

YandexGPT models integrate seamlessly with the broader Yandex AI Assistant API, allowing developers to build assistant-like flows, including:

  • Memory persistence across turns

  • RAG (retrieval-augmented generation) using company data

  • Integration with Yandex Drive, Disk, and Mail

Developers can use the Quickstart SDKs and sample projects in Python, REST, and gRPC to deploy chatbots, writing tools, and internal copilots using Lite or Pro as the underlying LLM.



Platform availability and usage methods by interface.

Platform

Model(s)

Selection Method

Alice (web, mobile, voice)

YandexGPT 3 → auto-updated

Not user-selectable

Yandex Cloud Foundation Models

YandexGPT Lite / Pro

Manual via URI

gRPC / REST API

Same

Controlled via headers and endpoints

AI Assistant (dialog orchestration)

YandexGPT Lite / Pro

Routed through Assistant API config

The key distinction is that Alice auto-routes to the latest model, while developers on Yandex Cloud can explicitly select model type and generation.



Choosing between Lite and Pro depends on speed, complexity, and budget.

  • Choose YandexGPT Pro when you need:

    • Multi-document Q&A

    • Long-context reasoning

    • Structured workflows and enterprise copilots

  • Choose YandexGPT Lite when you need:

    • Fast, responsive answers

    • Lower cost per request

    • Simple completion tasks or short-turn chats

Both models are priced separately, and performance trade-offs are documented in the Foundation Models dashboard.



Summary of YandexGPT’s current model architecture.

Model Name

Model ID

Max Context

Use Case

Status

YandexGPT Pro (Gen 5)

yandexgpt

32,000 tokens

Research, RAG, complex tasks

Stable (/latest)

YandexGPT Lite (Gen 5)

yandexgpt-lite

32,000 tokens

Chat, short answers, speed

Stable (/latest)

YandexGPT 3

Alice internal

n/a (server-managed)

Consumer assistant

Active

Yandex continues to expand its LLM capabilities through a mix of closed and open-weight models, supporting not only its assistant but also a wide range of business integrations. A full benchmarking comparison of Lite vs Pro vs previous gens is available through Yandex Cloud documentation.



____________

FOLLOW US FOR MORE.


DATA STUDIOS


bottom of page