top of page

Meta AI: Rollout updates for advanced models and feature expansions

ree

Meta AI has accelerated its deployment of higher-tier Llama models, introducing larger context windows, faster throughput, and expanded multimodal capabilities. The rollout strategy is structured to move from tightly controlled design-partner programs to general availability, ensuring stability and compliance while delivering performance improvements to all user tiers.



The new model rollout follows a phased approach.

Phase

Description

Design-partner trial

Limited to fewer than 500 users, focused on high-load and complex scenario testing.

Open beta

Available to subscription tiers with a feature toggle for two weeks.

Phased general release

Rolled out to Plus and Enterprise first, then to the Free tier within 10–14 days.

Enterprise enablement

Activated after SSO and encryption key validation for each organisation.

This measured sequence reduces disruption and provides Meta with actionable telemetry from early-stage users before large-scale release.



New models bring higher limits and faster execution.

Model version

Max context

Output speed (tokens/s)

Multimodal support

Special features

Llama 3.5 Turbo

32,000

70

Limited image beta

—

Llama 4 Turbo

64,000

92

Full image analysis

Partial tool-calling upgrade

Llama 4 Deep Think

128,000

60

Full image analysis, audio transcription

Chain-of-thought trace with tagging

Llama 4 Ultra

256,000

—

Video + code fusion (private alpha)

Not yet released for public use

These upgrades allow more complex projects to be handled in a single conversation, reducing the need for manual summarisation or splitting inputs into multiple sessions.


Availability varies by plan and location.

Plan

Default model

Highest model opt-in

Access notes

Free

Llama 4 Turbo

Deep Think (10 calls/day)

Global, no cost changes

Plus

Llama 4 Turbo

Deep Think GA

Included in subscription

Meta AI+

Llama 4 Turbo

Deep Think beta

15% token surcharge for beta use

Enterprise

Custom mix

Deep Think or Ultra

Per-organisation pricing and governance controls

Deep Think remains opt-in for most plans, with Ultra restricted to a small number of enterprise partners.



Governance tools accompany each rollout.

Control

Purpose

Model allow-list

Restricts available versions for a workspace.

Spend caps

Separate token and call limits by model tier.

Audit tagging

Flags preview calls with model stage metadata.

Data residency

Ensures processing stays in EU, US, or APAC clusters.

No-train setting

Prevents conversation data from being stored or reused.

These safeguards enable organisations to adopt new models without losing oversight of usage or compliance.


Performance benchmarks show tangible improvements.

Task

Llama 4 Turbo

Llama 4 Deep Think

PDF Q&A with 10,000 tokens and images

1.9 s first token, 420 ms retrieval

2.6 s first token, 390 ms retrieval

Multi-image scene analysis

5.3 s total

5.0 s total

Audio-to-summary for 2-minute clip

—

11.8 s total

The Deep Think model trades slightly slower first-token latency for longer memory and reasoning depth, making it more effective for extended projects.



Known issues are tracked with workarounds.

Issue

Affected model

Suggested workaround

Memory truncation at 110,000 tokens

Deep Think

Split session and send a summarised recap

Evening latency spikes in EU

Turbo (Free tier)

Schedule generation during off-peak hours

Occasional JSON mismatch in tools

Turbo beta tools

Re-register schema or simplify parameters

Meta publishes these limitations with recommended mitigation steps in the model release notes.


The roadmap points toward even larger context and customisation.

Planned upgrades include a fine-tuning toolkit for Llama 4 Turbo, enabling LoRA adaptation for up to 25 million tokens; on-device inference for Ray-Ban smart glasses using quantised Llama 3.5; and a code execution sandbox inside chats for Deep Think. These features are expected to extend both the autonomy and portability of the models in production environments.



____________

FOLLOW US FOR MORE.


DATA STUDIOS


bottom of page