All Meta AI models available in 2025: complete list for web, mobile, and developer APIs including Llama 4, 3.3, 3.2, and 3.1

Graziano Stefanelli
Aug 11
4 min read

Meta now runs Llama 4 across its consumer AI experiences and offers open and cloud-based access to Llama 3.x models.

As of August 2025, Meta AI’s architecture spans across mobile assistants, web interfaces, and API-ready open-weight models. The company has unified its consumer-facing assistant under the Llama 4 family, while continuing to support developers through a growing suite of open and managed models from the Llama 3.1, 3.2, and 3.3 series. Additionally, Meta provides real-time image generation through its proprietary Emu engine, available inside Meta AI products.

Here we present a detailed breakdown of every Meta AI model currently in use or available, structured by platform, deployment method, and technical capabilities.

Meta AI web and mobile assistants run on Llama 4 with Emu for image generation.

The standalone Meta AI assistant—accessible via meta.ai or inside WhatsApp, Instagram, Facebook, and Messenger—uses a server-side deployment of Llama 4. This model is responsible for all conversational tasks, reasoning, and question answering. Users do not manually select a model.

● Llama 4 (Scout and Maverick variants)

Meta’s flagship family, released in April 2025, includes Scout 17B-16E and Maverick 17B-128E, both based on a MoE (Mixture-of-Experts) architecture. These models are natively multimodal, accepting both text and images, and supporting long-context, grounded outputs. They are used exclusively inside Meta AI products and exposed to developers as open weights and cloud-managed endpoints.

● Emu (Imagine with Meta AI)

Emu is Meta’s generative image model, responsible for the “Imagine” experience available in the Meta AI app. It creates 1280×1280 images with watermarks, supporting style variations and basic editing. Emu is not publicly available as open weight or direct API but powers all in-app image generation.

Llama 4 models are also available as open weights and via managed cloud APIs.

While Llama 4 powers the assistant experience, it is also offered to developers through both open-weight model cards and cloud services like Amazon Bedrock. This provides multiple deployment options for enterprise and research users.

Model Name	Type	Access Methods
Llama 4 Scout (17B)	Text & image (MoE)	Hugging Face, Amazon Bedrock
Llama 4 Maverick (17B)	Text & image (high expert count)	Amazon Bedrock, self-host
Emu	Image generation only	Meta AI app only

The Llama 4 models emphasize low latency, structured output, and real-world grounding through fine-tuning on supervised instruction data.

Llama 3.3 is widely used for high-performance text generation in the cloud.

Released in Q2 2025, Llama 3.3 70B has become a preferred choice for many developers requiring high-quality outputs at lower compute cost. The model is text-only, non-multimodal, and optimized for inference efficiency.

● Available via:

Amazon Bedrock
Amazon SageMaker
Azure AI Foundry

Its architecture offers strong performance in summarization, structured completions, and factual question answering.

Llama 3.2 introduces small text models and open vision capabilities.

The Llama 3.2 family expands on Meta’s open model strategy by including small text models and the first open-weight vision models. These models support image+text input and text output, making them suitable for hybrid reasoning tasks.

Model Name	Context	Mode
Llama 3.2 1B / 3B	Text-only	Efficient inference (edge or mobile)
Llama 3.2 Vision 11B / 90B	Image+text in → text out	Vision tasks, open-sourced

These are available on Hugging Face and some are integrated into Amazon Bedrock’s vision model set. Their release is part of Meta’s push toward open, explainable, and verifiable AI systems.

Llama 3.1 still powers legacy and enterprise deployments with 405B-scale capability.

Despite the availability of newer versions, Llama 3.1 remains in active use across academic and enterprise settings. It includes one of the largest open-weight models ever released:

Llama 3.1 8B / 70B – multilingual text models for general use
Llama 3.1 405B – frontier-scale model designed for high-end reasoning and completions

● Available via:

Open weights
Amazon Bedrock
Google Vertex AI (405B version is GA)

The 405B model is frequently used in research and competitive benchmarking due to its scale and reproducibility.

Overview by product and developer access.

Product / Platform	Model(s) Used	How It’s Selected
Meta AI app (web/mobile)	Llama 4 (auto-routed)	Not user-selectable
Meta AI image generation	Emu	App-only
Developer (cloud-managed)	Llama 4, Llama 3.3	Amazon Bedrock, Vertex AI
Developer (self-hosted)	Llama 4, 3.3, 3.2, 3.1	Hugging Face / open weights

Meta does not currently offer a public inference API for Emu or its LLMs; instead, it relies on open access and cloud partnerships for developers.

Choosing the right Meta model depends on task type and deployment environment.

For natural conversation or search-style queries → use Meta AI assistant (Llama 4)
For image generation → use Imagine with Meta AI (Emu, inside app)
For multimodal long-context reasoning → deploy Llama 4 Maverick or Scout via Bedrock
For cost-efficient text applications → use Llama 3.3 70B
For lightweight inference or on-device → use Llama 3.2 1B/3B
For vision-based tasks → use Llama 3.2 Vision models
For frontier-scale reasoning or research → use Llama 3.1 405B

Meta’s Llama family now represents one of the most flexible open AI stacks, covering real-time assistants, multimodal tasks, and custom deployments—from mobile to server scale.

____________

DATA STUDIOS

datastudios.org