Meta Launches Llama 4 Scout and Maverick, Open-Weight Multimodal Models That Outperform GPT-4 and Gemini

An AI-generated image of a llama standing next to the number four. Image credit: Adobe Firefly
"The AI Economy," a newsletter exploring AI's impact on business, work, society and tech.
Welcome to "The AI Economy," a weekly newsletter by Ken Yeung on how AI is influencing business, work, society, and technology. Subscribe now to stay ahead with expert insights and curated updates—delivered straight to your inbox.

Meta has released the first two models in its new Llama 4 family, which it claims will power “more personalized multimodal experiences.” On Saturday, the company introduced Llama 4 Scout and Maverick, both with 17 billion active parameters. Their debut follows a report that Meta delayed Llama 4’s launch multiple times after failing to meet the firm’s technical benchmark expectations.

The two Llama 4 variations are described as Meta’s first “open-weight natively multimodal models with unprecedented context length support” and were built using a mixture of experts (MoE) architecture. Although not in today’s release, the company acknowledged that it has started previewing Llama 4 Behemoth, its “most intelligent teacher model for distillation,” with 288 billion active parameters.

Image credit: Meta
Image credit: Meta

“Whether you’re a developer building on top of our models, an enterprise integrating them into your workflows, or simply curious about the potential uses and benefits of AI, Llama 4 Scout and Llama 4 Maverick are the best choices for adding next-generation intelligence to your products,” the company wrote in a blog post.

When tested, Meta claimed its Llama 4 family of models bested OpenAI’s GPT and Google’s Gemini. Specifically, Llama 4 Scout outperformed Google’s Gemma 3, Gemini 2.0 Flash-Lite, and Mistral 3.1, while Maverick surpassed GPT-4o and Gemini 2.0 Flash while delivering comparable results to DeepSeek v3 on reasoning and coding. While still training, Meta boasted that Llama 4 Behemoth topped GPT-4.5, Anthropic’s Claude Sonnet 3.7, and Gemini 2.0 Pro across several STEM benchmarks.

Subscribe to The AI Economy

Pre-Training Llama 4 With MoE

The new Llama 4 models were pre-trained using MoE, a method for making AI models smarter and more efficient. This method is similar to what the Chinese AI lab DeepSeek used for its eponymous mobile app earlier this year. Instead of using all of a model’s “experts” every time something is processed, MoE models elect to activate only those needed. This saves the model from using unnecessary computing power to handle complex tasks, resulting in faster and cheaper operations while delivering high-quality results.

A workflow showing how Meta used mixture of experts architecture to train its Llama 4 models. Image credit: Meta
A workflow showing how Meta used mixture of experts architecture to train its Llama 4 models. Image credit: Meta

Llama 4 Scout has 16 experts, Maverick has 128, and Behemoth has 16. The discrepancy is notable because Scout is geared towards lightweight and accessible use cases that run on a single H100 GPU. On the other hand, Maverick has more experts since it’s the flagship Llama 4 variant, and Meta has built it to be robust and versatile across various tasks. With a greater expert count comes greater capacity and flexibility.

Since Behemoth and its 16 experts are unlikely to be as frequently used as its Maverick sibling, Meta has positioned it as the wise sage of the family, distilling knowledge to the smaller models. The fact that it has so few experts could suggest that Behemoth’s experts are more intelligent than those in the other variants.

Meta also says these models are designed with “native multimodality” and leverage “early fusion.” According to the company, this allows Llama 4 to be pre-trained with large amounts of “unlabeled text, image, and video data.”

Llama 4 Post-Training Challenges

Once training was completed, Meta referenced a special process it implemented to make Llama 4 smarter, more helpful, and better capable of handling conversations, images, and complex reasoning. Specifically for Maverick, instead of feeding the model tons of easy examples, the company filtered them out and focused on more complicated and challenging prompts, forcing Maverick to think more critically.

Fine-tuning for this model was done in several stages, from light supervision to online reinforcement learning, then a final polish to improve response quality. In doing so, Meta believes it now makes Llama 4 Maverick a balanced, general-purpose AI capable of understanding images, solving problems, and carrying on natural conversations without diminished speech or efficiency.

How Meta's Llama 4 Maverick performs compared to leading competitors Google Gemini 2.0 Flash, DeepSeek v.3.1, and OpenAI's GPT-4o. Image credit: Meta
How Meta’s Llama 4 Maverick performs compared to leading competitors Google Gemini 2.0 Flash, DeepSeek v.3.1, and OpenAI’s GPT-4o. Image credit: Meta

What Makes This Different From Llama 3?

These advancements come on top of key improvements from the previous generation, as Meta’s Llama 4 introduces capabilities far beyond those found in Llama 3. Unlike its predecessor, Meta’s Llama 4 was designed with native multimodality. It also utilizes early fusion to combine vision and text tokens within a single model, enabling it to hopefully better support image-based applications. And as mentioned earlier, Llama 4 features a mixture of expert architecture—Llama 3 uses the standard dense model, which is simpler, but not as efficient.

Other improvements include more extraordinary context lengths (Scout is capable of having a 10 million context length), better benchmark performance, and support for more multilingual data (now spanning 200 languages).

Regardless, Meta believes its new powerful language model will usher in more interactive experiences we have with AI agents and applications. This includes general AI assistants, enterprise workflows, services that need to process and understand both images and text, for use in vibe-coding or even to help better communicate with those around the world.

Llama 4 Scout and Llama 4 Maverick can be downloaded on Hugging Face and Llama.com. Meta says its AI will now be powered by these models starting today.

Featured Image: An AI-generated image of a llama standing next to the number four. Image credit: Adobe Firefly

Subscribe to “The AI Economy”

Exploring AI’s impact on business, work, society, and technology.

Leave a Reply

Discover more from Ken Yeung

Subscribe now to keep reading and get access to the full archive.

Continue reading