Amazon Nova: Everything About This Next-Gen Foundation Model

Amazon Chief Executive Andy Jassy appears on stage at the day one keynote during the company's re:Invent developer show. He would later go on to announce Amazon Nova. Photo credit: Ken Yeung

Among the many announcements Amazon revealed at this year’s re:Invent conference, introducing a new foundation model family was among the most interesting. Named Amazon Nova, it is designed to enhance intelligence and content generation in applications while providing significant advancements in latency, cost-efficiency, customization, information grounding, and agent-like capabilities.

“It’s actually quite difficult to build a really good generative AI application,” Amazon Chief Executive Andy Jassy pointed out during a keynote address. He explains that simply having a model isn’t enough. Applications must also include proper guardrails, ensure fluent messaging, maintain low latency, and adopt a cost-effective structure. He emphasizes that developers often assume they’ve built a great AI app just because they’re using a powerful model, but true success requires much more than that. “It turns out you’re really only about 70 percent of the way there. The reality is customers don’t take kindly to apps that have 30 percent wonkiness.”

Image credit: Screenshot/AWS
Image credit: Screenshot/AWS

Amazon Nova is just one of Amazon’s foundational models. In addition, the company offers its Titan models, a series of large language models (LLMs) designed for natural language understanding and generation tasks. AWS also features CodeWhisperer, a model focused on code generation, and AlexaTM, which powers the conversational capabilities of Amazon Alexa. Furthermore, the text-to-speech model Polly and several other innovative models round out AWS’s extensive AI portfolio.

Take a look at everything you need to know about Amazon Nova:

Disclosure: I attended Amazon's 2024 re:Invent as a guest, with a portion of my travel expenses covered by the company. However, Amazon had no influence over the content of this post—these thoughts are entirely my own.

The Six Variants of Amazon Nova

Though most new model families typically have a couple of variations at launch, Amazon Nova has six. However, not all of them are generally available today. All can be accessed through Amazon Bedrock.

A list of Amazon Nova's text-only and multimodal model variations. Photo credit: Ken Yeung
A list of Amazon Nova’s text-only and multimodal model variations. Photo credit: Ken Yeung

Amazon Nova Micro

The first is Amazon Nova Micro, a text-only model offering the lowest latency response (200 tokens per second) at a very low cost. Jassy explained that some company employees use it to help with simple tasks. It supports more than 200 languages and includes a maximum of 128,000 tokens.

Amazon Nova Lite

The first of Nova’s three multimodal models, Nova Lite, is low-cost and “lightning fast” for processing image, video, and text inputs. Amazon describes it as useful for interactive and high-volume apps where cost is a significant factor. Like its Micro peer, Nova Lite supports more than 200 languages but has a higher maximum token limit of 300,000.

Amazon Nova Pro

The second multimodal model, Nova Pro, is highly capable and offers the best combination of accuracy, speed, and cost to handle a multitude of tasks. Amazon cites it as a model optimized for video summarization, Q&A, mathematical reasoning, software development, and AI agents capable of multi-step workflows. In addition, Nova Pro “excels” at instruction following and agentic workflows. It has language support and token limits similar to those of Nova Lite.

Amazon Nova ProAmazon Nova LiteAmazon Nova Micro
Output modalitiesTextTestText
Input modalities Text, Image, VideoText, Image, VideoText
Context Window300K300K128K
Max Output Tokens5K5K5K
Supported Languages200+200+200+
Document SupportPDF, CSV, DOC, DOCX, XLS, XLSX, HTML, TXT, MDPDF, CSV, DOC, DOCX, XLS, HTML, TXT, MDNo
Converse APIYesYesYes
Invoke APIYesYesYes
StreamingYesYesYes
Batch InferenceYesYesYes
Provisioned ThroughputYesYesYes
Bedrock Knowledge BasesYesYesYes
Bedrock AgentsYesYesYes
Bedrock GuardrailsYes (text only)Yes (text only)Yes
Bedrock EvaluationsYes (text only)Yes (text only)Yes
Bedrock Prompt FlowsYesYesYes
Bedrock StudioYesYesYes
Bedrock Batch InferenceYesYesYes

🔗 Amazon Nova Model Information

Amazon Nova Premier

Called the “most capable of Amazon’s multimodal models,” Amazon Nova Premier is suited for complex reasoning tasks within challenging applications. It can also serve as a teacher for custom models, distilling knowledge down to them so developers can build specialized versions for specific use cases. Unlike its other multimodal model peers, Nova Premier is unavailable today—Amazon lists it as coming sometime in Q1 2025.

Amazon Nova Canvas and Reels

The remaining two Nova variations are creative content generation and multimodal models. Amazon Nova Canvas is an image-generation model that would directly compete with DALL-E, Midjourney, Stable Diffusion, Adobe Firefly, and Imagen. It creates studio-quality images and features editing tools, controls for adjusting color scheme and layout, and guardrails for safe and responsible usage (e.g. watermarking and content moderation).

YouTube player

Amazon Nova Reels is a video-generation model that competes against Runway, Dream Machine, Sora, and Veo. That being said, a key difference is that Amazon’s model has an API and automated benchmarks. Initially, it will create up to six-second clips, but eventually, that limit will increase to two minutes. It supports the use of natural language prompts to control the visual style and pacing, features camera motion control (e.g. panning and 360-degree rotation), and has safe and responsible guardrails.

Both Nova Canvas and Nova Reels are coming soon.

Future Model Roadmap

Jassy declared Amazon plans to work on a second generation of Nova in 2025.

In addition, the company is working on two additional variations: The first is a speech-to-speech model in which you can speak to the model, and it will generate a speech response. It sounds similar to capabilities available in Google’s Gemini and OpenAI’s ChatGPT voice mode (remember how we thought Scarlett Johannsen was the voice of “Sky”?). Amazon plans to release this sometime in Q1 2025.

Amazon CEO Andy Jassy previews at re:Invent 2024 the future Nova AI models that the company plans to release in 2025. Photo credit: Screenshot
Amazon CEO Andy Jassy previews at re:Invent 2024 the future Nova AI models that the company plans to release in 2025. Photo credit: Screenshot

By mid-2025, the company hopes to release an any-to-any model, meaning multimodal-to-multimodal. Users should be able to provide a text, speech, image, or video input and receive an in-kind response.

“This is the future of how frontier models are going to be built and consumed,” Jassy claimed. “We really look forward to giving this to you.”

How Amazon Nova Compares

  • Benchmark evaluations of Amazon Nova Micro against Gemini 1.5 Flash 8B and Llama 3.1 8B. Image credit: Amazon
  • Benchmark evaluations of Amazon Nova Lite against Claude 3.5 Haiku, GPT-4o Mini, Gemini 1.5 Flash, and Llama 3.2 11B. Image credit: Amazon
  • Benchmark evaluations of Amazon Nova Pro against Claude 3.5 Sonnet, GPT 4.0, Gemini 1.5 Pro, and Llama 3.2 90B. Image credit: Amazon

Amazon has published benchmarks demonstrating that Nova performs competitively with many of the leading models in its class. With Nova Micro, Jassy states it performs equally or better than Google Gemini 1.5 Flash 8B and Meta’s Llama 3.1 8B across 11 tasks.

Regarding Nova Lite, Jassy claims it’s equal to or outperforms OpenAI’s GPT-4o Mini across 17 of 19 benchmarks, “equal to or better than” Google Gemini 1.5 Flash on 17 of 21 benchmarks, and does better than Anthropic Claude 3.5 Haiku on 10 of the 12 benchmarks.

Finally, Jassy highlights that Nova Pro matches or outperforms OpenAI’s GPT-4o on 17 out of 20 benchmarks. It also surpasses Google Gemini 1.5 Pro on 16 of 21 benchmarks. Compared to Anthropic’s Claude 3.5 Sonnet v2, the top model in its class, Nova Pro performs equally or better on half of the benchmarks.

The Amazon Nova Differentiation

During his presentation, Jassy mentioned several significant factors that may sway developers to Amazon Nova. The first is how cost-effective the model is—”They’re about 75 percent less expensive than the other leading models,” Amazon’s leader bragged.

He also argues that Amazon Nova is fast: “They’re the fastest models that you’ll see with respect to latency.” And developers will not only find the model on Amazon Bedrock, but also

He emphasized Nova’s low latency as a key advantage, stating, “They’re the fastest models you’ll see with respect to latency.” Nova also supports fine-tuning and model distillation—a feature AWS introduced last week that allows large models to train smaller, more efficient ones. Additionally, it integrates with Bedrock Knowledge Bases for Retrieval-Augmented Generation (RAG) and is specifically optimized for agent-based applications.

Another noticeable difference is that Amazon Nova is a proprietary foundation model. The company has not chosen to make it open-source in any way (e.g. weights, dataset, models, etc.).

It’s All About Choice

Although Amazon Nova has some heft, the company is not under any illusion that it will be the last model standing in the marketplace. Instead, it accepts that “one model to rule them all” is a myth, a belief more companies appear to be accepting. This is evident in the numerous models available within Amazon Bedrock, from the more popular large language models to more specialized ones.

However, as Ishit Vachhrajani, AWS’ Global Head of Enterprise Strategy, explained, the pitch to developers is that Amazon Nova is a model that has been put through the proverbial ringer. It’s been vetted through many areas within the tech company’s extensive systems, from its Rufus shopping assistant to Pharmacy, Prime Video, and other portfolio businesses. In short, Nova has proven use cases in which developers and business leaders can feel assured that the model will perform as expected.

“What we saw was that what we are building here is something that can help many other customers as well, which has always been our approach in terms of thinking in primitives, that if we are solving our own problems, maybe somebody else has the same issues that they would need our help with,” Vachhrajani remarked.

This might be an advantage Amazon brings that most model providers can’t replicate. The company has, in Vachhrajani’s words, “drank our own champagne,” applying its model in scenarios it has encountered across multiple product verticals, demonstrating its value and that it will specifically address workflows companies may want with their intelligent apps.

“All of our technology innovations are not just done for the sake of technology,” he clarifies. “They’re done to solve real-world business problems. And so that’s the huge differentiator—the fact that we actually use it inside of our large-scale operation across diverse sets of businesses. But now, you combine that differentiator with the fact that it’s also 75 percent cheaper in terms of price performance, that’s a pretty solid value proposition. So we are pretty excited about what customers are going to do with that.”

Featured Image: Amazon Chief Executive Andy Jassy appears on stage at the day one keynote during the company's re:Invent developer show. He would later go on to announce Amazon Nova. Photo credit: Ken Yeung

Leave a Reply

Discover more from Ken Yeung

Subscribe now to keep reading and get access to the full archive.

Continue reading