Last year, Amazon Web Services introduced Trainum3, its third-generation chip for training large language models (LLMs). At the time, these were state-of-the-art and twice as fast as their predecessors. However, as AI models become larger and more complex, a better way is needed to reduce latency and training times. Today, the company is introducing Amazon EC2 Trainium3 UltraServers, a single, fully integrated system with the capacity for 144 Trainium3 chips, each at three nanometers.
AWS claims the new Trainium3 UltraServers can deliver 4.4 times more compute performance than their previous generation. With this power, developers should be able to work on AI projects “that were previously impractical or too expensive by training models faster, cutting time from months to weeks, serving more inference requests from users simultaneously, and reducing both time-to-market and operational costs.”
This announcement comes as AWS holds its annual re:Invent conference. But it’s not the only chip news—the company also shared that Trainum3 is now generally available and also revealed details about its forthcoming Trainium4 chip.
Disclosure: I am attending AWS’ 2025 re:Invent as a guest, with a portion of my travel expenses covered by the company. However, Amazon did not influence the content of this post—these thoughts are entirely my own.
AWS reports that when testing Trainium3 UltraServers with OpenAI’s open-weight GPT-OSS model, each chip achieved three times the throughput and delivered responses four times faster than Trainium2 UltraServers. The company highlights that companies won’t need as much infrastructure to grow their AI apps, thereby lowering costs.
The EC2 Trainium3 UltraServers are now generally available.
Subscribe to The AI Economy
Trainum3: Most Advanced Chip Set
When Trainium3 was unveiled in 2024, it was considered to be the most advanced chip set in AWS’ arsenal. It delivers over 40 percent improvement in energy efficiency compared to previous generations and comes with enhanced memory systems that eliminate bottlenecks when processing large AI models. Companies including Anthropic, Karakuri, Metagenomics, Neto.ai, Ricoh, and Splashmusic are said to be leveraging Trainum3 to halve training costs.
Initially, AWS said Trainum3 would launch by late 2025. True to its word, the time has come.
Previewing Trainum4
Although AWS’s focus is on Trainium3, it couldn’t help but share details about its next silicon chip, Trainium4. The company asserts that Trainium4 will deliver six times the FP4 processing performance, three times the FP8 performance, and four times the memory bandwidth, which it claims will be plenty to tackle the next generation of frontier training and inference tasks.
The fourth-generation training chip is being designed to support NVIDIA’s NVLink Fusion high-speed interconnect technology. This will allow Trainium4, Graviton, and EFA to work together within MGX racks. This will save IT departments money on AI infrastructure by enabling a single server to support both GPUs and Trainium chips.
There’s no word on when this new chip will be unveiled or when it will be generally available, though it could be by next year’s re:Invent.
Featured Image: At re:Invent 2024, AWS introduced Trainium3, its custom silicon chip for training large language models. Credit: Ken Yeung
Subscribe to “The AI Economy”
Exploring AI’s impact on business, work, society, and technology.


Leave a Reply
You must be logged in to post a comment.