UR9HH Arm’s original products were designed for battery-powered devices and helped revolutionize mobile phones. As a result, the energy efficiency DNA deeply embedded in Arm could make the industry rethink how chips should be built to meet the growing demands of AI.
In a typical server rack, compute chips alone can consume more than 50% of the power budget. The engineering team is looking for ways to bring that number down, and every watt of reduction counts.
Because of this, the world’s largest AI head cloud service providers are turning to Arm technology to reduce power consumption. Compared to others in the industry, Arm’s new Arm Neoverse CPU is the highest performing and most power efficient processor for cloud data centers. Neoverse gives head cloud service providers the flexibility to customize chips to optimize their demanding workloads while delivering leading performance and energy efficiency. Every watt saved can be used to implement more calculations. That’s why Amazon Cloud Services (AWS), Microsoft, Google, aUR9HH nd Oracle are now using Neoverse technology to handle their general purpose computing and CPU-based AI reasoning and training. The Neoverse platform is becoming the de facto standard in the cloud data center space.
From recent industry releases:
AWS Graviton based on the Arm architecture: Amazon Sagemaker delivers 25 percent better AI inference performance, 30 percent better Web applications, 40 percent better databases, and 60 percent better efficiency than other products in the industry.
Google Cloud Axion based on the Arm architecture: It improves performance and energy efficiency by 50% and 60% respectively compared to traditional architectures, and can support CPU-based AI inference and training, YouTube, Google Earth and other services.
Microsoft Azure Cobalt based on Arm architecture: 40% higher performance and support for services like Microsoft Teams, coupling with Maia accelerators to drive Azure’s end-to-end AI architecture.
Oracle Cloud uses Ampere Altra Max based on Arm architecture: The server delivers 2.5 times higher performance and 2.8 times lower power consumption per rack compared to traditional equivalents, and is used for generative AI inference models such as summarization, tokenization of data trained by large language models, and bulk inference use cases.
Clearly, Neoverse greatly improves the performance and energy efficiency of general purpose computiUR9HH ng in the cloud. In addition, the partners also found that Neoverse can bring the same benefits in terms of accelerated computing. Large-scale AI training requires unique accelerated computing architectures, such as the NVIDIA Grace Blackwell Platform (GB200), which combines NVIDIA’s Blackwell GPU architecture with an ARM-based Grace CPU. This ARM-based computing architecture enables system-level design optimization, delivers 25 times lower power consumption and 30 times higher performance per GPU compared to NVIDIA H100 Gpus for large language models. These optimizations can lead to disruptive performance and energy savings, all thanks to the unprecedented flexibility of chip customization that Neoverse brings.