Amazon unveils new chips for training and running AI models. As the demand for generative artificial intelligence, which is frequently trained and executed on GPUs, continues to rise, GPUs are scarce. It has been suggested that Nvidia’s highest-performing CPUs will not be available until 2024. Recently, the chief executive officer of the chipmaker TSMC had a less bullish outlook, indicating that the scarcity of graphics processing units (GPUs) from Nvidia and its competitors might continue until 2025.
To reduce their dependency on graphics processing units (GPUs), companies with the financial means, namely tech giants, produce specialized chips specifically designed for generating, iterating, and productizing artificial intelligence models. In some instances, these chips are being made accessible to customers. At the annual re: Invent conference that took place today, Amazon, which is one of the companies that fall into this category, presented the most recent iteration of their chips for model training and inferencing (also known as executing trained models).
According to Amazon, the first of two Trainiums, AWS Trainium2, is intended to give up to four times greater performance and two times better energy efficiency than the first-generation Trainium, which was introduced in December at the end of 2020. Tranium2 will be made accessible in EC Trn2 instances in clusters of sixteen chips in the AWS cloud. Additionally, the EC2 UltraCluster product from AWS can scale up to one hundred thousand chips.
According to Amazon, 100,000 Trainium chips can produce 65 exaflops of computing power, equivalent to 650 teraflops for a single chip. “The number of compute operations a chip can perform in a single second is measured in exaflops and teraflops.” There are probably several complicated elements that are causing the math done on the back of the napkin, not necessarily to be precise. Assuming that a single Tranium2 device can deliver around 200 teraflops of performance, it would be far higher than the capability of Google’s proprietary artificial intelligence training chips around 2017.
According to Amazon, a cluster of 100,000 Trainium processors can train an artificial intelligence big language model with 300 billion parameters in weeks rather than months. The capabilities of a model to solve a problem, such as creating text or code, are primarily defined by the parameters of the model, which are the model’s components learned from training data. This is approximately 1.75 times larger than OpenAI’s GPT-3, which was the forerunner of the text-generating GPT-4.
David Brown, the vice president of computing and networking at Amazon Web Services, stated in a press release, “Silicon underpins every customer workload, making it a critical area of innovation for an organization like AWS.” Tranium2 will assist clients in training their machine learning models more quickly, at a reduced cost, and with improved energy efficiency. This is in response to the growing interest in generative artificial intelligence.
Regarding the availability of Trainium2 instances for AWS clients, Amazon has not provided any information other than “sometime next year.” Rest assured that we will keep an eye out for any details.
Specifically designed for inferencing, the Arm-based Graviton4 is the second chip Amazon introduced this morning. It differs from Amazon’s other inferencing chip, Inferentia, and is the fourth generation of the Graviton chip family that Amazon uses (the “4” after the word “Graviton” indicates this).
According to Amazon, Graviton4 offers up to thirty percent greater compute performance, fifty percent more cores, and seventy-five percent more memory bandwidth than other Graviton processors of the previous generation, Graviton3 (but not the more current Graviton3E), when they are operating on Amazon Elastic Compute Cloud (EC2). Amazon claims that all of Graviton4’s physical hardware interfaces are “encrypted,” providing a higher level of security for artificial intelligence training workloads and data for clients with higher encryption needs. This is an additional update from Graviton 3, released in April. (We have inquired with Amazon over the precise meaning of the term “encrypted,” and we will return to this article once we have received a response.)
Brown said in a statement, “Graviton4 is the most powerful and energy-efficient chip we have ever built for a wide range of workloads.” This is the fourth product version we have produced in only five years. By focusing our chip designs on actual workloads that are significant to our clients, we can offer them the most cutting-edge cloud infrastructure.
The Graviton4 platform will be accessible through Amazon EC2 R8g instances, now available for preview. The public availability of Graviton 4 is scheduled for the following months.