News
On the 28th, Amazon unveiled two AWS-designed chips, Graviton4, a CPU propelling its AWS cloud services, and the second-gen AI chip Trainium2, tailored for large language models. Both chips boast substantial performance upgrades. With a positive market outlook, Amazon is intensifying its competition with Microsoft and Google for dominance in the AI cloud market. The demand for in-house chips is surging, leading to increased orders for key players like the wafer foundry TSMC and the silicon design and production services company ALCHIP, reported by UDN News.
According to reports, Amazon AWS CEO Adam Selipsky presented the fourth AWS-Designed custom CPU chip, Graviton4, at the AWS re:Invent 2023 in Las Vegas. It claims a 30% improvement in computing performance compared to the current Graviton3, with a 75% increase in memory bandwidth. Computers equipped with this processor are slated to go live in the coming months.
Trainium2, the second-gen chip for AI system training, boasts a computing speed three times faster than its predecessor and doubled energy efficiency. Selipsky announced that AWS will commence offering this new training chip next year.
AWS is accelerating the development of chips, maintaining its lead over Microsoft Azure and Google Cloud platforms. Amazon reports that over 50,000 AWS customers are currently utilizing Graviton chips.
Notably, Amazon’s in-house chip development heavily relies on the Taiwan supply chain, TSMC and ALchip. To produce Amazon’s chips, Alchip primarily provides application-specific integrated circuit (ASIC) design services, and TSMC manufactures with advanced processes.
TSMC consistently refrains from commenting on products for individual customers. Analysts estimate that TSMC has recently indirectly secured numerous orders from Cloud Service Providers (CSPs), mainly through ASIC design service providers assisting CSP giants in launching new in-house AI chips. This is expected to significantly contribute to TSMC’s high utilization for the 5nm family.
In recent years, TSMC has introduced successive technologies such as N4, N4P, N4X, and N5A to strengthen its 5nm family. The N4P, announced at 2023 Technology Symposium, is projected to drive increased demand from 2024 onwards. The expected uptick in demand is mainly attributed to AI, network, and automotive products.
(Image: Amazon)
Insights
Microsoft announced the in-house AI chip, Azure Maia 100, at the Ignite developer conference in Seattle on November 15, 2023. This chip is designed to handle OpenAI models, Bing, GitHub Copilot, ChatGPT, and other AI services. Support for Copilot, Azure OpenAI is expected to commence in early 2024.
TrendForce’s Insights:
Microsoft has not disclosed detailed specifications for Azure Maia 100. Currently, it is known that the chip will be manufactured using TSMC’s 5nm process, featuring 105 billion transistors and supporting at least INT8 and INT4 precision formats. While Microsoft has indicated that the chip will be used for both training and inference, the computational formats it supports suggest a focus on inference applications.
This emphasis is driven by its incorporation of the less common INT4 low-precision computational format in comparison to other CSP manufacturers’ AI ASICs. Additionally, the lower precision contributes to reduced power consumption, shortening inference times, enhancing efficiency. However, the drawback lies in the sacrifice of accuracy.
Microsoft initiated its in-house AI chip project, “Athena,” in 2019. Developed in collaboration with OpenAI. Azure Maia 100, like other CSP manufacturers, aims to reduce costs and decrease dependency on NVIDIA. Despite Microsoft entering the field of proprietary AI chips later than its primary competitors, its formidable ecosystem is expected to gradually demonstrate a competitive advantage in this regard.
Google led the way with its first in-house AI chip, TPU v1, introduced as early as 2016, and has since iterated to the fifth generation with TPU v5e. Amazon followed suit in 2018 with Inferentia for inference, introduced Trainium for training in 2020, and launched the second generation, Inferentia2, in 2023, with Trainium2 expected in 2024.
Meta plans to debut its inaugural in-house AI chip, MTIA v1, in 2025. Given the releases from major competitors, Meta has expedited its timeline and is set to unveil the second-generation in-house AI chip, MTIA v2, in 2026.
Unlike other CSP manufacturers, both MTIA v1 and MTIA v2 adopt the RISC-V architecture, while other CSP manufacturers opt for the ARM architecture. RISC-V is a fully open-source architecture, requiring no instruction set licensing fees. The number of instructions (approximately 200) in RISC-V is lower than ARM (approximately 1,000).
This choice allows chips utilizing the RISC-V architecture to achieve lower power consumption. However, the RISC-V ecosystem is currently less mature, resulting in fewer manufacturers adopting it. Nevertheless, with the growing trend in data centers towards energy efficiency, it is anticipated that more companies will start incorporating RISC-V architecture into their in-house AI chips in the future.
The competition among AI chips will ultimately hinge on the competition of ecosystems. Since 2006, NVIDIA has introduced the CUDA architecture, nearly ubiquitous in educational institutions. Thus, almost all AI engineers encounter CUDA during their academic tenure.
In 2017, NVIDIA further solidified its ecosystem by launching the RAPIDS AI acceleration integration solution and the GPU Cloud service platform. Notably, over 70% of NVIDIA’s workforce comprises software engineers, emphasizing its status as a software company. The performance of NVIDIA’s AI chips can be further enhanced through software innovations.
On the contrary, Microsoft possess a robust ecosystem like Windows. The recent Intel Arc GPU A770 showcased a 1.7x performance improvement in AI-driven Stable Diffusion on Microsoft Olive, this demonstrates that, similar to NVIDIA, Microsoft has the capability to enhance GPU performance through software.
Consequently, Microsoft’s in-house AI chips are poised to achieve superior performance in software collaboration compared to other CSP manufacturers, providing Microsoft with a competitive advantage in the AI competition.
Read more
News
According to a report by Taiwan’s Economic Daily, TSMC’s CoWoS advanced packaging capacity is running at full throttle. As they actively expand their production capabilities, there are reports of major customers like NVIDIA increasing their orders for AI chips. Additionally, industry giants like AMD and Amazon have rushed in with urgent orders.
In response to this urgent situation, TSMC is actively seeking equipment suppliers to expand its CoWoS machine procurement. Beyond TSMC’s existing production expansion goals, the company is further increasing its orders for equipment by an additional 30%, highlighting the ongoing fervor in the AI market.
It is reported that TSMC has sought assistance from equipment manufacturers such as Scientech, Allring, Grand Process Technology, E&R Engineering, and GP Group for this endeavor. They plan to complete the delivery and installation of the equipment by the first half of the coming year. The related equipment manufacturers are experiencing a surge in activity.
Industry sources reveal that TSMC’s CoWoS advanced packaging monthly production capacity is currently around 12,000 units. With their previous expansion efforts, they aimed to gradually increase this to 15,000 to 20,000 units per month. Now, with the addition of more equipment, they are looking at the possibility of reaching capacities of over 25,000 units per month, potentially even approaching 30,000 units. This substantial increase in production capacity positions TSMC to handle a significantly larger volume of AI-related orders.
Equipment providers have pointed out that NVIDIA is currently TSMC’s largest customer for CoWoS advanced packaging, accounting for 60% of the production capacity. Recently, in response to robust demand in AI computing, NVIDIA has increased its orders. Additionally, urgent orders from other customers such as AMD, Amazon, and Broadcom have started to pour in.
(Photo credit: TSMC)
Press Releases
Thanks to their flexible pricing schemes and diverse service offerings, CSPs have been a direct, major driver of enterprise demand for cloud services, according to TrendForce’s latest investigations. As such, the rise of CSPs have in turn brought about a gradual shift in the prevailing business model of server supply chains from sales of traditional branded servers (that is, server OEMs) to ODM Direct sales instead.
Incidentally, the global public cloud market operates as an oligopoly dominated by North American companies including Microsoft Azure, Amazon Web Services (AWS), and Google Cloud Platform (GCP), which collectively possess an above-50% share in this market. More specifically, GCP and AWS are the most aggressive in their data center build-outs. Each of these two companies is expected to increase its server procurement by 25-30% YoY this year, followed closely by Azure.
TrendForce indicates that, in order to expand the presence of their respective ecosystems in the cloud services market, the aforementioned three CSPs have begun collaborating with various countries’ domestic CSPs and telecom operators in compliance with data residency and data sovereignty regulations. For instance, thanks to the accelerating data transformation efforts taking place in the APAC regions, Google is ramping up its supply chain strategies for 2021.
As part of Google’s efforts at building out and refreshing its data centers, not only is the company stocking up on more weeks’ worth of memory products, but it has also been increasing its server orders since 4Q20, in turn leading its ODM partners to expand their SMT capacities. As for AWS, the company has benefitted from activities driven by the post-pandemic new normal, including WFH and enterprise cloud migrations, both of which are major sources of data consumption for AWS’ public cloud.
Conversely, Microsoft Azure will adopt a relatively more cautious and conservative approach to server procurement, likely because the Ice Lake-based server platforms used to power Azure services have yet to enter mass production. In other words, only after these Ice Lake servers enter mass production will Microsoft likely ramp up its server procurement in 2H21, during which TrendForce expects Microsoft’s peak server demand to take place, resulting in a 10-15% YoY growth in server procurement for the entirety of 2021.
Finally, compared to its three competitors, Facebook will experience a relatively more stable growth in server procurement owing to two factors. First, the implementation of GDPR in the EU and the resultant data sovereignty implications mean that data gathered on EU residents are now subject to their respective country’s legal regulations, and therefore more servers are now required to keep up the domestic data processing and storage needs that arise from the GDPR. Secondly, most servers used by Facebook are custom spec’ed to the company’s requirements, and Facebook’s server needs are accordingly higher than its competitors’. As such, TrendForce forecasts a double-digit YoY growth in Facebook’s server procurement this year.
Chinese CSPs are limited in their pace of expansions, while Tencent stands out with a 10% YoY increase in server demand
On the other hand, Chinese CSPs are expected to be relatively weak in terms of server demand this year due to their relatively limited pace of expansion and service areas. Case in point, Alicloud is currently planning to procure the same volume of servers as it did last year, and the company will ramp up its server procurement going forward only after the Chinese government implements its new infrastructure policies. Tencent, which is the other dominant Chinese CSP, will benefit from increased commercial activities from domestic online service platforms, including JD, Meituan, and Kuaishou, and therefore experience a corresponding growth in its server colocation business.
Tencent’s demand for servers this year is expected to increase by about 10% YoY. Baidu will primarily focus on autonomous driving projects this year. There will be a slight YoY increase in Baidu’s server procurement for 2021, mostly thanks to its increased demand for roadside servers used in autonomous driving applications. Finally, with regards to Bytedance, its server procurement will undergo a 10-15% YoY decrease since it will look to adopt colocation services rather than run its own servers in the overseas markets due to its shrinking presence in those markets.
Looking ahead, TrendForce believes that as enterprise clients become more familiar with various cloud services and related technologies, the competition in the cloud market will no longer be confined within the traditional segments of computing, storage, and networking infrastructure. The major CSPs will pay greater attention to the emerging fields such as edge computing as well as the software-hardware integration for the related services.
With the commercialization of 5G services that is taking place worldwide, the concept of “cloud, edge, and device” will replace the current “cloud” framework. This means that cloud services will not be limited to software in the future because cloud service providers may also want to offer their branded hardware in order to make their solutions more comprehensive or all-encompassing. Hence, TrendForce expects hardware to be the next battleground for CSPs.
For more information on reports and market data from TrendForce’s Department of Semiconductor Research, please click here, or email Ms. Latte Chung from the Sales Department at lattechung@trendforce.com