Insights
The US Department of Commerce issued new restrictions on AI chips on October 17, 2023, with a focus on controlling the export of chips to China, including NIVIDA’s A800, H800, L40S, and RTX4090, among others. Taiwanese manufacturers primarily serve cloud service providers and brand owners in North America, with relatively fewer shipments to Chinese servers. However, Chinese manufacturers, having already faced two chip restrictions imposed by the US, recognize the significance of AI chips in server applications and are expected to accelerate their in-house chip development processes.
TrendForce’s Insights:
1. Limited Impact on Taiwanese Manufacturers in Shipping AI Servers with H100 GPUs
Major Taiwanese server manufacturering companies, including Foxconn, Quanta, Inventec, GIGABYTE, and Wiwynn, provide AI servers equipped with H100 GPUs to cloud data centers and brand owners in Europe and the United States. These Taiwanese companies have established some AI server factories outside China, in countries such as the US, the Czech Republic, Mexico, Malaysia, and Thailand, focusing on producing L10 server units and L11 cabinets in proximity to end-users. This strategy aligns with the strategic needs of US cloud providers and brand owners for global server product deployment.
On the other hand, including MiTAC, Wistron, and Inventec, also provide server assembly services for Chinese brands such as Inspur and Lenovo. Although MiTAC has a significant share in assembling Inspur’s servers, it acquired Intel DSG (Data Center Solutions Group) business in July 2023. Therefore, the focus of AI servers remains on brand manufacturers using H100 GPUs, including Twitter, Dell, AWS, and European cloud service provider OVH. It is speculated that the production ratio of brand servers will be adjusted before the new restrictions are enforced.
Wistron is a major supplier for NVIDIA’s AI server modules, DGX A100, and HGX H100. Its primary shipments are to end-users in Europe and the United States. It is expected that there will be adjustments in the proportion of shipments to Chinese servers following the implementation of the restrictions.
Compal has fewer AI server orders compared to other Taiwanese manufacturers. It has not yet manifested any noticeable changes in Lenovo server assembly proportions. The full extent of the impact will only become more apparent after the enforcement of the ban.
During the transitional period before the implementation of the chip ban in the United States, the server supply chain can still adapt shipments based on local chip demand in China to address market impacts resulting from subsequent chip controls.
2. Chinese Manufacturers Focusing on Accelerating In-House Chip Development
Chinese cloud companies had already started developing their AI chips before the first U.S. chip restrictions in 2022. This included self-developed AI chips like Alibaba Cloud’s T-HEAD, a data center AI chip, and they expanded investments in areas such as DRAM, AI chips, and semiconductors with the aim of establishing a comprehensive IoT system from chips to the cloud.
Baidu Cloud, on the other hand, accelerated the development of its third-generation self-developed Kunlun chip, designed for cloud and edge computing, with plans for an early 2024 release.
Tencent introduced three self-developed chips in 2021, including an AI inference chip called Zixiao, used for Tencent’s meeting business; a video transcoding chip called Canghai, used in cloud gaming and live streaming applications; and a smart network card chip named Xuanling, applied in network storage and computing.
ByteDance made investments in cloud AI chips through its MooreThread initiative in 2022 for applications in AI servers. Huawei released the Ascend 900 chip in 2019 and is expected to introduce the Ascend 930B AI chip in the latter half of 2024. While this chip has the same computational power as the NVIDIA A100 chip, its performance still requires product validation, and it is speculated that it may not replace the current use of NVIDIA GPUs in Chinese AI servers.
Despite the acceleration of self-developed chip development among Chinese cloud server manufacturers, the high technological threshold, lengthy development cycles, and high costs associated with GPU development often delay the introduction of new server products. Therefore, Chinese cloud companies and brand manufacturers continue to purchase NVIDIA GPUs for the production of mid to high-end servers to align with their economic scale and production efficiency.
In response to the new U.S. restrictions, Chinese cloud companies have adopted short-term measures such as increasing imports of existing NVIDIA chips and building up stockpiles before the enforcement of the new restrictions. They are also focusing on medium to long-term strategies, including accelerating resource integration and shortening development timelines to expedite GPU chip manufacturing processes, thus reducing dependency on U.S. restrictions.
News
Source to China Times, in response to increased visibility in AI server orders and optimistic future demand, two ODM-Direct based in Taiwan, Wiwynn, and Quanta, are accelerating the expansion of their server production lines in non-Chinese regions. Recently, there have been updates on their progress. Wiwynn has completed the first phase of its self-owned new factory in Malaysia, specifically for L10. As for Quanta, has further expanded its L10 production line in California, both gearing up for future AI server orders.
Wiwynn’s new server assembly factory, located in the Senai Airport City in Johor, Malaysia, was officially inaugurated on the 12th, and it will provide full cabinet assembly services for large-scale data centers. Additionally, the second phase of the front-end server motherboard production line is expected to be completed and operational next year, allowing Wiwynn to offer high-end AI servers and advanced cooling technology to cloud service providers and customers in the SEA region
While Wiwynn has experienced some slowdown in shipments and revenue due to its customers adjusting to inventory and CAPEX impacts in recent quarters, Wiwynn still chooses to continue its overseas factory expansion efforts. Notably, with the addition of the new factory in Malaysia, Wiwynn’s vision of establishing a one-stop manufacturing, service, and engineering center in the APAC region is becoming a reality.
Especially as we enter Q4, the shipment of AI servers based on NVIDIA’s AI-GPU architecture is expected to boost Wiwynn’s revenue. The market predicts that after a strong fourth quarter, this momentum will carry forward into the next year.
How significant is the demand for AI servers?
According to TrendForce projection, a dramatic surge in AI server shipments for 2023, with an estimated 1.2 million units—outfitted with GPUs, FPGAs, and ASICs—destined for markets around the world, marking a robust YoY growth of 38.4%. This increase resonates with the mounting demand for AI servers and chips, resulting in AI servers poised to constitute nearly 9% of the total server shipments, a figure projected to increase to 15% by 2026. TrendForce has revised its CAGR forecast for AI server shipments between 2022 and 2026 upwards to an ambitious 29%.
Quanta has also been rapidly expanding its production capacity in North America and Southeast Asia in recent years. This year, in addition to establishing new facilities in Vietnam, they have recently expanded their production capacity at their California-based Fremont plant.
The Fremont plant in California has been Quanta’s primary location for the L10 production line in the United States. In recent years, it has expanded several times. With the increasing demand for data center construction by Tier 1 CSP, Quanta’s Tennessee plant has also received multiple investments to prepare for operational needs and capacity expansion.
In August of this year, Quanta initially injected $135 million USD into its California subsidiary, which then leased a nearly 4,500 square-meter site in the Bay Area. Recently, Quanta announced a $79.6 million USD contract awarded to McLarney Construction, Inc. for three construction projects within their new factory locations.
It is expected that Quanta’s new production capacity will gradually come online, with the earliest capacity expected in 2H24, and full-scale production scheduled for 1H25. With the release of new high-end AI servers featuring the H100 architecture, Quanta has been shipping these products since August and September, contributing to its revenue growth. They aim to achieve a 20% YoY increase in server sales for 2023, with the potential for further significant growth in 2024.
News
According to a report by Taiwanese media TechNews, industry sources have indicated that Microsoft has recently reduced its orders for Nvidia’s H100 graphics cards. This move suggests that the demand for H100 graphics cards in the large-scale artificial intelligence computing market has tapered off, and the frenzy of orders from previous customers is no longer as prominent.
In this wave of artificial intelligence trends, the major purchasers of related AI servers come from large-scale cloud computing service providers. Regarding Microsoft’s reported reduction in orders for Nvidia’s H100 graphics cards, market experts point to a key factor being the usage of Microsoft’s AI collaboration tool, Microsoft 365 Copilot, which did not perform as expected.
Another critical factor affecting Microsoft’s decision to reduce orders for Nvidia’s H100 graphics cards is the usage statistics of ChatGPT. Since its launch in November 2022, this generative AI application has experienced explosive growth in usage and has been a pioneer in the current artificial intelligence trend. However, ChatGPT experienced a usage decline for the first time in June 2023.
Industry insiders have noted that the reduction in Microsoft’s H100 graphics card orders was predictable. In May, both server manufacturers and direct customers stated that they would have to wait for over six months to receive Nvidia’s H100 graphics cards. However, in August, Tesla announced the deployment of a cluster of ten thousand H100 graphics cards, meaning that even those who placed orders later were able to receive sufficient chips within a few months. This indicates that the demand for H100 graphics cards, including from customers like Microsoft, has already been met, signifying that the fervent demand observed several months ago has waned.
(Photo credit: Nvidia)
News
According to a report by Taiwan’s Commercial Times, NVIDIA is facing repercussions from the US chip restriction, leading to controls on the export of high-end AI GPU chips to certain countries in the Middle East. Although NVIDIA claims that these controls won’t have an immediate impact on its performance, and industry insiders in the Taiwanese supply chain believe the initial effects are minimal. However, looking at the past practice of prohibiting exports to China, this could potentially trigger another wave of preemptive stockpiling.
Industry sources from the supply chain note that following the US restrictions on exporting chips to China last year, the purchasing power of Chinese clients increased rather than decreased, resulting in a surge in demand for secondary-level and below chip products, setting off a wave of stockpiling.
Take NVIDIA’s previous generation A100 chip for instance. After the US implemented export restrictions on China, NVIDIA replaced it with the lower-tier A800 chip, which quickly became a sought-after product in the Chinese market, driving prices to surge. It’s reported that the A800 has seen a cumulative price increase of 60% from the start of the year to late August, and it remains one of the primary products ordered by major Chinese CSPs.
Furthermore, the recently launched L40S GPU server by NVIDIA in August has become a market focal point. While it may not match the performance of systems like HGX H100/A100 in large-scale AI algorithm training, it outperforms the A100 in AI inference or small-scale AI algorithm training. As the L40S GPU is positioned in the mid-to-low range, it is currently not included in the list of chips subject to export controls to China.
Supply chain insiders suggest that even if the control measures on exporting AI chips to the Middle East are further enforced, local clients are likely to turn to alternatives like the A800 and L40S. However, with uncertainty about whether the US will extend the scope of controlled chip categories, this could potentially trigger another wave of purchasing and stockpiling.
The primary direct beneficiaries in this scenario are still the chip manufacturers. Within the Taiwanese supply chain, Wistron, which supplies chip brands in the AI server front-end GPU board sector, stands to gain. Taiwanese supply chain companies producing A800 series AI servers and the upcoming L40S GPU servers, such as Quanta, Inventec, Gigabyte, and ASUS, have the opportunity to benefit as well.
(Photo credit: NVIDIA)
News
According to the news from Chinatimes, Asus, a prominent technology company, has announced on the 30th of this month the release of AI servers equipped with NVIDIA’s L40S GPUs. These servers are now available for order. The L40S GPU was introduced by NVIDIA in August to address the shortage of H100 and A100 GPUs. Remarkably, Asus has swiftly responded to this situation by unveiling AI server products within a span of less than two weeks, showcasing their optimism in the imminent surge of AI applications and their eagerness to seize the opportunity.
Solid AI Capabilities of Asus Group
Apart from being among the first manufacturers to introduce the NVIDIA OVX server system, Asus has leveraged resources from its subsidiaries, such as TaiSmart and Asus Cloud, to establish a formidable AI infrastructure. This not only involves in-house innovation like the Large Language Model (LLM) technology but also extends to providing AI computing power and enterprise-level generative AI applications. These strengths position Asus as one of the few all-encompassing providers of generative AI solutions.
Projected Surge in Server Business
Regarding server business performance, Asus envisions a yearly compounded growth rate of at least 40% until 2027, with a goal of achieving a fivefold growth over five years. In particular, the data center server business catering primarily to Cloud Service Providers (CSPs) anticipates a tenfold growth within the same timeframe, driven by the adoption of AI server products.
Asus CEO recently emphasized that Asus’s foray into AI server development was prompt and involved collaboration with NVIDIA from the outset. While the product lineup might be more streamlined compared to other OEM/ODM manufacturers, Asus had secured numerous GPU orders ahead of the AI server demand surge. The company is optimistic about the shipping momentum and order visibility for the new generation of AI servers in the latter half of the year.
Embracing NVIDIA’s Versatile L40S GPU
The NVIDIA L40S GPU, built on the Ada Lovelace architecture, stands out as one of the most powerful general-purpose GPUs in data centers. It offers groundbreaking multi-workload computations for large language model inference, training, graphics, and image processing. Not only does it facilitate rapid hardware solution deployment, but it also holds significance due to the current scarcity of higher-tier H100 and A100 GPUs, which have reached allocation stages. Consequently, businesses seeking to repurpose idle data centers are anticipated to shift their focus toward AI servers featuring the L40S GPU.
Asus’s newly introduced L40S GPU servers include the ESC8000-E11/ESC4000-E11 models with built-in Intel Xeon processors, as well as the ESC8000A-E12/ESC4000A-E12 models utilizing AMD EPYC processors. These servers can be configured with up to 4 or a maximum of 8 NVIDIA L40S GPUs. This configuration assists enterprises in enhancing training, fine-tuning, and inference workloads, facilitating AI model creation. It also establishes Asus’s platforms as the preferred choice for multi-modal generative AI applications.