Monday, December 23, 2024

Supermicro Expands AI and GPU Rack Scale Solutions with Support for AMD Instinct MI300 Series Accelerators

New 8-GPU systems with AMD Instinct™ MI300X Accelerators are now available with breakthrough AI and HPC performance for large-scale AI, Training and LLM deployments

Supermicro, Inc., a provider of Total IT solutions for AI, Cloud, Storage and 5G/Edge, announces three additions to AMD-based H13 generation GPU servers, optimized to deliver industry-leading performance and energy efficiency , powered by accelerators from the new AMD MI300 series. Supermicro’s high-performance rack-mountable and scalable solutions with 8-GPU servers with the AMD Instinct MI300X OAM configuration are ideal for large model training.

The new liquid-cooled 2U and air-cooled 4U servers with the AMD Instinct MI300A Accelerated Processing Units (APUs) accelerators are available and improve data center efficiency and meet the rapidly growing complex demands of AI, LLM and HPC. The new systems include quad APUs for scalable applications. Supermicro can provide complete liquid-cooled racks for large-scale environments with up to 1,728 TFlops of FP64 memory per rack. Supermicro’s global manufacturing facilities streamline the delivery of these new servers for AI and HPC convergence.

“We are very excited to expand our Rack Scale Total IT Solutions for AI training with the latest generation of AMD Instinct accelerators, delivering up to 3.4X performance improvement compared to previous generations,” said Charles Liang , president and CEO of Supermicro. “With our ability to supply 4,000 liquid-cooled racks per month from our global manufacturing facilities, we can deliver the latest H13 GPU solutions with either the AMD Instinct MI300X accelerator or the AMD Instinct MI300A APU. Our proven architecture enables 1:1 400G networking for any GPU designed for large-scale AI and supercomputing clusters that can provide fully integrated liquid cooling solutions. This gives customers a competitive advantage in performance and superior efficiency with ease of implementation.”

Also Read: KLX Energy Services’ VISION Suite of downhole completion tools delivers optimal downhole technology solutions

The LLM-optimized  AS –8125GS-TNMR2  system is built on Supermicro’s architecture, a proven design for high-performance AI systems with air- and liquid-cooled scalable rack-mount designs. The balanced system design couples a GPU with a 1:1 network to provide a large pool of high-bandwidth memory across nodes and racks for today’s largest language models with up to trillions of parameters, maximizing parallel computing and reducing learning time and latency be kept to a minimum. The 8U system with the MI300X OAM accelerator delivers the raw acceleration power of 8 GPU with the AMD Infinity Fabric™ Links, delivering up to 896 GB/s theoretical peak P2P I/O bandwidth on the open standard platform with industry-leading 1.5TB HBM3 GPU memory in one system, in addition to native sparse matrix support, designed to save energy and reduce compute cycles and memory usage for AI calculations. Each server features dual-socket AMD EPYC™ 9004 series processors with up to 256 cores. At rack scale, more than 1,000 CPU cores, 24 TB of DDR5 memory, 6,144 TB of HBM3 memory, and 9,728 compute units are available for the most demanding AI environments. Using the OCP Accelerator Module (OAM), which Supermicro has extensive experience in 8U configurations, brings a fully configured server to market faster than a custom design, reducing costs and delivery time.

Supermicro is also introducing a compactness-optimized 2U liquid-cooled server, the AS –2145GH-TNMR  , and a 4U air-cooled server, the AS –4145GH-TNMR , each with 4 AMD Instinct™ MI300A accelerators. The new servers are designed for HPC and AI applications, which require extremely fast communication from CPU to GPU. The APU eliminates redundant memory copies by combining the highest performing AMD CPU, GPU and HBM3 memory on a single chip. Each server includes industry-leading x86 “Zen4” CPU cores for application scaling. In addition, each server includes 512 GB of HBM3 memory. In a full rack solution (48U) consisting of 21 2U systems, more than 10 TB of HBM3 memory available, as well as 19,152 computing units. The memory bandwidth from HBM3 to CPU is 5.3 TB/second.

Both systems feature dual AIOMs with 400G Ethernet support and extensive networking options designed to improve space, scalability and efficiency for high-performance computing. The 2U direct-to-chip liquid-cooled system delivers excellent TCO with energy savings of over 35%, based on 21 2U rack solutions producing 61,780 watts per rack in an air-cooled rack of over 95,256 watts, and a 70% reduction in the number of fans compared to an air-cooled system.

“AMD Instinct MI300 Series accelerators deliver industry-leading performance for both long-term, accelerated computing applications and the rapidly growing demand for generative AI,” said Forrest Norrod , executive vice president and general manager of Data Center Solutions Business Group, AMD. “We remain closely partnering with Supermicro to bring to market leading AI and HPC end-to-end solutions based on MI300 Series accelerators and leveraging Supermicro’s expertise in system and data center design.”

SOURCE: PRNewswire

Subscribe Now

    Hot Topics