The AMD Intuition MI300X & MI300A are a few of the most anticipated accelerators within the AI phase which is able to launch subsequent month. There’s a variety of anticipation surrounding AMD’s first full-fledged AI masterpiece and at present we thought of supplying you with a roundup of what to anticipate from this technical marvel.
AMD Intuition MI300X Is Designed For GPU-Accelerated AI Workloads Whereas MI300A Tackles HPC With The Most Technically Superior APU Package deal
On the sixth of December, AMD will host its “Advancing AI” keynote the place one of many foremost agendas is to do a full unveiling of the next-gen Intuition accelerator household codenamed MI300. This new GPU and CPU accelerated household would be the lead product of the AI phase which is AMD’s No.1 and crucial strategic precedence proper now because it lastly rolls out a product that’s not solely superior but in addition is designed to satisfy the crucial AI requirement inside the business. The MI300 class of AI accelerators can be one other chiplet powerhouse, making use of superior packaging applied sciences from TSMC so let’s have a look at what’s beneath the hood of those AI monsters.
AMD Intuition MI300X – Difficult NVIDIA’s AI Supremacy With CDNA 3 & Large Reminiscence
The AMD Intuition MI300X is unquestionably the chip that can be highlighted essentially the most since it’s clearly focused at NVIDIA’s Hopper and Intel’s Gaudi accelerators inside the AI phase. This chip has been designed solely on the CDNA 3 structure and there’s a lot of stuff occurring. The chip goes to host a mixture of 5nm and 6nm IPs, all combining to ship as much as 153 Billion transistors (MI300X).
Beginning with the design, the principle interposer is laid out with a passive die which homes the interconnect layer utilizing a next-gen Infinity Material resolution. The interposer features a whole of 28 dies which embrace eight HBM3 packages, 16 dummy dies between the HBM packages, & 4 lively dies and every of those lively dies will get two compute dies.
Every GCD based mostly on the CDNA 3 GPU structure includes a whole of 40 compute models which equals 2560 cores. There are eight compute dies (GCDs) in whole so that provides us a complete of 320 Compute & 20,480 core models. For yields, AMD can be scaling again a small portion of those cores and we can be getting extra particulars on precise configurations a month from now.
Reminiscence is one other space the place you will note an enormous improve with the MI300X boasting 50% extra HBM3 capability than its predecessor, the MI250X (128 GB). To realize a reminiscence pool of 192 GB, AMD is equipping the MI300X with 8 HBM3 stacks and every stack is 12-Hello whereas incorporating 16 Gb ICs which give us 2 GB capability per IC or 24 GB per stack.
The reminiscence will supply as much as 5.2 TB/s of bandwidth and 896 GB/s of Infinity Material Bandwidth. For comparability, NVIDIA’s upcoming H200 AI accelerator affords 141 GB capacities whereas Gaudi 3 from Intel can be providing 144 GB capacities. Massive reminiscence swimming pools matter quite a bit in LLMs that are principally reminiscence sure and AMD can positively present its AI prowess by main within the reminiscence division. For comparisons:
- Intuition MI300X – 192 GB HBM3
- Gaudi 3 – 144 GB HBM3
- H200 – 141 GB HBM3e
- MI300A – 128 GB HBM3
- MI250X – 128 GB HBM2e
- H100 – 96 GB HBM3
- Gaudi 2 – 96 GB HBM2e
By way of energy consumption, the AMD Intuition MI300X is rated at 750W which is a 50% improve over the 500W of the Intuition MI250X and 50W greater than the NVIDIA H200.
AMD Intuition MI300A – Densely Packaged Exascale APUs Now A Actuality
We now have waited for years for AMD to lastly ship on the promise of an Exascale-class APU and the day is nearing as we transfer nearer to the launch of the Intuition MI300A. The packaging on the MI300A is similar to the MI300X besides it makes use of TCO-optimized reminiscence capacities & Zen 4 cores.
One of many lively dies has two CDNA 3 GCDs lower out and changed with three Zen 4 CCDs which provide their very own separate pool of cache and core IPs. You get 8 cores and 16 threads per CCD in order that’s a complete of 24 cores and 48 threads on the lively die. There’s additionally 24 MB of L2 cache (1 MB per core) and a separate pool of cache (32 MB per CCD). It must be remembered that the CDNA 3 GCDs even have the L2 cache separate.
Rounding up a few of the highlighted options of the AMD Intuition MI300 Accelerators, we’ve got:
- First Built-in CPU+GPU Package deal
- Aiming Exascale Supercomputer Market
- AMD MI300A (Built-in CPU + GPU)
- AMD MI300X (GPU Solely)
- 153 Billion Transistors
- Up To 24 Zen 4 Cores
- CDNA 3 GPU Structure
- Up To 192 GB HBM3 Reminiscence
- Up To eight Chiplets + 8 Reminiscence Stacks (5nm + 6nm course of)
Bringing all of those collectively, AMD will work with its ecosystem enablers and companions to supply MI300 AI accelerators in 8-way configurations that includes SXM designs that connect with mainboard with mezzanine connectors. It will likely be fascinating to see what kind of configurations these can be provided inside and whereas SXM boards are a given, we will additionally anticipate a number of variants within the PCI-E type components.
For now, AMD ought to know that their rivals are additionally going full steam forward on the AI craze with NVIDIA already teasing some big figures for its 2024 Blackwell GPUs and Intel prepping up its Guadi 3 and Falcon Shores GPUs for launch within the coming years too. One factor is for certain on the present second, AI prospects will gobble up nearly something they will get and everybody goes to make the most of that. However AMD has a really formidable resolution that’s not simply aiming to be an alternative choice to NVIDIA however a frontrunner within the AI phase and we hope that MI300 may help them obtain that success.
AMD Radeon Intuition Accelerators
|Accelerator Title||AMD Intuition MI400||AMD Intuition MI300||AMD Intuition MI250X||AMD Intuition MI250||AMD Intuition MI210||AMD Intuition MI100||AMD Radeon Intuition MI60||AMD Radeon Intuition MI50||AMD Radeon Intuition MI25||AMD Radeon Intuition MI8||AMD Radeon Intuition MI6|
|CPU Structure||Zen 5 (Exascale APU)||Zen 4 (Exascale APU)||N/A||N/A||N/A||N/A||N/A||N/A||N/A||N/A||N/A|
|GPU Structure||CDNA 4||Aqua Vanjaram (CDNA 3)||Aldebaran (CDNA 2)||Aldebaran (CDNA 2)||Aldebaran (CDNA 2)||Arcturus (CDNA 1)||Vega 20||Vega 20||Vega 10||Fiji XT||Polaris 10|
|GPU Course of Node||4nm||5nm+6nm||6nm||6nm||6nm||7nm FinFET||7nm FinFET||7nm FinFET||14nm FinFET||28nm||14nm FinFET|
|GPU Chiplets||TBD||8 (MCM)||2 (MCM)
1 (Per Die)
1 (Per Die)
1 (Per Die)
|1 (Monolithic)||1 (Monolithic)||1 (Monolithic)||1 (Monolithic)||1 (Monolithic)||1 (Monolithic)|
|GPU Cores||TBD||Up To 19,456||14,080||13,312||6656||7680||4096||3840||4096||4096||2304|
|GPU Clock Pace||TBD||TBA||1700 MHz||1700 MHz||1700 MHz||1500 MHz||1800 MHz||1725 MHz||1500 MHz||1000 MHz||1237 MHz|
|FP16 Compute||TBD||TBA||383 TOPs||362 TOPs||181 TOPs||185 TFLOPs||29.5 TFLOPs||26.5 TFLOPs||24.6 TFLOPs||8.2 TFLOPs||5.7 TFLOPs|
|FP32 Compute||TBD||TBA||95.7 TFLOPs||90.5 TFLOPs||45.3 TFLOPs||23.1 TFLOPs||14.7 TFLOPs||13.3 TFLOPs||12.3 TFLOPs||8.2 TFLOPs||5.7 TFLOPs|
|FP64 Compute||TBD||TBA||47.9 TFLOPs||45.3 TFLOPs||22.6 TFLOPs||11.5 TFLOPs||7.4 TFLOPs||6.6 TFLOPs||768 GFLOPs||512 GFLOPs||384 GFLOPs|
|VRAM||TBD||192 GB HBM3||128 GB HBM2e||128 GB HBM2e||64 GB HBM2e||32 GB HBM2||32 GB HBM2||16 GB HBM2||16 GB HBM2||4 GB HBM1||16 GB GDDR5|
|Reminiscence Clock||TBD||5.2 Gbps||3.2 Gbps||3.2 Gbps||3.2 Gbps||1200 MHz||1000 MHz||1000 MHz||945 MHz||500 MHz||1750 MHz|
|Reminiscence Bus||TBD||8192-bit||8192-bit||8192-bit||4096-bit||4096-bit bus||4096-bit bus||4096-bit bus||2048-bit bus||4096-bit bus||256-bit bus|
|Reminiscence Bandwidth||TBD||5.2 TB/s||3.2 TB/s||3.2 TB/s||1.6 TB/s||1.23 TB/s||1 TB/s||1 TB/s||484 GB/s||512 GB/s||224 GB/s|
|Kind Issue||TBD||OAM||OAM||OAM||Twin Slot Card||Twin Slot, Full Size||Twin Slot, Full Size||Twin Slot, Full Size||Twin Slot, Full Size||Twin Slot, Half Size||Single Slot, Full Size|
|Cooling||TBD||Passive Cooling||Passive Cooling||Passive Cooling||Passive Cooling||Passive Cooling||Passive Cooling||Passive Cooling||Passive Cooling||Passive Cooling||Passive Cooling|