AMD Instinct
Release dateJune 20, 2017 (2017-06-20)
Designed byAMD
Marketed byAMD
Architecture
ModelsMI Series
Transistors
  • 5.7B (Polaris10) 14 nm
  • 8.9B (Fiji) 28 nm
  • 12.5B (Vega10) 14 nm
  • 13.2B (Vega20) 7 nm
  • 25.6B (Arcturus) 7 nm
  • 58.2B (Aldebaran) 6 nm
Fabrication process
History
Predecessor

AMD Instinct is AMD's brand of professional GPUs.[1][2] It replaced AMD's FirePro S brand in 2016. Compared to the Radeon brand of mainstream consumer/gamer products, the Instinct product line is intended to accelerate deep learning, artificial neural network, and high-performance computing/GPGPU applications.

The Radeon Instinct product line directly competes with Nvidia's Ampere and Intel Xeon Phi and incoming Intel Xe lines of machine learning and GPGPU cards.

Before MI100 introduction in November 2020, the Instinct family was known as AMD Radeon Instinct, AMD dropped the Radeon brand from its name.

Supercomputers based on (AMD CPUs and) AMD Instinct GPUs now take the lead on the Green500 supercomputer list with over 50% lead over any other, and top the first 4 spots, including the second, which is the current fastest in the world on the TOP500 list, Frontier.

Products

The three initial Radeon Instinct products were announced on December 12, 2016, and released on June 20, 2017, with each based on a different architecture.[3][4]

MI6

The MI6 is a passively cooled, Polaris 10 based card with 16 GB of GDDR5 memory and with a <150 W TDP.[1][2] At 5.7 TFLOPS (FP16 and FP32), the MI6 is expected to be used primarily for inference, rather than neural network training. The MI6 has a peak double precision (FP64) compute performance of 358 GFLOPS.[5]

MI8

The MI8 is a Fiji based card, analogous to the R9 Nano, and expected to have a <175W TDP.[1] The MI8 has 4 GB of High Bandwidth Memory. At 8.2 TFLOPS (FP16 and FP32), the MI8 is marked toward inference. The MI8 has a peak (FP64) double precision compute performance 512 GFLOPS.[6]

MI25

The MI25 is a Vega based card, utilizing HBM2 memory. The MI25 performance is expected to be 12.3 TFLOPS using FP32 numbers. In contrast to the MI6 and MI8, the MI25 is able to increase performance when using lower precision numbers, and accordingly is expected to reach 24.6 TFLOPS when using FP16 numbers. The MI25 is rated at <300W TDP with passive cooling. The MI25 also provides 768 GFLOPS peak double precision (FP64) at 1/16th rate.[7]

MI300 Series

The MI300A and MI300X are data center accelerators that use the CDNA 3 architecture, which is optimized for high-performance computing (HPC) and generative artificial intelligence (AI) workloads. The CDNA 3 architecture features a scalable multi-chip module (MCM) design that leverages TSMC’s advanced packaging technologies, such as CoWoS (chip-on-wafer-on-substrate) and InFO (integrated fan-out), to combine multiple chiplets on a single interposer. The chiplets are interconnected by AMD’s Infinity Fabric, which enables high-speed and low-latency data transfer between the chiplets and the host system.

The MI300A is an accelerated processing unit (APU) that integrates 24 Zen 4 CPU cores with four CDNA 3 GPU cores, resulting in a total of 228 CUs in the GPU section and 128 GB of HBM3 memory. The Zen 4 CPU cores are based on the 5 nm process node and support the x86-64 instruction set, as well as AVX-512 and BFloat16 extensions. The Zen 4 CPU cores can run general-purpose applications and provide host-side computation for the GPU cores. The MI300A has a peak performance of 61.3 TFLOPS of FP64 (122.6 TFLOPS FP64 matrix) and 980.6 TFLOPS of FP16 (1961.2 TFLOPS with sparsity), as well as 5.3 TB/s of memory bandwidth. The MI300A supports PCIe 5.0 and CXL 2.0 interfaces, which allow it to communicate with other devices and accelerators in a heterogeneous system.

The MI300X is a dedicated generative AI accelerator that replaces the CPU cores with additional GPU cores and HBM memory, resulting in a total of 304 CUs and 192 GB of HBM3 memory. The MI300X is designed to accelerate generative AI applications, such as natural language processing, computer vision, and deep learning. The MI300X has a peak performance of 653.7 TFLOPS of TP32 (1307.4 TFLOPS with sparsity) and 1307.4 TFLOPS of FP16 (2614.9 TFLOPS with sparsity), as well as 5.3 TB/s of memory bandwidth. The MI300X also supports PCIe 5.0 and CXL 2.0 interfaces, as well as AMD’s ROCm software stack, which provides a unified programming model and tools for developing and deploying generative AI applications on AMD hardware.[8][9][10]

Accelerator Architecture Lithography Compute Units Memory Memory Type PCIe Support Form Factor FP16 Performance BF16 Performance FP32 Performance FP32 Matrix Performance FP64 Performance FP64 Matrix Performance INT8 Performance INT4 Performance TBP Peak
MI6 GCN 4 14 nm 36 16 GB GDDR5 3.0 PCIe 5.7 TFLOPS N/A 5.7 TFLOPS N/A 358 GFLOPS N/A N/A N/A 150 W
MI8 GCN 3 28 nm 64 4 GB HBM 8.2 TFLOPS 8.2 TFLOPS 512 GFLOPS 175 W
MI25 GCN 5 14 nm 64 16 GB HBM2 26.4 TFLOPS 12.3 TFLOPS 768 GFLOPS 300 W
MI50 GCN 5 7 nm 60 4.0 26.5 TFLOPS 13.3 TFLOPS 6.6 TFLOPS 53 TOPS 300 W
MI60 GCN 5 64 32 GB 29.5 TFLOPS 14.7 TFLOPS 7.4 TFLOPS 59 TOPS 300 W
MI100 CDNA 120 184.6 TFLOPS 92.3 TFLOPS 23.1 TFLOPS 46.1 TFLOPS 11.5 TFLOPS 184.6 TOPS 300 W
MI210 CDNA 2 6 nm 104 64 GB HBM2e 181 TFLOPS 22.6 TFLOPS 45.3 TFLOPS 22.6 TFLOPS 45.3 TFLOPS 181 TOPS 300 W
MI250 208 128 GB OAM 362.1 TFLOPS 45.3 TFLOPS 90.5 TFLOPS 45.3 TFLOPS 90.5 TFLOPS 362.1 TOPS 560 W
MI250X 220 383 TFLOPS 47.92 TFLOPS 95.7 TFLOPS 47.9 TFLOPS 95.7 TFLOPS 383 TOPS 560 W
MI300A CDNA 3 6 & 5 nm 228 128 GB HBM3 5.0 APU SH5 socket 980.6 TFLOPS
1961.2 TFLOPS (With Sparsity)
122.6 TFLOPS 61.3 TFLOPS 122.6 TFLOPS 1961.2 TOPS
3922.3 TOPS (With Sparsity)
N/A 550 W
760 W (Liquid Cooling)
MI300X 304 192 GB OAM 1307.4 TFLOPS
2614.9 TFLOPS (With Sparsity)
163.4 TFLOPS 81.7 TFLOPS 163.4 TFLOPS 2614.9 TOPS
5229.8 TOPS (With Sparsity)
N/A 750 W

Software

ROCm

Following software is, as of 2022, regrouped under the Radeon Open Compute meta-project.

MxGPU

The MI6, MI8, and MI25 products all support AMD's MxGPU virtualization technology, enabling sharing of GPU resources across multiple users.[1][11]

MIOpen

MIOpen is AMD's deep learning library to enable GPU acceleration of deep learning.[1] Much of this extends the GPUOpen's Boltzmann Initiative software.[11] This is intended to compete with the deep learning portions of Nvidia's CUDA library. It supports the deep learning frameworks: Theano, Caffe, TensorFlow, MXNet, Microsoft Cognitive Toolkit, Torch, and Chainer. Programming is supported in OpenCL and Python, in addition to supporting the compilation of CUDA through AMD's Heterogeneous-compute Interface for Portability and Heterogeneous Compute Compiler.

Chipset table

Model
(Code name)
Release date Architecture
& fab
Transistors
& die size
Core Fillrate[lower-alpha 1][lower-alpha 2][lower-alpha 3] Processing power[lower-alpha 1][lower-alpha 4]
(TFLOPS)
Memory TBP Bus
interface
Config[lower-alpha 5] Clock[lower-alpha 1]
(MHz)
Texture
(GT/s)
Pixel
(GP/s)
Half Single Double Size
(GB)
Bus type
& width
Bandwidth
(GB/s)
Clock
(MT/s)
Radeon Instinct MI6
(Polaris 10)[12][13][14][15][16][17]
Jun 20, 2017 GCN 4
GloFo 14LP
5.7×109
232 mm2
2304:144:32
36 CU
1120
1233
161.3
177.6
35.84
39.46
5.161
5.682
5.161
5.682
0.323
0.355
16 GDDR5
256-bit
224 7000 150 W PCIe 3.0
×16
Radeon Instinct MI8
(Fiji)[12][13][14][18][19][20]
GCN 3
TSMC 28 nm
8.9×109
596 mm2
4096:256:64
64 CU
1000 256.0 64.00 8.192 8.192 0.512 4 HBM
4096-bit
512 1000 175 W
Radeon Instinct MI25
(Vega 10)[12][13][14][21][22][23][24]
GCN 5
GloFo 14LP
12.5×109
510 mm2
1400
1500
358.4
384.0
89.60
96.00
22.94
24.58
11.47
12.29
0.717
0.768
16 HBM2
2048-bit
484 1890 300 W
Radeon Instinct MI50
(Vega 20)[25][26][27][28][29][30]
Nov 18, 2018 GCN 5
TSMC N7
13.2×109
331 mm2
3840:240:64
60 CU
1450
1725
348.0
414.0
92.80
110.4
22.27
26.50
11.14
13.25
5.568
6.624
16
32
HBM2
4096-bit
1024 2000 PCIe 4.0
×16
Radeon Instinct MI60
(Vega 20)[26][31][32][33]
4096:256:64
64 CU
1500
1800
384.0
460.8
96.00
115.2
24.58
29.49
12.29
14.75
6.144
7.373
32
AMD Instinct MI100
(Arcturus)[34][35][36]
Nov 16, 2020 CDNA
TSMC N7
25.6×109
750 mm2
7680:480:-
120 CU
1000
1502
480.0
721.0
122.9
184.6
15.36
23.07
7.680
11.54
1228.8 2400
AMD Instinct MI210
(Aldebaran)[37][38][39]
Mar 22, 2022 CDNA 2
TSMC N6
28 x 109
~770 mm2
6656:416:-
104 CU
(1 × GCD)[lower-alpha 6]
1000
1700
416.0
707.2
106.5
181.0
13.31
22.63
13.31
22.63
64 HBM2e
4096-bit
1638.4 3200
AMD Instinct MI250
(Aldebaran)[40][41][42]
Nov 8, 2021 58 x 109
1540 mm2
13312:832:-
208 CU
(2 × GCD)
832.0
1414
213.0
362.1
26.62
45.26
26.62
45.26
2 × 64 HBM2e
2 × 4096-bit[lower-alpha 7]
2 × 1638.4 500 W
560 W (Peak)
AMD Instinct MI250X
(Aldebaran)[43][41][44]
14080:880:-
220 CU
(2 × GCD)
880.0
1496
225.3
383.0
28.16
47.87
28.16
47.87
AMD Instinct MI300A
(Antares)[45][46][47][48]
Dec 6, 2023 CDNA 3
TSMC N5 & N6
146 x 109
1017 mm2
14592:912:-
228 CU
(6 × XCD)
(24 AMD Zen 4 x86 CPU cores)
2100 912.0
1550.4
980.6
1961.2 (With Sparsity)
122.6
61.3
122.6 (FP64 Matrix)
128 HBM3
8192-bit
5300 5200 550 W
760 W (Liquid Cooling)
PCIe 5.0
×16
AMD Instinct MI300X
(Aqua Vanjaram)[49][50][51][52]
153 x 109
1017 mm2
19456:1216:-
304 CU
(8 × XCD)
1216.0
2062.1
1307.4
2614.9 (With Sparsity)
163.4
81.7
163.4 (FP64 Matrix)
192 750 W
  1. 1 2 3 Boost values (if available) are stated below the base value in italic.
  2. Texture fillrate is calculated as the number of texture mapping units multiplied by the base (or boost) core clock speed.
  3. Pixel fillrate is calculated as the number of render output units multiplied by the base (or boost) core clock speed.
  4. Precision performance is calculated from the base (or boost) core clock speed based on a FMA operation.
  5. Unified shaders : Texture mapping units : Render output units and Compute units (CU)
  6. GCD Refers to a Graphics Compute Die. Each GCD is a different piece of silicon.
  7. CDNA 2.0 Based cards adopt a design using two dies on the same package.They are linked with 400GB/s Bidirectional Infinity Fabric link, The dies are addressed as individual GPUs by the host system.

See also

References

  1. 1 2 3 4 5 Smith, Ryan (December 12, 2016). "AMD Announces Radeon Instinct: GPU Accelerators for Deep Learning, Coming in 2017". Anandtech. Retrieved December 12, 2016.
  2. 1 2 Shrout, Ryan (December 12, 2016). "Radeon Instinct Machine Learning GPUs include Vega, Preview Performance". PC Per. Retrieved December 12, 2016.
  3. WhyCry (December 12, 2016). "AMD announces first VEGA accelerator:RADEON INSTINCT MI25 for deep-learning". VideoCardz. Retrieved June 6, 2022.
  4. Mujtaba, Hassan (June 21, 2017). "AMD Radeon Instinct MI25 Accelerator With 16 GB HBM2 Specifications Detailed – Launches Today Along With Instinct MI8 and Instinct MI6". Wccftech. Retrieved June 6, 2022.
  5. "Radeon Instinct MI6". Radeon Instinct. AMD. Retrieved June 22, 2017.
  6. "Radeon Instinct MI8". Radeon Instinct. AMD. Retrieved June 22, 2017.
  7. "Radeon Instinct MI25". Radeon Instinct. AMD. Retrieved June 22, 2017.
  8. "AMD CDNA 3 Architecture" (PDF). AMD CDNA Architecture. AMD. Retrieved December 7, 2023.
  9. "AMD INSTINCT MI300A APU" (PDF). AMD Instinct Accelerators. AMD. Retrieved December 7, 2023.
  10. "AMD INSTINCT MI300X APU" (PDF). AMD Instinct Accelerators. AMD. Retrieved December 7, 2023.
  11. 1 2 Kampman, Jeff (December 12, 2016). "AMD opens up machine learning with Radeon Instinct". TechReport. Retrieved December 12, 2016.
  12. 1 2 3 Smith, Ryan (December 12, 2016). "AMD Announces Radeon Instinct: GPU Accelerators for Deep Learning, Coming in 2017". AnandTech. Retrieved December 12, 2016.
  13. 1 2 3 Shrout, Ryan (December 12, 2016). "Radeon Instinct Machine Learning GPUs include Vega, Preview Performance". PCPerspective. Retrieved December 12, 2016.
  14. 1 2 3 Kampman, Jeff (December 12, 2016). "AMD opens up machine learning with Radeon Instinct". Tech Report. Retrieved December 12, 2016.
  15. "Radeon Instinct MI6". AMD. Archived from the original on August 1, 2017. Retrieved May 27, 2022.
  16. "AMD Radeon Instinct MI6 Datasheet" (PDF). usermanual.wiki. Retrieved May 27, 2022.
  17. "AMD Radeon Instinct MI6 Specs". TechPowerUp. Retrieved May 27, 2022.
  18. "Radeon Instinct MI8". AMD. Archived from the original on August 1, 2017. Retrieved May 27, 2022.
  19. "AMD Radeon Instinct MI8 Datasheet" (PDF). usermanual.wiki. Retrieved May 27, 2022.
  20. "AMD Radeon Instinct MI8 Specs". TechPowerUp. Retrieved May 27, 2022.
  21. Smith, Ryan (January 5, 2017). "The AMD Vega Architecture Teaser: Higher IPC, Tiling, & More, coming in H1'2017". AnandTech. Retrieved January 10, 2017.
  22. "Radeon Instinct MI25". AMD. Archived from the original on August 1, 2017. Retrieved May 27, 2022.
  23. "AMD Radeon Instinct MI25 Datasheet" (PDF). AMD. Retrieved May 27, 2022.
  24. "AMD Radeon Instinct MI25 Specs". TechPowerUp. Retrieved May 27, 2022.
  25. Walton, Jarred (January 10, 2019). "Hands on with the AMD Radeon VII". PC Gamer.
  26. 1 2 "Next Horizon – David Wang Presentation" (PDF). AMD.
  27. "AMD Radeon Instinct MI50 Accelerator (16GB)". AMD. Retrieved December 24, 2022.
  28. "AMD Radeon Instinct MI50 Accelerator (32GB)". AMD. Retrieved December 24, 2022.
  29. "AMD Radeon Instinct MI50 Datasheet" (PDF). AMD. Retrieved December 24, 2022.
  30. "AMD Radeon Instinct MI50 Specs". TechPowerUp. Retrieved May 27, 2022.
  31. "Radeon Instinct MI60". AMD. Archived from the original on November 22, 2018. Retrieved May 27, 2022.
  32. "AMD Radeon Instinct MI60 Datasheet" (PDF). AMD. Retrieved December 24, 2022.
  33. "AMD Radeon Instinct MI60 Specs". TechPowerUp. Retrieved May 27, 2022.
  34. "AMD Instinct MI100 Accelerator". AMD. Retrieved May 27, 2022.
  35. "AMD Instinct MI100 Accelerator Brochure" (PDF). AMD. Retrieved May 27, 2022.
  36. "AMD Radeon Instinct MI100 Specs". TechPowerUp. Retrieved May 26, 2022.
  37. "AMD Instinct MI210 Accelerator". AMD. Retrieved May 27, 2022.
  38. "AMD Instinct MI210 Accelerator Brochure" (PDF). AMD. Retrieved May 27, 2022.
  39. "AMD Radeon Instinct MI210 Specs". TechPowerUp. Retrieved May 27, 2022.
  40. "AMD Instinct MI250 Accelerator". AMD. Retrieved May 27, 2022.
  41. 1 2 "AMD Instinct MI200 Series Accelerator Datasheet" (PDF). AMD. Retrieved December 24, 2022.
  42. "AMD Radeon Instinct MI250 Specs". TechPowerUp. Retrieved May 26, 2022.
  43. "AMD Instinct MI250X Accelerator". AMD. Retrieved May 27, 2022.
  44. "AMD Radeon Instinct MI250X Specs". TechPowerUp. Retrieved May 26, 2022.
  45. "AMD Instinct MI300A APU". AMD. Retrieved December 12, 2023.
  46. "AMD Instinct MI300A Series Accelerator Datasheet" (PDF). AMD. Retrieved December 12, 2023.
  47. "AMD Radeon Instinct MI300 Specs". TechPowerUp. Retrieved December 12, 2023.
  48. "AMD-CDNA3-white-paper" (PDF). AMD. Retrieved December 12, 2023.
  49. "AMD Instinct MI300X GPU". AMD. Retrieved December 12, 2023.
  50. "AMD Instinct MI300X Series Accelerator Datasheet" (PDF). AMD. Retrieved December 12, 2023.
  51. "AMD Radeon Instinct MI300 Specs". TechPowerUp. Retrieved December 12, 2023.
  52. "AMD-CDNA3-white-paper" (PDF). AMD. Retrieved December 12, 2023.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.