GPU Die Overview
Step 0 / 0
点击「下一步 →」或「▶ 自动播放」,逐层游览 GPU Die 的层级结构。
GPC × 12 (Graphics Processing Clusters)
GPC 0 (详细展开)
SM (Streaming Multiprocessor)
CUDA Cores
128 FP32
Tensor Cores
4× 5th Gen
RT Core
4th Gen
Shared Mem / L1
228 KB SRAM
Register File: 256 KB (65536 × 32-bit)
SM 1
SM 2
SM 3
... ×12 SMs/GPC
GPC 1
GPC 2
GPC 3
GPC 4-11
...
L2 Cache
96 MB | ~12 TB/s | All SMs share
GigaThread Engine
Block 调度
Warp 分配
负载均衡
HBM3e (GDDR7)
Stack 0
Stack 1
Stack 2
Stack 3
Stack 4
Stack 5
96 GB 总容量
1792 GB/s 带宽
512-bit 接口
NVLink 5.0
1800 GB/s
GPU↔GPU
PCIe 5.0 x16
128 GB/s
CPU↔GPU
CUDA Core (FP32/INT32)
Tensor Core (FP4/FP8/BF16/TF32)
RT Core (Ray Tracing)
SRAM (Shared Mem + L1)
HBM (Global Memory)
L2 Cache